Python for Data Analysis: Data Wrangling with Pandas, NumPy, by Wes McKinney

1 time_zones = [rec['tz'] for rec in records] KeyError: 'tz' Oops! Turns out that not all of the records have a time zone field. This is easy to handle as we can add the check if 'tz' in rec at the end of the list comprehension: In [26]: time_zones = [rec['tz'] for rec in records if 'tz' in rec] In [27]: time_zones[:10] Out[27]: [u'America/New_York', u'America/Denver', u'America/New_York', u'America/Sao_Paulo', u'America/New_York', u'America/New_York', u'Europe/Warsaw', u'', u'', u''] Just looking at the first 10 time zones we see that some of them are unknown (empty).

Some IDEs even provide IPython integration, so it’s possible to get the best of both worlds. The IPython project began in 2001 as Fernando Pérez’s side project to make a better interactive Python interpreter. In the subsequent 11 years it has grown into what’s widely considered one of the most important tools in the modern scientific Python computing stack. While it does not provide any computational or data analytical tools by itself, IPython is designed from the ground up to maximize your productivity in both interactive computing and software development.

Top time zones by Windows and non-Windows users Figure 2-3. Percentage Windows and non-Windows users in top-occurring time zones All of the methods employed here will be examined in great detail throughout the rest of the book. org/node/73) provides a number of collections of movie ratings data collected from users of MovieLens in the late 1990s and 26 | Chapter 2: Introductory Examples early 2000s. The data provide movie ratings, movie metadata (genres and year), and demographic data about the users (age, zip code, gender, and occupation).

Download PDF sample

Rated 4.06 of 5 – based on 13 votes