Pandas Quick-Start

I’ve fallen in love with doing Data Analysis using Python and Pandas. Here are some useful ways to get started:

It’s easy to read data from CSV files, Excel files, HDF5, SQL and lots of other data sources. Use the read_xxx functions for this.

import pandas as pd
import os

df = pd.read_csv(os.path.expanduser("~/data/mydata.csv"))
print(df.head(5)) # output the first 3 observations

Think of a Pandas DataFrame as being like an Excel sheet, with each column being able to have a data type accessable through the df.dtypes method.

You can use the head() method and tail() method to glance at the first and last values of the dataset.

df.describe() gives a quick statistical summary of the dataset.

You can grab a single column of the dataset by name df['Blah'], or iterate through the rows using the df.iterrows() method.

There is a Quick 10 Minute Introduction over at

Jupyter and EIN

I have fallen in love with running a Jupyter server on my notebook, and connecting to it using Emacs and the EIN package. It is great having a proper editor, set up for Python coding, to work on my Math models. I am starting to use it to create a Computable Document repository – and let’s face it – every document should be computable!

Continue reading “Jupyter and EIN”