Data structures for statistical computing in Python - SciPy Conferences
Data structures for statistical computing in Python - SciPy Conferences
Data structures for statistical computing in Python - SciPy Conferences
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Structured arrays<br />
Structured arrays are great <strong>for</strong> many applications, but not always<br />
great <strong>for</strong> general data analysis<br />
Pros<br />
Fast, memory-efficient, good <strong>for</strong> load<strong>in</strong>g and sav<strong>in</strong>g big data<br />
Nested dtypes help manage hierarchical data<br />
Cons<br />
Can’t be immediately used <strong>in</strong> many (most?) NumPy methods<br />
Are not flexible <strong>in</strong> size (have to use or write auxiliary methods to “add”<br />
fields)<br />
Not too many built-<strong>in</strong> data manipulation methods<br />
Select<strong>in</strong>g subsets is often O(n)!<br />
McK<strong>in</strong>ney () Statistical <strong>Data</strong> Structures <strong>in</strong> <strong>Python</strong> <strong>SciPy</strong> 2010 7 / 31