08.08.2013 Views

Data tools tipsheet - Investigative Reporters and Editors

Data tools tipsheet - Investigative Reporters and Editors

Data tools tipsheet - Investigative Reporters and Editors

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Open Refine<br />

http://openrefine.org/<br />

Price: Free<br />

OpenRefine (formally Google Refine), allows rapid cleaning of data with the combination of Excellike<br />

formulas <strong>and</strong> text faceting/clustering. The downloadable application, which runs in a Web<br />

browser, groups similar words together based on multiple algorithms <strong>and</strong> allows users to quickly<br />

st<strong>and</strong>ardize names, businesses <strong>and</strong> other data.<br />

mySQL<br />

http://dev.mysql.com/downloads/installer/5.6.html<br />

Price: Free<br />

Although it’s a powerful (<strong>and</strong> free) tool for building databases, mySQL isn’t particularly userfriendly.<br />

It has an open-source community mostly consisting of hardcore developers.<br />

Navicat<br />

http://www.navicat.com/<br />

Price: $100<br />

Navicat’s $100 price tag can be worth it if you’re looking to deal commonly with mySQL. It<br />

provides a user-friendly front end for the database service <strong>and</strong> reduces your need for knowledge<br />

of SQL language. A free trial is available.<br />

Muse<br />

http://mobisocial.stanford.edu/muse/<br />

Price: Free<br />

This experimental research tool from a Stanford computer scientist was built to help users<br />

browse large email archives. Although it was originally meant for people to browse their own<br />

archives, it’s been adapted to import mailbox files from Outlook <strong>and</strong> other clients.<br />

OTHER COOL STUFF<br />

Mr. <strong>Data</strong> Converter<br />

http://shancarter.github.io/mr-data-converter/<br />

Price: Free<br />

This open-source tool, built by Shan Carter, converts Excel data into one of several web-friendly<br />

structured formats, including HTML, JSON <strong>and</strong> XML.<br />

<strong>Data</strong> Science Toolkit<br />

http://www.datasciencetoolkit.org/<br />

Price: Free<br />

This toolkit features an entire suite of easy-to-use Web apps for doing all kinds of cool things to<br />

data, like converting PDFs to plain text <strong>and</strong> converting street addresses to coordinates. Also<br />

features an open API for more advanced users.<br />

Jigsaw<br />

http://www.cc.gatech.edu/gvu/ii/jigsaw/<br />

Price: Free<br />

Another experimental tool born out of academia, this Java application helps users make sense of<br />

large collections of documents with the help of text analysis algorithms. It features a variety of<br />

different ways to look at the documents, from topic clustering to entity extraction.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!