27.12.2016 Views

Hacker Bits, Issue 12

HACKER BITS is the monthly magazine that gives you the hottest technology stories crowdsourced by the readers of Hacker News. We select from the top voted stories and publish them in an easy-to-read magazine format. Get HACKER BITS delivered to your inbox every month! For more, visit https://hackerbits.com/issue12.

HACKER BITS is the monthly magazine that gives you the hottest technology stories crowdsourced by the readers of Hacker News. We select from the top voted stories and publish them in an easy-to-read magazine format.

Get HACKER BITS delivered to your inbox every month! For more, visit https://hackerbits.com/issue12.

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Introduction<br />

Data science continues to generate excitement and yet real-world results can often disappoint business<br />

stakeholders. How can we mitigate risk and ensure results match expectations? Working as a technical<br />

data scientist at the interface between R&D and commercial operations has given me an insight into the<br />

traps that lie in our path. I present a personal view on the most common failure modes of data science<br />

projects.<br />

The long version with slides and explanatory text below. Slides in one pdf here.<br />

There is some discussion at <strong>Hacker</strong> News<br />

First, about me:<br />

This talk is based on conversations I’ve had with many senior data scientists over the last few years.<br />

Many companies seems to go through a pattern of hiring a data science team only for the entire team<br />

to quit or be fired around <strong>12</strong> months later. Why is the failure rate so high?<br />

Let’s begin:<br />

A very wise data science consultant told me he always asks if the data has been used before in a project.<br />

If not, he adds 6-<strong>12</strong> months onto the schedule for data cleansing.<br />

Do a data audit before you begin. Check for missing data, or dirty data. For example, you might find<br />

hacker bits<br />

17

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!