22.07.2019 Views

Real NU MSiA Newsletter

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

SUMMER 2019<br />

MSIA TIMES<br />

INSIDE:<br />

Interview: Seminar Series Speaker, Patrick Boueri<br />

Alumni Spotlight: Julia Greenberger<br />

Upcoming Events<br />

Reinforcement Learning Videos<br />

<strong>MSiA</strong> Hackathon made the McCormick news!<br />

Extra-Curricular Projects<br />

Gaming Analytics<br />

Meet the <strong>MSiA</strong> Staff<br />

Get <strong>MSiA</strong> famous! Selfies on social media


WELCOME TO MSIA TIMES!<br />

Welcome to the inaugural edition of <strong>MSiA</strong><br />

Times, the alumni-exclusive newsletter that<br />

offers up-to-date happenings among the <strong>MSiA</strong><br />

family. <strong>MSiA</strong> knows life after graduation gets<br />

busy and classmates may not be as close by<br />

as before, so we present you with a place to<br />

share stories, upcoming <strong>MSiA</strong> events, industry<br />

news, and much more.<br />

We would love to hear from you! Send your<br />

news to analytics@northwestern.edu.


TO JOIN THE MSIA ALUMNI COMMUNITY ARE<br />

SOON<br />

ETAS, CLASS OF '19. CLICK HERE TO FIND OUT<br />

THE<br />

MSIA CLASS OF 2018<br />

ZETAS<br />

MORE ABOUT THE ETAS.


A N I N T E R V I E W W I T H<br />

P A T R I C K B O U E R I<br />

Patrick Boueri, senior data science<br />

manager at Uptake, took some time<br />

to speak with Naomi Kaduwela<br />

(<strong>MSiA</strong> '19) before his presentation<br />

on challenges in machine learning<br />

serving as part of <strong>MSiA</strong>'s quarterly<br />

seminar series.<br />

Naomi: What is your favorite part about being a<br />

data scientist?<br />

Patrick: My favorite is definitely the team. It is<br />

composed of people from different walks of life<br />

that have strong passionate technical interests,<br />

which speaks to me. It’s interesting to see the<br />

confluence of those across social sciences, math,<br />

and hard sciences. They are willing to share<br />

knowledge and experience across domains which<br />

fosters creativity and fun discussions. Also, when<br />

you get into data work, you can answer your own<br />

questions, which is very cool to me. If you want to<br />

be able to measure something, you can go out<br />

and do the empirical work yourself. You’re<br />

constantly uncovering new things.<br />

Naomi: What has been your favorite project?<br />

It was fun to do because you get into the guts of<br />

the algorithm. Practically, it’s hard to come up<br />

with rules that transcend any data set.<br />

Naomi: What are the challenges to getting to<br />

more productionized AI solutions?<br />

Patrick: It’s a confluence of things: expertise, right<br />

data, right system.<br />

Getting the right consumers is also important,<br />

which is why UI/UX initiatives are key to ensure it<br />

drives actions downstream and the right behavior<br />

down the road.<br />

Marquee products like Google were built for ad<br />

and web search so they have phenomenal ML in<br />

that space.<br />

Patrick: Favorite project was an error analysis<br />

where we trained a tree and then some error<br />

accumulates. So we took the tree’s leaves,<br />

terminal nodes as a feature vector and clustered<br />

the terminal predictions to do error analysis on<br />

that. It was a way to introspect on the tree and<br />

find out where the errors occur.<br />

Most companies sell products or services, and<br />

analytics is an offering or optimization on a<br />

business process. For many companies, analytics is<br />

not their core. However, if they have good<br />

analytics on their products, they can get increases<br />

in margin. But it’s harder to do for them because<br />

it’s more spread out and distributed.


AN INTERVIEW WITH PATRICK BOUERI, continued<br />

Naomi: Is it possible that there is much<br />

abstraction with programming languages and<br />

libraries?<br />

Patrick: Too much abstraction is only a bad thing if<br />

they are leaky abstractions. For example, you<br />

don’t care about the low level details of what<br />

your compiler is doing because it’s a very hermetic<br />

abstraction. But data is leaky, and that’s where<br />

you get results that make you question every<br />

move, because you don’t have a solid foundation.<br />

I worry if they are used incorrectly, in the sense<br />

that decisions are being made off of them. But, I’m<br />

not sure we are at that point yet, where so many<br />

machine learning algorithms are being<br />

productionized effectively.<br />

But it’s cool! I wouldn’t want to write a Stochastic<br />

gradient descent model every time I train a model.<br />

Naomi: What are your thoughts on the future of<br />

data science tool sets? I also see a lot of<br />

customization within each domain with regard to<br />

the data science packages.<br />

Patrick: Even with the abstractions, we get to<br />

grapple with the domain. And I think that’s where<br />

the future is going to go - to get more productive.<br />

You can throw as much machine learning as you<br />

want at it, but the way to narrow the search<br />

space to get things that are useful, without infinite<br />

compute and memory is to bundle domains into<br />

ML libraries.<br />

I think we'll see a lot of specialized domains.<br />

What we see growing in audio, natural<br />

language processing, will expand, for example,<br />

to CRM.<br />

You’ll see providers that have specific domain -<br />

like speech to text - with an API built on top of<br />

it so plug and play is easy for businesses. We<br />

have a lot of disparate tools, so to be able to<br />

pull it off the shelf quickly and do it ourselves<br />

easily would be a great thing.<br />

Naomi: In our Master of Science in Analytics<br />

program this quarter we are learning to build<br />

and deploy end to end production ready<br />

applications. Can you talk about the<br />

uniqueness vs intersections of the data scientist<br />

vs software developer role these days?<br />

Patrick: I would call that role a machine<br />

learning engineer. There are strong cultural<br />

differences between data scientists and<br />

software developers. Software developers can<br />

unit-test locally and there’s a sense of<br />

certainty where the uncertainty is from users.<br />

With data science it can be a bit frustrating<br />

because there is uncertainty in the data. So<br />

when someone says, “If I give you this data,<br />

will it work?” The answer is “I think so”. There is<br />

also technical rigor. We can learn from<br />

software developers: version control and a<br />

DevOps mindset.<br />

Click here for the full<br />

interview!


SEND US YOUR<br />

SELFIES!<br />

We always make Diego take selfies when he's attending conferences.<br />

Follow us on Instagram to check it out [instagram.com/nu_analytics]!<br />

Also, send us your selfies (analytics@northwestern.edu), whether it be<br />

at work, in your city, on a trip, or at a conference. We'll make you an<br />

<strong>MSiA</strong> star!


<strong>MSiA</strong>'s current<br />

cohort is busy at<br />

work...and we mean<br />

that literally. They<br />

marked their<br />

internship location<br />

on the map. Check<br />

out the video to see<br />

where they are<br />

during the summer!<br />

Summer Internships


U P C O M I N G<br />

E V E N T S<br />

Boot Camp<br />

Student<br />

evanston<br />

Mixer<br />

Alumni<br />

Francisco<br />

San<br />

Please Click<br />

to volunteer<br />

2019<br />

30<br />

august<br />

for details<br />

Click<br />

rsvp<br />

and<br />

2019<br />

<strong>MSiA</strong> Staff


<strong>MSiA</strong> Staff<br />

Beau Breeden<br />

Borchuluun<br />

Yadamsuren<br />

Cindy Nguyen<br />

As the program’s System<br />

Administrator, Beau’s<br />

mission is to make sure<br />

that the technological<br />

needs of the students and<br />

faculty are met. He calls<br />

upon more than 18 years<br />

of single-user to multienterprise<br />

IT experience to<br />

assist in that effort.<br />

DEEP DISH<br />

OR<br />

THIN SLICE?<br />

THIN SLICE.<br />

CHANGE MY<br />

MIND.<br />

As a research facilitator, I<br />

will work with students on<br />

their extracurricular<br />

projects. Previously, I<br />

worked for the Information<br />

Experience Laboratory of the<br />

University of Missouri on<br />

various user experience<br />

research projects. I also<br />

taught computer science and<br />

user experience related<br />

courses at the University of<br />

Missouri and Columbia<br />

College.<br />

WHAT IS<br />

YOUR GUILTY<br />

PLEASURE?<br />

BROWSING THE<br />

INTERNET FOR<br />

SERENDIPITOUS<br />

INFORMATION<br />

ENCOUNTERS<br />

Cindy La Nguyen is our lead on<br />

admissions and marketing. She<br />

has her PhD, and MS in<br />

Sociology, and her BA in<br />

Comparative Sociology (a mix of<br />

sociology and cultural<br />

anthropology). Cindy is<br />

fascinated by cultural<br />

movements big and small. She<br />

specializes in ethnographic<br />

methods and the study of how<br />

race, ethnicity, gender, and<br />

citizenship intersect.<br />

WOULD YOU RATHER<br />

GIVE ADVICE TO YOUR<br />

PAST-SELF OR HAVE A<br />

CONVERSATION WITH<br />

YOUR FUTURE-SELF?<br />

I’M MORE OF A FORWARD LOOKING<br />

PERSON. I’D DEFINITELY SIT DOWN<br />

AND HAVE A TALK WITH MY<br />

FUTURE-SELF. I’D SAY TO HER,<br />

“YOU’RE SITTING DOWN AND<br />

HAVING A CONVERSATION WITH<br />

YOUR PAST-SELF RIGHT NOW,<br />

WOULDN’T YOU RATHER BE TALKING<br />

TO YOUR FUTURE-SELF?” AND THEN<br />

I’D ASK HER IF TELEPORTATION<br />

EVER BECOMES A THING.


<strong>MSiA</strong> Staff<br />

Noelle Afolabi<br />

Sarah Mitchell<br />

Noelle Afolabi is our Program<br />

Assistant. She holds an MFA in<br />

Creative Writing from Northwestern<br />

University and a BA in Anthropology<br />

and Creative Writing from Eastern<br />

Michigan University. Noelle uses her<br />

cultural anthropology background<br />

with her creative writing skills in<br />

order to create well-rounded fiction<br />

pieces. When she’s not reading or<br />

writing, she enjoys biking<br />

Chicagoland trails with her husband<br />

and knitting.<br />

IF YOU OPENED A<br />

BUSINESS, WHAT<br />

KIND OF BUSINESS<br />

WOULD IT BE?<br />

Sarah Mitchell has been<br />

with the <strong>MSiA</strong> program for<br />

3.5 years and has her<br />

Bachelor's degree in<br />

education. She enjoys board<br />

games, has a mean cat, and<br />

pays to play bass guitar in<br />

a blues band. Her favorite<br />

local tourist spot is the<br />

House on the Rock in<br />

southern Wisconsin.<br />

WHAT MAKES<br />

YOUR DAY,<br />

SARAH?<br />

BARACK OBAMA SAID, "THE FUTURE<br />

BELONGS TO YOUNG PEOPLE WITH<br />

AN EDUCATION AND THE<br />

IMAGINATION TO CREATE." I WOULD<br />

LOVE TO BUILD UPON OBAMA'S<br />

WORDS AND OPEN AN ARTS MEGA<br />

CENTER WHERE PEOPLE OF ALL<br />

AGES COULD LEARN AND REFINE<br />

SKILLS IN WRITING, MUSIC, DANCE,<br />

FINE ART, CULINARY ART AND<br />

FASHION.<br />

THE EARTH'S ROTATION.


A C K A T H O N 2 0 1 9<br />

H<br />

<strong>MSiA</strong>'s sixth annual Hackathon, sponsored by ABC Supply Co., proved to<br />

challenge students to use learned knowledge from coursework and apply it<br />

to real data.<br />

Thanks to <strong>MSiA</strong> alumnae Jill Fan (<strong>MSiA</strong> ’18), Zili Li (<strong>MSiA</strong> ’18), and Madhuri<br />

Gupta (<strong>MSiA</strong> ’17) who participated as judges for the competition.<br />

"One of the things I<br />

enjoyed the most was<br />

that the prompts were<br />

so open ended,” <strong>MSiA</strong><br />

student Shreyas Sabnis<br />

said. “We didn’t go into<br />

the day knowing what<br />

problem we were going<br />

to solve, but we<br />

understood what to<br />

look for as we studied<br />

the data. Many of the<br />

concepts we learned in<br />

class proved to be<br />

extremely useful.”<br />

Read the FULL ARTICLE<br />

to find out more!


Now<br />

presenting<br />

two<br />

summaries<br />

from the Etas'<br />

extracurricular<br />

projects.<br />

Extra-<br />

Curricular<br />

BuildChange<br />

by Elliot Gardner (class of '19)<br />

Over the winter and spring quarters, four students (Ruixiang Fan, Elliot Gardner, Kejin<br />

Qian, and Yiwei Zhang), with the advisement of one PhD student (Zhe Su), worked on a<br />

project for BuildChange, which is a non-profit that conducts structural retrofitting<br />

projects on housing in third world countries that have faced natural disasters. One<br />

impediment to increasing the scope of work that BuildChange can conduct is the time<br />

needed to measure houses for suitability for retrofitting. As a result, BuildChange is<br />

considering building an app that will use a Machine Learning model, coupled with<br />

Computer Vision techniques, to allow users in a country to assess whether their house<br />

is a good candidate for retrofitting. In order to train such a model, quite a bit of<br />

training data (in the form of labeled pictures of houses) would be needed, much more<br />

than BuildChange currently has access to.<br />

As a result of this challenge in obtaining training data, BuildChange reached out to<br />

Northwestern through <strong>MSiA</strong> alumnae, Patty Liu Svenson (class of 2016) to get help with<br />

generating training data. BuildChange has software which will take measurements of a<br />

house and generate a CAD model of the structure, from which a picture of the structure<br />

from different perspectives can be extracted. However, generating a statistically<br />

distributed set of house measurements for generating the set of houses was beyond the<br />

technical capacity of BuildChange. Thus, the <strong>MSiA</strong> students set to work understanding<br />

the problem set, the ranges of possible values for measurements, and designing a<br />

script with an R Shiny dashboard which would take a set of possible input ranges and<br />

output a set of building measurements of the desired size. This would allow<br />

BuildChange to generate a sufficiently large training set of images for training a model<br />

for house retrofit classification, and also to adjust the parameters in the future for<br />

changing environments/changes in the types of structures to be retrofit.<br />

Player Profiling and Retention Prediction in Sandbox Games (Just Cause 2)<br />

by Arpan Venugopal (class of '19)<br />

Open world video games are designed to offer free-roaming virtual environments and agency to the<br />

players, providing a substantial degree of freedom to play the games in the way the individual<br />

player prefers. Open world games are typically either persistent or for single-player versions semipersistent,<br />

meaning that they can be played for long periods of time and generate substantial<br />

volumes and variety of user telemetry. Combined, these factors can make it challenging to develop<br />

insights about player behavior to inform the design and live operations in open world games.<br />

Developing behavioral profiles, and predicting the behavior of players, are important analytical tools<br />

for understanding how a game is being played and understand why players depart (churn).<br />

During the course of the project, we explored a novel method of building behavioral profiles in a<br />

sandbox game (Just Cause 2) and apply the same methodology to predict players likely to churn or<br />

to continue engaging with the game. We have adopted the Relaxed Tensor Dual DEDICOM (RTDD)<br />

algorithm for bipartite tensor factorization of spatio-temporal and behavioral data, allowing for<br />

semi-autonomous representation learning and dimensionality reduction. We also attempted to<br />

interpret the features generated from this tensor factorization and their utility in profiling players<br />

and predicting player retention. Our results indicate that the incorporation of the temporal<br />

dimension as made possible by RTDD provides meaningful prediction improvement when compared<br />

to simple lifetime-behavior models.<br />

The tensor factorization methodology and player retention prediction were discussed in a paper<br />

submitted for the Artificial Intelligence and Interactive Digital Entertainment (AIIDE’19) conference<br />

and is currently under review.


REINFORCEMENT<br />

LEARNING<br />

There's always room for reinforcing your skills<br />

and knowledge! Last year, Diego began teaching<br />

a new elective called Reinforcement Learning<br />

"where an agent learns in real time based on an<br />

ever-changing environment and receiving a<br />

reward after an action is taken." We have select<br />

recordings of Reinforcement Learning class<br />

sessions to share with you.<br />

Click to view!<br />

Analytics<br />

Props go out to Rishabh Joshi, Varun Gupta, Xinyue Li, Yue Cui and Ziwen Wang, a few Zetas who co-wrote<br />

the ACSW 2019 paper, "A Team Based Player Versus Player Recommender Systems Framework For Player<br />

Improvement" while in the program.<br />

Gaming<br />

"Modern Massively Multi-player Online Games (MMOGs) have grown to become extremely complex in terms<br />

of the usable resources in the games, resulting in an increase in the amount of data collected by tracking<br />

the in-game activities of players. This has opened the door for researchers to come up with novel methods<br />

to utilize this data to improve and personalize the user experience. In this paper, a novel but flexible<br />

framework towards building a team based recommender system for player-versus-player (PvP) content in<br />

such MMOGs is presented, and applied to a case study in the context of the major commercial title Destiny<br />

2. The framework combines behavioral profiling via cluster analysis with recommendation systems to look<br />

at teams of players as a unit, as well as the individual players, to make recommendations to the players,<br />

with the purpose of providing information to them towards improving their performance."<br />

Click to read<br />

the full<br />

analysis


Greenberger (class of 2017)<br />

Julia<br />

I spent some time talking in the<br />

and<br />

offices of Opex in downtown<br />

new<br />

last week.<br />

Chicago<br />

has an engineering background and has<br />

Julia<br />

been interested in using numbers and<br />

always<br />

but the tipping point for her<br />

problem-solving,<br />

when she hit the wall-that-is-Excel in her<br />

came<br />

as an economic consultant in the energy<br />

work<br />

practice.<br />

science is a perfect fit for her because it<br />

Data<br />

numbers and problem-solving, while also<br />

uses<br />

aspects of visualization and design.<br />

incorporating<br />

asked her if there was something she learned<br />

I<br />

in the program that she unexpectedly found<br />

about<br />

using in work. She said that after having<br />

herself<br />

several more algorithms for a text mining<br />

tried<br />

the Jaccard Similarity Index (<strong>MSiA</strong> 431,<br />

project,<br />

Mining: Malthouse) ended up being the most<br />

Data<br />

demonstrating that sometimes the<br />

accurate,<br />

no stranger to hard work. Before her <strong>MSiA</strong><br />

Julia's<br />

studying and working on projects, her<br />

all-nighters<br />

Friday and Saturday nights making stuffed<br />

efforts<br />

to sell at weekend farmers' markets gave<br />

bagels<br />

company she co-founded, Poppy’s Stuffed<br />

the<br />

its start in DC years ago.<br />

Bagels,<br />

her free time, Julia still enjoys spending time in<br />

In<br />

kitchen, and is currently going through her<br />

the<br />

and grandmother's recipes to try each one<br />

mother<br />

She also enjoys going to the dog beach with<br />

out.<br />

A L U M N I S P O T L I G H T<br />

b y S a r a h M i t c h e l l<br />

simplest answer is indeed the best.<br />

Olive!<br />

Olive.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!