Real NU MSiA Newsletter
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
SUMMER 2019<br />
MSIA TIMES<br />
INSIDE:<br />
Interview: Seminar Series Speaker, Patrick Boueri<br />
Alumni Spotlight: Julia Greenberger<br />
Upcoming Events<br />
Reinforcement Learning Videos<br />
<strong>MSiA</strong> Hackathon made the McCormick news!<br />
Extra-Curricular Projects<br />
Gaming Analytics<br />
Meet the <strong>MSiA</strong> Staff<br />
Get <strong>MSiA</strong> famous! Selfies on social media
WELCOME TO MSIA TIMES!<br />
Welcome to the inaugural edition of <strong>MSiA</strong><br />
Times, the alumni-exclusive newsletter that<br />
offers up-to-date happenings among the <strong>MSiA</strong><br />
family. <strong>MSiA</strong> knows life after graduation gets<br />
busy and classmates may not be as close by<br />
as before, so we present you with a place to<br />
share stories, upcoming <strong>MSiA</strong> events, industry<br />
news, and much more.<br />
We would love to hear from you! Send your<br />
news to analytics@northwestern.edu.
TO JOIN THE MSIA ALUMNI COMMUNITY ARE<br />
SOON<br />
ETAS, CLASS OF '19. CLICK HERE TO FIND OUT<br />
THE<br />
MSIA CLASS OF 2018<br />
ZETAS<br />
MORE ABOUT THE ETAS.
A N I N T E R V I E W W I T H<br />
P A T R I C K B O U E R I<br />
Patrick Boueri, senior data science<br />
manager at Uptake, took some time<br />
to speak with Naomi Kaduwela<br />
(<strong>MSiA</strong> '19) before his presentation<br />
on challenges in machine learning<br />
serving as part of <strong>MSiA</strong>'s quarterly<br />
seminar series.<br />
Naomi: What is your favorite part about being a<br />
data scientist?<br />
Patrick: My favorite is definitely the team. It is<br />
composed of people from different walks of life<br />
that have strong passionate technical interests,<br />
which speaks to me. It’s interesting to see the<br />
confluence of those across social sciences, math,<br />
and hard sciences. They are willing to share<br />
knowledge and experience across domains which<br />
fosters creativity and fun discussions. Also, when<br />
you get into data work, you can answer your own<br />
questions, which is very cool to me. If you want to<br />
be able to measure something, you can go out<br />
and do the empirical work yourself. You’re<br />
constantly uncovering new things.<br />
Naomi: What has been your favorite project?<br />
It was fun to do because you get into the guts of<br />
the algorithm. Practically, it’s hard to come up<br />
with rules that transcend any data set.<br />
Naomi: What are the challenges to getting to<br />
more productionized AI solutions?<br />
Patrick: It’s a confluence of things: expertise, right<br />
data, right system.<br />
Getting the right consumers is also important,<br />
which is why UI/UX initiatives are key to ensure it<br />
drives actions downstream and the right behavior<br />
down the road.<br />
Marquee products like Google were built for ad<br />
and web search so they have phenomenal ML in<br />
that space.<br />
Patrick: Favorite project was an error analysis<br />
where we trained a tree and then some error<br />
accumulates. So we took the tree’s leaves,<br />
terminal nodes as a feature vector and clustered<br />
the terminal predictions to do error analysis on<br />
that. It was a way to introspect on the tree and<br />
find out where the errors occur.<br />
Most companies sell products or services, and<br />
analytics is an offering or optimization on a<br />
business process. For many companies, analytics is<br />
not their core. However, if they have good<br />
analytics on their products, they can get increases<br />
in margin. But it’s harder to do for them because<br />
it’s more spread out and distributed.
AN INTERVIEW WITH PATRICK BOUERI, continued<br />
Naomi: Is it possible that there is much<br />
abstraction with programming languages and<br />
libraries?<br />
Patrick: Too much abstraction is only a bad thing if<br />
they are leaky abstractions. For example, you<br />
don’t care about the low level details of what<br />
your compiler is doing because it’s a very hermetic<br />
abstraction. But data is leaky, and that’s where<br />
you get results that make you question every<br />
move, because you don’t have a solid foundation.<br />
I worry if they are used incorrectly, in the sense<br />
that decisions are being made off of them. But, I’m<br />
not sure we are at that point yet, where so many<br />
machine learning algorithms are being<br />
productionized effectively.<br />
But it’s cool! I wouldn’t want to write a Stochastic<br />
gradient descent model every time I train a model.<br />
Naomi: What are your thoughts on the future of<br />
data science tool sets? I also see a lot of<br />
customization within each domain with regard to<br />
the data science packages.<br />
Patrick: Even with the abstractions, we get to<br />
grapple with the domain. And I think that’s where<br />
the future is going to go - to get more productive.<br />
You can throw as much machine learning as you<br />
want at it, but the way to narrow the search<br />
space to get things that are useful, without infinite<br />
compute and memory is to bundle domains into<br />
ML libraries.<br />
I think we'll see a lot of specialized domains.<br />
What we see growing in audio, natural<br />
language processing, will expand, for example,<br />
to CRM.<br />
You’ll see providers that have specific domain -<br />
like speech to text - with an API built on top of<br />
it so plug and play is easy for businesses. We<br />
have a lot of disparate tools, so to be able to<br />
pull it off the shelf quickly and do it ourselves<br />
easily would be a great thing.<br />
Naomi: In our Master of Science in Analytics<br />
program this quarter we are learning to build<br />
and deploy end to end production ready<br />
applications. Can you talk about the<br />
uniqueness vs intersections of the data scientist<br />
vs software developer role these days?<br />
Patrick: I would call that role a machine<br />
learning engineer. There are strong cultural<br />
differences between data scientists and<br />
software developers. Software developers can<br />
unit-test locally and there’s a sense of<br />
certainty where the uncertainty is from users.<br />
With data science it can be a bit frustrating<br />
because there is uncertainty in the data. So<br />
when someone says, “If I give you this data,<br />
will it work?” The answer is “I think so”. There is<br />
also technical rigor. We can learn from<br />
software developers: version control and a<br />
DevOps mindset.<br />
Click here for the full<br />
interview!
SEND US YOUR<br />
SELFIES!<br />
We always make Diego take selfies when he's attending conferences.<br />
Follow us on Instagram to check it out [instagram.com/nu_analytics]!<br />
Also, send us your selfies (analytics@northwestern.edu), whether it be<br />
at work, in your city, on a trip, or at a conference. We'll make you an<br />
<strong>MSiA</strong> star!
<strong>MSiA</strong>'s current<br />
cohort is busy at<br />
work...and we mean<br />
that literally. They<br />
marked their<br />
internship location<br />
on the map. Check<br />
out the video to see<br />
where they are<br />
during the summer!<br />
Summer Internships
U P C O M I N G<br />
E V E N T S<br />
Boot Camp<br />
Student<br />
evanston<br />
Mixer<br />
Alumni<br />
Francisco<br />
San<br />
Please Click<br />
to volunteer<br />
2019<br />
30<br />
august<br />
for details<br />
Click<br />
rsvp<br />
and<br />
2019<br />
<strong>MSiA</strong> Staff
<strong>MSiA</strong> Staff<br />
Beau Breeden<br />
Borchuluun<br />
Yadamsuren<br />
Cindy Nguyen<br />
As the program’s System<br />
Administrator, Beau’s<br />
mission is to make sure<br />
that the technological<br />
needs of the students and<br />
faculty are met. He calls<br />
upon more than 18 years<br />
of single-user to multienterprise<br />
IT experience to<br />
assist in that effort.<br />
DEEP DISH<br />
OR<br />
THIN SLICE?<br />
THIN SLICE.<br />
CHANGE MY<br />
MIND.<br />
As a research facilitator, I<br />
will work with students on<br />
their extracurricular<br />
projects. Previously, I<br />
worked for the Information<br />
Experience Laboratory of the<br />
University of Missouri on<br />
various user experience<br />
research projects. I also<br />
taught computer science and<br />
user experience related<br />
courses at the University of<br />
Missouri and Columbia<br />
College.<br />
WHAT IS<br />
YOUR GUILTY<br />
PLEASURE?<br />
BROWSING THE<br />
INTERNET FOR<br />
SERENDIPITOUS<br />
INFORMATION<br />
ENCOUNTERS<br />
Cindy La Nguyen is our lead on<br />
admissions and marketing. She<br />
has her PhD, and MS in<br />
Sociology, and her BA in<br />
Comparative Sociology (a mix of<br />
sociology and cultural<br />
anthropology). Cindy is<br />
fascinated by cultural<br />
movements big and small. She<br />
specializes in ethnographic<br />
methods and the study of how<br />
race, ethnicity, gender, and<br />
citizenship intersect.<br />
WOULD YOU RATHER<br />
GIVE ADVICE TO YOUR<br />
PAST-SELF OR HAVE A<br />
CONVERSATION WITH<br />
YOUR FUTURE-SELF?<br />
I’M MORE OF A FORWARD LOOKING<br />
PERSON. I’D DEFINITELY SIT DOWN<br />
AND HAVE A TALK WITH MY<br />
FUTURE-SELF. I’D SAY TO HER,<br />
“YOU’RE SITTING DOWN AND<br />
HAVING A CONVERSATION WITH<br />
YOUR PAST-SELF RIGHT NOW,<br />
WOULDN’T YOU RATHER BE TALKING<br />
TO YOUR FUTURE-SELF?” AND THEN<br />
I’D ASK HER IF TELEPORTATION<br />
EVER BECOMES A THING.
<strong>MSiA</strong> Staff<br />
Noelle Afolabi<br />
Sarah Mitchell<br />
Noelle Afolabi is our Program<br />
Assistant. She holds an MFA in<br />
Creative Writing from Northwestern<br />
University and a BA in Anthropology<br />
and Creative Writing from Eastern<br />
Michigan University. Noelle uses her<br />
cultural anthropology background<br />
with her creative writing skills in<br />
order to create well-rounded fiction<br />
pieces. When she’s not reading or<br />
writing, she enjoys biking<br />
Chicagoland trails with her husband<br />
and knitting.<br />
IF YOU OPENED A<br />
BUSINESS, WHAT<br />
KIND OF BUSINESS<br />
WOULD IT BE?<br />
Sarah Mitchell has been<br />
with the <strong>MSiA</strong> program for<br />
3.5 years and has her<br />
Bachelor's degree in<br />
education. She enjoys board<br />
games, has a mean cat, and<br />
pays to play bass guitar in<br />
a blues band. Her favorite<br />
local tourist spot is the<br />
House on the Rock in<br />
southern Wisconsin.<br />
WHAT MAKES<br />
YOUR DAY,<br />
SARAH?<br />
BARACK OBAMA SAID, "THE FUTURE<br />
BELONGS TO YOUNG PEOPLE WITH<br />
AN EDUCATION AND THE<br />
IMAGINATION TO CREATE." I WOULD<br />
LOVE TO BUILD UPON OBAMA'S<br />
WORDS AND OPEN AN ARTS MEGA<br />
CENTER WHERE PEOPLE OF ALL<br />
AGES COULD LEARN AND REFINE<br />
SKILLS IN WRITING, MUSIC, DANCE,<br />
FINE ART, CULINARY ART AND<br />
FASHION.<br />
THE EARTH'S ROTATION.
A C K A T H O N 2 0 1 9<br />
H<br />
<strong>MSiA</strong>'s sixth annual Hackathon, sponsored by ABC Supply Co., proved to<br />
challenge students to use learned knowledge from coursework and apply it<br />
to real data.<br />
Thanks to <strong>MSiA</strong> alumnae Jill Fan (<strong>MSiA</strong> ’18), Zili Li (<strong>MSiA</strong> ’18), and Madhuri<br />
Gupta (<strong>MSiA</strong> ’17) who participated as judges for the competition.<br />
"One of the things I<br />
enjoyed the most was<br />
that the prompts were<br />
so open ended,” <strong>MSiA</strong><br />
student Shreyas Sabnis<br />
said. “We didn’t go into<br />
the day knowing what<br />
problem we were going<br />
to solve, but we<br />
understood what to<br />
look for as we studied<br />
the data. Many of the<br />
concepts we learned in<br />
class proved to be<br />
extremely useful.”<br />
Read the FULL ARTICLE<br />
to find out more!
Now<br />
presenting<br />
two<br />
summaries<br />
from the Etas'<br />
extracurricular<br />
projects.<br />
Extra-<br />
Curricular<br />
BuildChange<br />
by Elliot Gardner (class of '19)<br />
Over the winter and spring quarters, four students (Ruixiang Fan, Elliot Gardner, Kejin<br />
Qian, and Yiwei Zhang), with the advisement of one PhD student (Zhe Su), worked on a<br />
project for BuildChange, which is a non-profit that conducts structural retrofitting<br />
projects on housing in third world countries that have faced natural disasters. One<br />
impediment to increasing the scope of work that BuildChange can conduct is the time<br />
needed to measure houses for suitability for retrofitting. As a result, BuildChange is<br />
considering building an app that will use a Machine Learning model, coupled with<br />
Computer Vision techniques, to allow users in a country to assess whether their house<br />
is a good candidate for retrofitting. In order to train such a model, quite a bit of<br />
training data (in the form of labeled pictures of houses) would be needed, much more<br />
than BuildChange currently has access to.<br />
As a result of this challenge in obtaining training data, BuildChange reached out to<br />
Northwestern through <strong>MSiA</strong> alumnae, Patty Liu Svenson (class of 2016) to get help with<br />
generating training data. BuildChange has software which will take measurements of a<br />
house and generate a CAD model of the structure, from which a picture of the structure<br />
from different perspectives can be extracted. However, generating a statistically<br />
distributed set of house measurements for generating the set of houses was beyond the<br />
technical capacity of BuildChange. Thus, the <strong>MSiA</strong> students set to work understanding<br />
the problem set, the ranges of possible values for measurements, and designing a<br />
script with an R Shiny dashboard which would take a set of possible input ranges and<br />
output a set of building measurements of the desired size. This would allow<br />
BuildChange to generate a sufficiently large training set of images for training a model<br />
for house retrofit classification, and also to adjust the parameters in the future for<br />
changing environments/changes in the types of structures to be retrofit.<br />
Player Profiling and Retention Prediction in Sandbox Games (Just Cause 2)<br />
by Arpan Venugopal (class of '19)<br />
Open world video games are designed to offer free-roaming virtual environments and agency to the<br />
players, providing a substantial degree of freedom to play the games in the way the individual<br />
player prefers. Open world games are typically either persistent or for single-player versions semipersistent,<br />
meaning that they can be played for long periods of time and generate substantial<br />
volumes and variety of user telemetry. Combined, these factors can make it challenging to develop<br />
insights about player behavior to inform the design and live operations in open world games.<br />
Developing behavioral profiles, and predicting the behavior of players, are important analytical tools<br />
for understanding how a game is being played and understand why players depart (churn).<br />
During the course of the project, we explored a novel method of building behavioral profiles in a<br />
sandbox game (Just Cause 2) and apply the same methodology to predict players likely to churn or<br />
to continue engaging with the game. We have adopted the Relaxed Tensor Dual DEDICOM (RTDD)<br />
algorithm for bipartite tensor factorization of spatio-temporal and behavioral data, allowing for<br />
semi-autonomous representation learning and dimensionality reduction. We also attempted to<br />
interpret the features generated from this tensor factorization and their utility in profiling players<br />
and predicting player retention. Our results indicate that the incorporation of the temporal<br />
dimension as made possible by RTDD provides meaningful prediction improvement when compared<br />
to simple lifetime-behavior models.<br />
The tensor factorization methodology and player retention prediction were discussed in a paper<br />
submitted for the Artificial Intelligence and Interactive Digital Entertainment (AIIDE’19) conference<br />
and is currently under review.
REINFORCEMENT<br />
LEARNING<br />
There's always room for reinforcing your skills<br />
and knowledge! Last year, Diego began teaching<br />
a new elective called Reinforcement Learning<br />
"where an agent learns in real time based on an<br />
ever-changing environment and receiving a<br />
reward after an action is taken." We have select<br />
recordings of Reinforcement Learning class<br />
sessions to share with you.<br />
Click to view!<br />
Analytics<br />
Props go out to Rishabh Joshi, Varun Gupta, Xinyue Li, Yue Cui and Ziwen Wang, a few Zetas who co-wrote<br />
the ACSW 2019 paper, "A Team Based Player Versus Player Recommender Systems Framework For Player<br />
Improvement" while in the program.<br />
Gaming<br />
"Modern Massively Multi-player Online Games (MMOGs) have grown to become extremely complex in terms<br />
of the usable resources in the games, resulting in an increase in the amount of data collected by tracking<br />
the in-game activities of players. This has opened the door for researchers to come up with novel methods<br />
to utilize this data to improve and personalize the user experience. In this paper, a novel but flexible<br />
framework towards building a team based recommender system for player-versus-player (PvP) content in<br />
such MMOGs is presented, and applied to a case study in the context of the major commercial title Destiny<br />
2. The framework combines behavioral profiling via cluster analysis with recommendation systems to look<br />
at teams of players as a unit, as well as the individual players, to make recommendations to the players,<br />
with the purpose of providing information to them towards improving their performance."<br />
Click to read<br />
the full<br />
analysis
Greenberger (class of 2017)<br />
Julia<br />
I spent some time talking in the<br />
and<br />
offices of Opex in downtown<br />
new<br />
last week.<br />
Chicago<br />
has an engineering background and has<br />
Julia<br />
been interested in using numbers and<br />
always<br />
but the tipping point for her<br />
problem-solving,<br />
when she hit the wall-that-is-Excel in her<br />
came<br />
as an economic consultant in the energy<br />
work<br />
practice.<br />
science is a perfect fit for her because it<br />
Data<br />
numbers and problem-solving, while also<br />
uses<br />
aspects of visualization and design.<br />
incorporating<br />
asked her if there was something she learned<br />
I<br />
in the program that she unexpectedly found<br />
about<br />
using in work. She said that after having<br />
herself<br />
several more algorithms for a text mining<br />
tried<br />
the Jaccard Similarity Index (<strong>MSiA</strong> 431,<br />
project,<br />
Mining: Malthouse) ended up being the most<br />
Data<br />
demonstrating that sometimes the<br />
accurate,<br />
no stranger to hard work. Before her <strong>MSiA</strong><br />
Julia's<br />
studying and working on projects, her<br />
all-nighters<br />
Friday and Saturday nights making stuffed<br />
efforts<br />
to sell at weekend farmers' markets gave<br />
bagels<br />
company she co-founded, Poppy’s Stuffed<br />
the<br />
its start in DC years ago.<br />
Bagels,<br />
her free time, Julia still enjoys spending time in<br />
In<br />
kitchen, and is currently going through her<br />
the<br />
and grandmother's recipes to try each one<br />
mother<br />
She also enjoys going to the dog beach with<br />
out.<br />
A L U M N I S P O T L I G H T<br />
b y S a r a h M i t c h e l l<br />
simplest answer is indeed the best.<br />
Olive!<br />
Olive.