Data Science Graduate Certificate Program Newsletter
March 2017

V.1-3
View this email in your browser
Data Science
Graduate Student
Newsletter

Volume 1, Issue 3

Introduction

 

In this edition of the Data Science Graduate Student Newsletter, we had a chance to connect with two leaders in the analytics space at Plante Moran, a Michigan based, certified public accounting and business advisory firm, to get their insights on how data science is transforming their workplace. Also, we have two great seminars lined up for the 17th and 24th of this month, the presenters are: Laura Balzano, PhD from the University of Michigan  and Tianxi Cai, Sc.D. from Harvard University, respectively. For thie month of March, the graduate student meeting will be held on March 20 (please see below for the location and time).
 

Data Science Graduate Student Meeting

Monday March 20, 2017

Location: School of Public Health I, Suite 7625
Time: 5:00 PM - 6:00 PM

+ GOOGLE CALENDAR



Not able to make it to SPH I for the meeting? Join virtually!
Join the meeting from your computer, tablet or smartphone. 
https://global.gotomeeting.com/join/830378933 
You can also dial in using your phone. 
United States: +1 (571) 317-3122 
Access Code: 830-378-933 
First GoToMeeting? Try a test session: http://help.citrix.com/getready 

 

March Meeting Agenda

  • Insight Data Engineering Presentation

    • Careers in Data Science and Data Engineering
      Anne Bessman from Insight Data Science will be leading a discussion on careers in data science, health data science, and data engineering. Top companies are hiring data scientists and engineers to help find insights in the petabytes of data that they collect every day. Scientists and engineers from diverse fields, including physics, computational biology, neuroscience, math, and statistics are playing key roles in transforming the way of working with data to impact our daily lives.
      The Insight Fellows Program is a training fellowship designed to bridge the gap between academia and a career in data. Insight provides seven-week, full-time, training fellowships in Silicon Valley, New York and Boston. They offer a full tuition scholarship, dedicated office space, and project-based learning under the guidance of top industry mentors. Over 800 Insight alumni are now working at Facebook, Apple, LinkedIn, Uber, Reddit, Netflix, NBC, MTV, Khan Academy, Biogen and other top companies.
       
      In this info session, we will provide a high-level overview of data trends in industry and describe the Insight Fellows Program. The session will include time for Q&A, and advice for those interested in transitioning to careers in data science and data engineering. Learn more at: insightdatascience.com and insightdataengineering.com
       
  • MDST Presentation

    • Arya Farahi, Current and ongoing projects with MDST

MIDAS Seminar Series Presentations


March 17
  • MIDAS Seminar Series
  • Location: Forum Hall, Palmer Commons
  • Time: 4:00 PM - 5:00 PM
  • + GOOGLE CALENDAR
    • Presenter: Laura Balzano, PhD, University of Michigan
      • Bio: Laura Balzano is an assistant professor in Electrical Engineering and Computer Science at the University of Michigan. She is an Intel Early Career Faculty Honor Fellow and received an NSF BRIGE award. Her main research focus is on modeling with highly incomplete or corrupted data, and its applications in networks, environmental monitoring, and computer vision. Her expertise is in statistical signal processing, matrix factorization, and optimization.
        Laura received all her degrees in Electrical Engineering: BS from Rice University, MS from the University of California in Los Angeles, and PhD from the University of Wisconsin. She received the Outstanding MS Degree of the year award from the UCLA EE Department, and the Best Dissertation award from the University of Wisconsin ECE Department. She has worked as a software engineer at Applied Signal Technology, Inc. Her PhD was supported by a 3M fellowship.
    • Topic: Finding Low-Rank Structure in Messy Data
      • Abstract: To draw inferences from large, high-dimensional datasets, we often seek simple structure that model the phenomena represented in those data. Low-rank linear structure is one of the most flexible and efficient such models, allowing efficient prediction, inference, and anomaly detection. In this talk we will cover at a high level some results in optimization and high-dimensional probability from the last several years that show how to identify low-rank structure in high-dimensional data despite missing and corrupted data. Additionally, we will discuss two new directions for finding low-rank structure in messy real-world data. In the first, we observe every entry of the matrix through a single unknown monotonic transformation. This is common in calibration and quantization problems. We show that matrix completion is still possible in this context and demonstrate a simple algorithm with guarantees. In the second, our vector observations are heteroscedastic, ie, corrupted by one of several noise variances. This is common in problems like sensor networks or medical imaging, where different measurements of the same phenomenon are taken with different quality sensing (eg high or low radiation). We prove recovery results for principal component analysis (PCA) in this context. We show that recovery for a fixed average noise variance is maximized when the noise variances are equal, implying that while average noise variance is often a convenient measure of the overall quality of the data, it gives an overly optimistic estimate of PCA performance.


March 24
  • MIDAS Seminar Series
  • Location: Forum Hall, Palmer Commons
  • Time: 4:30 PM - 5:30 PM
  • + GOOGLE CALENDAR
    • Presenter: Tianxi Cai, Sc.D. , Harvard University
      • Bio: Dr. Cai’s current research interests are mainly in the area of biomarker evaluation; model selection and validation; prediction methods; personalized medicine in disease diagnosis, prognosis and treatment; statistical inference with high dimensional data; and survival analysis.
        In addition to her methdological research, Dr. Cai also collaborates with the I2B2 (Informatics for Integrating Biology and the Bedside) center on developing a scalable informatics framework that will bridge clinical research data and the vast data banks arising from basic science research in order to better understand the genetic bases of complex diseases
    • Topic: Efficient Use of EMR for Biomedical Translational Research
      • Abstract: While clinical trials remain a critical source for studying disease risk, progression and treatment response, they have limitations including the generalizability of the study findings to the real world and the limited ability to test broader hypotheses. In recent years, due to the increasing adoption of electronic health records (EHR) and the linkage of EHR with specimen bio-repositories, large integrated EHR datasets now exist as a new source for translational research. These datasets open new opportunities for deriving real-word, data-driven prediction models of disease risk and progression as well as unbiased investigation of shared genetic etiology of multiple phenotypes. Yet, they also bring methodological challenges. For example, obtaining validated phenotype information, such as presence of a disease condition and treatment response, is a major bottleneck in EHR research, as it requires laborious medical record review. A valuable type of EHR data is narrative free-text data. Extracting accurate yet concise information from the narrative data via natural language processing is also challenging.  In this talk, Professor Cai discuss various statistical and informatics methods that illustrate both opportunities and challenges. These methods will be illustrated using EHR data from Partner’s Healthcare. .

Interview with Two Plante Moran Analytics Directors

 
Before going into the interview, I would like to give you a brief overview of Plante Moran as a company. Plante Moran is a Michigan based company, it is the 14th largest certified public accounting and business advisory firm in the United States: that offers auditing, accounting, tax, and business advisory consulting services. Plante Moran, over the course of 18 years, has been on FORTUNE magazine’s list of the '100 Best Companies to Work For' in America. This company has a workforce over 2,000 people.
 
Within the last few years, the company embarked on a new product offering to their clients: Data Analytics. Helping to provide the clients with the new product offering is Greg Alonso and Chris Moshier. Greg Alonso leads the Auto Industry Data Analytics Practice and Chris Moshier leads Plante Moran's Data Analytics Center of Excellence. These two leaders work hand-and-hand to provide solutions to solve Plante Moran’s client’s biggest business problems with data.
 
In looking at the careers of Greg and Chris, one would, and should, ask the question, “How did you get here?”, and I did. As, a data scientist, I looked at the theme that highly “correlated” between the two of them. The theme is that they worked in various capacities within multiple organizations using data to solve business problems. Greg worked the first half of his career in finance and financial analysis, and the latter half in more consultant oriented, business development roles, as he reported to the CIO for Chrysler. Chris, on the other hand, worked in the public sector to build models to solve business problems.
 
With insight of the path they traveled to obtain their roles, next on the list of questions is, “How is the advancement of Data Science changing your workplace?” Chris told me that it is changing from a tool perspective, which means that he has to use more advanced and hardcore analytic tools to extract and maintain his client’s data, as the size and the complexity of the databases he is given have changed dramatically over the years. He says that, “I went from using Excel to R, ALTERYX, Tableau, and SQL as a tool to mine and extract data”.
 
From an unsolved business problem standpoint, I poised the question, “What are the business problems that could be solved with analytics, that keep you up at night”. Greg states the biggest problems that they are facing is: 1.) long-term forecasting (2-5 years), 2.) Dealing with missing data caused by the acquisitions of the auto industry in database systems, and 3.) Developing the capabilities to forecast the future demand for the adoption of electric vehicles for suppliers, based on the CAFÉ standards set in place by the Obama administration. 
 
In the interest of the Spring graduates, that would love to tackle Plante Moran’s unsolved business problems. Greg and Chris gave me pointers on how to join the team. Greg, states that he is looking for a data scientist that has a high degree of intellectual curiosity, understands what is driving the business, and is an inquisitive/root cause thinker. Chris, is looking for individuals that are mathematically astute, has computer science capabilities, and the ability to apply theoretical principles from math and computer science to solve business problems. Overall, if you look at the statements made by Greg and Chris, the connection between the two is that they both desire someone with strong business acumen and analytical skills.
 
At the end of the interview, I thought that it would be fair to ask them, “What do you want future Data Scientists to know?” Unequivocally, they both stated that they wanted future data scientists to know that they are growing their team and would love to bring on individuals that have graduated from our department to aid them to solve their business problems.

Curriculum Updates


For any Curriculum updates, please see: http://midas.umich.edu/certificate/
For questions, comments and advising, please contact: Dr. Ivo Dinov

Career & Opportunity Highlights


For current career opportunities, please see: http://midas.umich.edu/careers/

Accolades


Are you Graduating in Winter/Spring/Summer 2017?

Please contact Alison Martin, aalison@umich.edu, to make sure all of your papers are in order and so that we can make sure to announce your success to the other students!
Data Science Certificate Program Statistics

Newsletter Editors


Vincent Harris Lyons & Dr. Ivo Dinov

 

 

Contacts


Ivo Dinov, Ph.D. - Associate Director
dinov@umich.edu
(734)764-5557

Moira Dowling - Project Manager
mdowling@umich.edu
(734)998-9247

Alison Martin - Administrstive Assistant
aalison@umich.edu
(734)615-8945
Copyright © 2016 University of Michigan, All rights reserved.

Add us to your address book

unsubscribe from this list    update subscription preferences 






This email was sent to aalison@umich.edu
why did I get this?    unsubscribe from this list    update subscription preferences
University of Michigan · 540 E. Liberty, Suite 202 · Ann Arbor, MI 48104-2210 · USA