stareightytwo
Action through Analytics

Blog

Introduction: Avivo New Year's Data Science Resolution
avivo_horizontal_rgb_blue.jpg

By Brian Utz

StarEightyTwo and Social Data Science began 2018 on an exciting note, with the groups partnering to host a New Year’s Data Science Resolution. The goal of the resolution is to engage members of StarEightyTwo and Social Data Science to collaborate on analytics projects that spark their interest. The first meeting of the New Year’s Data Science Resolution, a project-pitch session in January, involved members pitching various analytics projects to pursue, voting on the top projects, and then splitting into smaller groups to work on these projects over the next couple months.

Our Partnership with Avivo

Avivo, a Minneapolis-based non-profit organization, was one of the non-profit organizations that attended the pitch session as a potential partner and data sponsor. Avivo helps more than 18,000 people each year achieve recovery, employment, and economic advancement. Its mission is to reduce economic and health disparities in the community by targeting its services towards individuals facing multiple complex barriers to success, such as poverty, homelessness, unemployment, chemical addiction, or mental health concerns. Avivo’s presentation and mission resonated with many individuals in the room, and a group of us, informally known as “Team Avivo,” joined forces to determine how we could use analytics to help Avivo and the Twin Cities community.

Improving Avivo’s Outpatient Substance Abuse Program

We are fortunate to partner with Boyd Brown, Avivo’s Vice President of Chemical & Mental Health, and the broader Avivo team, to work on improving the success of Avivo’s outpatient substance abuse program. Individuals who complete outpatient substance abuse programs are at a lower risk of full relapse or incarceration in the months after the program than they are if they do not complete the program.

Avivo’s outpatient substance abuse treatment program has a historical completion rate near 60%, which is roughly 10 percentage points higher than the national average was in 2013. Over the past three years, however, Avivo’s outpatient treatment completion rate has fallen to 43%. Avivo is looking for insights to better understand the population they serve, factors driving successful program completion, and how it can make this information actionable to improve the success of its outpatient substance abuse treatment program.

Avivo Desires to Turn Data into Actionable Insights

“There is a disconnect between the data we are reporting out from a regulatory perspective and how we can use that data to better understand our population and impact of our services,” said Boyd during our team’s site visit to Avivo. Avivo’s regulatory reporting assists with the broader national goal of monitoring disparities in care across demographic groups, but Avivo would like to know how it can use this information to benefit its clients and improve its services. Our team is excited to dig into the data and partner with Avivo to determine how our findings can influence any clinical or operational interventions to improve the success of the outpatient substance abuse program. Through our team’s analytical knowledge and Avivo’s domain expertise, we hope to build a sustainable process of using Avivo’s incoming client data to customize each client’s treatment and put its clients in the best position to complete the outpatient substance abuse program.

Understanding Avivo’s Historical Client Data

In order to understand Avivo’s client population, each client’s progress through the program, and the key predictors of program completion, our team will use Avivo’s admission and discharge data for the nearly 1,577 clients who participated in outpatient substance abuse program from late-2014 to the end of 2017. This data set includes demographic information about each client, medical history, criminal history, and their goals and behavior throughout the program.

In addition to the data set provided by Avivo, we are also working on incorporating additional data sources to enhance our understanding of Avivo’s clients and improve the robustness of the results of our analysis. Avivo is working with the Minnesota Department of Health and Human Services to incorporate Avivo’s admission and discharge data from 2004 to 2013 in order to increase our sample size. We are also looking to add Avivo’s client post-discharge satisfaction survey data to the data set to better understand the impact of Avivo’s services on program completion.

Stay Tuned for Our Next Update on April 7th!

Our next update, on April 7th, will describe the progress we have made on our data enrichment efforts and the results of our initial analysis. We are excited to see what we will find and how we can incorporate our findings into actionable recommendations for Avivo over the upcoming months.

New Year's Data Science Resolution Update

It's been two weeks since our first New Year's Data Science Resolution event, and the three teams are well on their way to improving the world through data science. Team Avivo, which is working to improve a chemical dependency treatment program, has gathered data about the program, including admission and discharge counts, and has also incorporated demographic data. Team Thunder Lizards, which is working to improve fundraising at the Science Museum of Minnesota, is still working on understanding the data dictionary and the exact questions that would help the museum. Tonight's first presentation was from Team Real Estate, which is building predictive models of Twin Cities home prices in collaboration with a local real estate agent.

John Hogue, who is a Lead Data Scientist at General Mills, presented on behalf of Team Real Estate. He showed his general process for getting started on a data science project using real MLS data in Python with Pandas. His presentation included:

  • Exploratory analysis: using the pandas-profiling package to get a simple overview of the data to find potential problems like null values and collinearity

  • Data cleaning using pandas commands

  • Feature engineering:

    • transforming features to be normally distributed

    • splitting categories into one-hot columns

    • binning values to eliminate outliers

  • Supplementing data using open APIs and HTML scraping

John’s full presentation is viewable as a Jupyter notebook here.

Next we had Abhishek Roy, a data science consultant from Slalom, present on behalf of Team Avivo. Abhishek used many of the same techniques, but using R rather than Python. We’ll hear more from Team Avivo next time.

As always, please join our Slack group to participate in the project. (Email dfeldman.mn@gmail.com for an invitation to the Slack group).

Visualization of the Pearson correlations between real estate variables

Visualization of the Pearson correlations between real estate variables

Daniel FeldmanComment
Be a Data Science Superhero! StarEightyTwo and Social Data Science Kick-Off a New Year's Data Resolution
Cover photo -- Creative Commons licensed  https://pixabay.com/en/super-hero-man-woman-success-3083468/

Cover photo -- Creative Commons licensed https://pixabay.com/en/super-hero-man-woman-success-3083468/

By Daniel Feldman

What do a charity that helps with substance abuse, the Science Museum of Minnesota, and a group of Twin Cities homeowners all have in common? All of them need urgent data science help! On Wednesday night, these groups came to the University of Minnesota to present their data science projects to more than 40 students and practitioners of data science at the New Year’s Data Science Resolution event, organized by StarEightyTwo and Social Data Science.

The event united those who want to learn and practice data science with organizations that need data science volunteers. Each organization had a chance to pitch its idea, and the “analysts in attendance” discussed the ideas, decided which projects to work on, and formed teams to collaborate on the projects for the next three months.

Out of eight project pitches, the data scientists in attendance decided to work on three:

  • A project to help Avivo, a Twin Cities nonprofit dedicated to serving homeless and low income individuals, improve its chemical health treatment and recovery programs. Courtney Flug and Boyd Brown from Avivo presented their idea: using data from Avivo's case database to build models to achieve better results in these programs. One team of volunteer data scientists will work closely with Avivo experts to reach this goal.
  • A project to help homeowners understand house prices. Working with Shannon Furlong, a realtor capturing the data, our volunteer data scientists will build a predictive model of sale prices for thousands of homes in the northeastern Twin Cities metro. With this model, homeowners will take the power of information into their own hands in order to negotiate better prices when buying and selling.

  • A project to help the Science Museum of Minnesota improve its fundraising. Beth Varro, the Director of Membership Services at the Science Museum of Minnesota, needs help growing the museum’s membership base and enabling the Science Museum to continue to offer great attractions. Using data from the the museum’s membership database, the team will build a predictive model to identify the Science Museum’s highest-value members.

The three teams will work on their projects for the next three months, and present the results of their work at a grand finale in April.

If you’re interested in participating in the New Year’s Data Science Resolution, it’s not too late to join any of these teams. Email Daniel Feldman at dfeldman.mn@gmail.com for more information.

StarEightyTwo and Social Data Science are nonprofit organizations that together have held eight data science hackathons and hosted more than 30 free and educational meetups. Past hackathons have featured Generation Next, the Minnesota Pollution Control Agency, Habitat for Humanity, Aeon, and Mapping Prejudice, among other local organizations.

Thanks to MinneAnalytics for sponsoring food and drink, and the Carlson School of Management for providing meeting space.

Present and future data science superheroes listening to project pitches. Join them!

Present and future data science superheroes listening to project pitches. Join them!

Mitchell NoordykeComment