https://www.renarepenning.com/weplaynoladata/

Rena Repenning and Eddy Salazar

Intro to Data Science Service Learning - CMPS3160-01 0

Dr Nicholas Mattei, Tulane University

Fall 2021

We PLAY logo1, 3

Project Plan

Data

We PLAY Data 2 Provided by Christine Neely from Training Grounds NOLA. Available on this project's Git repository.

Explanation & Model Proposition

We are inspired by The We PLAY center's mission "to assist parents, caregivers, and professionals with providing children birth to five years of age with rich learning experiences, positive adult-child interactions, and social-emotional skills that will contribute to success in school and beyond." Melanie Richardson and Christine Neely proposed Survey Data Synthesis to Intro to Data Science’s service-learning students

The presented data contained five google sheets, a project description, and a Logic Model. Four of the five sheets were pulled from Incoming Parent Reflection Forms (an almost identical survey), and one is a Family Survey with similar but not identical questions. Although extensive, the data proved very difficult to wrangle because of its messy form. We aim to provide recommendations to facilitate future data analysis with less ETL.

Along with survey data4, Christine Neely provided us with a Logic Model submitted to their IMH Grant proposal. The organization's goal is stated as "Strengthening social-emotional interaction among children ages birth-3 yrs and parents at the We PLAY Center will lead to stronger communities, thriving families and children who are ready to learn." This document's Output Targets and Indicators of Success were guidance for the following data analysis. We PLAY assumes that "embedding [...] the center in low-income communities will yield participation" in utilizing and applying parenting information to families desire. To combat the disorganization of the income data we created a new sheet with appropriate labeling and all five datasets.

Our Focus

First, we manually removed empty columns and rows from the five excel files. Then, we broke apart sheets to make data frames addressing a single concern. During service learning, Melanie and Christine heard our Milestone Two plan and asked for us to investigate for any links between zip code, income, and times returned as well as income and requests for different types of information. Due to limitations in time and the data given we were not able to investigate these links. We instead focused on visualizing the data to provide answers to the questions posed on the surveys and logic model used.

We intend to model WE Play's current demographics, strengths, and provide recommendations for future data collection. To do so we will:

Recommendations

There are a few improvements we believe could be made to the survey process to ensure a higher volume of quality data is collected. The surveys we used were from irregular time intervals which affect the quality of the data that is recorded. Sending out surveys in more intervals, like quarterly would be an improvement. Some survey questions were essentially the same, so going through and removing questions that are too similar would improve the experience of someone taking the survey. We recommend surveys track a participent's attendance frequency and time spend in the program. Finally, We PLAY would benefit from employing one survey format. This would allow easier comparison between all data and provide more useful insights.

Recommendations - Bulleted

Tailor questions to IMH logic model

Process

Data transformation was the most time-intensive step of the data life cycle. First, we manually removed empty columns and rows from the five excel files. Then, we broke apart sheets to make data frames addressing a single concern. During service learning, Melanie and Christine heard our Milestone Two plan and requested further investigation into participants’ income levels relating to their requested information and zip code.

Graphing pie charts is difficult, and code-intensive, in pandas and matplotlib. Writing three of our own ETL functions increased our efficiency and cut down on lines of code. We added functions to plot pie charts easily, grab column labels, and take numerical series and plot a bar chart, and print summary stats. These will allow us to analyze the rest of the data in a much more efficient fashion.

Collaboration Plan

Rena and Eddy will utilize this repository to simultaneously edit the project. We plan on connecting over Zoom as needed and keeping constant communication over instant messages. Also, we are both able to meet after class on Thursdays. When convenient, we will export notebooks to Google Colabs to support simultaneous editing. (This is easily done using a Colabs Chrome extension). While re-organizing our final product we used a GoogleDoc to simultaneously outline and plan.

Footnotes

0. https://nmattei.github.io/cmps3160/

1. Image source: https://www.mytraininggrounds.org/we-play-center.

2. We PLAY data was last downloaded on October 25th, 2021.

3. Cite: https://stackoverflow.com/questions/14675913/changing-image-size-in-markdown)

4. Access the data and Logic Model [here](https://github.com/renarepenning/weplaynoladata/tree/main/_data)

5. Cites:

6. 2020 US Census Bureau data: https://www.census.gov/quickfacts/fact/table/neworleanscitylouisiana/INC110219

Check out our final presentation slides here

Exploring We PLAY Survey Results

Imports as needed
Separate paths to accomodate for our local environments

Extraction, Transform, and Loading Key Functions

Matplotlib pie charting functions are cumbersome and do not offer flexability while labeling. During Milestone 1 we found ourselves spending more time on graphing than working with data. Writing graphing functions, custom to this project spend up ETL and improved output quality. </br>Below is our code documentation:

plotPies() 5

getLabels()

Returns a list of lists of values for each column, using pandas.unique. Will print Nans! Unfortunately, labeling is inconsistent when filterning them out.

getColor()

Returns a random selection of one of We PLAY's colors. Used to alternate colors for printCarBars() and printHorizBars().

printCatBars()

printHorizBars()

Inended for categorical variables with "long" labels.

plotSection()

Streamlines plotting columns of the aggregated allIncoming df. Takes in indexes and calls other functions appropriately. Cannot be used on other data frames.

Displaying catagorical barcharts with option of summary stats for numeric variables
Plotting batches of aggregated data, with option to add in/change to bar chart

Surveys

We recieved two survey formats: Incoming Parent Reflection and Family Survey

Incoming Family Data

Aggregation of four "Incoming Parent Reflection Suverys." Dated from July 2019, November 2019, 2020, and an undated version from 2019. These sheets appear to be from the same survey, but the undated sheet seemed to differ slightly in column order. This was able to be fixed manually or by dropping indexes.

Distribution of data from each survey

Family Survey Data

This section went through ETL seperately because of format and question differences. FamilySurvey.xlsx data was collected between Jan-Aug 2019. We suspect overvations overlap with those from Incoming Reflections.

Data

Reported demographics

Participant ID was coded by the survey, automatically and did not allow us to track Respondants between surveys.

Income

The 2020 New Orleans Census data reports a median income of 45,615 USD. We PLAY's median and most frequent income range lies between 50,000 and 74,000. Although this suggests We PLAY attendees earn about 20 percent more than the average New Orleanian, the reported income distribution skews. We decided not to calculate averages because the intervals were uneven. 6

In New Orleans a family of three is considered to be in poverty if household income is below 21,960 USD. 30 percent of We PLAY attendees earn under 30,000; we estimate that this is closer to an actual living wage. 44 percent of attendees fall with in New Orleans' "middle class" range earning 50,000 to 100,000. 14 percent of participents make over 100/150k.

In the future, we recommend asking participents to report within a couple thousand dollars of their income.

Education

Survey respondants reported educational attainment levels consistent with New Orleans' broader demographic.

11 percent of respondants did not graduate highschool, and 42 percent completed some sort of post highschool education. Thus, about half of We PLAY parents have a highschool degree with no bachelors or trade degree.

Income data was aggregated in a separate spreadsheet to map income onto the group's desired information. We find that families earning less than the medium income uniformly seek more information.

Demographics - Identity

General Survey Respondants

Responding caregivers are almost 90% female, and about 80% are within "child rearing" age between 25 and 45. Results suggest that mostly mothers, and a few grandparents bring children to the center. 73 % households report themselves as "single adult." This number may or may not exclude sibilings who are technically over 18.

The porportion of Black and White attendees is consistent with the broader city's demographic to two and one percentage points, respectively.

Family Survey Data

The Family Survey Reflects similar findings -- we suspect that many families overlapping in taking both surveys.

Home population and Zipcode

We PLAY children grow up in overwhelingly single adult households. Estimates may be inflated because young adults are included as "Adults" but may not serve a gaurdian's role.

Zipcodes and attendance

Participants are traveling far, many from

Family survey results are similar to those from reflections.

Participants travel quite far to attend the We PLAY Center.

Attending children's demographic

We decided to omit ages and genders of third children because there was very little, sparse data.

Attending WPC - How and Why

66% of respondants have not applied to another childcare program; they are represented in following questions.

Survey Questions

Why else have you not applied to day care?

Where else have you sought child care?

How respondants found the We PLAY center

Answers differ greatly in the family survey

The family survey included sparse data on the ages/genders of a respondant's children, but we chose to omit it as it did not improve our model.

Parenting: Concerns, strengths, and progress

Family Survey Questions - Activity Style

Please share the number of days in a typical week that you engage in the following activities
Parents vary in how they discipline. Below are some strategies parents used to discipline their children during the last 7 days:
Please state how strongly you agree with the following statements
Parents vary in how they discipline. Below are some strategies parents used to discipline their children during the last 7 days:

Incoming Family Reflection

Using plot pies rather than plot sections to repurpose code we created before aggregating all four sets. We chose to cut off the last survey for some sections because the data was largely inconsistent.

Parents vary in how they discipline. Below are some strategies parents used to discipline their children during the last 7 days:

Parenting Self Assessment

Measuring parents have grown since attending We PLAY. These sections only had a few (omitted) answers besides those pictured.

Parent's opinion on their childrens' progress

No respondants reported negative outcomes!

Parent reviews and needs

More Family Survey Questions

The We PLAY Center strives to create a welcoming environment with toys and caring, knowledgable staff. How well do parents think We PLAY meets these goals?
We PLAY strives to help parents and children grow. The questions below help to illustrate how well We PLAY is fulfilling those goals:

Open Response

What will keep people coming back?

Family Survey

Incoming Parent Reflection

WordCloud

Open response to What will keep you coming back to the We PLAY Center? from incoming family surveys.

Automatic updating for Git Pages

We use https://github.com/renarepenning/weplaynoladata's gh-pages branch to display on Rena's personal website

Create a pull to merge main into gh-pages