We PLAY Data 2 Provided by Christine Neely from Training Grounds NOLA. Available on this project's Git repository.
We are inspired by The We PLAY center's mission "to assist parents, caregivers, and professionals with providing children birth to five years of age with rich learning experiences, positive adult-child interactions, and social-emotional skills that will contribute to success in school and beyond." Melanie Richardson and Christine Neely proposed Survey Data Synthesis to Intro to Data Science’s service-learning students
The presented data contained five google sheets, a project description, and a Logic Model. Four of the five sheets were pulled from Incoming Parent Reflection Forms (an almost identical survey), and one is a Family Survey with similar but not identical questions. Although extensive, the data proved very difficult to wrangle because of its messy form. We aim to provide recommendations to facilitate future data analysis with less ETL.
Along with survey data4, Christine Neely provided us with a Logic Model submitted to their IMH Grant proposal. The organization's goal is stated as "Strengthening social-emotional interaction among children ages birth-3 yrs and parents at the We PLAY Center will lead to stronger communities, thriving families and children who are ready to learn." This document's Output Targets and Indicators of Success were guidance for the following data analysis. We PLAY assumes that "embedding [...] the center in low-income communities will yield participation" in utilizing and applying parenting information to families desire. To combat the disorganization of the income data we created a new sheet with appropriate labeling and all five datasets.
First, we manually removed empty columns and rows from the five excel files. Then, we broke apart sheets to make data frames addressing a single concern. During service learning, Melanie and Christine heard our Milestone Two plan and asked for us to investigate for any links between zip code, income, and times returned as well as income and requests for different types of information. Due to limitations in time and the data given we were not able to investigate these links. We instead focused on visualizing the data to provide answers to the questions posed on the surveys and logic model used.
We intend to model WE Play's current demographics, strengths, and provide recommendations for future data collection. To do so we will:
There are a few improvements we believe could be made to the survey process to ensure a higher volume of quality data is collected. The surveys we used were from irregular time intervals which affect the quality of the data that is recorded. Sending out surveys in more intervals, like quarterly would be an improvement. Some survey questions were essentially the same, so going through and removing questions that are too similar would improve the experience of someone taking the survey. We recommend surveys track a participent's attendance frequency and time spend in the program. Finally, We PLAY would benefit from employing one survey format. This would allow easier comparison between all data and provide more useful insights.
Tailor questions to IMH logic model
Data transformation was the most time-intensive step of the data life cycle. First, we manually removed empty columns and rows from the five excel files. Then, we broke apart sheets to make data frames addressing a single concern. During service learning, Melanie and Christine heard our Milestone Two plan and requested further investigation into participants’ income levels relating to their requested information and zip code.
Graphing pie charts is difficult, and code-intensive, in pandas and matplotlib. Writing three of our own ETL functions increased our efficiency and cut down on lines of code. We added functions to plot pie charts easily, grab column labels, and take numerical series and plot a bar chart, and print summary stats. These will allow us to analyze the rest of the data in a much more efficient fashion.
Rena and Eddy will utilize this repository to simultaneously edit the project. We plan on connecting over Zoom as needed and keeping constant communication over instant messages. Also, we are both able to meet after class on Thursdays. When convenient, we will export notebooks to Google Colabs to support simultaneous editing. (This is easily done using a Colabs Chrome extension). While re-organizing our final product we used a GoogleDoc to simultaneously outline and plan.
0. https://nmattei.github.io/cmps3160/
1. Image source: https://www.mytraininggrounds.org/we-play-center.
2. We PLAY data was last downloaded on October 25th, 2021.
3. Cite: https://stackoverflow.com/questions/14675913/changing-image-size-in-markdown)
4. Access the data and Logic Model [here](https://github.com/renarepenning/weplaynoladata/tree/main/_data)
5. Cites:
https://matplotlib.org/stable/tutorials/colors/colormaps.html
https://matplotlib.org/3.1.0/api/_as_gen/matplotlib.pyplot.pie.html
https://matplotlib.org/stable/gallery/pie_and_polar_charts/pie_features.html </span>
6. 2020 US Census Bureau data: https://www.census.gov/quickfacts/fact/table/neworleanscitylouisiana/INC110219
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import math
import random
!pip3 install openpyxl # package to open xlsx files
Requirement already satisfied: openpyxl in /opt/conda/lib/python3.9/site-packages (3.0.9) Requirement already satisfied: et-xmlfile in /opt/conda/lib/python3.9/site-packages (from openpyxl) (1.1.0)
renapath = "../notebooks/_data/WePlayData/CleanedWPC"
edpath = "./_data/WePlayData/CleanedWPC"
path = renapath # path to manually cleaned excels
Matplotlib pie charting functions are cumbersome and do not offer flexability while labeling. During Milestone 1 we found ourselves spending more time on graphing than working with data. Writing graphing functions, custom to this project spend up ETL and improved output quality. </br>Below is our code documentation:
getLabels(df)
Returns a list of lists of values for each column, using pandas.unique. Will print Nans! Unfortunately, labeling is inconsistent when filterning them out.
Returns a random selection of one of We PLAY's colors. Used to alternate colors for printCarBars()
and printHorizBars()
.
df.key
, Pandas doesn't allow .key in place of .attributename whe calculating summary statsInended for categorical variables with "long" labels.
Streamlines plotting columns of the aggregated allIncoming
df. Takes in indexes and calls other functions appropriately. Cannot be used on other data frames.
def plotPies(df, labels, allLabelsSame, isNAct, ishoriz=True, showKeys=True, fig1size=(20,5)):
theme=['#794f9e','#55ad51','#db6ea1','#a8a8a8','#b297c9','#8fbf8e','#d6aec0','#6e6e6e','#4a395e'] #We PLAY colors
cols = df.columns
# create matplotlib axis, can be done vertically. Unfortunately figsize required int literals.
if ishoriz:
fig1, ax1 = plt.subplots(nrows=1,ncols=len(cols),figsize=fig1size)
else:
fig1, ax1 = plt.subplots(nrows=len(cols),ncols=1,figsize=fig1size)
if allLabelsSame:
# turns single list into same format outputted by `getLabels()`
labels = [labels]*len(cols)
#print key above all instead of individually
keys = []
for i, c in enumerate(df.columns): # incase our graph titles are going to be differend
if isNAct:
# To plot using is na
wedges, = ax1[i].pie(df[c].isna().value_counts(), colors=theme, autopct='%1i%%',textprops={'fontsize': 12, 'color':'w','weight':'bold'}, shadow=True, startangle=90),
else:
wedges = ax1[i].pie(df[c].value_counts(), colors=theme, autopct='%1i%%',textprops={'fontsize': 12, 'color':'w','weight':'bold'}, shadow=True, startangle=90)
ax1[i].set_title(str(i+1))
if isinstance(cols[i],str):
keys += [str(i+1) +". "+ cols[i]]
else:
keys += [str(i+1) +". "+ cols[i][0]]
ax1[i].axis('equal')
# label individually, when we must
if allLabelsSame == False:
ax1[i].legend(wedges[0], labels[i],
title="Legend", # must plot a unique legend for each to keep value/label pairs uniform
loc="upper left",
bbox_to_anchor=(1, 0, 0.5, 1))
# label for all to save space
if allLabelsSame:
ax1[0].legend(wedges[0], labels[0],
title="Legend", # must plot a unique legend for each to keep value/label pairs uniform
loc="upper left",
bbox_to_anchor=(1, 0, 0.5, 1))
# Label each graph - matplot title is messy
if showKeys == True: # Option to omit keys
print("KEYS:")
for k in keys:
print(k)
plt.show()
def getLabels(df):
# beware: will return nan
labels = ["x"]*len(df.columns)
for i, c in enumerate(df.columns):
labels[i] = df[c].unique()
return labels
def getColor():
theme=['#794f9e','#55ad51','#db6ea1']
random.seed()
i = random.randrange(3)
return theme[i]
def printCatBars(df, key, att, printStats=True): # att -- > is df.col_name attribute for calling summary stat functions
p = df[key].value_counts()
p.sort_index(inplace=True)
p.plot.bar(align="center", color=getColor())
if printStats: # skip for qualitative data
# convert numeric values
df[key].apply(pd.to_numeric, errors='coerce')
print("SUMMARY STATS\n","AVG: ", att.mean()[0],"\nMax,Min: ",att.min()[0],",",att.max()[0])
# printing horizontal bar charts for qualitative values with long labels
def printHorizBars(df, key):
p = df[key].value_counts()
p.sort_index(inplace=True)
p.plot.barh(align="center", color=getColor())
def plotSection(cols, i1, i2, plotBarsHoriz=False, plotBars="key", pie=True, isNAct=False):
df = allIncoming.iloc[1:, i1:i2]
df.columns = [cols]
if (pie) and (plotBarsHoriz==False):
# use labels function if non-bin question. else we make it T/f
if isNAct == False:
plotPies(df, getLabels(df), False, isNAct, True)
else:
plotPies(df, ["True", "False"], True, isNAct, True)
if plotBars != "key":
if plotBarsHoriz == False:
print("Taking a closer look at ",plotBars,":")
printCatBars(df, plotBars, df[plotBars], False)
else:
printHorizBars(df, plotBars)
We recieved two survey formats: Incoming Parent Reflection
and Family Survey
Aggregation of four "Incoming Parent Reflection Suverys." Dated from July 2019, November 2019, 2020, and an undated version from 2019. These sheets appear to be from the same survey, but the undated sheet seemed to differ slightly in column order. This was able to be fixed manually or by dropping indexes.
# JULY BATCH
incomingjuly2019 = pd.read_excel(path + "IncomingParentReflection-July2019" + ".xlsx")
incomingjuly2019["Sheet"] = "July19"
incomingjuly2019.drop([1], axis=0, inplace = True) #delete row full of test reponses
# NOV BATCH
incomingnov2019 = pd.read_excel(path + "IncomingParentReflection-November2019" + ".xlsx")
incomingnov2019["Sheet"] = "Nov19"
incomingnov2019.drop([0], axis=0, inplace = True)
# 2020 BATCH
incoming2020 = pd.read_excel(path + "IncomingParentReflectionForm2020" + ".xlsx")
incoming2020["Sheet"] = "2020"
incoming2020.drop([0], axis=0, inplace = True)
# UNDATED BATCH
incoming_undated = pd.read_excel(path + "IncomingParentReflection" + ".xlsx")
incoming_undated["Sheet"] = "NoDate"
# Create DF of all similar surveys
pd.set_option('display.max_columns',None) #display all columns
allIncoming = pd.concat([incomingjuly2019, incomingnov2019.iloc[1:,:], incoming2020.iloc[1:,:], incoming_undated.iloc[1:,:]], ignore_index=True)
print("Notes: --\'Nan\' codes lack of reponse -- Resp = Respondant\n -- Adult implies 18 + -- Child implies under 18 \n")
Notes: --'Nan' codes lack of reponse -- Resp = Respondant -- Adult implies 18 + -- Child implies under 18
Distribution of data from each survey
printHorizBars(allIncoming,"Sheet")
This section went through ETL seperately because of format and question differences. FamilySurvey.xlsx data was collected between Jan-Aug 2019. We suspect overvations overlap with those from Incoming Reflections.
familySurveyDf = pd.read_excel(path + "FamilySurvey" + ".xlsx")
print("Above notes apply.")
adult_FDa = familySurveyDf.iloc[1:, :5]
adult_FDa.columns = [["Resp_Gender", "Resp_Age", "Resp_Race", "Children", "AdultsAtHome"]]
adult_FDb = familySurveyDf.iloc[1:, 5:10]
adult_FDb.columns = [["Zip", "OMIT", "Avg_Income", "highest_edu", "visits_Monthly"]]
adult_FDb.drop(columns=["OMIT"], inplace=True)#,level=level)
Above notes apply.
Participant ID was coded by the survey, automatically and did not allow us to track Respondants between surveys.
The 2020 New Orleans Census data reports a median income of 45,615 USD. We PLAY's median and most frequent income range lies between 50,000 and 74,000. Although this suggests We PLAY attendees earn about 20 percent more than the average New Orleanian, the reported income distribution skews. We decided not to calculate averages because the intervals were uneven. 6
In New Orleans a family of three is considered to be in poverty if household income is below 21,960 USD. 30 percent of We PLAY attendees earn under 30,000; we estimate that this is closer to an actual living wage. 44 percent of attendees fall with in New Orleans' "middle class" range earning 50,000 to 100,000. 14 percent of participents make over 100/150k.
In the future, we recommend asking participents to report within a couple thousand dollars of their income.
Survey respondants reported educational attainment levels consistent with New Orleans' broader demographic.
11 percent of respondants did not graduate highschool, and 42 percent completed some sort of post highschool education. Thus, about half of We PLAY parents have a highschool degree with no bachelors or trade degree.
plotSection(["Income", "Highest Level of education"], 9, 11)
KEYS: 1. Income 2. Highest Level of education
Income data was aggregated in a separate spreadsheet to map income onto the group's desired information. We find that families earning less than the medium income uniformly seek more information.
incomeVsRequests = pd.read_excel(path + "Incomeandrequests" + ".xlsx")
under15k = incomeVsRequests[incomeVsRequests.AvgIncome == 'Under $15,000']
under30k = incomeVsRequests[incomeVsRequests.AvgIncome == 'Between $15,000 and $29,999']
under50k = incomeVsRequests[incomeVsRequests.AvgIncome == 'Between $30,000 and $49,999']
under75k = incomeVsRequests[incomeVsRequests.AvgIncome == 'Between $50,000 and $74,999']
under100k = incomeVsRequests[incomeVsRequests.AvgIncome == 'Between $75,000 and $99,999']
under150k = incomeVsRequests[incomeVsRequests.AvgIncome == 'Between $100,000 and $150,000']
over150k = incomeVsRequests[incomeVsRequests.AvgIncome == 'Over $150,000']
print("INFORMATION REQUESTED BY SALARY -- KEYS")
print('''1. Stress relief
2. Information about community resources
3. Information about how to foster my child’s development
4. Information about nurturing and responsive parenting strategies
5. Information about mouthing (biting, putting objects in their mouth)
6. Information about conflict resolution
7. Information about sharing
8. Information about nursing/ feeding
9. Information about temper tantrums
10. Information about sleeping''')
# charts are printed after all text, so we ran this in seperate cells to get accurate headers
def printincomerequests(d):
print("Salary Range: " + d["AvgIncome"].unique()[0])
d.drop(["AvgIncome"], axis=1, inplace = True)
plotPies(d, ["Yes","No"], True, True, showKeys=False, fig1size=(20,4))
printincomerequests(under15k)
INFORMATION REQUESTED BY SALARY -- KEYS 1. Stress relief 2. Information about community resources 3. Information about how to foster my child’s development 4. Information about nurturing and responsive parenting strategies 5. Information about mouthing (biting, putting objects in their mouth) 6. Information about conflict resolution 7. Information about sharing 8. Information about nursing/ feeding 9. Information about temper tantrums 10. Information about sleeping Salary Range: Under $15,000
printincomerequests(under30k)
Salary Range: Between $15,000 and $29,999
printincomerequests(under50k)
Salary Range: Between $30,000 and $49,999
printincomerequests(under75k)
Salary Range: Between $50,000 and $74,999
printincomerequests(under100k)
Salary Range: Between $75,000 and $99,999
printincomerequests(under150k)
Salary Range: Between $100,000 and $150,000
''' Only 3% of respondants fit into this category --> Ommitted due to quality '''
# printincomerequests(over150k)
' Only 3% of respondants fit into this category --> Ommitted due to quality '
Responding caregivers are almost 90% female, and about 80% are within "child rearing" age between 25 and 45. Results suggest that mostly mothers, and a few grandparents bring children to the center. 73 % households report themselves as "single adult." This number may or may not exclude sibilings who are technically over 18.
The porportion of Black and White attendees is consistent with the broader city's demographic to two and one percentage points, respectively.
plotSection(["Resp_Gender", "Resp_Age", "Resp_Race"], 1, 4)
print("A closer look at racial demographics:")
plotSection(["Resp_Race"], 3, 4, plotBarsHoriz=True, plotBars="Resp_Race")
KEYS: 1. Resp_Gender 2. Resp_Age 3. Resp_Race
A closer look at racial demographics:
The Family Survey Reflects similar findings -- we suspect that many families overlapping in taking both surveys.
# drop columns to plot bars
labels=getLabels(adult_FDa)
plotPies(adult_FDa, labels, False, False, True)
KEYS: 1. Resp_Gender 2. Resp_Race 3. Children
labels=getLabels(adult_FDb)
plotPies(adult_FDb, labels, False, False, True)
KEYS: 1. Avg_Income 2. highest_edu 3. visits_Monthly
We PLAY children grow up in overwhelingly single adult households. Estimates may be inflated because young adults are included as "Adults" but may not serve a gaurdian's role.
plotSection(["Adults", "Children"], 5, 7)
KEYS: 1. Adults 2. Children
Participants are traveling far, many from
Family survey results are similar to those from reflections.
# Incoming reflection aggregation
plotSection(["Zip"], 7, 8, plotBarsHoriz=True, plotBars="Zip")
# Family survey
printHorizBars(adult_FDb, "Zip")
Participants travel quite far to attend the We PLAY Center.
We decided to omit ages and genders of third children because there was very little, sparse data.
childDemo = allIncoming[["How old is the 1st child?","What is the 1st child's gender?","What is your relationship to the 1st child?","How old is the 2nd child?","What is the 2nd child's gender?","What is your relationship to the 2nd child?"]]
childDemo.columns = [["1st child age","1st child's gender","resp relationship to 1st child","2nd child age","2nd child's gender","resp relationship to 2nd child"]]
childDemo1 = childDemo.iloc[1:, :3].dropna()
plotPies(childDemo1, getLabels(childDemo1), False, False)
childDemo2 = childDemo.iloc[1:, 3:].dropna()
plotPies(childDemo2, getLabels(childDemo2), False, False)
KEYS: 1. 1st child age 2. 1st child's gender 3. resp relationship to 1st child
KEYS: 1. 2nd child age 2. 2nd child's gender 3. resp relationship to 2nd child
66% of respondants have not applied to another childcare program; they are represented in following questions.
print("Have you applied to another childcare program?")
s = allIncoming.iloc[:, 30].dropna().describe() # --> functions prohibit plotting one graph
print(str(s[3]/s[0]*100)[:5],"% of respondants say \"", s[2],"\".")
Have you applied to another childcare program? 65.71 % of respondants say " No ".
print("Why?")
c = ["It is too expensive","I feel my child is not old enough","There is a lack of quality childhood care","I use a nanny","I use a family member"]
d = allIncoming.iloc[1:, 31:36]
d.columns = [c]
plotPies(d, ["False", "True"], True, True)
Why? KEYS: 1. It is too expensive 2. I feel my child is not old enough 3. There is a lack of quality childhood care 4. I use a nanny 5. I use a family member
allIncoming.iloc[:,36].unique()[2:]
array(['While our oldest child is in daycare, it is too cost prohibitive to have our youngest two children in childcare.', 'Prefer him to be at home ', 'He was childcare already ', 'Stay at home mom', 'I want to spend as much time with my child as possible and, honestly, at this time, we can afford to be a single income household.', 'I am a stay at home mom', 'Limited part time options ', 'Fear of entrusting my child’s well-being to a stranger(s)', 'Not ready for him to go to daycare ', 'Child stays at home with me', 'Work part time to stay at home', 'I am not the parent, who prefers to "take care of child until he is vocal"', 'Occasional Babysitter '], dtype=object)
plotSection("y", 37, 38, pie=False, plotBars="y", plotBarsHoriz=True)
cols = ["Sign/Poster/flyer", "Friend", "Website", "Social Media", "Other org"]
plotSection(cols, 120, 125, isNAct=True)
KEYS: 1. Sign/Poster/flyer 2. Friend 3. Website 4. Social Media 5. Other org
how_found_WPC = familySurveyDf.iloc[1:, 10:17]
how_found_WPC.columns = ["Sign/Poster/flyer", "Friend", "Website", "Social Media", "Other org", "Laf-Fau Housing Office", "Other"]
plotPies(how_found_WPC, ["Yes", "No"], True, True)
KEYS: 1. Sign/Poster/flyer 2. Friend 3. Website 4. Social Media 5. Other org 6. Laf-Fau Housing Office 7. Other
print("Other: ")
allIncoming.iloc[:,126].unique()[2:]
Other:
array(['We play staff ', 'Easter 2019 engagement ', 'Random', 'Family (cousin) ', 'BabyCafe', 'Nolan baby cafe ', 'A table at an event', 'Baby cafe ', 'The early childhood development fair thing at Delgado', 'Macaroni Kid newsletter ', 'Nola Baby Cafe', 'My doctor', 'Midwife at touro', 'Ms. Melanie ', 'Macaroni Kid '], dtype=object)
The family survey included sparse data on the ages/genders of a respondant's children, but we chose to omit it as it did not improve our model.
daysActivitys_FD = familySurveyDf.iloc[1:, 24:27]
daysActivitys_FD.columns = ["Read Stories", "Play Music or Sing",
"Engage in language building activities"]
labels = ['Never', 'Every day', '1-2 days', '3-6 days']
plotPies(daysActivitys_FD, labels, True, False)
KEYS: 1. Read Stories 2. Play Music or Sing 3. Engage in language building activities
daysActivitys_FD = familySurveyDf.iloc[1:, 27:29]
daysActivitys_FD.columns = ['Engage in play', 'Create opportunities for your child to "practice" self control']
plotPies(daysActivitys_FD, getLabels(daysActivitys_FD), False, False)
KEYS: 1. Engage in play 2. Create opportunities for your child to "practice" self control
discipline_FD = familySurveyDf.iloc[0:, 34:42]
discipline_FD.columns = ['Raising your voice or yelling', 'Spanked/Slapped', 'Took away a toy or treats', 'Positive Reinforcement', 'Offered choices', 'Time-out', 'Explained why his/her behavior is not appropriate', 'Recognized and regulated your feelings']
labels = ["Never", "Always", "Very Often", "Sometimes", "Rarely"]
plotPies(discipline_FD, labels, True, False)
KEYS: 1. Raising your voice or yelling 2. Spanked/Slapped 3. Took away a toy or treats 4. Positive Reinforcement 5. Offered choices 6. Time-out 7. Explained why his/her behavior is not appropriate 8. Recognized and regulated your feelings
parenting_FD = familySurveyDf.iloc[1:, 29:34]
parenting_FD.columns = ['I respond quickly to my child\'s cries', 'I am able to comfort my child when he/she is upset',
'I know the meaning of my child\'s signals (cry, turning away, rubbing eyes)',
'I step back and encourage my child to work through problems',
'I am able to follow my child\'s lead during playtime']
cols = parenting_FD.columns
labels = ['Mostly agree', 'Somewhat agree', 'neither agree nor disagree',
'Strongly agree', 'Mostly disagree', 'Strongly disagree','Somewhat disagree']
plotPies(parenting_FD, labels, True, False)
KEYS: 1. I respond quickly to my child's cries 2. I am able to comfort my child when he/she is upset 3. I know the meaning of my child's signals (cry, turning away, rubbing eyes) 4. I step back and encourage my child to work through problems 5. I am able to follow my child's lead during playtime
discipline_FD = familySurveyDf.iloc[0:, 34:42]
discipline_FD.columns = ['Raising your voice or yelling', 'Spanked/Slapped', 'Took away a toy or treats', 'Positive Reinforcement', 'Offered choices', 'Time-out', 'Explained why his/her behavior is not appropriate', 'Recognized and regulated your feelings']
labels = ["Never", "Always", "Very Often", "Sometimes", "Rarely"]
plotPies(discipline_FD, labels, True, False)
KEYS: 1. Raising your voice or yelling 2. Spanked/Slapped 3. Took away a toy or treats 4. Positive Reinforcement 5. Offered choices 6. Time-out 7. Explained why his/her behavior is not appropriate 8. Recognized and regulated your feelings
Using plot pies rather than plot sections to repurpose code we created before aggregating all four sets. We chose to cut off the last survey for some sections because the data was largely inconsistent.
#create df with directly comparable info about how parents interact with their children before and after We PLAY
parent_child_behaviorsa = allIncoming[["Please think about each statement and how you currently feel","Unnamed: 59","Unnamed: 60","Unnamed: 61","Unnamed: 62","Unnamed: 63"]]
parent_child_behaviorsb = allIncoming[['"Since coming to the We PLAY Center I…"',"Unnamed: 97","Unnamed: 98","Unnamed: 99","Unnamed: 100","Unnamed: 101"]]
parent_child_behaviorsa.drop([0], axis=0, inplace = True)
parent_child_behaviorsa.columns = [["Read to my child (Before WP)","Play with my child (Before WP)","Talk to my child (Before WP)", "Listen to my child (Before WP)","Set limits with my child (Before WP)","Yell at my child (Before WP)"]]
parent_child_behaviorsb.drop([0], axis=0, inplace = True)
parent_child_behaviorsb.columns = [["Read to my child (Since WP)","Play with my child (Since WP)","Talk to my child (Since WP)","Listen to my child (Since WP)","Set limits with my child (Since WP)","Yell at my child (Since WP)"]]
print("Parent/child interaction before We PLAY")
plotPies(parent_child_behaviorsa, getLabels(parent_child_behaviorsa.iloc[:100, :]), False, False, ishoriz=True)
print("Parent/child interaction after We PLAY")
plotPies(parent_child_behaviorsb, getLabels(parent_child_behaviorsb.iloc[:115, :]), False, False, ishoriz=True)
Parent/child interaction before We PLAY KEYS: 1. Read to my child (Before WP) 2. Play with my child (Before WP) 3. Talk to my child (Before WP) 4. Listen to my child (Before WP) 5. Set limits with my child (Before WP) 6. Yell at my child (Before WP)
Parent/child interaction after We PLAY KEYS: 1. Read to my child (Since WP) 2. Play with my child (Since WP) 3. Talk to my child (Since WP) 4. Listen to my child (Since WP) 5. Set limits with my child (Since WP) 6. Yell at my child (Since WP)
discipline_FD = familySurveyDf.iloc[0:, 34:42]
discipline_FD.columns = ['Raising your voice or yelling', 'Spanked/Slapped', 'Took away a toy or treats', 'Positive Reinforcement', 'Offered choices', 'Time-out', 'Explained why his/her behavior is not appropriate', 'Recognized and regulated your feelings']
labels = ["Never", "Always", "Very Often", "Sometimes", "Rarely"]
plotPies(discipline_FD, labels, True, False)
KEYS: 1. Raising your voice or yelling 2. Spanked/Slapped 3. Took away a toy or treats 4. Positive Reinforcement 5. Offered choices 6. Time-out 7. Explained why his/her behavior is not appropriate 8. Recognized and regulated your feelings
Measuring parents have grown since attending We PLAY. These sections only had a few (omitted) answers besides those pictured.
since_WP_Parents_nov2019a = incomingnov2019[["After coming to the center, which of the following aspects have changed for you as a parent? Please choose one option for each statement.","Unnamed: 78","Unnamed: 79","Unnamed: 80","Unnamed: 81","Unnamed: 82","Unnamed: 83"]]
since_WP_Parents_nov2019b = incomingnov2019[['"Since coming to We PLAY Center…"',"Unnamed: 97","Unnamed: 98","Unnamed: 99","Unnamed: 100","Unnamed: 101"]]
since_WP_Parents_nov2019a.columns = [["My stress level as a parent...","My knowledge about parenting support and resources in the community...","My knowledge about my child’s development...","My knowledge about how to foster my child’s development...","My knowledge about nurturing and responsive parenting strategies...","My knowledge about nutrition and breastfeeding...","My confidence in my role as a parent..."]]
since_WP_Parents_nov2019b.columns =[["I engage with my child by repeating his/her words or sounds...","I engage with my child in pretend play...","I stand back and let my child work through problems...","I engage in bonding activities with my child...","I am better able to help my child uses age appropriate social skills...","I am better able to support my child as he/she learns new skills..."]]
print("Note: Both of these questions had many options, but we only show those with more than one response")
plotPies(since_WP_Parents_nov2019a, ["Increased significantly", "Somewhat decreased"], True, True)
plotPies(since_WP_Parents_nov2019b, ["More often", "The same"], True, True)
Note: Both of these questions had many options, but we only show those with more than one response KEYS: 1. My stress level as a parent... 2. My knowledge about parenting support and resources in the community... 3. My knowledge about my child’s development... 4. My knowledge about how to foster my child’s development... 5. My knowledge about nurturing and responsive parenting strategies... 6. My knowledge about nutrition and breastfeeding... 7. My confidence in my role as a parent...
KEYS: 1. I engage with my child by repeating his/her words or sounds... 2. I engage with my child in pretend play... 3. I stand back and let my child work through problems... 4. I engage in bonding activities with my child... 5. I am better able to help my child uses age appropriate social skills... 6. I am better able to support my child as he/she learns new skills...
No respondants reported negative outcomes!
childGrowthSince = allIncoming[["Since attending the We PLAY Center, which of the following aspects of your child's development have changed? Please choose one option for each statement.","Unnamed: 58","Unnamed: 59","Unnamed: 60"]]
childGrowthSince.columns = [["My child's early learning and cognitive skills...","My child's early language abilities...","My child's early social emotional skills...", "My child's school readiness..."]] #Rename columns
childGrowthSince.drop([0], axis=0, inplace = True) #drop unnesscary row with response parameters
labels=getLabels(childGrowthSince.iloc[:115,:])
print("The first chart shows two omitted, other answers - No response/NA")
plotPies(childGrowthSince.iloc[:115,:], ["Somewhat increased","Increased significantly", "Stayed the Same", "Does not apply"], True, False)
The first chart shows two omitted, other answers - No response/NA KEYS: 1. My child's early learning and cognitive skills... 2. My child's early language abilities... 3. My child's early social emotional skills... 4. My child's school readiness...
desired_info = allIncoming[["Which of the following aspects would you like the We Play Center to address more often? Please select all that apply and use the space to write in any additional comments.","Unnamed: 109","Unnamed: 110","Unnamed: 111","Unnamed: 112","Unnamed: 113","Unnamed: 114","Unnamed: 115","Unnamed: 116","Unnamed: 117"]]
desired_info.columns = [["Stress Relief","Information about community resources","Information about how to foster my child’s development", "Information about nurturing and responsive parenting strategies","Information about mouthing (biting, putting objects in their mouth)","Information about conflict resolution","Information about sharing","Information about nursing/ feeding","Information about temper tantrums","Information about sleeping"]]
desired_info = ~desired_info.isna()
print('"Which of the following aspects would you like the We PLAY Center to address more often?\n')
plotPies(desired_info, ["True", "False"], True, False)
"Which of the following aspects would you like the We PLAY Center to address more often? KEYS: 1. Stress Relief 2. Information about community resources 3. Information about how to foster my child’s development 4. Information about nurturing and responsive parenting strategies 5. Information about mouthing (biting, putting objects in their mouth) 6. Information about conflict resolution 7. Information about sharing 8. Information about nursing/ feeding 9. Information about temper tantrums 10. Information about sleeping
other_WP_opinions = allIncoming[['"The We PLAY Center..."',"Unnamed: 86","Unnamed: 87","Unnamed: 88","Unnamed: 89",'"The We PLAY Center has..."',"Unnamed: 91","Unnamed: 92","Unnamed: 93","Unnamed: 94","Unnamed: 95"]]
other_WP_opinions.columns = [["Provides an enriched learning environment, provides educational toys","Offers an adequate number of toys","Creates a positive welcoming friendly atmosphere", "Staff is knowledgeable and answer questions willingly","Staff is patient and empathetic","Been helpful to my family","Given me more confidence as a parent","Helped me to become a better parent","Provided me with positive parenting strategies","Supported me in my role as a parent/caregiver","Created a support system with other members of the We PLAY Center"]]
print('The We PLAY Center...')
plotPies(other_WP_opinions.iloc[:, :5], ["Strongly agree", "Neutral", "No opinion"], True, False)
plotPies(other_WP_opinions.iloc[:, 5:], ["Strongly agree", "Neutral", "No opinion"], True, False)
The We PLAY Center... KEYS: 1. Provides an enriched learning environment, provides educational toys 2. Offers an adequate number of toys 3. Creates a positive welcoming friendly atmosphere 4. Staff is knowledgeable and answer questions willingly 5. Staff is patient and empathetic
KEYS: 1. Been helpful to my family 2. Given me more confidence as a parent 3. Helped me to become a better parent 4. Provided me with positive parenting strategies 5. Supported me in my role as a parent/caregiver 6. Created a support system with other members of the We PLAY Center
environmentstatements_FD = familySurveyDf.iloc[1:, 42:48]
labels = ['Strongly agree','Strongly disagree', 'Mostly agree',
'Neither disagree nor agree', 'Somewhat agree',
'Somewhat disagree']
environmentstatements_FD.columns = familySurveyDf.iloc[0, 42:48]
plotPies(environmentstatements_FD, labels, True, False)
KEYS: 1. Provides an enriched learning environment 2. Provides educational toys 3. Offers an adequate number of toys 4. Creates a positive, welcoming, friendly atmosphere 5. Staff is knowledgeable and answer questions willingly 6. Staff is patient and empathetic
attendance_feedback = familySurveyDf.iloc[1:, 48:53]
labels = ['Strongly agree','Strongly disagree', 'Mostly agree',
'Neither disagree nor agree', 'Somewhat agree',
'Somewhat disagree']
attendance_feedback.columns = familySurveyDf.iloc[0, 48:53]
plotPies(attendance_feedback, labels, True, False)
KEYS: 1. Do you feel supported in your role as a parent? 2. Do you feel confident in your role as a parent? 3. Do you feel you have a better understanding about age appropriate developmental milestones ? 4. Do you have a stronger bond with their child? 5. Has your child exhibited an increase in age-appropriate social skills?
weplay_review = familySurveyDf.iloc[1:, 56:58]
labels = getLabels(weplay_review)
print("What will keep you coming back to the We PLAY Center?:\n")
print(labels[0][1:])
What will keep you coming back to the We PLAY Center?: ['The positive staff and positive environment.' 'My kids love it there an I enjoy playing with my babies ' 'Friendly staff, fun space ' 'The activities and Social time for kids. ' 'The people' 'The staff!' 'The dedicated playspace for infants. ' 'Convenient location, clean, diverse set of families.' 'Baby cafe is great.. really nice parents there too & staff. I like the diversity a lot too' 'The stimulating activities and toys that I can’t provide. The socialization of having to cooperate with other kids her age. The diversity in children and parents who come' 'Thursday' 'The interaction with other babies We Play provides ' 'Nice place to meet up with other moms. Only working part time right now so its nice to find free activities. ' 'Clean, open, temperate space for babies to safely play and engage and explore and welcoming, helpful staff' 'People and toys ' 'Increased open hours' 'Free and indoor and enough space to play (not too crowded) ' 'Wonderful staff & encouraged & excited for us to have a diverse place to play!' 'I really liked the friendly and helpful staff, the toys my children had access to, the separate baby area, and the relaxed attitudes of the other parents.' 'Access to a beautiful, inclusive family space with friendly staff.' 'The family aspect ' 'Community and engagement with other kids my child’s age' 'Socialization and bright atmosphere and wonderful moms' 'The amazing staff and the engaging activities.' 'The positive atmosphere that encourages good play time with my child.' 'Everything! The staff, the toys, the environment. It’s all great!' 'Doing the same thing they been doin and that’s being nice.' 'It was my first time there and the staff was so welcoming and positive that this will be our new go to. It was also very organized and clean. What a great place to go when the weather gets too humid to go outside. ' 'Yes ' 'If my schedule stays the same. ' 'The staff, and the people there' 'close to my house, free, friendly, opportunity to meet other moms and babies, lovely staff, opportunity for my baby to play with different things and other children ' 'For my child to acquire social skills ' 'Positive reinforcement explanation ' 'Chance for my kids to interact with others and a chance for me to hang out with other parents' 'The children ' 'The environment ' 'Being able to enjoy activities with my son ' 'The instructions on how to properly deal with my kids' 'Friendly staff and friendly parents, bright airy space & great toys' 'The atmosphere ' 'Playing with my baby' 'The space for 0-3 is great. I value the diversity of parents / caregivers racially and economically. ' 'Love coming as is, would come more if there were additional hours it was open.' 'The atmosphere it creates for my daughter' 'Location, staff, other children' 'We plan to attend regularly so our baby can socialize with other children in a safe and enriching environment. ' 'Tye environment is warm and welcoming! Melanie and Christine remember every child’s name❤️. They are so helpful and encouraging.. patient with the kids.' 'The opportunity for my child to engage with other children, opportunity to have a change of environment for myself and my son, opportunity to engage with other parents and professionals who understand child development ']
print("What aspect of the We PLAY Center do you find most helpful?:\n")
print(labels[1][1:])
What aspect of the We PLAY Center do you find most helpful?: ['Different stuff' 'Interactive space outside the house that is still indoors' 'Safe area to play ' 'Other parents ' 'The staff!' 'Dedicated playspace for infants with developmentally appropriate toys.' 'Free!' 'The staff' 'The staff was very friendly and made me feel very welcome!' 'Being able to engage with other parents. Providing my baby with the opportunity to play with other children. Knowing that the staff is knowledgeable and helpful ' 'being around other adults in a social environment ' 'I’ve only been once but will see' 'Other babies, parents and staff to interact with' 'Various types of engagement for the children' 'Free and indoor' 'Staff' 'Staff members and the song and snack at the end - that really helped transition my toddler for the time to leave.' 'Baby cafe, toddler activities, baby gate on soft area, lovely staff. ' 'Everything' 'It is free!' 'The women on staff' 'The kids are engaged and having fun ' 'The dedicated time and space to really focus on engaging in play with my baby without the distractions at home.' 'That it’s a free space for my child to play with lots of learning experiences offered. ' 'The engagement of other kids' 'The set up is very organized and there are duplicates of most toys which is helpful with toddlers. ' "Being around mom's " 'The separation of the age groups ' 'cost and availability ' 'The staff are knowledgeable in everything about kids' 'They walk your through redirecting ' 'Easy space to watch my son while holding my newborn' 'Free play ' 'Comfort' 'Age appropriate activities for children ' 'The social engagement amongst the children' 'Everything ' 'Location, toys, staff, talking to other parents ' 'Different activities for little to learn ' 'The mothers ' 'Location , clean age appropriate play space. Staff modeling best practices for caregivers ' 'That it is staffed with educators that are very knowledgeable in interacting and working with young children.' 'The information the staff gives about toddlers and their behavior' 'Ease of access' "It's open more than one day, so that's easier for us to fit in our schedule. " 'All of it! ' 'The ability to allow my child to engage in play with other kids in a safe space']
Open response to What will keep you coming back to the We PLAY Center? from incoming family surveys.
!pip3 install wordcloud
from os import path
from PIL import Image
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
# isolating column for word cloud:
text = allIncoming["What will keep you coming back to the We PLAY Center?"]
text = text.dropna()
#making df into a string for the wordcloud function
text = " ".join(response for response in text)
#filtering out uninformative words
omit = ['will', 'Yes', 'able', 'thing', 'old', 'bring', 'around', 'Maybe', 'though', 'Every', 'visit', 'now', 'days', 'year']
for w in omit:
text = text.replace(w, '')
# Create and generate a word cloud image:
wordcloud = WordCloud(max_font_size=50, max_words=100, background_color="white").generate(text)
# Display the generated image:
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
We use https://github.com/renarepenning/weplaynoladata
's gh-pages branch to display on Rena's personal website
# update local branch
!git pull
# convert this file to HTML
!jupyter nbconvert --to html FinalTutorial.ipynb
# Delete old index.html
!rm index.html
# move MilestoneTwo.html into index.html
!mv FinalTutorial.html index.html
# Push to Rena's repo
!git config --global user.name "renarepenning"
!git config --global user.email "rrepenning@tulane.edu"
!git add -A
!git commit -m "auto update index.html"
# MUST PUSH MANUALLY
remote: Enumerating objects: 2, done. remote: Counting objects: 100% (2/2), done. remote: Compressing objects: 100% (2/2), done. remote: Total 2 (delta 0), reused 0 (delta 0), pack-reused 0 Unpacking objects: 100% (2/2), 1.22 KiB | 32.00 KiB/s, done. From https://github.com/renarepenning/weplaynoladata 25002a2..82e3e5e gh-pages -> origin/gh-pages Already up to date. [NbConvertApp] Converting notebook FinalTutorial.ipynb to html [NbConvertApp] Writing 2961869 bytes to FinalTutorial.html [main 6b900f8] auto update index.html 2 files changed, 417 insertions(+), 311 deletions(-)
main
into gh-pages
¶