SAGE Journals
Browse

A Data-Driven Approach to the Fragile Families Challenge: Prediction through Principal-Components Analysis and Random Forests

Posted on 2019-09-11 - 14:08

Sociological research typically involves exploring theoretical relationships, but the emergence of “big data” enables alternative approaches. This work shows the promise of data-driven machine-learning techniques involving feature engineering and predictive model optimization to address a sociological data challenge. The author’s group develops improved generalizable models to identify at-risk families. Principal-components analysis and decision tree modeling are used to predict six main dependent variables in the Fragile Families Challenge, successfully modeling one binary variable but no continuous dependent variables in the diagnostic data set. This indicates that some binary dependent variables are more predictable using a reduced set of uncorrelated independent variables, and continuous dependent variables demand more complexity.

CITE THIS COLLECTION

DataCite
3 Biotech
3D Printing in Medicine
3D Research
3D-Printed Materials and Systems
4OR
AAPG Bulletin
AAPS Open
AAPS PharmSciTech
Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg
ABI Technik (German)
Academic Medicine
Academic Pediatrics
Academic Psychiatry
Academic Questions
Academy of Management Discoveries
Academy of Management Journal
Academy of Management Learning and Education
Academy of Management Perspectives
Academy of Management Proceedings
Academy of Management Review
or
Select your citation style and then place your mouse over the citation text to select it.

SHARE

email
need help?