• search hit 4 of 9
Back to Result List

Contrasting classical and machine learning approaches in the estimation of value-added scores in large-scale educational data

  • There is no consensus on which statistical model estimates school value-added (VA) most accurately. To date, the two most common statistical models used for the calculation of VA scores are two classical methods: linear regression and multilevel models. These models have the advantage of being relatively transparent and thus understandable for most researchers and practitioners. However, these statistical models are bound to certain assumptions (e.g., linearity) that might limit their prediction accuracy. Machine learning methods, which have yielded spectacular results in numerous fields, may be a valuable alternative to these classical models. Although big data is not new in general, it is relatively new in the realm of social sciences and education. New types of data require new data analytical approaches. Such techniques have already evolved in fields with a long tradition in crunching big data (e.g., gene technology). The objective of the present paper is to competently apply these "imported" techniques to education data, moreThere is no consensus on which statistical model estimates school value-added (VA) most accurately. To date, the two most common statistical models used for the calculation of VA scores are two classical methods: linear regression and multilevel models. These models have the advantage of being relatively transparent and thus understandable for most researchers and practitioners. However, these statistical models are bound to certain assumptions (e.g., linearity) that might limit their prediction accuracy. Machine learning methods, which have yielded spectacular results in numerous fields, may be a valuable alternative to these classical models. Although big data is not new in general, it is relatively new in the realm of social sciences and education. New types of data require new data analytical approaches. Such techniques have already evolved in fields with a long tradition in crunching big data (e.g., gene technology). The objective of the present paper is to competently apply these "imported" techniques to education data, more precisely VA scores, and assess when and how they can extend or replace the classical psychometrics toolbox. The different models include linear and non-linear methods and extend classical models with the most commonly used machine learning methods (i.e., random forest, neural networks, support vector machines, and boosting). We used representative data of 3,026 students in 153 schools who took part in the standardized achievement tests of the Luxembourg School Monitoring Program in grades 1 and 3. Multilevel models outperformed classical linear and polynomial regressions, as well as different machine learning models. However, it could be observed that across all schools, school VA scores from different model types correlated highly. Yet, the percentage of disagreements as compared to multilevel models was not trivial and real-life implications for individual schools may still be dramatic depending on the model type used. Implications of these results and possible ethical concerns regarding the use of machine learning methods for decision-making in education are discussed.show moreshow less

Export metadata

Additional Services

Search Google Scholar Statistics
Metadaten
Author details:Jessica LevyORCiD, Dominic MussackORCiD, Martin BrunnerORCiDGND, Ulrich Keller, Pedro Cardoso-LeiteORCiD, Antoine FischbachORCiDGND
DOI:https://doi.org/10.3389/fpsyg.2020.02190
ISSN:1664-1078
Pubmed ID:https://pubmed.ncbi.nlm.nih.gov/32973639
Title of parent work (English):Frontiers in psychology
Publisher:Frontiers Research Foundation
Place of publishing:Lausanne
Publication type:Article
Language:English
Date of first publication:2020/08/21
Publication year:2020
Release date:2023/03/24
Tag:comparison; longitudinal data; machine learning; model; school effectiveness; value-added modeling
Volume:11
Article number:2190
Number of pages:18
Funding institution:PRIDE grant of the Luxembourg National Research Fund (FNR); [PRIDE/15/10921377]; ATTRACT grant of the Luxembourg National Research; Fund (FNR) [ATTRACT/2016/ID/11242114/DIGILEARN]
Organizational units:Humanwissenschaftliche Fakultät / Strukturbereich Bildungswissenschaften / Department Erziehungswissenschaft
DDC classification:1 Philosophie und Psychologie / 15 Psychologie / 150 Psychologie
Peer review:Referiert
Publishing method:Open Access / Gold Open-Access
DOAJ gelistet
License (German):License LogoCC-BY - Namensnennung 4.0 International
Accept ✔
This website uses technically necessary session cookies. By continuing to use the website, you agree to this. You can find our privacy policy here.