A system based on data science to predict student dropouts

The authors of the study Eloi Puertas, Laura Igual and Sergi Rovira.
The authors of the study Eloi Puertas, Laura Igual and Sergi Rovira.
Research
(30/03/2017)

There is currently a 30% of student dropout in Europe according to Education at a Glance (EAG). In Spain, these figures are between the 25% and 29%. In order to create a tool aimed at lecturers to assess and improve the academic performance and reduce these levels, a team of the Faculty of Mathematics and Computer Science has published an article in the journal PLoS ONE which shows a data analysis system to predict student dropout. This tool is based on science data, that is, machine learning techniques.

The authors of the study Eloi Puertas, Laura Igual and Sergi Rovira.
The authors of the study Eloi Puertas, Laura Igual and Sergi Rovira.
Research
30/03/2017

There is currently a 30% of student dropout in Europe according to Education at a Glance (EAG). In Spain, these figures are between the 25% and 29%. In order to create a tool aimed at lecturers to assess and improve the academic performance and reduce these levels, a team of the Faculty of Mathematics and Computer Science has published an article in the journal PLoS ONE which shows a data analysis system to predict student dropout. This tool is based on science data, that is, machine learning techniques.

The study is signed by the researchers Laura Igual and Eloi Puertas, from the Department of Mathematics and Computer Science of the UB, together with Sergi Rovira, student of the bachelor degree in Computer Engineering of the UB. This research is part of the Teaching Innovation Project: “Sistema Intel·ligent de Suport als Tutors a la Universitat de Barcelona” (Intelligent System to Support the Teaching Staff at the University of Barcelona), led by Laura Igual, with the participation of researchers of the Faculty of Mathematics and Computer Science, and from the Faculty of Education. The aim is to develop a tool for the lecturers so they receive recommendations and orientation for the students, and can assess the risk of student dropouts.

“Nowadays the role of the tutor is more important than ever in order to prevent students from leaving the university and improve their academic performance. The research proposes a system based on objective data to take hidden information which is important for the studentsʼ academic data and therefore, to help teachers to offer their students a personal and proactive orientation” says Igual.

In this first stage, the objective of the research was to answer the question “is it possible to predict whether a student will continue the second year at university out of the results from the first academic year?” To carry the analysis out the researchers used data from the first and second academic years in three bachelor degrees: Mathematics, Computer Science and Law. To do so, they applied five different data science algorithms, the best of which has shown a precision of 82%. Both the algorithm and anonymous data are publicly published in PLoS ONE.

From statistics to data science

The previous studies on university dropouts in this field were focused on statistical models, based on a collection of data (usually through interviews) gathering information on the possible causes of study dropout (motivation, relation with students…). Statistical models are based on hypotheses taken from the underlying problems. With this, if studentsʼ performance factors change over time the assumptions of a statistical model could be obsolete. “However, -continues Igual- machine learning techniques have a predictive use based on objective data, which makes them more adaptable to new data”. However, statistical systems are better to determine the reasons students leave their studies. “But the predictive power of these tools is lower”, says Laura Igual. Also, this new focus will allow the teaching staff to have “warnings” about students before registering, she notes.

This system also allows predicting the grades students can get in future courses, which would allow the teachers to give advice or orientation to students.

Within the teaching innovation project, “the following step is to analyze -from an educational perspective- how to use this tool, how to assess its impact and develop a computer application prototype” concludes the researcher.

 

Reference:

S. Rovira, E. Puertas, L. Igual. «Data-driven System to Predict Academic Grades and Dropout». PLoS ONE, February, 2017. Doi:10.1371/journal.pone.0171207