Predicting academic performance from behavioural and learning data
Predicting academic performance from behavioural and learning data
Carlos J. Villagrá-Arnedo, Francisco J. Gallego-Durán, Patricia Compañ-Rosique, Faraón Llorens-Largo, Rafael Molina-Carmona
Departamento de Ciencia de la Computación e Inteligencia Artificial
Universidad de Alicante
Big Data 2016
Alicante, 3-5 May 2016
http://www.wessex.ac.uk/conferences/2016/big-data-2016
Organisers
Wessex Institute, UK
University Miguel Hernandez, Spain
University of Alicante, Spain
Abstract
The volume and quality of data, but also their relevance, are crucial when performing data analysis. In this paper, a study of the influence of different types of data is presented, particularly in the context of educational data obtained from Learning Management Systems. These systems provide a large amount of data from the student activity but they usually do not describe the results of the learning process, i.e., they describe the behaviour but not the learning results. The starting hypothesis states that complementing behavioural data with other more relevant data (regarding learning outcomes) can lead to a better analysis of the learning process, that is, in particular it is possible to early predict the student final performance. A learning platform has been specially developed to collect data not just from the usage but also related to the way students learn and progress in training activities. Data of both types are used to build a progressive predictive system for helping in the learning process. This model is based on a classifier that uses the Support Vector Machine technique. The system obtains as a result a weekly classification of each student as the probability of belonging to one of three classes: high, medium and low performance. The results show that, supplementing behavioural data with learning data allows us to obtain better predictions about the results of the students in a learning system. Moreover, it can be deduced that the use of heterogeneous data enriches the final performance of the prediction algorithms.