Educational Data Mining

Data analysis of educational datasets

This was a part of final project in Artificial Intelligence class.

Given the task of choosing two datasets from UC-Irvine Machine Learning Repository, I used “Student Performance” and “Turkieye Student Evaluation” as the two data sets. The specific requirements for the project were as follows:

Dataset
  • Classification
  • Multivariate
  • # of Attributes (10 ~ 100)
  • # of Instances (100 ~ 1,000, Greater than 1,000)
Analysis
  • Use at least 4 different classifiers for comparative analysis
  • The final evaluation must use cross-validation
  • Make sure to use the concept of overfitting in final evalution
  • Use the results of zeroR, oneR as the baseline


If you want to use the dataset I formatted for Weka, you can access it here.