Skip to content

A Machine Learning system that predicts student academic performance using demographic and behavioral data. Includes regression models for final scores and classification models for pass/fail grades

License

Notifications You must be signed in to change notification settings

aboalis/Student-Performance-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ“ Student Performance Prediction System

Python Scikit-Learn Streamlit Status

Course: CIE 317/417 Machine Learning
Institution: Zewail City of Science, Technology and Innovation


πŸ“Œ Project Overview

A comprehensive Machine Learning system that analyzes educational data and predicts student performance. Leveraging a dataset of 20,000+ student records, the system identifies key factors influencing academic success and provides predictive insights through three core tasks:

  1. Regression: Predicting exact final_score (0-100)
  2. Binary Classification: Determining pass_fail status
  3. Multiclass Classification: Predicting specific final_grade (A, B, C, D, F)

An interactive Streamlit Dashboard visualizes model performance and enables real-time predictions.


πŸ‘₯ Team Members

Name Student ID
Mohammed Ali Sadek 202200594
Ahmed Amgad 202200393
Abdulrahman Madgy 202200341
SalahDin Ahmed Rezk 202201079

πŸ“‚ Dataset Details

Source: Term_Project_Dataset_20K.csv

  • Size: 20,000+ samples
  • Features: 40 input variables across 4 categories
  • Target Variables: final_score, final_grade, pass_fail

Feature Categories

Category Examples
Demographic Age, Gender, Parent Income, Sibling Count
Academic History Previous GPA, High School Grade, Attendance Rate
Behavioral Study Hours, Participation, Alcohol Consumption
Psychological Stress Level, Motivation, Anxiety, Sleep Hours

βš™οΈ Methodology & Pipeline

1. Exploratory Data Analysis (EDA)

  • Distribution analysis of grades
  • Correlation matrices identifying relationships (e.g., Study Time vs. Score)
  • Outlier detection and visualization

2. Data Preprocessing

  • Imputation: Handling missing values in numerical and categorical columns
  • Encoding: One-Hot Encoding for nominal features (e.g., Gender)
  • Balancing: SMOTE (Synthetic Minority Over-sampling Technique) for class imbalance

3. Model Training

  • Training multiple classical ML models
  • Hyperparameter tuning via GridSearchCV/RandomizedSearchCV
  • Cross-validation for robust performance estimation

4. Evaluation

  • Regression Metrics: RMSE, MAE, RΒ² Score
  • Classification Metrics: Accuracy, Precision, Recall, F1-Score
  • Visualizations: Confusion Matrices, ROC Curves, Feature Importance

🧠 Models Implemented

πŸ“‰ Regression Models (Score Prediction)

  • Linear Regression
  • Ridge & Lasso Regression
  • Random Forest Regressor
  • Support Vector Regressor (SVR)
  • Gradient Boosting Regressor

πŸ“Š Classification Models (Grade/Pass-Fail Prediction)

  • Logistic Regression
  • Random Forest Classifier
  • Gradient Boosting Classifier
  • XGBoost Classifier
  • Support Vector Machine (SVM)
  • K-Nearest Neighbors (KNN)

πŸ’» Installation & Usage

Prerequisites

  • Python 3.8 or higher
  • pip package manager

1. Clone the Repository

git clone https://github.com/aboalis/student-performance-prediction.git
cd student-performance-prediction

2. Install Dependencies

pip install -r requirements.txt

3. Run the Jupyter Notebook

To view the complete analysis and training process:

jupyter notebook notebooks/Final_Project.ipynb

4. Launch the Streamlit Dashboard (Optional)

For interactive predictions and visualizations:

streamlit run app.py

The dashboard will open in your browser at http://localhost:8501


πŸ“ Repository Structure

student-performance-prediction/
β”‚
β”œβ”€β”€ data/
β”‚   └── Term_Project_Dataset_20K.csv   # Primary dataset
β”‚
β”œβ”€β”€ notebooks/
β”‚   └── Final_Project.ipynb            # Main analysis & training notebook
β”‚
β”œβ”€β”€ models/                            # Saved trained models (generated)
β”‚
β”œβ”€β”€ app.py                             # Streamlit dashboard application
β”œβ”€β”€ requirements.txt                   # Python dependencies
β”œβ”€β”€ README.md                          # Project documentation
└── .gitignore                         # Git ignore file

πŸ“Š Key Findings

Top Predictive Features

  1. Previous GPA - Strongest predictor of academic success
  2. Study Hours per Week - Strong positive correlation with final scores
  3. Attendance Rate - Critical factor in pass/fail outcomes
  4. Sleep Hours - Significant impact on cognitive performance

Model Performance Highlights

  • Best Regression Model: Random Forest Regressor
  • Best Binary Classifier: XGBoost
  • Best Multiclass Classifier: Gradient Boosting

Insights

  • Non-linear relationships between features favor ensemble methods (Random Forest, XGBoost)
  • SMOTE balancing significantly improved minority class predictions (F grades)
  • Behavioral factors (study time, participation) outweigh demographic factors in importance

πŸš€ Future Enhancements

  • Deep Learning models (Neural Networks) for comparison
  • Feature engineering with polynomial features
  • Real-time data integration with student information systems
  • Mobile application deployment
  • Explainable AI (SHAP values) for model interpretability

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

Course: CIE 317/417 Machine Learning
Instructor: Dr. Ahmed Tolba
Institution: Zewail City of Science, Technology and Innovation

Tools & Libraries:

  • Python, Scikit-Learn, XGBoost
  • Pandas, NumPy, Matplotlib, Seaborn
  • Streamlit, Jupyter Notebook
  • Google Colab

πŸ‘€ Contact

Mohammed Ali Sadek
LinkedIn GitHub


Project Date: Fall 2024
Last Updated: January 2025

About

A Machine Learning system that predicts student academic performance using demographic and behavioral data. Includes regression models for final scores and classification models for pass/fail grades

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors