Andrew Benedictus

Projects

Employee Attrition Problem Analysis and Prediction

Jupyter Python scikit-learn Pandas NumPy Matplotlib Seaborn Joblib Streamlit Tableau

This project solves the problem of employee attrition which reaches more than 10% using in-depth analysis with output in the form of a business dashboard and attrition predictions using machine learning, as well as several recommended action items that can be carried out by companies.

Dashboard Repository

Student Dropout Problem Analysis and Prediction

Jupyter Python scikit-learn Pandas NumPy Matplotlib Seaborn Joblib Streamlit Tableau

This project solves the problem of student dropout which is quite high, reaching more than 32% using in-depth analysis with output in the form of a business dashboard and student status predictions using machine learning, as well as several recommended action items that can be carried out by the institution.

Dashboard Repository

Stroke Disease Detection

Jupyter Python Keras TensorFlow TFX Pandas Railway Prometheus

Developed and deployed a stroke detection model using Predictive Analytics and MLOps, achieving 96% accuracy. Evaluated performance with AUC, Binary Accuracy, TFMA metrics, and confusion matrix components. Focused on improving early detection for high-risk individuals, enhancing healthcare insights through machine learning-driven diagnostics.

Deployment Dataset

Disaster Tweets Classification

Jupyter Python Keras TensorFlow TFX Pandas

Developed and deployed a disaster tweets classification model using Natural Language Processing (NLP) and Machine Learning Operations (MLOps), achieving 86% accuracy. Evaluated performance with AUC, Binary Accuracy, TFMA metrics, and confusion matrix components. Focused on detecting fake news or hoaxes that can quickly spread on widely used platform such as Twitter about natural disasters.

Repository Dataset

Bike Sharing Analysis Dashboard

Jupyter Python Pandas Matplotlib Seaborn Streamlit

This project is part of the bike-sharing data analysis project to analyze the Bike Sharing Dataset. The results of the analysis are then made into the form of data visualization into an interactive dashboard.

Dashboard Repository

Book Recommender System

Jupyter Python Keras TensorFlow scikit-learn Pandas NumPy Matplotlib

Developed an advanced book recommendation system to reignite reading interest, leveraging data from AWS. Implemented Content-Based Filtering with TF-IDF and Cosine Similarity, alongside a custom Collaborative Filtering model using RecommenderNet with Binary Cross-entropy and Root Mean Squared Error (RMSE) metrics. Optimized data preprocessing and analysis to enhance recommendations. Notable insights include peak book demand in December (12%) and the lowest in June (6%).

Repository Dataset

Electric Predictive Analytics

Jupyter Python scikit-learn Pandas Matplotlib Seaborn

Developed a machine learning model to predict electricity consumption in Tétouan, Morocco, using weather data. Analyzed 52,416 observations from three zones, identifying key correlations. Random Forest outperformed other models, achieving an Root Mean Squared Error (RMSE) of 24.15 (train) and 39.28 (test). This project addresses energy efficiency challenges and supports sustainable resource management.

Repository Dataset

Chicago Weather Forecasting

Jupyter Python Keras TensorFlow Pandas Matplotlib

Developed an LSTM-based deep learning model for weather time series prediction in Chicago using 43,824 data points. Trained with Stochastic Gradient Descent (SGD) optimizer, Huber loss, and Mean Absolute Error (MAE) metric, achieving a MAE of 2.2306 and validation MAE of 1.7385. Also implemented early stopping for model training optimization.

Repository Dataset

News Classification with NLP

Jupyter Python Keras TensorFlow Pandas Matplotlib Seaborn

Developed a deep learning model using Bi-LSTM for news topic classification (world, sports, business, sci-tech) on 120,000 data points. Applied data cleaning, trained with Adam optimizer and categorical cross-entropy loss, achieving 97.27% accuracy. Implemented ReduceLROnPlateau for model training optimization.

Repository Dataset

Rock-Paper-Scissors Image Classification

Jupyter Python Keras TensorFlow Matplotlib

Developed a deep learning model using a Convolutional Neural Network (CNN) to classify rock, paper, and scissors hand images. Trained on 2,188 images (60:40 split) with the Adam optimizer and categorical cross-entropy loss. Achieved 97.62% accuracy and 98.75% validation accuracy in 20 epochs.

Repository Dataset

Check More on My GitHub!