Projects
Selected projects across data, systems, and analytics.
Sort by:

Image-Based Mushroom Edibility Classification
Mar 2026Graduate - MDS Capstone Project
PythonMLComputer Vision
- Built an image-based mushroom edibility classifier using a large real-world dataset (100K+ images).
- Designed a preprocessing pipeline to handle duplicates, class imbalance, and image inconsistencies.
- Compared multiple deep learning models (ResNet-18, ResNet-50, Swin Tiny Transformer).
- Achieved best performance with a threshold-tuned Swin Tiny model (92.26% accuracy, 94.02% toxic recall).
- Prioritized safety by optimizing for toxic recall to reduce dangerous misclassification.

Scoring Risk of Default Using Banking Transaction Data
Mar 2024Undergraduate - DSC Capstone Project
PythonMLNLPXGBoost
- Developed a cash score model for assessing credit risk of first-time applicants.
- Led data analysis, income estimation, and feature derivation for robust risk assessment.
- Achieved 84% accuracy and 0.87 AUC with XGBoost; identified key default risk factors.
- Provided actionable insights to support better lending decisions and inclusive practices.

Status and Prospects of Data Science Careers
Dec 2023D3StorytellingVisualization
- Visualization project on data science job trends and salary growth.
- Used a drill-down narrative structure from overview to details.
- Covered remote work trends, salary vs experience, geography, and job categories.

Sudoku Solver
Jul 2023JavaScriptHTMLCSSAlgorithms
- Built a Sudoku solver using backtracking.
- Maintained both JavaScript and Java versions.

Predictive Analysis on Clothing Fit
Dec 2022Pythonscikit-learnpandas
- Developed a predictive model for clothing fit based on user measurements.
- Conducted exploratory analysis on size distribution and key drivers.
- Implemented baseline and improved models; addressed class imbalance for better performance.

Rank Prediction of NYC Police Officers Based on Civilian Complaints
Jun 2022PythonMLFairness
- Built a model to predict officer rank using civilian complaint data.
- Improved accuracy from 0.13 to 0.34 via feature engineering and tuning.
- Performed fairness analysis and a permutation test to assess potential bias.