← Back to portfolio
All Projects
Research, industry, and passion projects across ML, NLP, computer vision, and data analytics.
NLP, LLMs & Machine Learning
Medical Entity Extraction with NLP + RAG Completed · GWU

Fine-tuning BioBERT per disease for clinical NER on 40,000 notes. Knowledge graph linking diseases, symptoms, and treatments for retrieval-augmented generation.

BioBERTTransformersKnowledge GraphRAGPython
Niti — Legal Research Chatbot Completed

Fine-tuned BERT on 2,500+ legal documents for span extraction. Production legal research app with LangChain RAG for conversational querying. Won the UNDP-Diyo AI Hackathon 2023.

BERTLangChainRAGFastAPIReact
Lysine PTM Site Prediction Academic · Nepal

Stacked ProtBERT, ProtT5, and CNN model predicting lysine post-translational modification sites. Interactive web interface with visualizations for PTM predictions on protein sequences.

ProtBERTProtT5CNNANNFlask
Balancing Inverted Pendulum — RL Academic

Actor-Critic RL with IoT sensor data and ARX modeling for real-time control. Achieved stabilization outperforming traditional PID controllers in disturbance rejection.

Actor-Critic RLARX ModelIoTPython
Computer Vision
Artwork Similarity Search Research · GWU

Content-based image retrieval for 12K artworks using SimCLR self-supervised learning. FAISS nearest-neighbor search — 96% top-10 similarity accuracy.

SimCLRPyTorchFAISSAWS
Animal Sound Classifier Research · GWU

CNN, CRNN, and Vision Transformer models classifying animal sounds from Mel-spectrograms. Tuned CNN+Dropout outperformed YAMNet transfer learning — 92% test accuracy.

PyTorchTensorFlowLibrosaViTCNN
Camera Calibration & Projection Models Research · GWU

Real-world camera calibration for computer vision — intrinsic and extrinsic parameters, distortion correction, and 3D reconstruction techniques with practical implementation.

OpenCVPythonNumPy
Data Engineering & Analytics
Finance Data Analytics at Scale GWU · 2024

Processed Fannie Mae data in Parquet using Spark. Clustering, regression, and risk analysis on corporate filings with PySpark and Neo4j GraphRAG.

PySparkNeo4jGraphRAGApache Spark
Strategic Content Analysis — YouTube Completed

Leveraged YouTube API telemetry to identify a monetization sweet spot and uncover a growth opportunity through CTR optimization across channels.

YouTube APIPythonPandasMatplotlib
NFL Big Data Bowl 2026 2026

Quantified receiver separation, defensive reaction timing, and coverage efficiency from frame-level tracking data. K-Means clustering for defensive movement archetypes.

Scikit-learnK-MeansNumPyMatplotlib
E-Voting Database Management System Personal · 2022

Real-time online voting system with strong data integrity. Interactive web dashboard for live results visualization, significantly reducing result processing time.

SQLETLFlaskJavaScriptHugo
Starting with EDA Writing

A guide to exploratory data analysis — data profiling, distribution analysis, correlation, outlier detection, and visualization best practices for effective dataset understanding.

PandasSeabornMatplotlibPython