Fine-tuning BioBERT per disease for clinical NER on 40,000 notes. Knowledge graph linking diseases, symptoms, and treatments for retrieval-augmented generation.
Fine-tuned BERT on 2,500+ legal documents for span extraction. Production legal research app with LangChain RAG for conversational querying. Won the UNDP-Diyo AI Hackathon 2023.
Stacked ProtBERT, ProtT5, and CNN model predicting lysine post-translational modification sites. Interactive web interface with visualizations for PTM predictions on protein sequences.
Actor-Critic RL with IoT sensor data and ARX modeling for real-time control. Achieved stabilization outperforming traditional PID controllers in disturbance rejection.
Content-based image retrieval for 12K artworks using SimCLR self-supervised learning. FAISS nearest-neighbor search — 96% top-10 similarity accuracy.
CNN, CRNN, and Vision Transformer models classifying animal sounds from Mel-spectrograms. Tuned CNN+Dropout outperformed YAMNet transfer learning — 92% test accuracy.
Real-world camera calibration for computer vision — intrinsic and extrinsic parameters, distortion correction, and 3D reconstruction techniques with practical implementation.
Processed Fannie Mae data in Parquet using Spark. Clustering, regression, and risk analysis on corporate filings with PySpark and Neo4j GraphRAG.
Leveraged YouTube API telemetry to identify a monetization sweet spot and uncover a growth opportunity through CTR optimization across channels.
Quantified receiver separation, defensive reaction timing, and coverage efficiency from frame-level tracking data. K-Means clustering for defensive movement archetypes.
Real-time online voting system with strong data integrity. Interactive web dashboard for live results visualization, significantly reducing result processing time.
A guide to exploratory data analysis — data profiling, distribution analysis, correlation, outlier detection, and visualization best practices for effective dataset understanding.