AI/ML Engineer · Researcher
I build production-grade AI systems — multi-agent LLM pipelines, RAG architectures, and intelligent agents that go from research prototype to real-world deployment. Currently working with Dr. Alhoori as Researcher at NIU.
From research to production — end-to-end ML engineering across the full stack.
From production ML systems to HPC research infrastructure — one role, many hats.
Working with Dr. Alhoori on LLM-driven scientific figure analysis. Reconstructed scientific flowcharts with 90% accuracy across 1.4M figures using a Chain-of-Events, RAG and multi-agent pipeline (Phi-4, Qwen2.5-VL, InternVL, Llama4-Scout-as-judge). Deployed and maintained multi-node Ray + vLLM inference on NIU Metis HPC across 32 A100 GPUs. Also serving as Teaching Assistant for CSCI 502 (Java) and CSCI 503 (Python), mentoring 40+ graduate and undergraduate students.
Shipped a classification system auditing 144K+ reports at 95% accuracy, cutting manual review by 60% and eliminating SLA penalty exposure. Lifted minority-class recall by 20% via failure-mode feature engineering, k-fold CV, and hyperparameter tuning. Built D3.js dashboards tracking pipeline-stage progress — adopted as the default daily artifact for the ops team. Deployed on SageMaker and Kubeflow.
Reduced instance costs by 30% by migrating from MacStadium Anka to AWS Mac VMs. Built Jenkins CI/CD pipelines with auto-scaling and load-balancing for Docker-based environments, optimizing resource utilization at peak demand.
Maintained and ensured uptime of multiple web applications using robust cloud infrastructure and monitoring techniques. Addressed DDoS attacks to safeguard websites against potential threats, and worked with cross-functional teams for continuous deployment.
Peer-reviewed research in visual analytics and human-computer interaction.
Interactive visual analytics workflows are often disrupted by rigid filter panels and context switches that break analysts' cognitive flow. FluidViews elevates filters to first-class, manipulable objects through two novel direct-manipulation interactions — Copy-as-Highlight for persistent cross-view comparison, and Drag-as-Filter to apply context-sensitive filters in place with no menus or modal dialogs required. A user study showed it eliminated cross-view navigation overhead vs. baseline dashboards.
A selection of ML systems, research prototypes, and production work.
Built a framework processing ~110K arXiv papers and extracting ~1.4M figures with captions via PDFFigures2.0, GROBID, and PaddleOCR. Fine-tuned Llama4-Scout and Qwen2.5-VL with LoRA/PEFT on SPIQA and ChartLlama datasets. Designed a multi-agent orchestration pipeline (Phi-4, Qwen2.5-VL, InternVL) with Llama4-Scout as LLM-as-judge and HITL checkpoints. Built a flowchart reconstruction pipeline using Claude 3.5 Sonnet with structured outputs and RAG to generate Chain-of-Events (CoE) representations — achieving 90% accuracy.
Implemented a majority-vote ensemble combining Decision Trees, Random Forest, SVC, KNN, and GPT-4 with K-fold CV and Optuna hyperparameter tuning — boosting accuracy from 94% to 98% while mitigating overfitting. Applied feature engineering, StandardScaler, and SMOTE class balancing for dataset robustness; evaluated with accuracy, weighted F1-score, recall, and precision.
A custom D3.js glyph visualization that represents movies as flowers. Each flower encodes three variables simultaneously through visual form: the number of petals is scaled by the number of votes, the overall flower size reflects the movie rating, and the petal shape represents the MPAA rating category (G, PG, PG-13, R). A pleasing, scannable alternative to traditional bar charts for multivariate film data.
An interactive D3.js dashboard analyzing ProPublica's Road Home dataset — individual-level grants disbursed to Louisiana homeowners after Hurricanes Katrina and Rita. Built a custom D3 HexGrid map of Louisiana showing average damage per hex bin with brushing for region selection. Designed two novel custom glyphs: a Home glyph where 8 layered triangles encode the repair-cost percentage of total damage (with a blue back-area for rebuild cost), and a Chimney stacked bar showing cumulative distribution across the four grant types — Compensation, Additional Compensation, Elevation, and Individual Mitigation Measure.
Deep dives on LLMs, RAG systems, and ML engineering. Link to external posts or write here.
Behind the scenes of orchestrating Phi-4, Qwen2.5-VL, and InternVL with Llama4-Scout as LLM-as-judge — across 1.4M figures from 110K arXiv papers.
Practical notes from fine-tuning vision-language models on SPIQA and ChartLlama datasets across 32 A100 GPUs — what worked, what broke, what I'd do differently.
A step-by-step guide to sharding foundation models across multi-node HPC clusters using Ray + vLLM — from cluster setup to throughput tuning.
I'm actively looking for ML Engineer / Data Scientist roles. Open to both India and US opportunities.