Senior ML Engineer @ J-Squared Technologies · Toronto

Jaskaran Bhatia I make AI run anywhere.

Machine Learning Engineer with a Software Engineer's backbone. I take models from research papers to production: Agentic Pipelines, LLM Inference, and Computer Vision that ships on real hardware for real clients.ML Engineer with a Software Engineer's backbone — research papers to production.

Edge AILLM InferencingAgentic AIComputer VisionFull-StackAWS

Speaking at AI conferences

See my work Resume

4+ years

Production software & ML engineering

JP Morgan Alum

Fintech-grade engineering discipline

UofT MScAC

Master's in Applied Computing

2× CANSEC Speaker

Presented FalconVeo · 2025 & 2026

About

Engineer first, researcher close second.

I take ML from research papers to production — models that are small, fast and local.

Edge AI & Model Optimization
LLM Systems & Agentic AI
Full-Stack ML Engineering

UofT MScAC · 4.0 GPA

Thapar B.E. · 9.55

2× CANSEC Speaker

I own ML systems end-to-end. At J-Squared Technologies I've shipped agentic pipelines that cut labeling effort by 90%, video RAG presented at CANSEC 2025 & 2026, and vision models optimized to sub-5 ms on edge hardware for defence, manufacturing and retail-mining clients.

Before that: production microservices at JP Morgan Chase and a 4.0 GPA Master's at the University of Toronto. A rare combination — equally strong in CUDA / Rust / C++ systems work and React / AWS product work. I'm at my best where models meet production.

University of Toronto

MSc in Applied Computing (MScAC) — Computer Science

2022 – 2023 · GPA 4.0 / 4.0 · A+ in ML, Deep Learning, NLP & Computational Imaging

Thapar Institute of Engineering & Technology

B.E. in Computer Engineering

2018 – 2022 · GPA 9.55 / 10

Edge AI & Model Optimization

Quantization (PTQ + QAT), pruning, distillation and custom CUDA / TensorRT kernels — detection, segmentation, re-ID and pose models meeting sub-5 ms budgets on Jetson and Hailo-8.

LLM Systems & Agentic AI

RAG over knowledge graphs, MCP servers for real tools, multi-agent annotation workflows, and local quantized inference — Ollama, Candle (Rust), vLLM, TensorRT-LLM.

Full-Stack ML Engineering

Lock-free C++ IPC backbones, Rust inference services, REST APIs, React frontends and AWS architecture — the production plumbing that makes models actually usable.

Writing

Latest from the blog.

Deep dives on ML systems, edge AI and computer vision — published on Medium.

Medium

May 2026Rust · LLMs

Rust vs Python for LLM Inference: I Benchmarked Everything So You Don't Have To

Read article

Medium

Sep 2024Edge AI

The Rise of On-Device AI Processing and SLMs

Read article

Medium

Sep 2023Hardware

AI Accelerators: Tracing the Past, Understanding the Present, and Forecasting the Future

Read article

All articles Follow on Medium

Experience

Where I've shipped.

Full experience

J-Squared Technologies

2023 — Now

Senior Machine Learning Engineer

Agentic annotation at 100K+ sample scale, FalconVeo video RAG presented at CANSEC 2025 & 2026, and sub-5 ms edge inference for defence, manufacturing and retail-mining clients.

Edge AICUDARust

JP Morgan Chase

2022

Software Engineer

Migrated customer-facing CIB microservices from Angular to React (+10% engagement) and shipped hardened .NET Core APIs for production financial transactions.

React.NET Core

Scaler Academy

2021

SDE Intern

Rebuilt referral and newsletter dashboards in React with Rails APIs and Jenkins CI/CD on AWS — the redesigned flow doubled referral conversion.

ReactRailsAWS

Projects

Selected builds.

All projects

fragivo● LIVE

Fragivo — AI Fragrance Platform

LLM- and vision-powered fragrance discovery on AWS: OAuth, Google-Search-grounded analysis, prompt-engineered recommendations.

LLMsAWSRecSys

Visit site

LLMs for Medical Text Summarization

Fine-tuned and benchmarked GPT-3/4, T5, BART and Pegasus on medical summarization — ROUGE, BERTScore and inference cost head-to-head. Published as a preprint.

LLMsNLPBenchmarking

Preprint Code

MedGANs — Medical Image Enhancement

End-to-end GAN pipeline for medical image denoising and enhancement — published in IJESE (2025) with single-shot HDR and edge-enhancement post-processing.

GANsImagingPublished

Paper Code

View all 9 projects

Research & Speaking