Hello, I'm Arpit Singh Gautam

I am a Data Scientist working in the CSG CTO Lab at Dell Technologies, where I focus on optimization, efficient inference, and scalable AI systems. My work spans generative AI, reinforcement learning, neural architecture search, and distributed model serving — with an emphasis on building robust, efficient systems that work at scale.

I have developed systems for disaggregated serving, speculative decoding, and KV cache optimization. My work has been accepted at AAAI, ICCCNT (IIT Indore), and the FEVER Workshop @ EACL 2026 (now on ACL Anthology).

Research interests: Systems for ML & Distributed AI · Efficient and Hardware-Aware Inference · Reasoning-Centric LLMs · Reinforcement Learning for Foundation Models

Arpit Singh Gautam
0 Publications
0 Yr Full-Time Exp
0 Talks, Panels & Mentoring
0 Hackathon Wins
0 1-on-1 Students Helped 5/5 on TopMate ★

Recent Updates

Papers · Projects · Talks · Blog posts — all in one place.

LLM Quantization Gallery
Project Launch
Apr 7, 2026
An interactive gallery comparing INT4, INT8, GPTQ, AWQ, QLoRA and GGUF methods — with perplexity scores, memory footprints, and throughput benchmarks side by side. Read the blog post →
RAMP Paper
Paper · arXiv
Mar 18, 2026
Off-policy SAC framework that learns per-layer bit-width assignments to minimize perplexity under a global bit budget. Achieves 5.54 perplexity at 3.68 GB on Llama 2 7B — outperforming uniform 4-bit AWQ. Zero-shot transfer to Llama 2 13B and Mistral 7B.
EACL 2026
Paper Accepted · EACL 2026
Apr 2026
Accepted at the FEVER Workshop @ EACL 2026 — now on ACL Anthology. Proposes a diffusion-based generative stability method for automated fact verification with Kailash Talreja and Saurabh Jha.
ARC-AGI-3
Competition · Ongoing
2026 · Ongoing
Participating in François Chollet's ARC-AGI-3 challenge — one of AI's hardest benchmarks for fluid intelligence. Submitted a random agent baseline and exploring learning-based approaches. Read the blog post →