Arpit Singh Gautam

Data Scientist | Researcher

I am a Data Scientist working in the CSG CTO Lab at Dell Technologies where I focus on optimization, efficient inference, and scalable AI systems. My work spans generative AI, reinforcement learning, neural architecture search, and distributed model serving with an emphasis on building robust and efficient systems that work at scale.

I have developed systems for disaggregated serving, speculative decoding, and KV cache optimizations that achieve significant improvements in throughput and latency over existing inference frameworks. My experience also includes building reinforced reasoning models for Text to SQL, diffusion based fact verification systems, and multi modal models for medical imaging.

I am passionate about research work and have published across areas such as diffusion models for fact verification, theory of mind distillation, hybrid neural networks for medical imaging, hyperspectral image classification, and large scale SQL reasoning. My work has been accepted at premier venues including AAAI and ICCCNT and also submitted to EACL FEVER.

I enjoy mentoring students and have taught at IBM Z Datathon where I guided over 3300 learners. I have also delivered technical sessions, mentored hackathons, and helped develop open source educational content.

Research Interests: Systems for Machine Learning and Distributed AI; Efficient and Hardware-Aware Inference Optimization; Reasoning-Centric and Aligned Large Language Models; Reinforcement Learning for Foundation Models.

12 Hackathon Wins

10+ Internships

20k+ Conf. Views

5 Papers/Talks

🔥 News & Highlights

2025 Paper "The Energy of Falsehood" submitted to EACL 2026 FEVER Workshop.
2025 Paper "Faithful Theory of Mind Distillation" accepted at AAAI 2026 ToM Workshop.
2025 Released CogniSQL-R1-Zero (Reasoning Model for Text-to-SQL) via arXiv.

Show More ▼

💼 Professional Experience

Data Scientist, CSG CTO Lab Dell Technologies, Bengaluru | Jul 2025 - Present

Engineered a distributed inference system utilizing disaggregated serving, speculative decoding, and KV cache quantization, achieving 4x throughput and reducing latency from 2.5s to <1s compared to vLLM baselines (5+ Patents waiting to be filed).
Developed a reinforcement learning based quantization framework for Post Training Quantization in LLMs that integrates neural architecture search using RL, outperforming baseline methods by 2.6x in compression with minimal perplexity loss (Paper Ongoing).
Conducted a study on automating fact verification using generative stability signals and designed a diffusion based generative stability method for automated fact verification, improving robustness over discriminative baselines and detecting confidently incorrect claims (Paper Submitted in EACL FEVER Workshop 2026).
Studied reasoning transfer from larger to smaller models using sequential SFT and preference based refinement, showing clear gains in reasoning fidelity and alignment (Paper accepted at the AAAI ToM Workshop 2026).
Currently designing a State Space Model (Mamba) based reranker to mitigate adversarial attacks and enhance robustness in Retrieval-Augmented Generation (RAG) Systems.

Data Science Intern Dell Technologies, Bengaluru | Jul 2024 - Jun 2025

Created CogniSQL-R1-Zero, a reasoning model for Text-to-SQL using GRPO reinforcement learning and Deep-speed distributed training on a 7B backbone across 4 A100 GPUs (Paper released on arXiv).
Achieved state-of-the-art execution accuracy on the BIRD benchmark, outperforming 236B+ parameter models by avoiding intermediate supervision and complex reward shaping.
Built an agentic framework incorporating self-healing, test-time scaling, and CoT reasoning, increasing execution accuracy by 30% on proprietary datasets (Copilot now in production).

Computer Vision Research Engineer Stealth Startup (Remote, US) | Mar 2024 - Jun 2024

Developed a real-time theft detection system using SlowFast networks and 3D CNNs, specifically optimizing the architecture for low-latency edge device deployment.

AI Research Intern Renix Informatics | Aug 2023 - Oct 2023

Researched and Implemented two new features for the DocX (Document AI) product.
Used SonarQube to analyse and address issues in the old DocX code.

AI/ML Developer + Mentoring Intern WictroniX | Jun 2023 - Aug 2023

Contributed to AI/ML Advance Team, worked with drone footage taken 120m above ground.
Led a group of mentors in guiding and mentoring more than 30 interns in AI/ML.

AI/ML Apprentice IBM Z Systems | Jan 2023 - Jun 2023

Developed and executed end‑to‑end two machine learning projects (Omnizenon & KnowCrimez).
Applied ML techniques to enhance data analysis and deployed web user interface.

Machine Learning Intern Suvidha Foundation | Mar 2023 - Apr 2023

Researched on Abstractive Text Summarization using Hugging Face Transformers.
Researched on accuracy between different LLMs.

Backend Developer Trainee Safcurl | Jul 2022 - Jan 2023

Completed Sprint Tickets, built models, implemented APIs & maintained DB.

🔬 Academic Research Experience

Undergraduate Researcher Manipal University Jaipur | Feb 2024 – Jun 2025

Conducted research under Dr. Vivek Bhardwaj, Associate Professor, to develop hybrid CNN plus GRU plus LSTM models for lymphoma detection from histopathology images, achieving strong accuracy gains (Paper accepted at the 16th ICCCNT at IIT Indore, 2025).
Conducted research under Dr. Jayesh Gangrade, Associate Professor, to create a hybrid sentiment classifier blending transformer embeddings, attention driven recurrence, and numerical feature fusion (Paper submitted to the Discover Computing journal).
Conducted research under Mr. Rajesh Kumar, Assistant Professor Selection Grade, to build a real time QUIC traffic classifier using three raw features with LightGBM and SHAP plus LIME based explainability (Paper accepted at the 16th ICCCNT at IIT Indore, 2025).

Winter Research Intern VECC (Dept. of Atomic Energy), Kolkata | Dec 2023 - Feb 2024

Conducted research under Ushnish Sarkar (Scientific Officer F) to develop a Reinforcement Learning (TD3 Network) based autonomous navigation system for robots in nuclear radiation leaks.

Summer Research Intern IIT BHU, Varanasi | Jun 2023 - Jul 2023

Conducted research under Prof. Rajeev Srivastava (Ex-Dean, HoD@CSE) on hyperspectral image classification using a QUH dataset of over 10^6 samples, developing a hybrid deep learning model that achieved 91.90% accuracy and surpassed standard benchmarks.

📝 Publications

The Energy of Falsehood: Generative Calibration of Fact Verification via Diffusion Models

Arpit Singh Gautam, Kailash Talreja, Saurabh Jha

Submitted in the FEVER: Ninth Workshop on Fact Extraction and VERification at EACL 2026
Faithful Theory of Mind Distillation: Why Preference Based Refinement Improves Imitation

Arpit Singh Gautam, Saurabh Jha

Accepted in the Advancing Artificial Intelligence through Theory of Mind Workshop at AAAI 2026
CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation

Kushal Gajjar, Harshit Sikchi, Arpit Singh Gautam, Marc Hammons, Saurabh Jha

Due to model confidentiality, released via arXiv (Paper, 2 Datasets made public)
Enhancing Lymphoma Detection Using Multi-Layer Hybrid Neural Networks

Arpit Singh Gautam, Satyam Kumar, Nishad Khade, Vivek Bhardwaj

Presented at the 16th ICCCNT, IIT Indore (2025) (oral presentation)

🎓 Education

Manipal University, Jaipur, India Oct 2021 - Jun 2025

B.Tech. (Hons) Computer Science & Engineering, GPA: 8.56/10.0
Specialization: Artificial Intelligence and Machine Learning

7x Academic award(s): Dean's List Award (Academics), 4x Student Excellence Award
Honored by President and given the title of MUJ's Wizard Programmer (Gold Medal)

👨‍🏫 Teaching Experience

Teaching Assistant Manipal University Jaipur | Jan 2024 – May 2024

Assisted professor with designing tutorials and holding lab sessions for the following courses: AI3241 Reinforcement Learning (Dr. Animesh Kumar, Spring 2024), AI3231 Computer Vision and Pattern Lab (Prof. Harish Sharma, Spring 2024).

Subject Matter Expert IBM, Global | Oct 2023 – Jun 2024

Instructor at IBM Z Datathon guiding 3300 plus students in setting up LinuxONE for AI development (Oct 2023) and later served as a Subject Matter Expert mentoring winning teams on mainframe-based machine learning integration (Jan 2024 to Jun 2024).

🤝 Volunteering and Service

President & Co‑Founder AIML Community MUJ | Sep 2023 - Ongoing

Attracted 300+ members in first year. Organized events/hackathons on ML/CV/NLP.

IBM Z Student Ambassador IBM Z Systems | Aug 2023 - Ongoing

Speaker at IBM Z Day. Promote mainframe tech globally. Acquire technical certifications.

Student Ambassador Leader Streamlit | Aug 2023 - Ongoing

One of 10 leaders among 216 ambassadors. Work closely with Streamlit team to improve platform.

Deputy Head of Design Phi Phenomenon MUJ | Jun 2022 - Jul 2023

Working Member CodeChef MUJ Student Chapter | Jun 2022 - Jan 2023

🏆 Honors and Awards

I actively take part in hackathons, having completed 17 of them with 12 wins 🏆.

Dell Technologies Industry Hackathon
🏆 1st Place Overall (500+ participants) | 2024

International Hackathons (6 Wins)
W3B GreenTech, CalCodeFest, Friday Night Firefight, SacHacks V, MLH's Bon Hacketit, SacHacks IV

National Hackathons (2 Wins)
The 418 Hackathon, The Latest Cut

College-Level Hackathons (3 Wins)
ACM MiniHacks 3.0, T-Hunt Hackathon, Panacea Clone Wars

SacHacks IV & V (UC Davis)
🏆 Best Use of IBM Z Winner (2 times in a row) | 2022-23

💬 Testimonials

"I highly recommend Arpit as an exceptional talent in the AI and Data Science space... He swiftly grasped the project’s complex, high-impact vision and approached challenges with an innovative, research-driven mindset."

Filippos Paraskevas Zygouris

Cloud Solution Architect | AI & Data Architect

"Arpit is very energetic, hardworking student with motivation to build some of the cool applications using Data Science and Artificial Intelligence (AI)... I wish him all the very best in his future endeavours."

Chandra Shekhar Reddy Potula

Head Of AI IaaS

"I mentored Arpit as part of IBM Z Mentorship program. He is self motivated, hardworking, highly skilled colleague. I also appreciate the relevancy of problems that he picks to pursue..."

Jidhu Mohan Mattummal Valappil

CEO @ DigiDxDoc | Ex-BOSCH, IBM

"Worked with Arpit, he is really determined and always keen to contribute in every aspect possible and has deep knowledge in field of Machine Learning."

Raghav Kapoor

Software Engineering at VickyBytes

🛠 Skills and Interests

TECHNICAL: Python (Tensorflow, PyTorch, TRL, LangChain, OpenCV, Scikit-learn), Java, SQL, C++

CORE COMPETENCIES: LLMs, Reasoning, RAG, Distributed Training & Scaling, Quantization, RL, CV

INTERESTS: Hiking, FPS Gaming, Chess, Boxing, Fiction Reading