CS + AI/ML student

Devaansh Pathak

AI/ML and systems builder interested in reliable software, thoughtful evaluation, and practical research tools.

I work across LLM agents, reinforcement learning environments, evaluation systems, AI infrastructure, and applied engineering. This site collects my projects, writing, publications, and research notes as they develop.

View projects Research notes

/DevaanshPathak /in/devaanshpa Email CV

Profile

Research-minded engineering

I like problems where models, tools, data, and systems meet, especially when behavior needs to be measured carefully rather than only demoed.

I use this space as a working record of what I am building and learning: research prototypes, software projects, implementation notes, and longer-form writeups. The common thread is a preference for systems that can be inspected, tested, and improved over time.

Interests

Technical interests

A few areas I keep returning to while building projects and reading research.

Reliable LLM systems

Reinforcement learning environments

Evaluation pipelines and benchmarks

AI infrastructure and tooling

Full-stack product engineering

Failure analysis and debugging

Current research thread

SRE-Zero

An environment-grounded benchmark for evaluating reliable tool-using agents in simulated incident-response workflows. The project focuses on sequential decisions, safe tool use, partial evidence, remediation quality, and operational reliability metrics.

LLM AgentsRL EnvironmentsEvaluationAI Systems

Project page

Writing

Latest blog posts

Research diary entries, project notes, and implementation writeups.

All posts

May 13, 20266 min read

Starting SRE-Zero: Building RL Environments for Reliable Tool-Using AI Agents

Why I’m starting a long-term research project on environment-grounded evaluation and training for reliable LLM agents.

SRE-ZeroLLM AgentsReinforcement LearningAI Systems