Instructors
Prof. Ion Stoica
TAs
Ziming Mao
Schedule
Monday, 9:00 AM - 11:59 AM
Dates
1/20/2026 - 5/8/2026
Location
Soda 310
Level
Graduate
Units
4
Course Description
The central inquiry of this course is: Can AI-based tools enable a small team to build
real-world systems orders of magnitude faster than traditional methods?
The rapid advancement of AI has led to significant successes in systems research, particularly in
the performance optimization of existing systems. In parallel, AI-powered code assistants have
demonstrated increasing capability, often generating entire pull requests for application-level
code. However, their utility for the more demanding task of building robust, concurrent, and
high-performance systems from the ground up remains a largely open question.
This course tackles this challenge. We will investigate the end-to-end lifecycle of building new
systems and subsystems using the latest generative AI tools, as well as improving these tools. Our
exploration will include:
- Designing new architectures that are more amenable to using AI tools.
- Developing new AI tools that target building systems or systems components from scratch.
- Developing fast, accurate simulators.
- Generating formal specifications for protocols and components.
- Guiding iterative performance optimization and tuning.
Students will work on hands-on projects to build systems such as distributed collective
communication libraries, distributed storage systems, distributed inference engines, new kernels for
new accelerators, and more.
Note: Here we use "systems" broadly to include databases, distributed systems, networking,
programming languages, and operating systems.
📢 Announcements
Welcome to Spring 2026! Please check back for updates on readings and assignments.
If you haven't done so already, please join our Slack here: CS294-265
Spring 2026 Slack.
Course Schedule
Note: This syllabus is tentative and subject to change. Please check back regularly for
updates.
| Week |
Date |
Topic |
Readings (tentative list, subject to change) |
Speaker |
| 1 |
Jan 19 |
Academic Holiday |
— |
— |
| 2 |
Jan 26 |
Class overview, logistics, project suggestions |
AlphaEvolve: A coding agent for
scientific and algorithmic discovery
OpenEvolve:
Automated Discovery of Algorithms and Systems
ShinkaEvolve: Towards Open-Ended And
Sample-Efficient Program Evolution
Glia: A Human-Inspired AI for Automated Systems Design and Optimization
|
— |
| 3 |
Feb 2 |
Ongoing projects |
AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization
Evolve^2
Evolving SkyRL into a Highly-Modular RL Framework
SkyRL-Agent: Efficient RL Training for Multi-turn LLM Agent
GEPA: Reflective Prompt Evolution Can
Outperform Reinforcement Learning
KISS: A Language Model-based
Interactive Software Development Environment
|
— |
| 4 |
Feb 9 |
Project proposals presentations |
— |
— |
| 5 |
Feb 16 |
Academic Holiday |
— |
— |
| 6 |
Feb 23 |
Repo-level coding and repair |
SWE-bench: Can Language Models Resolve
Real-World GitHub Issues?
SWE-agent: Agent-Computer Interfaces
Enable Automated Software Engineering
Empirical Evaluation of Generalizable
Automated Program Repair with Large Language Models
OpenRCA:
Can Large Language Models Locate the Root Cause of Software Failures?
Rethinking Verification for LLM Code
Generation: From Generation to Testing
Optional:
NExT: Teaching Large Language Models to
Reason about Code Execution
Code Repair with LLMs gives an
Exploration-Exploitation Tradeoff
|
Koushik Sen |
| 7 |
March 2 |
Formal Verification & Reliability |
Chain-of-Verification Reduces
Hallucination in Large Language Models
Dafny as Verification-Aware Intermediate
Language for Code Generation
Solving a Million-Step LLM Task with Zero
Errors
Models That Prove Their Own
Correctness
Verifying LLM-Generated Code in the
Context of Software Verification with Ada/SPARK
Optional:
Don't Trust: Verify — Grounding LLM
Quantitative Reasoning with Autoformalization
Safe: Enhancing Mathematical Reasoning in
Large Language Models via Retrospective Stepaware Formal Verification
Position:
Trustworthy AI Agents Require the Integration of Large Language Models and
Formal Methods
Specifications: The missing link to
making the development of LLM systems an engineering discipline
|
— |
| 8 |
Mar 9 |
Simulations |
Fast End-to-End
Performance Simulation of Accelerated Hardware-Software Stacks
Analyzing Metastable
Failures
SimAI:
Unifying Architecture Design and Performance Tuning for Large-Scale Large
Language Model Training
Zero-Shot Cost Models for
Out-of-Distribution Query Workloads
CausalSim: A Causal Framework for
Unbiased Trace-Driven Simulation
|
Rishabh Iyer |
| 9 |
Mar 16 |
AI-driven kernel optimizations |
KernelBench: Can LLMs Write
Efficient GPU Kernels?
Astra: A multi-agent system for gpu
kernel performance optimization
Automating
GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling
EvoEngineer: Mastering Automated CUDA
Kernel Code Evolution with Large Language Models
Towards Robust Agentic CUDA Kernel
Benchmarking, Verification, and Optimization
VibeTensor: System Software for Deep Learning, Fully Generated by AI Agents
|
Bing Xu |
| 10 |
Mar 23 |
Spring Recess |
— |
— |
| 11 |
Mar 30 |
AI-driven training and inference optimizations |
REASONING COMPILER: LLM-Guided
Optimizations for Efficient Model Serving
PUZZLE: Distillation-based NAS for
inference-Optimized LLMs
Smaller, Weaker, Yet Better: Training LLM
Reasoners via Compute-Optimal Sampling
BEST-Route: Adaptive LLM Routing with
Test-Time Optimal Compute
KernelBlaster: Continual Cross-Task CUDA Optimization via Memory-Augmented In-Context Reinforcement Learning
|
Amir Yazdan Bakhsh |
| 12 |
Apr 6 |
From academia to industry and everything in between |
— |
Eric Liang, Martin Casado, Azalia Mirhoseini |
| 13 |
Apr 13 |
AI-driven networking optimizations |
AutoCCL:
Automated Collective Communication Tuning for Accelerating Distributed and
Parallel DNN Training
Learning
Production-Optimized Congestion Control Selection for Alibaba Cloud CDN
NetLLM: Adapting Large Language Models
for Networking
TACCL: Guiding
Collective Algorithm Synthesis using Communication Sketches
|
Christopher Fletcher |
| 14 |
Apr 20 |
AI for databases and databases for AI |
Conformal Prediction for Verifiable
Learned Query Optimization
Data-Agnostic Cardinality Learning from
Imperfect Workloads
NeurDB: On the
Design and Implementation of an AI-powered Autonomous Database
ZeroCard: Cardinality Estimation with
Zero Dependence on Target Databases - No Data, No Query, No Retraining
𝜆-Tune: Harnessing Large Language Models
for Automated Database System Tuning
Optional:
Palimpzest: Optimizing
AI-Powered Analytics with Declarative Query Processing
Supporting Our AI Overlords: Redesigning
Data Systems to be Agent-First
|
Aditya Parameswaran |
| 15 |
Apr 27 |
Project presentations |
— |
— |
Grading
Paper presentations and reviews
20%
Class discussions and participation
20%
Project
60%
Course Resources
📁 Course Materials
Access homework assignments, signup sheets, and other course materials:
Course Drive Folder
Academic Integrity
All work submitted must be your own. Collaboration is encouraged for understanding concepts and
debugging, but code and written reports must be completed independently unless explicitly stated
otherwise. Use of AI tools for completing assignments must be disclosed and will be discussed as
part of the course content.