CS 294-265: Building Real-World Systems with AI from Scratch

Instructors

Prof. Ion Stoica

TAs

Ziming Mao

Schedule

Monday, 9:00 AM - 11:59 AM

Dates

1/20/2026 - 5/8/2026

Location

Soda 310

Level

Graduate

Units

Course Description

The central inquiry of this course is: Can AI-based tools enable a small team to build real-world systems orders of magnitude faster than traditional methods?

The rapid advancement of AI has led to significant successes in systems research, particularly in the performance optimization of existing systems. In parallel, AI-powered code assistants have demonstrated increasing capability, often generating entire pull requests for application-level code. However, their utility for the more demanding task of building robust, concurrent, and high-performance systems from the ground up remains a largely open question.

This course tackles this challenge. We will investigate the end-to-end lifecycle of building new systems and subsystems using the latest generative AI tools, as well as improving these tools. Our exploration will include:

Designing new architectures that are more amenable to using AI tools.
Developing new AI tools that target building systems or systems components from scratch.
Developing fast, accurate simulators.
Generating formal specifications for protocols and components.
Guiding iterative performance optimization and tuning.

Students will work on hands-on projects to build systems such as distributed collective communication libraries, distributed storage systems, distributed inference engines, new kernels for new accelerators, and more.

Note: Here we use "systems" broadly to include databases, distributed systems, networking, programming languages, and operating systems.

📢 Announcements

Welcome to Spring 2026! Please check back for updates on readings and assignments.

If you haven't done so already, please join our Slack here: CS294-265 Spring 2026 Slack.

Course Schedule

Note: This syllabus is tentative and subject to change. Please check back regularly for updates.

Week	Date	Topic	Readings (tentative list, subject to change)	Speaker
1	Jan 19	Academic Holiday	—	—
2	Jan 26	Class overview, logistics, project suggestions	AlphaEvolve: A coding agent for scientific and algorithmic discovery OpenEvolve: Automated Discovery of Algorithms and Systems ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution Glia: A Human-Inspired AI for Automated Systems Design and Optimization	—
3	Feb 2	Ongoing projects	AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization Evolve^2 Evolving SkyRL into a Highly-Modular RL Framework SkyRL-Agent: Efficient RL Training for Multi-turn LLM Agent GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning KISS: A Language Model-based Interactive Software Development Environment	—
4	Feb 9	Project proposals presentations	—	—
5	Feb 16	Academic Holiday	—	—
6	Feb 23	Repo-level coding and repair	SWE-bench: Can Language Models Resolve Real-World GitHub Issues? SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering Empirical Evaluation of Generalizable Automated Program Repair with Large Language Models OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures? Rethinking Verification for LLM Code Generation: From Generation to Testing Optional: NExT: Teaching Large Language Models to Reason about Code Execution Code Repair with LLMs gives an Exploration-Exploitation Tradeoff	Koushik Sen
7	March 2	Formal Verification & Reliability	Chain-of-Verification Reduces Hallucination in Large Language Models Dafny as Verification-Aware Intermediate Language for Code Generation Solving a Million-Step LLM Task with Zero Errors Models That Prove Their Own Correctness Verifying LLM-Generated Code in the Context of Software Verification with Ada/SPARK Optional: Don't Trust: Verify — Grounding LLM Quantitative Reasoning with Autoformalization Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Stepaware Formal Verification Position: Trustworthy AI Agents Require the Integration of Large Language Models and Formal Methods Specifications: The missing link to making the development of LLM systems an engineering discipline	—
8	Mar 9	Simulations	Fast End-to-End Performance Simulation of Accelerated Hardware-Software Stacks Analyzing Metastable Failures SimAI: Unifying Architecture Design and Performance Tuning for Large-Scale Large Language Model Training Zero-Shot Cost Models for Out-of-Distribution Query Workloads CausalSim: A Causal Framework for Unbiased Trace-Driven Simulation	Rishabh Iyer
9	Mar 16	AI-driven kernel optimizations	KernelBench: Can LLMs Write Efficient GPU Kernels? Astra: A multi-agent system for gpu kernel performance optimization Automating GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling EvoEngineer: Mastering Automated CUDA Kernel Code Evolution with Large Language Models Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization VibeTensor: System Software for Deep Learning, Fully Generated by AI Agents	Bing Xu
10	Mar 23	Spring Recess	—	—
11	Mar 30	AI-driven training and inference optimizations	REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving PUZZLE: Distillation-based NAS for inference-Optimized LLMs Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute KernelBlaster: Continual Cross-Task CUDA Optimization via Memory-Augmented In-Context Reinforcement Learning	Amir Yazdan Bakhsh
12	Apr 6	From academia to industry and everything in between	—	Eric Liang, Martin Casado, Azalia Mirhoseini
13	Apr 13	AI-driven networking optimizations	AutoCCL: Automated Collective Communication Tuning for Accelerating Distributed and Parallel DNN Training Learning Production-Optimized Congestion Control Selection for Alibaba Cloud CDN NetLLM: Adapting Large Language Models for Networking TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches	Christopher Fletcher
14	Apr 20	AI for databases and databases for AI	Conformal Prediction for Verifiable Learned Query Optimization Data-Agnostic Cardinality Learning from Imperfect Workloads NeurDB: On the Design and Implementation of an AI-powered Autonomous Database ZeroCard: Cardinality Estimation with Zero Dependence on Target Databases - No Data, No Query, No Retraining 𝜆-Tune: Harnessing Large Language Models for Automated Database System Tuning Optional: Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First	Aditya Parameswaran
15	Apr 27	Project presentations	—	—

Grading

Paper presentations and reviews

20%

Class discussions and participation

20%

Project

60%

Course Resources

📁 Course Materials

Access homework assignments, signup sheets, and other course materials: Course Drive Folder

💬 Slack

Join our course Slack for announcements and discussions: CS294-264 Spring 2026 Slack Invite

Academic Integrity

All work submitted must be your own. Collaboration is encouraged for understanding concepts and debugging, but code and written reports must be completed independently unless explicitly stated otherwise. Use of AI tools for completing assignments must be disclosed and will be discussed as part of the course content.