Running a ChatGPT-level AI Locally for $100
Running a ChatGPT-level AI Locally for $100
Andrej Karpathy, the former Director of AI at Tesla and OpenAI, has once again dropped a project that demystifies AI for the rest of us. His latest release, nanochat, is a minimalist implementation that shows how to run capable Large Language Models (LLMs) on consumer hardware.
Why This Matters
For a long time, the narrative has been that you need NVIDIA H100s to do anything meaningful with AI. While true for training massive models, inference (running them) is becoming surprisingly accessible.
"The best ChatGPT that $100 can buy." — Andrej Karpathy
What is Nanochat?
It's a lightweight C inference engine. It strips away the complexity of PyTorch and Python dependencies, allowing you to run models like Llama 3 or Mistral directly on your CPU/GPU with minimal overhead.
Key Features
- Zero Dependencies: No Python hell. Just compile and run.
- Speed: Optimized for Apple Silicon (Metal) and pure C efficiency.
- Educational: The code is clean enough to read in an afternoon.
Getting Started
git clone https://github.com/karpathy/nanochat
cd nanochat
make
./nanochat model.bin
This project reminds us that the future of AI isn't just in the cloud—it's on our devices. As a developer, understanding how these models run at a low level is becoming a crucial skill.
Check out the repo here: github.com/karpathy/nanochat