CHASING AGI
└[∵┌]
Motivation
Keeping up with AI progress is hard. There is a fair amount of noise and there’s also a lot of signal1. It’s especially hard to follow when you don’t take notes of what’s happening, or a clear idea of where progress comes from. I think jumping between abstraction layers could be a good way to build intuition around the dynamics of this industrialization of intelligence. So I want to dedicate a series of posts that will indeed jump between more low-level technicalities and and higher-order ideas to improve my own and other people’s understanding of what is actually happening2. This will probably be at times a rough scratchpad and some other times a more formal note-taking excercise. I don’t really have a very clear idea of how it will turn out, but I will at least try to nail down a desiderata for the series:
- The info will not be completely self-contained (or else I will never finish), but I will try to add references whenever possible.
- There should be a format indicating ideas that are speculative. I’ll think about this when the time comes. I will definitely try to be very clear whenever I’m adding my personal perspective.
- The notes should follow empricial work more than theory. This will probably help with point 2.
- There should be a rough categorization strategy that serves as a guideline for the series.
Categorization
I will try to use the following categories during the series and tag each post with a theme. Hopefully this will make it easier to navigate the series, in case anyone wants to skip to any specific bit.
- Background
- The promise of AGI
- NN essentials
- The previous paradigm
- The transformer architecture
- Scaling hypothesis
- Evals
- Text
- Pre-training
- Post-training
- Inference
- Image
- Vision
- Diffusion
- Reasoning
- RL
- Agents
- Tools
- Memory
- Other
- Embeddings
- Quantization
I think the list above might have some chronological sense, but I would not want to commit to a specific order, so I’ll consider it to be an unordered list. Also, FWIW: I think I’ll leave this base entry as a draft while I work on the rest. That way I can come back to refine the categorization :)
🍀
-
DeepSeek is an excellent example of a source of great information about the state of the art. It’s been really inspiring to see them pierce through the hype with their open-source philosophy. ↩
-
There are many other similar efforts like this, but I took inspiration from logs like TDM’s Keeping up with AGI. There were others, but I forget now. Anyway, I will probably find them again and share them in the series. ↩