History

A Brief History of AI

From rule-based systems to AI agents
A 2-minute journey through AI history: rules → machine learning → transformers → agents

To understand where agents are today, it helps to know where we came from. This isn't a textbook — just the highlights that explain why things work the way they do now.

The early days: rules and logic (1950s–1990s)

The first "intelligent" computer programs weren't learned — they were handcrafted. Programmers wrote explicit rules: if X then Y. These systems could play chess, diagnose diseases, and answer narrow questions — but only within the exact boundaries their authors had anticipated.

Feature unlocks: Chess programs, medical diagnosis systems, language translation

Limitation: Brittle — one edge case outside the rules and they'd fail completely.

Machine learning arrives (1990s–2010s)

The breakthrough: instead of writing rules, let computers learn them from data.

Show a machine learning model 10,000 pictures of cats and 10,000 pictures of dogs, and it learns to tell them apart — without anyone writing a single rule about ears or tails.

Feature unlocks: Image recognition, spam filtering, recommendation engines, early voice assistants

Limitation: Each model learned exactly one task — nothing more.

The transformer moment (2017)

In 2017, Google researchers published "Attention Is All You Need" — and it quietly changed everything.

The transformer architecture let models learn language at a scale and quality that had never been possible. Instead of learning one task, these models could learn from the entire internet and develop a general understanding of language, reasoning, and knowledge.

Feature unlocks: Parallel processing, context-aware understanding, foundation for GPT/Claude/Gemini

Limitation: Required massive compute and data.

Large Language Models emerge (2020–2022)

OpenAI's GPT-3 in 2020 was the first widely accessible demonstration that AI could write, reason, code, and converse at a genuinely useful level. It wasn't perfect — but it was shockingly capable for something that simply predicted the next word.

Feature unlocks: Few-shot learning, code generation, creative writing, natural language understanding

Limitation: Could only respond — not act.

ChatGPT and the mainstream moment (2022–2023)

In November 2022, OpenAI released ChatGPT. 100 million users in 60 days — the fastest product adoption in history.

For most people, this was their first real experience with a capable AI. But for researchers, it was a proof of concept for something bigger: if you could chat with an AI, could you also give it tools and let it work autonomously?

Feature unlocks: Conversational interface, accessible to non-technical users, proof that AI can be useful to everyone

Limitation: Still just a chatbot — no tools, no action.

The agent era begins (2023–present)

The leap from "smart chatbot" to "autonomous agent" required two things that arrived in 2023–2024:

  1. Tool use — models that could call APIs, browse the web, run code, and interact with external systems
  2. Longer context — models that could hold an entire codebase, document, or conversation history in their working memory

Feature unlocks: Multi-step planning, autonomous workflows, writing and deploying code, operating business workflows

Combine a capable model with tools and enough memory, and you have an agent that can plan and execute multi-step work — not just answer a question.


Where are we now?

We're at the very beginning. The models of 2024 are roughly as capable as a very smart intern — excellent at well-defined tasks with good instructions, but still needing oversight for high-stakes or truly novel decisions.

The trajectory is steep. What agents could do in 2023 would have seemed like science fiction in 2020. What they'll do in 2026 is still being written.

Next: Understanding the core concepts →