What is AI?

This article builds on the concepts from:

You use one every day but probably can't explain how it works. Here's why understanding the basics changes how you think about AI.

Christian Genco

Most people treat "artificial intelligence" like it's some new, exotic thing. It isn't. I'd argue you've been using AI your entire life.

Here's my definition: artificial intelligence is human-like intelligence that exists outside a human brain.

That's it. And by that definition, AI is ancient.

You've been using AI since grade school

Think about a piece of paper and a pencil. You can solve math problems on paper that you can't solve in your head. The paper isn't alive. It isn't thinking. But it's holding intermediate results, keeping track of carries, letting you work through logic you'd otherwise lose track of. It's doing cognitive work outside your brain.

An abacus does even more. Slide the beads and it performs arithmetic for you. A shopkeeper in 1200 AD didn't need to multiply in her head — the abacus handled it. Was that artificial intelligence? By my definition: yes. She offloaded a cognitive task to a non-brain thing, and the non-brain thing handled it.

A book is externalized knowledge. A library is externalized collective memory. A calculator is externalized math. A spreadsheet is an externalized financial analyst, at least for the simple stuff.

The whole history of tools is the history of offloading cognitive work to things that aren't brains.

The official version (1956)

In the summer of 1956, a group of researchers got together at Dartmouth College and coined the term "Artificial Intelligence." They had a much narrower definition than mine. They wanted to build machines that could do things we'd call intelligent if a person did them: play chess, prove theorems, understand language, learn from experience.

The early work was all rule-based. Programmers would sit down and hand-code the logic: if the opponent's queen is threatening your king, move the king. They called this approach "expert systems" because you'd essentially interview a human expert, write down all their rules, and teach the computer to follow them.

It worked, kind of. These systems could play decent chess and diagnose certain diseases. But they were brittle. Every edge case needed a new rule. The programmer had to anticipate everything. And it turns out most of what humans know is really hard to express as explicit rules. Try writing down the complete set of instructions for recognizing a face, or understanding sarcasm, or knowing when someone is about to cry. You can't, because you don't actually know how you do those things. You just... do them.

The field went through cycles. A burst of excitement and funding, then the realization that the hard problems were way harder than expected, then funding dried up and researchers scattered. These downturns got a name: AI winters. There were two big ones, in the mid-70s and late-80s. Each time, the surviving researchers quietly kept working, and each time they came back with better ideas.

Machine learning: let the data write the rules

The big idea that pulled AI out of its second winter was simple: stop writing rules. Let the machine figure them out.

Instead of a programmer sitting down and coding "if pixel pattern X then it's a cat," you show the machine ten thousand photos labeled "cat" and ten thousand photos labeled "not cat," and it finds the patterns itself.

This is machine learning. The programmer doesn't need to know the rules. They don't need to be able to explain how to recognize a cat. They just need examples, and the machine discovers the rules on its own.

But how does the machine actually "discover" the rules? It uses a technique that's pretty intuitive once you see it.

Gradient descent: finding the bottom of a hill you can't see

Imagine you're standing in hilly terrain, but you're blindfolded. You want to get to the lowest point. All you can do is take a step, feel whether you went downhill or uphill, and decide where to step next.

You'd probably do something like this: take a step to the right. Did you go downhill? Great, keep going right. Did you go uphill? Turn around and go left. As the slope gets flatter, take smaller steps so you don't overshoot the bottom. Eventually you stop moving because every direction feels flat. You've found a valley.

That's gradient descent, and it's the core technique behind nearly all modern machine learning. Try it yourself:

🎯 Interactive Demo: Gradient Descent

A dial to set position, a button to drop a ball, and a hidden curve to explore. Find the lowest point.

Coming soon

In a real machine learning model, the "terrain" is the model's error — how wrong its predictions are. The "position" is the setting of all the model's internal dials (called parameters or weights). Training the model means twisting those dials, checking how wrong the model is, and twisting them again, over and over, trying to get the error as low as possible.

Each time you check the error, that's like dropping a ball. It costs time and compute. So you want a strategy that finds the bottom without wasting drops — which is why you follow the slope downhill instead of guessing randomly.

A cat-vs-not-cat image classifier might have millions of these internal dials. Gradient descent is how you tune all of them at once, feeling your way downhill in a million-dimensional space. You can't picture a million dimensions, but the math works the same way as our one-dimensional demo above.

This approach has been quietly running the world for the past twenty years. Spam filters use it. Netflix recommendations. Fraud detection at your bank. Every time you search Google. Machine learning has been making decisions about your life long before anyone started talking about ChatGPT.

But each of those models does one specific thing. A spam filter can't recommend movies. A fraud detector can't filter spam. They're narrow specialists. What changed recently is that people figured out how to train a general-purpose model — one that can handle almost any language task you throw at it.

What people mean by "AI" today

When someone in a boardroom says "AI" in 2026, they're almost never talking about abacuses or spam filters. They mean one of three things:

Large language models (LLMs) — the technology behind ChatGPT, Claude, Gemini, and the rest. You type in a question, it types back an answer. These are trained using the same gradient descent technique described above, but at a staggering scale: trillions of words of training data, billions of internal dials, months of training on thousands of specialized chips.
Generative models — systems that create images (Midjourney, DALL·E), video (Sora, Veo), music, and code. Same underlying ideas, different outputs.
Agents — LLMs hooked up to tools. An LLM by itself can only produce text. But give it the ability to search the web, run code, read files, send emails, and call APIs, and it becomes something that can actually do things in the real world. The LLM thinks about what to do, picks a tool, uses it, looks at what happened, then thinks about what to do next. Loop until the task is done.

These all feel qualitatively different from the machine learning that came before. Earlier ML models were narrow specialists you'd never want to talk to. LLMs are general-purpose and conversational. You interact with them in plain English. That's why they've captured the public imagination in a way that spam filters never did.

What this means for your business

Understanding these layers — from ancient tools to rule-based systems to machine learning to LLMs and agents — gives you a much better frame for evaluating what AI can and can't do for your company.

When a vendor tells you their product is "AI-powered," you can now ask: what kind of AI? Is it a hand-coded rules engine from the 1980s with a chatbot UI bolted on? A narrow ML model trained on a specific task? Or something built on top of a modern LLM?

When an employee says "we should use AI for this," you'll know what questions to ask. When a competitor announces an "AI initiative," you'll have a better sense of what that might actually mean versus what's just marketing.

If you want to go deeper on how LLMs actually work — what's happening under the hood when you type a question into ChatGPT — I wrote a companion piece that builds the intuition from scratch:

What is a large language model?

The technology behind ChatGPT and Claude, explained by building one up from the simplest possible version.

Christian Genco