Lesson 4
What AI Gets Wrong (And Why That Is OK)
AI-generated
- Understand hallucination and why AI makes things up
- Recognize common failure modes: math, counting, recent events, niche topics
- Know why verification matters even when AI sounds confident
- Develop healthy skepticism without paranoia
- Learn the "trust but verify" mindset that makes AI useful
Here is an uncomfortable truth: AI is wrong more often than it sounds wrong.
That confident, articulate response that cites specific details? It might be completely fabricated. AI can and does:
- Make up facts
- Invent citations
- Miscalculate numbers
- Confidently state things that are verifiably false
This is not a bug to be fixed in the next version. It is a fundamental characteristic of how current AI works. Understanding this makes you better at using AI effectively.
AI hallucination = a model generates content that is false but presented as fact.
Types of Hallucinations
Fabricated citations:
- AI invents books that do not exist
- Papers that were never published
- Quotes nobody ever said
- Citations look real (plausible names, dates, titles) but are generated, not retrieved
Invented facts:
- Specific statistics that sound precise but are made up
- "Studies show that 67% of users prefer..." might be pure fabrication
- The oddly specific percentage makes it sound researched
False confidence:
- AI does not say "I'm not sure about this"
- States hallucinated content with the same tone as accurate information
Why does this happen? AI generates text by predicting probable next words. When it does not "know" something, it does not stop. It generates plausible-sounding text that fits the pattern of what an answer should look like. The model optimizes for sounding good, not being accurate.
Good news: Hallucination rates have dropped significantly.
- 4 major models now have sub-1% rates on standard benchmarks
- Some models saw 60%+ drops in the past year
Bad news: Rates vary dramatically by task:
| Task Type | Hallucination Rate |
|---|---|
| General factual | < 1% (top models) |
| Legal information | ~6% (best models) |
| Legal (all models) | up to 18% |
| Specialized/technical | Higher risk |
2025 researcher consensus: Aim for "calibrated uncertainty." Systems should know when they do not know and safely decline to answer. Until that is solved, your verification skills remain essential.
You might expect a computer to be good at math. AI is not.
Counting problems:
- Ask AI how many r's in "strawberry"
- It frequently gets it wrong (answer: 3)
- AI sees tokens, not individual letters
Arithmetic errors:
- Multi-step calculations often contain mistakes
- AI might get the approach right but flub the actual computation
Logic puzzles:
- Step-by-step reasoning can go awry
- Model shortcuts to answers instead of working through each step
The workaround: For math that matters, verify independently. Use AI to explain the approach, but do calculations yourself or with a calculator.
AI models have a knowledge cutoff: a date after which they have no training data.
What happens when you ask about events after the cutoff:
- Correct admission: "I do not have information about that"
- Hallucinated answer that sounds plausible but is false
- Outdated information presented as current
Some AI systems now have web search capabilities, which helps. But the core model's knowledge is still frozen.
Practical rule: For anything time-sensitive, treat AI responses as a starting point, not a final answer.
| Traditional Software | AI |
|---|---|
| Fails loudly | Fails quietly |
| Returns error when data missing | Returns confident prose |
| Easy to spot problems | Hard to spot problems |
Why? AI was trained on human text, and humans rarely write "I don't know" in published content. The training data is full of confident assertions.
Develop your "AI smell": An intuition for when output might be unreliable.
Red flags that should trigger extra verification:
- Specific claims with precise numbers
- Obscure or niche topics
- Recent events
- High-stakes decisions
Why use AI at all given these limitations?
Because AI is still extraordinarily useful when you approach it correctly. The key is calibrated trust.
Trust AI MORE for:
- Brainstorming and ideation
- First drafts of writing
- Explaining well-documented concepts
- Code in common patterns
- Summarizing content you can check
- Tasks where creativity > precision
Trust AI LESS for:
- Factual claims you cannot verify
- Medical or legal advice
- Recent events
- Math and precise calculations
- Obscure or specialized topics
- Anything with significant consequences if wrong
The verification habit: Treat AI like a knowledgeable but occasionally confused colleague whose work you review before using.
Prompt 1: Knowledge Cutoff Test
Who won the Super Bowl in February 2027?
Model should recognize this is beyond its training data and say so rather than hallucinating.
Prompt 2: Letter Counting (Known Weakness)
How many letter r's are in the word "strawberry"? Count carefully.
Check if correct (answer: 3). Many models get this wrong, even when asked to count carefully.
Prompt 3: Hallucination Trap
Name three books by [insert an obscure author you actually know well].
Use an author you know so you can verify. If obscure enough, AI might hallucinate titles.
Goal: Build your "AI smell" by testing AI on topics you actually know.
Step 1: Think of a topic you know well: your profession, a hobby, your hometown.
Step 2: Ask AI a specific factual question about this topic.
Step 3: Before reading the response, predict: will this be accurate?
Step 4: Read carefully and fact-check at least two claims.
Step 5: Score the response:
- How many claims were accurate?
- How many were wrong or unverifiable?
- Did the tone match actual accuracy?
Repeat with different domains to calibrate your expectations.
- Hallucination is fundamental, not a bug. AI generates plausible-sounding content without checking accuracy.
- Math and counting are weaknesses. Do not trust AI for precise calculations or character counting.
- Knowledge cutoffs create blind spots. AI cannot know about recent events after its training date.
- Confident tone means nothing. AI sounds certain even when completely wrong.
- Trust but verify is the right approach. Use AI freely, but check output when accuracy matters.
- Upwork: Debunking 11 Common AI Myths in 2026, https://www.upwork.com/resources/artificial-intelligence-myths
- Beam AI: Artificial Intelligence: The Truth Behind the 5 Biggest Myths, https://beam.ai/agentic-insights/artificial-intelligence-the-truth-behind-the-5-biggest-myths
- SS&C Blue Prism: Debunking AI Myths and Misconceptions, https://www.blueprism.com/resources/blog/ai-myths-misconceptions/
- IBM: What Are Large Language Models, https://www.ibm.com/think/topics/large-language-models
- Wikipedia: Large language model (section on limitations), https://en.wikipedia.org/wiki/Large_language_model