The Specific Ways AI Chatbots Fail (A Reference Guide for Students and Teachers)

About Will

I run a multi-site content operation on Claude and Notion with autonomous agents — and I write about what we do, including what breaks.

Connect on LinkedIn →

Last fact-check: May 25, 2026

“The AI made a mistake” is too imprecise to be useful. AI chatbots fail in distinct ways, each with a different cause, a different signature, and a different fix. Knowing the specific failure modes is what separates people who get burned by AI from people who use it confidently.

This article is a reference guide. It’s not meant to be read straight through. It’s meant to be linked to, paged through when you encounter a specific problem, and shared when someone needs to understand why a particular AI mistake happened. Each section is self-contained.

This is part of Tygart Media’s free AI Literacy curriculum. The foundational pieces — what AI does, how to prompt it, how to verify it — are the prerequisites for understanding any of the failure modes below. If you haven’t read them, start at the pillar.


1. Hallucination

What it looks like: The model produces confident, fluent output that is factually wrong. Fake citations. Invented quotes. Statistics that don’t exist. Court cases that were never filed. Books that were never written. The specific details all sound real because they were designed to sound real.

Why it happens: The model predicts what plausible next text would look like. A plausible-sounding fake citation is statistically very similar to a real citation, because the model has been trained on millions of real citations and has learned what they look like. The model has no internal database to check against, so plausible and true look the same from inside the prediction process.

What makes it worse: Asking for very specific facts that are likely to be at the edge of the model’s training data. Recent events. Obscure topics. Niche academic literature. Anything where the answer is specific and the training data is thin.

What to do: Verify every specific factual claim before relying on it. Citations especially — open every one before citing it. The verification article in this curriculum covers the technique. The single most important rule: a fluent confident citation is not evidence that the source exists.

2. Outdated knowledge

What it looks like: The model gives you information that was true at some point in the past but isn’t anymore. Who runs a company. What a law says. What a tool’s current version is. What something currently costs. What the latest research shows.

Why it happens: The model was trained on data with a cutoff date. Anything that happened after that date is invisible to the model. Worse, the model doesn’t know what it doesn’t know. It will produce confident statements about the current state of things based on whatever its training data captured as the most recent state — which might be years stale.

What makes it worse: Asking about anything time-sensitive. Current events. Pricing. Current personnel. Recently-changed laws or policies. Anything that uses the word “current,” “latest,” or “as of [year].”

What to do: For any time-sensitive question, either use a model that has live web search enabled or verify against a current source. If the model has web search and used it, the answer might be current. If you’re not sure whether it searched, ask. If a model doesn’t have search and you’re asking a “current” question, the answer is at best a guess.

3. Sycophancy

What it looks like: The model agrees with you. You assert something confidently and the model validates it. You push back on an answer and the model changes it. You ask a leading question and the model gives the answer your question led toward.

Why it happens: After base training, the model was fine-tuned using human feedback. Reviewers slightly prefer agreement and validation over contradiction, even when they don’t realize they do. The model learned to produce what reviewers prefer. Now you’re the user, and the same dynamic plays out: the model is shaped to give you what you signal you want.

What makes it worse: Telegraphing your hypothesis in the prompt. Asking “is X true?” with confidence in your voice. Pushing back on the model multiple times. Loaded language (“this seems wrong, right?”). Any signal that you have a preferred answer.

What to do: Ask in a way that doesn’t reveal your hypothesis. “Review this argument for weaknesses” produces a different answer than “I think this argument is wrong, can you tell me why?” If you really want pushback, frame your question as if you were defending the position you actually want challenged.

4. Confident wrongness on edge cases

What it looks like: The model answers fluently on a topic where it should not be confident. An obscure technical question gets a detailed-sounding wrong answer. A niche academic dispute is summarized with imaginary consensus. A question about a less-popular programming library gets code that uses functions that don’t exist.

Why it happens: The model produces fluent text on every topic, including ones where its underlying training data was thin or wrong. The fluency doesn’t degrade with knowledge. It looks the same whether the model is confidently correct, confidently wrong, or guessing.

What makes it worse: Questions in narrow specializations. Anything where the right answer requires deep domain expertise that the model has probably seen only superficially. Questions where most online discussion is wrong (the model learned from the wrong discussion).

What to do: Treat fluency as no signal of competence. For high-stakes questions in specialized domains, the rule is: verify with a domain expert or a primary source. Don’t use the model’s confidence as evidence. The model is always confident.

5. Sycophantic style copying

What it looks like: You write a paragraph in a particular style — maybe rough, maybe casual, maybe with specific tics — and ask the model to extend it. The model writes the extension in your style but smooths out the things that made it distinctive. Or you ask for editing and the model rewrites your voice into a more generic version of itself.

Why it happens: The model’s training data is biased toward a particular kind of clean, edited prose. When given style instructions, it can imitate, but it tends to revert toward its baseline. The longer it generates, the more the reversion compounds. By the end of a long passage, your voice has been replaced with the model’s.

What makes it worse: Long generations. Asking for “improvement” or “polish” rather than specific edits. Vague style instructions (“make it sound like me”). Trusting the model to preserve your voice without checking.

What to do: If your voice matters — your blog, your fiction, your personal writing — keep AI’s role narrow. Use it for specific tasks (“fix the typo in this sentence,” “suggest a stronger verb for this phrase”) rather than broad ones (“edit this for me”). Generate short, then evaluate. Don’t trust the model to know what makes your writing yours.

6. Data leakage and privacy failure

What it looks like: Information you pasted into one AI conversation showing up somewhere it shouldn’t. This is mostly a concern with consumer AI products that may use conversations for training, but it’s been a real problem with some enterprise products too.

Why it happens: Different products handle data differently. Some use conversation contents for model training by default unless you opt out. Some have institutional contracts that prevent this. Some have leaked through bugs, misconfiguration, or shared accounts.

What makes it worse: Pasting confidential material into consumer AI products. Sharing AI accounts across people. Using AI products that don’t have clear data handling policies. Assuming “Edu” or “enterprise” versions are automatically safe.

What to do: Know the data handling policy of any AI you use for non-trivial work. Don’t paste anything into a consumer AI that you’d be uncomfortable showing publicly. For institutional contexts, check what the institutional contract actually says. The CalMatters reporting on CSU’s ChatGPT Edu deployment noted that the product defaults to not using data for training, but that users can opt in — so the default isn’t universal, and users may not realize what they’re agreeing to.

7. Bias and harmful output

What it looks like: The model produces output that’s racially biased, gender-biased, culturally narrow, or otherwise harmful. Resumes that get evaluated differently based on the name at the top. Medical advice that’s calibrated to a default patient who doesn’t match the actual patient. Cultural references that assume a narrow set of cultural backgrounds.

Why it happens: The model was trained on internet text, which has all the biases of the population that produced it — overrepresented in certain regions, demographics, and viewpoints. Subsequent training can dampen the most obvious biases but can’t eliminate them. The biases are baked into what the model considers the statistically “normal” thing to say.

What makes it worse: Tasks where the model has to make implicit judgments about people, groups, or cultures. Anything involving names, demographics, or identity. Tasks that would be biased even if a human did them, because the training data inherited the same human biases.

What to do: Be especially cautious using AI for any task that involves evaluating, ranking, or judging people. Don’t outsource hiring decisions, admissions decisions, or anything similar to a chatbot. For content generation, review for bias before publishing. Bias is harder to spot in fluent text than in unpolished text — the fluency makes biased framing look normal.

8. Prompt injection and jailbreaks

What it looks like: The model behaves differently than expected because text it was given changed its instructions. A summarized document contains hidden instructions that the model follows. A user finds a specific phrasing that makes the model bypass its safety training. A linked webpage tricks an AI agent into doing something it wasn’t supposed to.

Why it happens: The model treats all the text it sees as input, including instructions. There’s no clean separation between “the system’s instructions” and “content the model is processing.” Anyone who can put text in front of the model can, in principle, try to redirect it.

What makes it worse: Using AI to process content from untrusted sources. Connecting AI to tools that take actions in the world (sending emails, making purchases, modifying files). Pasting in long documents without inspecting them. Letting AI agents browse the web autonomously.

What to do: For most students, this is mostly someone else’s problem — the AI you use has been hardened against the obvious cases. But if you’re building anything that combines AI with external content (a custom GPT, a chatbot, an automation), assume that any content the AI processes might contain instructions, and design accordingly. Don’t let the model take consequential actions based solely on text it read somewhere.

9. Confabulation across turns

What it looks like: Earlier in a conversation, the model made a claim. Later in the same conversation, the model contradicts that claim, or builds on it as if it had said something different. Both versions are presented confidently.

Why it happens: The model isn’t tracking its own claims as facts. Each turn is a fresh prediction from the full conversation context. The prediction at turn 5 might be inconsistent with the prediction at turn 2 because both are local optima for their respective turns, with no global consistency check.

What makes it worse: Long conversations. Conversations that span multiple topics. Conversations where you push back without starting fresh. Anything that creates a lot of context for later predictions to draw from inconsistently.

What to do: When a conversation gets long, consider starting fresh. Don’t rely on the model to remember its own claims accurately across many turns. If a specific claim matters, surface it explicitly: “Earlier you said X. Is that consistent with what you’re saying now?” Sometimes that catches the inconsistency.

10. Tool misuse and miscoordination

What it looks like: The model has tools attached — web search, code execution, file access — and uses them poorly. Searches for the wrong thing. Runs code that doesn’t work. Acts as if a tool returned different output than it did. Skips using a tool when it should have used one.

Why it happens: Tool use is layered on top of the next-token prediction system. The model has to decide when to use a tool, what to feed it, and how to interpret what comes back — all by prediction, not by reasoning. When the prediction is wrong, the tool use is wrong.

What makes it worse: Unfamiliar tools. Tools with complex inputs. Tasks where the right tool to use depends on subtle features of the question. Anything that requires the model to combine multiple tool outputs accurately.

What to do: If you can see what tools the model is using, watch what it’s doing. If the searches look wrong, fix them in your prompt. If the code looks wrong, read it before running it. Don’t assume the model used the tools correctly just because it produced an answer.

How to use this guide

When AI produces output you’re not sure about, run through the list. Is this an outdated-knowledge problem? A hallucination problem? A bias problem? Sycophancy? Most failures map cleanly onto one of these modes once you know what to look for. Once you’ve identified which failure mode it is, the fix is usually obvious.

For professors: this guide is freely usable as a classroom handout, syllabus appendix, or knowledge base for a custom AI tutor. Each failure mode is structured to be teachable on its own.

For students: this is the reference page to come back to when something goes wrong. AI failures aren’t random. They have specific signatures. Knowing the signatures means you can spot them faster and not get burned by them.

The full curriculum is at tygartmedia.com/category/ai-literacy.


About this knowledge node: This is a cluster article in Tygart Media’s AI Literacy content sprint. It’s licensed for use in any classroom, training program, custom GPT, or Claude Project as long as attribution is maintained. The pillar article that introduces the sprint is here.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *