Appearance
Lesson 02 · Critically Reviewing AI Output
Beyond the 1Z0-830 exam
The skill that separates productive AI use from dangerous AI use is review. Generated code is confident, idiomatic, and frequently subtly wrong. This lesson catalogs the failure modes — and the lab demonstrates two of them with passing-then-failing tests.
Objectives
After this lesson you will be able to:
- Spot hallucinated APIs and wrong idioms.
- Find subtle correctness bugs that pass the happy path.
- Apply a concrete review checklist to a generated diff.
- Use the earlier modules as your reviewing toolkit.
The failure modes
| Failure | What it looks like | How you catch it |
|---|---|---|
| Hallucinated API | list.stream().toImmutableList() (no such method) | compile; check the Javadoc |
| Wrong idiom | mutating a list during a for-each; == on String | Modules 01, 05 |
| Subtle correctness bug | integer division, off-by-one, overflow | edge-case tests |
| Missing edge cases | no empty/null/boundary handling | your oracle (Lesson 03) |
| Concurrency bug | unsynchronized shared state, non-atomic check-then-act | Module 07; stress tests |
| Outdated/insecure | old library version, string-concatenated SQL | Modules 13, 19 |
| Plausible-but-wrong explanation | a confident comment that misdescribes the code | read the code, not the comment |
Two real bugs (from the lab)
Both look correct and pass an obvious test — then fail an edge case:
java
// "Average these numbers" — looks fine, average([2,4]) == 3 ✓
static int averageNaive(int[] numbers) {
int sum = 0;
for (int n : numbers) sum += n;
return sum / numbers.length; // ① integer division ② empty → ArithmeticException
}averageNaive(new int[]{1, 2}) returns 1, not 1.5 (integer division). averageNaive(new int[]{})throws (divide by zero). The reviewed version returns OptionalDouble and divides as double.
java
// Binary-search midpoint — passes every small test
static int midpointNaive(int low, int high) {
return (low + high) / 2; // overflows when low + high > Integer.MAX_VALUE
}Near Integer.MAX_VALUE the sum overflows to a negative number — the infamous JDK binary-search bug. The reviewed version, low + (high - low) / 2, can't overflow. The lab's tests assert exactly these edge cases, turning "looks right" into "proven right or wrong."
A review checklist
Run a generated diff through this before trusting it:
- Does every API exist? Compile, and verify unfamiliar calls against the docs.
- Edge cases — empty, null, zero, negative, max/min, single element, duplicates.
- Numeric correctness — integer vs floating division, overflow, rounding (Module 01).
- Resource & error handling — closing resources, swallowed exceptions (Modules 04, 08).
- Concurrency — shared mutable state, atomicity (Module 07).
- Security — input validation, parameterized SQL, no secrets in code (Module 19).
- Does the test actually test it? — or does it assert the bug? (Lesson 03)
Trap — the comment can lie
LLMs generate code and an explanatory comment, and the comment can confidently describe behavior the code doesn't have ("handles the empty case") when it doesn't. Review the code, not its description. A comment is a claim to verify, not evidence.
SDET note
Your reviewing power is the sum of the earlier modules. You catch the integer-division bug because of Module 01, the overflow because of JVM/numeric awareness (Modules 00, 12), the ==-on-String idiom because of Module 01, the race because of Module 07. You can't review what you don't understand — which is the case for studying all of it.
Key Takeaways
- AI output fails in predictable ways: hallucinated APIs, wrong idioms, subtle correctness bugs, missing edge cases, concurrency/security flaws.
- The dangerous bugs pass the happy path — only edge-case tests expose them (integer division, overflow).
- Apply a review checklist: APIs exist, edge cases, numeric correctness, resources/errors, concurrency, security, real tests.
- Read the code, not the comment — the explanation can be a confident lie.
- Reviewing well requires the Java fluency the rest of the course builds.
Lesson Quiz
A 'hallucinated API' from an LLM is…
averageNaive([1,2]) returning 1 instead of 1.5 is caused by…
Why does (low + high) / 2 fail near Integer.MAX_VALUE?
When reviewing AI output, the explanatory comment should be treated as…
The dangerous AI bugs are typically the ones that…
Next: Prompt Patterns & Test Oracles. This module's lab is in labs/src/main/java/com/jse21/m21_ai/.