Appearance
Lesson 03 · Prompt Patterns & Test Oracles
Beyond the 1Z0-830 exam
Two skills make AI-assisted testing trustworthy: writing a prompt that specifies intent precisely, and supplying the test oracle — the source of truth for what "correct" means. The model can generate a thousand test cases; deciding the right answer is the human part.
Objectives
After this lesson you will be able to:
- Write prompts that specify intent, constraints, and edge cases.
- Define a test oracle and list common kinds.
- Derive oracles independently of the implementation.
- Drive the AI to enumerate edge cases you'd otherwise miss.
Anatomy of a good prompt
Vague prompts get plausible-but-unconstrained code. Specify:
- Intent — what it must do, in one sentence.
- Signature — types in and out (
OptionalDouble average(int[])). - Constraints — performance, immutability, no external calls, Java 21.
- Edge cases — "handle empty input, overflow, nulls; average of [1,2] is 1.5."
- Examples — a couple of input → expected pairs (these double as oracles).
text
Write a Java 21 method `OptionalDouble average(int[] numbers)`.
- Return empty for an empty array (do not throw).
- Use double division: average of [1, 2] is 1.5, not 1.
- Sum into a long to avoid overflow.
Then write JUnit 5 + AssertJ tests covering empty, single, and the [1,2] case.Notice this prompt would have prevented both lab bugs — it pins the empty case and double division up front.
What a test oracle is
A test oracle is the mechanism that decides whether an output is correct for an input. Asserting assertThat(average(new int[]{1,2})).hasValue(1.5) works only because you know the answer is 1.5. The oracle — not the assertion syntax — is the hard part, and the part you must not outsource blindly to the same AI that wrote the code.
Kinds of oracle
| Oracle | Idea | Example |
|---|---|---|
| Known value | you know the exact answer | average([1,2]) == 1.5 |
| Inverse/round-trip | decode(encode(x)) == x | serialize→deserialize (Module 08) |
| Reference implementation | compare to a trusted version | new fast sort vs Arrays.sort |
| Property/invariant | a rule that always holds | min <= midpoint(min,max) <= max |
| Metamorphic | relation between related inputs | average(xs) == average(reverse(xs)) |
When you don't have an exact expected value, properties and metamorphic relations still pin behavior — and they're excellent for the edge cases AI misses.
Property oracles catch the overflow bug
You may not know the exact midpoint of two huge ints offhand, but you know the invariant: low <= midpoint(low, high) <= high. midpointNaive violates it (returns a negative). A property oracle catches the bug without your having to compute the "right" number — exactly what the lab asserts with isBetween(low, high).
Driving AI to find edge cases
Use the model's enumeration strength, then keep the oracle yourself:
text
List the edge cases a test for `average(int[])` should cover.It will suggest empty, single element, negatives, very large values (overflow), duplicates. You decide the expected result for each. This division of labor — AI enumerates, human adjudicates — is the heart of AI-assisted testing.
Gotcha — don't let code and oracle share a brain
If the same AI writes the method and its tests in one breath, both can encode the same wrong assumption (e.g. that integer division is fine), so the tests pass and the bug ships. Derive the oracle independently — from the spec, a reference implementation, or an invariant — not from the implementation under test.
Key Takeaways
- A good prompt specifies intent, signature, constraints, edge cases, and examples — pinning the cases that hide bugs.
- A test oracle decides correct vs incorrect; it's the human core of testing AI output.
- Use known values, round-trips, reference implementations, properties, and metamorphic relations as oracles.
- Property/invariant oracles catch edge-case bugs (overflow) without an exact expected value.
- Let AI enumerate edge cases, but derive the oracle independently of the code under test.
Lesson Quiz
A test oracle is…
Which prompt is most likely to yield correct code?
A property/invariant oracle for midpoint(low, high) is…
Why not let the same AI write both the method and its only tests, unreviewed?
The best division of labor in AI-assisted testing is…
Next: Guardrails — When Not to Trust AI. This module's lab is in labs/src/main/java/com/jse21/m21_ai/.