Skip to content

Lesson 03 · Regular Expressions

Beyond the 1Z0-830 exam

Regex isn't a 1Z0-830 objective, but it's a daily tool for validation, parsing logs, and data extraction in test code. The traps here (greedy vs lazy, escaping) cause real, hard-to-spot bugs.

Objectives

After this lesson you will be able to:

  • Use Pattern/Matcher and the String regex methods.
  • Extract capturing groups.
  • Tell greedy from lazy quantifiers.

Pattern, Matcher, and String methods

Compile a pattern once, then match against input:

java
Pattern p = Pattern.compile("\\d+");        // one or more digits
Matcher m = p.matcher("abc123def456");
while (m.find()) { System.out.println(m.group()); }   // 123, then 456

String has regex shortcuts:

java
"a1b2".matches("[a-z\\d]+");        // true — whole string must match
"a,b;c".split("[,;]");              // ["a", "b", "c"]
"a1b2".replaceAll("\\d", "#");      // "a#b#"

Exam trap

String.matches (and Matcher.matches) require the entire string to match; Matcher.find matches any substring. In Java string literals every regex backslash is doubled ("\\d" = the regex \d). Pattern.compile throws PatternSyntaxException on an invalid pattern.

Capturing groups

Parentheses create numbered groups; group 0 is the whole match.

java
Matcher m = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})").matcher("2026-06-19");
if (m.matches()) {
    m.group(0);   // "2026-06-19" (whole match)
    m.group(1);   // "2026"
    m.group(2);   // "06"
}
Pattern.compile("(?<year>\\d{4})").matcher("2026").results();   // named group "year"

Greedy vs lazy quantifiers

By default quantifiers are greedy — they match as much as possible, then backtrack. Append ? for lazy (as little as possible).

java
Pattern.compile("<.+>").matcher("<a><b>").results()...;   // greedy: matches "<a><b>" (one match)
Pattern.compile("<.+?>").matcher("<a><b>").results()...;  // lazy:   "<a>" then "<b>" (two matches)

Gotcha

Greedy .*/.+ over-match is the #1 regex surprise — <.+> swallows everything between the first < and the last >. Use a lazy quantifier (.+?) or a negated class ([^>]+) to stop at the first delimiter.

SDET note

In tests, prefer precise patterns (anchored with ^...$, negated classes) over broad .* — a loose regex passes on inputs you didn't intend, hiding bugs. Compile patterns once as static final for clarity and speed when validating many values.

Key Takeaways

  • Pattern/Matcher: find matches a substring; matches requires the whole input. String.matches/split/replaceAll are regex shortcuts.
  • Backslashes are doubled in Java string literals ("\\d"); a bad pattern throws PatternSyntaxException.
  • Capturing groups are numbered (group(1), …; group 0 = whole match); named groups via (?<name>…).
  • Quantifiers are greedy by default; add ? for lazy. Greedy .+ over-matches — use .+? or [^x]+.

Lesson Quiz

Lesson Quiz · Regular Expressions0 / 5
  1. Difference between String.matches and Matcher.find?

    • ANone
    • Bmatches requires the whole string to match; find matches any substring
    • Cfind requires the whole string
    • Dmatches is case-insensitive
  2. How do you write the regex \d in a Java string literal?

    • A"\d"
    • B"\\d"
    • C"d"
    • D"//d"
  3. For input "<a><b>", how many matches does <.+> (greedy) find?

    • A2
    • B1 (the whole <a><b>)
    • C0
    • D3
  4. What is group(0) in a Matcher?

    • AThe first capturing group
    • BThe whole match
    • Cnull
    • DAn error
  5. Which makes a quantifier lazy?

    • AAppending +
    • BAppending ?
    • CPrepending ^
    • DAppending *

Next: Module 11 Mini-Exam. Run the matching code in labs/src/main/java/com/jse21/m11_language/.