Lesson 5
Regex Debugging Workflow
Fixtures, highlights, backtracking risks, and common mistakes.
Reliable regex work is test-driven. Treat patterns like code: examples in, expected matches out.
Step 1: Collect representative fixtures
Gather inputs that must match and inputs that must not match:
- Happy path samples from production logs (redact secrets)
- Edge cases: empty strings, Unicode, extra spaces, missing fields
- Known false positives you already hit once
One green highlight on a demo string is not a test suite.
Step 2: Narrow the pattern incrementally
Start anchored and strict, then relax:
- Match a fixed prefix with
^ - Add character classes for variable segments
- Add quantifiers only where length truly varies
- Turn on
ioruonly when requirements demand it
If the pattern matches too much, tighten before adding exclusions with negative lookaheads.
Step 3: Inspect capture groups
For each match, verify group values map to the fields you think you parsed. Off-by-one $n bugs in replace templates are common.
Use replace preview to confirm reordered output on every fixture line.
Step 4: Watch catastrophic backtracking
Nested quantifiers such as (.*)* on large input can stall engines. Symptoms:
- Browser tab hangs on paste
- CI lint step times out
Fixes include possessive/lazy quantifiers, atomic groups (where supported), or rewriting to avoid .* sandwiches.
Common mistakes checklist
| Mistake | Symptom |
|---|---|
Forgot g | Only first line highlights |
. without s | Multiline JSON fails |
Unescaped $ in replace | Literal $1 inserted wrongly |
\w for international text | Misses valid names |
| Parsing nested formats | False positives on HTML/JSON |
Key takeaway
Ship regex with fixtures, flags documented, and a replace preview on real paste size. When complexity grows, graduate to a parser—regex should stay the scalpel, not the hammer.