Explain a cryptic regex and rewrite it to be readable
Decodes a confusing regex token by token, surfaces edge cases and backtracking risk, then rewrites it readably.
Regexes are write-once, read-never. Six months later nobody remembers what `(?<=\d)(?=(\d{3})+$)` does, and changing it feels dangerous. This prompt does two jobs: it explains an existing pattern token by token in plain English, and it rewrites it to be maintainable. Crucially, it asks what the pattern fails to match and where it matches by accident—the two failure modes that cause production bugs—and it checks for catastrophic backtracking, which can turn one bad input into a denial-of-service. Specify the regex flavor (PCRE, JavaScript, Python re, RE2/Go), because features like lookbehind and named groups vary, and supply sample strings so the explanation is grounded in a concrete match table rather than hand-waving. The rewritten version uses named groups and verbose mode where supported, so the next reader actually understands it. Use it both to learn an inherited pattern and to sanity-check one you just wrote before committing.
You are an expert in regular expressions and the [REGEX_FLAVOR] flavor. I will give you a regex I do not fully understand. First, explain it token by token in plain English. Then state exactly what it matches and—just as important—what it fails to match or matches by accident. Identify any catastrophic backtracking risk. Finally, rewrite it to be equivalent but more readable, using named groups and verbose/extended mode if the flavor supports it. Validate against these sample strings, marking each match or non-match: [SAMPLE_STRINGS]. Here is the regex: [REGEX] Return the breakdown, the gotchas, the rewritten version, and a small match table.
What you can expect back
Token breakdown: - `^` / `$`: anchor to start/end of string. - `[^@\s]+`: one or more chars that aren't '@' or whitespace (local part). - `@`: a literal at-sign. - `[^@\s]+`: the domain label(s). - `\.[^@\s]+`: a dot followed by the final label. What it matches/misses: accepts most simple addresses but also accepts things like '[email protected]' (fine) and rejects 'user@example' (no dot — correct). It would wrongly accept 'a@b.' is false because the final group needs 1+ chars. No nested quantifiers, so no catastrophic backtracking risk. Readable rewrite (no functional change): `/^(?<local>[^@\s]+)@(?<domain>[^@\s]+\.[^@\s]+)$/` Match table: - [email protected] — MATCH - user@example — no match (no dot) - @example.com — no match (empty local) - [email protected] — MATCH
Illustrative example — your results will vary by tool and inputs.
Get sharper results
- 01Always pass the flavor—lookbehind, named groups, and possessive quantifiers differ across engines.
- 02Include tricky non-matches in [SAMPLE_STRINGS]; that's where accidental matches get exposed.
- 03If it flags backtracking risk, ask for an atomic-group or RE2-safe rewrite.
- 04For validation regexes, consider whether a real parser (e.g. a URL/email library) is the better tool.
Adapt it for your case
Replace the regex with a plain-English spec like 'match US phone numbers in 3 formats' and ask it to produce and explain the pattern.
Add 'Rewrite to eliminate any super-linear backtracking, even if the pattern becomes longer.'
Common questions
Can it write a new regex instead of explaining one?
Yes—use the 'Write from scratch' variation: describe what to match in words and it will produce and explain the pattern.
What is catastrophic backtracking?
It's when nested quantifiers cause exponential time on certain inputs, freezing your app. The prompt checks for it explicitly.
Why does the regex flavor matter?
Features like lookbehind, named groups, and verbose mode aren't universal—Go's RE2 lacks backreferences entirely, so the rewrite must match your engine.
You may also need
Generate table-driven unit tests for a function with edge cases
Produces a table-driven unit test suite covering happy paths, boundaries, and error conditions for a given function.
Decode a stack trace and pinpoint the likely root cause
Reads a stack trace frame by frame to explain the failure and pinpoint the most likely root cause with next steps.
Port a code snippet to another language idiomatically
Translates a snippet into idiomatic code in a target language and flags cross-language correctness pitfalls.
Explain Complex Code in Simple Terms
Turn confusing code into a clear, junior-friendly explanation with edge-case notes.