Regular expressions are one of the most powerful tools in a developer's toolkit. They let you search, match, extract, and replace text using compact pattern descriptions. Whether you are validating user input, parsing log files, or refactoring code, regex saves hours of manual work. This guide covers every major regex concept with practical examples you can use immediately.

The Anatomy of a Regex

A regular expression consists of two parts: a pattern and optional flags. In JavaScript, a regex literal is written between forward slashes:

/pattern/flags

For example, /hello/i matches the word "hello" in a case-insensitive manner. You can also create a regex using the RegExp constructor:

const re = new RegExp('hello', 'i');

JavaScript provides three primary methods for working with regex patterns:

// .test() — boolean check
/^\d{5}$/.test('90210');        // true

// .match() — extract matches
'2026-03-30'.match(/\d+/g);    // ['2026', '03', '30']

// .replace() — transform text
'hello world'.replace(/world/, 'regex');  // 'hello regex'

Understanding when to use each method is essential. Use .test() for simple yes/no validation, .match() when you need to extract data, and .replace() when you need to transform strings.

Anchors and Boundaries

Anchors do not match characters — they match positions within a string. They are essential for ensuring your pattern matches at the right location.

// ^ and $ — full-string validation
/^\d{4}-\d{2}-\d{2}$/.test('2026-03-30');  // true
/^\d{4}-\d{2}-\d{2}$/.test('Date: 2026-03-30');  // false

// \b — whole-word matching
/\bcat\b/.test('the cat sat');     // true
/\bcat\b/.test('concatenate');     // false

// \B — non-boundary matching
/\Bcat\B/.test('concatenate');     // true
/\Bcat\B/.test('the cat sat');     // false

A common mistake is forgetting anchors when validating input. Without ^ and $, the pattern /\d{5}/ would match "12345" inside "abc123456xyz", which is rarely what you want for validation.

Character Classes

Character classes let you match any one character from a defined set. They are the building blocks of most regex patterns.

Each shorthand has a negated counterpart in uppercase:

You can also define custom character classes using square brackets:

// Custom ranges
/[a-z]/        // any lowercase letter
/[A-Z]/        // any uppercase letter
/[a-zA-Z]/     // any letter
/[0-9a-fA-F]/  // any hexadecimal digit

// Negated classes
/[^aeiou]/     // any character except lowercase vowels
/[^0-9]/       // any non-digit (same as \D)

// Special characters inside classes
/[.\-+]/       // literal dot, hyphen, plus sign

Inside square brackets, most special characters lose their meaning and are treated as literals. The exceptions are ], \, ^ (at the start), and - (between characters), which must be escaped or placed strategically.

Quantifiers

Quantifiers control how many times a character or group can repeat. They are appended directly after the element they quantify.

// Basic quantifiers
/\d+/          // one or more digits
/\w*/          // zero or more word characters
/https?/       // "http" or "https" (s is optional)

// Exact and range counts
/\d{3}/        // exactly 3 digits
/\d{2,4}/      // 2 to 4 digits
/[a-z]{1,}/    // one or more lowercase letters (same as [a-z]+)

By default, all quantifiers are greedy — they match as many characters as possible. This can cause unexpected behavior:

// Greedy (default) — matches the longest possible string
'<b>hello</b>'.match(/<.*>/);
// Result: '<b>hello</b>' (entire string)

// Lazy — matches the shortest possible string
'<b>hello</b>'.match(/<.*?>/);
// Result: '<b>' (first tag only)

To make any quantifier lazy, append a ? after it: *?, +?, ??, {n,m}?. Lazy matching is especially important when parsing markup, log formats, or any delimited text where greedy matching would overshoot.

Groups and Capturing

Parentheses () serve two purposes in regex: they group sub-expressions and capture the matched text for later use.

Capturing groups store the matched text so you can reference it in replacements or extract it from match results:

// Capturing groups — extract parts of a match
const match = '2026-03-30'.match(/(\d{4})-(\d{2})-(\d{2})/);
// match[0] = '2026-03-30' (full match)
// match[1] = '2026'       (year)
// match[2] = '03'         (month)
// match[3] = '30'         (day)

// Back-references in .replace()
'John Smith'.replace(/(\w+) (\w+)/, '$2, $1');
// Result: 'Smith, John'

Non-capturing groups (?:) group sub-expressions without storing the match. Use them when you need grouping for quantifiers or alternation but do not need the captured value:

// Non-capturing group — grouping without capturing
/(?:https?|ftp):\/\//
// Matches "http://", "https://", or "ftp://" without capturing the protocol

Lookaheads assert that a pattern follows (or does not follow) the current position, without consuming characters:

// Positive lookahead (?=...) — assert what follows
/\d+(?= dollars)/.exec('100 dollars');
// Matches '100' (followed by ' dollars', but ' dollars' is not consumed)

// Negative lookahead (?!...) — assert what does NOT follow
/\d+(?! dollars)/.exec('100 euros');
// Matches '100' (not followed by ' dollars')

Lookaheads are invaluable for password validation. For example, to require at least one uppercase letter, one digit, and a minimum length of 8:

/^(?=.*[A-Z])(?=.*\d).{8,}$/

Each (?=...) checks a condition from the start of the string without advancing the position, so multiple conditions can be stacked.

Real-World Patterns

Below are battle-tested regex patterns for common validation and extraction tasks. Each pattern balances strictness with practicality — they work for the majority of real-world inputs without being overly complex.

Use Case Pattern Example Match
Email ^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$ [email protected]
URL ^https?:\/\/[^\s/$.?#].[^\s]*$ https://toolplex.dev/regex-tester/
Phone (US) ^\+?1?[\s\-.]?\(?\d{3}\)?[\s\-.]?\d{3}[\s\-.]?\d{4}$ (555) 123-4567
Date (YYYY-MM-DD) ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$ 2026-03-30
IPv4 ^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$ 192.168.1.1
Hex Color ^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$ #ff5733
Slug ^[a-z0-9]+(?:-[a-z0-9]+)*$ my-blog-post
Postal Code (US) ^\d{5}(-\d{4})?$ 90210 or 90210-1234

A few notes on these patterns. The email regex covers the vast majority of valid addresses but does not attempt to implement the full RFC 5322 specification, which allows quoted local parts and other rarely used syntax. For production email validation, consider combining a regex check with a confirmation email. The URL pattern is intentionally simple — for robust URL parsing, use the built-in URL constructor in JavaScript and catch any errors.

Flags Reference

Flags modify how the regex engine processes the pattern. They are placed after the closing slash in a regex literal or passed as the second argument to new RegExp().

// g — global: find all matches
'aabbaab'.match(/a+/g);            // ['aa', 'aa']

// i — case-insensitive
/hello/i.test('Hello World');       // true

// m — multiline anchors
'line1\nline2'.match(/^\w+/gm);    // ['line1', 'line2']

// s — dotAll: dot matches newlines
/foo.bar/s.test('foo\nbar');        // true

// u — unicode
/\p{Emoji}/u.test('🎉');            // true

Flags can be combined freely. A common combination is gi for a global, case-insensitive search, or gm for matching line-by-line across a multi-line string.