Regular Expressions
In this section, you will learn about Regular Expressions.
What is a Regular Expression?
A Regular Expression (RegExp) is a powerful tool used to define search patterns for strings. It allows you to perform complex pattern matching and manipulation on strings, making it useful for tasks such as validation, extraction, and formatting of data.
Syntax
A regular expression can be defined as follows:
/pattern/flags
Example
const regex = /abc/i; // Pattern to match 'abc', case insensitive
Flags
Flags are optional modifiers that change how the regular expression behaves. The most commonly used flags in JavaScript are:
g
: Global search. The regex finds all matches, not just the first one.i
: Case-insensitive search.m
: Multiline search. Anchors (^ and $) match the start or end of each line.s
: Dot (.) matches newline characters as well.u
: Treat pattern as a Unicode pattern.y
: Sticky mode. Matches only from the current position in the target string.
Special Characters
Here are some special characters used in regular expressions:
.
: Matches any character except newline.^
: Matches the beginning of a string.$
: Matches the end of a string.\d
: Matches any digit (equivalent to [0-9]).\w
: Matches any word character (alphanumeric + underscore).\s
: Matches any whitespace character (spaces, tabs).\b
: Matches a word boundary.\
: Escapes a special character, treating it as a literal character.
Character Sets and Ranges
-
[...]
: Matches any character within the square brackets.const regex = /[abc]/; // Matches 'a', 'b', or 'c'
-
[^...]
: Matches any character NOT within the square brackets.const regex = /[^abc]/; // Matches any character except 'a', 'b', or 'c'
Quantifiers
Quantifiers define how many times a character or group can appear.
-
*
: 0 or more timesconst regex = /ab*c/; // Matches 'ac', 'abc', 'abbc', etc.
-
+
: 1 or more timesconst regex = /ab+c/; // Matches 'abc', 'abbc', but not 'ac'
-
?
: 0 or 1 timeconst regex = /ab?c/; // Matches 'ac' or 'abc'
-
{n}
: Exactly n timesconst regex = /a{3}/; // Matches 'aaa'
-
{n,}
: At least n timesconst regex = /a{2,}/; // Matches 'aa', 'aaa', 'aaaa', etc.
-
{n,m}
: Between n and m timesconst regex = /a{2,4}/; // Matches 'aa', 'aaa', or 'aaaa'
Groups and Alternations
-
(...)
: Groups multiple tokens together for capturing or applying quantifiers.const regex = /(abc)+/; // Matches 'abc', 'abcabc', etc.
-
|
: Alternation, like a logical OR.const regex = /cat|dog/; // Matches either 'cat' or 'dog'
Lookaheads and Lookbehinds
Lookaheads and lookbehinds allow for conditional matching without consuming characters.
-
Positive Lookahead (?=):
Ensures that the following characters match a pattern.
const regex = /abc(?=d)/; // Matches 'abc' only if followed by 'd'
-
Negative Lookahead (?!):
Ensures that the following characters do NOT match a pattern.
const regex = /abc(?!d)/; // Matches 'abc' only if NOT followed by 'd'
-
Positive Lookbehind (?<=):
Ensures that the preceding characters match a pattern.
const regex = /(?<=a)bc/; // Matches 'bc' only if preceded by 'a'
-
Negative Lookbehind (?<!):
Ensures that the preceding characters do NOT match a pattern.
const regex = /(?<!a)bc/; // Matches 'bc' only if NOT preceded by 'a'
Common Pattern
Digital
Pattern | Desc | Example |
---|---|---|
\d+ | Whole Numbers | 1, 2, 3 |
\d+\.\d+ | Decimal Numbers | 1.1, 1.2 |
\d+(\.\d+)? | Whole + Decimal Numbers | 1, 1.1 |
-?\d+(\.\d+)? | Negative, Positive Whole + Decimal Numbers | -1, 1, 1.2 |
Language
Pattern | Desc | Example |
---|---|---|
[a-zA-Z]+ | English | abc |
[\u4e00-\u9fa5]+ | Chinese | æ±‰è¯ |
JSON
Pattern | Desc | Example |
---|---|---|
(?<="name":)[^,]+(?=,) | JSON Value | {"name":"Tapicker", "age":18} -> "Tapicker" |
Credit
Pattern | Desc | Example |
---|---|---|
4[0-9]{12}(?:[0-9]{3})? | Visa Credit | - |
3[47][0-9]{13} | American Express Credit | - |
([1-9]{1})(\d{15}|\d{18}) | China Credit | - |
Phone Number
Pattern | Desc | Example |
---|---|---|
\(\d{3}\) \d{3}-?\d{4} | US Phone Number | (562) 988-1688 (562) 9881688 |
\d{3}-\d{8}|\d{4}-\d{7} | CN Phone Number | 0511-4405222 021-87888822 |
1[3456789]\d{9} | CN Cellphone Number | 18623236565 |
Zip Code
Pattern | Desc | Example |
---|---|---|
[1-9]\d{5}(?!\d) | US | 516285 |
\d{5}-\d{4}|\d{5} | CN | 90807 or 92064-3404 |
Pattern | Desc | Example |
---|---|---|
\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)* | support@tapicker.com |
Date
Pattern | Desc | Example |
---|---|---|
\d{1,2}\/\d{1,2}\/\d{4} | Date | 10/24/2022 |
\d{4}-\d{1,2}-\d{1,2} | Date | 2022-10-24 |
\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2} | Datetime | 2022-10-24 12:08:16 |
Others
Pattern | Desc | Example |
---|---|---|
https?://([\w-]+\.)+[\w-]+(/[\w-./?%&=]*)? | URL | https://www.tapicker.com/ |
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} | IP Address | 192.168.1.1 |
(?:[0-9a-fA-F]{2}\:){5}[0-9a-fA-F]{2} | MAC Address | 00:1b:63:84:45:e6 |
<(\S*?)[^>]*>.*?|<\/.*?> | HTML | <p id="test"></p> |