JavaScript RegExp *: Zero or More Occurrences
In JavaScript regular expressions, the *
quantifier is used to match zero or more occurrences of the preceding character, group, or character class. This means the pattern can appear any number of times, or not at all. This guide explains how to effectively use the *
quantifier with practical examples.
Definition and Purpose
The *
(asterisk) quantifier in regular expressions is a powerful tool for creating flexible and versatile patterns. Its primary purpose is to match a sequence where the preceding element can occur zero or more times.
Syntax
The syntax for using the *
quantifier is straightforward:
pattern*
Here, pattern
represents the character, group, or character class you want to match zero or more times.
Use Cases
- Matching Optional Characters: The
*
can make a character optional in a pattern. - Handling Variable Length Sequences: Useful for matching sequences where the length can vary.
- Finding Patterns with Flexible Structures: Enables matching patterns that can have optional or repeating parts.
Examples
Let’s explore some examples to demonstrate the usage of the *
quantifier.
Basic Usage: Matching Zero or More of a Character
In this example, we’ll match the letter ‘a’ zero or more times in a string.
const str1 = "baaa";
const str2 = "b";
const str3 = "bcd";
const regexStar1 = /ba*/;
console.log(regexStar1.test(str1)); // Output: true (matches "baa")
console.log(regexStar1.test(str2)); // Output: true (matches "b")
console.log(regexStar1.test(str3)); // Output: true (matches "b")
In the above example:
ba*
matches “b” followed by zero or more “a” characters.- In
"baaa"
, it matches “baa”. - In
"b"
, it matches “b” (zero occurrences of “a”). - In
"bcd"
, it matches “b” (zero occurrences of “a”).
Matching Zero or More of a Group
You can use the *
quantifier to match zero or more occurrences of a group of characters.
const str4 = "ababab";
const str5 = "abc";
const str6 = "xyz";
const regexStar2 = /(ab)*/;
console.log(regexStar2.test(str4)); // Output: true (matches "ababab")
console.log(regexStar2.test(str5)); // Output: true (matches "")
console.log(regexStar2.test(str6)); // Output: true (matches "")
In the above example:
(ab)*
matches zero or more occurrences of the group “ab”.- In
"ababab"
, it matches “ababab”. - In
"abc"
, it matches “” (zero occurrences of “ab” at the beginning). - In
"xyz"
, it matches “” (zero occurrences of “ab” at the beginning).
Matching Zero or More of a Character Class
Here, we use the *
quantifier to match zero or more word characters.
const str7 = "hello123";
const str8 = "hello";
const str9 = "!@#";
const regexStar3 = /\w*/;
console.log(regexStar3.test(str7)); // Output: true (matches "hello123")
console.log(regexStar3.test(str8)); // Output: true (matches "hello")
console.log(regexStar3.test(str9)); // Output: true (matches "")
In the above example:
\w*
matches zero or more word characters (alphanumeric characters and underscores).- In
"hello123"
, it matches “hello123”. - In
"hello"
, it matches “hello”. - In
"!@#"
, it matches “” (zero word characters at the beginning).
Combining with Other Quantifiers
The *
can be combined with other quantifiers and character classes to create more complex patterns.
const str10 = "abc123xyz";
const str11 = "abcxyz";
const str12 = "abc";
const regexStar4 = /abc\d*xyz/;
console.log(regexStar4.test(str10)); // Output: true (matches "abc123xyz")
console.log(regexStar4.test(str11)); // Output: true (matches "abcxyz")
console.log(regexStar4.test(str12)); // Output: false (no "xyz")
In the above example:
abc\d*xyz
matches “abc”, followed by zero or more digits, followed by “xyz”.- In
"abc123xyz"
, it matches “abc123xyz”. - In
"abcxyz"
, it matches “abcxyz” (zero digits between “abc” and “xyz”). - In
"abc"
, it doesn’t match because “xyz” is missing.
Real-World Example: Matching HTML Tags
In this example, we’ll match HTML tags, which may or may not have attributes.
const html1 = "<p>Hello</p>";
const html2 = "<div class='container'>World</div>";
const html3 = "<span></span>";
const regexStar5 = /<[a-z]+\s*\/?>|<[a-z]+.*?>.*?<\/[a-z]+>/;
console.log(regexStar5.test(html1)); // Output: true
console.log(regexStar5.test(html2)); // Output: true
console.log(regexStar5.test(html3)); // Output: true
In the above example:
- The regular expression
/<[a-z]+\s*\/?>|<[a-z]+.*?>.*?<\/[a-z]+>/
matches HTML tags. <[a-z]+\s*\/?>
matches self-closing tags like<br />
or<img>
.\s*
allows for zero or more whitespace characters before the closing/>
.<[a-z]+.*?>.*?<\/[a-z]+>
matches full HTML tags, including opening and closing parts.
Tips and Best Practices
- Be Specific: Use more specific patterns to avoid unintended matches.
- Combine with Anchors: Use
^
and$
to match the beginning and end of the string, ensuring the entire string matches the pattern. - Use Non-Greedy Matching: When combined with other quantifiers, use the non-greedy version
*?
to match as few characters as possible.
Conclusion
The *
(zero or more) quantifier in JavaScript regular expressions is a versatile tool for creating flexible and powerful patterns. Understanding how to use this quantifier effectively enables you to handle a wide range of pattern-matching tasks. By using the examples and best practices provided in this guide, you can confidently incorporate the *
quantifier into your regular expressions.