Regular expression matching is a fundamental algorithm used in compilers, interpreters, text processing engines, and search systems. The problem statement usually comes in the form:
Given a string
sand a patternpcontaining.(dot as any character) and*(zero or more repetitions), determine if the stringsmatchesp.
This topic is important for both computer science interviews and real-world applications where regex-based pattern engines need efficient solutions. A naΓ―ve recursive approach quickly becomes inefficient. However, the Dynamic Programming (DP) approach offers an optimized polynomial-time solution.
Understanding the Problem
We are given:
.β Matches any single character.*β Matches zero or more occurrences of the preceding element.
For example:
s = "aab"
p = "c*a*b"
Result = True (because "c*" can be skipped, "a*" matches "aa", "b" matches "b")
Dynamic Programming Approach
We define a DP table dp[i][j] as a boolean value indicating whether s[0..i-1] matches p[0..j-1].
Transition rules:
- If
p[j-1]is a normal character or.:dp[i][j] = dp[i-1][j-1]ifs[i-1] == p[j-1]orp[j-1] == '.'.
- If
p[j-1] == '*':- Treat it as zero occurrence:
dp[i][j] = dp[i][j-2] - Treat it as one or more:
dp[i][j] = dp[i-1][j]ifs[i-1] == p[j-2]orp[j-2] == '.'
- Treat it as zero occurrence:
Visualizing State Transition
Example Walkthrough
Let’s take s = "ab" and p = ".*".
p[1] = '*'allows repeating previous element..dp[2][2]becomes true because bothaandbmatch the wildcard sequence.
DP Table Visualization (β = True, β = False):
"" . *
"" [T] [F] [T]
a [F] [T] [T]
b [F] [F] [T]
Python Implementation
def isMatch(s: str, p: str) -> bool:
m, n = len(s), len(p)
dp = [[False] * (n + 1) for _ in range(m + 1)]
dp[0][0] = True
# Initialize patterns like a*, a*b*, etc.
for j in range(2, n + 1):
if p[j - 1] == '*':
dp[0][j] = dp[0][j - 2]
for i in range(1, m + 1):
for j in range(1, n + 1):
if p[j - 1] == '.' or p[j - 1] == s[i - 1]:
dp[i][j] = dp[i - 1][j - 1]
elif p[j - 1] == '*':
dp[i][j] = dp[i][j - 2] or (
(p[j - 2] == '.' or p[j - 2] == s[i - 1]) and dp[i - 1][j]
)
return dp[m][n]
# Example usage
print(isMatch("aab", "c*a*b")) # True
print(isMatch("mississippi", "mis*is*p*.")) # False
Interactive Demo Concept
Imagine typing a string and a regex pattern into a web form to watch the DP state table fill dynamically. For educational platforms, you could represent each dp[i][j] as a square in a grid, lighting up when it becomes True.
Complexity Analysis
- Time Complexity:
O(m * n), wherem= length of text,n= length of pattern. - Space Complexity:
O(m * n)for full DP table; can be optimized toO(n)with rolling arrays.
Applications of Regex Matching with DP
- String validation in compilers/interpreters
- Pattern search in data pipelines
- Form validators in web development
- Advanced text editors (search-and-replace engines)
Conclusion
The Dynamic Programming approach transforms the complex problem of regular expression matching into a systematic process. By breaking it into subproblems, one can efficiently handle . and * operators even in long patterns.
Mastering this algorithm is crucial for coding interviews, compiler theory, and practical regex-based systems. With solid theoretical foundations, visual aids, and hands-on Python examples, one can develop a deep understanding of this core algorithm.








