Regular expression matching is a fundamental algorithm used in compilers, interpreters, text processing engines, and search systems. The problem statement usually comes in the form:

Given a string s and a pattern p containing . (dot as any character) and * (zero or more repetitions), determine if the string s matches p.

This topic is important for both computer science interviews and real-world applications where regex-based pattern engines need efficient solutions. A naΓ―ve recursive approach quickly becomes inefficient. However, the Dynamic Programming (DP) approach offers an optimized polynomial-time solution.


Understanding the Problem

We are given:

  • . β†’ Matches any single character.
  • * β†’ Matches zero or more occurrences of the preceding element.

For example:

s = "aab"
p = "c*a*b"
Result = True (because "c*" can be skipped, "a*" matches "aa", "b" matches "b")

Dynamic Programming Approach

We define a DP table dp[i][j] as a boolean value indicating whether s[0..i-1] matches p[0..j-1].

Transition rules:

  1. If p[j-1] is a normal character or .:
    • dp[i][j] = dp[i-1][j-1] if s[i-1] == p[j-1] or p[j-1] == '.'.
  2. If p[j-1] == '*':
    • Treat it as zero occurrence: dp[i][j] = dp[i][j-2]
    • Treat it as one or more: dp[i][j] = dp[i-1][j] if s[i-1] == p[j-2] or p[j-2] == '.'

Visualizing State Transition


Example Walkthrough

Let’s take s = "ab" and p = ".*".

  • p[1] = '*' allows repeating previous element ..
  • dp[2][2] becomes true because both a and b match the wildcard sequence.

DP Table Visualization (βœ… = True, ❌ = False):

       ""   .   *
   "" [T]  [F] [T]
   a  [F]  [T] [T]
   b  [F]  [F] [T]

Python Implementation


def isMatch(s: str, p: str) -> bool:
    m, n = len(s), len(p)
    dp = [[False] * (n + 1) for _ in range(m + 1)]
    dp[0][0] = True
    
    # Initialize patterns like a*, a*b*, etc.
    for j in range(2, n + 1):
        if p[j - 1] == '*':
            dp[0][j] = dp[0][j - 2]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if p[j - 1] == '.' or p[j - 1] == s[i - 1]:
                dp[i][j] = dp[i - 1][j - 1]
            elif p[j - 1] == '*':
                dp[i][j] = dp[i][j - 2] or (
                    (p[j - 2] == '.' or p[j - 2] == s[i - 1]) and dp[i - 1][j]
                )
    
    return dp[m][n]

# Example usage
print(isMatch("aab", "c*a*b"))  # True
print(isMatch("mississippi", "mis*is*p*."))  # False

Interactive Demo Concept

Imagine typing a string and a regex pattern into a web form to watch the DP state table fill dynamically. For educational platforms, you could represent each dp[i][j] as a square in a grid, lighting up when it becomes True.

Regular Expression Matching: Dynamic Programming Approach Explained with Examples


Complexity Analysis

  • Time Complexity: O(m * n), where m = length of text, n = length of pattern.
  • Space Complexity: O(m * n) for full DP table; can be optimized to O(n) with rolling arrays.

Applications of Regex Matching with DP

  • String validation in compilers/interpreters
  • Pattern search in data pipelines
  • Form validators in web development
  • Advanced text editors (search-and-replace engines)

Conclusion

The Dynamic Programming approach transforms the complex problem of regular expression matching into a systematic process. By breaking it into subproblems, one can efficiently handle . and * operators even in long patterns.

Mastering this algorithm is crucial for coding interviews, compiler theory, and practical regex-based systems. With solid theoretical foundations, visual aids, and hands-on Python examples, one can develop a deep understanding of this core algorithm.