The Longest Common Subsequence (LCS) problem is one of the most classic problems in dynamic programming and string matching algorithms. It forms the foundation for solving problems such as file comparison, DNA sequence analysis, diff tools, and text similarity measurement. In this article, we will explore the LCS problem in great detail, breaking it down with multiple examples, visual DP tables, pseudocode, and Python implementations.
What is the Longest Common Subsequence?
A subsequence of a string is a sequence obtained by deleting zero or more characters without changing the order of the remaining characters. The Longest Common Subsequence (LCS) between two strings is the longest sequence that appears as a subsequence in both strings.
Example:
String A = ABCBDAB
String B = BDCAB
One LCS = BCAB (length = 4)
Another LCS = BDAB (length = 4)
Applications of LCS
- Comparing version differences in source code.
- DNA sequence alignment in bioinformatics.
- Plagiarism detection and text comparison tools.
- Spell checkers and suggestion engines.
- Data compression algorithms.
Recursive Definition of LCS
Let X[0âŚm-1] and Y[0âŚn-1] be two given strings. We define LCS(X, Y) as follows:
If X[m-1] == Y[n-1]:
LCS(X[0..m-1], Y[0..n-1]) = 1 + LCS(X[0..m-2], Y[0..n-2])
Else:
LCS(X[0..m-1], Y[0..n-1]) = max(
LCS(X[0..m-2], Y[0..n-1]),
LCS(X[0..m-1], Y[0..n-2])
)
Dynamic Programming Approach
The recursive solution has exponential time complexity. To optimize, we use Dynamic Programming (DP), where we build a table dp[m+1][n+1] in a bottom-up manner.
DP Table Construction
Consider two strings:
X = ABCBDAB (length 7)
Y = BDCAB (length 5)
We construct a DP table of size (7+1) x (5+1) initialized to 0. Each cell dp[i][j] stores the LCS length of X[0..i-1] and Y[0..j-1].
Python Implementation of LCS (DP)
def lcs(X: str, Y: str) -> int:
m, n = len(X), len(Y)
dp = [[0] * (n+1) for _ in range(m+1)]
for i in range(1, m+1):
for j in range(1, n+1):
if X[i-1] == Y[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
return dp[m][n]
# Example
print(lcs("ABCBDAB", "BDCAB")) # Output: 4
Visual Walkthrough of LCS DP Table
Below is a visual explanation of how the DP table evolves for X = ABCBDAB, Y = BDCAB.
Reconstructing the LCS
To reconstruct the actual subsequence, we backtrack from dp[m][n]:
def lcs_string(X: str, Y: str) -> str:
m, n = len(X), len(Y)
dp = [[0] * (n+1) for _ in range(m+1)]
for i in range(1, m+1):
for j in range(1, n+1):
if X[i-1] == Y[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
# Backtracking to build LCS string
i, j = m, n
lcs_str = []
while i > 0 and j > 0:
if X[i-1] == Y[j-1]:
lcs_str.append(X[i-1])
i -= 1
j -= 1
elif dp[i-1][j] >= dp[i][j-1]:
i -= 1
else:
j -= 1
return "".join(reversed(lcs_str))
print(lcs_string("ABCBDAB", "BDCAB")) # Output: BCAB
Complexity Analysis
- Time Complexity:
O(m Ă n)where m and n are lengths of two strings. - Space Complexity:
O(m Ă n)for storing the DP table. - Optimized Space: Using only two 1D arrays, we can reduce space to
O(min(m, n)).
def lcs_optimized(X: str, Y: str) -> int:
if len(Y) > len(X):
X, Y = Y, X # Ensure X is longer
prev = [0] * (len(Y)+1)
curr = [0] * (len(Y)+1)
for i in range(1, len(X)+1):
for j in range(1, len(Y)+1):
if X[i-1] == Y[j-1]:
curr[j] = prev[j-1] + 1
else:
curr[j] = max(prev[j], curr[j-1])
prev, curr = curr, [0]*(len(Y)+1)
return prev[len(Y)]
Interactive Example (Try Yourself)
To deepen understanding, modify and run:
X = input("Enter first string: ")
Y = input("Enter second string: ")
print("LCS length:", lcs(X, Y))
print("LCS string:", lcs_string(X, Y))
Conclusion
The Longest Common Subsequence problem is a cornerstone of dynamic programming and widely applied across computer science fields. By mastering both the recursive and DP approaches, you gain a strong basis for tackling advanced string algorithms, sequence alignment, and coding interview challenges. Whether optimizing space complexity or reconstructing the subsequence, LCS remains a powerful illustration of how optimal substructure and overlapping subproblems guide algorithm design.








