Python Regex Tutorial with Examples

Regular expressions, also known as regex, is a powerful tool for text processing and manipulation in Python. It allows us to search for and match patterns in strings, and perform operations such as string substitution and string splitting. In this article, we will explore the basics of regex in Python and how to use it for various tasks.

Basic Regex Syntax

Before we dive into the different ways to use regex in Python, let’s first go over the basic syntax of regex. The following are some of the most commonly used regex characters and symbols:

  • ^: Matches the start of a string.
  • $: Matches the end of a string.
  • .: Matches any single character (except a newline).
  • *: Matches zero or more occurrences of the preceding character.
  • +: Matches one or more occurrences of the preceding character.
  • ?: Matches zero or one occurrence of the preceding character.
  • {n}: Matches exactly n occurrences of the preceding character.
  • {n,}: Matches n or more occurrences of the preceding character.
  • {m,n}: Matches at least m and at most n occurrences of the preceding character.
  • []: Matches any character within the square brackets. For example, [0123456789] matches any digit.
  • [^]: Matches any character not within the square brackets. For example, [^aeiou] matches any character that is not a vowel.

Searching for Patterns in Strings

One of the most basic operations in regex is searching for patterns in strings. In Python, we can use the re module to search for patterns in strings using the search() method. The search() method returns a match object if a match is found, and None if no match is found. For example:

import re

string = "Hello, World!"

match = re.search("Hello", string)
if match:
    print("Match found!")
else:
    print("Match not found.")

Output:

Match found!

We can also use the group() method on the match object to get the actual matched string. For example:

import re

string = "Hello, World!"

match = re.search("Hello", string)
if match:
    print("Match found:", match.group())
else:
    print("Match not found.")

Output:

Match found: Hello

Matching Multiple Patterns

Sometimes, we may want to search for multiple patterns in a string. In such cases, we can use the findall() method. The findall() method returns a list of all non-overlapping matches in the string. For example:

import re

string = "Hello, World! 12345"

matches = re.findall("\d+", string)

print("Matches:", matches)

Output:

Matches: ['12345']

Note that the \d symbol is a shorthand character class that matches any digit.

Search and Replace

Another common operation in regex is search and replace. In Python, we can use the sub() method to search for patterns in a string and replace them with a specified string. For example:

import re

string = "Hello, World! 12345"

new_string = re.sub("\d+", "#####", string)

print("New string:", new_string)

Output:

New string: Hello, World! #####

Splitting Strings

Regex can also be used to split strings into smaller substrings based on a specified pattern. In Python, we can use the split() method to split a string into substrings based on a specified pattern. For example:

import re

string = "Hello, World! 12345"

substrings = re.split("\d+", string)

print("Substrings:", substrings)

Output:

Substrings: ['Hello, World! ', '']

Conclusion

In this article, we have explored the basics of regex in Python and how to use it for various tasks such as searching for patterns in strings, matching multiple patterns, search and replace, and splitting strings. Regex is a powerful tool for text processing and manipulation, and mastering it can greatly simplify many tasks. With the knowledge gained from this article, you can start using regex to solve your own text-processing problems in Python.

Leave a Reply

Your email address will not be published. Required fields are marked *