Learn with Yasir

Learn Python, Microsoft 365 and Google Workspace

Home

Share Your Feedback

Regular Expressions (Regex) in Python

Regular expressions are powerful tools for searching, matching, and manipulating strings based on patterns.

1. Introduction to Regular Expressions

  • The re module in Python provides functions for working with regular expressions.
  • You can use regular expressions to:
    • Search for specific patterns within a string.
    • Replace parts of a string that match a pattern.
    • Split strings based on patterns.

2. Basic Functions in the re module

  • re.match(): Checks if the regular expression matches the beginning of the string.
    import re
    result = re.match(r'Hello', 'Hello, world!')
    print(result)  # Output: <re.Match object; span=(0, 5), match='Hello'>
    
  • re.search(): Searches for the first location where the regular expression matches.
    result = re.search(r'world', 'Hello, world!')
    print(result)  # Output: <re.Match object; span=(7, 12), match='world'>
    
  • re.findall(): Returns a list of all non-overlapping matches of the regular expression in the string.
    result = re.findall(r'\d+', 'There are 12 apples and 34 oranges')
    print(result)  # Output: ['12', '34']
    
  • re.sub(): Replaces occurrences of the pattern with a specified string.
    result = re.sub(r'apples', 'bananas', 'There are 12 apples and 34 oranges')
    print(result)  # Output: There are 12 bananas and 34 oranges
    

3. Special Characters in Regular Expressions

  • .: Matches any character except a newline.
  • ^: Matches the start of the string.
  • $: Matches the end of the string.
  • []: Matches any single character inside the brackets.
  • |: Acts as a logical OR operator.
  • \d: Matches any digit (equivalent to [0-9]).
  • \w: Matches any alphanumeric character (equivalent to [a-zA-Z0-9_]).
  • +: Matches 1 or more occurrences of the preceding character or group.
  • *: Matches 0 or more occurrences of the preceding character or group.

4. Examples

  • Extracting digits from a string:
    text = "The price is 50 dollars"
    numbers = re.findall(r'\d+', text)
    print(numbers)  # Output: ['50']
    
  • Checking if a string starts with a certain word:
    text = "Hello, world!"
    if re.match(r'^Hello', text):
        print("String starts with 'Hello'")
    
  • Validating an email address:
    email = "test@example.com"
    if re.match(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$', email):
        print("Valid email")
    else:
        print("Invalid email")
    

5. Exercise:

  • Exercise 1: Write a function that extracts all phone numbers from a given string. Assume phone numbers follow the format XXX-XXX-XXXX.
  • Exercise 2: Create a program that validates a given username. The username must start with a letter and contain only letters, digits, and underscores.

Would you like to work on these exercises, or would you like a deeper dive into any specific part of regular expressions?