YouTip LogoYouTip

Python Re Sub

## Python `re.sub()` Method The `re.sub()` function in Python's `re` (regular expression) module is used to **replace occurrences** of a pattern within a string. It searches the target string for all substrings that match the specified regular expression pattern and replaces them with a replacement string or the output of a replacement function. > **Etymology**: The `sub` in `re.sub` stands for **substitute**. --- ## Syntax and Parameters ### Syntax ```python re.sub(pattern, repl, string, count=0, flags=0) ``` ### Parameter Descriptions * **`pattern`**: The regular expression pattern to search for. * **`repl`**: The replacement. This can be either a string or a callable function. * **`string`**: The original input string to search and modify. * **`count`**: The maximum number of pattern occurrences to be replaced. The default value is `0`, which means all occurrences will be replaced. * **`flags`**: Optional regex flags (such as `re.IGNORECASE`, `re.MULTILINE`, etc.) to modify matching behavior. ### Return Value * Returns a new string with the matched patterns replaced. If the pattern is not found, the original string is returned unchanged. --- ## Code Examples ### Example 1: Basic Replacement This example demonstrates how to replace a plain text substring with another string. ```python import re text = "Python is good, Python is great" # Replace 'Python' with 'Java' result = re.sub(r'Python', 'Java', text) print(result) ``` **Expected Output:** ```text Java is good, Java is great ``` --- ### Example 2: Limiting Replacement Count By using the `count` parameter, you can control how many matches are replaced. ```python import re text = "Python is good, Python is great" # Replace only the first occurrence of 'Python' result = re.sub(r'Python', 'Java', text, count=1) print(result) ``` **Expected Output:** ```text Java is good, Python is great ``` --- ### Example 3: Replacing with Regular Expressions You can use regular expression patterns to match dynamic content, such as masking sensitive digits. ```python import re text = "My phone number is 138-1234-5678" # Replace the middle 4 digits of the phone number with **** result = re.sub(r'\d{4}', '****', text, count=1) print(result) ``` **Expected Output:** ```text My phone number is 138-****-5678 ``` --- ### Example 4: Using Backreferences (Group Referencing) You can use captured groups in your pattern and reference them in the replacement string using `\g` or `\group_id` (e.g., `\1`, `\2`). ```python import re text = "2024-04-02" # Swap the order of Year, Month, and Day (YYYY-MM-DD to DD/MM/YYYY) result = re.sub(r'(\d{4})-(\d{2})-(\d{2})', r'\3/\2/\1', text) print(result) ``` **Expected Output:** ```text 02/04/2024 ``` --- ### Example 5: Using a Function as the Replacement If `repl` is a function, it is called for every non-overlapping occurrence of `pattern`. The function takes a single match object argument and must return a replacement string. ```python import re text = "3 apples, 5 bananas, 8 oranges" # Increment each number found in the string by 1 def add_one(match): # match.group() retrieves the matched string return str(int(match.group()) + 1) result = re.sub(r'\d+', add_one, text) print(result) ``` **Expected Output:** ```text 4 apples, 6 bananas, 9 oranges ``` --- ## Considerations and Best Practices 1. **Raw Strings (`r'...'`)**: Always use raw strings for regular expression patterns (e.g., `r'\d+'`) to prevent Python's string parser from misinterpreting backslashes as escape characters. 2. **Immutability**: Strings in Python are immutable. `re.sub()` does not modify the original string in place; instead, it returns a brand-new string. 3. **Performance**: If you need to perform the same substitution repeatedly in a loop, consider compiling the regular expression first using `re.compile()` and then calling the `sub()` method on the compiled pattern object: ```python pattern = re.compile(r'\d+') result = pattern.sub('NUMBER', text) ``` 4. **`re.subn()` Alternative**: If you need to know how many substitutions were actually made, use `re.subn()`. It returns a tuple containing `(new_string, number_of_substitutions_made)`.
← Python Re CompilePython Re Purge β†’