YouTip LogoYouTip

Pandas String

Pandas provides powerful string processing capabilities. Through the `.str` accessor, you can process each element in a Series just like operating on Python strings.\\n\\n* * *\\n\\n## str Accessor Overview\\n\\nWhen the data type of a Series is `object`, you can use the `.str` accessor for string operations.\\n\\n## Instance\\n\\nimport pandas as pd\\n\\n# Create a Series containing strings\\n\\n s = pd.Series(["hello","world","pandas","Python"])\\n\\nprint("Original Series:")\\n\\nprint(s)\\n\\nprint()\\n\\n# Convert to lowercase\\n\\nprint("Convert to lowercase:")\\n\\nprint(s.str.lower())\\n\\nprint()\\n\\n# Convert to uppercase\\n\\nprint("Convert to uppercase:")\\n\\nprint(s.str.upper())\\n\\nprint()\\n\\n# Capitalize\\n\\nprint("Capitalize:")\\n\\nprint(s.str.capitalize())\\n\\n* * *\\n\\n## Common String Functions\\n\\n### Case Conversion\\n\\n| Method | Description | Example |\\n| --- | --- | --- |\\n| `str.lower()` | Convert to lowercase | "Hello" β†’ "hello" |\\n| `str.upper()` | Convert to uppercase | "Hello" β†’ "HELLO" |\\n| `str.title()` | Capitalize first letter of each word | "hello world" β†’ "Hello World" |\\n| `str.capitalize()` | Capitalize first letter, lowercase the rest | "hELLO" β†’ "Hello" |\\n| `str.swapcase()` | Swap case | "HeLLo" β†’ "hEllO" |\\n\\n### String Search and Replace\\n\\n## Instance\\n\\nimport pandas as pd\\n\\ns = pd.Series(["hello world","python pandas","data science","machine learning"])\\n\\nprint("Original data: ")\\n\\nprint(s)\\n\\nprint()\\n\\n# IncludeCheck\\n\\nprint("Include'python':")\\n\\nprint(s.str.contains("python", case=False))\\n\\nprint()\\n\\n# Check start/end:\\n\\nprint("with'hello'Starts with:")\\n\\nprint(s.str.startswith("hello"))\\n\\nprint()\\n\\n# Replace\\n\\nprint("ReplaceReplace spaces with underscores:")\\n\\nprint(s.str.replace(" ","_"))\\n\\nprint()\\n\\n# Split\\n\\nprint("Split by space:")\\n\\nprint(s.str.split(" "))\\n\\n### Trimming and Padding\\n\\n## Instance\\n\\nimport pandas as pd\\n\\ns = pd.Series([" hello ","world "," pandas"," python "])\\n\\nprint("Original data: ")\\n\\nprint(s)\\n\\nprint()\\n\\n# Remove leading and trailing spaces\\n\\nprint("Remove leading and trailing spaces:")\\n\\nprint(s.str.strip())\\n\\nprint()\\n\\n# Remove leading spaces\\n\\nprint("Remove leading spaces:")\\n\\nprint(s.str.lstrip())\\n\\nprint()\\n\\n# Remove trailing spaces\\n\\nprint("Remove trailing spaces:")\\n\\nprint(s.str.rstrip())\\n\\nprint()\\n\\n# Pad\\n\\nprint("Left-pad with 0 to length 10:")\\n\\nprint(s.str.pad(10, side="left", fillchar="0"))\\n\\n* * *\\n\\n### Slicing and Joining\\n\\n## Instance\\n\\nimport pandas as pd\\n\\ns = pd.Series(["hello","world","python","pandas"])\\n\\nprint("Original data: ")\\n\\nprint(s)\\n\\nprint()\\n\\n# Slice\\n\\nprint("Before3itemsCharacters:")\\n\\nprint(s.str[:3])\\n\\nprint()\\n\\n# Take 3 items starting from index 1:\\n\\nprint("Take 3 items starting from index 1::")\\n\\nprint(s.str[1:4])\\n\\nprint()\\n\\n# String concatenation\\n\\n s1 = pd.Series(["Hello","World"])\\n\\n s2 = pd.Series(["Python","Pandas"])\\n\\nprint("Use + Concatenate:")\\n\\nprint(s1 + " " + s2)\\n\\nprint()\\n\\n# join Join (Use specified delimiter)\\n\\nprint("join Concatenate:")\\n\\nprint(s1.str.cat(s2, sep="-"))\\n\\n* * *\\n\\n## Regular Expression Support\\n\\nPandas string functions support regular expressions, which is a powerful tool for handling complex string patterns.\\n\\n## Instance\\n\\nimport pandas as pd\\n\\ns = pd.Series([\\n\\n"user@example.com",\\n\\n"test@domain.org",\\n\\n"invalid-email",\\n\\n"admin@site.net"\\n\\n])\\n\\nprint("Original data: ")\\n\\nprint(s)\\n\\nprint()\\n\\n# Extract email\\n\\nprint("Extract username:")\\n\\nprint(s.str.extract(r"(w+)@"))\\n\\nprint()\\n\\nprint("Extract domain:")\\n\\nprint(s.str.extract(r"@(w+.w+)"))\\n\\nprint()\\n\\n# Replace\\n\\nprint("ReplaceEmail is: :")\\n\\nprint(s.str.replace(r"w+@w+.w+","", regex=True))\\n\\nprint()\\n\\n# IncludeMatch\\n\\nprint("Include.comor.org:")\\n\\nprint(s.str.cont
← Pandas TimeseriesPandas Missing Data β†’