Pandas Merge
Pandas provides powerful data merging functions that can connect two or more DataFrames together based on keys, similar to SQL. `merge` and `join` are the two most commonly used methods.\\n\\n* * *\\n\\n## merge Basic Usage\\n\\nThe `pd.merge()` function is used to merge two DataFrames by columns, similar to SQL's JOIN operation.\\n\\n### Simple Merge\\n\\n## Instance\\n\\nimport pandas as pd\\n\\n# Create two DataFrames\\n\\n df1 = pd.DataFrame({\\n\\n"Student ID": ["S001","S002","S003"],\\n\\n"Name": ["Zhang San","Li Si","Wang Wu"]\\n\\n})\\n\\ndf2 = pd.DataFrame({\\n\\n"Student ID": ["S001","S002","S003"],\\n\\n"Math": [85,92,78]\\n\\n})\\n\\nprint("DataFrame 1οΌ")\\n\\nprint(df1)\\n\\nprint()\\n\\nprint("DataFrame 2οΌ")\\n\\nprint(df2)\\n\\nprint()\\n\\n# Merge\\n\\n result = pd.merge(df1, df2, on="Student ID")\\n\\nprint("Merge result:")\\n\\nprint(result)\\n\\n### Merging with Different Column Names\\n\\n## Instance\\n\\nimport pandas as pd\\n\\ndf1 = pd.DataFrame({\\n\\n"Student ID": ["S001","S002","S003"],\\n\\n"Name": ["Zhang San","Li Si","Wang Wu"]\\n\\n})\\n\\ndf2 = pd.DataFrame({\\n\\n"student_id": ["S001","S002","S003"],\\n\\n"Math": [85,92,78]\\n\\n})\\n\\n# Use left_on and right_on\\n\\n result = pd.merge(df1, df2, left_on="Student ID", right_on="student_id")\\n\\nprint("Merge with Different Column Names:")\\n\\nprint(result)\\n\\nprint()\\n\\n# Remove extra columns\\n\\n result = result.drop("student_id", axis=1)\\n\\nprint("Remove extra columnsAfter:")\\n\\nprint(result)\\n\\n* * *\\n\\n## Merge Types\\n\\n### inner / left / right / outer\\n\\n## Instance\\n\\nimport pandas as pd\\n\\ndf1 = pd.DataFrame({\\n\\n"Student ID": ["S001","S002","S003","S004"],\\n\\n"Name": ["Zhang San","Li Si","Wang Wu","Zhao Liu"]\\n\\n})\\n\\ndf2 = pd.DataFrame({\\n\\n"Student ID": ["S001","S002","S003","S005"],\\n\\n"Math": [85,92,78,88]\\n\\n})\\n\\nprint("DataFrame 1:")\\n\\nprint(df1)\\n\\nprint("\\\\n DataFrame 2:")\\n\\nprint(df2)\\n\\nprint()\\n\\n# inner joinοΌDefault): Keep only matching rows\\n\\nprint("inner joinοΌInner join):")\\n\\nprint(pd.merge(df1, df2, on="Student ID", how="inner"))\\n\\nprint()\\n\\n# left joinοΌKeep all rows from the left table\\n\\nprint("left joinοΌKeep left table only):")\\n\\nprint(pd.merge(df1, df2, on="Student ID", how="left"))\\n\\nprint()\\n\\n# right joinοΌKeep all rows from the right table\\n\\nprint("right joinοΌKeep right table only):")\\n\\nprint(pd.merge(df1, df2, on="Student ID", how="right"))\\n\\nprint()\\n\\n# outer joinοΌKeep all rows\\n\\nprint("outer joinοΌOuter join):")\\n\\nprint(pd.merge(df1, df2, on="Student ID", how="outer"))\\n\\n> It is important to understand the differences between the four JOINs: inner keeps the intersection, left keeps all from the left table, right keeps all from the right table, and outer keeps all from both tables.\\n\\n* * *\\n\\n## Multi-key Merge\\n\\n## Instance\\n\\nimport pandas as pd\\n\\ndf1 = pd.DataFrame({\\n\\n"City": ["Beijing","Beijing","Shanghai","Shanghai"],\\n\\n"Year": [2023,2024,2023,2024],\\n\\n"Sales Revenue": [100,120,90,110]\\n\\n})\\n\\ndf2 = pd.DataFrame({\\n\\n"City": ["Beijing","Beijing","Shanghai","Shanghai"],\\n\\n"Year": [2023,2024,2023,2024],\\n\\n"Profit": [20,25,18,22]\\n\\n})\\n\\n# Multi-key merge\\n\\n result = pd.merge(df1, df2, on=["City","Year"])\\n\\nprint("Multi-key mergeοΌ")\\n\\nprint(result)\\n\\n* * *\\n\\n## join Method\\n\\n`DataFrame.join()` is another way to merge, defaulting to a left join and merging on the index.\\n\\n### Merge by Index\\n\\n## Instance\\n\\nimport pandas as pd\\n\\n# Create data, set index\\n\\n df1 = pd.DataFrame({\\n\\n"Name": ["Zhang San","Li Si","Wang Wu"],\\n\\n"Age": [25,30,28]\\n\\n}, index=["S001","S002","S003"])\\n\\ndf2 = pd.DataFrame({\\n\\n"Math": [85,92,78],\\n\\n"English": [90,88,95]\\n\\n}, index=["S001","S002","S003"])\\n\\nprint("DataFrame 1οΌ")\\n\\nprint(df1)\\n\\nprint()\\n\\nprint("DataFrame 2οΌ")\\n\\nprint(df2)\\n\\nprint()\\n\\n# Use join to merge
YouTip