YouTip LogoYouTip

Pandas Df Query

[![Image 1: Pandas Common Functions](#) Pandas Common Functions](#) * * * `query()` is a very practical data filtering function in Pandas that allows using SQL-like string expressions to filter data. Compared to traditional boolean indexing, `query()` has a more concise and intuitive syntax, making it particularly suitable for handling complex filtering conditions. In data analysis work, we often need to filter data based on various conditions. The `query()` function writes filtering conditions in string form, just like writing SQL queries, which is very friendly for users familiar with SQL. At the same time, it also supports using Python variables and functions, making dynamic filtering possible. * * * ## Basic Syntax and Parameters `query()` is a method of DataFrame, called through the dot operator `.`. It accepts a string parameter containing the filtering expression. ### Syntax Format DataFrame.query(expr, inplace=False, **kwargs) ### Parameter Description | Parameter | Type | Required | Description | Default Value | | --- | --- | --- | --- | --- | | expr | str | Required | Filtering expression, similar to SQL's WHERE clause. | - | | inplace | bool | Optional | Whether to modify the original DataFrame directly. | False | ### Return Value Description * **Return Type**: Returns a new DataFrame containing rows that satisfy the filtering conditions. * **Does not modify original data**: By default, the original DataFrame remains unchanged. * * * ## Examples Let's comprehensively master the usage of `query()` through rich examples. ### Example 1: Basic Usage - Single Condition Filtering The simplest usage of `query()` is to filter using a single condition. ## Example import pandas as pd # Create sample DataFrame data ={ 'name': ['Alice','Bob','Charlie','David','Eve','Frank','Grace'], 'age': [18,19,17,18,20,19,18], 'score': [85,92,78,90,88,95,82], 'grade': ['A','A','B','A','B','A','B'] } df = pd.DataFrame(data) print("Original DataFrame:") print(df) print() # Filter students with score greater than 85 print("Students with score greater than 85:") print(df.query('score > 85')) print() # Filter students with age equal to 18 print("Students with age equal to 18:") print(df.query('age == 18')) print() # Filter students with grade A print("Students with grade A:") print(df.query('grade == "A"')) **Running Results:** Original DataFrame: name age score grade 0 Alice 18 85 A 1 Bob 19 92 A 2 Charlie 17 78 B 3 David 18 90 A 4 Eve 20 88 B 5 Frank 19 95 A 6 Grace 18 82 B Students with score greater than 85: name age score grade 1 Bob 19 92 A 3 David 18 90 A 4 Eve 20 88 B 5 Frank 19 95 A Students with age equal to 18: name age score grade 0 Alice 18 85 A 3 David 18 90 A 6 Grace 18 82 B Students with grade A: name age score grade 0 Alice 18 85 A 1 Bob 19 92 A 3 David 18 90 A 5 Frank 19 95 A **Code Analysis:** 1. Filtering conditions are placed in quotes, using SQL-like syntax. 2. Numeric comparisons use `>`, `==`, ` Tip: `query()` is particularly suitable for scenarios requiring dynamically built filtering conditions, such as filtering data based on user input or configuration files. Combined with Python's string formatting capabilities, very flexible data queries can be achieved. * * * ## Summary `query()` is a powerful data filtering function in Pandas that provides SQL-like query syntax, making complex conditional filtering more concise and readable. Its main advantages include: concise and intuitive syntax, support for compound logical operations, support for Python variables and functions, and ability to handle special column names. In actual data analysis, especially when dynamically building filtering conditions are needed, `query()` is a very practical tool. [![Image 2: Pandas Common Functions](#) Pandas Common Functions](#)
← Pandas Df GroupbyPandas Df Loc β†’