YouTip LogoYouTip

Ref Stat Quantiles

## Python statistics.quantiles() Method The `statistics.quantiles()` method is a built-in function in Python's `statistics` module. It is used to divide a given dataset into $n$ continuous intervals with equal probability and return a list of cut points (quantiles). For example, it can be used to find the quartiles (which divide the data into 4 equal groups), deciles (10 equal groups), or percentiles (100 equal groups). --- ### Syntax ```python statistics.quantiles(data, *, n=4, method='exclusive') ``` #### Parameters: * **`data`**: An iterable of numeric data (such as a list, tuple, or array). The dataset must contain at least two data points. * **`n`** *(optional)*: An integer specifying the number of equal-probability intervals to create. The default is `4`, which calculates the quartiles (returning 3 cut points: $Q_1$, $Q_2$, and $Q_3$). * **`method`** *(optional)*: A string specifying the estimation method used to calculate the quantiles. It accepts two values: * **`'exclusive'`** (Default): Used for samples taken from a population where the extreme values of the population are expected to be more extreme than the minimum and maximum values in the sample. * **`'inclusive'`**: Used for describing sample data or for populations where the minimum and maximum values are known to be the absolute boundaries of the population. #### Return Value: * Returns a list of float values representing the cut points (quantiles). The length of the returned list is always $n - 1$. --- ### Code Examples #### Example 1: Calculating Quartiles (n=4) using the 'inclusive' method The following example demonstrates how to calculate the quartiles of a simple dataset from 1 to 10. ```python import statistics # Define the dataset data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # Calculate quartiles (n=4) using the 'inclusive' method quartiles = statistics.quantiles(data, n=4, method='inclusive') print("Quartiles (Inclusive):", quartiles) ``` **Output:** ```text Quartiles (Inclusive): [3.25, 5.5, 7.75] ``` * **Explanation**: The dataset is divided into 4 equal parts. * **$Q_1$ (25th percentile)**: `3.25` * **$Q_2$ (50th percentile / Median)**: `5.5` * **$Q_3$ (75th percentile)**: `7.75` --- #### Example 2: Comparing 'exclusive' vs 'inclusive' methods This example shows how the choice of the `method` parameter affects the calculated cut points. ```python import statistics data = [10, 20, 30, 40, 50] # Exclusive method (Default) exclusive_q = statistics.quantiles(data, n=4, method='exclusive') # Inclusive method inclusive_q = statistics.quantiles(data, n=4, method='inclusive') print("Exclusive Quantiles:", exclusive_q) print("Inclusive Quantiles:", inclusive_q) ``` **Output:** ```text Exclusive Quantiles: [15.0, 30.0, 45.0] ``` --- #### Example 3: Calculating Deciles (n=10) You can easily calculate deciles (dividing the data into 10 equal parts) by setting `n=10`. ```python import statistics # A dataset of 20 numbers data = list(range(1, 21)) # Calculate deciles deciles = statistics.quantiles(data, n=10) print("Deciles (9 cut points):") print([round(q, 2) for q in deciles]) ``` **Output:** ```text Deciles (9 cut points): [2.1, 4.2, 6.3, 8.4, 10.5, 12.6, 14.7, 16.8, 18.9] ``` --- ### Key Considerations 1. **Python Version**: The `statistics.quantiles()` function was introduced in **Python 3.8**. Ensure your environment is running Python 3.8 or higher to use this method. 2. **Data Sorting**: Unlike some other statistical libraries, you do not need to manually sort the `data` before passing it to `statistics.quantiles()`. The function handles the sorting internally without modifying your original list. 3. **Minimum Data Points**: The input `data` must contain at least two data points, otherwise a `StatisticsError` will be raised. 4. **Handling of NaN**: If there are `NaN` (Not a Number) values in your dataset, the function will propagate them or raise an error depending on the data structure. It is recommended to clean your dataset of missing values before processing.
← Browsers EdgeRef Stat Pvariance β†’