Ref Stat Quantiles
## Python statistics.quantiles() Method
The `statistics.quantiles()` method is a built-in function in Python's `statistics` module. It is used to divide a given dataset into $n$ continuous intervals with equal probability and return a list of cut points (quantiles).
For example, it can be used to find the quartiles (which divide the data into 4 equal groups), deciles (10 equal groups), or percentiles (100 equal groups).
---
### Syntax
```python
statistics.quantiles(data, *, n=4, method='exclusive')
```
#### Parameters:
* **`data`**: An iterable of numeric data (such as a list, tuple, or array). The dataset must contain at least two data points.
* **`n`** *(optional)*: An integer specifying the number of equal-probability intervals to create. The default is `4`, which calculates the quartiles (returning 3 cut points: $Q_1$, $Q_2$, and $Q_3$).
* **`method`** *(optional)*: A string specifying the estimation method used to calculate the quantiles. It accepts two values:
* **`'exclusive'`** (Default): Used for samples taken from a population where the extreme values of the population are expected to be more extreme than the minimum and maximum values in the sample.
* **`'inclusive'`**: Used for describing sample data or for populations where the minimum and maximum values are known to be the absolute boundaries of the population.
#### Return Value:
* Returns a list of float values representing the cut points (quantiles). The length of the returned list is always $n - 1$.
---
### Code Examples
#### Example 1: Calculating Quartiles (n=4) using the 'inclusive' method
The following example demonstrates how to calculate the quartiles of a simple dataset from 1 to 10.
```python
import statistics
# Define the dataset
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# Calculate quartiles (n=4) using the 'inclusive' method
quartiles = statistics.quantiles(data, n=4, method='inclusive')
print("Quartiles (Inclusive):", quartiles)
```
**Output:**
```text
Quartiles (Inclusive): [3.25, 5.5, 7.75]
```
* **Explanation**: The dataset is divided into 4 equal parts.
* **$Q_1$ (25th percentile)**: `3.25`
* **$Q_2$ (50th percentile / Median)**: `5.5`
* **$Q_3$ (75th percentile)**: `7.75`
---
#### Example 2: Comparing 'exclusive' vs 'inclusive' methods
This example shows how the choice of the `method` parameter affects the calculated cut points.
```python
import statistics
data = [10, 20, 30, 40, 50]
# Exclusive method (Default)
exclusive_q = statistics.quantiles(data, n=4, method='exclusive')
# Inclusive method
inclusive_q = statistics.quantiles(data, n=4, method='inclusive')
print("Exclusive Quantiles:", exclusive_q)
print("Inclusive Quantiles:", inclusive_q)
```
**Output:**
```text
Exclusive Quantiles: [15.0, 30.0, 45.0]
```
---
#### Example 3: Calculating Deciles (n=10)
You can easily calculate deciles (dividing the data into 10 equal parts) by setting `n=10`.
```python
import statistics
# A dataset of 20 numbers
data = list(range(1, 21))
# Calculate deciles
deciles = statistics.quantiles(data, n=10)
print("Deciles (9 cut points):")
print([round(q, 2) for q in deciles])
```
**Output:**
```text
Deciles (9 cut points):
[2.1, 4.2, 6.3, 8.4, 10.5, 12.6, 14.7, 16.8, 18.9]
```
---
### Key Considerations
1. **Python Version**: The `statistics.quantiles()` function was introduced in **Python 3.8**. Ensure your environment is running Python 3.8 or higher to use this method.
2. **Data Sorting**: Unlike some other statistical libraries, you do not need to manually sort the `data` before passing it to `statistics.quantiles()`. The function handles the sorting internally without modifying your original list.
3. **Minimum Data Points**: The input `data` must contain at least two data points, otherwise a `StatisticsError` will be raised.
4. **Handling of NaN**: If there are `NaN` (Not a Number) values in your dataset, the function will propagate them or raise an error depending on the data structure. It is recommended to clean your dataset of missing values before processing.
YouTip