YouTip LogoYouTip

R Func Median

## R median() Function: Calculating the Median In statistics, the **median** is the middle value in a sorted, ordered list of numbers. It divides a data sample, population, or probability distribution into two equal halves: 50% of the values are less than or equal to the median, and 50% of the values are greater than or equal to the median. Unlike the mean (average), the median is highly robust against outliers and skewed data. In R, the built-in `median()` function is used to calculate the median of a numeric vector. --- ## Syntax and Parameters The basic syntax of the `median()` function in R is as follows: ```R median(x, na.rm = FALSE) ``` ### Parameter Description: * **`x`**: A numeric vector containing the data points for which you want to calculate the median. * **`na.rm`**: A logical value (`TRUE` or `FALSE`). It specifies whether missing values (`NA`) should be stripped before the computation proceeds. The default is `FALSE`. --- ## Code Examples ### Example 1: Basic Median Calculation (Odd Number of Elements) When a vector has an odd number of elements, the median is the exact middle value after sorting. ```R # Create a numeric vector x <- c(1, 2, 3, 4, 5, 6, 7) # Calculate the median result.median <- median(x) print(result.median) ``` **Output:** ```text 4 ``` ### Example 2: Median Calculation with an Even Number of Elements When a vector has an even number of elements, the median is calculated as the mean (average) of the two middle values. ```R # Create a numeric vector with an even number of elements x <- c(1, 2, 3, 4, 5, 6) # Calculate the median (the average of 3 and 4) result.median <- median(x) print(result.median) ``` **Output:** ```text 3.5 ``` --- ## Handling Missing Values (`NA`) If the input vector contains missing values (`NA`), the `median()` function will return `NA` by default. To calculate the median of the available numbers, you must set the `na.rm` parameter to `TRUE`. ### Example 3: Handling `NA` Values ```R # Create a vector containing an NA value x <- c(1, 2, 3, 4.5, 6, NA) # Calculate the median without removing NA (returns NA) result.median <- median(x) print(result.median) # Calculate the median with na.rm set to TRUE result.median <- median(x, na.rm = TRUE) print(result.median) ``` **Output:** ```text NA 3 ``` --- ## Key Considerations 1. **Data Types**: The `median()` function works on numeric (integer or double) vectors. Passing a character vector or a factor will result in an error. 2. **Robustness**: The median is a measure of central tendency that is not affected by extremely large or small values (outliers), making it ideal for analyzing skewed distributions like household income or real estate prices.
← R Linear RegressionR Examples β†’