YouTip LogoYouTip

R Examples

## R Programming Examples: A Comprehensive Developer's Guide R is a powerful, open-source programming language and environment designed specifically for statistical computing, data analysis, and high-quality data visualization. This comprehensive guide provides a curated set of practical R examples categorized by core programming concepts, data structures, statistical analysis, and data visualization. Whether you are a beginner looking to understand basic syntax or an experienced developer needing a quick reference, these examples will help you write clean, efficient, and idiomatic R code. --- ## 1. Basic Syntax and Control Flow These foundational examples cover basic operations, conditional statements, and loops in R. ### Hello World and Basic Arithmetic ```R # Print a string to the console print("Hello, World!") # Basic arithmetic operations a <- 15 b <- 4 sum_val <- a + b # Addition diff_val <- a - b # Subtraction prod_val <- a * b # Multiplication div_val <- a / b # Division mod_val <- a %% b # Modulo (remainder) exp_val <- a ^ b # Exponentiation cat("Sum:", sum_val, "\nModulo:", mod_val, "\nExponent:", exp_val) ``` ### Conditional Statements (if-else) ```R score <- 85 if (score >= 90) { print("Grade: A") } else if (score >= 80) { print("Grade: B") } else { print("Grade: C") } # Vectorized conditional using ifelse() status <- ifelse(score >= 60, "Pass", "Fail") print(status) ``` ### Loops (for, while) ```R # For loop iterating over a sequence for (i in 1:5) { cat("Iteration:", i, "\n") } # While loop counter <- 1 while (counter <= 3) { cat("Counter is at:", counter, "\n") counter <- counter + 1 } ``` --- ## 2. Core Data Structures R features several built-in data structures optimized for scientific computing. ### Vectors Vectors are the most basic data structure in R and must contain elements of the same type. ```R # Create a numeric vector numeric_vector <- c(1.2, 3.4, 5.6) # Create a sequence of integers sequence_vector <- 1:10 # Vector operations (vectorized by default) doubled_vector <- numeric_vector * 2 print(doubled_vector) ``` ### Lists Lists can contain elements of different types, including other lists or vectors. ```R # Create a list with mixed data types employee_record <- list( id = 101, name = "Alice", skills = c("R", "Python", "SQL"), active = TRUE ) # Accessing list elements print(employee_record$name) print(employee_record[]) # Access first skill ``` ### Matrices Matrices are two-dimensional arrays where all elements must be of the same data type. ```R # Create a 3x3 matrix matrix_data <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE) print(matrix_data) # Accessing elements: matrix[row, column] element_2_3 <- matrix_data[2, 3] # Row 2, Column 3 row_1 <- matrix_data[1, ] # Entire first row ``` ### Data Frames Data frames are two-dimensional, tabular structures where columns can contain different data types. This is the standard structure for datasets. ```R # Create a data frame df <- data.frame( EmployeeID = c(1, 2, 3), Name = c("John", "Emma", "Luke"), Salary = c(55000, 62000, 48000), stringsAsFactors = FALSE ) # View structure and summary str(df) summary(df) # Subset data frame high_earners <- df[df$Salary > 50000, ] print(high_earners) ``` --- ## 3. Functions and Functional Programming Functions in R are first-class citizens. You can pass them as arguments, return them from other functions, and assign them to variables. ### Defining a Custom Function ```R # Function to calculate the area of a circle calculate_circle_area <- function(radius = 1) { if (radius < 0) { stop("Radius cannot be negative!") } area <- pi * radius^2 return(area) } # Call the function print(calculate_circle_area(5)) ``` ### The `apply` Family R discourages explicit loops for data transformations. Instead, use the `apply` family of functions for cleaner, faster execution. ```R # Create a sample matrix mat <- matrix(1:12, nrow = 3, ncol = 4) # Calculate sum of each row (MARGIN = 1) row_sums <- apply(mat, 1, sum) # Calculate mean of each column (MARGIN = 2) col_means <- apply(mat, 2, mean) # sapply() on a vector to return a simplified vector squared_values <- sapply(1:5, function(x) x^2) print(squared_values) ``` --- ## 4. Data Manipulation (dplyr) The `dplyr` package (part of the `tidyverse`) is the industry standard for data manipulation in R. ```R # Install and load dplyr (uncomment if not installed) # install.packages("dplyr") library(dplyr) # Sample dataset sales_data <- data.frame( Region = c("North", "South", "North", "East", "South", "East"), Product = c("A", "B", "B", "A", "A", "B"), Revenue = c(1200, 1500, 800, 2000, 1100, 950) ) # Using the pipe operator (%>%) to chain operations summary_report <- sales_data %>% filter(Revenue > 900) %>% # Filter rows group_by(Region) %>% # Group by Region summarise( Total_Revenue = sum(Revenue), # Aggregate data Avg_Revenue = mean(Revenue), Transaction_Count = n() ) %>% arrange(desc(Total_Revenue)) # Sort results print(summary_report) ``` --- ## 5. Statistical Analysis R was built by statisticians, making statistical modeling and analysis straightforward and highly robust. ### Descriptive Statistics ```R # Generate a random sample of 100 numbers from a normal distribution set.seed(42) # Set seed for reproducibility data_sample <- rnorm(100, mean = 50, sd = 10) # Calculate basic statistics mean_val <- mean(data_sample) median_val <- median(data_sample) sd_val <- sd(data_sample) quantiles <- quantile(data_sample, probs = c(0.25, 0.5, 0.75)) cat("Mean:", mean_val, "\nSD:", sd_val, "\n") ``` ### Linear Regression ```R # Using the built-in 'mtcars' dataset # Predict Miles Per Gallon (mpg) based on Horsepower (hp) and Weight (wt) linear_model <- lm(mpg ~ hp + wt, data = mtcars) # Print the model summary summary(linear_model) # Make a prediction for a new car new_car <- data.frame(hp = 110, wt = 2.8) predicted_mpg <- predict(linear_model, newdata = new_car) cat("Predicted MPG:", predicted_mpg, "\n") ``` --- ## 6. Data Visualization R offers two primary plotting systems: Base R (built-in) and `ggplot2` (declarative, grammar of graphics). ### Base R Plotting ```R # Simple scatter plot plot(mtcars$wt, mtcars$mpg, main = "Weight vs. MPG", xlab = "Weight (1000 lbs)", ylab = "Miles Per Gallon", col = "blue", pch = 19) # Add a trend line abline(lm(mpg ~ wt, data = mtcars), col = "red", lwd = 2) ``` ### Advanced Plotting with ggplot2 ```R # Install and load ggplot2 (uncomment if not installed) # install.packages("ggplot2") library(ggplot2) # Create a styled scatter plot with a regression line ggplot(data = mtcars, aes(x = wt, y = mpg, color = factor(cyl))) + geom_point(size = 3, alpha = 0.8) + geom_smooth(method = "lm", se = FALSE, color = "darkgray") + labs( title = "Fuel Efficiency vs. Vehicle Weight", subtitle = "Grouped by Number of Cylinders", x = "Weight (1000 lbs)", y = "Miles Per Gallon (MPG)", color = "Cylinders" ) + theme_minimal() ``` --- ## Considerations and Best Practices When writing R code, keep the following professional guidelines in mind: 1. **Use the Arrow Operator (`<-`) for Assignment**: While `=` works in most contexts, `<-` is the idiomatic standard in R for variable assignment. Reserve `=` for passing arguments inside functions. 2. **Leverage Vectorization**: Avoid writing explicit `for` loops whenever possible. R is optimized for vectorized operations (e.g., `x + 1` instead of looping through each element of `x`). 3. **Manage Missing Data (`NA`)**: R uses `NA` to represent missing values. Many functions will return `NA` if the input contains missing values unless you explicitly pass `na.rm = TRUE` (e.g., `mean(x, na.rm = TRUE)`). 4. **Prefer the Tidyverse**: For data manipulation and visualization, packages like `dplyr`, `tidyr`, and `ggplot2` provide a more consistent, readable, and maintainable syntax than base R. 5. **Set Seeds for Reproducibility**: When using functions that involve randomness (like `rnorm`, `sample`, or machine learning algorithms), always use `set.seed()` to ensure your results can be exactly replicated.
← R Func MedianRedis Geo β†’