R Data Types

R Data Types \\n\\n

Data types refer to a broad system used to declare different types of variables or functions.

\\n\\n

The type of a variable determines the storage space it occupies and how the stored bit pattern is interpreted.

\\n\\n

The most fundamental data types in R are mainly three:

\\n\\n

Numeric
Logical
Text

\\n\\n

Numeric constants are mainly of two types:

\\n\\n\\n \\n \\n \\n \\n \\n \\n \\n \\n

General form	123 -0.125
Scientific notation	1.23e2 -1.25E-1

\\n\\n

The logical type is often called Boolean in many other programming languages, with constant values only being TRUE and FALSE.

\\n\\n

Note: R is case-sensitive. true or True cannot represent TRUE.

\\n\\n

The most intuitive data type is the text type. Text is what is commonly referred to as a string in other languages, with constants enclosed in double quotes. In R, text constants can be enclosed in either single or double quotes, for example:

\\n\\n

Example

\\n

\\n
''==""
\\n
TRUE
\\n

\\n\\n

Regarding variable definition in R, unlike the syntax rules in some strongly typed languages where you need to set a name and data type for a variable, whenever you use the assignment operator in R, you are actually defining a new variable:

\\n\\n

Example

\\n

\\n
a =1
\\n
b <- TRUE
\\n
b ="abc"
\\n

\\n\\n

By object type, there are the following 6 types (these types will be introduced in detail later):

\\n\\n

Vector
List
Matrix
Array
Factor
Data Frame

\\n\\n

Vector

\\n\\n

Vectors are often provided in the standard libraries of specialized programming languages like Java, Rust, and C#. This is because vectors are indispensable tools in mathematical operations—the most common vectors are two-dimensional vectors, which are inevitably used in planar coordinate systems.

\\n\\n

From a data structure perspective, a vector is a linear list and can be considered an array.

\\n\\n

The existence of vectors as a type in R makes vector operations easier:

\\n\\n

Example

\\n

\\n
> a =c(3, 4)
\\n
> b =c(5, 0)
\\n
> a + b
\\n
8 4
\\n
>
\\n

\\n\\n

c() is a function for creating vectors.

\\n\\n

Here, adding two two-dimensional vectors results in a new two-dimensional vector (8, 4). If you perform an operation between a two-dimensional vector and a three-dimensional vector, it will lose mathematical meaning. Although it won't stop running, you will get a warning.

\\n\\n

I suggest you avoid this situation out of habit.

\\n\\n

Each element in a vector can be accessed individually using an index:

\\n\\n

Example

\\n

\\n
> a =c(10, 20, 30, 40, 50)
\\n
> a
\\n
20
\\n

\\n\\n

Note: The "index" in R does not represent an offset, but rather which position it is, meaning it starts from 1!

\\n\\n

R can also conveniently extract a part of a vector:

\\n\\n

Example

\\n

\\n
> a[1:4]# Extract items 1 to 4, including items 1 and 4
\\n
10 20 30 40
\\n
> a[c(1, 3, 5)]# Extract items 1, 3, 5
\\n
10 30 50
\\n
> a[c(-1, -5)]# Remove items 1 and 5
\\n
20 30 40
\\n

\\n\\n

These three partial extraction methods are the most commonly used.

\\n\\n

Vectors support scalar operations:

\\n\\n

Example

\\n

\\n
>c(1.1, 1.2, 1.3)-0.5
\\n
0.6 0.7 0.8
\\n
> a =c(1,2)
\\n
> a ^2
\\n
1 4
\\n

\\n\\n

The commonly used mathematical operation functions mentioned earlier, such as sqrt and exp, can also be used for scalar operations on vectors.

\\n\\n

As a linear list structure, "vectors" should have some common linear list processing functions, and R indeed has these functions:

\\n\\n

Vector sorting:

\\n\\n

Example

\\n

\\n
> a =c(1, 3, 5, 2, 4, 6)
\\n
>sort(a)
\\n
1 2 3 4 5 6
\\n
>rev(a)
\\n
6 4 2 5 3 1
\\n
>order(a)
\\n
1 4 2 5 3 6
\\n
> a[order(a)]
\\n
1 2 3 4 5 6
\\n

\\n\\n

The order() function returns an index vector after sorting the vector.

\\n\\n

Vector Statistics

\\n\\n

R has very complete statistical functions:

\\n\\n\\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n

Function Name	Meaning
sum	Sum
mean	Mean
var	Variance
sd	Standard Deviation
min	Minimum
max	Maximum
range	Range (a two-dimensional vector, maximum and minimum)

\\n\\n

Vector statistics example:

\\n\\n

Example

\\n

\\n
>sum(1:5)
\\n
15
\\n
>sd(1:5)
\\n
1.581139
\\n
>range(1:5)
\\n
1 5
\\n

\\n\\n

Vector Generation

\\n\\n

Vectors can be generated using the c() function, or using the min:max operator to generate a continuous sequence.

\\n\\n

If you want to generate an arithmetic sequence with gaps, you can use the seq function:

\\n\\n

\\n
> seq(1, 9, 2) 1 3 5 7 9
\\n

\\n\\n

seq can also generate an arithmetic sequence from m to n, you just need to specify m, n, and the length of the sequence:

\\n\\n

\\n
> seq(0, 1, length.out=3) 0.0 0.5 1.0
\\n

\\n\\n

rep stands for repeat, and can be used to generate a repeating number sequence:

\\n\\n

\\n
> rep(0, 5) 0 0 0 0 0
\\n

\\n\\n

NA and NULL are often used in vectors. Here is an introduction to these two terms and their differences:

\\n\\n

NA represents "missing", NULL represents "non-existent".
NA is like a placeholder, representing that there is no value here, but the position exists.
NULL represents that the data does not exist.

\\n\\n

Example illustration:

\\n\\n

Example

\\n

\\n
>length(c(NA, NA, NULL))
\\n
2
\\n
>c(NA, NA, NULL, NA)
\\n
NA NA NA
\\n

\\n\\n

Obviously, NULL has no meaning in a vector.

\\n\\n

The string data type itself is not complex; here we focus on introducing string operation functions:

\\n\\n

Example

\\n

\\n
>toupper("")# Convert to uppercase
\\n
""
\\n
>tolower("")# Convert to lowercase
\\n
""
\\n
>nchar("Chinese", type="bytes")# Count byte length
\\n
4
\\n
>nchar("Chinese", type="char")# Count total number of characters
\\n
2
\\n
>substr("123456789", 1, 5)# Extract substring, from 1 to 5
\\n
"12345"
\\n
>substring("1234567890", 5)# Extract substring, from 5 to the end
\\n
"567890"
\\n
>as.numeric("12")# Convert string to number
\\n
12
\\n
>as.character(12.34)# Convert number to string
\\n
"12.34"
\\n
>strsplit("2019;10;1", ";")# Split string by delimiter
\\n
[]
\\n
"2019""10""1"
\\n
>gsub("/", "-", "2019/10/1")# Replace in string
\\n
"2019-10-1"
\\n

\\n\\n

On Windows computers, the GBK encoding standard is used, so one Runoob Tutorial R Data Types character is two bytes. If running on a UTF-8 encoded computer, the byte length of a single Runoob Tutorial R Data Types character should be 3.

\\n\\n

R supports regular expressions in Perl language format:

\\n\\n

Example

\\n

\\n
>gsub("[[:alpha:]]+", "$", "Two words")
\\n
"$ $"
\\n

\\n\\n

For more string content, refer to: R Language String Introduction.

\\n\\n

Matrix

\\n\\n

R provides a matrix type for the study of linear algebra. This data structure is similar to a two-dimensional array in other languages, but R provides language-level matrix operation support.

\\n\\n

First, let's look at matrix generation:

\\n\\n

Example

\\n

\\n
>vector=c(1, 2, 3, 4, 5, 6)
\\n
>matrix(vector, 2, 3)
\\n
[,1][,2][,3]
\\n
[1,]1 3 5
\\n
[2,]2 4 6
\\n

\\n\\n

The matrix initialization content is passed by a vector, and then you need to specify how many rows and columns the matrix has.

\\n\\n

The values in the vector are filled into the matrix column by column. If you want to fill by row, you need to specify the byrow attribute:

\\n\\n

Example

\\n

\\n
>matrix(vector, 2, 3, byrow=TRUE)
\\n
[,1][,2][,3]
\\n
[1,]1 2 3
\\n
[2,]4 5 6
\\n

\\n\\n

Each value in the matrix can be accessed directly:

\\n\\n

Example

\\n

\\n
> m1 =matrix(vector, 2, 3, byrow=TRUE)
\\n
> m1[1,1]# Row 1, Column 1
\\n
1
\\n
> m1[1,3]# Row 1, Column 3
\\n
3
\\n

\\n\\n

Each column and row in a matrix in R can be named, and this process is done in batch using a string vector:

\\n\\n

Example

\\n

\\n
>colnames(m1)=c("x", "y", "z")
\\n
>rownames(m1)=c("a", "b")
\\n
> m1
\\n
x y z
\\n
a 1 2 3
\\n
b 4 5 6
\\n
> m1["a", ]
\\n
x y z
\\n
1 2 3
\\n

\\n\\n

The arithmetic operations on matrices are basically the same as on vectors. They can be performed with scalars or with corresponding elements of matrices of the same size.

\\n\\n

Matrix multiplication:

\\n\\n

Example

\\n

\\n
> m1 =matrix(c(1, 2), 1, 2)
\\n
> m2 =matrix(c(3, 4), 2, 1)
\\n
> m1 %*% m2
\\n
[,1]
\\n
[1,]11
\\n

\\n\\n

Inverse matrix:

\\n\\n

Example

\\n

\\n
> A =matrix(c(1, 3, 2, 4), 2, 2)
\\n
>solve(A)
\\n
[,1][,2]
\\n
[1,]-2.0 1.0
\\n
[2,]1.5-0.5
\\n

\\n\\n

The solve() function is used to solve linear algebra equations. The basic usage is solve(A,b), where A is the coefficient matrix of the equation system, and b is the vector or matrix of the equations.

\\n\\n

The apply() function can treat each row or column of a matrix as a vector for operations:

\\n\\n

Example

\\n

\\n
>(A =matrix(c(1, 3, 2, 4), 2, 2))
\\n
[,1][,2]
\\n
[1,]1 2
\\n
[2,]3 4
\\n
>apply(A, 1, sum)# The second parameter is 1 for row-wise operation, using sum() function
\\n
3 7
\\n
>apply(A, 2, sum)# The second parameter is 2 for column-wise operation
\\n
4 6
\\n

\\n\\n

For more matrix content, refer to: R Matrix.

YouTip

R Data Types

Example

Example

Vector

Example

Example

Example

Example

Example

Example

Example

Logical Type

Example

Example

Example

Example

String

Example

Example

Matrix

Example

Example

Example

Example

Example

Example

Example

📂 Categories