R Input Xml File
XML stands for eXtensible Markup Language. XML is designed to transport and store data.
If you are not familiar with XML, you can first refer to: (#)
Reading and writing XML files in R requires installing an extension package. We can install it by entering the following command in the R console:
install.packages("XML", repos = "https://mirrors.ustc.edu.cn/CRAN/")
To check if the installation was successful:
> any(grepl("XML",installed.packages())) TRUE
Create a sites.xml file. The XML file should be in the same directory as the test script. The code is as follows:
## Example
1Googlewww.google.com1112Tutorialwww..com2223Taobaowww.taobao.com333
Next, we can use the XML package to load the data from the XML file:
## Example
# Load the XML package
library("XML")
# Set the filename
result <- xmlParse(file="sites.xml")
# Print the result
print(result)
To count the amount of XML data:
## Example
# Load the XML package
library("XML")
# Set the filename
result <- xmlParse(file="sites.xml")
# Extract the root node
rootnode <- xmlRoot(result)
# Count the data amount
rootsize <- xmlSize(rootnode)
# Print the result
print(rootsize)
Executing the above code outputs the following result:
3
To view node data, use for a specific row, and [] for a specific row and column:
## Example
# Load the XML package
library("XML")
# Set the filename
result <- xmlParse(file="sites.xml")
# Extract the root node
rootnode <- xmlRoot(result)
# View the 2nd node's data
print(rootnode)
# View the 1st data item of the 2nd node
print(rootnode[][])
# View the 3rd data item of the 2nd node
print(rootnode[][])
Executing the above code outputs the following result:
$site 2 www. 222 attr(,"class") "XMLInternalNodeList" "XMLNodeList" 2 www.
### Convert XML to a Data List
The above code outputs data in XML format. We can use the `xmlToList()` function to convert the file data into a list format, which is easier to read:
## Example
# Load the XML package
library("XML")
# Set the filename
result <- xmlParse(file="sites.xml")
# Convert to a list
xml_data <- xmlToList(result)
print(xml_data)
print("============================")
# Print the data from the first row, second column
print(xml_data[][])
Executing the above code outputs the following result:
$site $site$id "1" $site$name "Google" $site$url "www.google.com" $site$likes "111" $site $site$id "2" $site$name "" $site$url "www." $site$likes "222" $site $site$id "3" $site$name "Taobao" $site$url "www.taobao.com" $site$likes "333" "============================" "Google"
### Convert XML to a Data Frame
XML file data can be converted into a data frame type, which makes it more convenient for us to manipulate the data:
## Example
# Load the XML package
library("XML")
# Convert XML file data to a data frame
xmldataframe <- xmlToDataFrame("sites.xml")
print(xmldataframe)
Executing the above code outputs the following result:
id name url likes 1 1 Google www.google.com 1112 2 www. 2223 3 Taobao www.taobao.com 333
YouTip