A string (or string) is a sequence of characters composed of numbers, letters, and underscores.
\n\nIn Lua, a string is a fundamental data type used to store text data.
\n\nStrings in Lua can contain any character, including letters, numbers, symbols, spaces, and other special characters.
\n\nIn Lua, strings can be represented in the following three ways:
\n\n- \n
- A string enclosed in single quotes. \n
local str1 = 'This is a string.'\nlocal str2 = "This is also a string."\n\n- \n
- A string enclosed in double quotes. \n
local str = "Hello, "\nstr = str .. "World!" -- Creates a new string and assigns it to str\nprint(str) -- Outputs "Hello, World!"\n\n- \n
- A string enclosed between [[ and ]]. \n
local multilineString = [[This is a multiline string.\nIt can contain multiple lines of text.\nNo need for escape characters.]]\nprint(multilineString)\n\nExamples of strings using the three methods above:
\n\nExample
\n\nstring1 = "Lua"\n\nprint("\\"String 1 is\\"", string1)\n\nstring2 = '.com'\n\nprint("String 2 is", string2)\n\nstring3 = []\n\nprint("String 3 is", string3)\n\nThe output of the above code is:
\n\n"String 1 is" Lua\nString 2 is .com\nString 3 is "Lua Tutorial"\n\nString Length Calculation
\n\nIn Lua, to calculate the length of a string (i.e., the number of characters in the string), you can use the string.len function or the utf8.len function. For strings containing Chinese characters, utf8.len is generally used. The string.len function is used to calculate the length of strings containing only ASCII characters.
Example
\n\nlocal myString = "Hello, !"\n\n-- Calculate the length of the string (number of characters)\nlocal length = string.len(myString)\n\nprint(length) -- Outputs 14\n\nThe myString in the above example contains only ASCII characters, so the string.len function can accurately return the string length.
For strings containing Chinese characters, use the utf8.len function:
Example
\n\nlocal myString = "Hello, World!"\n\n-- Calculate the length of the string (number of characters)\nlocal length1 = utf8.len(myString)\n\nprint(length1) -- Outputs 10\n\n-- string.len function will produce inaccurate results\nlocal length2 = string.len(myString)\n\nprint(length2) -- Outputs 14\n\nThe output is:
\n\n10\n14\n\nEscape characters are used to represent characters that cannot be displayed directly, such as backspace, carriage return, etc. For example, to represent a double quote within a string, you can use \\".
All escape characters and their corresponding meanings:
\n\n| Escape Character | \nMeaning | \nASCII Code (Decimal) | \n
|---|---|---|
| \\a | \nBell (BEL) | \n007 | \n
| \\b | \nBackspace (BS), moves cursor to the previous column | \n008 | \n
| \\f | \nForm Feed (FF), moves cursor to the beginning of the next page | \n012 | \n
| \\n | \nNewline (LF), moves cursor to the beginning of the next line | \n010 | \n
| \\r | \nCarriage Return (CR), moves cursor to the beginning of the current line | \n013 | \n
| \\t | \nHorizontal Tab (HT) (jumps to the next TAB position) | \n009 | \n
| \\v | \nVertical Tab (VT) | \n011 | \n
| \\\\ | \nRepresents a backslash character '\\' | \n092 | \n
| \\' | \nRepresents a single quote (apostrophe) character | \n039 | \n
| \\" | \nRepresents a double quote character | \n034 | \n
| \\0 | \nNull character (NULL) | \n000 | \n
| \\ddd | \nAny character represented by 1 to 3 octal digits | \nThree-digit octal | \n
| \\xhh | \nAny character represented by 1 to 2 hexadecimal digits | \nTwo-digit hexadecimal | \n
\n\n
String Operations
\n\nLua provides many methods to support string operations:
\n\n| No. | \nMethod & Usage | \n
|---|---|
| 1 | \nstring.upper(argument): Converts the string to all uppercase letters. | \n
| 2 | \nstring.lower(argument): Converts the string to all lowercase letters. | \n
| 3 | \nstring.gsub(mainString, findString, replaceString, num) Replaces occurrences in a string. mainString is the string to operate on, findString is the character to be replaced, replaceString is the character to replace with, num is the number of replacements (optional, if omitted, all occurrences are replaced). Example: string.gsub("aaaa","a","z",3); returns zzza and 3. | \n
| 4 | \nstring.find(str, substr, [init, ]) Searches for a specified substring substr within a target string str. If a matching substring is found, it returns the starting and ending indices of that substring; otherwise, it returns nil. init specifies the starting position of the search (default is 1, can be a negative number to count from the end). plain indicates whether to use simple pattern matching (default is false). If true, it performs a simple substring search; if false, it uses regular expression pattern matching. Example: string.find("Hello Lua user", "Lua", 1) returns 7 and 9. | \n
| 5 | \nstring.reverse(arg) Reverses a string. string.reverse("Lua") returns auL. | \n
| 6 | \nstring.format(...) Returns a formatted string similar to printf. string.format("the value is:%d", 4) returns the value is:4. | \n
| 7 | \nstring.char(arg) and string.byte(arg[, int]) char converts integer numbers to characters and concatenates them. byte converts a character to its integer value (can specify a character, default is the first character). string.char(97, 98, 99, 100) returns abcd. string.byte("ABCD", 4) returns 68. string.byte("ABCD") returns 65. | \n
| 8 | \nstring.len(arg) Calculates the length of a string. string.len("abc") returns 3. | \n
| 9 | \nstring.rep(string, n) Returns n copies of the string string. string.rep("abcd", 2) returns abcdabcd. | \n
| 10 | \n.. Concatenates two strings. print("www..".."com") outputs www..com. | \n
| 11 | \nstring.gmatch(str, pattern) Returns an iterator function. Each time this function is called, it returns the next substring in string str that matches the pattern described by pattern. If no substring matching the pattern is found, the iterator function returns nil. Example: for word in string.gmatch("Hello Lua user", "%a+") do print(word) end outputs Hello, Lua, user. | \n
| 12 | \nstring.match(str, pattern, init) string.match() only finds the first match in the source string str. The parameter init is optional and specifies the starting position of the search (default is 1). On a successful match, the function returns all capture results from the matching expression; if no capture markers are set, it returns the entire matched string. When there is no successful match, it returns nil. Example: = string.match("I have 2 questions for you.", "%d+ %a+") returns 2 questions. = string.format("%d, %q", string.match("I have 2 questions for you.", "(%d+) (%a+)")) returns 2, "questions". | \n
String Substring
\n\nString substring extraction uses the sub() method.
string.sub() is used to extract substrings. Its prototype is:
string.sub(s, i [, j])\n\nParameter description:
\n\n- \n
s: The string to extract from. \n i: The starting position of the extraction. \n j: The ending position of the extraction (default is -1, the last character). \n
Example
\n\n-- String\nlocal sourcestr = "prefix--tutorialgoogletaobao--suffix"\n\nprint("\\nOriginal string", string.format("%q", sourcestr))\n\n-- Extract a part, from the 4th to the 15th character\nlocal first_sub = string.sub(sourcestr, 4, 15)\n\nprint("\\nFirst extraction", string.format("%q", first_sub))\n\n-- Get the string prefix, from the 1st to the 8th character\nlocal second_sub = string.sub(sourcestr, 1, 8)\n\nprint("\\nSecond extraction", string.format("%q", second_sub))\n\n-- Extract the last 10 characters\nlocal third_sub = string.sub(sourcestr, -10)\n\nprint("\\nThird extraction", string.format("%q", third_sub))\n\n-- Index out of bounds, outputs the original string\nlocal fourth_sub = string.sub(sourcestr, -100)\n\nprint("\\nFourth extraction", string.format("%q", fourth_sub))\n\nThe output of the above code is:
\n\nOriginal string "prefix--tutorialgoogletaobao--suffix"\nFirst extraction "fix--tutorialg"\nSecond extraction "prefix--"\nThird extraction "ao--suffix"\nFourth extraction "prefix--tutorialgoogletaobao--suffix"\n\nString Case Conversion
\n\nThe following example demonstrates how to convert the case of a string:
\n\nExample
\n\nstring1 = "Lua";\n\nprint(string.upper(string1))\n\nprint(string.lower(string1))\n\nThe output of the above code is:
\n\nLUA\nlua\n\nString Find and Reverse
\n\nThe following example demonstrates how to find and reverse a string:
\n\nExample
\n\nstring = "Lua Tutorial"\n\n-- Find the string\nprint(string.find(string, "Tutorial"))\n\nreversedString = string.reverse(string)\n\nprint("New string is", reversedString)\n\nThe output of the above code is:
\n\n5 12\nNew string is lairotuT auL\n\nString Formatting
\n\nLua provides the string.format() function to generate strings with specific formats. The first parameter of the function is the format, followed by various data corresponding to each code in the format.
Due to the existence of format strings, the readability of the resulting long strings is greatly improved. The format of this function is very similar to printf() in the C language.
The following example demonstrates how to format a string:
\n\nFormat strings may contain the following escape codes:
\n\n- \n
- %c - Accepts a number and converts it to the corresponding character in the ASCII table \n
- %d, %i - Accepts a number and converts it to a signed integer format \n
- %o - Accepts a number and converts it to an octal format \n
- %u - Accepts a number and converts it to an unsigned integer format \n
- %x - Accepts a number and converts it to a hexadecimal format, using lowercase letters \n
- %X - Accepts a number and converts it to a hexadecimal format, using uppercase letters \n
- %e - Accepts a number and converts it to scientific notation format, using lowercase 'e' \n
- %E - Accepts a number and converts it to scientific notation format, using uppercase 'E' \n
- %f - Accepts a number and converts it to floating-point format \n
- %g (%G) - Accepts a number and converts it to the shorter format between %e (%E, corresponding to %G) and %f \n
- %q - Accepts a string and converts it to a format that can be safely read by the Lua compiler \n
- %s - Accepts a string and formats it according to the given parameters \n
To further refine the format, parameters can be added after the % sign. Parameters are read in the following order:
\n\n- \n
- (1) Sign: A '+' sign indicates that the subsequent numeric escape will display a positive sign for positive numbers. By default, only negative numbers display a sign. \n
- (2) Padding character: A '0' is used for padding when a string width is specified. The default padding character is a space. \n
- (3) Alignment flag: When a string width is specified, the default is right-aligned. Adding a '-' sign changes it to left-aligned. \n
- (4) Width value \n
- (5) Decimal places/String truncation: The decimal part 'n' added after the width value. If followed by 'f' (floating-point escape, e.g., %6.3f), it sets the floating-point number to retain only 'n' decimal places. If followed by 's' (string escape, e.g., %5.3s), it sets the string to display only the first 'n' characters. \n
Example
\n\nstring1 = "Lua"\n\nstring2 = "Tutorial"\n\nnumber1 = 10\n\nnumber2 = 20\n\n-- Basic string formatting\nprint(string.format("Basic formatting %s %s", string1, string2))\n\n-- Date formatting\ndate = 2; month = 1; year = 2014\n\nprint(string.format("Date formatting %02d/%02d/%04d", date, month, year))\n\n-- Decimal formatting\nprint(string.format("%.4f", 1/3))\n\nThe output of the above code is:
\n\nBasic formatting Lua Tutorial\nDate formatting 02/01/2014\n0.3333\n\nOther examples:
\n\nExample
\n\nstring.format("%c", 83) -- Outputs S\nstring.format("%+d", 17.0) -- Outputs +17\nstring.format("%05d", 17) -- Outputs 00017\nstring.format("%o", 17) -- Outputs 21\nstring.format("%u", 3.14) -- Outputs 3\nstring.format("%x", 13) -- Outputs d\nstring.format("%X", 13) -- Outputs D\nstring.format("%e", 1000) -- Outputs 1.000000e+03\nstring.format("%E", 1000) -- Outputs 1.000000E+03\nstring.format("%6.3f", 13) -- Outputs 13.000\nstring.format("%q", "One\\n Two") -- Outputs "One\\\\\n-- Two"\nstring.format("%s", "monkey") -- Outputs monkey\nstring.format("%10s", "monkey") -- Outputs monkey\nstring.format("%5.3s", "monkey") -- Outputs mon\n\nCharacter and Integer Conversion
\n\nThe following example demonstrates converting between characters and integers:
\n\nExample
\n\n-- Character conversion\n-- Convert the first character\nprint(string.byte("Lua"))\n\n-- Convert the third character\nprint(string.byte("Lua", 3))\n\n-- Convert the last character\nprint(string.byte("Lua", -1))\n\n-- Convert the second character\nprint(string.byte("Lua", 2))\n\n-- Convert the second to last character\nprint(string.byte("Lua", -2))\n\n-- Convert integer ASCII code to character\nprint(string.char(97))\n\nThe output of the above code is:
\n\n76\n97\n97\n117\n117\na\n\nOther Common Functions
\n\nThe following example demonstrates other string operations, such as calculating string length, string concatenation, string repetition, etc.:
\n\nExample
\n\nstring1 = "www."\n\nstring2 = ""\n\nstring3 = ".com"\n\n-- Use .. for string concatenation\nprint("Concatenated string", string1..string2..string3)\n\n-- String length\nprint("String length ", string.len(string2))\n\n-- Repeat string 2 times\nrepeatedString = string.rep(string2, 2)\n\nprint(repeatedString)\n\nThe output of the above code is:
\n\nConcatenated string www..com\nString length 6\ntutorialtutorial\n\nPattern Matching
\n\nMatching patterns in Lua are described directly using regular strings. It is used in pattern matching functions string.find, string.gmatch, string.gsub, string.match.
\n\nYou can also use character classes in pattern strings.
\n\nA character class is a pattern item that can match any character within a specific character set. For example, the character class %d matches any digit. So you can use the pattern string %d%d/%d%d/%d%d%d%d to search for dates in dd/mm/yyyy format:
Example
\n\ns = "Deadline is 30/05/1999, firm"\n\ndate = "%d%d/%d%d/%d%d%d%d"\n\nprint(string.sub(s, string.find(s, date))) --> 30/05/1999\n\nThe following table lists all character classes supported by Lua:
\n\nSingle character (except ^$()%.[]*+-?): Matches the character itself.
\n\n- \n
- . (dot): Matches any character. \n
- %a: Matches any letter. \n
- %c: Matches any control character (e.g., \\n). \n
- %d: Matches any digit. \n
- %l: Matches any lowercase letter. \n
- %p: Matches any punctuation character. \n
- %s: Matches any whitespace character. \n
- %u: Matches any uppercase letter. \n
- %w: Matches any alphanumeric character. \n
- %x: Matches any hexadecimal digit. \n
- %z: Matches any character representing 0. \n
- %x (where x is a non-alphanumeric character): Matches the character x. This is mainly used to handle the matching of functional characters in expressions (^$()%.[]*+-?), for example, %% matches %. \n
- : Matches any character class contained within []. For example, [%w_] matches any alphanumeric character or underscore (_). \n
- [^Several character classes]: Matches any character class not contained within []. For example, [^%s] matches any non-whitespace character. \n
When the above character classes are written in uppercase, they match any character that is not in that character class. For example, %S matches any non-whitespace character. For example, '%A' matches non-letter characters:
\n\n> print(string.gsub("hello, up-down!", "%A", "."))\nhello..up.down. 4\n\nThe number 4 is not part of the string result; it is the second result returned by gsub, representing the number of replacements that occurred.
There are some special characters in pattern matching that have special meanings. The special characters in Lua are as follows:
\n\n( ) . % + - * ? [ ^ $\n\n'%' is used as an escape character for special characters, so '%.' matches a dot; '%%' matches the character '%'. The escape character '%' can not only be used to escape special characters but also for all non-alphabetic characters.
\n\nPattern items can be:
\n\n- \n
- A single character class matches any single character in that category; \n
- A single character class followed by a '
*', matches zero or more characters of that class. This item always matches the longest possible string; \n - A single character class followed by a '
+', matches one or more characters of that class. This item always matches the longest possible string; \n - A single character class followed by a '
-', matches zero or more characters of that class. Unlike '*', this item always matches the shortest possible string; \n - A single character class followed by a '
?', matches zero or one character of that class. It will match one if possible; \n %n, where n can be from 1 to 9; this item matches a substring equal to the nth capture (described later). \n %bxy, where x and y are two distinct characters; this item matches a string that starts with x and ends with y, and where x and y are balanced. This means that if you read the string from left to right, for each x you read, you add 1, and for each y you read, you subtract 1, and the final y is the first one that brings the count to 0. For example, the item%b()can match a balanced expression with parentheses. \n %f, a frontier pattern; this item matches an empty string at a position where the previous character is not in set and the next character is in set. The meaning of the set set is as described above. The start and end points of the matched empty string are calculated as if there were a character '\\0' at that position. \n
Pattern:
\n\nA pattern is a sequence of pattern items. Adding a '^' at the very beginning of the pattern anchors the match to the start of the string. Adding a '$' at the very end of the pattern anchors the match to the end of the string. If '^' and '$' appear elsewhere, they have no special meaning and represent themselves.
Captures:
\n\nA pattern can have sub-patterns enclosed in parentheses; these sub-patterns are called captures. When a match is successful, the substrings matched by the captures are saved for future use. Captures are numbered in the order of their left parentheses. For example, for the pattern "(a*(.)%w(%s*))", the part of the string that matches "a*(.)%w(%s*)" is saved in the first capture (therefore numbered 1); the character matched by "." is capture 2, and the part matching "%s*" is capture 3.
As a special case, an empty capture () captures the current position in the string (which is a number). For example, if the pattern "()aa()" is applied to the string "flaaap", it produces two captures: 3 and 5.
YouTip