C Function Strtok
## C Library Function - strtok()
The `strtok()` function is a built-in function in the C standard library (``) used to split a string into a series of smaller substrings, commonly referred to as **tokens**, based on specified delimiters.
---
## Syntax & Declaration
To use `strtok()`, you must include the `` header file.
```c
char *strtok(char *str, const char *delim);
```
### Parameters
* **`str`**: The string to be tokenized.
* On the **first call**, pass the pointer to the target string you want to split.
* On **subsequent calls** to continue tokenizing the same string, pass `NULL`.
* **`delim`**: A string containing all the delimiter characters. `strtok()` will split the target string whenever it encounters any character present in this delimiter set.
### Return Value
* Returns a pointer to the **next token** found in the string.
* Returns `NULL` if there are no more tokens left to extract.
---
## Code Example
The following program demonstrates how to use the `strtok()` function to split a string using a hyphen (`-`) as a delimiter.
```c
#include
#include
int main() {
char str = "This is - www.youtip.co - website";
const char s = "-";
char *token;
/* Get the first token */
token = strtok(str, s);
/* Walk through the remaining tokens */
while( token != NULL ) {
printf( "%s\n", token );
/* Get the next token */
token = strtok(NULL, s);
}
return(0);
}
```
### Output
When compiled and executed, the program produces the following output:
```text
This is
www.youtip.co
website
```
---
## Key Considerations & Pitfalls
While `strtok()` is highly convenient, it has several critical behaviors and side effects that developers must be aware of:
1. **Modifies the Original String**: `strtok()` modifies the input string directly by replacing the delimiter characters with null terminators (`\0`). Therefore, the original string is altered (and effectively destroyed). If you need to preserve the original string, make a copy using `strcpy()` or `strdup()` before tokenizing.
2. **Cannot Use Read-Only Memory**: Because `strtok()` modifies the string, passing a string literal (e.g., `strtok("Hello World", " ")`) will result in a segmentation fault or undefined behavior.
3. **Consecutive Delimiters**: If the string contains consecutive delimiters, `strtok()` automatically groups them together and treats them as a single delimiter, skipping over empty tokens.
4. **Non-Reentrant (Not Thread-Safe)**: `strtok()` uses an internal static buffer to keep track of its position within the string between calls. This makes it unsafe to use in multi-threaded environments or nested loops.
---
## Thread-Safe Alternative: strtok_r()
`strtok_r()` is the reentrant (thread-safe) version of `strtok()`. It is widely available on POSIX-compliant systems (like Linux and macOS). Instead of using an internal static buffer, it uses an explicit user-provided pointer (`saveptr`) to maintain state.
### Syntax
```c
char *strtok_r(char *str, const char *delim, char **saveptr);
```
* **`saveptr`**: A pointer to a `char*` variable that is used internally by `strtok_r()` to maintain context between successive calls.
### Example Usage of strtok_r()
```c
#include
#include
int main() {
char str[] = "This is a sample string";
char *token;
char *saveptr;
// First call: pass the source string and the address of saveptr
token = strtok_r(str, " ", &saveptr);
// Subsequent calls: pass NULL and the address of saveptr
while (token != NULL) {
printf("%s\n", token);
token = strtok_r(NULL, " ", &saveptr);
}
return 0;
}
```
### Summary
* Use `strtok()` for quick, single-threaded string parsing where modifying the source string is acceptable.
* Always opt for `strtok_r()` in multi-threaded applications or when parsing nested strings to prevent race conditions and data corruption.
YouTip