0

I typed up this block of code for an assignment:

char *tokens[10];

void parse(char* input);

void main(void) 
{
    char input[] = "Parse this please.";
    parse(input);

    for(int i = 2; i >= 0; i--) {
        printf("%s ", tokens[i]);
    }
}

void parse(char* input)
{
    int i = 0;
    tokens[i] = strtok(input, " ");

    while(tokens[i] != NULL) {
        i++;
        tokens[i] = strtok(NULL, " ");
    }
}

But, looking at it, I'm not sure how the memory allocation works. I didn't define the length of the individual strings as far as I know, just how many strings are in the string array tokens (10). Do I have this backwards? If not, then is the compiler allocating the length of each string dynamically? In need of some clarification.

0

2 Answers 2

2

strtok is a bad citizen.

For one thing, it retains state, as you've implicitly used when you call strtok(NULL,...) -- this state is stored in the private memory of the Standard C Library, which means only single threaded programs can use strtok. Note that there is a reentrant version called strtok_r in some libraries.

For another, and to answer your question, strtok modifies its input. It doesn't allocate space for the strings; it writes NUL characters in place of your delimiter in the input string, and returns a pointer into the input string.

You are correct that strtok can return more than 10 results. You should check for that in your code so you don't write beyond the end of tokens. A reliable program would either set an upper limit, like your 10, and check for it, reporting an error if it's exceeded, or dynamically allocate the tokens array with malloc, and realloc it if it gets too big. Then the error occurs when you fun out of memory.

Note that you can also work around the problem of strtok modifying your input string by strduping before passing it to strtok. Then you'll have to free the new string after both it and tokens, which points to it, are going out of scope.

Sign up to request clarification or add additional context in comments.

2 Comments

Oh ok, thanks, that's good to know. So, I was thinking about it wrong. tokens is not a 2 dimensional array of chars, but an array of pointers. So the compiler allocates for sizeof(void *) not for multiple characters.
@Doug Currie It should be noted that strdup is not a standard C library function, and is likely not available on platforms that are not POSIX compliant.
1

tokens is an array of pointers.

The distinction between strings and pointers if often fuzzy. In some situations strings are better thought out as arrays, in other situations as pointers.

Anyway... in your example input is an array and tokens is an array of pointers to a place within input.

The data inside input is changed with each call to strtok()

So, step by step

// input[] = "foo bar baz";
tokens[0] = strtok(input, " ");
// input[] = "foo\0bar baz";
//            ^-- tokens[0] points here
tokens[1] = strtok(NULL, " ");
// input[] = "foo\0bar\0baz";
//                 ^-- tokens[1] points here
tokens[2] = strtok(NULL, " ");
// input[] = "foo\0bar\0baz";
//                      ^-- tokens[2] points here
// next strtok returns NULL

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.