1
 struct TokenizerT_ {
    char* separators;
    char* tks;
    char* cur_pos;
    char* next;
  };

  typedef struct TokenizerT_ TokenizerT;

  TokenizerT *TKCreate(char *separators, char *ts) 
  { 
    TokenizerT *tokenizer;
    tokenizer = (TokenizerT*)malloc(sizeof(TokenizerT));

    //some manipulation here

    tokenizer->tks = (char*) malloc (strlen(str)* sizeof(char));
    tokenizer->tks=str;
    printf("size of tokenizer->tks is %zu\n", strlen(tokenizer->tks)); //this prints out the correct number (e.g. 7)
    return tokenizer;
  }

  int main(int argc, char **argv)
  {
    TokenizerT *tk = TKCreate(argv[1], argv[2]);
    printf("tk->tks: %zu\n", strlen(tk->tks)); //HOWEVER, this prints out the wrong number (e.g. 1)
  }

As seen from the above code, I'm working with pointers to structs. For some reason I am not receiving back the correct length for tk->tks. I cannot understand this because it should be the same size as tks in my TKCreate function. Can someone explain this please?

3 Answers 3

2

I suspect str, the definition of which is not shown in your code snippet, is a local variable defined in TKCreate(). If so, you're assigning tokenizer->tks to have the value of str, which points to a proper string inside the scope of TKCreate() but upon exiting TKCreate(), the stack contents (including parameters and local variables) are freed and wiped out so when you try to reference that pointer outside the scope of TKCreate() all bets are off.

One plausible fix is to allocate the storage for tokenizer->tks dynamically, so it persists after you exit TKCreate(). I see you do that with a call to malloc but then you overwrite that with an explicit assignment from str. Instead you should copy the contents of str (using strcpy) into the dynamically allocated memory via: strcpy(tokenizer->tks, str);

Sign up to request clarification or add additional context in comments.

1 Comment

I just added a second paragraph to explain the fix. See if that helps.
0

You should strcpy the contents of str to tokenizer->tks, because when you use the assign operator, you're losing the pointer malloc gave you, creating a memory leak and pointing tokenizer->tks to a local variable, which will be destroyed after the function's return.

So, the approach would be something like this:

tokenizer->tks = (char *)malloc ((strlen(str) + 1) * sizeof(char));
strcpy(tokenizer->tks, str);

Another thing:

Don't forget to free ->tks before you free tk itself.

So, after the printf, you should use:

free(tk->tks);
free(tk);

There's no problem in not freeing the structure and the string (which is in another memory location and not inside the structure's memory space, that's why you have to free they both), if your program is that small, because after it's executed, the program's memory will be wiped out anyways. But if you intend to implement this function on a fully-working and big program, freeing the memory is a good action.

Comments

0

It is not clear where str is defined, but if it is a local variable in the function, your problem is likely that it goes out of scope, so the data gets overwritten.

You're leaking memory because you've forgotten to use strcpy() or memcpy() or memmove() to copy the value in str over the allocated space, and you overwrite the only pointer to the newly allocated memory with the pointer str. If you copied, you would be writing out of bounds because you forgot to allocate enough space for the trailing null as well as the string. You should also check that the allocation succeeds.

Bogus code:

tokenizer->tks = (char*) malloc (strlen(str)* sizeof(char));
tokenizer->tks = str;

Fixed code:

size_t len = strlen(str) + 1;
tokenizer->tks = (char *)malloc(len);
if (tokenizer->tks == 0)
    ...error handling...
memmove(tokenizer->tks, str, len);

Using memmove() or memcpy() can outperform strcpy() handily (see Why is Python faster than C for some illustrations and timing). There are those who would excoriate you (and me) for using the cast on malloc(); I understand why they argue as they do, but I don't fully agree with them (and usually use the cast myself). Since sizeof(char) is 1 by definition, there's no particular need to multiply by it, though there's no harm done in doing so, either.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.