-2

I have been trying to write a function that takes in strings as a line and returns a pointer to an array of words. The function written below does something similar How can I rewrite the following code1 but it should be better than code2 by being able to change the delimiter. However, code1 works but during memory allocation the same memory is duplicated for the words array. Thereby causing word duplication.

Code 1:

char *split(const char *string) {
    char *words[MAX_LENGTH / 2];
    char *word = (char *)calloc(MAX_WORD, sizeof(char));
    memset(word, ' ', sizeof(char));
    static int index = 0;
    int line_index = 0;
    int word_index = 0;

    while (string[line_index] != '\n') {
        const char c = string[line_index];
        if (c == ' ') {
            word[word_index+ 1] = '\0';
            memcpy(words + index, &word, sizeof(word));
            index += 1;
            if (word != NULL) {
                free(word);
                char *word = (char *)calloc(MAX_WORD, sizeof(char));
                memset(word, ' ', sizeof(char));
            }
            ++line_index;
            word_index = 0;
            continue;
        }
        if (c == '\t')
            continue;
        if (c == '.')
            continue;
        if (c == ',')
            continue;

        word[word_index] = c;
        ++word_index;
        ++line_index;
    }

    index = 0;
    if (word != NULL) {
        free(word);
    }
    return *words;
}

Code 2:

char **split(char *string) {
    static char *words[MAX_LENGTH / 2];
    static int index = 0;
    // resetting words 
    for (int i = 0; i < sizeof(words) / sizeof(words[0]); i++) {
         words[i] = NULL;
    }
    const char *delimiter = " ";
    char *ptr = strtok(string, delimiter);
    while (ptr != NULL) {
        words[index] = ptr;
        ptr = strtok(NULL, delimiter);
        ++index;
    }
    index = 0;
    return words;
}

However I noticed that the memory of word+index is been reassigned to the same location thereby causing word duplication.

4
  • 2
    What is your question? Providing the delimier to the function or memory problem during splitting? BTW: No need to shout in the title. Commented Nov 25, 2019 at 12:01
  • Code 2 does not have word variable. Is your problem in Code 1 or Code 2? Commented Nov 25, 2019 at 12:04
  • Code 2 works but i can't change the delimitter since it is const char* but I want to use code 1 since check for all kind of non-word character can be discovered but the memory is been reallocated again thereby causing word duplication in the char* word[] by duplicating the same memory location across it index Commented Nov 25, 2019 at 23:31
  • Splitting a String and returning an array of Strings may be helpful. Commented Dec 14, 2019 at 4:44

1 Answer 1

1

strtok() always returns a different pointer into the initial string. This cannot produce duplicates, unless you call it twice with the same input string (maybe with new contents).

However, your function returns a pointer to a static array, which is overwritten on each call to split(), voiding the results of all previous calls. To prevent this,

  • either allocate new memory in each call (which must be freed by the caller):

    char *words = calloc(MAX_LENGTH / 2, 1);
    
  • or return a struct instead (which is always copied by value):

    struct wordlist { char *word[MAX_LENGTH / 2]; };
    
    wordlist split(char *string)
    {
        wordlist list = {};
        /* ... */
        list.word[index] = /* ... */;
        /* ... */
        return list;
    }
    
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.