0

I'm trying to analyze a function that a project partner wrote to try and figure out why it's giving me a heap buffer overflow. I have run this through valgrind and cppcheck, but they didn't bring me any closer.

The function is designed to parse through a string and pull out a value to sort on. Special cases to handle were null values, and movie titles with commas in them. (Since we're parsing a CSV file, if data has a comma in it, that data needs to be handled differently.) Here is the relevant snippet:

char* findKey(char lineBuffer[], int columnNumber ){
    char tempArray[512];
    int commasCounted = 0;
    int i =0;

    for(i = 0; i< 1024; i++){
        if (commasCounted == columnNumber){
            commasCounted = i;
            break;
        }

        if (lineBuffer[i] == '\"'){
            while(lineBuffer[i] != '\"'){
                i++;
            }
        }

        if (lineBuffer[i] == ','){
            commasCounted++;
        }
    }

    if(lineBuffer[commasCounted] == ','){
        tempArray[0] = '0';
        tempArray[1] = '0';
        tempArray[2] = '0';
        tempArray[3] = '0';
        tempArray[4] = '\0';
    }else{
        int j = 0;
        for(i = commasCounted; i < 1024; i++){
            if(lineBuffer[i] == '\"'){
                i++;
                while(lineBuffer[i] != '\"'){
                    tempArray[j] = lineBuffer[i];
                    i++;
                    j++;
                }
                break;
            }else if(lineBuffer[i] == ','){
                break;
            }else
                tempArray[j] = lineBuffer[i];
                j++;
        }
        tempArray[j] = '\0';
    }

    char* tempString = strtok(tempArray, "\n");
    return tempString;
}

I believe the part that's blowing up is this section:

        while(lineBuffer[i] != '\"'){
            tempArray[j] = lineBuffer[i];
            i++;
            j++;
        }

I just can't figure out exactly why. Is there a way to fix this? I don't know if this is the cause, but this is the input for lineBuffer when it breaks:

Color,Glenn Ficarra,310,118,43,7000,Emma Stone,33000,84244877,Comedy|Drama|Romance,Ryan Gosling,"Crazy, Stupid, Love. ",375456,57426,Steve Carell,7,bar|divorce|friend|girl|male objectification,http://www.imdb.com/title/tt1570728/?ref_=fn_tt_tt_1,292,English,USA,PG-13,50000000,2011,15000,7.4,2.39,44000,

Any help would be appreciated. Thank you!

6
  • 1
    Is it possible for i to be > strlen(linebuffer)? Make sure that that doesn't happen. Commented Oct 2, 2018 at 15:40
  • 2
    Test against '\0'` while(lineBuffer[i] != '\"' && lineBuffer[i] != '\0'){ Commented Oct 2, 2018 at 15:41
  • 2
    Possibly the 1024 is a risk. Better pass the size as a parameter than assume 1024. Commented Oct 2, 2018 at 15:41
  • 1
    what is a 'heap buffer overflow' Commented Oct 2, 2018 at 15:42
  • 3
    You are returning a pointer to a local variable. Commented Oct 2, 2018 at 15:42

2 Answers 2

2

You have a few problems here.

First you are returning a pointer to a local variable. This invokes undefined behavior. You should allocate the string with malloc or duplicate tempArray with strdup and then return it.

Second you should not declare tmpArray of size 512, which seems to be an arbitrary number. tmpArray should have at least the size to fit lineBuffer in it, so you should declare it as char tmpArray[strlen(lineBuffer)+1];

You iterate from 0 to 1023 when searching for commas, which may access lineBuffer out of bounds, better option would be to iterate form 0 to strlen(lineBuffer)-1. Since you are incrementing i also inside the loop you should always check if your index is inside bounds.

But all those things does not seem to be a problem for the provided test string, since it fit inside all buffers. I think the problem is here:

if (lineBuffer[i] == '\"')
{
    while(lineBuffer[i] != '\"')
    {
        i++;
    }
}

If you find a '\"' you want to skip everything until you find the next one. If you think about it your code does not have any effect at all, since the while loop can not be entered. You should change this to:

if (lineBuffer[i] == '\"')
{
    i++;
    while(lineBuffer[i] && lineBuffer[i] != '\"')
    {
        i++;
    }
}
Sign up to request clarification or add additional context in comments.

3 Comments

Ahhh, I see. This fixed it 100%. Thank you! So, once the outer if loop fired, because i wasn't incremented, the inner while loop evaluated false every time, because it was still at the double quotes char that was just found. Correct?
@Locke Exactly. You want to start the search at the next character not the current one which by definition is already '\"'.
Thank you so much!
0

Make sure index i doesn't move past the end of lineBuffer:

    if (lineBuffer[i] && lineBuffer[i] == '\"'){...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.