1

I'm trying to create a program that reads some string, but when I test a very long string, an overflow occurs, and all the solutions I've already seen do not work. The following code is:

#include <stdio.h>

int main()
{
    char nome[201] = {0};
    char cpf[15] = {0};
    char senha[101] = {0};
    scanf("%200s", nome);
    scanf("%14s", cpf);
    scanf("%100s", senha);
    printf("nome: %s\n", nome);
    printf("cpf: %s\n", cpf);
    printf("senha: %s\n", senha);
    return 0;
}

This code is supposed to prevent the overflow, but the following string:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaassssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss

I'm trying to put the string in all inputs and when it comes to the second the program is finished and the overflow content goes to the third string.

13
  • 1
    scanf("%200s", nome); reads up to 200 non-white-space characters, leaving the rest in stdin for the next input function. Code is acting as it should. Your expectations are amiss. If you want to consume and toss characters past the 200, you need other code. Commented Nov 29, 2022 at 18:39
  • Eduardo Mosca, buffer overflow is prevented. What output do you want with the 3 "aaa...sss" input? Commented Nov 29, 2022 at 18:51
  • The output I want is the variables with the string limited and not skipping the third variable with this input. For example, in the second variable, I want only the 14 first characters of this input and the same for the last input Commented Nov 29, 2022 at 18:57
  • Eduardo Mosca, OK. If input was "aaa bbb ccc\n" "ddd eee fff\n" "ggg hhh iii\n", what output would you like? (If the line of input contained spaces?) Commented Nov 29, 2022 at 19:01
  • The output needs to be the same since none of these outputs has a length larger than the memory allocated to the variables. Commented Nov 29, 2022 at 19:09

3 Answers 3

3

You asked for inputs in order. The first one has a maximum length of 200 characters, the second 14, and the third 100. You input a string of 160.

Ignoring the first variable for now (since there's no overflow), C takes the first 14 characters from the input buffer and puts them in the second variable. It terminates this with a null terminator. No overflow has occurred.

Now we need to get data for the third variable. Specifically, we need to get the next 100 characters, or all of the characters up to the next whitespace, whichever is shorter. We put 160 characters into the input buffer (your keyboard smash) and took 14 out. Therefore, there are still 144 characters in the input buffer. No need to interactively wait for input anymore; C takes the first 100 of those characters and puts them into the third variable, terminated with a null terminator. Now all of our inputs have been completed, so the program continues.

There is no buffer overflow vulnerability here. The program is well-defined and does what you asked it to. You asked it to read from the input buffer three times. You never said "from three different lines". If you want to do that, then you need to handle delimiters yourself. In C++, there's a function called std::getline that will do it for you, but in C, you'll need to manually read (and discard) the rest of the line yourself. Something like this would suffice.

scanf("%200s%*[^\n]", nome);

The * indicates that the newly-read value should not be stored anywhere, and the [^\n] indicates that zero or more non-newline characters should be read, until the pattern doesn't match anymore (i.e. until the next character is a newline or we hit the end-of-file)

Sign up to request clarification or add additional context in comments.

3 Comments

In order to discard the remainder of the line, I recommend scanf( "%*[^\n]" );
@AndreasWenzel scanf( "%*[^\n]" ); almost discards the remainder of the line. It does not discards the line's '\n'. scanf( "%*[^\n]" ); scanf( "%*1[\n]" ); does.
Note that scanf("%200s%*[^\n]", nome); will not read an empty line ("\n") well, but instead wait for another non-"\n" line of input. @Andreas Wenzel idea is a good start as a separate call.
1

Other answers have well addressed why OP's code is performing as it is.


To robustly read a line in C, unfortunately, is not easy and a real good way is beyond a beginner's need.

One modest approach using fgets():

// Return 1 on success.
// Return EOF on input error or end-of-file with no input.
// Return 0 when input exceeds buffer space.
// A line's \n is read, but not saved.
// If using explicitly C needs to include stdbool.h library
int read1line(size_t n, char * restrict s, FILE * restrict stream) {
  if (fgets(s, n, stream) == NULL) {
    return EOF;
  }
  size_t len = strlen(s);
  // Was a \n read?
  if (len > 0 && s[len-1] ==  '\n') {
    s[--len] = '\0';
  }
  // Potentially more?
  if (len + 1 == n) {
    int ch;
    bool more_read = false;
    while ((ch = fgetc(stream)) != '\n' && ch != EOF) {
      more_read = true;
    }
    if (ch == EOF && !feof(stream)) {
      return EOF;
    }
    if (more_read) {
      return 0;
    }
  } 
  return 1;
}

The above still has corner weaknesses:

  1. Reading a null character then incorrectly determines len.
  2. s == NULL, n <= 0 or n > INT_MAX remain unhandled pathological cases.
  3. Odd systems where CHAR_MAX > INT_MAX need special handling.
  4. It would be useful to indicate length in buffer, once #1 solved.

1 Comment

Thank you a lot for your contribution and effort
1

Your posted code does not have a buffer overflow, but you are right that the input from one input prompt "overflows" into the next input prompt.

What is happening is the following:

Since your input string consists of 160 characters (161 characters including the null terminating character), when you first enter that input, it will fit entirely inside the array nome, so the line

scanf("%200s", nome);

will read this input entirely.

However, when you enter that input a second time, this time at the second input prompt, the line

scanf("%14s", cpf);

will only read the first 14 characters of that input and leave the remaining 146 characters on the input stream.

Therefore, the line

scanf("%100s", senha);

will read 100 of the remaining 146 characters of the input stream and write them into senha. So you are correct in saying that the second input prompt "overflows" into the third input prompt.

If you want to prevent this "overflow" from happening, you will have to discard all remaining characters on the line before the next input prompt, for example by calling:

scanf( "%*[^\n]" );

However, I generally do not recommend using the function scanf for user input, as that is not what it is designed to be used for.

Also, judging form the comments you made in the comments section, you want to be able to read entire lines that may be separated by spaces, instead of reading single words. However, the %s scanf format specifier will only read a single word.

For this reason, it is probably better for you to use the funtion fgets. This function will always attempt to read an entire line at once, including the newline character, instead of only a single word.

However, when using fgets, you will probably want to remove the newline character from the input. See the following question on how to do that:

Removing trailing newline character from fgets() input.

In the program below, I have created a function get_line_from_user which will read a single line from the user using fgets, discard the newline character, and if the line does not fit into the buffer, it willl also discard the rest of the line:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void get_line_from_user( char *buffer, int buffer_size );

int main()
{
    char nome[201] = {0};
    char cpf[15] = {0};
    char senha[101] = {0};

    printf( "Input Phase: \n\n" );

    //read inputs
    printf( "Nome: " );
    get_line_from_user( nome, sizeof nome );
    printf( "Cpf: " );
    get_line_from_user( cpf, sizeof cpf );
    printf( "Senha: " );
    get_line_from_user( senha, sizeof senha );

    printf( "\n\nOutput Phase: \n\n" );

    //output the results
    printf("nome: %s\n", nome);
    printf("cpf: %s\n", cpf);
    printf("senha: %s\n", senha);

    return 0;
}

//This function will read exactly one line of input from the
//user and discard the newline character. If the line does
//not fit into the buffer, it will also discard the rest of
//the line from the input stream.
void get_line_from_user( char *buffer, int buffer_size )
{
    char *p;

    //attempt to read one line of input
    if ( fgets( buffer, buffer_size, stdin ) == NULL )
    {
        printf( "Error reading from input\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to find newline character
    p = strchr( buffer, '\n' );

    //determine whether entire line was read in (i.e. whether
    //the buffer was too small to store the entire line)
    if ( p == NULL )
    {
        int c;

        //discard remainder of line from input stream
        do
        {
            c = getchar();
        
        } while ( c != EOF && c != '\n' );
    }
    else
    {
        //remove newline character by overwriting it with
        //null character
        *p = '\0';
    }
}

This program has the following behavior:

Input Phase: 

Nome: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaassssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
Cpf: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaassssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
Senha: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaassssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss


Output Phase: 

nome: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaassssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
cpf: aaaaaaaaaaaaaa
senha: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaassssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss

As you can see, the input from one input prompt no longer overflows into the input of another input prompt.

And it now also works with spaces in the input:

Input Phase: 

Nome: This is a test 
Cpf: Another test
Senha: Yet another test


Output Phase: 

nome: This is a test
cpf: Another test
senha: Yet another test

4 Comments

It worked perfectly. Thank you a lot for your time and effort.
@EduardoMosca: I am pleased that I was able to help. Do you also understand how the function get_line_from_user works?
Actually, I could understand until the p = strchr( buffer, '\n' );. I am with a bit of difficult understanding the if and else after.
@EduardoMosca: I suggest that you read the documentation of fgets and strchr to see exactly what these two functions do. I use the function strchr to determine whether it can find a newline character in the input string. That way, I can determine whether the entire line was read in or not. If this is the case, then I remove the newline character from the input. If not, then I discard the remainder of the line from the input stream.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.