0

My question seems to be very simple but I can't find an answer on Google.

I'd like to know what while (fscanf(inFile, "%[^ \n] ", string) != EOF) does. I'm trying to read in a string from a file by using the above.

However, I am not exactly sure what this statement does, specifically the %[^ \n] part. I know that it will loop until end of file, but is "string" a number value or some other value? Also, how can I use it?

For example, for a sentence "I like trees", what is the string value equivalent to?

Thank you in advanced.

5
  • 5
    Why not read the manual page Commented Mar 22, 2014 at 11:52
  • Using fscanf to read strings (or character sequences) is quite dangerous and should be avoided. If char string[buffer_size] is not large enough to hold the contents, it will overflow and could compromise your system. One could use the POSIX.1-2008 m modifier but it's not very portable. Commented Mar 22, 2014 at 12:08
  • 1
    @codebeard: You can specify length. %254s. It is not dangerous if you use it correctly. Commented Mar 22, 2014 at 12:17
  • 1
    @user13500 You're right – I meant to say that using fscanf to read strings of unknown length is dangerous. If you specify a field width like %255[^ \n] then it's safer, but could still lead to unexpected behaviour and it gets cumbersome to handle the question "did strlen(string) == 255 because the word was exactly that long, or will my next read string be a continuation of the previous one, and how can I know?" Commented Mar 22, 2014 at 12:26
  • @codebeard: If one use %n, one can say: if (items_read > 0 && strlen(string)) > n. It will be true if whitespace was consumed, as in: %n will hold the number of characters read including white space. Commented Mar 22, 2014 at 14:33

4 Answers 4

2

%[^ \n] tells fscanf to read all the characters excluding \n and space.

Sign up to request clarification or add additional context in comments.

8 Comments

So if it reads, "Hello everyone" what is the result. I mean what is string?
Will it be "everyone" in the second loop?
No. It reads one character per iteration. The result after the while loop quits will be only "Hello"
@aakashjain Is it possible to read different strings per iteration (using some different code)?
@aakashjain actually the while loop won't ever end in this case, because the end of file is not reached
|
2

Note the space after the caret ^ and the trailing space in the format string "%[^ \n] " of fscanf.

fscanf(inFile, "%[^ \n] ", string)

The above statement means that fscanf will read from the stream inFile and match any nonempty sequence of characters which does not contain either a space ' ' or a newline '\n' and write them into the buffer pointer to by the next argument which is string, and then read and discard any number of (including zero) whitespace characters (meant by the trailing space in the format string). The buffer pointed to by string must be large enough for any such sequence of characters plus the terminating null byte added automatically. If the buffer is not large enough, then fscanf will overrun it invoking undefined behaviour and most likely causing program crash due to segfault. You must guard against it by specifying maximum field width which should be one less than the length of the buffer to accommodate the terminating null byte.

fscanf returns the total number of input items successfully matched and assigned, which in this case, is one. It will return EOF when the end of file is reached in the stream inFile. Therefore the while loop condition means that fscanf will read such a sequence of characters and write it into the buffer string till the end of file is reached in the stream inFile.

You should change your while loop to -

// assuming string is a char array

char string[100];

while(fscanf(inFile, "%99[^ \n] ", string) == 1) {
    // return value 1 of fscanf means fscanf call was successful
    // do stuff with string
}

8 Comments

I cannot suggest an edit for this, because it will be a single-character edit. This, however, is important, please change your format string into "%99[^ \n] ", with a space right behind the closing double-quote.
@ThoAppelsin thanks for pointing it out. I had also missed explaining the role of the trailing space in the format string.
Note that even with this it is impossible to distinguish between a word being longer than 99 characters and a word being exactly 99 characters. That is, if in one iteration you get string = "aaaa...99...a" and in the next you get string = "b", you cannot tell if the input was aaaa...99...ab or aaaa...99...a b. This may or may not be a problem depending on what the string is being used for.
@codebeard true, but at least this will prevent fscanf from overrunning the buffer string in case the input is 100 or more characters long. strlen(string) == 99 is true, then the input was longer with extraneous characters lying in the stream buffer.
@codebeard cont. and this needs to be dealt with depending on what OP wants to do with it. fscanf should not be used if you are not sure that input is formatted.
|
0

"%[^ \n]" basically it matches every thing except a \n or a ' ' character.

The fscanf statement you have will read everything from the file pointed to by inFile and then stores what it read into a string named string, which should be declared as char string[500] (or a similar sufficiently large value). Everytime it successfully reads a character, it returns it.

Now since you need a way for the while loop to quit after reading the file, you are comparing the return of fscanf to EOF, which is the special end-of-file character.

EDIT: corrected const char*

7 Comments

Note that mine has a space after '[' and before '\'. Is this any different?
string must not be declared as const char *string. string must be a buffer large enough to hold any sequence of non-space/non-newline characters found, which I might note is very insecure.
So, in my example "I like trees", will string be different each loop?
@aakashjain So, in my example "I like trees", will string be different each loop?
@aakashjain %[^ \n] will read a sequence of characters, not just one.
|
0

while (fscanf(inFile, "%[^ \n] ", string) != EOF) has huge problems.

The format string "%[^ \n] " is made up of 2 directives: "%[^ \n]" and " ".

"%[^ \n]" is a format specifier made up of a scanset. The scanset consists of all char except ' ' and '\n'. So fscanf() looks for 1 or more characters that are in the scanset and saves them to string.

"%[^ \n]" has 2 problems:
1) If the first character encountered is not in the scanset, fscanf() ungets that character back to inFile and the function returns 0. No further scanning occurs. So if inFile begins with a ' ', code puts nothing in string, inFile is not advanced and subsequent code is set up for UB or an infinite loop!

2) The numbers of characters to save in string is unbounded. So if inFile begins with more that the sizeof(string)-1 char in the scanset, undefined behavior ensues.

" " directive tells fscanf() to consume 0 or more white-space char. This can be confusing should inFile have the value stdin. In that case, the user needs to enter some non-white-space after the ' ' or '\n'. Since stdin is usually line buffered, that means something like "abc\ndef\n" needs to be entered before scanf() saves "abc" and returns 1. The "def\n" is still in stdin.

Recommend instead:

char string[100];
while (fscanf(inFile, "%99s", string) == 1) {
  ...
}

"%99s" will consume optionally leading white-space, then up to 99 non-white-space char, appending the usual \0.

1 Comment

Believe you might meant to say ` == 1` or ` > 0, not != 1`.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.