0

As below code representation, it seems strange that the 'read' syscall dose not work correctly with C language in windows.

#include <fcntl.h>
#include <windows.h>
#include <stdio.h>

int main()
{
    int fd = open("a.txt",O_RDONLY);
    char *buf = (char *)malloc(4);
    read(fd,buf,4);
    printf("the string is %s\n",buf);
    return 0;
}

very succinct c code, and the content of a.txt is 'abcd'. But when I run this code in windows (env is MinGW, compiler is gcc). The output is

abcd?

what is the character "?" in this output string?

Can I use "read" or "write" unix syscall in windows?

thanks advance.

3
  • You failed to ensure your string is NUL terminated. BTW you also fail to check if open was successful. Commented Feb 9, 2018 at 12:46
  • 1
    'the 'read' syscall dose not work correctly with C language in windows.' - just think about how unlikely that is, compared with the chance of bugs in your code... Commented Feb 9, 2018 at 13:25
  • 1
    FYI, on Windows, open, read, and "file descriptors" are implemented by the C runtime library; they are not system calls. The Windows API uses CreateFile to create a File object for a device or file-system file/directory and return a handle for it. ReadFile reads from a File object that's referenced by a handle. The actual system calls (i.e. that switch to ring-0 kernel mode via SYSCALL, SYSENTER, etc) are NtCreateFile and NtReadFile. Commented Feb 9, 2018 at 13:34

1 Answer 1

6

The issue have nothing to do with platform or operating system. It's just about you missing the string terminator.

In C a char string is really called a null-terminated byte string. That null-terminated bit is important, because all functions treating a pointer to char as a string look for this terminator to know when the string ends.

That means a string of four characters actually needs space for five, with the last being the character null-terminator character '\0'.

By not having the terminator, string functions can and will go out of bounds looking for the terminator, leading to undefined behavior.

So:

char buf[5];  // 4 + 1 for terminator
int size = _read(fd, buf, 4);  // Windows and the MSVC compiler doesn't really have read

// _read (as well as the POSIX read) returns -1 on error, and 0 on end-of-file
if (size > 0)
{
    buf[size] = '\0';  // Terminate string
    printf("the string is %s\n", buf);
}
Sign up to request clarification or add additional context in comments.

7 Comments

Might also want to add that some numbers don't map to a printable character in the current font, and that the logic called by printf typically prints a replacement character when this happens. That replacement character is generally a question mark in a diamond, or the simplified form, a question mark.
thanks @Some programmer dude. But why my code run smoothly in linux instead of the header file sys/types.h and sys/stat.h
@suoyong One of the possibilities of undefined behavior is seemingly running well.
@suoyong Or do you mean you can build without errors without those header files?
@Edwin Buck thanks the supplement. every byte map a ASCII code. isn't it Right? So where the question mark come from???
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.