0

My code

char[] fileContents = "hi\n whats up\n";

    char *output=malloc(sizeof(char)*1024) ;

    int i = 0; int j = 0;
    char *bPtr = fileContents;


    for(i=j=0; bPtr[i]!='\0'; i++)
    {
      if('\n'==bPtr[i])
            outputPtr[j++]='\r';
            outputPtr[j++]=bPtr[i];
    }

On netbeans, this code works but using linux gcc, the \ and the \n are being treated as seperate characters, where as in net beans \n is all one char. plz help

Upon debugging, in Linux using GDB it completely skips the if statement, while in netbeans it enters and gets the job done.

9
  • sizeof(char) is always redundant, and with the hardwired type it's also error prone. When malloc'ing I use the pointer itself to set the type, here it'd be output = malloc(1024*sizeof *output);. Commented Feb 11, 2013 at 4:58
  • Ok, I will fix that but can you help me with my issue? Commented Feb 11, 2013 at 4:59
  • I suspect that the output is correct and consistent, but the way the file read is not: this outputs \r\n which is a "DOS newline". If this is not the case, please provide an example of the exact output at the point where it differs and how it differs. (Also, make sure to NUL-terminate output.) Commented Feb 11, 2013 at 4:59
  • @pst what do you mean? give that input, I want this output hi\r\n whats up \r\n , I get it on netbeans using same input Commented Feb 11, 2013 at 5:00
  • 1
    @user1888502 Your code is working fine on my system (gcc/Linux). The output is also exactly what you want. Maybe the problem is something else ? Commented Feb 11, 2013 at 5:09

2 Answers 2

1

First off, your C code isn't C code. It's close, but as is, it won't compile at all. Second, after cleaning up the code to get it to a compilable state:

#include <stdio.h>
#include <stdlib.h>

char fileContents[] = "hi\n whats up\n";

int main(void)
{
  char *output;
  int   i;
  int   j;
  char *bPtr;

  output = malloc(1024);
  bPtr   = fileContents;

  for (i = j = 0 ; bPtr[i] != '\0' ; i++)
  {
    if ('\n' == bPtr[i])
      output[j++] = '\r';
    output[j++] = bPtr[i];
  }

  output[j] = '\0';
  fputs(output,stdout);
  return EXIT_SUCCESS;
}

And compiling with "gcc -g a.c" and using gdb:

GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db
library "/lib/tls/libthread_db.so.1".

(gdb) break 17
Breakpoint 1 at 0x80483fa: file a.c, line 17.
(gdb) run
Starting program: /tmp/a.out 

Breakpoint 1, main () at a.c:17
17        for (i = j = 0 ; bPtr[i] != '\0' ; i++)
(gdb) n
19          if ('\n' == bPtr[i])
(gdb) n
21          output[j++] = bPtr[i];
(gdb) n
17        for (i = j = 0 ; bPtr[i] != '\0' ; i++)
(gdb) n
19          if ('\n' == bPtr[i])
(gdb) n
21          output[j++] = bPtr[i];
(gdb) n
17        for (i = j = 0 ; bPtr[i] != '\0' ; i++)
(gdb) n
19          if ('\n' == bPtr[i])
(gdb) n
20            output[j++] = '\r';
(gdb) n
21          output[j++] = bPtr[i];

The first two times through the loop, we skip over the condition, since it's false. On the third time through, the condition is met, and the "\r" is included in the output.

But from reading some of your other comments, it seems you are confused by line endings. On Unix (and because Linux is a type of Unix, this is true for Linux as well), lines end with one character, LF (ASCII code 10). Windows (and MS-DOS, the precursor to Windows, and CP/M, the precursor to MS-DOS) uses the character sequence CR LF (ASCII code 13, ASCII code 10) to mark the end of line.

Why the two differing standards? Because of the wording of the ASCII standard, when it was created and why. Back when it was created, output was mostly on teletypes---think typewriter. CR was defined as moving the print carriage (or print head) back to the begining of the line, and LF was defined as advancing to the next line. The action of bringing the print carriage to the beginning of the next line was unspecified. CP/M (and descendants) standardized on using both to mark the end of a line due to a rather literal translation of the standards document. The creators of Unix decided on a more liberal interpretation where LF, a Line Feed, meant to advance to the next line for output, bringing the print carriage back to the start (whereas the first computer I used used CR for the same thing, bring the carriage back to the start and advance to the next line).

Now, if a teletype device is hooked up to a Unix system and requires both CR and LF, then it's up to the Unix device driver, when it sees a LF, to add the required CR. In other words, the system handles the details in behalf of your program, and you only need the LF to end a line.

To further confound the mess, the C standard weighs in. When you open a file,

FILE *fp = fopen("sometextfile.txt","r");

you open it in "text" mode. Under Unix, this does nothing, but under Windows, the C library will discard "\r" on input so the program only needs to concern itself with looking for "\n" (and for files opened for writing, it will add the CR when a LF is seen). But this is under Windows (there may be other systems out there that do this, but I am unfamiliar with any).

If you really want to see the file, as is, you need to open it in binary mode:

FILE *fp = fopen("sometextfile.txt","rb");

Now, if there are any CRs in the file, your program will see them. Normally, one doesn't need to concern themselves with line endings---it's only when you move a text file from one system to another that uses a different line-ending convention where it becomes an issue, and even then, the transport mechanism might take care of the issue for you (such as FTP). But it doesn't hurt to check.

Remember when I said that Unix does not make a distinction between "text" and "binary" modes? It doesn't. So a text file from the Windows world is processed with a Unix program, said Unix program will see the CRs. What happens is really up to the program in question. Programs like grep don't seem to care, but the editor I uses will show any CR that exists.

So I guess now, my question is---what are you trying to do?

Sign up to request clarification or add additional context in comments.

7 Comments

I am running Linux, I have a file with this text in it: Hi\n how are you\n created using vim. I save the file, parse it using read system call. Now I want to convert it to dos style new line, so a new file should be created that now has the same contents of the previous file with dos new lines hi\r\n how are you \r\n the code I wrote worked fine in netbeans on a mac but on linux it is treating \n as \\n which confuses me about how to approach this and simply replace the \n with a \r\n
At the command line, run "wc -l yourtextfile". If it reports 1, then what you have (as far as Unix is concerned) is a text file of one line, which contains the sequence "\n" (ASCII 92, ASCII 110) twice, not two end of lines.
it reports 1, even though I have \n twice in the same line
Typing "\n" in insert mode of vi does NOT create a new line, it inserts two characters, slash (ASCII 92) and n (ASCII 110) into the line. If you want a new line, press the Enter key in insert mode.
It sounds like (having no experience) that NetBeans (whatever that is) will automatically convert the sequence "\n" it sees in a text file into a end-of-line marker. C won't do that for you. The C compiler, when it sees a "\n" in a string constant in C source code, will covert it to an end-of-line marker, but the standard C library (getc(), scanf(), etc.) WON'T.
|
0

Your code runs perfectly on my system with gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 .

The code I ran :

int main()
{
    char Contents[] = "hi\n whats up\n";

    int i = 0; int j = 0;
    char outputPtr[20];
    for(i=j=0; Contents[i]!='\0'; i++)
    {
      if('\n'==Contents[i]) outputPtr[j++]='\r';
      outputPtr[j++]=Contents[i];
    }
    outputPtr[j]='\0';
    printf("%s %d %d \n", outputPtr,j,i);
    i = 0;
    while(outputPtr[i]!='\0')   printf(" %d ", outputPtr[i++]);
    return 0;
}

Output :

hi
 whats up
 15 13 //Length of the edited string and the original string
 104  105  13  10  32  119  104  97  116  115  32  117  112  13  10 //Ascii values of the characters of the string 

13 is the carriage return character.

8 Comments

Why does it skip the if statement on my machine, does it treat \n as \\n on your machine? I have a file with that content and when I parse it in a file it treats the \n as \\n which is forcing me to change my if statement
char outputPtr[10];? and where is segfault?
@Eddy_Em Missed that. Changed and tested. Segfault may not happen if the array is this small.
@user1888502 The problem is in your file read statements and not in the parsing. If the file input has \\n then just adjust for that.
what do you mean by adjust for that, I am having a hard time understanding how the output should then be for linux with the \r\n
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.