0

I'm trying to write a function which would convert all escape sequences in a string in their non-printable form. Basically if I have a string "This \n makes a new line", I would like it to be "This makes a new line". So far I've got this. I'm calling from main:

int main()
{
    unescape("This \\n\\n is \\t\\t\\t string number \\t 7.");
    return 0;
}

char* unescape(char* s)
{
    char *esc[2] = {"\\n", "\\t"};
    int i;
    char* uus = (char*)calloc(80, sizeof(char));
    char* uus2 = (char*)calloc(80,sizeof(char));

    strncpy(uus, s, strlen(s));

    for(i = 0; i < 2; i++)
    {
        while(strstr(uus, esc[i]) != NULL) //checks if \\n can be found
        {
            //printf("\n\n%p\n\n", strstr(uus, esc[i]));
            int c = strstr(uus, esc[i]) - uus; //gets the difference between the address of the beginning of the string and the location
                                           //where the searchable string was found
            uus2 = strncpy(uus2, uus, c); //copies the beginning of the string to a new string

            //add check which esc is being used
            strcat(uus2, "\n"); //adds the non-printable form of the escape sequence
            printf("%s", uus2);

            //should clear the string uus before writing uus2 to it 
            strncpy(uus, uus2, strlen(uus2)); //copies the string uus2 to uus so it can be checked again
        }
    }
    //this should return something in the end. 
}

Basically, what I need to do now, is take the part from the string uus after "\n" and add it to the string uus2 so I can run the while loop again. I thought about using strtok but hit a wall as it makes two separate strings using some kind of delimiter which is not always there in my case.

edit: Adding the rest of the string to uus2 should be before strncpy. This is the code without it.

edit vol2: This is the code that works and which I ended up using. Basically edited Ruud's version a bit as the function I had to use had to return a string. Thanks a lot.

char* unescape(char* s)
{
    char *uus = (char*) calloc(80, sizeof(char));
    int i = 0;

    while (*s != '\0')
    {
        char c = *s++;
            if (c == '\\' && *s != '\0')
            {
                c = *s++;
                switch (c)
                {
                case 'n': c = '\n'; break;
                case 't': c = '\t'; break;
                }
            }
        uus[i] = c;
        i++;
    }
    uus[i] = '\0';
    return uus;
}
1
  • There were at least 3 problems with your original code: (1) you were so busy copying data back and forth that you forgot to close the 'holes' that these escape characters leave behind; (2) you left strings unterminated because strncpy never copied the trailing \0; (3) \t was replaced by a newline because of the hardcoded \n in strcat(uus2, "\n");. Commented Apr 12, 2014 at 21:16

2 Answers 2

2

I agree with Anonymouse. It is both clumsy and inefficient to replace first all \n, then all \t. Instead, make a single pass through the string, replacing all escape characters as you go.

I left the space allocation out in the code sample below; IMHO this is a separate responsibility, not a part of the algorithm, and as such does not belong in the same function.

void unescape(char *target, const char *source)
{
    while (*source != '\0')
    {
        char c = *source++;
        if (c == '\\' && *source != '\0')
        {
            c = *source++;
            switch (c)
            {
                case 'n': c = '\n'; break;
                case 't': c = '\t'; break;
            }
        }
        *target++ = c;
    }
    *target = '\0';
}

EDIT: Here's an alternative version, using strchr as suggested by Anonymouse. This implementation should be faster, especially on very long strings with relatively few escape characters. I posted it primarily as a demonstration of how optimizations can make your code more complex and less readable; and consequently less maintainable and more error-prone. For a detailed discussion, see: http://c2.com/cgi/wiki?OptimizeLater

void unescape(char *target, const char *source)
{
    while (*source != '\0')
    {
        if (*source++ == '\\' && *source != '\0')
        {
            char c = *source++;
            switch (c)
            {
                case 'n': c = '\n'; break;
                case 't': c = '\t'; break;
            }
            *target++ = c;
        }
        else
        {
            const char *escape = strchr(source--, '\\');
            int numberOfChars = escape != NULL ? escape - source : strlen(source);
            strncpy(target, source, numberOfChars);
            target += numberOfChars;
            source += numberOfChars;
        }
    }
    *target = '\0';
}
Sign up to request clarification or add additional context in comments.

4 Comments

Going to try this out. Though, just being curious, is there a way to pull my approach off? Just wondering for future reference as I've spent so much time trying to make it happen that way.
my version with strchr will be faster. Plus I'm (mistakenly) searching \\n - so delete the IF statement. Finally Ruud's function prototype is better than the questioners, const is your friend especially if you try and modify a constant string (such as "This \\n\\n is \\t\\t\\t string number \\t 7.")
I ended up using Ruud's version of the code. I edited my first post with the final code. Sadly, I had some trouble understanding Anonymouse's code but thanks a lot to both of you.
@Anonymouse: Indeed, strchr is faster, but speed isn't everything. Please see my edit; and the wiki discussion I hyperlinked.
1

You'd be better using this...

char *p;

p = input_string;
while ((p=strchr (p, '\\')) != NULL)
{
  if (p [1] == '\\')
  {
     switch (p [2])
     {
     case 'n' :
       // handle \n
       break;
     case 't' :
       // handle tab
       break;    
     }
  }
  else
    p++;

}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.