1

i've this code:

     int i =0;
  char * str = "ar   bitrary             whitespace";

  int whitespace=0,index;
  for(index = 0;index < strlen(str);index++)
    {
      if(isspace(str[index]) != 0)
    {
      whitespace++;
    }
    }


  char * tmp = (char *)calloc(strlen(str)-whitespace +1,sizeof(char));

  memset(tmp,'\0',strlen(tmp)+1);

  while(i < strlen(str))
    {
      if(isspace(str[i]) != 0)
    {
      i++;
      continue;
    }else if(isspace(str[i]) == 0)
    {

      strcat(tmp,&str[i]);
      i++;
    }

    }


  printf("\nnew string is: %s \n",tmp);

the problem is that the output is a string without the whitespace removed + some garbage character. I've used memset to null terminate tmp,is there the problem?

2
  • Garbage after a string sounds like you missing a string terminator. Commented Feb 21, 2017 at 17:44
  • onlinegdb.com Commented Feb 21, 2017 at 17:55

3 Answers 3

1

The length of the source string could be calculated before this loop

for(index = 0;index < strlen(str);index++)

Otherwise if the code will not be optimized the function strlen will be called for each iteration of the loop. In fact using of the function is redundant for such a task.

This statement

memset(tmp,'\0',strlen(tmp)+1);

does not make sense because the call of calloc already initialized the memory with zeroes.

This statement

strcat(tmp,&str[i]);

also copies blanks from the source string after the position i. So it can write beyond the memory allocated for the array pointed to by the pointer tmp.

You can write a separate function that can look as it is shown in this demonstrative program

#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>

char * remove_blanks( const char *s )
{
    size_t n = 0;

    const char *p = s;

    do
    {
        if ( !isspace( ( unsigned char )*p ) ) ++n;
    } while ( *p++ );

    char *t = malloc( n );

    if ( t )
    {
        char *q = t;
        p = s;

        do
        {
            if ( !isspace( ( unsigned char )*p ) ) *q++ = *p;
        } while ( *p++ );
    }

    return t;
}

int main(void) 
{
    char * str = "ar   bitrary             whitespace";

    printf( "\"%s\"\n", str );

    char *t = remove_blanks( str );

    printf( "\"%s\"\n", t );

    free( t );
}   

The program output is

"ar   bitrary             whitespace"
"arbitrarywhitespace"
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks,I'm trying to understand a couple of things: when you in the beginning say that the "strlen" in the for loop was redundant,could that be solved like "int ln = strlen(str)" and than using ln as upper limit for the loop?(avoiding like this the continuous call to strlen). Second things: the cast to unsigned character is really necessary? Third things: in the do-while loop as the argument of the while you use "*p++" and then in the isspace inside the loop you write "*p" as the argument, but *p don't indicate only to the first element of the string? Sorry for the questions,but I'm learning.
@Collapsed You can calculate the length either before the loop or in the part of the first expression of the for statement. If some characters can have negative values then it is necessary to cast to unsigned char. In the do while loop there is access to each character of the original string using the pointer p. So it must be increased to point each character sequentially and this is done in the condition of the loop.
1

this is your problem

 memset(tmp,'\0',strlen(tmp)+1);

strlen(tmp) works by looking for '\0' in tmp, you have a chicken and egg situation here.

You should not be doing a memset any way, just tack on a '\0' when you fnish copying

And dont use strcat, instead maintain a pointer to tmp and just do *p = str[i] then increment p

1 Comment

I don't understand, are you implying that there is no 0 terminator or that memset() only sets the first byte to 0?
1

I will not read your question, you overwrite the '\0' terminator for sure.

Now that I read your question, it looks like you need to understand strings and arrays better,

  1. Don't ever write while (i < strlen(str))
  2. Don't use strcat() for adding a single character, you apparently did overwrite the '\0' there. Furthermore, don't ever use strcat() for concatenating more than to pieces of a string.

Also notable,

  • You memset() after calloc() which already initialized to 0. That means that you are enforcing something that is not necessary, and trying it twice as if it failed the first time which I can guarantee it didn't.

    In fact, since you have used calloc() and all bytes pointed to by tmp are 0 then strlen(tmp) will return 0, thus your memset() is equivalent to

    tmp[0] = '\0';
    

    and you REALLY don't need initialize tmp except when you finally copy the actual bytes from str.

I always advice against calloc() for strings, because

  1. You don't really need to initialize something twice.
  2. You should be sure your code does take the terminating '\0' into account and not simply assume that it's there because you calloc()ed. That is a bug that you just hide with calloc() but it shows up at some point.

Try this and see if you can understand the reasons for my changes

#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>

int main(void)
{
    int whitespace;
    int length;
    char *str = "ar   bitrary             whitespace";
    char *tmp;

    whitespace = 0;
    for (length = 0; str[length] != '\0'; ++length) {
        if (isspace(str[length]) != 0) {
            whitespace++;
        }
    }
    tmp = malloc(length - whitespace + 1);
    if (tmp == NULL)
        return -1;
    for (int i = 0, j = 0; str[i] != '\0'; ++i) {
        if (isspace(str[i]) != 0)
            continue;
        tmp[j++] = str[i];
    }
    tmp[length - whitespace] = '\0';
    printf("new string is: %s\n",tmp);

    free(tmp);
    return 0;
}

3 Comments

Thanks for having answered. What is the reason why i don't have to write "while(i < strlen(str))" ? After everything I've to enforce the '\0' termination as you did(talking about the general case of working with strings)?
the strlen function will take time and spac. when you write while(strlen(str)) the loop will go to the function every time until the loop will end. when you write int l=strlen(str) before and use l you will call strlen only once.
@Collapsed Unlike in PHP where it's common to see strlen() in a loop control, in c it's not a good idea because the length is not stored anywhere. So every time you call it, it counts characters one by one. The same goes for strcat(). Knowing the length of a string whose length will not change during the execution is a nice advantage in c that you can use for improving performance.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.