1

As a part in my process to learn C I am developing a couple of functions for string manipulations. One of these has the function of replacing substrings within a string, and is raising some questions. I am working in C99; compiling on Mac OS Sierra and FreeBSD.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *repstr(char input[], char rep[], char new[]) {

    char *output = malloc(strlen(input)*5); // <- Question 2
    int replen = strlen(rep);
    int newlen = strlen(new);
    int a, b, c = 0;

    // printf("input: %ld\t%s\n", strlen(input), input); // <- Question 1

    while(input[a]) {
            if(input[(a+b)] == rep[b]) {
                    if(b == replen - 1) {
                            strcat(output, new);
                            a += replen;
                            c += newlen;
                            b=0;
                    }
                    else b++;
            } else {
                    output[c] = input[a];
                    a++;
                    c++;
            }
    }

    return output;
}


int main() {

    char buffer[] = "This is the test string test string test test string!";
    char rep[] = "test";
    char new[] = "tested";

    int len = strlen(buffer);

    char output[len+5];

    printf("input: %d\t%s\n", len, buffer); // <- Question 1
    strcpy(output, repstr(buffer, rep, new));
    printf("output: %ld\t%s\n", strlen(output), output);

    return 0;
}

Question 1: When this line is executed in main() it causes a segfault. However, when executed within the function everything seems to work fine. Why?

Question 2: I have realized that I need a pretty large piece of memory allocated for the output to look as expected. strlen(input)*5 is an arbitrary number which seems to work, but why do I get seemingly 'random' errors when lowering the number?

NB! As this is a part of my process to learn coding in C I am not primarily interested in (more efficient) pre-fab solutions for solving the problem (already have them), but to explain the two questions listed - so that I can solve the problem myself.

Also; this is my first post in the SO forums. Hello.

9
  • strlen return size_t, so use %zu for printf format and correct type for variables. new is a reserved name for c++, you should avoid to use it as variable name. Commented Oct 10, 2016 at 8:20
  • 1. %ld is the wrong format type. 2. if you count the occurreneces of the substring first, you can calculate how long the new string will be. Commented Oct 10, 2016 at 8:23
  • Moreover: you have to check malloc return != NULL (it can fail) and init the allocated memory due to random values the memory contains. Your call to strcat is UB otherwise. Commented Oct 10, 2016 at 8:31
  • BTW your segmentation fault is mainly cause by: int a, b, c = 0; Should be int a=0, b=0, c = 0; With your code the value of a and b are not inited to zero. Commented Oct 10, 2016 at 8:39
  • output in main is not big enough to hold the result! Commented Oct 10, 2016 at 8:46

1 Answer 1

1

Question 1: When this line is executed in main() it causes a segfault. However, when executed within the function everything seems to work fine. Why?

No, printf("input: %d\t%s\n", len, buffer); // <- Question 1 is not the cause of your segfault.

printf("output: %ld\t%s\n", strlen(output), output);

This part is, strlen doesn't return int but it returns size_t. As noted in the comments, use %zu to print it out.

Also, while(input[a]) will stop at the NULL terminator which means that your output will never hold a terminator and thus printf will keep on reading, you should add it at the end:

output[c] = '\0';

Also, as noted by @LPs in the comments, you should zero initialize the variables you work with :

 int a = 0, b = 0, c = 0;

Question 2: I have realized that I need a pretty large piece of memory allocated for the output to look as expected. strlen(input)*5 is an arbitrary number which seems to work, but why do I get seemingly 'random' errors when lowering the number?

Probably because you haven't allocated enough memory. Because the string length depends on runtime factors there's no way to know the exact memory needed you should allocate the maximum amount required:

char *output = malloc(strlen(input) * strlen(new) + 1);
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your answer. I'm already feeling a bit smarter. However: Q1: I have replaced %ld with %zu, and am now storing 'len' in a size_t variable. The problem persists that if I comment out the line in main() marked with Question 1, the program runs as it should. If I keep the line I still get the segfault. Q2: In this example the input string is 53 characters long and the output string 61. I still get odd behavior if I allocate strlen(input)*2, for example.
@BNauclér Did you add the NULL terminator and zero initialized your int's?
Ah! Zero initialization seems to have been the main issue there. I was of the impression that my int a, b, c = 0; would initialize them all to 0. Now this is changed to int a = 0, b = 0, c = 0; and it runs like a charm. Yes, also inserted the NULL terminator. Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.