101

I have two postcodes char* that I want to compare, ignoring case. Is there a function to do this?

Or do I have to loop through each use the tolower function and then do the comparison?

Any idea how this function will react with numbers in the string

Thanks

5
  • I think I wrote that in a bad way, postcode is not a type , just the real world value the char* will hold. Commented Apr 28, 2011 at 15:11
  • 3
    What platform are you on? Many platforms have a platform-specific function to do this. Commented Apr 28, 2011 at 15:11
  • If you are comparing a number with a letter, then you know the strings aren't equivalent, regardless of case. Commented Apr 28, 2011 at 15:11
  • I assume you just mean ASCII string comparison? Not generic to the whole world across multiple locales? Commented Apr 28, 2011 at 15:11
  • The comparison could result in comparing a number and a letter, I need to test if two postcodes are equal to each other, one is greater than or one is less than. The greater than, less than part is confusing, I'm not sure how that's going to work out Commented Apr 28, 2011 at 16:49

13 Answers 13

71

There is no function that does this in the C standard. Unix systems that comply with POSIX are required to have strcasecmp in the header strings.h; Microsoft systems have stricmp. To be on the portable side, write your own:

int strcicmp(char const *a, char const *b)
{
    for (;; a++, b++) {
        int d = tolower((unsigned char)*a) - tolower((unsigned char)*b);
        if (d != 0 || !*a)
            return d;
    }
}

But note that none of these solutions will work with UTF-8 strings, only ASCII ones.

Sign up to request clarification or add additional context in comments.

9 Comments

This implementation is not correct; it will incorrectly return 0 when b is a substring of a. For example it will return 0 for strcicmp("another", "an") but it should return 1
This is bad advice. There is no reason to "write your own" standard C text functions to deal with a simple name difference. Do #ifdef _WINDOWS ... #define strcasecmp stricmp ... #endif and put it in an appropriate header. The above comments where the author had to fix the function to work right is why rewriting standard C functions is counter-productive if a far simpler solution is available.
Neither _stricmp nor strcasecmp is available in -std=c++11. They also have different semantics with regards to locale.
This will break awfully when a or b are NULL.
@YoTengoUnLCD Re: break awfully when a or b are NULL. Breaking with a and/or b as NULL is commonly accepted practice as a null pointer does not point to a string. Not a bad check to add, yet what to return? Should cmp("", NULL) return 0, INT_MIN? There is not consensus on this. Note: C allows UB with strcmp(NULL, "abc");.
|
49

Take a look at strcasecmp() in strings.h.

9 Comments

I think you mean int strcasecmp(const char *s1, const char *s2); in strings.h
This function is non-standard; Microsoft calls it stricmp. @entropo: strings.h is a header for compatibility with 1980s Unix systems.
@entropo: apologies, POSIX does seem to define strings.h. It also defined strcasecmp, to be declared in that header. ISO C doesn't have it, though.
See: difference-between-string-h-and-strings-h . Some C standard libraries have merged all of the non-deprecated functions into string.h. See, e.g., Glibc
Yes it seems there is such header strings.h and in theory strcasecmp should be declared there. But all compilers I used have strcasecmp declared in string.h. at least cl, g++, forte c++ compilers has it.
|
13

Additional pitfalls to watch out for when doing case insensitive compares:


Comparing as lower or as upper case? (common enough issue)

Both below will return 0 with strcicmpL("A", "a") and strcicmpU("A", "a").
Yet strcicmpL("A", "_") and strcicmpU("A", "_") can return different signed results as '_' is often between the upper and lower case letters.

This affects the sort order when used with qsort(..., ..., ..., strcicmp). Non-standard library C functions like the commonly available stricmp() or strcasecmp() tend to be well defined and favor comparing via lowercase. Yet variations exist.

int strcicmpL(char const *a, char const *b) {
  while (*b) {
    int d = tolower(*a) - tolower(*b);
    if (d) {
        return d;
    } 
    a++;
    b++;
  } 
  return tolower(*a);
}

int strcicmpU(char const *a, char const *b) {
  while (*b) {
    int d = toupper(*a) - toupper(*b);
    if (d) {
        return d;
    } 
    a++;
    b++;
  } 
  return toupper(*a);
}

char can have a negative value. (not rare)

touppper(int) and tolower(int) are specified for unsigned char values and the negative EOF. Further, strcmp() returns results as if each char was converted to unsigned char, regardless if char is signed or unsigned.

tolower(*a); // Potential UB
tolower((unsigned char) *a); // Correct (Almost - see following)

char can have a negative value and not 2's complement. (rare)

[edit 2024]
This is no longer possible with C23 as the standard now requires 2's compliment for signed integer types.

The above does not handle -0 nor other negative values properly as the bit pattern should be interpreted as unsigned char. To properly handle all integer encodings, change the pointer type first.

// tolower((unsigned char) *a);
tolower(*(const unsigned char *)a); // Correct

Locale (less common)

Although character sets using ASCII code (0-127) are ubiquitous, the remainder codes tend to have locale specific issues. So strcasecmp("\xE4", "a") might return a 0 on one system and non-zero on another.


Unicode (the way of the future)

If a solution needs to handle more than ASCII consider a unicode_strcicmp(). As C lib does not provide such a function, a pre-coded function from some alternate library is recommended. Writing your own unicode_strcicmp() is a daunting task.


Do all letters map one lower to one upper? (pedantic)

[A-Z] maps one-to-one with [a-z], yet various locales map various lower case characters to one upper and visa-versa. Further, some uppercase characters may lack a lower case equivalent and again, visa-versa.

This obliges code to covert through both tolower() and tolower().

int d = tolower(toupper(*a)) - tolower(toupper(*b));

Again, potential different results for sorting if code did tolower(toupper(*a)) vs. toupper(tolower(*a)).


Portability

@B. Nadolson recommends to avoid rolling your own strcicmp() and this is reasonable, except when code needs high equivalent portable functionality.

Below is an approach that even performed faster than some system provided functions. It does a single compare per loop rather than two by using 2 different tables that differ with '\0'. Your results may vary.

static unsigned char low1[UCHAR_MAX + 1] = {
  0, 1, 2, 3, ...
  '@', 'a', 'b', 'c', ... 'z', `[`, ...  // @ABC... Z[...
  '`', 'a', 'b', 'c', ... 'z', `{`, ...  // `abc... z{...
}
static unsigned char low2[UCHAR_MAX + 1] = {
// v--- Not zero, but A which matches none in `low1[]`
  'A', 1, 2, 3, ...
  '@', 'a', 'b', 'c', ... 'z', `[`, ...
  '`', 'a', 'b', 'c', ... 'z', `{`, ...
}

int strcicmp_ch(char const *a, char const *b) {
  // Compare using tables that differ slightly so no need to check for a null character.
  while (low1[*(const unsigned char *)a] == low2[*(const unsigned char *)b]) {
    a++;
    b++;
  }
  // Either strings differ or null character detected.
  // Perform subtraction using same table.
  return (low1[*(const unsigned char *)a] - low1[*(const unsigned char *)b]);
}

Pedantically: there exist systems where the range of unsigned char is like unsigned, so better to compare in the end.

unsigned char c1 = low1[*(const unsigned char *)a];
unsigned char c2 = low1[*(const unsigned char *)b];
return (c1 > c2) - (c1 < c2);

Comments

8

I've found built-in such method named from which contains additional string functions to the standard header .

Here's the relevant signatures :

int  strcasecmp(const char *, const char *);
int  strncasecmp(const char *, const char *, size_t);

I also found it's synonym in xnu kernel (osfmk/device/subrs.c) and it's implemented in the following code, so you wouldn't expect to have any change of behavior in number compared to the original strcmp function.

tolower(unsigned char ch) {
    if (ch >= 'A' && ch <= 'Z')
        ch = 'a' + (ch - 'A');
    return ch;
 }

int strcasecmp(const char *s1, const char *s2) {
    const unsigned char *us1 = (const u_char *)s1,
                        *us2 = (const u_char *)s2;

    while (tolower(*us1) == tolower(*us2++))
        if (*us1++ == '\0')
            return (0);
    return (tolower(*us1) - tolower(*--us2));
}

3 Comments

Kudos for mentioning the safer strncasecmp() function!
strcasecmp() and strncasecmp() are not part of the standard C library, but common additions in *nix.
Note that there's no reason to implement your own tolower() function if you're compiling with a standards-compliant compiler/C implementation - tolower() is a required function per effectively every version of the C standard.
6

I would use stricmp(). It compares two strings without regard to case.

Note that, in some cases, converting the string to lower case can be faster.

Comments

4

As others have stated, there is no portable function that works on all systems. You can partially circumvent this with simple ifdef:

#include <stdio.h>

#ifdef _WIN32
#include <string.h>
#define strcasecmp _stricmp
#else // assuming POSIX or BSD compliant system
#include <strings.h>
#endif

int main() {
    printf("%d", strcasecmp("teSt", "TEst"));
}

2 Comments

this reminds me that strings.h (with an s), is not the same as string.h.... I've spent some time looking from strcasecmp on the wrong one....
@GustavoVargas Me too, then I decided to write it here and save time for the future myself and others :)
4

POSIX <strings.h> header file replacement for strcasecmp() and strncasecmp() in C

Update 25 July 2024:

My latest work on this is now here:

  1. stringslib.h
  2. stringslib.c
  3. stringslib_unittest.cpp

The above library contains my implementations of my_strcasecmp() and my_strncasecmp() and uses Gtest to test them directly against the POSIX functions strcasecmp() and strncasecmp() which are contained in the non-C-standard POSIX header file named strings.h.

To test and run:

  1. If in Linux, use your regular Bash terminal. If in Windows, use the MSYS2 terminal. See my instructions here: Installing & setting up MSYS2 from scratch, including adding all 7 profiles to Windows Terminal

  2. First, install Gtest by following my instructions here: How do I build and use googletest (gtest) and googlemock (gmock) with gcc/g++ or clang?

  3. Then, clone my repo and run the unit test file as a Bash script:

    # clone it
    git clone https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world.git
    
    # cd into the directory
    cd eRCaGuy_hello_world/c
    
    # build and run the unit test
    ./stringslib_unittest.cpp
    

    Example run and output:

    eRCaGuy_hello_world/c$ ./stringslib_unittest.cpp 
    Running main() from /home/gabriel/Downloads/Install_Files/gtest/googletest-1.14.0/googletest/src/gtest_main.cc
    [==========] Running 2 tests from 1 test suite.
    [----------] Global test environment set-up.
    [----------] 2 tests from stringslib
    [ RUN      ] stringslib.strncasecmp
    [       OK ] stringslib.strncasecmp (0 ms)
    [ RUN      ] stringslib.strcasecmp
    [       OK ] stringslib.strcasecmp (0 ms)
    [----------] 2 tests from stringslib (0 ms total)
    
    [----------] Global test environment tear-down
    [==========] 2 tests from 1 test suite ran. (0 ms total)
    [  PASSED  ] 2 tests.
    

strncmpci(), a direct, drop-in case-insensitive string comparison replacement for strncmp() and strcmp()

I'm not really a fan of the most-upvoted answer here (in part because it seems like it isn't correct since it should continue if it reads a null terminator in either string--but not both strings at once--and it doesn't do this), so I wrote my own.

This is a direct drop-in replacement for strncmp(), and has been tested with numerous test cases, as shown below.

It is identical to strncmp() except:

  1. It is case-insensitive.
  2. The behavior is NOT undefined (it is well-defined) if either string is a null ptr. Regular strncmp() has undefined behavior if either string is a null ptr (see: https://en.cppreference.com/w/cpp/string/byte/strncmp).
  3. It returns INT_MIN as a special sentinel error value if either input string is a NULL ptr.

LIMITATIONS: Note that this code works on the original 7-bit ASCII character set only (decimal values 0 to 127, inclusive), NOT on unicode characters, such as unicode character encodings UTF-8 (the most popular), UTF-16, and UTF-32.

Here is the code only (no comments):

int strncmpci(const char * str1, const char * str2, size_t num)
{
    int ret_code = 0;
    size_t chars_compared = 0;

    if (!str1 || !str2)
    {
        ret_code = INT_MIN;
        return ret_code;
    }

    while ((chars_compared < num) && (*str1 || *str2))
    {
        ret_code = tolower((int)(*str1)) - tolower((int)(*str2));
        if (ret_code != 0)
        {
            break;
        }
        chars_compared++;
        str1++;
        str2++;
    }

    return ret_code;
}

Fully-commented version:

/// \brief      Perform a case-insensitive string compare (`strncmp()` case-insensitive) to see
///             if two C-strings are equal.
/// \note       1. Identical to `strncmp()` except:
///               1. It is case-insensitive.
///               2. The behavior is NOT undefined (it is well-defined) if either string is a null
///               ptr. Regular `strncmp()` has undefined behavior if either string is a null ptr
///               (see: https://en.cppreference.com/w/cpp/string/byte/strncmp).
///               3. It returns `INT_MIN` as a special sentinel value for certain errors.
///             - Posted as an answer here: https://stackoverflow.com/a/55293507/4561887.
///               - Aided/inspired, in part, by `strcicmp()` here:
///                 https://stackoverflow.com/a/5820991/4561887.
/// \param[in]  str1        C string 1 to be compared.
/// \param[in]  str2        C string 2 to be compared.
/// \param[in]  num         max number of chars to compare
/// \return     A comparison code (identical to `strncmp()`, except with the addition
///             of `INT_MIN` as a special sentinel value):
///
///             INT_MIN (usually -2147483648 for int32_t integers)  Invalid arguments (one or both
///                      of the input strings is a NULL pointer).
///             <0       The first character that does not match has a lower value in str1 than
///                      in str2.
///              0       The contents of both strings are equal.
///             >0       The first character that does not match has a greater value in str1 than
///                      in str2.
int strncmpci(const char * str1, const char * str2, size_t num)
{
    int ret_code = 0;
    size_t chars_compared = 0;

    // Check for NULL pointers
    if (!str1 || !str2)
    {
        ret_code = INT_MIN;
        return ret_code;
    }

    // Continue doing case-insensitive comparisons, one-character-at-a-time, of `str1` to `str2`, so
    // long as 1st: we have not yet compared the requested number of chars, and 2nd: the next char
    // of at least *one* of the strings is not zero (the null terminator for a C-string), meaning
    // that string still has more characters in it.
    // Note: you MUST check `(chars_compared < num)` FIRST or else dereferencing (reading) `str1` or
    // `str2` via `*str1` and `*str2`, respectively, is undefined behavior if you are reading one or
    // both of these C-strings outside of their array bounds.
    while ((chars_compared < num) && (*str1 || *str2))
    {
        ret_code = tolower((int)(*str1)) - tolower((int)(*str2));
        if (ret_code != 0)
        {
            // The 2 chars just compared don't match
            break;
        }
        chars_compared++;
        str1++;
        str2++;
    }

    return ret_code;
}

Test code:

Download the entire sample code, with unit tests, from my eRCaGuy_hello_world repository here: "strncmpci.c":

(this is just a snippet)

int main()
{
    printf("-----------------------\n"
           "String Comparison Tests\n"
           "-----------------------\n\n");

    int num_failures_expected = 0;

    printf("INTENTIONAL UNIT TEST FAILURE to show what a unit test failure looks like!\n");
    EXPECT_EQUALS(strncmpci("hey", "HEY", 3), 'h' - 'H');
    num_failures_expected++;
    printf("------ beginning ------\n\n");


    const char * str1;
    const char * str2;
    size_t n;

    // NULL ptr checks
    EXPECT_EQUALS(strncmpci(NULL, "", 0), INT_MIN);
    EXPECT_EQUALS(strncmpci("", NULL, 0), INT_MIN);
    EXPECT_EQUALS(strncmpci(NULL, NULL, 0), INT_MIN);
    EXPECT_EQUALS(strncmpci(NULL, "", 10), INT_MIN);
    EXPECT_EQUALS(strncmpci("", NULL, 10), INT_MIN);
    EXPECT_EQUALS(strncmpci(NULL, NULL, 10), INT_MIN);

    EXPECT_EQUALS(strncmpci("", "", 0), 0);
    EXPECT_EQUALS(strncmp("", "", 0), 0);

    str1 = "";
    str2 = "";
    n = 0;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 0);
    EXPECT_EQUALS(strncmp(str1, str2, n), 0);

    str1 = "hey";
    str2 = "HEY";
    n = 0;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 0);
    EXPECT_EQUALS(strncmp(str1, str2, n), 0);

    str1 = "hey";
    str2 = "HEY";
    n = 3;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 0);
    EXPECT_EQUALS(strncmp(str1, str2, n), 'h' - 'H');

    str1 = "heY";
    str2 = "HeY";
    n = 3;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 0);
    EXPECT_EQUALS(strncmp(str1, str2, n), 'h' - 'H');

    str1 = "hey";
    str2 = "HEdY";
    n = 3;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 'y' - 'd');
    EXPECT_EQUALS(strncmp(str1, str2, n), 'h' - 'H');

    str1 = "heY";
    str2 = "hEYd";
    n = 3;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 0);
    EXPECT_EQUALS(strncmp(str1, str2, n), 'e' - 'E');

    str1 = "heY";
    str2 = "heyd";
    n = 6;
    EXPECT_EQUALS(strncmpci(str1, str2, n), -'d');
    EXPECT_EQUALS(strncmp(str1, str2, n), 'Y' - 'y');

    str1 = "hey";
    str2 = "hey";
    n = 6;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 0);
    EXPECT_EQUALS(strncmp(str1, str2, n), 0);

    str1 = "hey";
    str2 = "heyd";
    n = 6;
    EXPECT_EQUALS(strncmpci(str1, str2, n), -'d');
    EXPECT_EQUALS(strncmp(str1, str2, n), -'d');

    str1 = "hey";
    str2 = "heyd";
    n = 3;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 0);
    EXPECT_EQUALS(strncmp(str1, str2, n), 0);

    str1 = "hEY";
    str2 = "heyYOU";
    n = 3;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 0);
    EXPECT_EQUALS(strncmp(str1, str2, n), 'E' - 'e');

    str1 = "hEY";
    str2 = "heyYOU";
    n = 10;
    EXPECT_EQUALS(strncmpci(str1, str2, n), -'y');
    EXPECT_EQUALS(strncmp(str1, str2, n), 'E' - 'e');

    str1 = "hEYHowAre";
    str2 = "heyYOU";
    n = 10;
    EXPECT_EQUALS(strncmpci(str1, str2, n), 'h' - 'y');
    EXPECT_EQUALS(strncmp(str1, str2, n), 'E' - 'e');

    EXPECT_EQUALS(strncmpci("nice to meet you.,;", "NICE TO MEET YOU.,;", 100), 0);
    EXPECT_EQUALS(strncmp(  "nice to meet you.,;", "NICE TO MEET YOU.,;", 100), 'n' - 'N');
    EXPECT_EQUALS(strncmp(  "nice to meet you.,;", "nice to meet you.,;", 100), 0);

    EXPECT_EQUALS(strncmpci("nice to meet you.,;", "NICE TO UEET YOU.,;", 100), 'm' - 'u');
    EXPECT_EQUALS(strncmp(  "nice to meet you.,;", "nice to uEET YOU.,;", 100), 'm' - 'u');
    EXPECT_EQUALS(strncmp(  "nice to meet you.,;", "nice to UEET YOU.,;", 100), 'm' - 'U');

    EXPECT_EQUALS(strncmpci("nice to meet you.,;", "NICE TO MEET YOU.,;", 5), 0);
    EXPECT_EQUALS(strncmp(  "nice to meet you.,;", "NICE TO MEET YOU.,;", 5), 'n' - 'N');

    EXPECT_EQUALS(strncmpci("nice to meet you.,;", "NICE eo UEET YOU.,;", 5), 0);
    EXPECT_EQUALS(strncmp(  "nice to meet you.,;", "nice eo uEET YOU.,;", 5), 0);

    EXPECT_EQUALS(strncmpci("nice to meet you.,;", "NICE eo UEET YOU.,;", 100), 't' - 'e');
    EXPECT_EQUALS(strncmp(  "nice to meet you.,;", "nice eo uEET YOU.,;", 100), 't' - 'e');

    EXPECT_EQUALS(strncmpci("nice to meet you.,;", "nice-eo UEET YOU.,;", 5), ' ' - '-');
    EXPECT_EQUALS(strncmp(  "nice to meet you.,;", "nice-eo UEET YOU.,;", 5), ' ' - '-');


    if (globals.error_count == num_failures_expected)
    {
        printf(ANSI_COLOR_GRN "All unit tests passed!" ANSI_COLOR_OFF "\n");
    }
    else
    {
        printf(ANSI_COLOR_RED "FAILED UNIT TESTS! NUMBER OF UNEXPECTED FAILURES = %i"
            ANSI_COLOR_OFF "\n", globals.error_count - num_failures_expected);
    }

    assert(globals.error_count == num_failures_expected);
    return globals.error_count;
}

Sample output:

$ gcc -Wall -Wextra -Werror -ggdb -std=c11 -o ./bin/tmp strncmpci.c && ./bin/tmp
-----------------------
String Comparison Tests
-----------------------

INTENTIONAL UNIT TEST FAILURE to show what a unit test failure looks like!
FAILED at line 250 in function main! strncmpci("hey", "HEY", 3) != 'h' - 'H'
  a: strncmpci("hey", "HEY", 3) is 0
  b: 'h' - 'H' is 32

------ beginning ------

All unit tests passed!

References:

  1. This question & other answers here served as inspiration and gave some insight (Case Insensitive String Comparison in C)
  2. http://www.cplusplus.com/reference/cstring/strncmp/
  3. https://en.wikipedia.org/wiki/ASCII
  4. https://en.cppreference.com/w/c/language/operator_precedence
  5. Undefined Behavior research I did to fix part of my code above (see comments below):
    1. Google search for "c undefined behavior reading outside array bounds"
    2. Is accessing a global array outside its bound undefined behavior?
    3. https://en.cppreference.com/w/cpp/language/ub - see also the many really great "External links" at the bottom!
    4. 1/3: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
    5. 2/3: https://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html
    6. 3/3: https://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html
    7. https://blog.regehr.org/archives/213
    8. https://www.geeksforgeeks.org/accessing-array-bounds-ccpp/

Topics to further research

  1. (Note: this is C++, not C) Lowercase of Unicode character
  2. tolower_tests.c on OnlineGDB: https://onlinegdb.com/HyZieXcew

TODO:

  1. Make a version of this code which also works on Unicode's UTF-8 implementation (character encoding)!

14 Comments

in part because it isn't correct since ... you code isn't correct either. There is no point to use tolower, it's going to be by far the slowest part of the function. If you really want your function to be locale aware and handle non-ascii chars then you have to cast your chars to unsigned first. Otherwise, your code results in UB
Voting down this solution - it advertizes to be a drop-in/tested solution, but a simple additional test using "" shows that it will not behave like the linux/windows version of it, returning strncmpci("", "", 0) = -9999 instead of 0
Hi @GaspardP, thanks for pointing out this edge case. I've fixed my code now. The fix was simple. I initialized ret_code to 0 instead of to INT_MIN (or -9999 as it was in the code you tested), and then set it to INT_MIN only if one of the input strings is a NULL ptr. Now it works perfectly. The problem was simply that for n is 0, none of the blocks were entered (neither the if nor the while), so it simply returned what I had initialized ret_code to. Anyway, it's fixed now, & I've cleaned up my unit tests a ton and added in the test you mentioned. Hopefully you upvote now.
Forming the address 1-past the range is OK. Dereferencing that address, as code does here with *str1, is UB. Code here is "using" it in that it attempting to read through that pointer. The UB is usually benign, yet remains UB. The whole point of a size parameter is to prevent access outside the array bounds - which this code violated. With num == 0, nothing should be read.
Posted. This is my first question on that site: codereview.stackexchange.com/questions/255344/….
|
2

You can get an idea, how to implement an efficient one, if you don't have any in the library, from here

It use a table for all 256 chars.

  • in that table for all chars, except letters - used its ascii codes.
  • for upper case letter codes - the table list codes of lower cased symbols.

then we just need to traverse a strings and compare our table cells for a given chars:

const char *cm = charmap,
        *us1 = (const char *)s1,
        *us2 = (const char *)s2;
while (cm[*us1] == cm[*us2++])
    if (*us1++ == '\0')
        return (0);
return (cm[*us1] - cm[*--us2]);

Comments

1

Simple solution:

int str_case_ins_cmp(const char* a, const char* b) {
  int rc;

  while (1) {
    rc = tolower((unsigned char)*a) - tolower((unsigned char)*b);
    if (rc || !*a) {
      break;
    }

    ++a;
    ++b;
  }

  return rc;
}

Comments

1

Aspect that need consideration:

Now, if it's just UK post codes (NW3 6SG), then well, |0x20 for A-Z and byte compare. But does the space matter? You might have some special rules.

And for prefix searches, one has to check the function used (e.g. stricmp) for what happens when one string is shorter than the other.

The sad reality is that there are MANY string comparisons that can be used, but the edge cases are complex.

TL;DR: just roll your own, depending on your needs.

Comments

0

if we have a null terminated character:

   bool striseq(const char* s1,const char* s2){ 
     for(;*s1;){ 
       if(tolower(*s1++)!=tolower(*s2++)) 
         return false; 
      } 
      return *s1 == *s2;
    }

or with this version that uses bitwise operations:

    int striseq(const char* s1,const char* s2)
       {for(;*s1;) if((*s1++|32)!=(*s2++|32)) return 0; return *s1 == *s2;}

i'm not sure if this works with symbols, I haven't tested there, but works fine with letters.

1 Comment

This sure seems like it would fail if s1 is longer than s2.
-1
int strcmpInsensitive(char* a, char* b)
{
    return strcmp(lowerCaseWord(a), lowerCaseWord(b));
}

char* lowerCaseWord(char* a)
{
    char *b=new char[strlen(a)];
    for (int i = 0; i < strlen(a); i++)
    {
        b[i] = tolower(a[i]);   
    }
    return b;
}

good luck

Edit-lowerCaseWord function get a char* variable with, and return the lower case value of this char*. For example "AbCdE" for value of char*, will return "abcde".

Basically what it does is to take the two char* variables, after being transferred to lower case, and make use the strcmp function on them.

For example- if we call the strcmpInsensitive function for values of "AbCdE", and "ABCDE", it will first return both values in lower case ("abcde"), and then do strcmp function on them.

5 Comments

some explanation could go a long way
It seems wholly inefficient to lower both input strings, when the function "might" return as soon as after the first character compare instead. e.g. "ABcDe" vs "BcdEF", could return very quickly, without needing to lower or upper anything other than the first character of each string.
Not to mention leaking memory twice.
You don't null-terminate your lower case strings, so the subsequent strcmp() might crash the program.
You also compute strlen(a) a total of strlen(a)+1 times. That together with the loop itself and you're traversing a strlen(a)+2 times.
-1
static int ignoreCaseComp (const char *str1, const char *str2, int length)
{
    int k;
    for (k = 0; k < length; k++)
    {

        if ((str1[k] | 32) != (str2[k] | 32))
            break;
    }

    if (k != length)
        return 1;
    return 0;
}

Reference

2 Comments

The ORing idea is kind of nifty, but the logic is flawed. For example, ignoreCaseComp("`", "@", 1) and perhaps more importantly, ignoreCaseComp("\0", " ", 1) (i.e. where all bits other than bit 5 (decimal 32) are identical) both evaluates to 0 (match).
Short version of the above comment: this code is broken

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.