0

I like "reinventing the wheel" for learning purposes, so I'm working on a container class for strings. Will using the NULL character as an array terminator (i.e., the last value in the array will be NULL) cause interference with the null-terminated strings?

I think it would only be an issue if an empty string is added, but I might be missing something.

EDIT: This is in C++.

6
  • What do you mean by "interference." Can you post any code to show what you are describing? Commented Dec 2, 2010 at 18:26
  • Is this C or C++? What type are the strings you're storing? Commented Dec 2, 2010 at 18:26
  • @Sht C++. Sorry I didn't make that clear. @SpeksETC I mean that since strings are null-terminated (I think?), an empty string would have the same value as the "end of array" value. Commented Dec 2, 2010 at 18:27
  • You wont be able to have embedded nulls in your string, which is useful for non-ASCII character sets and when using a string to process binary protocols, for example. Commented Dec 2, 2010 at 18:31
  • Are you talking about double-NULL terminated strings? Where you have a set of NULL-terminated strings all concatenated with NULL character separators, and a final extra NULL to terminate the whole array? You want to look at this: blogs.msdn.com/b/oldnewthing/archive/2009/10/08/9904646.aspx Commented Dec 2, 2010 at 18:33

7 Answers 7

3

"" is the empty string in C and C++, not NULL. Note that "" has exactly one element (instead of zero), meaning it is equivalent to {'\0'} as an array of char.

char const *notastring = NULL;
char const *emptystring = "";

emptystring[0] == '\0';  // true
notastring[0] == '\0';   // crashes
Sign up to request clarification or add additional context in comments.

4 Comments

But strings are null-terminated, so wouldn't an empty string have a value of "" + NULL, and therefore just a value of NULL?
NULL may or may not be the same value as NUL.
There's no such thing as "" + NULL. That's adding an array and a pointer (it might work, but it's not meaningful). Since "" has one element, its address can't be NULL; the element has to be stored somewhere.
Interestingly, it is possible for "" + NULL to mean something, since NULL is not guaranteed to be 0, but it certainly won't mean anything you'd want it to.
3

No, it won't, because you won't be storing in an array of char, you'll be storing in an array of char*.

char const* strings[] = {
  "WTF"
, "Am"
, "I"
, "Using"
, "Char"
, "Arrays?!"
, 0
};

Comments

2

It depends on what kind of string you're storing.

If you're storing C-style strings, which are basically just pointers to character arrays (char*), there's a difference between a NULL pointer value, and an empty string. The former means the pointer is ‘empty’, the latter means the pointer points to an array that contains a single item with character value 0 ('\0'). So the pointer still has a value, and testing it (if (foo[3])) will work as expected.

If what you're storing are C++ standard library strings of type string, then there is no NULL value. That's because there is no pointer, and the string type is treated as a single value. (Whereas a pointer is technically not, but can be seen as a reference.)

Comments

2

I think you are confused. While C-strings are "null terminated", there is no "NULL" character. NULL is a name for a null pointer. The terminator for a C-string is a null character, i.e. a byte with a value of zero. In ASCII, this byte is (somewhat confusingly) named NUL.

Suppose your class contains an array of char that is used to store the string data. You do not need to "mark the end of the array"; the array has a specific size that is set at compile-time. You do need to know how much of that space is actually being used; the null-terminator on the string data accomplishes that for you - but you can get better performance by actually remembering the length. Also, a "string" class with a statically-sized char buffer is not very useful at all, because that buffer size is an upper limit on the length of strings you can have.

So a better string class would contain a pointer of type char*, which points to a dynamically allocated (via new[]) array of char s. Again, it makes no sense to "mark the end of the array", but you will want to remember both the length of the string (i.e. the amount of space being used) and the size of the allocation (i.e. the amount of space that may be used before you have to re-allocate).

1 Comment

Indeed, I was confused. Thank you for clarifying.
1

When you are copying from std::string, use the iterators begin(), end() and you don't have to worry about the NULL - in reality, the NULL is only present if you call c_str() (in which case the block of memory this points to will have a NULL to terminate the string.) If you want to memcpy use the data() method.

Comments

0

Why don't you follow the pattern used by vector - store the number of elements within your container class, then you know always how many values there are in it:

vector<string> myVector;

size_t elements(myVector.size());

Instantiating a string with x where const char* x = 0; can be problematic. See this code in Visual C++ STL that gets called when you do this:

_Myt& assign(const _Elem *_Ptr)
    {   // assign [_Ptr, <null>)
    _DEBUG_POINTER(_Ptr);
    return (assign(_Ptr, _Traits::length(_Ptr)));
    }

static size_t __CLRCALL_OR_CDECL length(const _Elem *_First)
    {   // find length of null-terminated string
    return (_CSTD strlen(_First));
    }

5 Comments

That does not preclude you from keeping track of the current valid element count. vector uses contiguous (i.e. array) storage under the covers too.
The OP is not using a vector because it would conflict with the stated preference to re-invent the wheel for learning purposes.
@Thomas - I'm not saying use a vector, I'm saying design the container class so that no artificial 'terminator' construct is required to track its element count.
@Steve Ahh I understand. Maxpm, as part of your learning, consider the way std::vector does it.
@Thomas - I am still checking but almost sure string str(0); is a serious problem on Visual C++. A container class that requires "last element" marker will be hard to generalize, and that should be OP's next project imo.
0
#include "Maxmp_crafts_fine_wheels.h"
MaxpmContaner maxpm;
maxpm.add("Hello");
maxpm.add(""); // uh oh, adding an empty string; should I worry?
maxpm.add(0);

At this point, as a user of MaxpmContainer who had not read your documentation, I would expect the following:

strcmp(maxpm[0],"Hello") == 0;
*maxpm[1] == 0;
maxpm[2] == 0;

Interference between the zero terminator at position two and the empty string at position one is avoided by means of the "interpret this as a memory address" operator *. Position one will not be zero; it will be an integer, which if you interpret it as a memory address, will turn out to be zero. Position two will be zero, which, if you interpret it as a memory address, will turn out to be an abrupt disorderly exit from your program.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.