1

I'm a student taking a class on Data Structures in C++ this semester and I came across something that I don't quite understand tonight. Say I were to create a pointer to an array on the heap:

int* arrayPtr = new int [4];

I can access this array using pointer syntax

int value = *(arrayPtr + index);

But if I were to add another value to the memory position immediately after the end of the space allocated for the array, I would then be able to access it

*(arrayPtr + 4) = 0;
int nextPos = *(arrayPtr + 4);
//the value of nextPos will be 0, or whatever value I previously filled that space with

The position in memory of *(arrayPtr + 4) is past the end of the space allocated for the array. But as far as I understand, the above still would not cause any problems. So aside from it being a requirement of C++, why even give arrays a specific size when declaring them?

3
  • 3
    "the above still would not cause any problems" Why not? If there is something else in that memory region, you might get into great trouble... Commented Oct 1, 2015 at 4:00
  • These things lead to what is known as "undefined error"..They work sometimes, but the catch is, probability of the application crashing at unforseen time is very high...You are sure to get Linux Process Signals such as memory corruption,segmentation fault etc. Commented Oct 1, 2015 at 4:04
  • 6
    If you build a garden shed that overlaps your neighbor's land he might not notice at first, but he's going to be pretty angry when he finds out you've crushed his petunias. Commented Oct 1, 2015 at 4:05

5 Answers 5

6

When you go past the end of allocated memory, you are actually accessing memory of some other object (or memory that is free right now, but that could change later). So, it will cause you problems. Especially if you'll try to write something to it.

Sign up to request clarification or add additional context in comments.

1 Comment

Okay, I guess I didn't think of the possibility of it being overwritten. I'm only a few weeks into this class and never really thought much about memory management before so I'm still kinda new to this stuff. Thanks for clarifying
3

I can access this array using pointer syntax

int value = *(arrayPtr + index);

Yeah, but don't. Use arrayPtr[index]

The position in memory of *(arrayPtr + 4) is past the end of the space allocated for the array. But as far as I understand, the above still would not cause any problems.

You understand wrong. Oh so very wrong. You're invoking undefined behavior and undefined behavior is undefined. It may work for a week, then break one day next week and you'll be left wondering why. If you don't know the collection size in advance use something dynamic like a vector instead of an array.

Comments

3

Yes, in C/C++ you can access memory outside of the space you claim to have allocated. Sometimes. This is what is referred to as undefined behavior.

Basically, you have told the compiler and the memory management system that you want space to store four integers, and the memory management system allocated space for you to store four integers. It gave you a pointer to that space. In the memory manager's internal accounting, those bytes of ram are now occupied, until you call delete[] arrayPtr;.

However, the memory manager has not allocated that next byte for you. You don't have any way of knowing, in general, what that next byte is, or who it belongs to.

In a simple example program like your example, which just allocates a few bytes, and doesn't allocate anything else, chances are, that next byte belongs to your program, and isn't occupied. If that array is the only dynamically allocated memory in your program, then it's probably, maybe safe to run over the end.

But in a more complex program, with multiple dynamic memory allocations and deallocations, especially near the edges of memory pages, you really have no good way of knowing what any bytes outside of the memory you asked for contain. So when you write to bytes outside of the memory you asked for in new you could be writing to basically anything.

This is where undefined behavior comes in. Because you don't know what's in that space you wrote to, you don't know what will happen as a result. Here's some examples of things that could happen:

  • The memory was not allocated when you wrote to it. In that case, the data is fine, and nothing bad seems to happen. However, if a later memory allocation uses that space, anything you tried to put there will be lost.

  • The memory was allocated when you wrote to it. In that case, congratulations, you just overwrote some random bytes from some other data structure somewhere else in your program. Imagine replacing a variable somewhere in one of your objects with random data, and consider what that would mean for your program. Maybe a list somewhere else now has the wrong count. Maybe a string now has some random values for the first few characters, or is now empty because you replaced those characters with zeroes.

  • The array was allocated at the edge of a page, so the next bytes don't belong to your program. The address is outside your program's allocation. In this case, the OS detects you accessing random memory that isn't yours, and terminates your program immediately with SIGSEGV.

Basically, undefined behavior means that you are doing something illegal, but because C/C++ is designed to be fast, the language designers don't include an explicit check to make sure you don't break the rules, like other languages (e.g. Java, C#). They just list the behavior of breaking the rules as undefined, and then the people who make the compilers can have the output be simpler, faster code, since no array bounds checks are made, and if you break the rules, it's your own problem.

So yes, this sometimes works, but don't ever rely on it.

Comments

0

It would not cause any problems in a a purely abstract setting, where you only worry about whether the logic of the algorithm is sound. In that case there's no reason to declare the size of an array at all. However, your computer exists in the physical world, and only has a limited amount of memory. When you're allocating memory, you're asking the operating system to let you use some of the computer's finite memory. If you go beyond that, the operating system should stop you, usually by killing your process/program.

Comments

-2

Yes, you must write it as arrayptr[index] because the position in memory of *(arrayptr + 4) is past the end of the space which you have allocated for the array. Its the flaw in C++ that the array size cant be extended once allocated.

1 Comment

"Its the flaw in C++ that the array size cant be extended once allocated." - nonsense... it's easy in C++ to create an array-like container that "auto-extends" to whatever indices are actually used because C++ supports overloading of operator[], and - in a slightly less transparent but also less error prone fashion - vector::push_back and ::emplace support that kind of thing, while reserve and resize are more explicit. ::at allows checked access. That's pretty comprehensive. And all that's got nothing to do with the choice of arrayptr[index] vs. *(arrayptr + 4) anyway.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.