14

This is taken from C, and is based on that. Let's imagine we have a 32 bit pointer

char* charPointer;

It points into some place in memory that contains some data. It knows that increments of this pointer are in 1 byte, etc. On the other hand,

int* intPointer;

also points into some place in memory and if we increase it it knows that it should go up by 4 bytes if we add 1 to it.

Question is, how are we able to address full 32 bits of addressable space (2^32) - 4 gigabytes with those pointers, if obviously they contain some information in them that allows them to be separated one from another, for example char* or int*, so this leaves us with not 32 bytes, but with less.

When typing this question I came to thinking, maybe it is all syntatic sugar and really for compiler? Maybe raw pointer is just 32 bit and it doesn't care of the type? Is it the case?

1
  • 4
    Yes. The compiler applies pointer math appropriate for the variable type, but this type is not encoded in the pointer variable itself. Commented Nov 16, 2012 at 11:53

6 Answers 6

18

You might be confused by compile time versus run time.

During compilation, gcc (or any C compiler) knows the type of a pointer, in particular knows the type of the data pointed by that pointer variable. So gcccan emit the right machine code. So an increment of a int * variable (on a 32 bits machine having 32 bits int) is translated to an increment of 4 (bytes), while an increment of a char* variable is translated to an increment of 1.

During runtime, the compiled executable (it does not care or need gcc) is only dealing with machine pointers, usually addresses of bytes (or of the start of some word).

Types (in C programs) are not known during runtime.

Some other languages (Lisp, Python, Javascript, ....) require the types to be known at runtime. In recent C++ (but not C) some objects (those having virtual functions) may have RTTI.

Sign up to request clarification or add additional context in comments.

Comments

13

It is indeed syntactic sugar. Consider the following code fragment:

int t[2];
int a = t[1];

The second line is equivalent to:

int a = *(t + 1); // pointer addition

which itself is equivalent to:

int a = *(int*)((char*)t + 1 * sizeof(int)); // integer addition

After the compiler has checked the types it drops the casts and works only with addresses, lengths and integer addition.

4 Comments

I won't call it syntactic sugar; it is somehow "semantic", so it is not really sugar anymore. In contrast, the for notation is syntactic sugar: you can rewrite it to equivalent C.
You could write the last line directly. Pointer arithmetic and array access are just abbreviations that make life easier, i.e. syntactic sugar. They don't give the language additional expressive power.
Did you mean char t[2]; in first one? Because they are all ints now.
@Dvole No, the whole point is that the type here is int, not char.
3

Yes. Raw pointer is 32 bits of data (or 16 or 64 bits, depending on architecture), and does not contain anything else. Whether it's int *, char *, struct sockaddr_in * is just information for compiler, to know what is the number to actually add when incrementing, and for the type it's going to have when you dereference it.

Comments

3

Your hypothesis is correct: to see how different kinds of pointer are handled, try running this program:

int main()
{
    char * pc = 0;
    int * pi = 0;

    printf("%p\n", pc + 1);
    printf("%p\n", pi + 1);

    return 0;
}

You will note that adding one to a char* increased its numeric value by 1, while doing the same to the int* increased by 4 (which is the size of an int on my machine).

2 Comments

How is this relevant to the question?
I't a demonstration that a pointer does not have internal information about the type it is referring: printing the actual value the pointer contains shows that the difference of sizes between int and char is "taken care of" at compile time.
2

It's exactly as you say in the end - types in C are just a compile-time concept that tells to the compiler how to generate the code for the various operations you can perform on variables.

In the end pointers just boil down to the address they point to, the semantic information doesn't exist anymore once the code is compiled.

Comments

1

Incrementing an int* pointer is different from a incrementing char* solely because the pointer variable is declared as int*. You can cast an int* to char* and then it will increment with 1 byte.

So, yes, it is all just syntactic sugar. It makes some kinds of array processing easier and confuses void* users.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.