I was thinking about the two register timer interview question that goes as follows:
There is a hardware memory mapped timer with the value of the timer stored in two registers: one holds the most significant 32 bits, the other holds the least significant 32 bits. They are at 0x1004 and 0x1000 respectively. Read the timer.
The main idea (as far as I've seen) is to show consideration for the fact that the timer can overflow while you read so you have to make sure the upper byte hasn't changed while you read the lower byte.
One of the other things that I've been told to look out for is declaring these types of variables as volatile otherwise the compiler can optimize things away and the result will not be what I expect.
I wanted to see how the compiler would optimize this, so I wrote the following code:
int main() {
uint32_t *ap = 0x1000;
uint32_t *bp = 0x1004;
uint64_t temp = ((uint64_t)(*bp) << 32) | (*ap);
if (temp > 0x2000) {
return 0;
}
return 1;
}
I expected (from vague compiler lore I've heard) the compiler to optimize this into a single 64 bit read. But no matter the optimization level I use with gcc, I can't get it to happen. I've also tried using 16 bit "registers", but the compiler will still do two separate reads.
My questions are:
- Is what I've been told/what I've gathered wrong?
- Will the compiler ever combine these separate reads into one?
- If so, how can I get it to do it?
- Bonus: I've also heard of possible bus faults in this scenario without
volatile... 1. Can that actually happen? 2. If it can, would that be because a bus can only do say 32 bit reads and the compiler may ask for a 64 bit read on that bus which it can't provide?
*apand*bpare adjacent. If you make them adjacent in a way that's built into the language, such as making them a struct, it can happen. (I also tried making them adjacent entries of an array, but that didn't work for gcc nor clang.)*apand*bp, but in the wrong order.