0

The book "Computer Architecture", by Hennessy/Patterson, 6th ed, on page 394, includes an example with true sharing and false sharing misses with 2 processors.

here is the example from the book
here is the example from the book

It says that initially (before time stamp #1), "Assume that words z1 and z2 are in the same cache block, which is in the shared state in the caches of both P1 and P2."

I assume the underlaying coherence protocol assumed for this example is MSI.

My question: why in time stamp #1 we have a true sharing miss and not true sharing hit ?

More general, why moving from Shared to Modified in MSI requires write miss for processor P1 (that wants to write to Z1) and not just hit + invalidation on the bus (to invalidate all other possible copies of the same cache line of Z1 in other processors)?

I don't see a reason to have a write-miss and flush the value of this cache line (with Z1 and Z2) to main memory if the cache state is already Shared (meaning has the same value in the cache and in the main memory).

0

2 Answers 2

1

You can't change the data of any shareable cacheline, before you know it's invalidated in others. If you don't have a silent store permission (M or E state ), you have to be sure that you invalidated others before changing the data of cacheline.

The reason, is simple. If you implement hit + send invalidation as you said, think about a scenario like:

A data with 0x80_000_100 address and the cacheline is 128'd100 shared in 2 cores. (CoreA and CoreB) Assume the cacheline width is 128 bit defined.

At the same clock cycle, CoreA stores 32'd50 to the first word in the cacheline, and CoreB stores 32'd200 to the 4th word in the cacheline. And you make the stores, and sent invalidation to bus.

What'll happen? After invalidations gone to other cores, there'll be no valid cacheline and you didn't write it back too. Next time you wanted to read this address in any core, you'll get stale data from main memory.

So, if you don't have silent store permission stated datas (M / E), first you should get acknowledge from the bus that your invalidation is done, then you change the value of the cacheline to guarantee there'll be no error after this.

Sign up to request clarification or add additional context in comments.

Comments

0

A line in Shared state isn't writeable, we don't have exclusive ownership. (Modified state. Or Exclusive state in MESI which adds the E state for unmodified but exclusively owned, can be flipped to M state without communication with other cores.)

A store to that line can't immediately commit to cache, so it's a cache miss because the core (or store buffer) has to wait, exactly like it was in Invalid state.

If we have a valid Shared copy, we can just send out an invalidate and wait for responses instead of a Read For Ownership (RFO), so we don't need another core or memory controller to send us a new copy of the line. But we still have to wait for an acknowledgement that all other caches have invalidated their copies.

See also https://en.wikipedia.org/wiki/MESI_protocol

5 Comments

Thanks. It helps. So I wonder also about the MESI protocol. MESI (although the addition of Exclusive state) also enables to go directly from Shared to Modified. Hence, even in MESI you need a cache miss from Shared->Modified, similar to MSI?
@User710: Going from Shared to Modified in MESI requires communication with other caches to invalidate their copies as part of getting exclusive ownership (RFO or invalidate request). This part isn't different from MSI.
A follow-up question from a different angle: so will a cache miss for moving from Shared directly to Modified in MESI or in MSI have ONLY the overhead of invalidation of existing cache copies or ALSO the overhead of access and read from the global memory "of the same value") ?
@User710: It should only need to send an invalidate, not an RFO, to the reply should just be an acknowledgement, not the data. Definitely doesn't need to talk to the memory controller, and there'd be no reason for another core to reply with the data from its cache (or for a shared L2 or L3 cache to send the data).
Unless another core's request for exclusive ownership was already in flight or something? But this core would have to acknowledge it before the line could change, so I guess then it might need to send an RFO, unless the other core upgrades its invalidate to an RFO once it's done modifying the line, to minimize communication messages and round trips? (CPUs have hardware arbitration to manage priority if multiple cores request ownership of the same line at once.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.