Skip to main content
CPUintermediatecachecoherencemultiprocessorMESIMOESI

Cache Coherence Protocols

Understanding how multiprocessor systems maintain data consistency across multiple cache levels and cores.

15 min read
Updated 1/15/2024
2 prerequisites

Prerequisites

Make sure you're familiar with these concepts before diving in:

Cache Basics
Multiprocessor Systems

Learning Objectives

By the end of this topic, you will be able to:

Understand the cache coherence problem
Learn MESI and MOESI protocols
Analyze performance implications

Table of Contents

Cache Coherence Protocols

Cache coherence is one of the most critical challenges in multiprocessor system design. When multiple processors have their own caches, we need mechanisms to ensure that all processors see a consistent view of memory.

1. The Coherence Problem

Consider this scenario:

  • Processor A reads variable X (value = 10) into its cache
  • Processor B reads the same variable X into its cache
  • Processor A modifies X to 20 in its cache
  • What does Processor B see when it reads X?

Without coherence protocols, Processor B would still see the old value (10), leading to incorrect program execution.

2. MESI Protocol

The MESI protocol is the foundation of most modern coherence systems. Each cache line can be in one of four states:

2.1 Modified (M)

  • The cache line is valid and has been modified
  • Only this cache has a valid copy
  • Memory is stale
  • Must write back to memory on eviction

2.2 Exclusive (E)

  • The cache line is valid and unmodified
  • Only this cache has a copy
  • Memory is up-to-date
  • Can transition to Modified without bus transaction

2.3 Shared (S)

  • The cache line is valid and unmodified
  • Multiple caches may have copies
  • Memory is up-to-date
  • Must broadcast invalidations before modifying

2.4 Invalid (I)

  • The cache line is not valid
  • Must fetch from memory or another cache

3. State Transitions

Rendering diagram...

4. Performance Considerations

Cache coherence protocols have significant performance implications:

  1. Latency: Remote cache misses are much slower than local hits
  2. Bandwidth: Coherence traffic consumes interconnect bandwidth
  3. False Sharing: Unrelated data on the same cache line causes unnecessary coherence traffic

5. MOESI Extensions

Some systems extend MESI with an Owner (O) state:

  • Owner: Cache line is modified and shared with other caches
  • Allows sharing of modified data without writing to memory first
  • Reduces memory bandwidth requirements

6. Directory-Based Coherence

For larger systems, broadcast-based protocols don't scale. Directory-based protocols maintain:

  • A directory entry for each memory block
  • List of which caches have copies
  • State information (shared/exclusive)

This enables point-to-point coherence messages instead of broadcasts.

7. Modern Implementations

Real processors use sophisticated optimizations:

  • Multi-socket systems: MOESI protocol with high-speed interconnects
  • Modern processors: Modified MESI with advanced point-to-point links
  • Embedded systems: Coherency extensions for system-on-chip designs

Each implementation has unique optimizations for their target workloads and system architectures.

8. Key Takeaways

  1. Coherence ensures all processors see consistent memory values
  2. MESI provides the foundation for most protocols
  3. Performance depends heavily on sharing patterns
  4. Directory protocols enable scalability beyond broadcast limits
  5. Real implementations include many optimizations beyond basic protocols