CXL 1.0 - 2.0

Coherence

  • There is one home agent in a coherence domain
  • The home agent knows which peer cache has a cache line
  • Any device that needs access to a cacheline, sends the message to the home agent, the home agent then asks the cache responsible for the cacheline for data.
  • Snoop: Messages sent from the home agent to a peer cache for changing a particular cache’s state

Bias (CXL.cache)

[CXL 1.0, 1.1, 2.0, and 3.0]

Bias in CXL decides who handles “resolves” coherence among the device and the host. Resolving coherence here means who will send invalidate requests or data requests on behalf of everyone in a coherence domain.

Need for Bias

Bias is needed for efficient acccess to device’s own memory from the device. In CXL.cache mode, the device does not have its own memory, but has a cache that it can use to cache the host memory. In this case, coherence for all cache misses are resolved by the host. This is the simplest solution to the coherence problem: always ask a single central entity to find you the cacheline you need.

However, if the device also has its own memory and needs to efficiently access it without talking to the host on every cache miss or eviction, CXL supports the ability for the device to gain exclusive access to pages of host memory and access them without reaching out to the host. This is where device bias and host bias comes into the picture.

If a device is Type 1 (CXL.io + CXL.cache), it does not need support for bias.

Working

CXL devices can operate in either device bias or host bias. When a page is in device bias, the host has to send CXL requests to the device to access it, but the device can access it without talking to the host first.

Device bias is optimal when device needs to access its memory without extra coherence overhead, and same applies to host bias.

Bias is can be set to either device or host bias by the software or the hardware. By default, the memory is in host bias mode. Bias is set at a granularity of 4 KiB1 and is tracked using the bias table on the device.

The bias table needs to have one entry per page of the device memory exposed to the host (HDM-memory). This results in bias table size scaling with HDM-memory; for a smaller table size, CXL 3.0 enables snoop filters and back invalidation.

Multi Host Coherence

[CXL 3.0]

When a single fabric-attached-memory is accessible by multiple hosts in a cache coherent way, the device resolves coherence. In case a cache line requested by a host is cached by another host, the device sends a back-invalidation to the current owner host. Despite who owns the cacheline, the request is always sent to the device which then requests the owner.

  1. https://www.synopsys.com/designware-ip/technical-bulletin/compute-express-link-standard-2019q3.html