Skip to main content
Consistency in Archil is divided into two different models:
  • Strong “disk consistency” for clients which are connected directly to the Archil disk
  • Eventual “synchronization consistency” for clients which are connected directly to the attached data sources
Consistency Model

Strong disk consistency

When clients are connected directly to the Archil disk, Archil offers strong read-after-write consistency. Specifically, any time a client issues an “fsync” operation to the file system, or Archil proactively flushes its cache, the client will push all latest writes to the Archil disk. A successful fsync means the data is durably committed — stored redundantly across multiple Availability Zones, as described in Data durability. When another client issues a read request to the disk, it will see the latest writes immediately. Operations such as rename are atomic — another client’s next read of the affected path returns either the old state or the new state, never a partial result.

Client-side caching

To enhance performance, the Archil client aggressively caches data and directory information from the server. Therefore, not all reads will result in a request to the server — where data is strongly consistent. This can result in delays of seconds before new files or changes to existing files are reflected in a second disk client. The client-side cache configuration is adjustable using the archil set-cache-expiry command. Read more about the command here. You can also force a client to drop its cached entries and re-read from the server with archil invalidate-cache.

Eventual synchronization consistency

Archil disks provide eventual consistency between the file system view and the backing data source. For example, an object that is created directly in Amazon S3 will not appear in the Archil disk (via tools like ls) for a few minutes after creation. Similarly, a file that is created in the Archil disk (using a tool such as touch) can take tens of seconds to appear when calling the S3 API directly (such as by using aws s3 ls).

Durability during synchronization

The synchronization delay affects only visibility through the data source’s APIs — data that has been committed to the disk is not at risk while it waits to synchronize. The Archil disk is itself a highly available, durable storage system, not a single server: committed data is stored redundantly across multiple Availability Zones, with no single point of failure that can cause data loss. If an Archil node — or an entire Availability Zone — fails before data has synchronized to the data source, no data is lost: the disk continues serving it from the remaining Availability Zones and completes synchronization normally. See Data durability for the full durability design.