Python SDK

The Archil Python SDK is the archil package on PyPI. It’s a pure-Python control-plane client: create disks, list and inspect them, manage who can mount them, run commands against them with Disk.exec, and read and write their contents through an S3-compatible object API.

pip install archil

archil talks to the Archil control plane over HTTPS and has no native dependencies. It requires Python 3.10 or later. Every method works both synchronously and asynchronously from a single implementation: disk.put_object(...) blocks, while disk.put_object.aio(...) returns a coroutine you can await. See Async below.

Looking for the FUSE mount client? That’s the archil CLI, which mounts a disk as a real local filesystem. The Python SDK does not mount disks; it talks to the control plane, runs serverless commands, and reads and writes objects over HTTPS.

Configuration

The recommended pattern is a one-time configure call, then the module-level helpers:

import archil

archil.configure(api_key="key-...", region="aws-us-east-1")

Both options fall back to environment variables (ARCHIL_API_KEY, ARCHIL_REGION) if omitted, so in most environments you can skip configure entirely and let the SDK read the environment. For multi-tenant scripts that need multiple credentials in one process, instantiate Archil directly instead of using the module-level configure:

from archil import Archil

prod = Archil(api_key=prod_key, region="aws-us-east-1")
staging = Archil(api_key=staging_key, region="aws-us-east-1")

prod_disks = prod.disks.list()
staging_disks = staging.disks.list()

The API key is an account-level credential and is not the same thing as a disk token. API keys authenticate calls to the control plane (everything in this page); a disk token grants mount access to a single disk. See the disk users concept page.

Managing disks

import archil

# Create a disk. `token` here is the disk token — the one-time credential for
# mounting. Save it; it isn't retrievable again.
result = archil.create_disk(name="my-disk")
print(f"Created {result.disk.id}, disk token: {result.token}")

# A freshly-created disk starts in "creating"; block until it's usable.
disk = result.disk.wait_until_ready()  # raises on terminal failure / timeout

# List and look up disks
all_disks = archil.list_disks()
d = archil.get_disk(result.disk.id)

Per-disk operations are methods on the Disk object itself, not top-level functions:

from archil import TokenUser

d = archil.get_disk("dsk-abc123")

# Add an additional mount token — save user.token; it's only returned once
user = d.add_user(TokenUser(nickname="ci"))

# Revoke access
d.remove_user("token", user.identifier)

# Delete the disk (this does not delete data in your bucket)
d.delete()

Executing commands

Disk.exec(command) runs a shell command inside a container with the file system already mounted, and returns stdout, stderr, exit code, and timing. See the Serverless Execution concept page for the full picture.

d = archil.get_disk("dsk-abc123")

res = d.exec("grep -r ERROR logs")
print(res.stdout, res.stderr, res.exit_code)
print(f"ran in {res.timing.execute_ms}ms (queued {res.timing.queue_ms}ms)")

The disk is the working directory inside the container — commands can reference files using relative paths. Billing is based on execute_ms — the wall-clock time your command runs — in 1ms increments, with a 100ms minimum per call. Queue time is not billed. stdout and stderr are each capped at 128 KiB per invocation — pipe larger outputs to a file on the disk instead. For multi-disk execs (mount several disks at once, optionally pinned to a subdirectory or read-only), call archil.exec(...) instead of Disk.exec. Each disk is mounted at its own relative path; pass a Disk, a disk-id string, or an ExecMountSpec for finer control:

from archil import ExecMountSpec

res = archil.exec(
    disks={
        "data": "dsk-abc123",                                   # mount root, read-write
        "ref": ExecMountSpec("dsk-def456", read_only=True),     # read-only
        "logs": ExecMountSpec("dsk-ghi789", subdirectory="2026/01"),
    },
    command="grep -r ERROR data ref logs",
)

See the bash tool for agents guide for an end-to-end example wiring exec into an AI agent loop.

Searching files

Disk.grep(...) searches the files on a disk for lines matching a regular expression, fanning the listing and matching out across many ephemeral containers so the search scales across many machines instead of one. Reach for it instead of exec("grep …") whenever you just want matching lines. See Search Files for the full model.

d = archil.get_disk("dsk-abc123")

res = d.grep(directory="logs", pattern="ERROR|FATAL", recursive=True)

for m in res.matches:
    print(f"{m.file}:{m.line}: {m.text}")
print(f"{len(res.matches)} matches in {res.duration_ms}ms ({res.stopped_reason})")

You control cost and latency with three knobs:

max_duration_seconds — wall-clock deadline (default 30, capped at 30).
concurrency — max parallel workers (default 50). More workers scan a large dataset faster, at proportionally more compute.
max_results — short-circuit once this many matches are collected (default 1000).

Always check stopped_reason — it tells you whether the search was exhaustive. When it stops early, the returned matches are a sample of whichever workers reported first, not the lexicographically first N:

`stopped_reason`	Meaning
`completed`	Every file under the directory was scanned successfully. The matches are exhaustive.
`max_results`	Stopped after collecting `max_results` matches before scanning everything.
`deadline`	Hit `max_duration_seconds` before scanning everything.
`incomplete`	The pipeline finished but one or more batches errored (invalid regex, unreadable file). Results may be partial.
`list_failed`	Directory listing failed; only partial results, if any, are present.

Reading and writing objects

A Disk doubles as an S3-compatible bucket: read, write, delete, and list its files by key without mounting it. These methods talk to Archil’s S3 endpoint using your same API key — no separate S3 credentials or SigV4 signing on your part.

import json

d = archil.get_disk("dsk-abc123")
report = {"generated": "2026-01", "rows": 1234}

# Write — accepts str or bytes. content_type is optional. Returns the etag.
result = d.put_object("reports/2026-01/data.json", json.dumps(report), "application/json")

# Read — returns bytes.
data = d.get_object("reports/2026-01/data.json")
text = data.decode("utf-8")

# Metadata / existence without downloading the body
meta = d.head_object("reports/2026-01/data.json")  # None if absent
if d.object_exists("reports/2026-01/data.json"):
    ...

# Delete (idempotent — deleting a missing key succeeds)
d.delete_object("reports/2026-01/data.json")

put_object handles any size with one call: small bodies go through a single request, and larger bodies are uploaded as a multipart upload automatically — split into parts, uploaded with bounded concurrency, and assembled, aborting the upload if any part fails so nothing is left half-staged. Tune the switch point and parallelism with keyword options:

result = d.put_object(
    "backups/2026-01.tar",
    big_bytes,
    content_type="application/x-tar",
    multipart_threshold=5 * 1024 * 1024,  # switch to multipart above 5 MiB; default = part_size
    part_size=32 * 1024 * 1024,           # >= 5 MiB; default 16 MiB
    concurrency=8,                        # parts in flight at once; default 4
)

For very large objects the part size grows automatically so the upload never exceeds S3’s 10,000-part limit. list_objects auto-paginates by default, returning every matching key. The first argument is a key prefix; a non-recursive listing (the default) returns the immediate level as objects plus subdirectory common_prefixes:

result = d.list_objects("reports/")                       # one level
all_keys = d.list_objects("reports/", recursive=True)     # whole subtree
first_100 = d.list_objects("reports/", limit=100)         # cap the total

# Stream pages instead of buffering everything (large listings):
for page in d.list_objects_pages("reports/"):
    for obj in page.objects:
        print(obj.key, obj.size, obj.last_modified)

# Or drive pagination yourself:
page = d.list_objects("reports/", single_page=True)
if page.is_truncated:
    nxt = d.list_objects("reports/", single_page=True, continuation_token=page.next_continuation_token)

delete_objects removes many keys in one round trip (auto-batched at S3’s 1,000-key limit). Unlike delete_object, per-key failures are returned rather than raised:

result = d.delete_objects(["a.txt", "logs/b.txt", "c.txt"])
for e in result.errors:
    print(f"{e.key}: {e.code} {e.message}")

append_object appends bytes to an existing object (creating it if absent) — handy for log-style writes. Each call may append at most 1 MiB; append in chunks to grow past that:

d.append_object("logs/app.log", b"first line\n")
d.append_object("logs/app.log", b"second line\n")  # concatenated

For manual control over the multipart lifecycle (e.g. uploading parts from separate processes), the raw S3 primitives live in the opt-in d.multipart namespace — create, upload_part, complete, abort, list_parts, list_uploads. Most code never needs these; prefer put_object, which runs the lifecycle for you.

upload = d.multipart.create("big.bin")
p1 = d.multipart.upload_part("big.bin", upload.upload_id, 1, first_chunk)
p2 = d.multipart.upload_part("big.bin", upload.upload_id, 2, second_chunk)
d.multipart.complete("big.bin", upload.upload_id, [p1, p2])

Async

Every method on Archil, Disks, Disk, and Tokens has an .aio variant that returns a coroutine. The module-level helpers (configure, create_disk, get_disk, etc.) are synchronous convenience wrappers, so from async code, construct Archil(...) directly and use .aio:

import asyncio
from archil import Archil

async def main():
    async with Archil(api_key="key-...", region="aws-us-east-1") as client:
        d = await client.disks.get.aio("dsk-abc123")
        await d.put_object.aio("a/b.txt", b"hello")
        data = await d.get_object.aio("a/b.txt")
        async for page in d.list_objects_pages.aio("a/"):
            for obj in page.objects:
                print(obj.key)

asyncio.run(main())

Managing API keys

API keys are account-level, so these helpers live at the top level rather than on a Disk:

archil.list_api_keys()
k = archil.create_api_key(name="ci-bot", description="GitHub Actions")
# save k.token — it's only returned once
archil.delete_api_key("key-abc123")

Error handling

All SDK errors extend ArchilError, so except ArchilError handles control-plane and S3 failures uniformly. Object-API failures raise ArchilS3Error with status (HTTP status), code (the S3 error code, e.g. "NoSuchKey"), request_id, and the raw body on raw:

from archil import ArchilError, ArchilS3Error

try:
    data = d.get_object("reports/missing.json")
except ArchilS3Error as e:
    print(e.status, e.code, e.request_id)

get_object on a missing key raises a 404 — use head_object / object_exists to probe without catching. Transient failures (HTTP 429 and 5xx, plus network errors) are retried automatically with jittered exponential backoff before surfacing; other 4xx are caller errors and aren’t retried. The two non-idempotent operations — complete (multipart) and append_object — are not auto-retried, since a retry after a succeeded-but-unacknowledged call would return a spurious NoSuchUpload or duplicate the appended bytes.

Supported regions

Region	Provider
`aws-us-east-1`	AWS
`aws-us-west-2`	AWS
`aws-eu-west-1`	AWS
`gcp-us-central1`	GCP

Getting started

Mounting

Compute

Concepts

Data Sources

Details

Administration

Protocols

SDKs

Integrations

Reference

Legal

Configuration

Managing disks

Executing commands

Searching files

Reading and writing objects

Async

Managing API keys

Error handling

Supported regions

​Configuration

​Managing disks

​Executing commands

​Searching files

​Reading and writing objects

​Async

​Managing API keys

​Error handling

​Supported regions

Configuration

Managing disks

Executing commands

Searching files

Reading and writing objects

Async

Managing API keys

Error handling

Supported regions