Release Notes

Updates and improvements to Sprites and our SDKs.

Sprites API

Improved: Storage sync jobs are now resilient to deploys

Large storage syncs are now broken into a chain of small jobs, one per page of buckets, instead of one long-running job. This means a rolling deploy no longer forces a full re-sync from scratch — each page finishes well within the deployment grace period and any interrupted page is safely retried without producing duplicate billing records.

Improved: Remote MCP server tools are clearer and safer

The Sprites remote MCP server now exposes metadata that lets Claude distinguish read-only tools from destructive ones. Users also get more actionable error messages when required parameters are missing or a backend routing issue occurs.

Improved: Storage sync resumes from where it failed

If a storage sync job hits an error partway through paginating buckets, it now picks up from the point of failure instead of starting over from the beginning. Buckets already processed are not lost, and a follow-up job is automatically queued to continue from the last successful offset.

Fixed: Oban queue stats were silently failing to report

A type mismatch in the queue stats query was causing an error on every poll, which meant job queue metrics (sprites_oban_queue_stats_count, oldest_age_ms) were never emitted to Prometheus. The query now runs correctly and metrics appear as expected.

Fixed: Empty job queues now report zero instead of disappearing from metrics

When a job queue drained completely, its metrics series would vanish from Prometheus rather than showing zero. Queue stats now explicitly emit a zero value for every tracked state, so dashboards and alerts behave correctly even when queues are empty.

Improved: Prometheus metrics are now active in production

The Prometheus metrics endpoint is now enabled in the application’s supervision tree. Metrics were previously collected but not served; they are now available and scraped in production.

Fixed: Metrics server no longer fails to start during blue/green deploys

The Prometheus metrics listener previously crashed with an address-in-use error when a new instance started before the old one finished shutting down during a blue/green deploy. The server now uses socket options that allow both instances to bind the same port simultaneously during the cutover window.

Sprites API

New: Prometheus metrics for background job queues

The Sprites API now exposes Prometheus metrics for its internal job pipeline, including job durations, error rates, and queue depths per queue. This makes it possible to set up precise alerts and dashboards using PromQL instead of fragile log-string matching.

Improved: Cleaned up completed backfill infrastructure

The temporary backfill queue and workers used to rebuild app index records have been removed now that the backfill is complete. This keeps the job pipeline tidy and reduces overhead.

Fixed: Storage sync errors for non-active sprites

Storage sync was skipping app index records for destroyed or archived sprites, causing thousands of “AppRecord not found” errors every hour. The fix ensures all sprites — regardless of status — are covered, eliminating the recurring error noise.

Improved: Reduced log noise from orphaned storage buckets

Missing app record warnings during storage sync are now logged at warning level instead of error level, with a single batch summary instead of one error per bucket. Real storage sync failures are now much easier to spot in the logs.

Improved: Increased SQLite queue timeouts

Internal SQLite queue timeouts have been bumped to better handle bursts of activity without triggering premature timeout failures.

Improved: Faster, more reliable health probes and self-replay requests

Health probes and internal self-replay requests now use a larger dedicated connection pool (50x4 instead of 50x1), reducing connection checkout timeouts when many sprite channels are active at the same time.

Sprites API

New: Control who can access private sprite URLs

You can now set private_access on a sprite’s URL settings to allow all org members — not just admins — to access private sprite URLs. The default remains admin-only. Admins can view and change this setting from the sprite detail page.

Fixed: Storage usage was being recorded for deleted sprites

A background job was incorrectly tracking storage usage for sprites that had already been deleted, causing phantom billing data to accumulate. Deleted sprites are now correctly identified and skipped during storage accounting.

Improved: Remote MCP server tools are clearer and safer

The Sprites remote MCP server now exposes metadata that lets Claude distinguish read-only tools from destructive ones. Users also get more actionable error messages when required parameters are missing or a backend routing issue occurs.

Sprites Go SDK

New: Labels support

You can now set, update, and clear labels on sprites using the Go SDK. The new types and helpers integrate with sprite create and update flows.

PR #18

New: PrivateAccess field in URLSettings

The Go SDK’s URLSettings struct now includes a PrivateAccess field, supporting the new sprite URL access control option. This change is backward compatible with older servers.

PR #17

Sprites API

New: Restore a deleted sprite from its backup bucket

Deleted sprites can now be recovered through the admin console when their storage bucket is still intact. This gives operators a way to bring back sprites that were removed accidentally.

Fixed: Sprite snapshots, forking, and fork progress now work correctly

Three issues with the admin sprite page have been resolved: the Snapshots panel now loads correctly instead of always showing empty, forking a sprite to a new sprite now reliably creates and starts the new machine, and the Fork button now shows real progress (“Creating fork bucket…”, “Configuring sprite…”, “Booting machine…”) instead of a static “Forking…” message throughout the ~90-second process.

Fixed: Test pipeline runs reliably again after dependency update

An issue with the CI dependency cache was resolved, so automated tests run cleanly again after a recent internal library update.

Sprites API

Improved: Deleting sprites is now allowed even when flagged as high risk

You can now delete sprites that were previously blocked from deletion due to a high-risk status. This removes an unnecessary restriction that could leave unwanted sprites stuck in your account.

Fixed: Sprite deletion events now reach all listeners in your organization

When a sprite is deleted, the notification is now correctly delivered through the organization-scoped channel so all relevant subscribers receive it. Previously, some listeners could miss the deletion event entirely.

Improved: The machines API now uses “version” instead of “instance_id”

The /orgs/machines endpoint now returns a version field instead of instance_id to be consistent with the rest of the API. This makes the field naming more predictable across endpoints.

Fixed: Health probes that return “sprite not found” are now handled correctly

Health checks that receive a 404 “sprite not found” response are now treated as their own distinct state rather than being grouped with other failures. The system continues probing and recovers automatically if the sprite becomes available again.