Align download and caching with huggingface_hub for improved performance by DePasqualeOrg · Pull Request #21 · huggingface/swift-huggingface

DePasqualeOrg · 2025-12-26T18:21:22Z

This PR includes significant improvements to download and cache performance, aligning swift-huggingface more closely with the Python huggingface_hub library. Several of the problems it solves originated in design decisions that diverged from the Python library.

Benchmark Results

I've added a separate benchmarks test target that will not run in CI and can be run with RUN_BENCHMARKS=1 swift test --filter Benchmarks.

Tested with mlx-community/Qwen3-0.6B-Base-DQ5 (~11 MB tokenizer.json).

Check out commit 84759be to run the benchmarks before changes on this branch.

Benchmark	Before	After	Improvement
Cached file retrieval	678.5 ms	0.7 ms	~969x faster
Fresh download	2749.3 ms	1070.7ms	~2.6x faster

Cached file retrieval: Previously, every call copied files from the cache to a destination directory, even when nothing had changed. Now we return the snapshot cache path directly with no copy step. When the revision is a commit hash (as in this benchmark), we also skip the API call entirely and return immediately.

Fresh download: Previously, files were downloaded sequentially. Now they download concurrently via a task group (default concurrency: 8), and results are written directly to the cache with no extra copy step.

Why this is a single PR

These changes form an interconnected rewrite of the download/cache subsystem. Parallel downloads, resume, cache-path return, progress, file locking, and offline mode all modify the same core functions (downloadSnapshot, downloadFile, downloadToCache) and depend on each other.

Changes

1. Return snapshot cache path instead of copying files

downloadSnapshot and downloadFile now return the snapshot cache path directly (containing symlinks to blobs), matching Python's snapshot_download() default behavior. The previous design always copied every file from the cache to a separate destination directory, which caused redundant disk I/O and duplication. An optional to destination: parameter (matching Python's local_dir) is available for callers that need files in a specific location.

Python equivalent: _snapshot_download.py:462-465

2. Skip API calls for cached files

When the revision is a commit hash (immutable), the API response is cached as <commit>.json in a separate .metadata directory at the cache root (mirroring how .locks is structured) after the first download. This keeps the snapshot directory clean so it only contains files from the repository. On subsequent calls, this cached response is used to verify that all files matching the requested globs are present in the snapshot. If all files are present, the snapshot is returned immediately with no API call. If any files are missing (e.g., from an interrupted download or different glob patterns), the fast path is skipped and a fresh download proceeds.

Python's snapshot_download acknowledges this limitation in its offline path: "we can't check if all the files are actually there." Since commit hashes are immutable, caching the API response is safe and allows per-file verification without a network round-trip. This improvement could also be added to the Python library.

Python equivalent: file_download.py:1082-1095

3. Parallel file downloads

Files are now downloaded concurrently using a task group with configurable concurrency (default: 8).

Python equivalent: _snapshot_download.py:449-455

4. Size-weighted download progress

Progress is weighted by file size instead of file count, providing accurate progress bars for downloads containing a mix of small config files and large model weights.

5. Automatic resume for interrupted downloads

Downloads automatically resume from where they left off using HTTP Range headers with a cache-first approach matching huggingface_hub:

Check for cache/blobs/{etag}.incomplete and its size
Send Range header to download only the remaining bytes
Append the remainder to the incomplete file, then rename to cache/blobs/{etag}
Create symlink in snapshots/{commit}/{filename}

Uses URLSession.download(for:delegate:) for efficient OS-level streaming to disk. For resume, the downloaded remainder is appended to the incomplete file via chunked FileHandle copy. DownloadProgressDelegate accounts for the resume offset so that progress bars report accurate totals.

This enables cross-client resume -- if a download starts in Python and gets interrupted, Swift can resume it (and vice versa), since incomplete files are stored in the same cache location with the same naming convention.

Python equivalent: file_download.py:1850-1855 (incomplete file handling), file_download.py:403-404 (Range header)

6. File locking

Concurrent downloads to the same blob are serialized using swift-filelock, a port of Python's filelock, which correctly handles contention across multiple lock instances for the same path (this was not the case with the file lock implementation that was previously included).

Python equivalent: file_download.py:1239-1251

7. Offline mode and cache fallback

Added a localFilesOnly parameter on both downloadSnapshot and downloadFile (matching Python's local_files_only), a useOfflineMode parameter, and automatic network detection via NetworkMonitor. When any of these are active, cached files are returned without making network requests.

Additionally, if the API call to fetch repo info fails (network error, server outage, etc.), downloadSnapshot falls back to the local cache before re-throwing the error. This matches Python's try/except pattern that catches errors during repo_info() and retries with local_files_only=True.

Python equivalent: _snapshot_download.py:234-330

8. Xet storage compatibility

Added fetchFileMetadata to capture X-Linked-Etag and X-Repo-Commit headers before CDN redirect. Uses same-host redirect handling matching huggingface_hub's _httpx_follow_relative_redirects.

downloadSnapshot uses the Git.TreeEntry overload for size-based Xet transport selection, so small files use LFS transport directly and skip the unnecessary HEAD request to check for Xet support.

9. Linux support

Linux has full feature parity for caching (blob checks, file locking, cache structure) but lacks resume support due to API limitations. Fallback paths are included throughout.

10. Make HubCache required

Changed cache: HubCache? to cache: HubCache across all HubClient initializers. The cache-first download architecture (blob storage, symlinks, resume) inherently requires a cache, and this aligns with huggingface_hub which also always requires a cache directory.

11. Dead code cleanup

Removed FileProgressReporter and ProgressObservation from the Xet PR, which became unused after downloadSnapshot was replaced with a parallel implementation using Foundation's Progress parent-child hierarchy. Also removed copyCachedFiles, filesSameSize, the destination fast path, and the deprecated resumeDownloadFile (which used opaque URLSession resume data, now replaced by automatic Range header resume). Fixed the snapshot progress tests to mock the correct API endpoint (getRepoInfo instead of listFiles).

Tests

Added comprehensive test suite in SnapshotDownloadTests.swift covering cache, incomplete snapshot detection, offline mode, cache fallback on network errors, resume, 416 recovery, file locking, and concurrent downloads. Path traversal protection tested with 12 cases.

Future work and alignment with Python

There are still many aspects in which the Swift client does not match the behavior or design of the Python client, which can result in critical issues. I won't address these in this PR, but in general I recommend closely following the Python implementation rather than trying out different designs, as the Python client has already solved many issues that different designs can introduce.

…ingface_hub

…ccess

The downloadSnapshot implementation calls getRepoInfo (model info API) rather than listFiles (tree API). Update the mock responses to return model info JSON with sha and siblings instead of a tree entry array.

The Xet PR added a downloadFile(Git.TreeEntry) overload that skips unnecessary HEAD requests for small files. downloadSnapshot was bypassing this by using its own FileEntry type and calling the path-based overload. Replace FileEntry with Git.TreeEntry so downloadSnapshot uses the same size-based transport selection.

Align with Python's huggingface_hub: downloadSnapshot and downloadFile now return the snapshot cache path (containing symlinks to blobs) instead of copying files to a separate destination. This eliminates redundant file copies and disk duplication. Remove copyCachedFiles, downloadToDestinationWithoutCache, filesSameSize, destination fast path, and the redundant downloadContentsOfFile -> URL overload. Rename copyBlobToDestination to createCacheEntries (symlink only). Require a cache to be configured (throw cacheNotConfigured otherwise). Propagate the main session configuration to metadataSession so mock tests can intercept metadata HEAD requests.

Match huggingface_hub's try/except pattern: when getRepoInfo fails (network error, server 500, timeout), try serving from the local cache before re-throwing the error. This makes downloadSnapshot resilient to transient network issues when files are already cached.

Matches Python's local_files_only parameter. When true, returns cached files without making any network requests, resolving branch names via the local refs file.

Adds optional `to destination:` parameter to downloadSnapshot and downloadFile, matching Python's local_dir. When set, files are copied from the cache to the specified directory. Also adds localFilesOnly to downloadFile for consistency with downloadSnapshot.

Python's snapshot_download acknowledges it can't check if all files are present when returning a cached snapshot. We improve on this by caching the repo info response after the first download and verifying each file's presence on subsequent calls. This detects incomplete snapshots caused by interrupted downloads or different glob patterns.

The cache-first download architecture (blob storage, symlinks, resume support) inherently requires a cache. This aligns with huggingface_hub, which also always requires a cache directory.

Replace the byte-by-byte AsyncBytes iteration with URLSession.download(), which handles streaming to disk at the OS level. Resume support is preserved: when an .incomplete file exists, a Range header is sent and the downloaded remainder is appended via chunked FileHandle copy. Also fixes progress reporting during resume by accounting for the resume offset in DownloadProgressDelegate. Remove deprecated resumeDownloadFile, which used opaque URLSession resume data. The new downloadFile handles resume automatically via Range headers.

…tring

DePasqualeOrg · 2026-02-24T16:11:57Z

@mattt, it looks like some of your commits in this repo from the past two days are derived from my work in this PR as well as my swift-filelock package, but you did not engage with this PR or credit me.

Cc @pcuenca @LysandreJik @julien-c

DePasqualeOrg mentioned this pull request Dec 26, 2025

Optimizations for significantly faster downloads and cache hits huggingface/swift-transformers#302

Closed

DePasqualeOrg changed the title ~~Add download optimizations and offline mode support~~ Download/cache optimizations and offline mode support Dec 26, 2025

DePasqualeOrg changed the title ~~Download/cache optimizations and offline mode support~~ Optimize download and cache performance, add offline mode support Dec 27, 2025

DePasqualeOrg force-pushed the optimizations branch from 8cb1eb3 to 49ff955 Compare December 27, 2025 10:43

DePasqualeOrg changed the title ~~Optimize download and cache performance, add offline mode support~~ Optimize download and cache; add resumable, parallel downloads and offline mode Dec 27, 2025

This was referenced Dec 27, 2025

Add resumeDownloadSnapshot, add optional resumable argument to fileDownload #16

Closed

Optimizations for significantly faster tokenizer loading huggingface/swift-transformers#303

Closed

DePasqualeOrg changed the title ~~Optimize download and cache; add resumable, parallel downloads and offline mode~~ Improve download and cache performance; add resumable, parallel downloads and offline mode Dec 27, 2025

DePasqualeOrg force-pushed the optimizations branch from 49ff955 to 007d780 Compare December 28, 2025 17:20

DePasqualeOrg marked this pull request as draft January 5, 2026 12:50

DePasqualeOrg force-pushed the optimizations branch from d4a7c1a to ce5642b Compare January 5, 2026 19:18

DePasqualeOrg marked this pull request as ready for review January 5, 2026 19:22

Fix .gitignore

146c9bb

DePasqualeOrg force-pushed the optimizations branch from a427698 to e8c01a2 Compare February 17, 2026 12:05

DePasqualeOrg added 14 commits February 20, 2026 22:12

Add benchmarks for downloads and cache

84759be

Add download and cache functionality from swift-transformers and hugg…

76f4421

…ingface_hub

Improve FileLock timeout and contention handling

01f8f05

Store lock files in separate directory, like huggingface_hub

837438a

Use shared URLSession for metadata fetching

c60e07b

Add file download tests

873bccc

Demonstrate unreliability of original FileLock with concurrent file a…

067a4cd

…ccess

Prevent infinite recursion on 416 in downloadToCacheApple

b2f54c2

Remove redundant NetworkMonitor startMonitoring call

4f56f3a

Add semantic context to download progress

ffc5a86

Eliminate duplication of normalizeEtag

cf7ea41

Document download cancellation

ce216cc

Document .locks directory

f4202dc

Centralize download constants

d645d1e

DePasqualeOrg force-pushed the optimizations branch 2 times, most recently from dc4029b to 7f3a9e9 Compare February 21, 2026 17:22

DePasqualeOrg added 8 commits February 21, 2026 22:25

Update swift-filelock

58a7b70

Fix snapshot progress tests to mock correct API endpoint

8881e27

The downloadSnapshot implementation calls getRepoInfo (model info API) rather than listFiles (tree API). Update the mock responses to return model info JSON with sha and siblings instead of a tree entry array.

Remove unused FileProgressReporter and ProgressObservation

016d29e

Delete FileLockOriginal and its demonstration test

a5632dd

Add localFilesOnly parameter to downloadSnapshot

cf4c627

Matches Python's local_files_only parameter. When true, returns cached files without making any network requests, resolving branch names via the local refs file.

DePasqualeOrg force-pushed the optimizations branch from 7f3a9e9 to f2d301c Compare February 21, 2026 21:46

DePasqualeOrg changed the title ~~Improve download and cache performance; add resumable, parallel downloads and offline mode~~ Align download and caching with huggingface_hub for improved performance Feb 21, 2026

DePasqualeOrg added 11 commits February 24, 2026 15:14

Make HubCache required instead of optional

ef308d6

The cache-first download architecture (blob storage, symlinks, resume support) inherently requires a cache. This aligns with huggingface_hub, which also always requires a cache directory.

Fix errors

5a1b15a

Fix Sendable capture in snapshot download speed test

02c6019

Cleanup

9f4df4a

Prevent path traversal in cache revision and ref lookups

df69691

Resolve symlinks before copying cached files to destination directory

8a44c62

Include hidden files when copying snapshot to destination directory

4fafdde

Require commit hash from server instead of falling back to revision s…

b983324

…tring

DePasqualeOrg force-pushed the optimizations branch from f2d301c to b983324 Compare February 24, 2026 14:56

enrico-flo mentioned this pull request May 1, 2026

URLSession temp files leaked on every download (CFNetworkDownload_*.tmp) #51

Closed

es617 mentioned this pull request May 1, 2026

URLSession temp files leaked on every download (CFNetworkDownload_*.tmp) #52

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align download and caching with huggingface_hub for improved performance#21

Align download and caching with huggingface_hub for improved performance#21
DePasqualeOrg wants to merge 34 commits into
huggingface:mainfrom
DePasqualeOrg:optimizations

DePasqualeOrg commented Dec 26, 2025 •

edited

Loading

Uh oh!

DePasqualeOrg commented Feb 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DePasqualeOrg commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

Why this is a single PR

Changes

1. Return snapshot cache path instead of copying files

2. Skip API calls for cached files

3. Parallel file downloads

4. Size-weighted download progress

5. Automatic resume for interrupted downloads

6. File locking

7. Offline mode and cache fallback

8. Xet storage compatibility

9. Linux support

10. Make HubCache required

11. Dead code cleanup

Tests

Future work and alignment with Python

Uh oh!

DePasqualeOrg commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DePasqualeOrg commented Dec 26, 2025 •

edited

Loading

DePasqualeOrg commented Feb 24, 2026 •

edited

Loading