Skip to content

fix(io): add connection pool semaphore to Azure Blob Storage backend#6305

Merged
rohitkulshreshtha merged 5 commits intoEventual-Inc:mainfrom
singularityDLW:feature/issue-6279-azure-connection-errors
Mar 2, 2026
Merged

fix(io): add connection pool semaphore to Azure Blob Storage backend#6305
rohitkulshreshtha merged 5 commits intoEventual-Inc:mainfrom
singularityDLW:feature/issue-6279-azure-connection-errors

Conversation

@singularityDLW
Copy link
Contributor

Changes Made

Add a connection pool semaphore to AzureBlobSource, matching the existing pattern in S3/GCS/TOS backends. Azure was the only major backend missing this, causing
unbounded concurrent connections (400-800+) when reading multiple large parquet files in parallel.

  • Add max_connections_per_io_thread (default 8) to AzureConfig
  • Add connection_pool_sema to AzureBlobSource with permit lifecycle tied to GetResult::Stream
  • Extract get_size_internal() to avoid deadlock for GetRange::Suffix (Azure SDK doesn't support native suffix ranges)
  • Update Python bindings, type stubs, SQL config

Related Issues

Closes #6279

Azure was the only major cloud storage backend missing a connection pool
semaphore, causing unbounded concurrent connections when reading multiple
large parquet files in parallel (8 files × 50-100 ranges = 400-800+
connections), leading to Azure server-side connection resets.

Add `max_connections_per_io_thread` (default 8) to AzureConfig and
`connection_pool_sema` to AzureBlobSource, matching the pattern used by
S3, GCS, and TOS backends. Extract `get_size_internal()` to avoid
deadlock when `get()` handles GetRange::Suffix requests.
Add missing max_connections parameter documentation to AzureConfig
docstring for parity with S3Config and GCSConfig. Clamp
max_connections_per_io_thread to minimum 1 to prevent deadlock when
user passes max_connections=0.
@singularityDLW singularityDLW requested a review from a team as a code owner February 27, 2026 07:28
@singularityDLW singularityDLW changed the title Feature/issue 6279 azure connection errors fix(io): add connection pool semaphore to Azure Blob Storage backend Feb 27, 2026
@github-actions github-actions bot added the fix label Feb 27, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 27, 2026

Greptile Summary

This PR adds connection pool management to the Azure Blob Storage backend, limiting concurrent connections to prevent resource exhaustion when reading multiple large files in parallel.

Key changes:

  • Added connection_pool_sema semaphore to AzureBlobSource initialized with max_connections_per_io_thread (default 8) × IO pool threads
  • Permits are acquired using acquire_owned() in get() and held for the lifetime of the stream via GetResult::Stream
  • Introduced get_size_internal() helper method to avoid deadlock when handling GetRange::Suffix (which requires fetching file size while already holding a permit)
  • Added max_connections parameter to Python bindings (AzureConfig) with proper type stubs and SQL config support
  • Clamped max_connections_per_io_thread to minimum 1 to prevent deadlock when users pass 0

The implementation follows the established pattern from S3/GCS/TOS backends and includes comprehensive test coverage. This resolves the unbounded connection issue described in #6279.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The implementation correctly follows established patterns from other storage backends (S3/GCS/TOS) for connection pool management. The semaphore lifecycle is properly managed (permits acquired with acquire_owned and released when streams drop), the get_size_internal pattern correctly avoids potential deadlock, and all configuration layers are updated consistently. Comprehensive test coverage validates the new parameter including edge cases like max_connections=0.
  • No files require special attention

Important Files Changed

Filename Overview
src/daft-io/src/azure_blob.rs Added connection pool semaphore with proper lifecycle management and get_size_internal to avoid deadlock
src/common/io-config/src/azure.rs Added max_connections_per_io_thread field (default 8) with display formatting
src/common/io-config/src/python.rs Added max_connections parameter to Python bindings with proper mapping to internal field

Sequence Diagram

sequenceDiagram
    participant Client
    participant AzureBlobSource
    participant Semaphore as Connection Pool<br/>Semaphore
    participant Azure as Azure Blob<br/>Storage

    Note over AzureBlobSource,Semaphore: Initialization (max 8 × num_threads)
    AzureBlobSource->>Semaphore: Create with capacity

    Note over Client,Azure: get() operation
    Client->>AzureBlobSource: get(uri, range)
    AzureBlobSource->>Semaphore: acquire_owned()
    Semaphore-->>AzureBlobSource: permit
    
    alt GetRange::Suffix
        AzureBlobSource->>AzureBlobSource: get_size_internal()<br/>(no semaphore)
        AzureBlobSource->>Azure: get_properties()
        Azure-->>AzureBlobSource: file size
        AzureBlobSource->>AzureBlobSource: calculate range
    end
    
    AzureBlobSource->>Azure: download with range
    Azure-->>AzureBlobSource: byte stream
    AzureBlobSource-->>Client: GetResult::Stream<br/>(with permit)
    
    Note over Client,Semaphore: Permit released when<br/>stream is dropped

    Note over Client,Azure: get_size() operation
    Client->>AzureBlobSource: get_size(uri)
    AzureBlobSource->>Semaphore: acquire()
    Semaphore-->>AzureBlobSource: permit guard
    AzureBlobSource->>AzureBlobSource: get_size_internal()
    AzureBlobSource->>Azure: get_properties()
    Azure-->>AzureBlobSource: file size
    AzureBlobSource-->>Client: size
    Note over AzureBlobSource,Semaphore: Permit released<br/>when guard drops
Loading

Last reviewed commit: 20583c3

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Copy link
Member

@universalmind303 universalmind303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @singularityDLW!

@rohitkulshreshtha rohitkulshreshtha merged commit 2629d23 into Eventual-Inc:main Mar 2, 2026
30 checks passed
@gsmit
Copy link

gsmit commented Mar 3, 2026

Great! Thanks a lot @singularityDLW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Connection errors when reading multiple parquet files from Azure

4 participants