Skip to content

Fetch several Parquet row groups when appropriate #234

@severo

Description

@severo

See #213 (comment)

When fetching a row group, we only fetch from the first row to the last one:

const rowStart = groupEnds[groupIndex - 1] ?? 0
const rowEnd = groupEnds[groupIndex]

But, when sorting along a column with small data, it can result in many small requests, triggering a rate limit.

This issue aims at fetching several consecutive Parquet row groups at once when appropriate, reducing the number of requests while receiving larger responses.

eg: instead of 2,000 requests with 7KB responses, we prefer 200 requests with 70KB responses (possible only if enough row groups are consecutive).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions