Skip to content

chore: avoid some unnecessary hash lookups#7576

Open
fengjiachun wants to merge 1 commit intoGreptimeTeam:mainfrom
fengjiachun:feat/avoid_hash_lookups
Open

chore: avoid some unnecessary hash lookups#7576
fengjiachun wants to merge 1 commit intoGreptimeTeam:mainfrom
fengjiachun:feat/avoid_hash_lookups

Conversation

@fengjiachun
Copy link
Collaborator

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

As @copilot said

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
@gemini-code-assist
Copy link
Contributor

Important

Installation incomplete: to start using Gemini Code Assist, please ask the organization owner(s) to visit the Gemini Code Assist Admin Console and sign the Terms of Services.

@fengjiachun fengjiachun requested a review from Copilot January 15, 2026 08:24
@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Jan 15, 2026
@fengjiachun
Copy link
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request optimizes the convert_bulk_part function by pre-computing column indices to avoid repeated hash map lookups during batch processing. The changes improve performance without altering the function's behavior.

Changes:

  • Pre-compute column indices for primary key, field, timestamp, and sparse primary key dictionary columns before the main processing loops
  • Use SmallVec instead of Vec for primary key values to avoid heap allocations for typical small primary key sizes (≤16 columns)
  • Pre-compute column ID and vector pairs to avoid repeated zip operations in the row processing loop

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@fengjiachun
Copy link
Collaborator Author

@v0y4g3r @evenyag PTAL

Comment on lines +817 to +819
// Pre-compute column indices.
// For sparse encoding, primary key columns are not in the input schema (already encoded)
let pk_col_indices = if !is_sparse {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a benchmark for it to see the how much is the improvement?

.as_str(),
)
.context(ColumnNotFoundSnafu {
column: &region_metadata.time_index_column().column_schema.name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: with_context

time_index_column has one hash loop up inside it

@killme2008
Copy link
Member

What's the status of this PR? @fengjiachun

@fengjiachun
Copy link
Collaborator Author

What's the status of this PR? @fengjiachun

Will work on it (will add benchmark).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required This change does not impact docs. size/S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants