Skip to content

feat: support grpc tokenizer #41994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sangheee
Copy link

relate: #41035

This PR adds support for a gRPC-based tokenizer.

  • The protobuf definition was added in milvus-proto#445.
  • Based on this, the corresponding Rust client code was generated and added under tantivi-binding.
    • The generated file is milvus.proto.tokenizer.rs.

I'm not very experienced with Rust, so there might be parts of the code that could be improved.
I’d appreciate any suggestions or improvements.

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sangheee
To complete the pull request process, please assign jiaoew1991 after the PR has been reviewed.
You can assign the PR to them by writing /assign @jiaoew1991 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot added area/compilation size/XL Denotes a PR that changes 500-999 lines. labels May 21, 2025
@sre-ci-robot sre-ci-robot requested review from czs007 and sunby May 21, 2025 11:14
Copy link
Contributor

mergify bot commented May 21, 2025

@sangheee Thanks for your contribution. Please submit with DCO, see the contributing guide https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md#developer-certificate-of-origin-dco.

@mergify mergify bot added needs-dco DCO is missing in this pull request. kind/feature Issues related to feature request from users labels May 21, 2025
@sangheee sangheee force-pushed the add-grpc-tokenizer branch from 3cffa99 to f78f966 Compare May 21, 2025 11:16
@mergify mergify bot added dco-passed DCO check passed. and removed needs-dco DCO is missing in this pull request. labels May 21, 2025
Copy link
Contributor

mergify bot commented May 21, 2025

@sangheee E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link

codecov bot commented May 21, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.10%. Comparing base (9f866dd) to head (6e12c46).
Report is 71 commits behind head on master.

Current head 6e12c46 differs from pull request most recent head 4d79d5c

Please upload reports for the commit 4d79d5c to get more accurate results.

❌ Your project check has failed because the head coverage (73.10%) is below the target coverage (77.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #41994      +/-   ##
==========================================
+ Coverage   73.03%   73.10%   +0.06%     
==========================================
  Files         335      335              
  Lines       30700    30727      +27     
==========================================
+ Hits        22423    22464      +41     
+ Misses       8277     8263      -14     
Components Coverage Δ
Client ∅ <ø> (∅)
Core 73.10% <ø> (+0.06%) ⬆️
Go ∅ <ø> (∅)

see 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sangheee sangheee force-pushed the add-grpc-tokenizer branch from f78f966 to d21db6c Compare May 22, 2025 01:31
Copy link
Contributor

mergify bot commented May 22, 2025

@sangheee E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@sangheee sangheee force-pushed the add-grpc-tokenizer branch from d21db6c to 6a9b800 Compare May 22, 2025 08:54
@sre-ci-robot sre-ci-robot added size/L Denotes a PR that changes 100-499 lines. and removed size/XL Denotes a PR that changes 500-999 lines. labels May 22, 2025
Copy link
Contributor

mergify bot commented May 22, 2025

@sangheee go-sdk check failed, comment rerun go-sdk can trigger the job again.

1 similar comment
Copy link
Contributor

mergify bot commented May 22, 2025

@sangheee go-sdk check failed, comment rerun go-sdk can trigger the job again.

Signed-off-by: park.sanghee <[email protected]>
@sangheee sangheee force-pushed the add-grpc-tokenizer branch from 6a9b800 to 6e12c46 Compare May 22, 2025 10:31
@sre-ci-robot sre-ci-robot added size/XL Denotes a PR that changes 500-999 lines. and removed size/L Denotes a PR that changes 100-499 lines. labels May 22, 2025
Copy link
Contributor

mergify bot commented May 30, 2025

@sangheee go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented May 30, 2025

@sangheee cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented May 30, 2025

@sangheee E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/compilation dco-passed DCO check passed. kind/feature Issues related to feature request from users size/XL Denotes a PR that changes 500-999 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants