Skip to content

Add handler logic to ner #774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

magdyksaleh
Copy link
Contributor

No description provided.

@magdyksaleh magdyksaleh requested review from Copilot and tgaddair May 21, 2025 17:40
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the /classify and /classify_batch endpoints to support Named Entity Recognition (NER) by returning raw NER entities and linkable fields, and implements NER postprocessing and regex-based transaction annotation in the inference layer.

  • Switch classification inputs from plain strings to ClassifyInput with original_description and amount, and add optional parameters.
  • Change handlers to return a structured ClassifyResponse (with raw_ner and linkable_fields) instead of raw entity lists.
  • Introduce NER postprocessing utilities in infer.rs (using lazy_static and Regex) and wire them into Infer::classify and streaming batch classify.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

File Description
router/src/server.rs Updated HTTP handlers to consume ClassifyInput, record NER metrics, and return ClassifyResponse.
router/src/lib.rs Defined ClassifyInput, ClassifyParameters, ClassifyResponse, and BatchClassifyResponse types.
router/src/infer.rs Added NER postprocessing (postprocess_entity_rust, regex annotation, linkable-fields builder) and updated Infer methods.
router/Cargo.toml Added lazy_static dependency for static regex initializations.
Comments suppressed due to low confidence (6)

router/src/infer.rs:42

  • [nitpick] Suffix _RUST on constant names is redundant and may confuse readers. Consider renaming to MIN_LENGTH_PER_ENTITY for clarity and consistency.
static ref MIN_LENGTH_PER_ENTITY_RUST: HashMap<&'static str, usize> = {

router/src/infer.rs:49

  • The key "store number" includes a space, whereas other entity types use single words or snake_case. Verify this matches upstream entity_group values or consider using store_number for consistency.
m.insert("store number", 5); // Note: Rust variable names typically use snake_case

router/src/infer.rs:1111

  • Using expect here will panic the server if an ID is missing. Consider handling this case more gracefully (e.g., returning an error) to avoid runtime panics.
let request_id = id.expect("Classify response in batch missing ID. This is a bug.");

router/src/server.rs:1875

  • The new parameters field in ClassifyRequest is never accessed inside the handler. Consider validating or passing it to the inference layer if intended, or remove until needed.
Json(req): Json<ClassifyRequest>,

router/src/lib.rs:1221

  • The optional parameters field is defined in the request types but never read or validated by the handlers. Consider wiring it through or removing until it’s needed.
struct ClassifyParameters {

router/src/server.rs:1952

  • In the utoipa::path macro, body = Vec<ClassifyResponse> may not be recognized as a valid type reference. Verify the OpenAPI schema generates correctly or use a wrapper type.
(status = 200, description = "Classifications", body = Vec<ClassifyResponse>),

Comment on lines +1259 to +1263
#[derive(Debug, Serialize)]
struct BatchClassifyResponse {
responses: Vec<ClassifyResponse>,
}

Copy link
Preview

Copilot AI May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BatchClassifyResponse struct is added but never used by the handlers (they now return Vec<ClassifyResponse> directly). Consider removing it to reduce dead code.

Suggested change
#[derive(Debug, Serialize)]
struct BatchClassifyResponse {
responses: Vec<ClassifyResponse>,
}

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant