Skip to content

Fix: nlp sample populate entities #973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

abdou6666
Copy link
Contributor

@abdou6666 abdou6666 commented May 5, 2025

Motivation

This PR fixes the issue where nlp-sample missing entities when nlp is enabled .

Fixes #974

Type of change:

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • I have performed a self-review of my own code

Summary by CodeRabbit

  • New Features

    • Integrated NLP entity extraction and language inference into chat message processing, allowing for automated enhancement of NLP samples.
    • Upgraded NLP samples from inbox to training status when relevant entities and language are detected in chat messages.
  • Improvements

    • Enhanced feedback in NLP sample and training forms by displaying loading states during sample creation and updates.
    • Improved validation logic in the NLP training form, including additional checks for language selection and mutation loading status.
  • Style

    • Minor formatting cleanup in NLP service test files.

@abdou6666 abdou6666 self-assigned this May 5, 2025
Copy link

coderabbitai bot commented May 5, 2025

Warning

Rate limit exceeded

@abdou6666 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 14 minutes and 44 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 78ad9ea and 574bacd.

📒 Files selected for processing (1)
  • api/src/nlp/services/nlp-sample.service.ts (2 hunks)

Walkthrough

The changes integrate NLP entity extraction into the chat workflow and enhance the NLP sample management in both backend and frontend. On the backend, the Chat module now imports the NLP module and registers the NLP sample model. The ChatService is updated to utilize the NlpSampleService, invoking a new method to upgrade samples with detected entities after message creation. The NlpSampleService introduces an upgradeSampleWithEntities method to associate entities and inferred language with samples and promote them for training. On the frontend, loading states for sample creation and updating are propagated to relevant components, and validation logic is centralized and extended.

Changes

File(s) Change Summary
api/src/chat/chat.module.ts Updated ChatModule to import NlpModule and register NlpSampleModel in the Mongoose schema configuration.
api/src/chat/services/chat.service.ts Injected NlpSampleService; after message creation, now upgrades NLP samples with entities; event emission order in handleNewMessage adjusted.
api/src/nlp/services/nlp-sample.service.ts Added upgradeSampleWithEntities method to NlpSampleService to process entities, associate them with samples, infer language, and promote samples.
api/src/nlp/services/nlp-sample.service.spec.ts Removed an extraneous blank line in the providers array; formatting cleanup only.
frontend/src/components/nlp/components/NlpSampleForm.tsx Exposed mutation loading state (isUpdatingSample) from useUpdate hook and passed it as isMutationLoading prop to NlpDatasetSample.
frontend/src/components/nlp/components/NlpTrainForm.tsx Added isMutationLoading prop to NlpDatasetSample; centralized and extended validation logic to include language and mutation loading state; updated disabling logic for the "Validate" button.
frontend/src/components/nlp/index.tsx Exposed mutation loading state (isLoading) from useCreate hook and passed it as isMutationLoading prop to NlpDatasetSample.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ChatService
    participant NlpSampleService
    participant NlpSampleEntityService
    participant NlpLanguageModel

    User->>ChatService: Send message
    ChatService->>ChatService: Create message
    ChatService->>NlpSampleService: upgradeSampleWithEntities(entities, message)
    NlpSampleService->>NlpSampleService: Check message.text
    NlpSampleService->>NlpSampleService: Separate 'language' entity
    NlpSampleService->>NlpSampleService: Find sample by message.text and type 'inbox'
    alt Sample found
        NlpSampleService->>NlpSampleEntityService: storeSampleEntities(entities, sample)
        NlpSampleService->>NlpLanguageModel: Find language by code
        alt Language found
            NlpSampleService->>NlpSampleService: Update sample type to 'train' and set language
        else Language not found
            NlpSampleService->>NlpSampleService: Log warning
        end
    else Sample not found
        NlpSampleService->>NlpSampleService: Return
    end
Loading

Suggested labels

needs-review

Suggested reviewers

  • marrouchi
  • MohamedAliBouhaouala
  • IkbelTalebHssan

Poem

A rabbit with code in its hat,
Upgraded the chat—imagine that!
Entities leap from each message anew,
Training samples promoted, language in view.
With loading states clear and validation in line,
This burrow of features is working just fine!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
api/src/nlp/services/nlp-sample.service.ts (1)

309-349: Well-implemented method for upgrading NLP samples with entities.

This method effectively handles the process of upgrading inbox samples with detected entities and language information. The implementation includes:

  1. Proper validation with early returns when message text is missing
  2. Clean separation of language entity from other entities
  3. Appropriate sample lookup and entity storage
  4. Conditional language assignment with fallback handling

This addresses the PR's core issue of missing entities in NLP samples.

Consider enhancing error handling with more detailed logging, especially for the sample entity storage operation:

  async upgradeSampleWithEntities(
    entities: NLU.ParseEntity[],
    createdMessage: Message,
  ) {
    if (!('text' in createdMessage.message)) {
      this.logger.warn('Received message without text attribute');
      return;
    }
    const inferredLanguage = entities.find((e) => e.entity === 'language');
    const entitiesWithoutLanguage = entities.filter(
      (e) => e.entity !== 'language',
    );

    const foundSample = await this.repository.findOne({
      text: createdMessage.message.text,
      type: 'inbox',
    });
    if (!foundSample) {
+     this.logger.debug(`No inbox sample found for text: "${createdMessage.message.text.substring(0, 50)}..."`);
      return;
    }

+   try {
      await this.nlpSampleEntityService.storeSampleEntities(
        foundSample,
        entitiesWithoutLanguage,
      );
+     this.logger.debug(`Successfully stored ${entitiesWithoutLanguage.length} entities for sample ID: ${foundSample.id}`);
+   } catch (error) {
+     this.logger.error(`Failed to store sample entities: ${error.message}`, error.stack);
+     return;
+   }

    const language = await this.languageService.findOne(
      { code: inferredLanguage?.value },
      undefined,
      { _id: 1 },
    );

    if (!language) {
      this.logger.warn('Unable to find inferred language', inferredLanguage);
    }

+   try {
      await this.repository.updateOne(foundSample.id, {
        type: 'train',
        ...(language && { language: language.id }),
      });
+     this.logger.debug(`Successfully upgraded sample ${foundSample.id} to train type${language ? ` with language ${language.id}` : ''}`);
+   } catch (error) {
+     this.logger.error(`Failed to update sample: ${error.message}`, error.stack);
+   }
  }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b419dd5 and 78ad9ea.

📒 Files selected for processing (7)
  • api/src/chat/chat.module.ts (2 hunks)
  • api/src/chat/services/chat.service.ts (4 hunks)
  • api/src/nlp/services/nlp-sample.service.spec.ts (0 hunks)
  • api/src/nlp/services/nlp-sample.service.ts (2 hunks)
  • frontend/src/components/nlp/components/NlpSampleForm.tsx (2 hunks)
  • frontend/src/components/nlp/components/NlpTrainForm.tsx (4 hunks)
  • frontend/src/components/nlp/index.tsx (2 hunks)
💤 Files with no reviewable changes (1)
  • api/src/nlp/services/nlp-sample.service.spec.ts
🧰 Additional context used
🧬 Code Graph Analysis (2)
api/src/chat/chat.module.ts (1)
api/src/nlp/schemas/nlp-sample.schema.ts (1)
  • NlpSampleModel (79-82)
api/src/nlp/services/nlp-sample.service.ts (2)
api/src/helper/types.ts (1)
  • ParseEntity (18-24)
api/src/extensions/channels/web/types.ts (1)
  • Message (221-221)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: API-Tests
  • GitHub Check: Frontend Tests
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (14)
frontend/src/components/nlp/components/NlpSampleForm.tsx (2)

32-33: LGTM: Good loading state management

The extraction of loading state from the useUpdate hook with proper naming improves user experience by allowing the UI to reflect mutation status.


66-70: LGTM: Correctly propagating loading state

Passing the isUpdatingSample loading state to the NlpDatasetSample component enables proper UI feedback during sample updates.

frontend/src/components/nlp/index.tsx (2)

65-65: LGTM: Improved hook destructuring with descriptive naming

Renaming mutate to createSample and extracting the isLoading state improves code readability and enables proper loading state management.


94-97: LGTM: Consistent loading state propagation

Passing the loading state to the NlpDatasetSample component ensures consistent UI feedback during sample creation operations, aligning with the update functionality in other components.

api/src/chat/chat.module.ts (2)

19-20: LGTM: Proper imports for NLP integration

Adding imports for NlpModule and NlpSampleModel enables the required integration between chat and NLP functionality.


67-67: LGTM: Correct module registration for NLP integration

Registering the NlpSampleModel in Mongoose and including NlpModule in the imports provides the necessary foundation for NLP entity extraction in the chat workflow.

Also applies to: 74-74

api/src/chat/services/chat.service.ts (3)

32-32: LGTM: Clean dependency injection of NlpSampleService

Properly importing and injecting the NlpSampleService follows the application's dependency injection pattern and enables the NLP integration.

Also applies to: 50-50


147-154: LGTM: Effective implementation of entity extraction

This change correctly extracts NLP entities from the event, checks for their existence, and calls the NlpSampleService to upgrade samples with the detected entities, addressing the core issue.


369-371:

✅ Verification successful

Verify event emission flow change

The emission of the 'hook:chatbot:received' event was moved in the execution flow. Please verify that this change doesn't affect any dependencies that might rely on the specific order of events.

Run this script to check for other listeners of this event:


🏁 Script executed:

#!/bin/bash
# Search for other listeners of the 'hook:chatbot:received' event
rg -A 3 -B 3 "hook:chatbot:received" --type ts --type js --glob "!**/chat.service.ts"

Length of output: 85


Safe to relocate ‘hook:chatbot:received’ emission—no action needed
A repository-wide search found no subscribers or handlers for this event outside of chat.service.ts. Moving the emit call won’t impact any internal dependencies.

frontend/src/components/nlp/components/NlpTrainForm.tsx (4)

52-53: Good addition of loading state handling.

The new isMutationLoading prop allows the component to be aware of mutation operations (create/update) in progress, which improves user experience by preventing multiple submissions while a request is processing.

Also applies to: 58-59


101-101: LGTM - Language field is now tracked for validation.

Adding the language field to the watched values is necessary for proper form validation.


171-180: Improved validation logic with clearer structure.

Breaking down the validation logic into semantic variable names improves readability and maintainability. The new composite boolean shouldDisableValidateButton makes it explicit when the button should be disabled, including validation for language field that was previously missing.


458-458: Clean implementation of button disabled state.

Using the composite boolean shouldDisableValidateButton is much cleaner than an inline condition and properly prevents submissions when a mutation is in progress or when form data is invalid.

api/src/nlp/services/nlp-sample.service.ts (1)

19-19: LGTM - Proper import for ParseEntity type.

The import of NLU from helper types is needed for the new method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐛 [BUG] - Nlp-sample missing entities when nlp enabled
1 participant