Skip to content

Conversation

@zahidulhaque
Copy link
Collaborator

Description

This PR introduces comprehensive support for Microsoft SQL Server 2025 across multiple components of the GenAIComps and OPEA frameworks. It enables SQL Server as a backend for both the Dataprep and Retriever microservices, with full Docker-based deployment, environment configuration, and embedding support. The integration enhances compatibility with enterprise-grade databases and provides robust documentation for seamless setup and usage.


Key Features

Docker Deployment for SQL Server

  • Added compose.yaml under comps/third_parties/sqlserver/deployment/docker_compose/ for containerized SQL Server setup.
  • Includes:
    • Health checks
    • Persistent volume configuration
    • Environment variable support

Documentation Enhancements

  • New README.md under comps/third_parties/sqlserver/src/ with:
    • Docker and Docker Compose instructions
    • Environment setup (e.g., MSSQL_SA_PASSWORD)
    • Usage examples for docker run and docker compose
  • SQL Server-specific README_sqlserver.md added to:
    • dataprep/src
    • retrievers/src
  • Main README files updated to reference SQL Server documentation.

Dataprep Microservice Integration

  • New component: OpeaSqlServerDataprep in integrations/sqlserver.py
  • Features:
    • Ingestion from files and URLs
    • Embedding via HuggingFace or TEI
    • Storage in SQL Server using SQLServer_VectorStore
    • File management: upload, delete, structure retrieval
    • Health check for SQL Server connectivity
  • Registered in opea_dataprep_microservice.py

Retriever Microservice Integration

  • New component: OpeaSqlServerRetriever in integrations/sqlserver.py
  • Features:
    • Vector search using SQL Server
    • Embedding support (local & TEI)
    • Health check on startup
  • Registered in opea_retrievers_microservice.py

Environment & Configuration

  • New environment variables:
    • MSSQL_CONNECTION_STRING
    • TABLE_NAME
    • CHUNK_SIZE, CHUNK_OVERLAP for embedding
  • Docker services:
    • dataprep-sqlserver
    • retriever-sqlserver
  • Health checks and logging for debugging

Issues

NA

Type of change

List the type of change like below. Please delete options that are not relevant.

  • New feature (non-breaking change which adds new functionality)

Dependencies

Added to requirements-cpu.txt:

  • pyodbc==5.2.0
  • langchain-sqlserver==0.1.1

Dockerfiles updated to install:

  • Microsoft ODBC drivers
  • SQL Server tools

Tests

Describe the tests that you ran to verify your changes.

Copy link
Collaborator

@letonghan letonghan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zahidulhaque for your great contribution!
Here's some comments, please check and refine, thanks!

@zahidulhaque
Copy link
Collaborator Author

@letonghan Thanks for your review and feedback! I've addressed the comments. Can you please have a look?

Copy link
Collaborator

@letonghan letonghan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @zahidulhaque !

@CICD-at-OPEA
Copy link
Collaborator

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@joshuayao
Copy link
Collaborator

Thanks @zahidulhaque for your contribution. Could you please help resolve the conflicts?

@joshuayao joshuayao requested a review from ZePan110 September 10, 2025 01:23
@zahidulhaque
Copy link
Collaborator Author

hi @joshuayao , Thanks for reviewing. Merge conflicts have been resolved. Please check.

@zahidulhaque
Copy link
Collaborator Author

hi @joshuayao , Thanks for reviewing. Merge conflicts have been resolved. Please check.

During testing today, I encountered a build failure. The issue stems from the recent upgrade of the base image (python:3.11-slim) used by the dataprep and retrievers components to Debian 13, which was officially released last month. Unfortunately, Microsoft does not yet provide official support for the ODBC Driver for SQL Server on Debian 13.

I am currently exploring workarounds to enable compatibility until Microsoft releases an officially supported driver. I will push a fix once I identify a viable solution and validate it end-to-end.

@joshuayao joshuayao added this to OPEA Sep 16, 2025
@joshuayao joshuayao added this to the v1.5 milestone Sep 16, 2025
@zahidulhaque
Copy link
Collaborator Author

The issue with Debian 13 has been resolved. Please check.

@joshuayao
Copy link
Collaborator

The issue with Debian 13 has been resolved. Please check.

Thanks. Could you please check the hadolint check failure?

Signed-off-by: Zahidul Haque <[email protected]>
@zahidulhaque
Copy link
Collaborator Author

The issue with Debian 13 has been resolved. Please check.

Thanks. Could you please check the hadolint check failure?

The hadolint failure issue has been resolved.

@joshuayao
Copy link
Collaborator

The issue with Debian 13 has been resolved. Please check.

Thanks. Could you please check the hadolint check failure?

The hadolint failure issue has been resolved.

Thanks. Could you please check the CI failures?

@zahidulhaque
Copy link
Collaborator Author

I have reviewed the CI failures and it looks like they are not related to the changes I made in this commit.

@srinarayan-srikanthan

@CICD-at-OPEA
Copy link
Collaborator

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants