Skip to content

Commit fd3466b

Browse files
authored
Merge pull request #129 from snexus/feature/bump-version
This PR updates package dependencies to uv‑based management, removes support for llama‑cpp, and introduces an MCP server for semantic search and RAG answer operations. Dependency upgrades and configuration updates Removal of llama‑cpp support and introduction of MCP server Update of configuration and cache handling throughout the codebase
2 parents 13927a9 + 864c22c commit fd3466b

20 files changed

+6379
-418
lines changed

.flake8

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[flake8]
2+
max-line-length = 119

.github/workflows/release.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ jobs:
88
name: Build source distribution
99
runs-on: ubuntu-latest
1010
steps:
11-
- uses: actions/checkout@v3
11+
- uses: actions/checkout@v4
1212

1313
- uses: actions/[email protected]
1414
with:
@@ -22,7 +22,7 @@ jobs:
2222
- name: Run build
2323
run: python -m build
2424

25-
- uses: actions/upload-artifact@v3
25+
- uses: actions/upload-artifact@v4
2626
with:
2727
path: ./dist/*
2828

@@ -34,7 +34,7 @@ jobs:
3434
runs-on: ubuntu-latest
3535

3636
steps:
37-
- uses: actions/download-artifact@v3
37+
- uses: actions/download-artifact@v4
3838
with:
3939
name: artifact
4040
path: ./dist

README.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,13 @@
44

55
[Documentation](https://llm-search.readthedocs.io/en/latest/)
66

7-
The purpose of this package is to offer a convenient question-answering (RAG) system with a simple YAML-based configuration that enables interaction with multiple collections of local documents. Special attention is given to improvements in various components of the system **in addition to basic LLM-based RAGs** - better document parsing, hybrid search, HyDE enabled search, chat history, deep linking, re-ranking, the ability to customize embeddings, and more. The package is designed to work with custom Large Language Models (LLMs) – whether from OpenAI or installed locally.
7+
The purpose of this package is to offer an advanced question-answering (RAG) system with a simple YAML-based configuration that enables interaction with a collection of local documents. Special attention is given to improvements in various components of the system **in addition to basic LLM-based RAGs** - better document parsing, hybrid search, HyDE, chat history, deep linking, re-ranking, the ability to customize embeddings, and more. The package is designed to work with custom Large Language Models (LLMs) – whether from OpenAI or installed locally.
8+
9+
Interaction with the package is supported through the built-in frontend, or by exposing an MCP server, allowing clients like Cursor, Windsurf or VSCode GH Copilot to interact with the RAG system.
810

911
## Features
1012

13+
* Fast, incremental parsing and embedding of medium size document bases (tested on up to few gigabytes of markdown and pdfs)
1114
* Supported document formats
1215
* Build-in parsers:
1316
* `.md` - Divides files based on logical components such as headings, subheadings, and code blocks. Supports additional features like cleaning image links, adding custom metadata, and more.
@@ -17,12 +20,13 @@ The purpose of this package is to offer a convenient question-answering (RAG) sy
1720
* List of formats see [here](https://unstructured-io.github.io/unstructured/core/partition.html).
1821

1922
* Allows interaction with embedded documents, internally supporting the following models and methods (including locally hosted):
20-
* OpenAI models (ChatGPT 3.5/4 and Azure OpenAI).
23+
* OpenAI compatible models and APIs.
2124
* HuggingFace models.
22-
* Llama cpp supported models - for full list see [here](https://github.com/ggerganov/llama.cpp#description).
2325

2426
* Interoperability with LiteLLM + Ollama via OpenAI API, supporting hundreds of different models (see [Model configuration for LiteLLM](sample_templates/llm/litellm.yaml))
2527

28+
* SSE MCP Server enabling interface with popular MCP clients.
29+
2630
* Generates dense embeddings from a folder of documents and stores them in a vector database ([ChromaDB](https://github.com/chroma-core/chroma)).
2731
* The following embedding models are supported:
2832
* Hugging Face embeddings.
@@ -50,12 +54,11 @@ The purpose of this package is to offer a convenient question-answering (RAG) sy
5054

5155
* Supprts optional chat history with question contextualization
5256

53-
5457
* Other features
55-
* Simple CLI and web interfaces.
58+
* Simple web interfaces.
5659
* Deep linking into document sections - jump to an individual PDF page or a header in a markdown file.
5760
* Ability to save responses to an offline database for future analysis.
58-
* Experimental API
61+
* FastAPI based API + MCP server, allo
5962

6063

6164
## Demo

docker/Dockerfile

Lines changed: 0 additions & 81 deletions
This file was deleted.

docker/entrypoint.sh

Lines changed: 0 additions & 27 deletions
This file was deleted.

docs/installation.rst

Lines changed: 12 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ Prerequisites
88
* Tested with CUDA 11.8 to 12.4 - https://developer.nvidia.com/cuda-toolkit
99
* To interact with OpenAI models, create `.env` in the root directory of the repository, containing OpenAI API key. A template for the `.env` file is provided in `.env_template`
1010
* For parsing `.epub` documents, Pandoc is required - https://pandoc.org/installing.html
11+
* `uv` - https://github.com/astral-sh/uv#installation
1112

1213

1314

@@ -16,21 +17,16 @@ Install Latest Version
1617

1718
.. code-block:: bash
1819
19-
2020
# Create a new environment
21-
python3 -m venv .venv
21+
uv venv
2222
2323
# Activate new environment
2424
source .venv/bin/activate
2525
26-
# Install packages using pip
27-
pip install pyllmsearch
28-
2926
# Optional dependencues for Azure parser
30-
pip install "pyllmsearch[azureparser]"
27+
uv pip install "pyllmsearch[azureparser]"
3128
3229
# Preferred method (much faster) - install packages using uv
33-
pip install uv
3430
uv pip install pyllmsearch
3531
3632
@@ -45,20 +41,16 @@ Install from source
4541
4642
git clone https://github.com/snexus/llm-search.git
4743
cd llm-search
44+
# Create a new environment
45+
uv venv
46+
# Activate new environment
47+
source .venv/bin/activate
48+
# Install packages using uv
4849
49-
# Optional - Set variables for llama-cpp to compile with CUDA.
50-
# Assuming Nvidia CUDA Toolkit is installed and pointing to `usr/local/cuda` on Ubuntu
51-
52-
source ./setvars.sh
53-
54-
# Optional - Install newest stable torch for CUDA 11.x
55-
# pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu118
56-
57-
# or for CUDA 12.x version
58-
# pip3 install torch torchvision
50+
uv sync
5951
60-
# Install the package
61-
pip install . # or `pip install -e .` for development
52+
# Optional - install in the development mode
53+
uv pip install -e . # or `pip install -e .` for development
6254
6355
# For Azure parser, install with optional dependencies
64-
pip install ."[azureparser]"
56+
uv pip install ."[azureparser]"

docs/usage.rst

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -46,11 +46,14 @@ To interact with the documents using one of the supported LLMs, follow these ste
4646
Here `path/to/config/folder` points to a folder of one or more document config files. The tool will scans the configs and allows to switch between them.
4747

4848

49-
API (Experimental)
50-
-----------------
49+
API and MCP Server
50+
------------------
5151

52-
To launch an api, supply a path config file in the `FASTAPI_LLM_CONFIG` environment variable and launch `llmsearchapi`
52+
To launch FastAPI/MCP server, supply a path semantic search config file in the `FASTAPI_RAG_CONFIG` and path to llm config in `FASTAPI_LLM_CONFIG` environment variable and launch `llmsearchapi`
5353

5454
.. code-block:: bash
5555
56-
FASTAPI_LLM_CONFIG="/path/to/config.yaml" llmsearchapi
56+
FASTAPI_RAG_CONFIG="/path/to/config.yaml" FASTAPI_LLM_CONFIG="/path/to/llm.yaml" llmsearchapi
57+
58+
1. The API server will be available at `http://localhost:8000/docs` and can be used to interact with the documents using the LLMs.
59+
2. The MCP server will be available at `http://localhost:8000/mcp` and can be configured via any MCP client, assuming SSE MCP server which should point to the same URL.

pyproject.toml

Lines changed: 55 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,43 @@ build-backend = "setuptools.build_meta"
55
[project]
66
name = "pyllmsearch"
77
description = "LLM Powered Advanced RAG Application"
8-
dynamic = ["dependencies", "version"]
8+
# dynamic = ["dependencies", "version"]
9+
dynamic = ["version"]
910
keywords = ["llm", "rag", "retrieval-augemented-generation","large-language-models", "local", "splade", "hyde", "reranking", "chroma", "openai"]
1011
readme = "README.md"
11-
requires-python = ">=3.9"
12+
requires-python = ">=3.10"
1213
classifiers = [
1314
"License :: OSI Approved :: MIT License",
1415
"Operating System :: OS Independent",
1516
"Programming Language :: Python :: 3 :: Only",
1617
"Programming Language :: Python :: 3.10",
1718
"Programming Language :: Python :: 3.11",
1819
]
20+
dependencies = [
21+
"langchain-community>=0.3.22",
22+
"langchain>=0.3.24",
23+
"langchain-huggingface>=0.1.2",
24+
"langchain-chroma>=0.2.3",
25+
"python-dotenv>=1.1.0",
26+
"loguru>=0.7.3",
27+
"click>=8.1.8",
28+
"openai>=1.76.0",
29+
"streamlit>=1.44.1",
30+
"tenacity>=9.1.2",
31+
"tqdm>=4.67.1",
32+
"gmft==0.2.1",
33+
"pypdf2>=3.0.1",
34+
"pydantic>=2.11.3",
35+
"instructorembedding>=1.0.1",
36+
"unstructured>=0.17.2",
37+
"tiktoken>=0.9.0",
38+
"tokenizers>=0.21.1",
39+
"langchain-openai>=0.3.14",
40+
"python-docx>=1.1.2",
41+
"pymupdf>=1.25.5",
42+
"termcolor>=3.0.1",
43+
"fastapi-mcp>=0.3.3",
44+
]
1945

2046
[project.optional-dependencies]
2147

@@ -35,6 +61,10 @@ azureparser = [
3561
"azure-identity==1.17.1"
3662
]
3763

64+
googleparser = [
65+
"google-generativeai>=0.8.5",
66+
]
67+
3868
[project.urls]
3969
Homepage = "https://github.com/snexus/llm-search"
4070
Documentation = "https://llm-search.readthedocs.io/en/latest/"
@@ -46,9 +76,30 @@ local_scheme = "no-local-version"
4676
[tool.setuptools.packages.find]
4777
where = ["src"]
4878

49-
[tool.setuptools.dynamic]
50-
dependencies = {file = ["requirements.txt"]}
79+
# [tool.setuptools.dynamic]
80+
# dependencies = {file = ["requirements.txt"]}
5181

82+
[tool.flake8]
83+
docstring-convention = "all"
84+
ignore = [
85+
"D107",
86+
"D212",
87+
"E501",
88+
"W503",
89+
"W605",
90+
"D203",
91+
"D100",
92+
"D400",
93+
"D415",
94+
"D104",
95+
"D203",
96+
"D213",
97+
"D401",
98+
"D406",
99+
"D417",
100+
]
101+
exclude = [ "venv" ]
102+
max-line-length = 119
52103

53104
[tool.ruff]
54105
# Decrease the maximum line length to 79 characters.

0 commit comments

Comments
 (0)