Skip to content

Commit 719bde0

Browse files
Added vector search example using Azure OpenAI and Elastic (#224)
* Added vector search example using Azure OpenAI and Elastic * Added vector search example using Azure OpenAI and Elastic * Added vector search example using Azure OpenAI and Elastic * Added vector search example using Azure OpenAI and Elastic * Added vector search example using Azure OpenAI and Elastic * Added vector search example using Azure OpenAI and Elastic * Added vector search example using Azure OpenAI and Elastic * Added vector search example using Azure OpenAI and Elastic
1 parent de2d5d7 commit 719bde0

File tree

2 files changed

+222
-0
lines changed

2 files changed

+222
-0
lines changed

bin/find-notebooks-to-test.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ EXEMPT_NOTEBOOKS=(
2020
"notebooks/integrations/gemini/qa-langchain-gemini-elasticsearch.ipynb"
2121
"notebooks/integrations/openai/openai-KNN-RAG.ipynb"
2222
"notebooks/integrations/gemma/rag-gemma-huggingface-elastic.ipynb"
23+
"notebooks/integrations/azure-openai/vector-search-azure-openai-elastic.ipynb"
2324
"notebooks/enterprise-search/app-search-engine-exporter.ipynb"
2425
)
2526

Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "c52e30d1-cb29-4e70-af4a-9c953fcb0f2e",
6+
"metadata": {},
7+
"source": [
8+
"# Quickstart: Vector search using Azure OpenAI Embeddings and Elasticsearch\n",
9+
"\n",
10+
"This tutorial demonstrates how to use the [Azure OpenAI API](https://azure.microsoft.com/en-in/products/ai-services/openai-service) to create [embeddings](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/embeddings?tabs=console) and store them in Elasticsearch. Elasticsearch will enable us to perform vector search (Knn) to find similar documents."
11+
]
12+
},
13+
{
14+
"cell_type": "markdown",
15+
"id": "88303061-f357-43d8-8b63-c4f79e9a1746",
16+
"metadata": {},
17+
"source": [
18+
"## setup\n",
19+
"\n",
20+
"* Elastic Credentials - Create [Cloud deployment](https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud) to get all Elastic credentials (`ELASTIC_CLOUD_ID`, `ELASTIC_API_KEY`).\n",
21+
"\n",
22+
"* `AZURE_OPENAI_API_KEY` - To use the Azure OpenAI API, you need an API key. [Follow](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython-new&pivots=programming-language-python#retrieve-key-and-endpoint) to create a key.\n",
23+
"* `AZURE_OPENAI_ENDPOINT` - Endpoint for your Azure OpenAI Resource.\n",
24+
"* `AZURE_DEPLOYMENT_ID` - The deployment name you chose when you deployed the model.\n",
25+
"* `AZURE_OPENAI_API_VERSION` - The API version to use for this operation. This follows the YYYY-MM-DD format."
26+
]
27+
},
28+
{
29+
"cell_type": "markdown",
30+
"id": "76ca723c-6148-4682-a5ae-486e73cb2b94",
31+
"metadata": {},
32+
"source": [
33+
"## Install packages"
34+
]
35+
},
36+
{
37+
"cell_type": "code",
38+
"execution_count": null,
39+
"id": "ef1f1e52-f892-489f-8947-3e4698f5f5c3",
40+
"metadata": {},
41+
"outputs": [],
42+
"source": [
43+
"pip install -q -U openai elasticsearch"
44+
]
45+
},
46+
{
47+
"cell_type": "markdown",
48+
"id": "3d86d3fa-4ca0-41b6-a4bc-81bacf26bf02",
49+
"metadata": {},
50+
"source": [
51+
"## Import packages and credentials"
52+
]
53+
},
54+
{
55+
"cell_type": "code",
56+
"execution_count": null,
57+
"id": "bb62d8fb-6c34-44fd-bc94-18b644422ee8",
58+
"metadata": {},
59+
"outputs": [],
60+
"source": [
61+
"from openai import AzureOpenAI\n",
62+
"from elasticsearch import Elasticsearch, helpers\n",
63+
"from getpass import getpass\n",
64+
"import os\n",
65+
"\n",
66+
"ELASTIC_API_KEY = getpass(\"Elastic API Key :\")\n",
67+
"ELASTIC_CLOUD_ID = getpass(\"Elastic Cloud ID :\")\n",
68+
"\n",
69+
"AZURE_OPENAI_API_KEY = getpass(\"Azure OpenAI API Key :\")\n",
70+
"AZURE_OPENAI_ENDPOINT = getpass(\"Azure OpenAI Endpoint :\")\n",
71+
"AZURE_DEPLOYMENT_ID = getpass(\"Azure Deployment ID :\")\n",
72+
"AZURE_OPENAI_API_VERSION = getpass(\"Azure OpenAI API Version :\")\n",
73+
"\n",
74+
"elastic_index_name = \"azure-openai-vector-search-demo\""
75+
]
76+
},
77+
{
78+
"cell_type": "markdown",
79+
"id": "8b22dc16-c0a0-48f0-979d-5d21c17bd264",
80+
"metadata": {},
81+
"source": [
82+
"## Embedding generation\n",
83+
"\n"
84+
]
85+
},
86+
{
87+
"cell_type": "code",
88+
"execution_count": null,
89+
"id": "ca56532d-7c82-4e2b-aecf-2173520d3696",
90+
"metadata": {},
91+
"outputs": [],
92+
"source": [
93+
"def generate_embeddings(text):\n",
94+
" client = AzureOpenAI(\n",
95+
" api_key=AZURE_OPENAI_API_KEY,\n",
96+
" api_version=AZURE_OPENAI_API_VERSION,\n",
97+
" azure_endpoint=AZURE_OPENAI_ENDPOINT,\n",
98+
" )\n",
99+
"\n",
100+
" response = client.embeddings.create(\n",
101+
" input=text,\n",
102+
" model=AZURE_DEPLOYMENT_ID,\n",
103+
" )\n",
104+
"\n",
105+
" return response.data[0].embedding\n",
106+
"\n",
107+
"\n",
108+
"sample_text = \"India generally experiences a hot summer from March to June, with temperatures often exceeding 40°C in central and northern regions. Monsoon season, from June to September, brings heavy rainfall, especially in the western coast and northeastern areas. Post-monsoon months, October and November, mark a transition with decreasing rainfall. Winter, from December to February, varies in temperature across the country, with colder conditions in the north and milder weather in the south. India's diverse climate is influenced by its geographical features, resulting in regional\"\n",
109+
"embeddings = generate_embeddings(sample_text)"
110+
]
111+
},
112+
{
113+
"cell_type": "markdown",
114+
"id": "6239eda7-3bed-43dd-a6a8-a8369b907d5c",
115+
"metadata": {},
116+
"source": [
117+
"## Connecting Elasticsearch"
118+
]
119+
},
120+
{
121+
"cell_type": "code",
122+
"execution_count": null,
123+
"id": "7cbade18-3049-46f1-8d3e-5b22d4aade5b",
124+
"metadata": {},
125+
"outputs": [],
126+
"source": [
127+
"es = Elasticsearch(cloud_id=ELASTIC_CLOUD_ID, api_key=ELASTIC_API_KEY)"
128+
]
129+
},
130+
{
131+
"cell_type": "markdown",
132+
"id": "20d070c8-9e19-48a3-bc3b-5f22067eb63f",
133+
"metadata": {},
134+
"source": [
135+
"## Index document with Elasticsearch"
136+
]
137+
},
138+
{
139+
"cell_type": "code",
140+
"execution_count": null,
141+
"id": "e02ca81e-7caa-4505-95c6-3c6be7843c8f",
142+
"metadata": {},
143+
"outputs": [],
144+
"source": [
145+
"doc = {\"text\": sample_text, \"text_embedding\": embeddings}\n",
146+
"\n",
147+
"resp = es.index(index=elastic_index_name, document=doc)\n",
148+
"\n",
149+
"print(resp)"
150+
]
151+
},
152+
{
153+
"cell_type": "markdown",
154+
"id": "afa0d371-afbf-4f98-9cd1-ee457839f323",
155+
"metadata": {},
156+
"source": [
157+
"## Searching for document with Elasticsearch"
158+
]
159+
},
160+
{
161+
"cell_type": "code",
162+
"execution_count": 7,
163+
"id": "d71eeacc-d0c8-4035-b052-a1c03300aec0",
164+
"metadata": {},
165+
"outputs": [
166+
{
167+
"name": "stdout",
168+
"output_type": "stream",
169+
"text": [
170+
"\n",
171+
"\n",
172+
"ID: SxtQyY4BMvvuJ06pSACG\n",
173+
"\n",
174+
"Text: India generally experiences a hot summer from March to June, with temperatures often exceeding 40°C in central and northern regions. Monsoon season, from June to September, brings heavy rainfall, especially in the western coast and northeastern areas. Post-monsoon months, October and November, mark a transition with decreasing rainfall. Winter, from December to February, varies in temperature across the country, with colder conditions in the north and milder weather in the south. India's diverse climate is influenced by its geographical features, resulting in regional\n"
175+
]
176+
}
177+
],
178+
"source": [
179+
"q = \"How's weather in India?\"\n",
180+
"\n",
181+
"embeddings = generate_embeddings(q)\n",
182+
"\n",
183+
"resp = es.search(\n",
184+
" index=elastic_index_name,\n",
185+
" knn={\n",
186+
" \"field\": \"text_embedding\",\n",
187+
" \"query_vector\": embeddings,\n",
188+
" \"k\": 10,\n",
189+
" \"num_candidates\": 100,\n",
190+
" },\n",
191+
")\n",
192+
"\n",
193+
"\n",
194+
"for result in resp[\"hits\"][\"hits\"]:\n",
195+
" pretty_output = f\"\\n\\nID: {result['_id']}\\n\\nText: {result['_source']['text']}\"\n",
196+
" print(pretty_output)"
197+
]
198+
}
199+
],
200+
"metadata": {
201+
"kernelspec": {
202+
"display_name": "Python 3 (ipykernel)",
203+
"language": "python",
204+
"name": "python3"
205+
},
206+
"language_info": {
207+
"codemirror_mode": {
208+
"name": "ipython",
209+
"version": 3
210+
},
211+
"file_extension": ".py",
212+
"mimetype": "text/x-python",
213+
"name": "python",
214+
"nbconvert_exporter": "python",
215+
"pygments_lexer": "ipython3",
216+
"version": "3.11.4"
217+
}
218+
},
219+
"nbformat": 4,
220+
"nbformat_minor": 5
221+
}

0 commit comments

Comments
 (0)