Skip to content

Commit c83d4c8

Browse files
authored
[NeuralChat] Update notebook (intel#928)
Signed-off-by: Liangyx2 <[email protected]>
1 parent 59f15e3 commit c83d4c8

29 files changed

+372
-642
lines changed

intel_extension_for_transformers/neural_chat/docs/full_notebooks.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,14 +25,15 @@ Welcome to use Jupyter Notebooks to explore how to build and customize chatbots
2525
| 3 | Optimizing Chatbots | | |
2626
| 3.1 | Enabling Chatbot with BF16 Optimization on SPR | Learn how to optimize chatbot using mixed precision on SPR | [Notebook](./notebooks/amp_optimization_on_spr.ipynb) |
2727
| 3.2 | Enabling Chatbot with BF16 Optimization on Habana Gaudi1/Gaudi2 | Learn how to optimze chatbot using mixed precision on Habana Gaudi1/Gaudi2 | [Notebook](./notebooks/amp_optimization_on_habana_gaudi.ipynb) |
28-
| 3.3 | Enabling Chatbot with BitsAndBytes Optimization on Nvidia A100 | Learn how to optimize chatbot using BitsAndBytes on Nvidia A100 | [Notebook](./notebooks/weight_only_optimization_on_nv_a100.ipynb) |
28+
| 3.3 | Enabling Chatbot with BitsAndBytes Optimization on Nvidia A100 | Learn how to optimize chatbot using BitsAndBytes on Nvidia A100 | [Notebook](./notebooks/bits_and_bytes_optimization_on_nv_a100.ipynb) |
2929
| 3.4 | Enabling Chatbot with Weight Only INT4 Optimization on SPR | Learn how to optimize chatbot using ITREX LLM graph Weight Only INT4 on SPR | [Notebook](./notebooks/itrex_llm_graph_int4_optimization_on_spr.ipynb) |
3030
| 4 | Fine-Tuning Chatbots | | |
3131
| 4.1 | Fine-tuning on SPR (Single Node) | Learn how to fine-tune chatbot on SPR with single node | [Notebook](./notebooks/single_node_finetuning_on_spr.ipynb) |
3232
| 4.2 | Fine-tuning on SPR (Multiple Nodes) | Learn how to fine-tune chatbot on SPR with multiple nodes | [Notebook](./notebooks/multi_node_finetuning_on_spr.ipynb) |
3333
| 4.3 | Fine-tuning on Habana Gaudi1/Gaudi2 (Single Card) | Learn how to fine-tune on Habana Gaudi1/Gaudi2 with single card | [Notebook](./notebooks/single_card_finetuning_on_habana_gaudi.ipynb) |
34-
| 4.4 | Fine-tuning on Habana Gaudi1/Gaudi2 (Multiple Cards) | Learn how to fine-tune on Habana Gaudi1/Gaudi2 with multiple cards | [Notebook](./notebooks/multi_card_finetuning_on_habana_gaudi.ipynb) |
35-
| 4.5 | Fine-tuning on Nvidia A100 (Single Card) | Learn how to fine-tune chatbot on Nvidia A100 | [Notebook](./notebooks/finetuning_on_nv_a100.ipynb) |
34+
| 4.4 | Fine-tuning on Nvidia A100 (Single Card) | Learn how to fine-tune chatbot on Nvidia A100 | [Notebook](./notebooks/finetuning_on_nv_a100.ipynb) |
35+
| 4.5 | Finetune Neuralchat on NVIDIA GPU | Learn how to fine-tune Neuralchat on Nvidia GPU | [Notebook](./notebooks/finetune_neuralchat_v2_on_Nvidia_GPU.ipynb) |
36+
| 4.6 | Finetuning or RAG for external knowledge | Learn how to fine-tune or RAG for external knowledge | [Notebook](./notebooks/Finetuning_or_RAG_for_external_knowledge.ipynb) |
3637
| 5 | Customizing Chatbots | | |
3738
| 5.1 | Enabling Plugins to Customize Chatbot | Learn how to customize chatbot using plugins | [Notebook](./notebooks/customize_chatbot_with_plugins.ipynb) |
3839
| 5.2 | Enabling Fine-tuned Models in Chatbot | Learn how to customize chatbot using fine-tuned models | [Notebook](./notebooks/customize_chatbot_with_finetuned_models.ipynb) |

intel_extension_for_transformers/neural_chat/docs/notebooks/Finetuning_or_RAG_for_external_knowledge.ipynb

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -88,8 +88,11 @@
8888
"outputs": [],
8989
"source": [
9090
"!git clone https://github.com/intel/intel-extension-for-transformers.git\n",
91-
"!cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/\n",
92-
"!pip install -r requirements.txt"
91+
"%cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/\n",
92+
"!pip install -r requirements.txt\n",
93+
"%cd ../../../\n",
94+
"!pip uninstall torch -y\n",
95+
"!pip install torch"
9396
]
9497
},
9598
{
@@ -234,7 +237,6 @@
234237
"load_model(model_name=base_model_path,\n",
235238
" tokenizer_name=base_model_path,\n",
236239
" peft_path=peft_model_path,\n",
237-
" device=\"cuda\",\n",
238240
" )\n",
239241
"\n",
240242
"template = \"\"\"\n",
@@ -248,11 +250,10 @@
248250
"### Assistant:\n",
249251
"\"\"\"\n",
250252
"\n",
251-
"query = \"who founded cnvrg.io?\"\n",
253+
"query = \"What is cnvrg.io?\"\n",
252254
"\n",
253255
"params = {\n",
254256
" \"prompt\": template.format(query),\n",
255-
" \"device\": \"cuda\",\n",
256257
" \"model_name\": base_model_path,\n",
257258
" \"use_cache\": True,\n",
258259
" \"repetition_penalty\": 1.0,\n",
@@ -264,11 +265,7 @@
264265
" }\n",
265266
"\n",
266267
"for new_text in predict_stream(**params):\n",
267-
" print(new_text, end=\"\", flush=True)\n",
268-
"\n",
269-
"\n",
270-
"\n",
271-
"# the response: The cnvrg.io was founded by Yochay Ettun and Leah Forkosh Kolben."
268+
" print(new_text, end=\"\", flush=True)"
272269
]
273270
},
274271
{
@@ -284,7 +281,9 @@
284281
"source": [
285282
"##### 1. prepare dataset\n",
286283
"\n",
287-
"the format for RAG, refer to: https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/assets/docs/sample.jsonl"
284+
"the format for RAG, refer to: https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/assets/docs/sample.jsonl\n",
285+
"\n",
286+
"For the example as follows, you can define the content of `doc` to be \"The cnvrg.io was founded by Yochay Ettun and Leah Forkosh Kolben.\""
288287
]
289288
},
290289
{
@@ -307,7 +306,7 @@
307306
"plugins.retrieval.args['embedding_model'] = \"hkunlp/instructor-large\"\n",
308307
"plugins.retrieval.args['process'] = False\n",
309308
"\n",
310-
"plugins.retrieval.args[\"input_path\"] = './cnvrg_docs_rag'\n",
309+
"plugins.retrieval.args[\"input_path\"] = './cnvrg_docs_rag/'\n",
311310
"plugins.retrieval.args[\"persist_dir\"] = \"./test_dir\"\n",
312311
"plugins.retrieval.args[\"response_template\"] = \"check the result\"\n",
313312
"plugins.retrieval.args['search_type'] = \"similarity_score_threshold\"\n",
@@ -318,8 +317,7 @@
318317
"chatbot = build_chatbot(config)\n",
319318
"\n",
320319
"response = chatbot.predict(\"Who are the founders of cnvrg.io?\")\n",
321-
"\n",
322-
"# the response: Great, thank you for providing me with the necessary information! Based on your query and the context provided, I can confidently answer your question:\\nThe founders of cnvrg.io are Yochay Ettun and Leah Forkosh Kolben."
320+
"print('response',response)"
323321
]
324322
}
325323
],

intel_extension_for_transformers/neural_chat/docs/notebooks/amp_optimization_on_habana_gaudi.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
"git clone https://github.com/intel/intel-extension-for-transformers.git\n",
2525
"cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docker/\n",
2626
"docker build --build-arg UBUNTU_VER=22.04 -f Dockerfile -t neuralchat . --target hpu\n",
27-
"docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host neuralchat:latest\n",
27+
"docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all neuralchat:latest\n",
2828
"```\n"
2929
]
3030
},
@@ -41,9 +41,9 @@
4141
"metadata": {},
4242
"outputs": [],
4343
"source": [
44-
"from intel_extension_for_transformers.neural_chat import build_chatbot\n",
45-
"from intel_extension_for_transformers.neural_chat.config import PipelineConfig, MixedPrecisionConfig\n",
46-
"config = PipelineConfig(optimization_config=MixedPrecisionConfig(), model_name_or_path='Intel/neural-chat-7b-v1-1')\n",
44+
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n",
45+
"from intel_extension_for_transformers.transformers import MixedPrecisionConfig\n",
46+
"config = PipelineConfig(optimization_config=MixedPrecisionConfig())\n",
4747
"chatbot = build_chatbot(config)\n",
4848
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n",
4949
"print(response)"

intel_extension_for_transformers/neural_chat/docs/notebooks/amp_optimization_on_spr.ipynb

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -57,20 +57,6 @@
5757
"%cd ../../../"
5858
]
5959
},
60-
{
61-
"cell_type": "markdown",
62-
"metadata": {},
63-
"source": [
64-
"## Prepare the model"
65-
]
66-
},
67-
{
68-
"cell_type": "markdown",
69-
"metadata": {},
70-
"source": [
71-
"Make sure to request access at https://huggingface.co/meta-llama/Llama-2-7b-chat-hf and pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`."
72-
]
73-
},
7460
{
7561
"cell_type": "markdown",
7662
"metadata": {},

intel_extension_for_transformers/neural_chat/docs/notebooks/weight_only_optimization_on_nv_a100.ipynb renamed to intel_extension_for_transformers/neural_chat/docs/notebooks/bits_and_bytes_optimization_on_nv_a100.ipynb

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# Weight Only Quantization Optimization of Chatbot on Nvidia's A100"
7+
"# Bits And Bytes Optimization of Chatbot on Nvidia's A100"
88
]
99
},
1010
{
@@ -44,15 +44,26 @@
4444
"outputs": [],
4545
"source": [
4646
"!git clone https://github.com/intel/intel-extension-for-transformers.git\n",
47-
"!cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/\n",
48-
"!pip install -r requirements.txt"
47+
"%cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/\n",
48+
"!pip install -r requirements.txt\n",
49+
"%cd ../../../"
50+
]
51+
},
52+
{
53+
"cell_type": "code",
54+
"execution_count": null,
55+
"metadata": {},
56+
"outputs": [],
57+
"source": [
58+
"!pip uninstall torch -y\n",
59+
"!pip install torch"
4960
]
5061
},
5162
{
5263
"cell_type": "markdown",
5364
"metadata": {},
5465
"source": [
55-
"## Weight Only Quantization"
66+
"## BitsAndBytes Optimization"
5667
]
5768
},
5869
{
@@ -61,9 +72,9 @@
6172
"metadata": {},
6273
"outputs": [],
6374
"source": [
64-
"from intel_extension_for_transformers.neural_chat import build_chatbot\n",
65-
"from intel_extension_for_transformers.neural_chat.config import PipelineConfig, WeightOnlyQuantConfig\n",
66-
"config = PipelineConfig(optimization_config=WeightOnlyQuantConfig(), model_name_or_path='neural-chat-7b-v1-1')\n",
75+
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n",
76+
"from intel_extension_for_transformers.transformers import BitsAndBytesConfig\n",
77+
"config = PipelineConfig(optimization_config=BitsAndBytesConfig(), device=\"cuda\")\n",
6778
"chatbot = build_chatbot(config)\n",
6879
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n",
6980
"print(response)"

intel_extension_for_transformers/neural_chat/docs/notebooks/build_chatbot_on_habana_gaudi.ipynb

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -35,20 +35,6 @@
3535
"```\n"
3636
]
3737
},
38-
{
39-
"cell_type": "markdown",
40-
"metadata": {},
41-
"source": [
42-
"# Prepare the model"
43-
]
44-
},
45-
{
46-
"cell_type": "markdown",
47-
"metadata": {},
48-
"source": [
49-
"Make sure to request access at https://huggingface.co/meta-llama/Llama-2-7b-chat-hf and pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`."
50-
]
51-
},
5238
{
5339
"cell_type": "markdown",
5440
"metadata": {},

intel_extension_for_transformers/neural_chat/docs/notebooks/build_chatbot_on_icx.ipynb

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -76,20 +76,6 @@
7676
"!conda list"
7777
]
7878
},
79-
{
80-
"cell_type": "markdown",
81-
"metadata": {},
82-
"source": [
83-
"# Prepare the model"
84-
]
85-
},
86-
{
87-
"cell_type": "markdown",
88-
"metadata": {},
89-
"source": [
90-
"Make sure to request access at https://huggingface.co/meta-llama/Llama-2-7b-chat-hf and pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`."
91-
]
92-
},
9379
{
9480
"cell_type": "markdown",
9581
"metadata": {},

intel_extension_for_transformers/neural_chat/docs/notebooks/build_chatbot_on_nv_a100.ipynb

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -69,20 +69,6 @@
6969
"!conda list"
7070
]
7171
},
72-
{
73-
"cell_type": "markdown",
74-
"metadata": {},
75-
"source": [
76-
"# Prepare the model"
77-
]
78-
},
79-
{
80-
"cell_type": "markdown",
81-
"metadata": {},
82-
"source": [
83-
"Make sure to request access at https://huggingface.co/meta-llama/Llama-2-7b-chat-hf and pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`."
84-
]
85-
},
8672
{
8773
"cell_type": "markdown",
8874
"metadata": {},

intel_extension_for_transformers/neural_chat/docs/notebooks/build_chatbot_on_spr.ipynb

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -78,20 +78,6 @@
7878
"!conda list"
7979
]
8080
},
81-
{
82-
"cell_type": "markdown",
83-
"metadata": {},
84-
"source": [
85-
"# Prepare the model"
86-
]
87-
},
88-
{
89-
"cell_type": "markdown",
90-
"metadata": {},
91-
"source": [
92-
"Make sure to request access at https://huggingface.co/meta-llama/Llama-2-7b-chat-hf and pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`."
93-
]
94-
},
9581
{
9682
"cell_type": "markdown",
9783
"metadata": {},

intel_extension_for_transformers/neural_chat/docs/notebooks/build_chatbot_on_xpu.ipynb

Lines changed: 0 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -103,20 +103,6 @@
103103
"Notes: If you face \"GLIBCXX_3.4.30\" not found issue in conda environment, please remove lib/libstdc++* from conda environment. "
104104
]
105105
},
106-
{
107-
"cell_type": "markdown",
108-
"metadata": {},
109-
"source": [
110-
"# Prepare the model"
111-
]
112-
},
113-
{
114-
"cell_type": "markdown",
115-
"metadata": {},
116-
"source": [
117-
"Make sure to request access at https://huggingface.co/meta-llama/Llama-2-7b-chat-hf and pass a token having permission to this repo either by logging in with huggingface-cli login or by passing token=<your_token>."
118-
]
119-
},
120106
{
121107
"cell_type": "markdown",
122108
"metadata": {},
@@ -197,49 +183,6 @@
197183
"response = chatbot.predict(\"How many cores does the Intel® Xeon® Platinum 8480+ Processor have in total?\")\n",
198184
"print(response)"
199185
]
200-
},
201-
{
202-
"cell_type": "markdown",
203-
"metadata": {},
204-
"source": [
205-
"## Voice Chat with ATS & TTS Plugin"
206-
]
207-
},
208-
{
209-
"cell_type": "markdown",
210-
"metadata": {},
211-
"source": [
212-
"In the context of voice chat, users have the option to engage in various modes: utilizing input audio and receiving output audio, employing input audio and receiving textual output, or providing input in textual form and receiving audio output.\n",
213-
"\n",
214-
"For the Python API code, users have the option to enable different voice chat modes by setting ASR and TTS plugins enable or disable."
215-
]
216-
},
217-
{
218-
"cell_type": "code",
219-
"execution_count": null,
220-
"metadata": {},
221-
"outputs": [],
222-
"source": [
223-
"!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/speaker_embeddings/spk_embed_default.pt\n",
224-
"!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/audio/welcome.wav"
225-
]
226-
},
227-
{
228-
"cell_type": "code",
229-
"execution_count": null,
230-
"metadata": {},
231-
"outputs": [],
232-
"source": [
233-
"from intel_extension_for_transformers.neural_chat import PipelineConfig\n",
234-
"from intel_extension_for_transformers.neural_chat import build_chatbot, plugins\n",
235-
"plugins.asr.enable = True\n",
236-
"plugins.tts.enable = True\n",
237-
"plugins.tts.args[\"output_audio_path\"]=\"./output_audio.wav\"\n",
238-
"config = PipelineConfig(plugins=plugins, device='xpu')\n",
239-
"chatbot = build_chatbot(config)\n",
240-
"result = chatbot.predict(query=\"./welcome.wav\")\n",
241-
"print(result)"
242-
]
243186
}
244187
],
245188
"metadata": {

0 commit comments

Comments
 (0)