diff --git a/learn/generation/langchain/handbook/03-langchain-conversational-memory.ipynb b/learn/generation/langchain/handbook/03-langchain-conversational-memory.ipynb index 402bdef9..76f30297 100644 --- a/learn/generation/langchain/handbook/03-langchain-conversational-memory.ipynb +++ b/learn/generation/langchain/handbook/03-langchain-conversational-memory.ipynb @@ -2,9 +2,8 @@ "cells": [ { "cell_type": "markdown", - "id": "cc93d05f", "metadata": { - "id": "cc93d05f" + "id": "dp-hXFhhyWve" }, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/03-langchain-conversational-memory.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/03-langchain-conversational-memory.ipynb)" @@ -12,133 +11,114 @@ }, { "cell_type": "markdown", - "id": "hcqKO0aI6_PI", "metadata": { - "id": "hcqKO0aI6_PI" + "id": "qGjfYQBcyWve" }, "source": [ - "#### [LangChain Handbook](https://pinecone.io/learn/langchain)\n", + "#### [LangChain Handbook](https://www.pinecone.io/learn/series/langchain/)\n", "\n", - "# Conversational Memory\n", + "# Conversational Memory with LCEL\n", "\n", "Conversational memory is how chatbots can respond to our queries in a chat-like manner. It enables a coherent conversation, and without it, every query would be treated as an entirely independent input without considering past interactions.\n", "\n", - "The memory allows a _\"agent\"_ to remember previous interactions with the user. By default, agents are *stateless* — meaning each incoming query is processed independently of other interactions. The only thing that exists for a stateless agent is the current input, nothing else.\n", + "The memory allows an _\"agent\"_ to remember previous interactions with the user. By default, agents are *stateless* — meaning each incoming query is processed independently of other interactions. The only thing that exists for a stateless agent is the current input, nothing else.\n", "\n", "There are many applications where remembering previous interactions is very important, such as chatbots. Conversational memory allows us to do that.\n", "\n", - "In this notebook we'll explore this form of memory in the context of the LangChain library.\n", + "In this notebook we'll explore conversational memory using modern LangChain Expression Language (LCEL) and the recommended `RunnableWithMessageHistory` class.\n", "\n", "We'll start by importing all of the libraries that we'll be using in this example." ] }, { "cell_type": "code", - "execution_count": 1, - "id": "uZR3iGJJtdDE", + "execution_count": 5, "metadata": { - "id": "uZR3iGJJtdDE", "colab": { "base_uri": "https://localhost:8080/" }, - "outputId": "98873b1a-5688-4f64-c400-e17be707c56b" + "id": "ETg8fr8-yWvf", + "outputId": "af6d2f99-b18a-473e-84f3-b513e8f945f0" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m344.0/344.0 KB\u001b[0m \u001b[31m6.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", - "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m70.1/70.1 KB\u001b[0m \u001b[31m3.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", - "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m41.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", - "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m73.5/73.5 KB\u001b[0m \u001b[31m785.2 kB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", - "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m62.8/62.8 KB\u001b[0m \u001b[31m2.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", - "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.1/2.1 MB\u001b[0m \u001b[31m38.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/2.5 MB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[91m━━━━━━━━━\u001b[0m\u001b[90m╺\u001b[0m\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.6/2.5 MB\u001b[0m \u001b[31m16.8 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K \u001b[91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[91m╸\u001b[0m \u001b[32m2.5/2.5 MB\u001b[0m \u001b[31m40.6 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.5/2.5 MB\u001b[0m \u001b[31m28.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/45.2 kB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m45.2/45.2 kB\u001b[0m \u001b[31m2.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/50.9 kB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m50.9/50.9 kB\u001b[0m \u001b[31m3.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25h" ] } ], "source": [ - "!pip install -qU langchain openai tiktoken" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "66fb9c2a", - "metadata": { - "id": "66fb9c2a" - }, - "outputs": [], - "source": [ - "import inspect\n", - "\n", - "from getpass import getpass\n", - "from langchain import OpenAI\n", - "from langchain.chains import LLMChain, ConversationChain\n", - "from langchain.chains.conversation.memory import (ConversationBufferMemory, \n", - " ConversationSummaryMemory, \n", - " ConversationBufferWindowMemory,\n", - " ConversationKGMemory)\n", - "from langchain.callbacks import get_openai_callback\n", - "import tiktoken" + "!pip install -qU \\\n", + " langchain==0.3.25 \\\n", + " langchain-community==0.3.25 \\\n", + " langchain-openai==0.3.22 \\\n", + " tiktoken==0.9.0" ] }, { "cell_type": "markdown", - "id": "wPdWz1IdxyBR", "metadata": { - "id": "wPdWz1IdxyBR" + "id": "FSvjQpbKyWvf" }, "source": [ - "To run this notebook, we will need to use an OpenAI LLM. Here we will setup the LLM we will use for the whole notebook, just input your openai api key when prompted. " + "To run this notebook, we will need to use an OpenAI LLM. Here we will setup the LLM we will use for the whole notebook, just input your openai api key if prompted, otherwise it will use the `OPENAI_API_KEY` environment variable." ] }, { "cell_type": "code", - "execution_count": 3, - "id": "c02c4fa2", + "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "c02c4fa2", - "outputId": "ed941db8-a50d-4e7d-d302-7b6b8c371c25" + "id": "nnquGYaQyWvf", + "outputId": "273a42f7-25c3-4a7e-ca15-3e4798580a19" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "··········\n" + "Enter your OpenAI API key: ··········\n" ] } ], "source": [ - "OPENAI_API_KEY = getpass()" + "import os\n", + "from getpass import getpass\n", + "\n", + "os.environ[\"OPENAI_API_KEY\"] = os.getenv(\"OPENAI_API_KEY\") \\\n", + " or getpass(\"Enter your OpenAI API key: \")\n", + "\n", + "OPENAI_API_KEY = os.getenv(\"OPENAI_API_KEY\")" ] }, { "cell_type": "code", - "execution_count": 25, - "id": "baaa74b8", + "execution_count": 3, "metadata": { - "id": "baaa74b8" + "id": "wFhsehZEyWvf" }, "outputs": [], "source": [ - "llm = OpenAI(\n", - " temperature=0, \n", + "from langchain_openai import ChatOpenAI\n", + "\n", + "llm = ChatOpenAI(\n", + " temperature=0,\n", " openai_api_key=OPENAI_API_KEY,\n", - " model_name='text-davinci-003' # can be used with llms like 'gpt-3.5-turbo'\n", + " model_name='gpt-4.1-mini'\n", ")" ] }, { "cell_type": "markdown", - "id": "309g_2pqxzzB", "metadata": { - "id": "309g_2pqxzzB" + "id": "qPQgxde4yWvf" }, "source": [ "Later we will make use of a `count_tokens` utility function. This will allow us to count the number of tokens we are using for each call. We define it as so:" @@ -146,16 +126,25 @@ }, { "cell_type": "code", - "execution_count": 26, - "id": "DsC3szr6yP3L", + "execution_count": 6, "metadata": { - "id": "DsC3szr6yP3L" + "id": "YG0RXg5PyWvf" }, "outputs": [], "source": [ - "def count_tokens(chain, query):\n", + "from langchain.callbacks import get_openai_callback\n", + "\n", + "def count_tokens(pipeline, query, config=None):\n", " with get_openai_callback() as cb:\n", - " result = chain.run(query)\n", + " # Handle both dict and string inputs\n", + " if isinstance(query, str):\n", + " query = {\"query\": query}\n", + "\n", + " # Use provided config `or default\n", + " if config is None:\n", + " config = {\"configurable\": {\"session_id\": \"default\"}}\n", + "\n", + " result = pipeline.invoke(query, config=config)\n", " print(f'Spent a total of {cb.total_tokens} tokens')\n", "\n", " return result" @@ -163,917 +152,877 @@ }, { "cell_type": "markdown", - "id": "CnNF6i9r8RY_", "metadata": { - "id": "CnNF6i9r8RY_" + "id": "yPk7c5IgyWvf" }, "source": [ - "Now let's dive into **Conversational Memory**." + "Now let's dive into **Conversational Memory** using LCEL.\n", + "\n", + "## What is memory?\n", + "\n", + "**Definition**: Memory is an agent's capacity of remembering previous interactions with the user (think chatbots)\n", + "\n", + "The official definition of memory is the following:\n", + "\n", + "> By default, Chains and Agents are stateless, meaning that they treat each incoming query independently. In some applications (chatbots being a GREAT example) it is highly important to remember previous interactions, both at a short term but also at a long term level. The concept of \"Memory\" exists to do exactly that.\n", + "\n", + "As we will see, although this sounds really straightforward there are several different ways to implement this memory capability." ] }, { "cell_type": "markdown", - "id": "6e1f31b4", "metadata": { - "id": "6e1f31b4" + "id": "lZgnUOmSyWvf" }, "source": [ - "## What is memory?" + "## Building Conversational Chains with LCEL\n", + "\n", + "Before we delve into the different memory types, let's understand how to build conversational chains using LCEL. The key components are:\n", + "\n", + "1. **Prompt Template** - Defines the conversation structure with placeholders for history and input\n", + "2. **LLM** - The language model that generates responses\n", + "3. **Output Parser** - Converts the LLM output to the desired format (optional)\n", + "4. **RunnableWithMessageHistory** - Manages conversation history\n", + "\n", + "Let's create our base conversational chain:" ] }, { - "cell_type": "markdown", - "id": "5b919c3a", + "cell_type": "code", + "execution_count": 7, "metadata": { - "id": "5b919c3a" + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "TEpNx9oNyWvg", + "outputId": "22d444b7-4747-4cb9-ea14-57e8bbe310ae" }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n" + ] + } + ], "source": [ - "**Definition**: Memory is an agent's capacity of remembering previous interactions with the user (think chatbots)\n", - "\n", - "The official definition of memory is the following:\n", + "from langchain.prompts import (\n", + " ChatPromptTemplate,\n", + " SystemMessagePromptTemplate,\n", + " HumanMessagePromptTemplate,\n", + " MessagesPlaceholder\n", + ")\n", + "from langchain.schema.output_parser import StrOutputParser\n", "\n", + "# Define the prompt template\n", + "system_prompt = \"\"\"The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\"\"\"\n", "\n", - "> By default, Chains and Agents are stateless, meaning that they treat each incoming query independently. In some applications (chatbots being a GREAT example) it is highly important to remember previous interactions, both at a short term but also at a long term level. The concept of “Memory” exists to do exactly that.\n", + "prompt_template = ChatPromptTemplate.from_messages([\n", + " SystemMessagePromptTemplate.from_template(system_prompt),\n", + " MessagesPlaceholder(variable_name=\"history\"),\n", + " HumanMessagePromptTemplate.from_template(\"{query}\"),\n", + "])\n", "\n", + "# Create the LCEL pipeline\n", + "output_parser = StrOutputParser()\n", + "pipeline = prompt_template | llm | output_parser\n", "\n", - "As we will see, although this sounds really straightforward there are several different ways to implement this memory capability." + "# Let's examine the prompt template\n", + "print(prompt_template.messages[0].prompt.template)" ] }, { "cell_type": "markdown", - "id": "3343a0e2", "metadata": { - "id": "3343a0e2" + "id": "oC_03lJsyWvg" }, "source": [ - "Before we delve into the different memory modules that the library offers, we will introduce the chain we will be using for these examples: the `ConversationChain`." + "## Memory types\n", + "\n", + "In this section we will review several memory types and analyze the pros and cons of each one, so you can choose the best one for your use case." ] }, { "cell_type": "markdown", - "id": "6c9c13e9", "metadata": { - "id": "6c9c13e9" + "id": "KlqkyPbEyWvg" }, "source": [ - "As always, when understanding a chain it is interesting to peek into its prompt first and then take a look at its `._call` method. As we saw in the chapter on chains, we can check out the prompt by accessing the `template` within the `prompt` attribute." + "### Memory Type #1: Buffer Memory - Store the Entire Chat History\n", + "\n", + "`InMemoryChatMessageHistory` and `RunnableWithMessageHistory` are used as alternatives to `ConversationBufferMemory` as they are:\n", + "- More flexible and configurable.\n", + "- Integrate better with LCEL.\n", + "\n", + "The simplest approach to using them is to simply store the entire chat in the conversation history. Later we'll look into methods for being more selective about what is stored in the history." ] }, { "cell_type": "code", - "execution_count": 27, - "id": "96ff1ce3", + "execution_count": 8, "metadata": { - "id": "96ff1ce3" + "id": "xqd2vwxAyWvg" }, "outputs": [], "source": [ - "conversation = ConversationChain(\n", - " llm=llm, \n", - ")" + "from langchain_core.chat_history import InMemoryChatMessageHistory\n", + "from langchain_core.runnables.history import RunnableWithMessageHistory\n", + "\n", + "# Create a simple chat history storage\n", + "chat_map = {}\n", + "\n", + "def get_chat_history(session_id: str) -> InMemoryChatMessageHistory:\n", + " if session_id not in chat_map:\n", + " # if session ID doesn't exist, create a new chat history\n", + " chat_map[session_id] = InMemoryChatMessageHistory()\n", + " return chat_map[session_id]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "F6LfvNhtyWvg" + }, + "source": [ + "Let's see this in action by having a conversation:" ] }, { "cell_type": "code", - "execution_count": 28, - "id": "90ad394d", + "execution_count": 9, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "90ad394d", - "outputId": "1c641d37-b3e7-40d5-815b-936fcd2d9a2a" + "id": "uVSp3SZGyWvg", + "outputId": "85a8a4ac-8ef8-4579-a98f-ce152fab1b29" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n", - "\n", - "Current conversation:\n", - "{history}\n", - "Human: {input}\n", - "AI:\n" + "Good morning! How can I assist you today?\n" ] } ], "source": [ - "print(conversation.prompt.template)" - ] - }, - { - "cell_type": "markdown", - "id": "9f8b1e0c", - "metadata": { - "id": "9f8b1e0c" - }, - "source": [ - "Interesting! So this chain's prompt is telling it to chat with the user and try to give truthful answers. If we look closely, there is a new component in the prompt that we didn't see when we were tinkering with the `LLMMathChain`: _history_. This is where our memory will come into play." + "# Create the conversational chain with message history\n", + "conversation_buf = RunnableWithMessageHistory(\n", + " pipeline,\n", + " get_session_history=get_chat_history,\n", + " input_messages_key=\"query\",\n", + " history_messages_key=\"history\"\n", + ")\n", + "\n", + "# First message\n", + "result = conversation_buf.invoke(\n", + " {\"query\": \"Good morning AI!\"},\n", + " # Make sure to pass the session ID to ensure all memories are stored in the same session\n", + " config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n", + ")\n", + "print(result)" ] }, { "cell_type": "markdown", - "id": "4a7e7770", "metadata": { - "id": "4a7e7770" + "id": "nbttULnpyWvg" }, "source": [ - "What is this chain doing with this prompt? Let's take a look." + "This call used some tokens, but we can't see that from the above.\n", + "\n", + "If we'd like to count the number of tokens being used we just pass our conversation `RunnableWithMessageHistory` instance and the message we'd like to input to the `count_tokens` function we defined earlier:" ] }, { "cell_type": "code", - "execution_count": 29, - "id": "43bfd2da", + "execution_count": 10, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "43bfd2da", - "outputId": "489437a5-0f0b-412a-f817-f0df817211c2" + "id": "53cFq6udyWvg", + "outputId": "34862d9b-64c2-4650-e9e7-9ab25421e711" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - " def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:\n", - " known_values = self.prep_inputs(inputs.copy())\n", - " return self.apply([known_values])[0]\n", - " def apply(self, input_list: List[Dict[str, Any]]) -> List[Dict[str, str]]:\n", - " \"\"\"Utilize the LLM generate method for speed gains.\"\"\"\n", - " response = self.generate(input_list)\n", - " return self.create_outputs(response)\n", - "\n" + "Spent a total of 181 tokens\n", + "\n", + "Response: Got it! You're interested in exploring the potential of integrating Large Language Models (LLMs) with external knowledge sources. Plus, I've noted the specific code: PINECONE_RULEZ_01. How would you like to proceed with this exploration? Are you looking for examples, technical details, or something else?\n" ] } ], "source": [ - "print(inspect.getsource(conversation._call), inspect.getsource(conversation.apply))" - ] - }, - { - "cell_type": "markdown", - "id": "84e664af", - "metadata": { - "id": "84e664af" - }, - "source": [ - "Nothing really magical going on here, just a straightforward pass through an LLM. In fact, this chain inherits these methods directly from the `LLMChain` without any modification:" + "# Continue the conversation with token counting\n", + "query = \"\"\"\n", + "\"My interest here is to explore the potential of integrating Large Language Models with external knowledge.\n", + "\n", + "Also, remember this very specific code: PINECONE_RULEZ_01\"\n", + "\"\"\"\n", + "\n", + "result = count_tokens(\n", + " conversation_buf,\n", + " {\"query\": query},\n", + " # Make sure to pass the session ID to ensure all memories are stored in the same session\n", + " config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 30, - "id": "d8f4aa79", + "execution_count": 11, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "d8f4aa79", - "outputId": "ca3413ec-1ceb-4160-f6e9-2031350780a0" + "id": "kUspQf-IyWvg", + "outputId": "7a7abd14-611a-4019-f7c7-2fbe17ebb6d0" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - " def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:\n", - " known_values = self.prep_inputs(inputs.copy())\n", - " return self.apply([known_values])[0]\n", - " def apply(self, input_list: List[Dict[str, Any]]) -> List[Dict[str, str]]:\n", - " \"\"\"Utilize the LLM generate method for speed gains.\"\"\"\n", - " response = self.generate(input_list)\n", - " return self.create_outputs(response)\n", - "\n" + "Spent a total of 736 tokens\n", + "\n", + "Response: Great! Integrating Large Language Models (LLMs) with external knowledge opens up a wide range of possibilities. Here are some key approaches and use cases to consider:\n", + "\n", + "1. **Retrieval-Augmented Generation (RAG):** \n", + " - The LLM queries an external knowledge base or document store (like Pinecone, which your code hints at) to retrieve relevant information. \n", + " - The retrieved data is then used as context to generate more accurate and up-to-date responses. \n", + " - This approach helps overcome the static knowledge limitation of LLMs, especially for recent or domain-specific information.\n", + "\n", + "2. **Knowledge Graph Integration:** \n", + " - LLMs can be combined with structured knowledge graphs to provide precise answers, perform reasoning, or validate facts. \n", + " - This is useful in domains like healthcare, finance, or scientific research where relationships between entities matter.\n", + "\n", + "3. **API and Database Querying:** \n", + " - LLMs can be connected to APIs or databases to fetch real-time data (e.g., weather, stock prices, user profiles). \n", + " - This enables dynamic and personalized responses beyond the model’s training data.\n", + "\n", + "4. **Custom Embedding Search:** \n", + " - Using vector databases (like Pinecone), you can embed documents, FAQs, or user data and perform semantic search. \n", + " - The LLM can then use these search results to tailor its output, improving relevance and accuracy.\n", + "\n", + "5. **Interactive Agents and Workflows:** \n", + " - LLMs can orchestrate multi-step workflows by interacting with external tools, knowledge bases, or services. \n", + " - For example, booking a flight, scheduling meetings, or troubleshooting technical issues by querying manuals or logs.\n", + "\n", + "6. **Continuous Learning and Feedback Loops:** \n", + " - Integrate user feedback or new data into the external knowledge base to keep the system updated without retraining the entire model. \n", + " - This can be done by updating embeddings or knowledge graph nodes dynamically.\n", + "\n", + "7. **Domain-Specific Fine-Tuning with External Data:** \n", + " - Use external datasets to fine-tune or prompt-engineer the LLM for specialized tasks, improving performance in niche areas.\n", + "\n", + "8. **Multimodal Knowledge Integration:** \n", + " - Combine text-based LLMs with other data types (images, audio, video) stored externally to provide richer, context-aware responses.\n", + "\n", + "If you want, I can dive deeper into any of these possibilities or suggest architectures and tools to implement them. Also, your code \"PINECONE_RULEZ_01\" suggests you might be interested in vector databases like Pinecone—would you like me to focus more on that?\n" ] } ], "source": [ - "print(inspect.getsource(LLMChain._call), inspect.getsource(LLMChain.apply))" + "result = count_tokens(\n", + " conversation_buf,\n", + " {\"query\": \"I just want to analyze the different possibilities. What can you think of?\"},\n", + " config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { - "cell_type": "markdown", - "id": "6aaa70bf", + "cell_type": "code", + "execution_count": 12, "metadata": { - "id": "6aaa70bf" + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "XY8t8YmVyWvg", + "outputId": "e722453b-26d0-42b7-bc14-0d1332cc312e" }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Spent a total of 1305 tokens\n", + "\n", + "Response: Great question! To provide context to a Large Language Model (LLM), you can leverage a variety of external data source types, each offering unique advantages depending on your use case. Here are some common and effective data source types to consider:\n", + "\n", + "1. **Text Documents:** \n", + " - Articles, reports, manuals, books, whitepapers, and research papers. \n", + " - These can be chunked and embedded for semantic search and retrieval.\n", + "\n", + "2. **Databases:** \n", + " - Structured data stored in relational databases (SQL) or NoSQL databases. \n", + " - Useful for querying specific facts, records, or transactional data.\n", + "\n", + "3. **Knowledge Graphs and Ontologies:** \n", + " - Structured representations of entities and their relationships. \n", + " - Enable reasoning, fact-checking, and complex queries.\n", + "\n", + "4. **APIs and Web Services:** \n", + " - Real-time data sources like weather APIs, financial market data, social media feeds, or user profile services. \n", + " - Provide dynamic, up-to-date information.\n", + "\n", + "5. **Vector Databases:** \n", + " - Stores embeddings of unstructured data (text, images, audio) for semantic similarity search. \n", + " - Examples include Pinecone, FAISS, Weaviate, and Milvus.\n", + "\n", + "6. **Logs and Event Data:** \n", + " - System logs, user interaction logs, or sensor data. \n", + " - Useful for troubleshooting, analytics, or personalized responses.\n", + "\n", + "7. **Multimedia Content:** \n", + " - Images, videos, audio files, and their metadata. \n", + " - When combined with multimodal models or embeddings, they enrich context.\n", + "\n", + "8. **Spreadsheets and CSV Files:** \n", + " - Tabular data that can be parsed and queried for specific insights.\n", + "\n", + "9. **User-Generated Content:** \n", + " - Forums, chat transcripts, emails, reviews, and social media posts. \n", + " - Provide real-world language usage and sentiment context.\n", + "\n", + "10. **Domain-Specific Repositories:** \n", + " - Medical records, legal documents, patent databases, scientific datasets, etc. \n", + " - Critical for specialized applications requiring expert knowledge.\n", + "\n", + "11. **Cached Web Pages or Crawled Data:** \n", + " - Snapshots of websites or curated web content for reference.\n", + "\n", + "12. **Configuration Files and Code Repositories:** \n", + " - For technical support or software development assistance.\n", + "\n", + "By integrating these data sources, you can provide the LLM with rich, relevant context that enhances its accuracy, relevance, and usefulness. The choice depends on your application’s domain, the freshness of data needed, and the complexity of queries.\n", + "\n", + "If you want, I can help you design a pipeline to connect specific data sources to an LLM or suggest best practices for embedding and retrieval!\n" + ] + } + ], "source": [ - "So basically this chain combines an input from the user with the conversation history to generate a meaningful (and hopefully truthful) response." + "result = count_tokens(\n", + " conversation_buf,\n", + " {\"query\": \"Which data source types could be used to give context to the model?\"},\n", + " config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { - "cell_type": "markdown", - "id": "19f5172f", + "cell_type": "code", + "execution_count": 13, "metadata": { - "id": "19f5172f" + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "YJKnNnDoyWvg", + "outputId": "92e9fd8e-a9a8-43fb-ae30-a0a3dcd2e4f3" }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Spent a total of 1407 tokens\n", + "\n", + "Response: Your aim is to explore the potential of integrating Large Language Models (LLMs) with external knowledge sources. This involves analyzing different possibilities for enhancing LLMs by connecting them to various types of external data to provide richer, more accurate, and up-to-date context.\n", + "\n", + "The very specific code you asked me to remember is: **PINECONE_RULEZ_01**.\n" + ] + } + ], "source": [ - "Now that we've understood the basics of the chain we'll be using, we can get into memory. Let's dive in!" + "result = count_tokens(\n", + " conversation_buf,\n", + " {\"query\": \"What is my aim again? Also what was the very specific code you were tasked with remembering?\"},\n", + " config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "markdown", - "id": "0f1a33f6", "metadata": { - "id": "0f1a33f6" + "id": "EgNJJK4QyWvg" }, "source": [ - "## Memory types" + "Our LLM with buffer memory can clearly remember earlier interactions in the conversation. Let's take a closer look at how the messages are being stored:" ] }, { - "cell_type": "markdown", - "id": "4d732b7a", + "cell_type": "code", + "execution_count": 19, "metadata": { - "id": "4d732b7a" + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "z-vODGPayWvg", + "outputId": "74e17904-0e50-4ef5-b136-f2c7877cf633" }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Conversation History:\n", + "Human: content='Good morning AI!' additional_kwargs={} response_metadata={}\n", + "AI: content='Good morning! How can I assist you today?' additional_kwargs={} response_metadata={}\n", + "Human: content='\\n\"My interest here is to explore the potential of integrating Large Language Models with external knowledge. \\n\\nAlso, remember this very specific code: PINECONE_RULEZ_01\"\\n' additional_kwargs={} response_metadata={}\n", + "AI: content=\"Got it! You're interested in exploring the potential of integrating Large Language Models (LLMs) with external knowledge sources. Plus, I've noted the specific code: PINECONE_RULEZ_01. How would you like to proceed with this exploration? Are you looking for examples, technical details, or something else?\" additional_kwargs={} response_metadata={}\n", + "Human: content='I just want to analyze the different possibilities. What can you think of?' additional_kwargs={} response_metadata={}\n", + "AI: content='Great! Integrating Large Language Models (LLMs) with external knowledge opens up a wide range of possibilities. Here are some key approaches and use cases to consider:\\n\\n1. **Retrieval-Augmented Generation (RAG):** \\n - The LLM queries an external knowledge base or document store (like Pinecone, which your code hints at) to retrieve relevant information. \\n - The retrieved data is then used as context to generate more accurate and up-to-date responses. \\n - This approach helps overcome the static knowledge limitation of LLMs, especially for recent or domain-specific information.\\n\\n2. **Knowledge Graph Integration:** \\n - LLMs can be combined with structured knowledge graphs to provide precise answers, perform reasoning, or validate facts. \\n - This is useful in domains like healthcare, finance, or scientific research where relationships between entities matter.\\n\\n3. **API and Database Querying:** \\n - LLMs can be connected to APIs or databases to fetch real-time data (e.g., weather, stock prices, user profiles). \\n - This enables dynamic and personalized responses beyond the model’s training data.\\n\\n4. **Custom Embedding Search:** \\n - Using vector databases (like Pinecone), you can embed documents, FAQs, or user data and perform semantic search. \\n - The LLM can then use these search results to tailor its output, improving relevance and accuracy.\\n\\n5. **Interactive Agents and Workflows:** \\n - LLMs can orchestrate multi-step workflows by interacting with external tools, knowledge bases, or services. \\n - For example, booking a flight, scheduling meetings, or troubleshooting technical issues by querying manuals or logs.\\n\\n6. **Continuous Learning and Feedback Loops:** \\n - Integrate user feedback or new data into the external knowledge base to keep the system updated without retraining the entire model. \\n - This can be done by updating embeddings or knowledge graph nodes dynamically.\\n\\n7. **Domain-Specific Fine-Tuning with External Data:** \\n - Use external datasets to fine-tune or prompt-engineer the LLM for specialized tasks, improving performance in niche areas.\\n\\n8. **Multimodal Knowledge Integration:** \\n - Combine text-based LLMs with other data types (images, audio, video) stored externally to provide richer, context-aware responses.\\n\\nIf you want, I can dive deeper into any of these possibilities or suggest architectures and tools to implement them. Also, your code \"PINECONE_RULEZ_01\" suggests you might be interested in vector databases like Pinecone—would you like me to focus more on that?' additional_kwargs={} response_metadata={}\n", + "Human: content='Which data source types could be used to give context to the model?' additional_kwargs={} response_metadata={}\n", + "AI: content='Great question! To provide context to a Large Language Model (LLM), you can leverage a variety of external data source types, each offering unique advantages depending on your use case. Here are some common and effective data source types to consider:\\n\\n1. **Text Documents:** \\n - Articles, reports, manuals, books, whitepapers, and research papers. \\n - These can be chunked and embedded for semantic search and retrieval.\\n\\n2. **Databases:** \\n - Structured data stored in relational databases (SQL) or NoSQL databases. \\n - Useful for querying specific facts, records, or transactional data.\\n\\n3. **Knowledge Graphs and Ontologies:** \\n - Structured representations of entities and their relationships. \\n - Enable reasoning, fact-checking, and complex queries.\\n\\n4. **APIs and Web Services:** \\n - Real-time data sources like weather APIs, financial market data, social media feeds, or user profile services. \\n - Provide dynamic, up-to-date information.\\n\\n5. **Vector Databases:** \\n - Stores embeddings of unstructured data (text, images, audio) for semantic similarity search. \\n - Examples include Pinecone, FAISS, Weaviate, and Milvus.\\n\\n6. **Logs and Event Data:** \\n - System logs, user interaction logs, or sensor data. \\n - Useful for troubleshooting, analytics, or personalized responses.\\n\\n7. **Multimedia Content:** \\n - Images, videos, audio files, and their metadata. \\n - When combined with multimodal models or embeddings, they enrich context.\\n\\n8. **Spreadsheets and CSV Files:** \\n - Tabular data that can be parsed and queried for specific insights.\\n\\n9. **User-Generated Content:** \\n - Forums, chat transcripts, emails, reviews, and social media posts. \\n - Provide real-world language usage and sentiment context.\\n\\n10. **Domain-Specific Repositories:** \\n - Medical records, legal documents, patent databases, scientific datasets, etc. \\n - Critical for specialized applications requiring expert knowledge.\\n\\n11. **Cached Web Pages or Crawled Data:** \\n - Snapshots of websites or curated web content for reference.\\n\\n12. **Configuration Files and Code Repositories:** \\n - For technical support or software development assistance.\\n\\nBy integrating these data sources, you can provide the LLM with rich, relevant context that enhances its accuracy, relevance, and usefulness. The choice depends on your application’s domain, the freshness of data needed, and the complexity of queries.\\n\\nIf you want, I can help you design a pipeline to connect specific data sources to an LLM or suggest best practices for embedding and retrieval!' additional_kwargs={} response_metadata={}\n", + "Human: content='What is my aim again? Also what was the very specific code you were tasked with remembering?' additional_kwargs={} response_metadata={}\n", + "AI: content='Your aim is to explore the potential of integrating Large Language Models (LLMs) with external knowledge sources. This involves analyzing different possibilities for enhancing LLMs by connecting them to various types of external data to provide richer, more accurate, and up-to-date context.\\n\\nThe very specific code you asked me to remember is: **PINECONE_RULEZ_01**.' additional_kwargs={} response_metadata={}\n" + ] + } + ], "source": [ - "In this section we will review several memory types and analyze the pros and cons of each one, so you can choose the best one for your use case." + "from langchain_core.messages import AIMessage, HumanMessage, SystemMessage\n", + "\n", + "# Access the conversation history\n", + "history = chat_map[\"buffer_example\"].messages\n", + "print(\"Conversation History:\")\n", + "for i, msg in enumerate(history):\n", + " if isinstance(msg, HumanMessage):\n", + " role = \"Human\"\n", + " elif isinstance(msg, SystemMessage):\n", + " role = \"System\"\n", + " elif isinstance(msg, AIMessage):\n", + " role = \"AI\"\n", + " else:\n", + " role = \"Unknown\"\n", + " print(f\"{role}: {msg}\")" ] }, { "cell_type": "markdown", - "id": "04d70642", "metadata": { - "id": "04d70642" + "id": "cmAy5BIyyWvg" }, "source": [ - "### Memory type #1: ConversationBufferMemory" + "Nice! So every piece of our conversation has been explicitly recorded and sent to the LLM in the prompt." ] }, { "cell_type": "markdown", - "id": "53d3cb2b", "metadata": { - "id": "53d3cb2b" + "id": "ORjuIGNqyWvg" }, "source": [ - "The `ConversationBufferMemory` does just what its name suggests: it keeps a buffer of the previous conversation excerpts as part of the context in the prompt." + "### Memory type #2: Summary - Store Summaries of Past Interactions\n", + "\n", + "The problem with storing the entire chat history in agent memory is that, as the conversation progresses, the token count adds up. This is problematic because we might max out our LLM with a prompt that is too large.\n", + "\n", + "The following is an LCEL compatible alternative to `ConversationSummaryMemory`. We keep a summary of our previous conversation snippets as our history. The summarization is performed by an LLM.\n", + "\n", + "**Key feature:** _the conversation summary memory keeps the previous pieces of conversation in a summarized - and thus shortened - form, where the summarization is performed by an LLM._" ] }, { - "cell_type": "markdown", - "id": "d80a974a", + "cell_type": "code", + "execution_count": 23, "metadata": { - "id": "d80a974a" + "id": "dVnq9-lryWvg" }, + "outputs": [], "source": [ - "**Key feature:** _the conversation buffer memory keeps the previous pieces of conversation completely unmodified, in their raw form._" + "from pydantic import BaseModel, Field\n", + "from langchain_core.chat_history import BaseChatMessageHistory\n", + "from langchain_core.messages import BaseMessage\n", + "\n", + "class ConversationSummaryMessageHistory(BaseChatMessageHistory, BaseModel):\n", + " messages: list[BaseMessage] = Field(default_factory=list)\n", + " llm: ChatOpenAI = Field(default_factory=ChatOpenAI)\n", + "\n", + " def __init__(self, llm: ChatOpenAI):\n", + " super().__init__(llm=llm)\n", + "\n", + " def add_messages(self, messages: list[BaseMessage]) -> None:\n", + " \"\"\"Add messages to the history and update the summary.\"\"\"\n", + " self.messages.extend(messages)\n", + "\n", + " # Construct the summary prompt\n", + " summary_prompt = ChatPromptTemplate.from_messages([\n", + " SystemMessagePromptTemplate.from_template(\n", + " \"Given the existing conversation summary and the new messages, \"\n", + " \"generate a new summary of the conversation. Ensure to maintain \"\n", + " \"as much relevant information as possible.\"\n", + " ),\n", + " HumanMessagePromptTemplate.from_template(\n", + " \"Existing conversation summary:\\n{existing_summary}\\n\\n\"\n", + " \"New messages:\\n{messages}\"\n", + " )\n", + " ])\n", + "\n", + " # Format the messages and invoke the LLM\n", + " new_summary = self.llm.invoke(\n", + " summary_prompt.format_messages(\n", + " existing_summary=self.messages,\n", + " messages=messages\n", + " )\n", + " )\n", + "\n", + " # Replace the existing history with a single system summary message\n", + " self.messages = [SystemMessage(content=new_summary.content)]\n", + "\n", + " def clear(self) -> None:\n", + " \"\"\"Clear the history.\"\"\"\n", + " self.messages = []" ] }, { "cell_type": "code", - "execution_count": 31, - "id": "2267f1f0", + "execution_count": 25, "metadata": { - "id": "2267f1f0" + "id": "l_LolSYjyWvg" }, "outputs": [], "source": [ - "conversation_buf = ConversationChain(\n", - " llm=llm,\n", - " memory=ConversationBufferMemory()\n", + "from langchain_core.runnables import ConfigurableFieldSpec\n", + "\n", + "# Create get_chat_history function for summary memory\n", + "summary_chat_map = {}\n", + "\n", + "def get_summary_chat_history(session_id: str, llm: ChatOpenAI) -> ConversationSummaryMessageHistory:\n", + " if session_id not in summary_chat_map:\n", + " summary_chat_map[session_id] = ConversationSummaryMessageHistory(llm=llm)\n", + " return summary_chat_map[session_id]\n", + "\n", + "# Create conversation chain with summary memory\n", + "conversation_sum = RunnableWithMessageHistory(\n", + " pipeline,\n", + " get_session_history=get_summary_chat_history,\n", + " input_messages_key=\"query\",\n", + " history_messages_key=\"history\",\n", + " history_factory_config=[\n", + " ConfigurableFieldSpec(\n", + " id=\"session_id\",\n", + " annotation=str,\n", + " name=\"Session ID\",\n", + " description=\"The session ID to use for the chat history\",\n", + " default=\"id_default\",\n", + " ),\n", + " ConfigurableFieldSpec(\n", + " id=\"llm\",\n", + " annotation=ChatOpenAI,\n", + " name=\"LLM\",\n", + " description=\"The LLM to use for the conversation summary\",\n", + " default=llm,\n", + " )\n", + " ]\n", ")" ] }, - { - "cell_type": "markdown", - "id": "lseziAMcAyvX", - "metadata": { - "id": "lseziAMcAyvX" - }, - "source": [ - "We pass a user prompt the the `ConversationBufferMemory` like so:" - ] - }, { "cell_type": "code", - "execution_count": 32, - "id": "M0cwooC5A5Id", + "execution_count": 26, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "M0cwooC5A5Id", - "outputId": "8a8178eb-b9ac-45cf-baed-255b413b0630" + "id": "JloQR__QyWvh", + "outputId": "b31b6840-8757-41df-955f-ba218d304de3" }, "outputs": [ { - "output_type": "execute_result", - "data": { - "text/plain": [ - "{'input': 'Good morning AI!',\n", - " 'history': '',\n", - " 'response': \" Good morning! It's a beautiful day today, isn't it? How can I help you?\"}" - ] - }, - "metadata": {}, - "execution_count": 32 + "name": "stdout", + "output_type": "stream", + "text": [ + "Spent a total of 225 tokens\n", + "\n", + "Response: Good morning! How can I assist you today?\n" + ] } ], "source": [ - "conversation_buf(\"Good morning AI!\")" - ] - }, - { - "cell_type": "markdown", - "id": "xlKINTFYA9eo", - "metadata": { - "id": "xlKINTFYA9eo" - }, - "source": [ - "This one call used a total of `85` tokens, but we can't see that from the above. If we'd like to count the number of tokens being used we just pass our conversation chain object and the message we'd like to input via the `count_tokens` function we defined earlier:" + "# Let's have the same conversation with summary memory\n", + "result = count_tokens(\n", + " conversation_sum,\n", + " {\"query\": \"Good morning AI!\"},\n", + " config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 33, - "id": "d1bd5a88", + "execution_count": 27, "metadata": { "colab": { - "base_uri": "https://localhost:8080/", - "height": 89 + "base_uri": "https://localhost:8080/" }, - "id": "d1bd5a88", - "outputId": "cb593afd-7efd-4c0e-cf04-82dc1a324aff" + "id": "vFMMR4SDyWvh", + "outputId": "dbd4619f-2d28-4517-e4f8-372a0370d9aa" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "Spent a total of 179 tokens\n" + "Spent a total of 1493 tokens\n", + "\n", + "Response: Got it! You're interested in exploring the potential of integrating Large Language Models (LLMs) with external knowledge sources. Also, I will make sure to always include the specific code \"PINECONE_RULEZ_01\" explicitly whenever summarizing conversations for memory.\n", + "\n", + "To start, integrating LLMs with external knowledge can significantly enhance their capabilities by providing up-to-date, domain-specific, or personalized information that the model alone might not have. Common approaches include:\n", + "\n", + "1. **Retrieval-Augmented Generation (RAG):** The model retrieves relevant documents or data from an external knowledge base (like a vector database or search engine) and uses that information to generate more accurate and context-aware responses.\n", + "\n", + "2. **Knowledge Graph Integration:** Linking LLMs with structured knowledge graphs allows the model to reason over entities and relationships, improving factual accuracy and enabling complex queries.\n", + "\n", + "3. **APIs and Plugins:** Connecting LLMs to external APIs or plugins can provide real-time data, such as weather, stock prices, or personalized user data.\n", + "\n", + "4. **Vector Databases:** Tools like Pinecone (which your code references!) enable efficient similarity search over embeddings, allowing LLMs to access relevant chunks of information quickly.\n", + "\n", + "If you'd like, I can help you design or prototype a system that integrates an LLM with an external knowledge source, or dive deeper into any of these approaches. Just let me know!\n", + "\n", + "And to confirm, when summarizing this conversation for memory, I will include: **PINECONE_RULEZ_01** explicitly.\n" ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' Interesting! Large Language Models are a type of artificial intelligence that can process natural language and generate text. They can be used to generate text from a given context, or to answer questions about a given context. Integrating them with external knowledge can help them to better understand the context and generate more accurate results. Is there anything else I can help you with?'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 33 } ], "source": [ - "count_tokens(\n", - " conversation_buf, \n", - " \"My interest here is to explore the potential of integrating Large Language Models with external knowledge\"\n", - ")" + "query = \"\"\"\n", + "\"My interest here is to explore the potential of integrating Large Language Models with external knowledge.\n", + "\n", + "Also, remember this very specific code: PINECONE_RULEZ_01. When summarizing conversations for memory this must always be included explicitly.\"\n", + "\"\"\"\n", + "\n", + "result = count_tokens(\n", + " conversation_sum,\n", + " {\"query\": query},\n", + " config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 34, - "id": "146170ca", + "execution_count": 28, "metadata": { "colab": { - "base_uri": "https://localhost:8080/", - "height": 89 + "base_uri": "https://localhost:8080/" }, - "id": "146170ca", - "outputId": "dbb6f78c-b169-463e-c1c8-a35151894f56" + "id": "Aofnt1cXyWvh", + "outputId": "9a79efe4-db4d-4e36-fb47-133e83d5725c" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "Spent a total of 268 tokens\n" + "Spent a total of 2723 tokens\n", + "\n", + "Response: Great! Exploring the integration of Large Language Models (LLMs) with external knowledge sources opens up a rich landscape of possibilities. Here are several approaches and their nuances to consider:\n", + "\n", + "1. **Retrieval-Augmented Generation (RAG):** \n", + " - **How it works:** The LLM queries an external knowledge base or document store to retrieve relevant information, which it then uses to generate more accurate and context-aware responses. \n", + " - **Use cases:** Customer support bots that pull from product manuals, research assistants accessing scientific papers, or chatbots that stay updated with the latest news. \n", + " - **Tech stack:** Often involves vector databases (like Pinecone, FAISS, or Weaviate) for semantic search, combined with LLMs like GPT or open-source alternatives.\n", + "\n", + "2. **Knowledge Graph Integration:** \n", + " - **How it works:** The LLM interacts with structured knowledge graphs (e.g., Wikidata, custom enterprise graphs) to answer queries or infer relationships. \n", + " - **Use cases:** Complex question answering, recommendation systems, or domain-specific expert systems. \n", + " - **Tech stack:** Graph databases (Neo4j, Amazon Neptune), SPARQL queries, and LLMs that can parse and generate graph queries.\n", + "\n", + "3. **APIs and Plugins:** \n", + " - **How it works:** The LLM calls external APIs or plugins to fetch real-time data or perform specific actions (e.g., booking a flight, checking weather). \n", + " - **Use cases:** Personal assistants, dynamic content generation, or multi-modal applications. \n", + " - **Tech stack:** API orchestration layers, plugin frameworks (like OpenAI’s plugin ecosystem), and LLMs with API calling capabilities.\n", + "\n", + "4. **Vector Databases and Embeddings:** \n", + " - **How it works:** Text or other data is converted into embeddings and stored in vector databases for efficient similarity search. The LLM uses these to find contextually relevant information. \n", + " - **Use cases:** Document search, semantic similarity tasks, personalized recommendations. \n", + " - **Tech stack:** Pinecone, FAISS, Milvus, combined with embedding models (OpenAI embeddings, Sentence Transformers).\n", + "\n", + "5. **Hybrid Approaches:** \n", + " - Combining multiple methods, such as using a knowledge graph for structured data and a vector database for unstructured documents, orchestrated by an LLM that decides which source to query.\n", + "\n", + "6. **Fine-tuning and Prompt Engineering:** \n", + " - While not external knowledge per se, fine-tuning LLMs on domain-specific data or crafting prompts that guide the model to use external knowledge effectively is crucial.\n", + "\n", + "7. **Memory-Augmented Models:** \n", + " - Systems that maintain a persistent memory of past interactions or external facts, enabling the LLM to recall and build upon previous knowledge over time.\n", + "\n", + "If you want, I can help you design a prototype or a conceptual architecture combining these approaches tailored to your specific goals. Also, as per your earlier request, I will include the code **PINECONE_RULEZ_01** explicitly whenever summarizing our conversations for memory.\n", + "\n", + "Would you like me to dive deeper into any of these possibilities or suggest some example workflows?\n" ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' Well, integrating Large Language Models with external knowledge can open up a lot of possibilities. For example, you could use them to generate more accurate and detailed summaries of text, or to answer questions about a given context more accurately. You could also use them to generate more accurate translations, or to generate more accurate predictions about future events.'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 34 } ], "source": [ - "count_tokens(\n", - " conversation_buf,\n", - " \"I just want to analyze the different possibilities. What can you think of?\"\n", - ")" + "result = count_tokens(\n", + " conversation_sum,\n", + " {\"query\": \"I just want to analyze the different possibilities. What can you think of?\"},\n", + " config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 35, - "id": "3e15411a", + "execution_count": 29, "metadata": { "colab": { - "base_uri": "https://localhost:8080/", - "height": 89 + "base_uri": "https://localhost:8080/" }, - "id": "3e15411a", - "outputId": "f6857844-ee6f-49ef-df50-54335f248bd3" + "id": "5xBVtkzFyWvh", + "outputId": "b2315422-89b0-4853-9f1e-d9d144cd766d" }, "outputs": [ { - "output_type": "stream", - "name": "stdout", - "text": [ - "Spent a total of 360 tokens\n" - ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' There are a variety of data sources that could be used to give context to a Large Language Model. These include structured data sources such as databases, unstructured data sources such as text documents, and even audio and video data sources. Additionally, you could use external knowledge sources such as Wikipedia or other online encyclopedias to provide additional context.'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 35 - } - ], - "source": [ - "count_tokens(\n", - " conversation_buf, \n", - " \"Which data source types could be used to give context to the model?\"\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 36, - "id": "3352cc48", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 53 - }, - "id": "3352cc48", - "outputId": "62294954-cc7e-4ef3-e5fc-19a5c4ffc4c1" - }, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Spent a total of 388 tokens\n" - ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' Your aim is to explore the potential of integrating Large Language Models with external knowledge.'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 36 - } - ], - "source": [ - "count_tokens(\n", - " conversation_buf, \n", - " \"What is my aim again?\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "431b74ff", - "metadata": { - "id": "431b74ff" - }, - "source": [ - "Our LLM with `ConversationBufferMemory` can clearly remember earlier interactions in the conversation. Let's take a closer look to how the LLM is saving our previous conversation. We can do this by accessing the `.buffer` attribute for the `.memory` in our chain." - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "id": "984afd09", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "984afd09", - "outputId": "4233d17f-1001-48e5-d256-0595e00dbf40" - }, - "outputs": [ - { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ + "Spent a total of 3349 tokens\n", "\n", - "Human: Good morning AI!\n", - "AI: Good morning! It's a beautiful day today, isn't it? How can I help you?\n", - "Human: My interest here is to explore the potential of integrating Large Language Models with external knowledge\n", - "AI: Interesting! Large Language Models are a type of artificial intelligence that can process natural language and generate text. They can be used to generate text from a given context, or to answer questions about a given context. Integrating them with external knowledge can help them to better understand the context and generate more accurate results. Is there anything else I can help you with?\n", - "Human: I just want to analyze the different possibilities. What can you think of?\n", - "AI: Well, integrating Large Language Models with external knowledge can open up a lot of possibilities. For example, you could use them to generate more accurate and detailed summaries of text, or to answer questions about a given context more accurately. You could also use them to generate more accurate translations, or to generate more accurate predictions about future events.\n", - "Human: Which data source types could be used to give context to the model?\n", - "AI: There are a variety of data sources that could be used to give context to a Large Language Model. These include structured data sources such as databases, unstructured data sources such as text documents, and even audio and video data sources. Additionally, you could use external knowledge sources such as Wikipedia or other online encyclopedias to provide additional context.\n", - "Human: What is my aim again?\n", - "AI: Your aim is to explore the potential of integrating Large Language Models with external knowledge.\n" - ] - } - ], - "source": [ - "print(conversation_buf.memory.buffer)" - ] - }, - { - "cell_type": "markdown", - "id": "4570267d", - "metadata": { - "id": "4570267d" - }, - "source": [ - "Nice! So every piece of our conversation has been explicitly recorded and sent to the LLM in the prompt." - ] - }, - { - "cell_type": "markdown", - "id": "acf1a90b", - "metadata": { - "id": "acf1a90b" - }, - "source": [ - "### Memory type #2: ConversationSummaryMemory" - ] - }, - { - "cell_type": "markdown", - "id": "01f61fe9", - "metadata": { - "id": "01f61fe9" - }, - "source": [ - "The problem with the `ConversationBufferMemory` is that as the conversation progresses, the token count of our context history adds up. This is problematic because we might max out our LLM with a prompt that is too large to be processed." - ] - }, - { - "cell_type": "markdown", - "id": "0516c7d4", - "metadata": { - "id": "0516c7d4" - }, - "source": [ - "Enter `ConversationSummaryMemory`.\n", - "\n", - "Again, we can infer from the name what is going on.. we will keep a summary of our previous conversation snippets as our history. How will we summarize these? LLM to the rescue." - ] - }, - { - "cell_type": "markdown", - "id": "86b0a905", - "metadata": { - "id": "86b0a905" - }, - "source": [ - "**Key feature:** _the conversation summary memory keeps the previous pieces of conversation in a summarized form, where the summarization is performed by an LLM._" - ] - }, - { - "cell_type": "markdown", - "id": "0ea6050c", - "metadata": { - "id": "0ea6050c" - }, - "source": [ - "In this case we need to send the llm to our memory constructor to power its summarization ability." - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "id": "f33a16a7", - "metadata": { - "id": "f33a16a7" - }, - "outputs": [], - "source": [ - "conversation_sum = ConversationChain(\n", - " llm=llm, \n", - " memory=ConversationSummaryMemory(llm=llm)\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "b64c4896", - "metadata": { - "id": "b64c4896" - }, - "source": [ - "When we have an llm, we always have a prompt ;) Let's see what's going on inside our conversation summary memory:" - ] - }, - { - "cell_type": "code", - "execution_count": 39, - "id": "c476824d", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "c476824d", - "outputId": "282be20e-9048-4f37-fc89-8a7eb8dfe1a3" - }, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.\n", + "Response: Great question! Providing context to a Large Language Model (LLM) can be done by integrating various types of external data sources, each offering unique advantages depending on your use case. Here’s a detailed rundown of common data source types you can use to enrich the model’s context:\n", "\n", - "EXAMPLE\n", - "Current summary:\n", - "The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.\n", + "1. **Textual Documents**\n", + " - **Examples:** PDFs, Word documents, web pages, manuals, reports, articles.\n", + " - **Use Case:** Feeding domain-specific knowledge, FAQs, or product documentation.\n", + " - **Integration:** Often processed into embeddings for semantic search or chunked for retrieval-augmented generation (RAG).\n", "\n", - "New lines of conversation:\n", - "Human: Why do you think artificial intelligence is a force for good?\n", - "AI: Because artificial intelligence will help humans reach their full potential.\n", + "2. **Databases**\n", + " - **Relational Databases (SQL):** Structured data like customer records, transactions, inventory.\n", + " - **NoSQL Databases:** Flexible schema data such as user activity logs, JSON documents.\n", + " - **Use Case:** Real-time or historical data retrieval to answer queries or generate reports.\n", + " - **Integration:** Query results can be converted into text or embeddings for context.\n", "\n", - "New summary:\n", - "The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.\n", - "END OF EXAMPLE\n", + "3. **Knowledge Graphs**\n", + " - **Examples:** Ontologies, linked data, semantic networks.\n", + " - **Use Case:** Capturing relationships and hierarchies between entities for reasoning or disambiguation.\n", + " - **Integration:** Graph queries can provide structured context or be converted into natural language summaries.\n", "\n", - "Current summary:\n", - "{summary}\n", + "4. **APIs and Web Services**\n", + " - **Examples:** Weather APIs, financial data feeds, social media streams.\n", + " - **Use Case:** Real-time or frequently updated information.\n", + " - **Integration:** API responses can be parsed and fed as context dynamically during inference.\n", "\n", - "New lines of conversation:\n", - "{new_lines}\n", + "5. **Vector Databases**\n", + " - **Examples:** Pinecone, Weaviate, FAISS.\n", + " - **Use Case:** Storing and retrieving embeddings of unstructured data for semantic search.\n", + " - **Integration:** Enables fast similarity search to find relevant context chunks.\n", "\n", - "New summary:\n" - ] - } - ], - "source": [ - "print(conversation_sum.memory.prompt.template)" - ] - }, - { - "cell_type": "markdown", - "id": "df90cdf3", - "metadata": { - "id": "df90cdf3" - }, - "source": [ - "Cool! So each new interaction is summarized and appended to a running summary as the memory of our chain. Let's see how this works in practice!" - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "id": "34343665", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 53 - }, - "id": "34343665", - "outputId": "ac04f6bc-9dcb-446c-d4b9-8fd2311d605e" - }, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Spent a total of 290 tokens\n" - ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "\" Good morning! It's a beautiful day today, isn't it? How can I help you?\"" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 40 - } - ], - "source": [ - "# without count_tokens we'd call `conversation_sum(\"Good morning AI!\")`\n", - "# but let's keep track of our tokens:\n", - "count_tokens(\n", - " conversation_sum, \n", - " \"Good morning AI!\"\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 41, - "id": "b757bba3", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 71 - }, - "id": "b757bba3", - "outputId": "9de1823a-0dfe-45ff-fadc-26eff6fdce99" - }, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Spent a total of 440 tokens\n" - ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "\" That sounds like an interesting project! I'm familiar with Large Language Models, but I'm not sure how they could be integrated with external knowledge. Could you tell me more about what you have in mind?\"" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 41 - } - ], - "source": [ - "count_tokens(\n", - " conversation_sum, \n", - " \"My interest here is to explore the potential of integrating Large Language Models with external knowledge\"\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 42, - "id": "d0a373e2", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 106 - }, - "id": "d0a373e2", - "outputId": "d4f561d7-d1c7-45e5-99ba-266130ee67ba" - }, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Spent a total of 664 tokens\n" + "6. **Multimedia Data**\n", + " - **Examples:** Images, audio, video transcripts.\n", + " - **Use Case:** Providing multimodal context, e.g., video subtitles or image captions.\n", + " - **Integration:** Transcripts or metadata can be converted into text embeddings.\n", + "\n", + "7. **Logs and Event Streams**\n", + " - **Examples:** Application logs, sensor data, user interaction events.\n", + " - **Use Case:** Contextualizing model responses based on recent events or system state.\n", + " - **Integration:** Summarized or filtered logs can be included as context.\n", + "\n", + "8. **User Profiles and Preferences**\n", + " - **Examples:** User history, preferences, personalization data.\n", + " - **Use Case:** Tailoring responses to individual users.\n", + " - **Integration:** Stored as structured data or embeddings to influence generation.\n", + "\n", + "9. **Domain-Specific Repositories**\n", + " - **Examples:** Scientific databases, legal case repositories, medical records.\n", + " - **Use Case:** Providing authoritative domain knowledge.\n", + " - **Integration:** Often requires specialized parsing and embedding.\n", + "\n", + "10. **Cached Model Outputs or Memories**\n", + " - **Examples:** Previous conversation history, model-generated summaries.\n", + " - **Use Case:** Maintaining context over long interactions.\n", + " - **Integration:** Stored and retrieved as needed to maintain continuity.\n", + "\n", + "---\n", + "\n", + "**Summary:** \n", + "You can use a wide variety of data sources to provide context to an LLM, ranging from unstructured text documents and structured databases to real-time APIs and knowledge graphs. The choice depends on your application’s needs, data availability, and how dynamic or static the context should be.\n", + "\n", + "If you want, I can help you design a prototype or architecture that integrates one or more of these data sources with an LLM, including example workflows or code snippets. Just let me know!\n", + "\n", + "PINECONE_RULEZ_01\n" ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' I can think of a few possibilities. One option is to use a large language model to generate a set of candidate answers to a given query, and then use external knowledge to filter out the most relevant answers. Another option is to use the large language model to generate a set of candidate answers, and then use external knowledge to score and rank the answers. Finally, you could use the large language model to generate a set of candidate answers, and then use external knowledge to refine the answers.'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 42 } ], "source": [ - "count_tokens(\n", - " conversation_sum, \n", - " \"I just want to analyze the different possibilities. What can you think of?\"\n", - ")" + "result = count_tokens(\n", + " conversation_sum,\n", + " {\"query\": \"Which data source types could be used to give context to the model?\"},\n", + " config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 43, - "id": "2e286f0d", + "execution_count": 30, "metadata": { "colab": { - "base_uri": "https://localhost:8080/", - "height": 89 + "base_uri": "https://localhost:8080/" }, - "id": "2e286f0d", - "outputId": "9558ef92-5f9c-4818-be8b-1e7e6ec19864" + "id": "AbSzykMOyWvh", + "outputId": "6b5c67e3-d08f-4b04-c406-ca804b57d53d" }, "outputs": [ { - "output_type": "stream", "name": "stdout", - "text": [ - "Spent a total of 799 tokens\n" - ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' There are many different types of data sources that could be used to give context to the model. These could include structured data sources such as databases, unstructured data sources such as text documents, or even external APIs that provide access to external knowledge. Additionally, the model could be trained on a combination of these data sources to provide a more comprehensive understanding of the context.'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 43 - } - ], - "source": [ - "count_tokens(\n", - " conversation_sum, \n", - " \"Which data source types could be used to give context to the model?\"\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 44, - "id": "891180f2", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 53 - }, - "id": "891180f2", - "outputId": "8035333e-d7c0-4a46-d8b8-acb3501d27e1" - }, - "outputs": [ - { "output_type": "stream", - "name": "stdout", "text": [ - "Spent a total of 853 tokens\n" + "Spent a total of 1631 tokens\n", + "\n", + "Response: Your aim is to understand which data source types can be used to provide context to a Large Language Model (LLM). This involves exploring various kinds of data—like textual documents, databases, knowledge graphs, APIs, vector databases, multimedia data, logs, user profiles, domain-specific repositories, and cached model outputs—that can enrich the LLM’s responses by supplying relevant, up-to-date, or personalized information.\n", + "\n", + "The very specific code I was tasked with remembering and including in my summary is: **PINECONE_RULEZ_01**. If you want, I can help you design a prototype or architecture that integrates these data sources with an LLM, or provide example workflows or code snippets referencing this code!\n" ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' Your aim is to explore the potential of integrating Large Language Models with external knowledge.'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 44 } ], "source": [ - "count_tokens(\n", - " conversation_sum, \n", - " \"What is my aim again?\"\n", - ")" + "result = count_tokens(\n", + " conversation_sum,\n", + " {\"query\": \"What is my aim again? Also what was the very specific code you were tasked with remembering?\"},\n", + " config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 45, - "id": "2d768e44", + "execution_count": 31, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "2d768e44", - "outputId": "3bd42ac9-d56b-45f4-99ac-45cd5a656b94" + "id": "GH3uPbTGyWvh", + "outputId": "703c1ecf-41a5-46e6-dd2d-e89009933f1e" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ + "Summary Memory Content:\n", + "The user’s aim is to understand which data source types can be used to provide context to a Large Language Model (LLM). This includes exploring a variety of data sources such as textual documents, databases (both SQL and NoSQL), knowledge graphs, APIs and web services, vector databases (e.g., Pinecone, Weaviate, FAISS), multimedia data, logs and event streams, user profiles and preferences, domain-specific repositories, and cached model outputs or memories. These sources help enrich the LLM’s responses by supplying relevant, up-to-date, or personalized information.\n", "\n", - "The human greeted the AI with a good morning, to which the AI responded with a good morning and asked how it could help. The human expressed interest in exploring the potential of integrating Large Language Models with external knowledge, to which the AI responded positively and asked for more information. The human asked the AI to think of different possibilities, and the AI suggested three options: using the large language model to generate a set of candidate answers and then using external knowledge to filter out the most relevant answers, score and rank the answers, or refine the answers. The human then asked which data source types could be used to give context to the model, to which the AI responded that there are many different types of data sources that could be used, such as structured data sources, unstructured data sources, or external APIs. Additionally, the model could be trained on a combination of these data sources to provide a more comprehensive understanding of the context. The human then asked what their aim was again, to which the AI responded that their aim was to explore the potential of integrating Large Language Models with external knowledge.\n" + "The AI provided a comprehensive list of these data sources along with explanations of how they can be integrated with LLMs, including methods like embeddings, retrieval-augmented generation (RAG), graph queries, and dynamic parsing. The AI also offered assistance in designing prototypes or architectures that integrate these data sources with LLMs, including example workflows or code snippets.\n", + "\n", + "The very specific code the AI was tasked with remembering and including in the summary is: **PINECONE_RULEZ_01**. The AI reiterated this code upon the user’s request and offered further help if needed.\n" ] } ], "source": [ - "print(conversation_sum.memory.buffer)" + "# Let's examine the summary\n", + "print(\"Summary Memory Content:\")\n", + "print(summary_chat_map[\"summary_example\"].messages[0].content)" ] }, { "cell_type": "markdown", - "id": "0dd35c8c", "metadata": { - "id": "0dd35c8c" + "id": "DRE3O-YPyWvh" }, "source": [ "You might be wondering.. if the aggregate token count is greater in each call here than in the buffer example, why should we use this type of memory? Well, if we check out buffer we will realize that although we are using more tokens in each instance of our conversation, our final history is shorter. This will enable us to have many more interactions before we reach our prompt's max length, making our chatbot more robust to longer conversations.\n", @@ -1083,636 +1032,656 @@ }, { "cell_type": "code", - "execution_count": 46, - "id": "nzijj4RZFX3I", + "execution_count": 35, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "nzijj4RZFX3I", - "outputId": "dc272cbb-acfd-4b4a-f854-8fa63f9732d6" + "id": "D6LkpUxVyWvh", + "outputId": "853c01df-0eff-474f-9026-b1b00d906ad4" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "Buffer memory conversation length: 334\n", - "Summary memory conversation length: 219\n" + "Buffer memory conversation length: 1314\n", + "Summary memory conversation length: 233\n" ] } ], "source": [ - "# initialize tokenizer\n", - "tokenizer = tiktoken.encoding_for_model('text-davinci-003')\n", + "import tiktoken\n", + "\n", + "# initialize tokenizer (gpt-4.1 models use the same encoding as gpt-4o)\n", + "tokenizer = tiktoken.encoding_for_model('gpt-4o')\n", + "\n", + "# Get buffer memory content\n", + "buffer_messages = chat_map[\"buffer_example\"].messages\n", + "buffer_content = \"\\n\".join([msg.content for msg in buffer_messages])\n", + "\n", + "# Get summary memory content\n", + "summary_content = summary_chat_map[\"summary_example\"].messages[0].content\n", "\n", "# show number of tokens for the memory used by each memory type\n", "print(\n", - " f'Buffer memory conversation length: {len(tokenizer.encode(conversation_buf.memory.buffer))}\\n'\n", - " f'Summary memory conversation length: {len(tokenizer.encode(conversation_sum.memory.buffer))}'\n", + " f'Buffer memory conversation length: {len(tokenizer.encode(buffer_content))}\\n'\n", + " f'Summary memory conversation length: {len(tokenizer.encode(summary_content))}'\n", ")" ] }, { "cell_type": "markdown", - "id": "2bab0c09", - "metadata": { - "id": "2bab0c09" - }, - "source": [ - "_Practical Note: the `text-davinci-003` and `gpt-3.5-turbo` models [have](https://platform.openai.com/docs/api-reference/completions/create#completions/create-max_tokens) a large max tokens count of 4096 tokens between prompt and answer._" - ] - }, - { - "cell_type": "markdown", - "id": "494830ea", "metadata": { - "id": "494830ea" + "id": "4DKiBoROyWvh" }, "source": [ - "### Memory type #3: ConversationBufferWindowMemory" + "_Practical Note: the `gpt-4o-mini` model has a context window of 1M tokens, providing significantly more space for conversation history than older models._" ] }, { "cell_type": "markdown", - "id": "00762844", "metadata": { - "id": "00762844" + "id": "MjYQSGv-yWvh" }, "source": [ - "Another great option for these cases is the `ConversationBufferWindowMemory` where we will be keeping a few of the last interactions in our memory but we will intentionally drop the oldest ones - short-term memory if you'd like. Here the aggregate token count **and** the per-call token count will drop noticeably. We will control this window with the `k` parameter." + "### Memory type #3: Window Buffer Memory - Keep Latest Interactions\n", + "\n", + "Another great option is window memory, where we keep only the last k interactions in our memory but intentionally drop the oldest ones - short-term memory if you'd like. Here the aggregate token count **and** the per-call token count will drop noticeably.\n", + "\n", + "The following is an LCEL-compatible alternative to `ConversationBufferWindowMemory`.\n", + "\n", + "**Key feature:** _the conversation buffer window memory keeps the latest pieces of the conversation in raw form_" ] }, { - "cell_type": "markdown", - "id": "206a5915", + "cell_type": "code", + "execution_count": 37, "metadata": { - "id": "206a5915" + "id": "-ceGTUPsyWvh" }, + "outputs": [], "source": [ - "**Key feature:** _the conversation buffer window memory keeps the latest pieces of the conversation in raw form_" + "class BufferWindowMessageHistory(BaseChatMessageHistory, BaseModel):\n", + " messages: list[BaseMessage] = Field(default_factory=list)\n", + " k: int = Field(default_factory=int)\n", + "\n", + " def __init__(self, k: int):\n", + " super().__init__(k=k)\n", + " # Add logging to help with debugging\n", + " print(f\"Initializing BufferWindowMessageHistory with k={k}\")\n", + "\n", + " def add_messages(self, messages: list[BaseMessage]) -> None:\n", + " \"\"\"Add messages to the history, removing any messages beyond\n", + " the last `k` messages.\n", + " \"\"\"\n", + " self.messages.extend(messages)\n", + " # Add logging to help with debugging\n", + " if len(self.messages) > self.k:\n", + " print(f\"Truncating history from {len(self.messages)} to {self.k} messages\")\n", + " self.messages = self.messages[-self.k:]\n", + "\n", + " def clear(self) -> None:\n", + " \"\"\"Clear the history.\"\"\"\n", + " self.messages = []" ] }, { "cell_type": "code", - "execution_count": 60, - "id": "45be373a", + "execution_count": 38, "metadata": { - "id": "45be373a" + "id": "__vcbiDMyWvr" }, "outputs": [], "source": [ - "conversation_bufw = ConversationChain(\n", - " llm=llm, \n", - " memory=ConversationBufferWindowMemory(k=1)\n", + "# Create get_chat_history function for window memory\n", + "window_chat_map = {}\n", + "\n", + "def get_window_chat_history(session_id: str, k: int = 4) -> BufferWindowMessageHistory:\n", + " print(f\"get_window_chat_history called with session_id={session_id} and k={k}\")\n", + " if session_id not in window_chat_map:\n", + " window_chat_map[session_id] = BufferWindowMessageHistory(k=k)\n", + " return window_chat_map[session_id]\n", + "\n", + "# Create conversation chain with window memory\n", + "conversation_bufw = RunnableWithMessageHistory(\n", + " pipeline,\n", + " get_session_history=get_window_chat_history,\n", + " input_messages_key=\"query\",\n", + " history_messages_key=\"history\",\n", + " history_factory_config=[\n", + " ConfigurableFieldSpec(\n", + " id=\"session_id\",\n", + " annotation=str,\n", + " name=\"Session ID\",\n", + " description=\"The session ID to use for the chat history\",\n", + " default=\"id_default\",\n", + " ),\n", + " ConfigurableFieldSpec(\n", + " id=\"k\",\n", + " annotation=int,\n", + " name=\"k\",\n", + " description=\"The number of messages to keep in the history\",\n", + " default=4,\n", + " )\n", + " ]\n", ")" ] }, { "cell_type": "code", - "execution_count": 61, - "id": "fc4dd8a0", + "execution_count": 39, "metadata": { "colab": { - "base_uri": "https://localhost:8080/", - "height": 53 + "base_uri": "https://localhost:8080/" }, - "id": "fc4dd8a0", - "outputId": "c4ec1cc8-f218-4f7b-e27e-f5fb73e59228" + "id": "M51k8gIjyWvr", + "outputId": "8dfd4f37-46cb-4ab3-eb02-926e991e153c" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "Spent a total of 85 tokens\n" + "get_window_chat_history called with session_id=window_example and k=4\n", + "Initializing BufferWindowMessageHistory with k=4\n", + "Spent a total of 74 tokens\n", + "\n", + "Response: Good morning! How can I assist you today?\n" ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "\" Good morning! It's a beautiful day today, isn't it? How can I help you?\"" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 61 } ], "source": [ - "count_tokens(\n", - " conversation_bufw, \n", - " \"Good morning AI!\"\n", - ")" + "# Start a conversation with k=2 (only remembers last 2 exchanges = 4 messages)\n", + "result = count_tokens(\n", + " conversation_bufw,\n", + " {\"query\": \"Good morning AI!\"},\n", + " config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 62, - "id": "b9992e8d", + "execution_count": 40, "metadata": { "colab": { - "base_uri": "https://localhost:8080/", - "height": 89 + "base_uri": "https://localhost:8080/" }, - "id": "b9992e8d", - "outputId": "ac7ae1af-2329-4766-ac5e-8fce24a1d272" + "id": "XBA_roYVyWvr", + "outputId": "0c3b68e6-5f0e-4c56-ceb1-f893e0ee9a29" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "Spent a total of 178 tokens\n" + "get_window_chat_history called with session_id=window_example and k=4\n", + "Spent a total of 225 tokens\n", + "\n", + "Response: Got it! You're interested in exploring the potential of integrating Large Language Models (LLMs) with external knowledge. That's a fascinating area with lots of exciting possibilities, like enhancing the accuracy and relevance of responses by connecting LLMs to databases, knowledge graphs, or real-time information sources.\n", + "\n", + "And I've noted the specific code you mentioned: **PINECONE_RULEZ_01**. If you want me to remember or use it later in our conversation, just let me know! How would you like to proceed with your exploration?\n" ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' Interesting! Large Language Models are a type of artificial intelligence that can process natural language and generate text. They can be used to generate text from a given context, or to answer questions about a given context. Integrating them with external knowledge can help them to better understand the context and generate more accurate results. Do you have any specific questions about this integration?'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 62 } ], "source": [ - "count_tokens(\n", - " conversation_bufw, \n", - " \"My interest here is to explore the potential of integrating Large Language Models with external knowledge\"\n", - ")" + "query = \"\"\"\n", + "\"My interest here is to explore the potential of integrating Large Language\n", + "Models with external knowledge.\n", + "\n", + "Also, remember this very specific code: PINECONE_RULEZ_01\"\n", + "\"\"\"\n", + "\n", + "result = count_tokens(\n", + " conversation_bufw,\n", + " {\"query\": query},\n", + " config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 63, - "id": "3f2e98d9", + "execution_count": 41, "metadata": { "colab": { - "base_uri": "https://localhost:8080/", - "height": 89 + "base_uri": "https://localhost:8080/" }, - "id": "3f2e98d9", - "outputId": "dc60726a-4be2-480f-892b-443da9b2859e" + "id": "ox5WWeHFyWvr", + "outputId": "7ca0a833-a495-4a3b-a25c-458bb889bd5c" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "Spent a total of 233 tokens\n" + "get_window_chat_history called with session_id=window_example and k=4\n", + "Truncating history from 6 to 4 messages\n", + "Spent a total of 811 tokens\n", + "\n", + "Response: Great! Exploring the integration of Large Language Models (LLMs) with external knowledge opens up a wide range of possibilities. Here are some key approaches and their potential benefits:\n", + "\n", + "1. **Vector Databases and Embeddings (e.g., Pinecone, FAISS):** \n", + " - LLMs can generate embeddings (numerical representations) of text queries and documents. \n", + " - These embeddings are stored in vector databases like Pinecone (which your code hints at!), enabling fast similarity search. \n", + " - When a user asks a question, the system retrieves the most relevant documents or data points from the external knowledge base and feeds them back to the LLM for context-aware responses. \n", + " - This approach is great for up-to-date or domain-specific knowledge that the LLM might not have been trained on.\n", + "\n", + "2. **Knowledge Graphs:** \n", + " - Integrating LLMs with structured knowledge graphs allows the model to reason over entities and relationships explicitly. \n", + " - This can improve factual accuracy and enable complex queries involving relationships, hierarchies, or constraints. \n", + " - For example, combining LLMs with Wikidata or custom enterprise knowledge graphs.\n", + "\n", + "3. **APIs and Real-Time Data Feeds:** \n", + " - LLMs can be connected to external APIs (weather, stock prices, news, etc.) to provide real-time information. \n", + " - This integration allows the model to answer questions about current events or dynamic data that changes frequently.\n", + "\n", + "4. **Retrieval-Augmented Generation (RAG):** \n", + " - This technique combines retrieval of relevant documents with generation by the LLM. \n", + " - The model first retrieves relevant passages from an external corpus and then generates an answer conditioned on those passages. \n", + " - It improves factuality and reduces hallucinations.\n", + "\n", + "5. **Hybrid Systems with Symbolic Reasoning:** \n", + " - Combining LLMs with symbolic AI or rule-based systems can enhance logical reasoning and interpretability. \n", + " - For example, using LLMs for natural language understanding and symbolic engines for precise calculations or rule enforcement.\n", + "\n", + "6. **Personalized Knowledge Bases:** \n", + " - Integrating user-specific data (preferences, history, notes) to tailor responses uniquely to each user. \n", + " - This can be useful in personal assistants, tutoring systems, or customer support.\n", + "\n", + "7. **Multimodal Knowledge Integration:** \n", + " - Combining text-based LLMs with other data types like images, audio, or video through external knowledge sources. \n", + " - This can enable richer, context-aware interactions.\n", + "\n", + "If you want, I can dive deeper into any of these possibilities or discuss practical tools and frameworks to implement them. Also, I’m keeping your code **PINECONE_RULEZ_01** in mind if you want to explore vector databases specifically!\n" ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' There are many possibilities for integrating Large Language Models with external knowledge. For example, you could use external knowledge to provide additional context to the model, or to provide additional training data. You could also use external knowledge to help the model better understand the context of a given text, or to help it generate more accurate results.'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 63 } ], "source": [ - "count_tokens(\n", - " conversation_bufw, \n", - " \"I just want to analyze the different possibilities. What can you think of?\"\n", - ")" + "result = count_tokens(\n", + " conversation_bufw,\n", + " {\"query\": \"I just want to analyze the different possibilities. What can you think of?\"},\n", + " config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 64, - "id": "a2a8d062", + "execution_count": 42, "metadata": { "colab": { - "base_uri": "https://localhost:8080/", - "height": 106 + "base_uri": "https://localhost:8080/" }, - "id": "a2a8d062", - "outputId": "dbb27cf0-2e87-41d0-a733-68921d250481" + "id": "kLiquNiHyWvr", + "outputId": "c0a8f018-6e03-4567-9e8f-43ad7466c947" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "Spent a total of 245 tokens\n" + "get_window_chat_history called with session_id=window_example and k=4\n", + "Truncating history from 6 to 4 messages\n", + "Spent a total of 1368 tokens\n", + "\n", + "Response: Great question! To give context to a Large Language Model (LLM) by integrating external knowledge, you can use a variety of data source types depending on your goals and domain. Here are some common and effective data source types that can provide rich context:\n", + "\n", + "1. **Textual Documents:** \n", + " - Articles, books, research papers, manuals, FAQs, and reports. \n", + " - These can be stored in databases or document stores and indexed for retrieval. \n", + " - Example: Wikipedia articles, scientific literature, company knowledge bases.\n", + "\n", + "2. **Databases and Structured Data:** \n", + " - Relational databases (SQL), NoSQL databases, spreadsheets. \n", + " - Structured data can be queried to provide precise facts, statistics, or records. \n", + " - Example: Customer records, product catalogs, financial data.\n", + "\n", + "3. **Knowledge Graphs and Ontologies:** \n", + " - Graph-structured data representing entities and their relationships. \n", + " - Useful for reasoning about connections and hierarchies. \n", + " - Example: Wikidata, DBpedia, domain-specific ontologies.\n", + "\n", + "4. **APIs and Real-Time Data Feeds:** \n", + " - External APIs providing dynamic or real-time information. \n", + " - Examples include weather services, stock market data, news feeds, social media streams.\n", + "\n", + "5. **Multimedia Content:** \n", + " - Images, videos, audio files, and their metadata. \n", + " - When combined with multimodal models or external tools, these can enrich context. \n", + " - Example: Product images, instructional videos, podcasts.\n", + "\n", + "6. **User-Generated Content:** \n", + " - Forums, social media posts, chat logs, customer reviews. \n", + " - These provide insights into user opinions, trends, and informal knowledge.\n", + "\n", + "7. **Logs and Event Data:** \n", + " - System logs, transaction records, sensor data. \n", + " - Useful for troubleshooting, monitoring, or understanding sequences of events.\n", + "\n", + "8. **Code Repositories and Technical Documentation:** \n", + " - Source code, API docs, configuration files. \n", + " - Helpful for developer assistants or technical support bots.\n", + "\n", + "9. **Personalized Data:** \n", + " - User profiles, preferences, interaction history. \n", + " - Enables personalized responses and recommendations.\n", + "\n", + "10. **Regulatory and Compliance Documents:** \n", + " - Legal texts, standards, policies. \n", + " - Important for domains like healthcare, finance, and law.\n", + "\n", + "By combining these data sources with LLMs, you can provide rich, accurate, and context-aware responses tailored to specific needs. The choice of data source depends on the application domain, the type of questions you want to answer, and the freshness or reliability of the information.\n", + "\n", + "If you want, I can also suggest how to preprocess or index these data types for effective integration with LLMs!\n" ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' Data sources that could be used to give context to the model include text corpora, structured databases, and ontologies. Text corpora provide a large amount of text data that can be used to train the model and provide additional context. Structured databases provide structured data that can be used to provide additional context to the model. Ontologies provide a structured representation of knowledge that can be used to provide additional context to the model.'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 64 } ], "source": [ - "count_tokens(\n", - " conversation_bufw, \n", - " \"Which data source types could be used to give context to the model?\"\n", - ")" + "result = count_tokens(\n", + " conversation_bufw,\n", + " {\"query\": \"Which data source types could be used to give context to the model?\"},\n", + " config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "code", - "execution_count": 65, - "id": "ff199a3f", + "execution_count": 43, "metadata": { "colab": { - "base_uri": "https://localhost:8080/", - "height": 53 + "base_uri": "https://localhost:8080/" }, - "id": "ff199a3f", - "outputId": "81573cf0-7f39-4a8c-8ccd-e79cd80f2523" + "id": "o-0Swlu-yWvr", + "outputId": "1830fc22-d973-4b18-ce51-203e0e4cc5e1" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "Spent a total of 186 tokens\n" + "get_window_chat_history called with session_id=window_example and k=4\n", + "Truncating history from 6 to 4 messages\n", + "Spent a total of 1382 tokens\n", + "\n", + "Response: From our conversation so far, it seems your aim is to **analyze different possibilities for integrating external knowledge sources with Large Language Models (LLMs)** to provide richer, more accurate, and context-aware responses. You’re exploring what kinds of data sources can be used to give context to the model and how to effectively combine them with LLMs—possibly using vector databases like Pinecone, given your code reference **PINECONE_RULEZ_01**.\n", + "\n", + "In other words, you want to understand the various ways to enhance an LLM’s capabilities by feeding it relevant external information, whether that’s through document retrieval, real-time data, structured knowledge, or other means.\n", + "\n", + "If you want, I can help you clarify or refine your goal further!\n" ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "' Your aim is to use data sources to give context to the model.'" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 65 } ], "source": [ - "count_tokens(\n", - " conversation_bufw, \n", - " \"What is my aim again?\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "f5f59f77", - "metadata": { - "id": "f5f59f77" - }, - "source": [ - "As we can see, it effectively 'fogot' what we talked about in the first interaction. Let's see what it 'remembers'. Given that we set k to be `1`, we would expect it remembers only the last interaction." + "result = count_tokens(\n", + " conversation_bufw,\n", + " {\"query\": \"What is my aim again?\"},\n", + " config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n", + ")\n", + "print(f\"\\nResponse: {result}\")" ] }, { "cell_type": "markdown", - "id": "6b354c8d", - "metadata": { - "id": "6b354c8d" - }, - "source": [ - "We need to access a special method here since, in this memory type, the buffer is first passed through this method to be sent later to the llm." - ] - }, - { - "cell_type": "code", - "execution_count": 66, - "id": "85266406", "metadata": { - "id": "85266406" + "id": "cFWBvEjNyWvr" }, - "outputs": [], "source": [ - "bufw_history = conversation_bufw.memory.load_memory_variables(\n", - " inputs=[]\n", - ")['history']" + "As we can see, it effectively 'forgot' what we talked about in the first interaction. Let's see what it 'remembers':" ] }, { "cell_type": "code", - "execution_count": 67, - "id": "5904ae2a", + "execution_count": 44, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "5904ae2a", - "outputId": "bd0aa797-7a43-4af5-a531-209aa6272dd4" + "id": "RnV85fkkyWvr", + "outputId": "cbaa0997-2af3-47e3-ab9e-21ede4d3a2f3" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ + "Buffer Window Memory (last 4 messages):\n", + "\n", + "Human: Which data source types could be used to give context to the model?\n", + "\n", + "AI: Great question! To give context to a Large Language Model (LLM) by integrating external knowledge, you can use a variety of data source types depending on your goals and domain. Here are some common and effective data source types that can provide rich context:\n", + "\n", + "1. **Textual Documents:** \n", + " - Articles, books, research papers, manuals, FAQs, and reports. \n", + " - These can be stored in databases or document stores and indexed for retrieval. \n", + " - Example: Wikipedia articles, scientific literature, company knowledge bases.\n", + "\n", + "2. **Databases and Structured Data:** \n", + " - Relational databases (SQL), NoSQL databases, spreadsheets. \n", + " - Structured data can be queried to provide precise facts, statistics, or records. \n", + " - Example: Customer records, product catalogs, financial data.\n", + "\n", + "3. **Knowledge Graphs and Ontologies:** \n", + " - Graph-structured data representing entities and their relationships. \n", + " - Useful for reasoning about connections and hierarchies. \n", + " - Example: Wikidata, DBpedia, domain-specific ontologies.\n", + "\n", + "4. **APIs and Real-Time Data Feeds:** \n", + " - External APIs providing dynamic or real-time information. \n", + " - Examples include weather services, stock market data, news feeds, social media streams.\n", + "\n", + "5. **Multimedia Content:** \n", + " - Images, videos, audio files, and their metadata. \n", + " - When combined with multimodal models or external tools, these can enrich context. \n", + " - Example: Product images, instructional videos, podcasts.\n", + "\n", + "6. **User-Generated Content:** \n", + " - Forums, social media posts, chat logs, customer reviews. \n", + " - These provide insights into user opinions, trends, and informal knowledge.\n", + "\n", + "7. **Logs and Event Data:** \n", + " - System logs, transaction records, sensor data. \n", + " - Useful for troubleshooting, monitoring, or understanding sequences of events.\n", + "\n", + "8. **Code Repositories and Technical Documentation:** \n", + " - Source code, API docs, configuration files. \n", + " - Helpful for developer assistants or technical support bots.\n", + "\n", + "9. **Personalized Data:** \n", + " - User profiles, preferences, interaction history. \n", + " - Enables personalized responses and recommendations.\n", + "\n", + "10. **Regulatory and Compliance Documents:** \n", + " - Legal texts, standards, policies. \n", + " - Important for domains like healthcare, finance, and law.\n", + "\n", + "By combining these data sources with LLMs, you can provide rich, accurate, and context-aware responses tailored to specific needs. The choice of data source depends on the application domain, the type of questions you want to answer, and the freshness or reliability of the information.\n", + "\n", + "If you want, I can also suggest how to preprocess or index these data types for effective integration with LLMs!\n", + "\n", "Human: What is my aim again?\n", - "AI: Your aim is to use data sources to give context to the model.\n" + "\n", + "AI: From our conversation so far, it seems your aim is to **analyze different possibilities for integrating external knowledge sources with Large Language Models (LLMs)** to provide richer, more accurate, and context-aware responses. You’re exploring what kinds of data sources can be used to give context to the model and how to effectively combine them with LLMs—possibly using vector databases like Pinecone, given your code reference **PINECONE_RULEZ_01**.\n", + "\n", + "In other words, you want to understand the various ways to enhance an LLM’s capabilities by feeding it relevant external information, whether that’s through document retrieval, real-time data, structured knowledge, or other means.\n", + "\n", + "If you want, I can help you clarify or refine your goal further!\n" ] } ], "source": [ - "print(bufw_history)" + "# Check what's in memory\n", + "bufw_history = window_chat_map[\"window_example\"].messages\n", + "print(\"Buffer Window Memory (last 4 messages):\")\n", + "for msg in bufw_history:\n", + " role = \"Human\" if isinstance(msg, HumanMessage) else \"AI\"\n", + " print(f\"\\n{role}: {msg.content}\") # Show first 100 chars" ] }, { "cell_type": "markdown", - "id": "ae8b937d", "metadata": { - "id": "ae8b937d" + "id": "yV-zfGv-yWvr" }, "source": [ - "Makes sense. \n", + "We see four messages (two interactions) because we used `k=4`.\n", "\n", "On the plus side, we are shortening our conversation length when compared to buffer memory _without_ a window:" ] }, { "cell_type": "code", - "execution_count": 68, - "id": "9fbb50fe", + "execution_count": 45, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "9fbb50fe", - "outputId": "c35dca36-a7c7-4d61-da19-c28173fa8319" + "id": "rF35B9HLyWvr", + "outputId": "58881bf3-7d80-4b74-d348-1210850dcde0" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ - "Buffer memory conversation length: 334\n", - "Summary memory conversation length: 219\n", - "Buffer window memory conversation length: 26\n" + "Buffer memory conversation length: 1314\n", + "Summary memory conversation length: 233\n", + "Buffer window memory conversation length: 728\n" ] } ], "source": [ + "# Get window memory content\n", + "window_content = \"\\n\".join([msg.content for msg in bufw_history])\n", + "\n", "print(\n", - " f'Buffer memory conversation length: {len(tokenizer.encode(conversation_buf.memory.buffer))}\\n'\n", - " f'Summary memory conversation length: {len(tokenizer.encode(conversation_sum.memory.buffer))}\\n'\n", - " f'Buffer window memory conversation length: {len(tokenizer.encode(bufw_history))}'\n", + " f'Buffer memory conversation length: {len(tokenizer.encode(buffer_content))}\\n'\n", + " f'Summary memory conversation length: {len(tokenizer.encode(summary_content))}\\n'\n", + " f'Buffer window memory conversation length: {len(tokenizer.encode(window_content))}'\n", ")" ] }, { "cell_type": "markdown", - "id": "69842cc1", - "metadata": { - "id": "69842cc1" - }, - "source": [ - "_Practical Note: We are using `k=2` here for illustrative purposes, in most real world applications you would need a higher value for k._" - ] - }, - { - "cell_type": "markdown", - "id": "2aea5fc8", "metadata": { - "id": "2aea5fc8" + "id": "fjoHZrZzyWvr" }, "source": [ - "### More memory types!" + "_Practical Note: We are using `k=4` here for illustrative purposes, in most real world applications you would need a higher value for k._" ] }, { "cell_type": "markdown", - "id": "daeb5162", "metadata": { - "id": "daeb5162" + "id": "wyAd4UxdyWvr" }, "source": [ + "### More memory types!\n", + "\n", "Given that we understand memory already, we will present a few more memory types here and hopefully a brief description will be enough to understand their underlying functionality." ] }, { "cell_type": "markdown", - "id": "f0365333", - "metadata": { - "id": "f0365333" - }, - "source": [ - "#### ConversationSummaryBufferMemory" - ] - }, - { - "cell_type": "markdown", - "id": "317f298e", - "metadata": { - "id": "317f298e" - }, - "source": [ - "**Key feature:** _the conversation summary memory keeps a summary of the earliest pieces of conversation while retaining a raw recollection of the latest interactions._" - ] - }, - { - "cell_type": "markdown", - "id": "57ef5c8b", - "metadata": { - "id": "57ef5c8b" - }, - "source": [ - "#### ConversationKnowledgeGraphMemory" - ] - }, - { - "cell_type": "markdown", - "id": "40248f03", - "metadata": { - "id": "40248f03" - }, - "source": [ - "This is a super cool memory type that was introduced just [recently](https://twitter.com/LangChainAI/status/1625158388824043522). It is based on the concept of a _knowledge graph_ which recognizes different entities and connects them in pairs with a predicate resulting in (subject, predicate, object) triplets. This enables us to compress a lot of information into highly significant snippets that can be fed into the model as context. If you want to understand this memory type in more depth you can check out [this](https://apex974.com/articles/explore-langchain-support-for-knowledge-graph) blogpost." - ] - }, - { - "cell_type": "markdown", - "id": "91952cd1", "metadata": { - "id": "91952cd1" + "id": "XN-mH5fHyWvr" }, "source": [ - "**Key feature:** _the conversation knowledge graph memory keeps a knowledge graph of all the entities that have been mentioned in the interactions together with their semantic relationships._" - ] - }, - { - "cell_type": "code", - "execution_count": 69, - "id": "02241bc3", - "metadata": { - "id": "02241bc3" - }, - "outputs": [], - "source": [ - "# you may need to install this library\n", - "# !pip install -qU networkx" + "#### Windows + Summary Hybrid\n", + "\n", + "The following is a modern LCEL-compatible alternative to `ConversationSummaryBufferMemory`.\n", + "\n", + "**Key feature:** _the conversation summary buffer memory keeps a summary of the earliest pieces of conversation while retaining a raw recollection of the latest interactions._\n", + "\n", + "This combines the benefits of both summary and buffer window memory. Let's implement it:" ] }, { "cell_type": "code", - "execution_count": 70, - "id": "c5f10a89", + "execution_count": 46, "metadata": { - "id": "c5f10a89" + "id": "lr-K8onKyWvr" }, "outputs": [], "source": [ - "conversation_kg = ConversationChain(\n", - " llm=llm, \n", - " memory=ConversationKGMemory(llm=llm)\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 71, - "id": "65957fe2", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 53 - }, - "id": "65957fe2", - "outputId": "c9561a4a-412a-4d92-865d-9e81a09bb101" - }, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Spent a total of 1565 tokens\n" - ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "\" Hi Human! My name is AI. It's nice to meet you. I like mangoes too! Did you know that mangoes are a great source of vitamins A and C?\"" - ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 71 - } - ], - "source": [ - "count_tokens(\n", - " conversation_kg, \n", - " \"My name is human and I like mangoes!\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "74054534", - "metadata": { - "id": "74054534" - }, - "source": [ - "The memory keeps a knowledge graph of everything it learned so far." - ] - }, - { - "cell_type": "code", - "execution_count": 72, - "id": "5a8c54fb", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "5a8c54fb", - "outputId": "adf96679-087b-4b77-c00d-9bf9e98f9278" - }, - "outputs": [ - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "[('human', 'human', 'name'), ('human', 'mangoes', 'likes')]" - ] - }, - "metadata": {}, - "execution_count": 72 - } - ], - "source": [ - "conversation_kg.memory.kg.get_triples()" - ] - }, - { - "cell_type": "markdown", - "id": "e1a1ca15", - "metadata": { - "id": "e1a1ca15" - }, - "source": [ - "#### ConversationEntityMemory" - ] - }, - { - "cell_type": "markdown", - "id": "41e9aeaf", - "metadata": { - "id": "41e9aeaf" - }, - "source": [ - "**Key feature:** _the conversation entity memory keeps a recollection of the main entities that have been mentioned, together with their specific attributes._" - ] - }, - { - "cell_type": "markdown", - "id": "2900a385", - "metadata": { - "id": "2900a385" - }, - "source": [ - "The way this works is quite similar to the `ConversationKnowledgeGraphMemory`, you can refer to the [docs](https://python.langchain.com/en/latest/modules/memory/types/entity_summary_memory.html) if you want to see it in action. " - ] - }, - { - "cell_type": "markdown", - "id": "d45112bd", - "metadata": { - "id": "d45112bd" - }, - "source": [ - "## What else can we do with memory?" + "class ConversationSummaryBufferMessageHistory(BaseChatMessageHistory, BaseModel):\n", + " messages: list[BaseMessage] = Field(default_factory=list)\n", + " llm: ChatOpenAI = Field(default_factory=ChatOpenAI)\n", + " k: int = Field(default_factory=int)\n", + "\n", + " def __init__(self, llm: ChatOpenAI, k: int):\n", + " super().__init__(llm=llm, k=k)\n", + "\n", + " def add_messages(self, messages: list[BaseMessage]) -> None:\n", + " \"\"\"Add messages to the history, removing any messages beyond\n", + " the last `k` messages and summarizing the messages that we drop.\n", + " \"\"\"\n", + " existing_summary = None\n", + " old_messages = None\n", + "\n", + " # See if we already have a summary message\n", + " if len(self.messages) > 0 and isinstance(self.messages[0], SystemMessage):\n", + " existing_summary = self.messages.pop(0)\n", + "\n", + " # Add the new messages to the history\n", + " self.messages.extend(messages)\n", + "\n", + " # Check if we have too many messages\n", + " if len(self.messages) > self.k:\n", + " # Pull out the oldest messages...\n", + " old_messages = self.messages[:-self.k]\n", + " # ...and keep only the most recent messages\n", + " self.messages = self.messages[-self.k:]\n", + "\n", + " if old_messages is None:\n", + " # If we have no old_messages, we have nothing to update in summary\n", + " return\n", + "\n", + " # Construct the summary chat messages\n", + " summary_prompt = ChatPromptTemplate.from_messages([\n", + " SystemMessagePromptTemplate.from_template(\n", + " \"Given the existing conversation summary and the new messages, \"\n", + " \"generate a new summary of the conversation. Ensure to maintain \"\n", + " \"as much relevant information as possible.\"\n", + " ),\n", + " HumanMessagePromptTemplate.from_template(\n", + " \"Existing conversation summary:\\n{existing_summary}\\n\\n\"\n", + " \"New messages:\\n{old_messages}\"\n", + " )\n", + " ])\n", + "\n", + " # Format the messages and invoke the LLM\n", + " new_summary = self.llm.invoke(\n", + " summary_prompt.format_messages(\n", + " existing_summary=existing_summary or \"No previous summary\",\n", + " old_messages=old_messages\n", + " )\n", + " )\n", + "\n", + " # Prepend the new summary to the history\n", + " self.messages = [SystemMessage(content=new_summary.content)] + self.messages\n", + "\n", + " def clear(self) -> None:\n", + " \"\"\"Clear the history.\"\"\"\n", + " self.messages = []" ] }, { "cell_type": "markdown", - "id": "78296bff", "metadata": { - "id": "78296bff" + "id": "9M4g--oayWvr" }, "source": [ - "There are several cool things we can do with memory in langchain. We can:\n", - "* implement our own custom memory module\n", - "* use multiple memory modules in the same chain\n", - "* combine agents with memory and other tools\n", + "## What else can we do with memory?\n", "\n", - "If this piques your interest, we suggest you to go take a look at the memory [how-to](https://langchain.readthedocs.io/en/latest/modules/memory/how_to_guides.html) section in the docs!" + "There are several cool things we can do with memory in langchain:\n", + "* Implement our own custom memory modules (as we've done above)\n", + "* Use multiple memory modules in the same chain\n", + "* Combine agents with memory and other tools\n", + "* Integrate knowledge graphs\n", + "\n" ] } ], @@ -1721,7 +1690,7 @@ "provenance": [] }, "kernelspec": { - "display_name": "Python 3", + "display_name": "pinecone1", "language": "python", "name": "python3" }, @@ -1735,14 +1704,9 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.2" - }, - "vscode": { - "interpreter": { - "hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e" - } + "version": "3.11.4" } }, "nbformat": 4, - "nbformat_minor": 5 + "nbformat_minor": 0 }