-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Description
I think it doesn't like that the pydantic model has an array.
here is a reprex and the error message:
from typing import Literal
from pydantic import BaseModel, Field
from chatlas import ChatAnthropic, content_pdf_file
import pandas as pd
import dotenv
import os
import json
dotenv.load_dotenv()
text = "The new quantum computing breakthrough could revolutionize the tech industry."
class Classification(BaseModel):
name: Literal[
"Politics", "Sports", "Technology", "Entertainment", "Business", "Other"
] = Field(description="The category name")
score: float = Field(description="The classification score for the category, ranging from 0.0 to 1.0.")
class Classifications(BaseModel):
"""Array of classification results. The scores should sum to 1."""
classifications: list[Classification]
chat = ChatAnthropic(
system_prompt = "you are a friendly but terse assistant"
)
data = chat.extract_data(text, data_model=Classifications)
pd.DataFrame(data["classifications"])
gives
BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'tools.0.custom.input_schema: JSON schema is invalid. It must match JSON Schema draft 2020-12 ([https://json-schema.org/draft/2020-12](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)). Learn more about tool use at [https://docs.anthropic.com/en/docs/tool-use.](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)'}}
File [~/DsSandbox/price-parser-python/test-example.py:33](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)
26 classifications: list[Classification]
29 chat = ChatAnthropic(
30 system_prompt = "you are a friendly but terse assistant"
31 )
---> 33 data = chat.extract_data(text, data_model=Classifications)
34 pd.DataFrame(data["classifications"])
BUT this works (notice that we used Classification for the data_model instead of the array Classifications
data = chat.extract_data(text, data_model=Classification)
print(data)
gives
{'name': 'Technology', 'score': 0.95}
Metadata
Metadata
Assignees
Labels
No labels