Skip to content

Commit 5f79d21

Browse files
authored
Add pure local mode (#20)
* add local mode and readme updates * ensure using newest lambdaprompt * fix typo
1 parent b4d5f25 commit 5f79d21

File tree

3 files changed

+15
-5
lines changed

3 files changed

+15
-5
lines changed

README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,14 @@ df['capitol'] = pd.DataFrame({'State': ['Colorado', 'Kansas', 'California', 'New
6868

6969
## Sketch currently uses `prompts.approx.dev` to help run with minimal setup
7070

71-
In the future, we plan to update the prompts at this endpoint with our own custom foundation model, built to answer questions more accurately than GPT-3 can with its minimal data context.
71+
You can also directly use a few pre-built hugging face models (right now `MPT-7B` and `StarCoder`), which will run entirely locally (once you download the model weights from HF).
72+
Do this by setting environment 3 variables:
73+
74+
```python
75+
os.environ['LAMBDAPROMPT_BACKEND'] = 'StarCoder'
76+
os.environ['SKETCH_USE_REMOTE_LAMBDAPROMPT'] = 'False'
77+
os.environ['HF_ACCESS_TOKEN'] = 'your_hugging_face_token'
78+
```
7279

7380
You can also directly call OpenAI directly (and not use our endpoint) by using your own API key. To do this, set 2 environment variables.
7481

pyproject.toml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,15 @@ dependencies = [
1717
"datasketch>=1.5.8",
1818
"datasketches>=4.0.0",
1919
"ipython",
20-
"lambdaprompt",
20+
"lambdaprompt>=0.5.2",
2121
"packaging"
2222
]
2323
urls = {homepage = "https://github.com/approximatelabs/sketch"}
2424
dynamic = ["version"]
2525

26+
[project.optional-dependencies]
27+
local = ["lambdaprompt[local]"]
28+
all = ["sketch[local]"]
2629

2730
[tool.setuptools_scm]
2831

sketch/pandas_extension.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -181,7 +181,7 @@ def call_prompt_on_dataframe(df, prompt, **kwargs):
181181
return text_to_copy
182182

183183

184-
howto_prompt = lambdaprompt.GPT3Prompt(
184+
howto_prompt = lambdaprompt.Completion(
185185
"""
186186
For the pandas dataframe ({{ dfname }}) the user wants code to solve a problem.
187187
Summary statistics and descriptive data of dataframe [`{{ dfname }}`]:
@@ -234,7 +234,7 @@ def howto_from_parts(
234234
return code
235235

236236

237-
ask_prompt = lambdaprompt.GPT3Prompt(
237+
ask_prompt = lambdaprompt.Completion(
238238
"""
239239
For the pandas dataframe ({{ dfname }}) the user wants an answer to a question about the data.
240240
Summary statistics and descriptive data of dataframe [`{{ dfname }}`]:
@@ -338,7 +338,7 @@ def apply(self, prompt_template_string, **kwargs):
338338
raise RuntimeError(
339339
f"Too many rows for apply \n (SKETCH_ROW_OVERRIDE_LIMIT: {row_limit}, Actual: {len(self._obj)})"
340340
)
341-
new_gpt3_prompt = lambdaprompt.GPT3Prompt(prompt_template_string)
341+
new_gpt3_prompt = lambdaprompt.Completion(prompt_template_string)
342342
named_args = new_gpt3_prompt.get_named_args()
343343
known_args = set(self._obj.columns) | set(kwargs.keys())
344344
needed_args = set(named_args)

0 commit comments

Comments
 (0)