- Results column visualization: New
Results.show_columns()method displays columns in hierarchical tree format using mermaid diagrams in Jupyter notebooks and ASCII trees in terminals, organized by data type (agent, answer, model, scenario, etc.). - Wildcard column selection: Added
*wildcard pattern support for flexible column selection inResultsSelectorclass, enabling patterns like*_cost,answer.*,prefix*suffixfor selecting multiple related columns at once. - Question groups: New feature for organizing related questions together in surveys.
- Question dropdown: New question type for dropdown selection interfaces.
- Results data retrieval: Updated
fetch_remotemethod to usecoop.pullinstead ofcoop.getwhen fetching remote results, improving clarity and data retrieval semantics. - Documentation structure: Major reorganization of documentation in
mint.jsonwith new groups (Getting Started, Core Concepts, Getting Data, Working with Results, etc.), improved navigation, logo configuration, and custom search with feedback options. - Survey rule validation: Enhanced with proper error handling for unknown question references and improved validation for wildcard patterns.
- Survey navigation logic: Fixed
Survey.next_question_with_instructionsto properly account for before rules created withadd_skip_rule. - Makefile workflow: Linting now runs in parallel across modules for significant speed improvements. All test, lint, format, and doctest targets mark completion status for better CI integration.
- Benchmarking: Commands now use
--no-openflag for CI-friendly operation with enhanced performance reporting and regression detection.
- QuestionMultipleChoiceWithOther: Removed unnecessary
other_instructionsparameter and fixed answer column text to properly display "{{other_option_text}}: [your answer]" instead of just "Other". - Dataset export: Removed overriding
to_docxmethod from Dataset class that was conflicting with the DatasetOperationsMixin method. - Documentation files: Fixed scenarios.mdx documentation formatting.
- Double print issue: Resolved issue where content was being displayed twice in certain contexts.
- Test matrix: Removed Python 3.9 from CI test matrix to simplify support. Updated starter tutorial integration test to run only for Python 3.10.
- Doctest environment: Doctest runs now set
EDSL_RUNNING_DOCTESTS=Trueenvironment variable in both Makefile and GitHub workflow for consistent test environments. - CI cleanup: GitHub tests locally now stash uncommitted changes before running tests with
act, ensuring only committed files are tested in clean Docker environment.
- QuestionDemand: New question type for collecting demand curves from language models, enabling economic analysis by asking for quantities demanded at various price points. Supports multiple numeric price points with automatic validation of non-negative quantities.
- QuestionPydantic: New question type for structured generation with Pydantic models. Enables extraction of structured data using Pydantic schemas with models that support structured outputs (e.g.,
gpt-4o-mini). - Survey generation methods: New methods
Survey.generate_survey_from_topic()to create surveys from topic strings with scenario placeholders, andSurvey.generate_survey_from_questions()to generate surveys from question text lists with automatic question option suggestions. - ScenarioList to Agent conversion: New
ScenarioList.to_agent_traits()method returns a single Agent with traits for all scenarios in the list, with auto-incremented keys for duplicates. - EDSL Apps framework: Introduced the concept of EDSL "Apps" for building interactive data analysis workflows.
- Mintlify documentation: Migrated documentation to Mintlify format with improved navigation and styling.
- Performance optimizations: Significant improvements to prompt rendering with detailed timing and profiling, render caching for agent prompts, increased cache sizes for template compilation, optimized hash computations with result caching in
Agent.__hash__, and changedInvigilatorBase.prompt_constructorto use@cached_propertyto prevent unnecessary instance recreation. - Cost estimation performance: Fixed prompts and cost estimation computation that was taking 300+ seconds for large jobs with many scenarios and questions.
- Remote job polling: Introduced dynamic and adaptive polling strategy that calculates intervals based on job complexity and progress, reducing unnecessary API calls for long jobs while ensuring quick detection for short jobs.
- Proxy service support: Added repeater proxy logic for Mistral, Azure, OpenRouter, and Together services with cache key computation on client side and prevention of duplicate file uploads.
- FileStore GCS offloading: Implemented automatic offloading of large file contents to Google Cloud Storage during push operations with automatic restoration when accessed, significantly reducing payload sizes.
- Text file support: Enhanced Anthropic model integration with text file decoding support for .txt, .json, and other text formats, with proper handling of images, PDFs, and clear warnings for unsupported file types.
- Survey rule evaluation: Improved safety and reliability of rule evaluation with proper string escaping using
json.dumps, better error messages, and more robust survey navigation logic that skips unevaluable stop rules.
- Jobs.pull() method: Fixed incorrect logic in
Jobs.pull()that was causing "Object not found" errors when pulling jobs. - Progress bar display: Resolved issues with remote job progress bar display.
- Agent traits persistence: Fixed
traits_presentation_templatenot persisting in serialization when set via setter after agent construction. - Grok-4 model calls: Fixed parameter compatibility by filtering unsupported parameters (
presence_penalty,frequency_penalty) for grok-4 model on xai service. - Conversation module: Various fixes to the Conversation module functionality.
- Duplicate hash handling: Refactored index tracking to use
_position_indexattribute directly on objects instead of hash-based dictionaries, fixing issues with duplicate hashes.
- Jobs metadata: Added
nr_questionsproperty to Jobs class, computing total questions to be executed (interviews × survey questions). - Notebook imports: Updated notebook import mechanisms for better compatibility.
- Firecrawl Integration: Complete web scraping capabilities with new FirecrawlScenario class. Enables scraping, crawling, searching, and structured data extraction from websites, returning results as EDSL Scenario and ScenarioList objects for seamless integration with surveys and analysis workflows.
- AgentList Enhancements: Enhanced agent creation with CSV codebook support and improved
from_results()method allowing users to specify which questions and answers become agent traits. - ScenarioList Functionality: Added data manipulation methods including
fillna()for handling missing values andtransform_by_key()for reshaping scenario data structures. - Model Services: Improved model availability system to show all compatible services per model rather than just preferred ones, and fixed Azure integration for local usage.
- User Experience: Enhanced Jupyter notebook login with rich UI components and improved environment variable handling.
- Agent Generation: Resolved duplicate agent creation bug in
Results.to_agent_list()when name fields were present.
- AgentList.from_source(): Unified method for creating agents from various data sources with optional
instructionsparameter. - AgentList.add_instructions(): Method to apply instructions to all agents in an existing list.
- Widget infrastructure: Added support for interactive widgets including ResultsInspector and ResultInspector.
- Local extension testing: Support for testing extensions locally using
extension.localsyntax. - Remote caching: Universal cache integration for improved performance with remote cache fetching when local cache misses occur.
- OpenAI reasoning models: Added standardized list of reasoning models including new GPT-5 class models with proper temperature handling.
- Multiple choice validation: Enhanced case-insensitive matching for capitalized responses (e.g., "Grapefruit" now matches "grapefruit").
- Agent combination: Fixed
+operator to properly preservenameandtraits_presentation_templatefrom both agents with conflict warnings. - Cache consistency: Improved cache key generation by sorting file hashes to prevent missed cache hits due to file order differences.
- Google services: Full asynchronous support using
client.aiofor better performance and concurrency. - Model availability: Enhanced
Model.available()with better type annotations and archive management. - Numpy compatibility: Support for both numpy 1.x and 2.x versions (
>=1.22,<3).
- Results refactoring: Broke down enormous Results and Result classes into smaller helper classes for better maintainability.
- Agent and Scenario refactoring: Improved code organization and structure.
- OpenAI API: Replaced deprecated
max_tokensparameter withmax_completion_tokensfor reasoning models. - Job status polling: Increased default refresh rate from 1 to 3 seconds to reduce polling frequency.
- Jinja2 template errors: Fixed crashes when loading agents with template syntax patterns like
{#,{{,{%in their data. - NaN handling: Replaced NaN values with
Nonein scenarios and question options for proper JSON serialization. - Double display issue: Fixed
show_prompts()displaying content twice in REPL/notebook environments. - Piping bugs: Various fixes for data piping and processing issues.
- Job chaining: Delayed execution of jobs to build longer chains with dependency management. Jobs can now store other jobs they depend on and execute those first.
- QuestionCompute: New question type that renders Jinja2 templates directly without LLM processing, with automatic numeric conversion and access to prior question answers.
- Template validation: Added syntax validation for survey scenarios to ensure correct usage of
{{scenario.field}}references. - Ordered sampling: Support for ordered sampling in data collection.
- Extensions service framework: New framework for creating EDSL-based web services with decorators like
@edsl_service,@input_param, and@output_schema. Includespip install edsl[services]for optional FastAPI dependencies.
- PDF handling: Fixed issues with Anthropic & OpenAI not working properly with PDFs.
- Nested scenario options: Fixed piping issues with nested scenario options and QuestionNumerical parameters.
- Agent handling: Better support for name fields in AgentList.from_scenario_list.
- Results association: Jobs now properly append associated results after post-run commands, maintaining Results objects even after dataset operations like
select(). - Public object search: Updated list endpoint to allow users to search public objects.
- Agent display: Fixed display issues for agents with no traits.
- Extension gateway: Replaced static
EDSL_EXTENSION_GATEWAY_URLconfiguration with dynamicget_extension_gateway_url()method. - Service deployment: Added comprehensive service framework with examples and documentation for creating web services without FastAPI knowledge.
- Answer validation: Fixed template context in InvigilatorFunctional to include prior question answers.
- Scenario references: Resolved issue with jobs involving scenarios accessing the scenario.target variable correctly.
- Display formatting: Various improvements to result comparison and agent display.
- Version 1.0.0: First major stable release of EDSL, marking production readiness and API stability.
- Auto-update mechanism: Automatic version checking on package import with
check_for_updates()function. New CLI commandedsl check-updatesto manually check for available updates. - ScenarioList offload method: New method to offload scenario lists for improved memory management.
- Dataset unique method: Added method to get unique values from datasets.
- Error messaging: Enhanced messaging for insufficient funds failures, properly retrieving failure reasons from the correct location.
- API key handling: Better support for remote inference with improved API key checks and conditional logic for remote configurations.
- Survey flow testing: Added comprehensive testing for survey flow functionality.
- Report generation: Enhanced report generation capabilities.
- Google Docs integration: Fixed bugs in scenario list generation from Google Docs sources.
- Colab compatibility: Patched error messages and improved handling in Google Colab environments.
- Azure and OpenAI services: Improved error handling to prevent failures when environment variables are missing.
- Service payments functionality for handling payments through the platform.
get_profile()method in Coop class to retrieve authenticated user's profile information including username and email.- Answer validation tracking in Results with new columns
validated.{question_name}_validatedto track which answers passed validation.
- ScenarioList now functions as a standard list with improved method compatibility.
- Support for pulling Jobs stored in Google Cloud Storage (new format since ORM migration).
- Enhanced object patch method with proper alias handling and format detection.
- List methods adapted to work with new ORM setup.
- Fixed typo in
Jobs.humanize()that was causing SyntaxWarning. - Fixed issue #2027 related to scenario handling.
- Removed stray print statements from the codebase.
- Updated
.pull()method implementation for questions.
- Linear scale questions now accept label responses in addition to numeric values. Models can return labels like "I love it" which are intelligently matched to the corresponding numeric value with support for exact, partial, and contextual matching.
- Prolific integration for managing studies directly from EDSL. Project endpoints return Prolific data when applicable.
- Proxy keys feature allows creating encrypted keys with usage limits that can be safely shared with third parties.
- Enhanced pull/push methods with alias-based retrieval support and new Google Cloud Storage format.
- Agent list tools for more efficient list operations.
- Support for scenarios in humanize feature.
- Increased maximum concurrent tasks to 1000 for improved performance at scale.
- Enhanced object retrieval with format detection (new/old) and legacy format fallback.
- Simplified error object handling with single parameter approach.
- Fixed pull method to work correctly with aliases.
- Resolved Google Colab environment compatibility issues.
- Fixed issues #1989, #1990, and #1921.
- Support for the OpenAI response API has been added. Job responses now have access to model reasoning summaries.
- Example notebook for the new functionality: https://www.expectedparrot.com/content/arulm/getting-reasoning-summaries-from-thinking-models
- Added a drop method to the Agent class for removing specific fields, and updated the to_dict method to optionally include all fields. [1] [2]
- Enhanced the AgentList class with methods to set instructions and traits presentation templates for all agents, and added a drop method to remove fields across the list. [1] [2]
- Updated the to_dataset method in AgentList to include traits_presentation_template in the agent_parameters when traits_only is set to False.
- Fix error in computing the remote inference cache key for files.
- Fix timeout issues when running jobs with videos.
- Improvements to the job status table to include more details on exceptions and costs.
-
Video file handlers:
Scenarioobjects can now be videos (MP4 and WebM). Example: https://www.expectedparrot.com/content/RobinHorton/video-scenarios-notebook -
Resultsobjects now include separate fields for input tokens, output tokens, input tokens cost, output tokens cost and total cost for eachResult. These fields all have theraw_model_responseprefix. -
Jobsmethodestimate_job_cost()now also includes estimated input tokens, output tokens, input tokens cost, output tokens cost and total cost for each model, and credits to be placed on hold while the job is running. -
New documentation page on estimating and tracking costs: https://docs.expectedparrot.com/en/latest/costs.html
-
Method
get_uuid()retrieves the Coop UUID for the relevant object, if it exists. -
Method
list()retrieves details of objects of the relevant type that you have posted to Coop. By default, it returns information about the 10 most recently created objects. Optional parameters:
page=specifies the pagination (e.g.,page=2will return the next 10 objects)page_size=specifies the number of objects to return (10 by default, and up to 100)search_queryreturns objects based on the description (if any)
The list() method is available for all EDSL object types (Agent, Scenario, Jobs, Results, Notebook, etc.), as well as the Coop client object. For example, Results.list() will return details on the 10 most recent results and Coop().list(page_size=5) will return details on the 5 most recent objects of any type.
-
Method
fetch()can be combined with thelist()method to retrieve objects of the relevant type that you have posted to Coop. By default, it returns the 10 most recently created objects. For example:Results.list().fetch()will return the 10 most recently created results. Thefetch()method is available for all EDSL object types (Agent,Scenario,Jobs,Results,Notebook, etc.), and theCoopclient object. -
Method
fetch_results()is a special method ofJobsobjects that can be combined with thelist()method to retrieve results of your jobs. For example:results = Jobs.list(page_size=2).fetch_results()will retrieve the results for your 2 most recent jobs.
- Methods for auto-generating
ScenarioListobjects from different file types are now available with a single syntax:ScenarioSource.from_source(). For example,sl = ScenarioSource.from_source('csv', 'my_file.csv')is equivalent tosl = ScenarioList.from_csv('my_file.csv').
-
Improvements to the Job Status table and Exceptions Report.
-
Improved logic for computing image token usage approximation.
- Improvements to Exceptions Report code for reproducing errors.
- Improvements to answer validation tests.
- Modified
ScenarioList.from_directory()to wrap files inScenarioobjects.
- New optional parameter for
QuestionList:min_list_itemsallows you to specify the minimum number of items that must be returned in the answer formatted as a list. This complements existing optional parametermax_list_items. Example: https://docs.expectedparrot.com/en/latest/questions.html#questionlist-class
-
Updated default prompt instructions for
QuestionRank. -
Improvements to
ScenarioList.from_csv()to handle non-UTF-8 encoding. -
Improvements to exceptions messages.
- Codebook support for
AgentListobjects. This facilitates creation of agents based on existing survey data, using a codebook for questions and responses.
-
Resultsmethodspot_issues()runs a survey to spot issues and suggest revised versions of any prompts that did not generate responses in your original survey (i.e., any user/system prompts where your results show a null answer and raw model response). You can optionally pass a list of models to use to run the meta-survey instead of the default model. See details on the meta-questions that are used and how it works: https://www.expectedparrot.com/content/RobinHorton/spot-issues-notebook. -
When you post an object to Coop with the
push()method you can optionally pass adescription, a convenientaliasfor the Coop URL that is created and avisiblitysetting (public, private or unlisted by default). An alias Coop URL is now displayed in the object details that are returned when the object is created. You can then use thealias_urlto retrieve or modify the object in lieu of theuuid. See examples in the Coop section. -
Scenarioobjects can be reference with thescenario.prefix, e.g., "Do you enjoy {{ scenario.activity }}?" (previously "Do you enjoy {{ activity }}?") to standardize syntax with other objects, e.g., when referencingagent.fields in the same way, or when pipinganswer.andquestion.fields.
-
A universal remote cache (URC) is available for retrieving responses to any questions that have been run at the Expected Parrot server. If you re-run a question that anyone has run before, you can retrieve that response at no cost to you. This cache is available for all jobs run remotely by default, and new responses are automatically added to it. If you want to draw fresh responses you can use
run(fresh=True). If you draw a fresh response for a question that has already been run, the new response is also added to the URC with an iteration index. The URC is not available for jobs run locally. See the remote cache section for details and FAQ. -
ScenarioListmethods for concatenating and collapsing scenarios in a scenario list:
-
concatenate()can be used to concatenate specified fields into a single string field -
concatenate_to_list()can be used to concatenate specified fields into a single list field -
concatenate_to_set()can be used to concatenate specified fields into a single set field -
collapse()can be used to collapse a scenario list by grouping on all fields except a specified field
See examples.
ScenarioListmethodfrom_sqlite()can be used to create a scenario list from a SQLite database.
- Bug causing some tokens generated to be omitted from results when skip logic was applied.
-
ScenarioListmethodfrom_dta()creates a scenario list from a Stata file. -
Resultsmethodflatten()will flatten a field of dictionaries into separate fields. It takes a list of the fields to flatten and a boolean indicator whether to preserve the original fields in the newResultsobject that is returned. See example. -
Resultsmethodreport()generates a report of selected columns in markdown by iterating through the rows, presented as observations. You can optionally pass headers, a divider and a limit on the number of observations to include. It can be useful if you want to display some sample part of larger results in a working notebook you are sharing. See example. -
Surveymethodshow_flow()can now also be called on aJobsobject, and will show any scenarios and/or agent traits that that you have added to questions. See examples.
-
Xai models are now available. If you have your own key, you can add it to your Keys page at your Coop account or add
XAI_API_KEY=<your_key_here>to your.envfile. -
Surveymethodhumanize()will create a web-based version of your survey to share with humans. Responses are automatically added to aResultsobject that you can access at your account. This feature is live but in development.
-
You can now use your own keys from service providers to run jobs remotely at the Expected Parrot server, and store them at the Keys page of your Coop account (in lieu of your
.envfile). You can also grant access to other users (without sharing the keys directly), set limits on their usage and set RPM/TPM limits. -
You can run a remote survey in the background (and then continue working or not) by calling
run(background=True). You can check the status of the job at any time by (1) viewing the progress bar page (the link is returned while your job is running), (2) callingresults.fetch()(which will return a status update every 1.0 seconds or thepolling_intervalthat you specify, or the completed results) or (3) calling the results as usual, e.g.,results.columns. Additional planned features: request email notification when your job is completed. See an example. -
Method
ScenarioList.from_pdf_to_image(<filename>)generates a scenario for each page of a pdf converted into a jpeg (to use as an image instead of converting to text). Companion methodScenario.from_pdf_to_image(<filename>)generates a key/value for each page within the same scenario object to allow you to use multiple images at the same time. See a notebook of examples. -
You can now pull an object from Coop using its alias. Alias routes were previously of the form expectedparrot.com/<owner_username>/, They are now of the form expectedparrot.com/content/<owner_username>/.
-
You can now see Mermaid diagrams and inline math in Coop notebooks.
-
Improved methods and moved tasks to background to prevent some timeout errors.
-
Upgrated connection for Anthropic models.
-
Fixed a bug preventing iterations on remote inference.
-
DeepSeek models are now available (e.g., try
Model("deepseek-reasoner")). If you have your own key, you can add it to your Keys page at your Coop account or addDEEPSEEK_API_KEY=<your_key_here>to your.envfile. -
The name of the inference service is now included in the
Modelparameters andResultsobjects. This can be useful when the same model is provided by multiple services. -
The model pricing page at Coop shows daily test results for available models: https://www.expectedparrot.com/home/pricing. The same information can also be returned by calling the method
Model.check_working_models(). Check the models for a particular service provider by passing the name of the service:Model.check_working_models(service="google").
- Default size limits on question texts have been removed.
- Modified default RPM to avoid timeout issues.
- Question type
QuestionDictreturns a response as a dictionary with specified keys and (optionally) specified value types and descriptions. Details: https://docs.expectedparrot.com/en/latest/questions.html#questiodict-class
-
Results of jobs run remotely are no longer automatically synced to your local cache. Now, a new cache for results is automatically generated and attached to a results object; you can access it by calling
results.cache. Results now also include the following fields for the associated cache:cache_keys.<question_name>_cache_key(the unique identifier for a cache entry) andcache_used.<question_name>_cache_used(an indicator whether the default cache was used to provide the response--this is either your local cache or remote cache, or a cache that was passed to therunmethod, if used instead of local or remote). -
Improvements to the web-based progress bar for remote jobs.
- Occasional timeout issue should be fixed by modifications to caching noted above.
-
Question type
QuestionMatrix. Details: https://docs.expectedparrot.com/en/latest/questions.html#questionmatrix-class -
A
join()method for objects. -
FileStoremethodcreate_link()embeds a file in the HTML of a notebook and generates a download link for it. Examples: https://docs.expectedparrot.com/en/latest/filestore.html
-
Exceptions report is displayed as a clickable link.
-
Improvements to table display of results returned by
select()method. -
Improvements to status messages displayed in a table log when a job is running.
-
Model.available()now uses Coop by default (all models available with remote inference are returned). If remote inference is not activated then only models available locally are returned (based on stored personal API keys).
- Progress bar shows total interviews instead of total unique interviews (iterations may be >1).
Resultsare now automatically displayed in a scrollable table when you callselect()on them. You can also calltable().long()to display results in a long-view table. This replaces the need to callprint(format="rich"). See examples in the starter tutorial.
- The progress bar is now web-based and a link to view it in a new tab is automatically returned when you call the
run()method on a survey (progress_bar=Trueby default). See examples in the starter tutorial.
- Results were automatically appending cache; this was removed.
- EDSL Authentication Token: If you attempt to run a survey remotely without having stored your EXPECTED_PARROT_API_KEY, a message will appear providing a Coop login link. Clicking this link and logging in will automatically store your key in your .env file.
-
The
AgentListmethodfrom_csv()now allows you to (optionally) automatically specify thenameparameters for agents by including a column "name" in the CSV. Other columns are (still) passed as agenttraits. See an example: https://docs.expectedparrot.com/en/latest/agents.html#from-a-csv-file -
The
Jobmethodrun()now takes a parameterremote_inference_results_visibilityto set the visibility of results of jobs that are being run remotely. The allowed visibility settings arepublic,privateandunlisted(the default setting is unlisted). This parameter has the same effect as passing the parametervisibilityto thepush()andpatch()methods for posting and updating objects at the Coop. For example, these commands have the same effect when remote inference activated:
Survey.example().run()
Survey.example().run(remote_inference_visibility="unlisted")
-
Bug in using f-strings and scenarios at once. Example usage: https://docs.expectedparrot.com/en/latest/scenarios.html#using-f-strings-with-scenarios
-
Bug in optional question parameters
answering_instructionsandquestion_presentation, which can be used to modify user prompts separately from modifying question texts. Example usage: https://docs.expectedparrot.com/en/latest/questions.html#optional-question-parameters
-
Method
show_prompts()can be called on aSurveyto display the user prompt and system prompt. This is in addition to the existing methodprompts()that is called on aJobwhich will return the prompts and additional information about the questions, agents, models and estimated costs. Learn more: https://docs.expectedparrot.com/en/latest/prompts.html -
Documentation on storing API keys as "secrets" for using EDSL in Colab.
-
Conversationmodule works with multiple models at once. -
Improved features for adding new models.
- Access to Open AI o1 models
-
Survey Builder is a new interface for creating and launching hybrid human-AI surveys. It is fully integrated with EDSL and Coop. Get access by activating beta features from your Coop account profile page. Learn more: https://docs.expectedparrot.com/en/latest/survey_builder.html
-
Jobsmethodshow_prompts()returns a table showing the user and system prompts that will be used with a survey, together with information about the agent and model and estimated cost for each interview.Jobsmethodpromptsreturns the information in a dataset. -
Scenarioobjects can contain multiple images to be presented to a model at once (works with Google models).
- Bug in piping a
ScenarioListcontaining multiple lists ofquestion_optionsto use with questions.
-
Optional parameters for
Questionobjects:include_comment = Falseprevents acommentfield from being added to a question (default isTrue: all question types other than free text automatically include a field for the model to comment on its answer, unless this parameter is passed)use_code = Truemodifies user prompts for question types that takequestion_optionsto instruct the model to return the integer code for an option instead of the option value (default isFalse)answering_instructionsandquestion_presentationallow you to control exact prompt language and separate instructions for the presentation of a questionpermissive = Trueturns off enforcement of question constraints (e.g., if min/max selections for a checkbox question have been specified, you can setpermissive = Trueto allow responses that contain fewer or greater selections) (default isFalse)
-
Methods for
Questionobjects:loop()generates a list of versions of a question for aScenarioListthat is passed to it. Questions are constructed with a{{ placeholder }}for a scenario as usual, but each scenario value is added to the question when it is created instead of when a survey is run (which is done with theby()method). Survey results for looped questions include fields for each unique question but noscenariofield. See examples: https://docs.expectedparrot.com/en/latest/starter_tutorial.html#adding-scenarios-using-the-loop-method and https://docs.expectedparrot.com/en/latest/scenarios.html#looping
-
Methods for
ScenarioListobjects:unpivot()expands a scenario list by specified identifierspivot()undoesunpivot(), collapsing scenarios by identifiersgive_valid_names()generates valid Pythonic identifiers for scenario keysgroup_by()groups scenarios by identifiers or applies a function to the values of the specified variablesfrom_wikipedia_table()converts a Wikipedia table into a scenario list. See examples: https://docs.expectedparrot.com/en/latest/notebooks/scenario_list_wikipedia.htmlto_docx()exports scenario lists as structured Docx documents
-
Optional parameters for
Modelobjects:raise_validation_errors = Falsecauses exceptions to only be raised (interrupting survey execution) when a model returns nothing at all (default:raise_validation_errors = True)print_exceptions = Falsecauses exceptions to not be printed at all (default:print_exceptions = True)
-
Columns in
Resultsfor monitoring token usage:generated_tokensshows the tokens that were generated by the modelraw_model_response.<question_name>_costshows the cost of the result for the question, applying the token quanities & pricesraw_model_response.<question_name>_one_usd_buysshows the number of results for the question that 1USD will buyraw_model_response.<question_name>_raw_model_responseshows the raw response for the question
-
Methods for
Resultsobjects:tree()displays a nested tree for specified componentsgenerate_html()andsave_html()generate and save HTML code for displaying results
-
General improvements to exceptions reports.
-
General improvements to the progress bar:
survey.run(progress_bar=True) -
Question validation methods no longer use JSON. This will eliminate exceptions relating to JSON errors previously common to certain models.
-
Base agent instructions template is not added to a job if no agent is used with a survey (reducing tokens).
-
The
select()method (forResultsandScenarioList) now allows partial match on key names to save typing.
-
Bug in enforcement of token/rate limits.
-
Bug in generation of exceptions report that excluded agent information.
-
Models: AWS Bedrock & Azure
-
Question: New method
loop()allows you to create versions of questions when you are constructing a survey. It takes aScenarioList()as a parameter and returns a list ofQuestionobjects.
- Bug in
Surveyquestion piping prevented you from adding questions after piping.
-
ScenarioList.from_sqliteallows you to create a list of scenarios from a SQLite table. -
Added LaTeX support to SQL outputs and ability to write to files:
Results.print(format="latex", filename="example.tex") -
Options that we think of as "terminal", such as
sql(),print(),html(), etc., now take ateeboolean that causes them to returnself. This is useful for chaining, e.g., if you runprint(format = "rich", tee = True)it will returnself, which allows you do also runprint(format = "rich", tee = True).print(format = "latex", filename = "example.tex").
- Ability to create a
Scenarioforquestion_options. Example:
from edsl import QuestionMultipleChoice, Scenario
q = QuestionMultipleChoice(
question_name = "capital_of_france",
question_text = "What is the capital of France?",
question_options = "{{question_options}}"
)
s = Scenario({'question_options': ['Paris', 'London', 'Berlin', 'Madrid']})
results = q.by(s).run()
- Prompts visibility: Call
prompts()on aJobsobject for a survey to inspect the prompts that will be used in a survey before running it. For example:
from edsl import Model, Survey
j = Survey.example().by(Model())
j.prompts().print(format="rich")
-
Piping: Use agent traits and components of questions (question_text, answer, etc.) as inputs to other questions in a survey (e.g.,
question_text = "What is your last name, {{ agent.first_name }}?"orquestion_text = "Name some examples of {{ prior_q.answer }}"orquestion_options = ["{{ prior_q.answer[0]}}", "{{ prior_q.answer[1] }}"]). Examples: https://docs.expectedparrot.com/en/latest/surveys.html#id2 -
Agent traits: Call agent traits directly (e.g.,
Agent.example().agewill return22).
- A bug in piping to allow you to pipe an
answerintoquestion_options. Examples: https://docs.expectedparrot.com/en/latest/surveys.html#id2
-
Method
add_columns()allows you to add columns toResults. -
Class
ModelListallows you to create a list ofModelobjects, similar toScenarioListandAgentList.
Conjuremodule allows you to import existing survey data and reconstruct it as EDSL objects. See details on methodsto_survey(),to_results(),to_agent_list()and renaming/modifying objects: https://docs.expectedparrot.com/en/latest/conjure.html
- Method
rename()allows you to rename questions, agents, scenarios, results.
- New language models from OpenAI, Anthropic, Google will be added automatically when they are released by the platforms.
- Removed an errant break point in language models module.
-
Scenario.rename()allows you to rename fields of a scenario. -
Scenario.chunk()allows you to split a field into chunks of a given size based onnum_wordornum_lines, creating aScenarioList. -
Scenario.from_html()turns the contents of a website into a scenario. -
Scenario.from_image()creates an image scenario to use with a vision model (e.g., GPT-4o). -
ScenarioList.sample()allows you to take a sample from a scenario list. -
ScenarioList.tally()allows you to tally fields in scenarios. -
ScenarioList.expand()allows you to expand a scenario by a field in it, e.g., if a scenario field contains a list the method can be used to break it into separate scenarios. -
ScenarioList.mutate()allows you to add a key/value to each scenario. -
ScenarioList.order_by()allows you to order the scenarios. -
ScenarioList.filter()allows you to filter the scenarios based on a logical expression. -
ScenarioList.from_list()allows you to create a ScenarioList from a list of values and specified key. -
ScenarioList.add_list()allows you to use a list to add values to individual scenarios. -
ScenarioList.add_value()allows you to add a value to all the scenarios. -
ScenarioList.to_dict()allows you to turn a ScenarioList into a dictionary. -
ScenarioList.from_dict()allows you to create a ScenarioList from a dictionary. -
Results.drop()complementsResults.select()for identifying the components that you want to print in a table. -
ScenarioList.drop()similarly complementsScenarioList.select().
- Improvements to exceptions reports: Survey run exceptions now include the relevant job components and are optionally displayed in an html report.
-
We started a blog! https://blog.expectedparrot.com
-
Agent/AgentListmethodremove_trait(<trait_key>)allows you to remove a trait by name. This can be useful for comparing combinations of traits. -
Agent/AgentListmethodtranslate_traits(<codebook_dict>)allows you to modify traits based on a codebook passed as dictionary. Example:
agent = Agent(traits = {"age": 45, "hair": 1, "height": 5.5})
agent.translate_traits({"hair": {1:"brown"}})
This will return: Agent(traits = {'age': 10, 'hair': 'brown', 'height': 5.5})
-
AgentListmethodget_codebook(<filename>)returns the codebook for a CSV file. -
AgentListmethodfrom_csv(<filename>)loads anAgentListfrom a CSV file with the column names astraitskeys. Note that the CSV column names must be valid Python identifiers (e.g.,current_ageand notcurrent age). -
Resultsmethodto_scenario_list()allows you to turn any components of results into a list of scenarios to use with other questions. A default parameterremove_prefixes=Truewill remove the results component prefixesagent.,answer.,comment., etc., so that you don't have to modify placeholder names for the new scenarios. Example: https://docs.expectedparrot.com/en/latest/scenarios.html#turning-results-into-scenarios -
ScenarioListmethodto_agent_list()converts aScenarioListinto anAgentList. -
ScenarioListmethodfrom_pdf(<filename>)allows you to import a PDF and automatically turn the pages into a list of scenarios. Example: https://docs.expectedparrot.com/en/latest/scenarios.html#turning-pdf-pages-into-scenarios -
ScenarioListmethodfrom_csv(<filename>)allows you to import a CSV and automatically turn the rows into a list of scenarios. -
ScenarioListmethodfrom_pandas(<dataframe>)allows you to import a pandas dataframe and automatically turn the rows into a list of scenarios. -
Scenariomethodfrom_image(<image_path>)creates a scenario with a base64 encoding of an image. The scenario is formatted as follows:"file_path": <filname / url>, "encoded_image": <generated_encoding>Note that you need to use a vision model (e.g.,model = Model('gpt-4o')) and you do not need to add a{{ placeholder }}for the scenario (for now--this might change!). Example:
from edsl.questions import QuestionFreeText
from edsl import Scenario, Model
model = Model('gpt-4o')
scenario = Scenario.from_image('general_survey.png') # Image from this notebook: https://docs.expectedparrot.com/en/latest/notebooks/data_labeling_agent.html
# scenario
q = QuestionFreeText(
question_name = "example",
question_text = "What is this image showing?" # We do not need a {{ placeholder }} for this kind of scenario
)
results = q.by(scenario).by(model).run(cache=False)
results.select("example").print(format="rich")
Returns:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ answer ┃
┃ .example ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ This image is a flowchart showing the process of creating and administering a survey for data labeling tasks. │
│ The steps include importing data, creating data labeling tasks as questions about the data, combining the │
│ questions into a survey, inserting the data as scenarios of the questions, and administering the same survey to │
│ all agents. │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-
QuestionandSurveymethodhtml()generates an improved html page representation of the object. You can optionally specify the filename and css. See default css:edsl/edsl/surveys/SurveyExportMixin.py
Line 10 in 9d981fa
-
QuestionMultipleChoicenow takes numbers and lists asquestion_options(e.g.,question_options = [[1,2,3], [4,5,6]]is allowed). Previously options had to be a list of strings (i.e.,question_options = ['1','2','3']is still allowed but not required).
- Optional parameter in
Resultsmethodto_list()to flatten a list of lists (eg, responses toQuestionList):results.to_list(flatten=True)
- Erroneous error messages about adding rules to a survey.
- New
Surveymethod to export a survey to file. Usage:generated_code = survey.code("example.py")
- A bug in
Surveymethodadd_skip_logic()
- New methods for adding, sampling and shuffling
Resultsobjects:dup_results = results + resultsresults.shuffle()results.sample(n=5)
-
Optional parameter
survey.run(cache=False)if you do not want to access any cached results in running a survey. -
Instructions passed to an agent at creation are now a column of results:
agent_instruction
- Methods for setting session caches
New function
set_session_cachewill set the cache for a session:
from edsl import Cache, set_session_cache
set_session_cache(Cache())
The cache can be set to a specific cache object, or it can be set to a dictionary or SQLite3Dict object:
from edsl import Cache, set_session_cache
from edsl.data import SQLiteDict
set_session_cache(Cache(data = SQLiteDict("example.db")))
# or
set_session_cache(Cache(data = {}))
The unset_session_cache function is used to unset the cache for a session:
from edsl import unset_session_cache
unset_session_cache()
This will unset the cache for the current session, and you will need to pass the cache object to the run method during the session.
Details: https://docs.expectedparrot.com/en/latest/data.html#setting-a-session-cache
-
Answer comments are now a separate component of results The "comment" field that is automatically added to each question (other than free text) is now stored in
Resultsascomment.<question_name>. Prior to this change, the comment for each question was stored asanswer.<question_name>_comment, i.e., if you ranresults.columnsthe list of columns would includeanswer.<question_name>andanswer.<question_name>_commentfor each question. With this change, the columns will now beanswer.<question_name>andcomment.<question_name>_comment. This change is meant to make it easier to select only the answers, e.g., runningresults.select('answer.*').print()will no longer also include all the comments, which you may not want to display. (The purpose of the comments field is to allow the model to add any information about its response to a question, which can help avoid problems with JSON formatting when the model does not want to return just the properly formatted response.) -
Exceptions We modified exception messages. If your survey run generates exceptions, run
results.show_exceptions()to print them in a table.
- A package that was missing for working with Anthropic models.
-
Resultsobjects now include columns for question components. Call the.columnsmethod on your results to see a list of all components. Runresults.select("question_type.*", "question_text.*", "question_options.*").print()to see them. -
Surveyobjects now also have a.to_csv()method.
- Increased the maximum number of multiple choice answer options to 200 (previously 20) to facilitate large codebooks / data labels.
- A bug in in
Survey.add_rule()method that caused an additional question to be skipped when used to apply a skip rule.
- New models: Run
Model.available()to see a complete current list.
- A bug in json repair methods.
-
New documentation: https://docs.expectedparrot.com
-
Progress bar: You can now pass
progress_bar=Trueto therun()method to see a progress bar as your survey is running. Example:
from edsl import Survey
results = Survey.example().run(progress_bar=True)
Job Status
Statistic Value
─────────────────────────────────────────────────────────────────
Elapsed time 1.1 sec.
Total interviews requested 1
Completed interviews 1
Percent complete 100 %
Average time per interview 1.1 sec.
Task remaining 0
Estimated time remaining 0.0 sec.
Model Queues
gpt-4-1106-preview;TPM (k)=1200.0;RPM (k)=8.0
Number question tasks waiting for capacity 0
new token usage
prompt_tokens 0
completion_tokens 0
cost $0.00000
cached token usage
prompt_tokens 104
completion_tokens 35
cost $0.00209
- New language models: We added new models from Anthropic and Databricks. To view a complete list of available models see edsl.enums.LanguageModelType or run:
from edsl import Model
Model.available()This will return:
['claude-3-haiku-20240307',
'claude-3-opus-20240229',
'claude-3-sonnet-20240229',
'dbrx-instruct',
'gpt-3.5-turbo',
'gpt-4-1106-preview',
'gemini_pro',
'llama-2-13b-chat-hf',
'llama-2-70b-chat-hf',
'mixtral-8x7B-instruct-v0.1']For instructions on specifying models to use with a survey see new documentation on Language Models. Let us know if there are other models that you would like us to add!
- Cache: We've improved user options for caching LLM calls.
Old method:
Pass a use_cache boolean parameter to a Model object to specify whether to access cached results for the model when using it with a survey (i.e., add use_cache=False to generate new results, as the default value is True).
How it works now:
All results are (still) cached by default. To avoid using a cache (i.e., to generate fresh results), pass an empty Cache object to the run() method that will store everything in it. This can be useful if you want to isolate a set of results to share them independently of your other data. Example:
from edsl.data import Cache
c = Cache() # create an empty Cache object
from edsl.questions import QuestionFreeText
results = QuestionFreeText.example().run(cache = c) # pass it to the run method
c # inspect the new data in the cache
We can inspect the contents:
Cache(data = {‘46d1b44cd30e42f0f08faaa7aa461d98’: CacheEntry(model=‘gpt-4-1106-preview’, parameters={‘temperature’: 0.5, ‘max_tokens’: 1000, ‘top_p’: 1, ‘frequency_penalty’: 0, ‘presence_penalty’: 0, ‘logprobs’: False, ‘top_logprobs’: 3}, system_prompt=‘You are answering questions as if you were a human. Do not break character. You are an agent with the following persona:\n{}’, user_prompt=‘You are being asked the following question: How are you?\nReturn a valid JSON formatted like this:\n{“answer”: “<put free text answer here>“}‘, output=’{“id”: “chatcmpl-9CGKXHZPuVcFXJoY7OEOETotJrN4o”, “choices”: [{“finish_reason”: “stop”, “index”: 0, “logprobs”: null, “message”: {“content”: “```json\\n{\\“answer\\“: \\“I\‘m doing well, thank you for asking! How can I assist you today?\\“}\\n```“, “role”: “assistant”, “function_call”: null, “tool_calls”: null}}], “created”: 1712709737, “model”: “gpt-4-1106-preview”, “object”: “chat.completion”, “system_fingerprint”: “fp_d6526cacfe”, “usage”: {“completion_tokens”: 26, “prompt_tokens”: 68, “total_tokens”: 94}}’, iteration=0, timestamp=1712709738)}, immediate_write=True, remote=False)For more details see new documentation on Caching LLM Calls.
Coming soon: Automatic remote caching options.
- API keys:
You will no longer be prompted to enter your API keys when running a session. We recommend storing your keys in a private
.envfile in order to avoid having to enter them at each session. Alternatively, you can still re-set your keys whenever you run a session. See instructions on setting up an.envfile in our Starter Tutorial.
The Expected Parrot API key is coming soon! It will let you access all models at once and come with automated remote caching of all results. If you would like to test it out, please let us know!
- Prompts: We made it easier to modify the agent and question prompts that are sent to the models. For more details see new documentation on Prompts.
Modelattributeuse_cacheis now deprecated. See details above about how caching now works.
.run(n = ...)now works and will run your survey with fresh results the specified number of times.
- Various fixes and small improvements
- The raw model response is now available in the
Resultsobject, accessed via "raw_model_response" keyword. There is one for each question. The key is the question_name +_raw_response_model - The
.run(progress_bar = True)returns a much more informative real-time view of job progress.
- The
answercomponent of theResultsobject is printed in a nicer format.
trait_namedescriptor was not working; it is now fixed.QuestionListis now working properly again
- Results now provides a
.sql()method that can be used to explore data in a SQL-like manner. - Results now provides a
.ggplot()method that can be used to create ggplot2 visualizations. - Agent now admits an optional
nameargument that can be used to identify the Agent.
- Fixed various issues with visualizations. They should now work better.
- Question options can now be 1 character long or more (down from 2 characters)
- Fixed a bug where prompts displayed were incorrect (prompts sent were correct)
- Report functionalities are now part of the main package.
- Fixed a bug in the Results.print() function
- The package no longer supports a report extras option.
- Fixed a bug in EndofSurvey
- Better handling of async failures
- Fixed bug in survey logic
- Improvements in async survey running
- Added logging
- Improvements in async survey running
- Improvements in async survey running
- Support for several large language models
- Async survey running
- Asking for API keys before they are used
- Bugs in survey running
- Bugs in several question types
- Unused files
- Unused package dependencies
- Changelog file
- Image display and description text in README.md
- Unused files
- Base feature