Image Understanding with RAG Cookbook #1838
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
The goal of this PR is to add a new cookbook to the multimodal examples in this repo. The new OpenAI image understanding capabilities have unlocked new use cases for systems that analyse multimodal data and many real-world datasets are multimodal. The cookbook covers creating a model which can leverage text and image context to answer questions about a synthetic dataset, with evals to test performance for different models.
Motivation
Why are these changes necessary? How do they improve the cookbook?
TODO: add authors in authors.yaml