*Feature issue : #1706* Write a design proposals for Gen AI data ingestion workflow using : - Gitlab pipeline as data ingestion scheduler - OpenSearch as vector DB provider - AWS lambda to run ingestion script with access to the database - AWS for infrastructure (this design may include GCP GKE reflexion also) - Langfuse as test dataset storage solution - Reuse as much as possible existing python tooling : [tock-llm-indexing-tools](https://github.com/theopenconversationkit/tock/blob/tock-24.3.4/gen-ai/orchestrator-server/src/main/python/tock-llm-indexing-tools/README.md) - **Optional** Ragas for evaluators *Design should be reviewed and approved before starting any development to be sure that we are developing in the right direction.*