-
Notifications
You must be signed in to change notification settings - Fork 6
Fix/memory leaks, add functions to report context usage, add unittest #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…m_chat_check_context Adds a check to return early if the chat struct is already initialized, preventing memory leaks and unnecessary re-initialization and ensuring correct chat context handling.
… avoid a crash on llm_chat_free() Refactored llm_messages_append to always duplicate message content using sqlite_strdup, removing the duplicate_content parameter. This option was causing a double-free crash in llm_chat_free with non-duplicated content. Also, the ai->chat.response buffer is reset on new prompts in the same chat so we cannot rely on that buffer to store previous response messages.
Introduces the llm_context_usage SQLite function, which returns a JSON object with the current context size, tokens used, and usage ratio. This helps users monitor LLM context utilization and manage resources more effectively.
Introduces a new C unit test runner at tests/c/unittest.c with basic tests for the SQLite AI extension. The Makefile is updated to build and run these tests, including logic to download a test model if needed and link against the appropriate SQLite libraries.
Replaces the llm_context_usage function with two separate functions: llm_context_size, which returns the total context size, and llm_context_used, which returns the number of tokens used. Updates function registration accordingly for improved clarity and granularity.
Introduces a new test, test_llm_chat_vtab, to verify the llm_chat virtual table functionality and memory usage. Adds the exec_select_rows helper for row-counting and verbose output. Updates test_llm_chat_respond_repeated to use varied prompts and more detailed context usage queries. Removes redundant SQLITE_ENABLE_LOAD_EXTENSION macro definition.
Introduces a new 'download-model' job in the GitHub Actions workflow to cache and restore the GGUF model, reducing redundant downloads. Updates the Makefile to use GGUF model variables, simplifies test model handling, and ensures the test binary is built with the correct SQLite source. Add SQLite amalgamation for testing.
Replaces the use of the ETag header with a SHA-256 hash of the GGUF model URL for cache key generation in the workflow. This improves reliability by avoiding dependency on remote server ETag support.
…ndows and linux-musl Introduces a new output 'model-path' in the download-model job and updates subsequent steps to use this output. This improves maintainability by centralizing the model path reference.
Introduces a conditional test target for the Android platform in the Makefile. The Android test runs only the sqlite3 CLI smoke test, while other platforms also run additional tests. This prevents the Android job from trying to execute build/tests/sqlite_ai_tests (and other host-only tools) inside the emulator while keeping the rest of the test workflow unchanged. The commands on Android were trying to use the host’s prebuilt toolchain path inside an Android emulator, which doesn’t exist there
The step that verified the existence of the GGUF model file has been removed from the GitHub Actions workflow because on windows and linux-musl runners it was failing. If the file isn’t present, the Makefile’s $(GGUF_MODEL_PATH) rule will download it as before, so we no longer need to enforce the explicit test -f check that was failing in containerized environments.
- Added SKIP_UNITTEST flag to the Makefile so platforms lacking the necessary tooling (Android, Linux-musl, Windows) can run only the sqlite CLI smoke test while macOS/Linux keep running the full C harness. - Fixed platform-specific issues: removed -ldl on Windows builds, adjusted workflow verification steps, and ensured model paths/devices are handled cleanly.
| #include "sqlite-ai.h" | ||
| #endif | ||
|
|
||
| #define DEFAULT_MODEL_PATH "tests/models/unsloth/gemma-3-270m-it-GGUF/gemma-3-270m-it-UD-IQ2_M.gguf" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chore: I'd add a comment to say that // Just a lightweight model to use for testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added: 054798d
| static const test_case TESTS[] = { | ||
| {"issue15_llm_chat_without_context", test_issue15_chat_without_context}, | ||
| {"llm_chat_respond_repeated", test_llm_chat_respond_repeated}, | ||
| {"llm_chat_vtab", test_llm_chat_vtab}, | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: maybe this array should be placed in the head of file so it's easier to remember to fill it with new tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I put the array above the function implementations, I would also need to add the functions prototypes before the array
|
@andinux better to update the API.md as well with the new functions |
Added comments clarifying the use of a lightweight model for testing in both the Makefile and unittest.c. Reordered and grouped related test variables in the Makefile for better readability.
Adds documentation for `llm_context_size()`, `llm_context_used()`, and `llm_token_count(text)` functions.
fix(chat): avoid memory leaks and unnecessary re-initialization in llm_chat_check_context
Adds a check to return early if the chat struct is already initialized, preventing memory leaks and unnecessary re-initialization and ensuring correct chat context handling.
fix(chat): remove the duplicate_content arg in llm_messages_append to avoid a crash on llm_chat_free()
Refactored llm_messages_append to always duplicate message content using sqlite_strdup, removing the duplicate_content parameter. This option was causing a double-free crash in llm_chat_free with non-duplicated content. Also, the ai->chat.response buffer is reset on new prompts in the same chat so we cannot rely on that buffer to store previous response messages.
test: add C unit tests
Introduces a new C unit test runner at tests/c/unittest.c with basic tests for the SQLite AI extension. The Makefile is updated to build and run these tests, including logic to download a test model if needed and link against the appropriate SQLite libraries.
ci: add GGUF model caching to workflow
Introduces a new 'download-model' job in the GitHub Actions workflow to cache and restore the GGUF model, reducing redundant downloads. Updates the Makefile to use GGUF model variables, simplifies test model handling, and ensures the test binary is built with the correct SQLite source.
Add SQLite amalgamation for testing.
feat: add llm_context_used and llm_context_size functions to report context usage
Add two functions: llm_context_size, which returns the total context size, and llm_context_used, which returns the number of tokens used.