Skip to content

[Bug]: Memory leak in text_to_cstring() usage - missing pfree() calls #107

@ishansurdi

Description

@ishansurdi

Bug Description

I was reviewing the codebase and noticed something concerning in src/pg_ai_query.cpp.

The code uses text_to_cstring() to convert PostgreSQL text types to C++ strings, but I don't see any corresponding pfree() calls afterward. According to the PostgreSQL documentation, text_to_cstring() returns a palloc'd pointer that the caller is responsible for freeing.

I think this is causing memory leaks in all the extension functions. The pattern I'm seeing is:

std::string query = text_to_cstring(query_arg);  // char* leaked here!

The std::string constructor copies the char* data, but the original palloc'd memory is never freed. I counted 8 instances of this pattern across the file.

I know PostgreSQL cleans up memory contexts at transaction boundaries, so this might not be immediately obvious in typical usage, but in long-running transactions or with connection pooling, I think this could accumulate quite a bit of memory.

I'm open to review if I didn't grasp the underlying implementations.

Steps to Reproduce

Steps to Reproduce:

1. Install the pg_ai_query extension
2. Run this in a long transaction:
```sql
BEGIN;
  -- Call generate_query() repeatedly
  SELECT generate_query('show all users', 'test_key', 'openai');
  SELECT generate_query('count orders', 'test_key', 'openai');
  -- ... repeat 100+ times ...
  
  -- Check memory usage while still in transaction
  SELECT * FROM pg_backend_memory_contexts() WHERE name = 'MessageContext';
  
  -- You'll see memory has accumulated
COMMIT;  -- Only now does it get cleaned up
  1. Check the memory contexts - I think you'll see the accumulation during the transaction

Expected Behavior

Expected Behavior:

The code should call `pfree()` on the char* returned by `text_to_cstring()` after copying it to std::string. Something like:

```cpp
char* temp = text_to_cstring(query_arg);
std::string query = temp;
pfree(temp);  // This is what's missing!

Or better yet, use a helper function that handles this automatically.

Memory shouldn't accumulate during a transaction - each call should clean up after itself.

Actual Behavior

Actual Behavior:

The char* pointers returned by `text_to_cstring()` are never freed with `pfree()`. 

I found 8 instances of this leak:
- Line 42: `std::string nl_query = text_to_cstring(nl_query_arg);`
- Line 43: `std::string api_key = api_key_arg ? text_to_cstring(api_key_arg) : "";`
- Line 45: `std::string provider = provider_arg ? text_to_cstring(provider_arg) : "auto";`
- Line 123: `std::string table_name = text_to_cstring(table_name_arg);`
- Line 125: `std::string schema_name = schema_name_arg ? text_to_cstring(schema_name_arg) : "public";`
- Line 182: `std::string query_text = text_to_cstring(query_text_arg);`
- Line 183: `std::string api_key = api_key_arg ? text_to_cstring(api_key_arg) : "";`
- Line 185: `std::string provider = provider_arg ? text_to_cstring(provider_arg) : "auto";`

Each function call leaks approximately 120-370 bytes depending on the length of the input strings. In a long transaction with many calls, I think this adds up pretty quickly.

### SQL Query/Function Call

```sql
-- Any of these will leak memory:
SELECT generate_query('show all users from database', 'sk-test123456', 'openai');
SELECT get_table_details('users', 'public');
SELECT explain_query('SELECT * FROM orders WHERE status = ''pending''', 'sk-test123456', 'anthropic');

Error Messages/Logs

No error messages - the code compiles and runs fine. The leak is silent.

I verified the leak exists by searching the code at /src/pg_ai_query.cpp

And confirmed there are zero `pfree()` calls in the file.

PostgreSQL Version

PostgreSQL 14.10

Operating System

Windows

OS Version

No response

Configuration

Additional Context

I created some test files to demonstrate and verify this issue (attached to this bug report):

  1. MEMORY_LEAK_PROOF.md - Detailed analysis showing:

    • Why this is a leak (PostgreSQL API contract violation)
    • All 8 leak locations with line numbers
    • Impact calculation (~120-370 bytes per call)
    • Recommended fix with code examples
  2. tests/sql/memory_leak_test.sql - SQL test that shows memory accumulation in pg_backend_memory_contexts()

  3. tests/memory_leak_demo.cpp - C++ demonstration of the buggy vs correct pattern

I think the fix is pretty straightforward - just add a helper wrapper function like:

namespace {
  inline std::string pg_text_to_string(const text* t) {
    if (!t) return "";
    char* cstr = text_to_cstring(t);
    std::string result(cstr);
    pfree(cstr);  // The missing piece!
    return result;
  }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions