Timestamp Precision Fix and Time‐Based Recall

Timestamp Precision Fix and Time-Based Recall

This guide details the journey of debugging and fixing timestamp precision issues in the MCP Memory Service, including implementing natural language time queries.

Overview

Timestamp precision is crucial for accurate time-based memory recall. This case study shows how sub-second precision was lost during storage and how natural language queries like "yesterday" and "last week" were implemented.

The Problem

Symptoms

Memories created milliseconds apart showed identical timestamps
Time-based recall ("yesterday", "last week") returned incorrect results
Test cases failing when memories were created 0.2 seconds apart

Initial Investigation

# Test that revealed the issue
async def test_timestamp_precision():
    storage = ChromaMemoryStorage()
    
    # Create two memories 0.2 seconds apart
    memory1 = Memory(content=\"First memory\")
    await storage.store(memory1)
    
    await asyncio.sleep(0.2)
    
    memory2 = Memory(content=\"Second memory\")
    await storage.store(memory2)
    
    # Both memories had the same timestamp!
    assert memory1.created_at != memory2.created_at  # FAILED

Root Cause Analysis

Discovery Process

Storage Layer Investigation

# Found in chroma.py
def _optimize_metadata_for_chroma(self, metadata: Dict) -> Dict:
    optimized = metadata.copy()
    
    # Problem: Converting to int loses sub-second precision!
    if \"created_at\" in optimized and isinstance(optimized[\"created_at\"], float):
        optimized[\"created_at\"] = int(optimized[\"created_at\"])  # <-- BUG

Retrieval Layer Issue

# In recall() method
results = self.collection.query(
    where={
        \"$and\": [
            {\"created_at\": {\"$gte\": int(start_timestamp)}},  # <-- Converting to int
            {\"created_at\": {\"$lte\": int(end_timestamp)}}     # <-- Losing precision
        ]
    }
)

The Solution

1. Storage Layer Fix

Modified _optimize_metadata_for_chroma in chroma.py:

def _optimize_metadata_for_chroma(self, metadata: Dict) -> Dict:
    \"\"\"Optimize metadata for ChromaDB storage while preserving precision\"\"\"
    optimized = metadata.copy()
    
    # Keep timestamps as floats to preserve sub-second precision
    timestamp_fields = [\"created_at\", \"updated_at\", \"timestamp\"]
    
    for field in timestamp_fields:
        if field in optimized and isinstance(optimized[field], (int, float)):
            # Store as float to maintain precision
            optimized[field] = float(optimized[field])
    
    return optimized

2. Retrieval Layer Fix

Updated comparison logic in recall():

async def recall(
    self,
    start_time: datetime,
    end_time: Optional[datetime] = None,
    n_results: int = 10
) -> List[MemoryQueryResult]:
    \"\"\"Retrieve memories within a time range with proper precision\"\"\"
    
    # Use float timestamps for precise comparisons
    start_timestamp = start_time.timestamp()  # Returns float
    end_timestamp = end_time.timestamp() if end_time else datetime.now().timestamp()
    
    results = self.collection.query(
        where={
            \"$and\": [
                {\"created_at\": {\"$gte\": start_timestamp}},  # Float comparison
                {\"created_at\": {\"$lte\": end_timestamp}}      # Maintains precision
            ]
        },
        n_results=n_results,
        include=[\"metadatas\", \"documents\", \"distances\"]
    )

3. Memory Model Fix

Updated to_dict() method to preserve precision:

def to_dict(self) -> Dict[str, Any]:
    \"\"\"Convert Memory to dictionary preserving timestamp precision\"\"\"
    return {
        \"id\": self.id,
        \"content\": self.content,
        \"content_hash\": self.content_hash,
        \"created_at\": self.created_at,  # Keep as float
        \"created_at_iso\": datetime.fromtimestamp(self.created_at).isoformat(),
        \"updated_at\": self.updated_at,  # Keep as float
        \"updated_at_iso\": datetime.fromtimestamp(self.updated_at).isoformat(),
        \"metadata\": self.metadata
    }

Natural Language Time Queries

Time Parser Implementation

Created utils/time_parser.py to handle natural language:

from datetime import datetime, timedelta
from typing import Tuple, Optional
import re

class TimeParser:
    \"\"\"Parse natural language time expressions\"\"\"
    
    @staticmethod
    def parse_relative_time(query: str) -> Tuple[datetime, Optional[datetime]]:
        \"\"\"
        Parse queries like 'yesterday', 'last week', 'past 3 days'
        Returns (start_time, end_time)
        \"\"\"
        now = datetime.now()
        query_lower = query.lower().strip()
        
        # Handle ISO date strings first
        if re.match(r'\\d{4}-\\d{2}-\\d{2}', query):
            try:
                date = datetime.fromisoformat(query)
                return (
                    date.replace(hour=0, minute=0, second=0, microsecond=0),
                    date.replace(hour=23, minute=59, second=59, microsecond=999999)
                )
            except ValueError:
                pass
        
        # Time-relative patterns
        patterns = {
            r'today': (
                now.replace(hour=0, minute=0, second=0, microsecond=0),
                now
            ),
            r'yesterday': (
                (now - timedelta(days=1)).replace(hour=0, minute=0, second=0, microsecond=0),
                (now - timedelta(days=1)).replace(hour=23, minute=59, second=59, microsecond=999999)
            ),
            r'last (\\d+) days?': lambda m: (
                now - timedelta(days=int(m.group(1))),
                now
            ),
            r'last week': (
                now - timedelta(weeks=1),
                now
            ),
            r'last month': (
                now - timedelta(days=30),
                now
            ),
            r'(\\d+) days? ago': lambda m: (
                (now - timedelta(days=int(m.group(1)))).replace(hour=0, minute=0, second=0, microsecond=0),
                (now - timedelta(days=int(m.group(1)))).replace(hour=23, minute=59, second=59, microsecond=999999)
            ),
        }
        
        # Match patterns
        for pattern, result in patterns.items():
            match = re.search(pattern, query_lower)
            if match:
                if callable(result):
                    return result(match)
                return result
        
        # Default: last 24 hours
        return (now - timedelta(days=1), now)

Integration with Recall

async def handle_recall(self, query: str, n_results: int = 10) -> List[MemoryQueryResult]:
    \"\"\"Handle natural language time queries\"\"\"
    
    # Parse the time expression
    start_time, end_time = TimeParser.parse_relative_time(query)
    
    # Use the fixed recall method
    return await self.storage.recall(
        start_time=start_time,
        end_time=end_time,
        n_results=n_results
    )

Testing Approach

1. Precision Test Suite

Created tests/test_timestamp_recall.py:

import asyncio
import pytest
from datetime import datetime, timedelta

class TestTimestampPrecision:
    
    @pytest.mark.asyncio
    async def test_subsecond_precision(self):
        \"\"\"Test that sub-second timestamps are preserved\"\"\"
        storage = ChromaMemoryStorage()
        
        memories = []
        base_time = datetime.now()
        
        # Create memories with 0.1 second intervals
        for i in range(5):
            memory = Memory(
                content=f\"Memory {i}\",
                created_at=(base_time + timedelta(seconds=i*0.1)).timestamp()
            )
            await storage.store(memory)
            memories.append(memory)
        
        # Verify each memory has unique timestamp
        timestamps = [m.created_at for m in memories]
        assert len(set(timestamps)) == 5, \"Timestamps should be unique\"
        
        # Verify order is preserved
        assert timestamps == sorted(timestamps), \"Timestamp order should be preserved\"
    
    @pytest.mark.asyncio
    async def test_precise_time_range_queries(self):
        \"\"\"Test precise time range filtering\"\"\"
        storage = ChromaMemoryStorage()
        
        # Create test memories
        now = datetime.now()
        memories = []
        
        for i in range(10):
            memory = Memory(
                content=f\"Memory {i}\",
                created_at=(now - timedelta(seconds=i*0.5)).timestamp()
            )
            await storage.store(memory)
            memories.append(memory)
        
        # Query for memories in last 2 seconds
        start_time = now - timedelta(seconds=2)
        results = await storage.recall(start_time, now, n_results=10)
        
        # Should get exactly 4 memories (0, 0.5, 1.0, 1.5 seconds ago)
        assert len(results) == 4
        
        # Verify they're the right ones
        for result in results:
            time_diff = now.timestamp() - result.memory.created_at
            assert time_diff <= 2.0, f\"Memory outside range: {time_diff}s ago\"

2. Natural Language Test Suite

Created tests/test_time_parser.py:

class TestTimeParser:
    
    def test_relative_expressions(self):
        \"\"\"Test natural language time expressions\"\"\"
        test_cases = [
            (\"yesterday\", 1, 1),
            (\"last 3 days\", 3, 0),
            (\"last week\", 7, 0),
            (\"2 days ago\", 2, 2),
            (\"today\", 0, 0),
        ]
        
        now = datetime.now()
        
        for query, expected_days_start, expected_days_end in test_cases:
            start, end = TimeParser.parse_relative_time(query)
            
            # Verify approximate time ranges
            days_diff_start = (now - start).days
            days_diff_end = (now - end).days if end else 0
            
            assert abs(days_diff_start - expected_days_start) <= 1
            assert abs(days_diff_end - expected_days_end) <= 1
    
    def test_edge_cases(self):
        \"\"\"Test edge cases and invalid inputs\"\"\"
        # ISO dates
        start, end = TimeParser.parse_relative_time(\"2025-06-15\")
        assert start.date() == datetime(2025, 6, 15).date()
        assert end.date() == datetime(2025, 6, 15).date()
        
        # Invalid input defaults to last 24 hours
        start, end = TimeParser.parse_relative_time(\"invalid query\")
        assert (datetime.now() - start).days == 1

3. Integration Tests

@pytest.mark.asyncio
async def test_natural_language_recall():
    \"\"\"Test complete natural language recall flow\"\"\"
    storage = ChromaMemoryStorage()
    
    # Create memories at specific times
    now = datetime.now()
    yesterday = now - timedelta(days=1)
    last_week = now - timedelta(days=7)
    
    memories = [
        Memory(content=\"Today's memory\", created_at=now.timestamp()),
        Memory(content=\"Yesterday's memory\", created_at=yesterday.timestamp()),
        Memory(content=\"Last week's memory\", created_at=last_week.timestamp()),
    ]
    
    for memory in memories:
        await storage.store(memory)
    
    # Test natural language queries
    yesterday_results = await storage.recall_natural(\"yesterday\")
    assert len(yesterday_results) == 1
    assert \"Yesterday's memory\" in yesterday_results[0].memory.content
    
    week_results = await storage.recall_natural(\"last week\")
    assert len(week_results) == 3  # All memories from last week

Lessons Learned

1. Precision Matters

Even though humans don't create memories at sub-second intervals in normal usage, preserving precision is important for:

Data integrity
Automated imports
Test reliability
Future-proofing

2. Type Consistency

Always be explicit about numeric types when dealing with timestamps:

# Good
timestamp = float(datetime.now().timestamp())

# Bad - might lose precision
timestamp = int(datetime.now().timestamp())

3. Testing Edge Cases

The issue was only discovered through aggressive testing with sub-second intervals. Real-world usage patterns might not reveal such bugs.

4. Natural Language UX

Implementing natural language time queries greatly improves user experience:

"show me yesterday's memories" vs "recall from 2025-06-14T00:00:00 to 2025-06-14T23:59:59"

5. Comprehensive Fix Approach

When fixing timestamp issues:

Fix storage layer (preserve precision)
Fix retrieval layer (use precise comparisons)
Fix model layer (maintain types)
Add comprehensive tests
Document the changes

Conclusion

This timestamp precision fix demonstrates the importance of understanding data types throughout the entire data pipeline. While the sub-second precision might seem excessive for a human-oriented memory system, maintaining data integrity and supporting edge cases like automated imports makes it worthwhile. The addition of natural language time queries transformed a technical fix into a significant UX improvement.

Uh oh!

Timestamp Precision Fix and Time‐Based Recall

Timestamp Precision Fix and Time-Based Recall

Table of Contents

Overview

The Problem

Symptoms

Initial Investigation

Root Cause Analysis

Discovery Process

The Solution

1. Storage Layer Fix

2. Retrieval Layer Fix

3. Memory Model Fix

Natural Language Time Queries

Time Parser Implementation

Integration with Recall

Testing Approach

1. Precision Test Suite

2. Natural Language Test Suite

3. Integration Tests

Lessons Learned

1. Precision Matters

2. Type Consistency

3. Testing Edge Cases

4. Natural Language UX

5. Comprehensive Fix Approach

Conclusion

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally