-
-
Notifications
You must be signed in to change notification settings - Fork 82
Timestamp Precision Fix and Time‐Based Recall
This guide details the journey of debugging and fixing timestamp precision issues in the MCP Memory Service, including implementing natural language time queries.
- Overview
- The Problem
- Root Cause Analysis
- The Solution
- Natural Language Time Queries
- Testing Approach
- Lessons Learned
Timestamp precision is crucial for accurate time-based memory recall. This case study shows how sub-second precision was lost during storage and how natural language queries like "yesterday" and "last week" were implemented.
- Memories created milliseconds apart showed identical timestamps
- Time-based recall ("yesterday", "last week") returned incorrect results
- Test cases failing when memories were created 0.2 seconds apart
# Test that revealed the issue
async def test_timestamp_precision():
storage = ChromaMemoryStorage()
# Create two memories 0.2 seconds apart
memory1 = Memory(content=\"First memory\")
await storage.store(memory1)
await asyncio.sleep(0.2)
memory2 = Memory(content=\"Second memory\")
await storage.store(memory2)
# Both memories had the same timestamp!
assert memory1.created_at != memory2.created_at # FAILED
-
Storage Layer Investigation
# Found in chroma.py def _optimize_metadata_for_chroma(self, metadata: Dict) -> Dict: optimized = metadata.copy() # Problem: Converting to int loses sub-second precision! if \"created_at\" in optimized and isinstance(optimized[\"created_at\"], float): optimized[\"created_at\"] = int(optimized[\"created_at\"]) # <-- BUG
-
Retrieval Layer Issue
# In recall() method results = self.collection.query( where={ \"$and\": [ {\"created_at\": {\"$gte\": int(start_timestamp)}}, # <-- Converting to int {\"created_at\": {\"$lte\": int(end_timestamp)}} # <-- Losing precision ] } )
Modified _optimize_metadata_for_chroma
in chroma.py
:
def _optimize_metadata_for_chroma(self, metadata: Dict) -> Dict:
\"\"\"Optimize metadata for ChromaDB storage while preserving precision\"\"\"
optimized = metadata.copy()
# Keep timestamps as floats to preserve sub-second precision
timestamp_fields = [\"created_at\", \"updated_at\", \"timestamp\"]
for field in timestamp_fields:
if field in optimized and isinstance(optimized[field], (int, float)):
# Store as float to maintain precision
optimized[field] = float(optimized[field])
return optimized
Updated comparison logic in recall()
:
async def recall(
self,
start_time: datetime,
end_time: Optional[datetime] = None,
n_results: int = 10
) -> List[MemoryQueryResult]:
\"\"\"Retrieve memories within a time range with proper precision\"\"\"
# Use float timestamps for precise comparisons
start_timestamp = start_time.timestamp() # Returns float
end_timestamp = end_time.timestamp() if end_time else datetime.now().timestamp()
results = self.collection.query(
where={
\"$and\": [
{\"created_at\": {\"$gte\": start_timestamp}}, # Float comparison
{\"created_at\": {\"$lte\": end_timestamp}} # Maintains precision
]
},
n_results=n_results,
include=[\"metadatas\", \"documents\", \"distances\"]
)
Updated to_dict()
method to preserve precision:
def to_dict(self) -> Dict[str, Any]:
\"\"\"Convert Memory to dictionary preserving timestamp precision\"\"\"
return {
\"id\": self.id,
\"content\": self.content,
\"content_hash\": self.content_hash,
\"created_at\": self.created_at, # Keep as float
\"created_at_iso\": datetime.fromtimestamp(self.created_at).isoformat(),
\"updated_at\": self.updated_at, # Keep as float
\"updated_at_iso\": datetime.fromtimestamp(self.updated_at).isoformat(),
\"metadata\": self.metadata
}
Created utils/time_parser.py
to handle natural language:
from datetime import datetime, timedelta
from typing import Tuple, Optional
import re
class TimeParser:
\"\"\"Parse natural language time expressions\"\"\"
@staticmethod
def parse_relative_time(query: str) -> Tuple[datetime, Optional[datetime]]:
\"\"\"
Parse queries like 'yesterday', 'last week', 'past 3 days'
Returns (start_time, end_time)
\"\"\"
now = datetime.now()
query_lower = query.lower().strip()
# Handle ISO date strings first
if re.match(r'\\d{4}-\\d{2}-\\d{2}', query):
try:
date = datetime.fromisoformat(query)
return (
date.replace(hour=0, minute=0, second=0, microsecond=0),
date.replace(hour=23, minute=59, second=59, microsecond=999999)
)
except ValueError:
pass
# Time-relative patterns
patterns = {
r'today': (
now.replace(hour=0, minute=0, second=0, microsecond=0),
now
),
r'yesterday': (
(now - timedelta(days=1)).replace(hour=0, minute=0, second=0, microsecond=0),
(now - timedelta(days=1)).replace(hour=23, minute=59, second=59, microsecond=999999)
),
r'last (\\d+) days?': lambda m: (
now - timedelta(days=int(m.group(1))),
now
),
r'last week': (
now - timedelta(weeks=1),
now
),
r'last month': (
now - timedelta(days=30),
now
),
r'(\\d+) days? ago': lambda m: (
(now - timedelta(days=int(m.group(1)))).replace(hour=0, minute=0, second=0, microsecond=0),
(now - timedelta(days=int(m.group(1)))).replace(hour=23, minute=59, second=59, microsecond=999999)
),
}
# Match patterns
for pattern, result in patterns.items():
match = re.search(pattern, query_lower)
if match:
if callable(result):
return result(match)
return result
# Default: last 24 hours
return (now - timedelta(days=1), now)
async def handle_recall(self, query: str, n_results: int = 10) -> List[MemoryQueryResult]:
\"\"\"Handle natural language time queries\"\"\"
# Parse the time expression
start_time, end_time = TimeParser.parse_relative_time(query)
# Use the fixed recall method
return await self.storage.recall(
start_time=start_time,
end_time=end_time,
n_results=n_results
)
Created tests/test_timestamp_recall.py
:
import asyncio
import pytest
from datetime import datetime, timedelta
class TestTimestampPrecision:
@pytest.mark.asyncio
async def test_subsecond_precision(self):
\"\"\"Test that sub-second timestamps are preserved\"\"\"
storage = ChromaMemoryStorage()
memories = []
base_time = datetime.now()
# Create memories with 0.1 second intervals
for i in range(5):
memory = Memory(
content=f\"Memory {i}\",
created_at=(base_time + timedelta(seconds=i*0.1)).timestamp()
)
await storage.store(memory)
memories.append(memory)
# Verify each memory has unique timestamp
timestamps = [m.created_at for m in memories]
assert len(set(timestamps)) == 5, \"Timestamps should be unique\"
# Verify order is preserved
assert timestamps == sorted(timestamps), \"Timestamp order should be preserved\"
@pytest.mark.asyncio
async def test_precise_time_range_queries(self):
\"\"\"Test precise time range filtering\"\"\"
storage = ChromaMemoryStorage()
# Create test memories
now = datetime.now()
memories = []
for i in range(10):
memory = Memory(
content=f\"Memory {i}\",
created_at=(now - timedelta(seconds=i*0.5)).timestamp()
)
await storage.store(memory)
memories.append(memory)
# Query for memories in last 2 seconds
start_time = now - timedelta(seconds=2)
results = await storage.recall(start_time, now, n_results=10)
# Should get exactly 4 memories (0, 0.5, 1.0, 1.5 seconds ago)
assert len(results) == 4
# Verify they're the right ones
for result in results:
time_diff = now.timestamp() - result.memory.created_at
assert time_diff <= 2.0, f\"Memory outside range: {time_diff}s ago\"
Created tests/test_time_parser.py
:
class TestTimeParser:
def test_relative_expressions(self):
\"\"\"Test natural language time expressions\"\"\"
test_cases = [
(\"yesterday\", 1, 1),
(\"last 3 days\", 3, 0),
(\"last week\", 7, 0),
(\"2 days ago\", 2, 2),
(\"today\", 0, 0),
]
now = datetime.now()
for query, expected_days_start, expected_days_end in test_cases:
start, end = TimeParser.parse_relative_time(query)
# Verify approximate time ranges
days_diff_start = (now - start).days
days_diff_end = (now - end).days if end else 0
assert abs(days_diff_start - expected_days_start) <= 1
assert abs(days_diff_end - expected_days_end) <= 1
def test_edge_cases(self):
\"\"\"Test edge cases and invalid inputs\"\"\"
# ISO dates
start, end = TimeParser.parse_relative_time(\"2025-06-15\")
assert start.date() == datetime(2025, 6, 15).date()
assert end.date() == datetime(2025, 6, 15).date()
# Invalid input defaults to last 24 hours
start, end = TimeParser.parse_relative_time(\"invalid query\")
assert (datetime.now() - start).days == 1
@pytest.mark.asyncio
async def test_natural_language_recall():
\"\"\"Test complete natural language recall flow\"\"\"
storage = ChromaMemoryStorage()
# Create memories at specific times
now = datetime.now()
yesterday = now - timedelta(days=1)
last_week = now - timedelta(days=7)
memories = [
Memory(content=\"Today's memory\", created_at=now.timestamp()),
Memory(content=\"Yesterday's memory\", created_at=yesterday.timestamp()),
Memory(content=\"Last week's memory\", created_at=last_week.timestamp()),
]
for memory in memories:
await storage.store(memory)
# Test natural language queries
yesterday_results = await storage.recall_natural(\"yesterday\")
assert len(yesterday_results) == 1
assert \"Yesterday's memory\" in yesterday_results[0].memory.content
week_results = await storage.recall_natural(\"last week\")
assert len(week_results) == 3 # All memories from last week
Even though humans don't create memories at sub-second intervals in normal usage, preserving precision is important for:
- Data integrity
- Automated imports
- Test reliability
- Future-proofing
Always be explicit about numeric types when dealing with timestamps:
# Good
timestamp = float(datetime.now().timestamp())
# Bad - might lose precision
timestamp = int(datetime.now().timestamp())
The issue was only discovered through aggressive testing with sub-second intervals. Real-world usage patterns might not reveal such bugs.
Implementing natural language time queries greatly improves user experience:
- "show me yesterday's memories" vs "recall from 2025-06-14T00:00:00 to 2025-06-14T23:59:59"
When fixing timestamp issues:
- Fix storage layer (preserve precision)
- Fix retrieval layer (use precise comparisons)
- Fix model layer (maintain types)
- Add comprehensive tests
- Document the changes
This timestamp precision fix demonstrates the importance of understanding data types throughout the entire data pipeline. While the sub-second precision might seem excessive for a human-oriented memory system, maintaining data integrity and supporting edge cases like automated imports makes it worthwhile. The addition of natural language time queries transformed a technical fix into a significant UX improvement.