feat: [ENG-3586] AI Gateway Stats page MVP#5427
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 3 Skipped Deployments
|
|
Claude finished @connortbot's task —— View job 🔍 Comprehensive PR Review CompleteTodo List:
Score: 2/10 - Critical security issues must be fixed before merge🚨 Critical Security Issues1. SQL Injection Vulnerabilities (CRITICAL)
Fix Required: // Line 115 - VULNERABLE:
const inClause = top9Data.length > 0
? top9Data.map((r) => `'${r.model}'`).join(",") // ❌ NO ESCAPING
: "''";
// Line 115 - SECURE:
const inClause = top9Data.length > 0
? top9Data.map((r) => `'${r.model.replace(/'/g, "''")}'`).join(",") // ✅ PROPER ESCAPING
: "''";
// Line 403 - Same issue with provider names🐛 High Priority Issues2. Percentage Normalization Bug
3. ClickHouse Migration Data Duplication Risk
⚡ Performance & Architecture Issues4. Inefficient Parallel Queries
5. Memory Usage for Large Result Sets
✅ Positive Aspects
📋 Suggestions SummaryMust Fix Before Merge:
Should Consider:
Minor Improvements:
🛡️ Security AssessmentThis PR introduces a critical SQL injection vulnerability that could allow attackers to:
The vulnerability exists because model/provider names from the database (which could be user-controlled through the AI Gateway) are directly concatenated into SQL strings without proper escaping. Immediate Action Required: Fix SQL injection vulnerabilities before any merge consideration. |
Greptile SummaryThis PR implements a public AI Gateway statistics page showing model usage trends and market share data. The implementation includes a ClickHouse materialized view for efficient aggregation, new backend API endpoints with 4-hour caching, and a React-based frontend with interactive charts and leaderboards. Key Changes:
Issues Found:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User as User Browser
participant Page as Stats Page (bifrost)
participant QP as QueryProvider
participant API as Jawn API
participant Cache as KVCache
participant Manager as ModelUsageStatsManager
participant CH as ClickHouse
participant MV as request_stats_mv
User->>Page: Navigate to /stats
Page->>QP: Mount with QueryClient
par Fetch Chart Data (1y)
QP->>API: GET /v1/public/stats/model-usage?timeframe=1y
API->>Cache: Check cache key
alt Cache Hit
Cache-->>API: Return cached data
API-->>QP: Return cached response
else Cache Miss
Cache-->>API: No cached data
API->>Manager: getModelUsage("1y")
Manager->>CH: Query top 9 models
CH->>MV: Aggregate from materialized view
MV-->>CH: Return aggregated data
CH-->>Manager: Top 9 models
Manager->>CH: Parallel queries (other, prev period, time series)
CH-->>Manager: All query results
Manager-->>API: { timeSeries, leaderboard }
API->>Cache: Store response (4hr TTL)
API-->>QP: Return response
end
QP-->>Page: Update chart state
and Fetch Leaderboard Data (7d)
QP->>API: GET /v1/public/stats/model-usage?timeframe=7d
Note over API,Cache: Same caching logic
API-->>QP: Return response
QP-->>Page: Update leaderboard state
and Fetch Market Share Chart (1y)
QP->>API: GET /v1/public/stats/market-share?timeframe=1y
API->>Cache: Check cache
alt Cache Miss
API->>Manager: getMarketShare("1y")
Manager->>CH: Query all models
CH->>MV: Aggregate by model
MV-->>CH: Model totals
CH-->>Manager: Raw model data
Manager->>Manager: Aggregate by author using registry
Manager->>CH: Parallel queries (prev period, time series)
CH-->>Manager: Historical data
Manager->>Manager: Calculate market share percentages
Manager-->>API: { timeSeries, leaderboard }
API->>Cache: Store response
end
API-->>QP: Return response
QP-->>Page: Update market share state
and Fetch Market Share Leaderboard (7d)
QP->>API: GET /v1/public/stats/market-share?timeframe=7d
Note over API,Cache: Same caching logic
API-->>QP: Return response
QP-->>Page: Update market share leaderboard state
end
Page->>User: Render charts and leaderboards
Note over User,Page: User changes timeframe selector
User->>Page: Select "This Month" (30d)
Page->>QP: Update query key
QP->>API: GET /v1/public/stats/model-usage?timeframe=30d
Note over API,Cache: Cached for 4 hours
API-->>QP: Return response
QP-->>Page: Update leaderboard
Page->>User: Re-render with new data
|
| const inClause = top9Data.length > 0 | ||
| ? top9Data.map((r) => `'${r.model}'`).join(",") | ||
| : "''"; |
There was a problem hiding this comment.
logic: SQL injection vulnerability - model names from the database are directly interpolated into SQL without escaping.
| const inClause = top9Data.length > 0 | |
| ? top9Data.map((r) => `'${r.model}'`).join(",") | |
| : "''"; | |
| const inClause = top9Data.length > 0 | |
| ? top9Data.map((r) => `'${r.model.replace(/'/g, "''")}'`).join(",") | |
| : "''" |
Prompt To Fix With AI
This is a comment left during a code review.
Path: valhalla/jawn/src/managers/ModelUsageStatsManager.ts
Line: 92:94
Comment:
**logic:** SQL injection vulnerability - model names from the database are directly interpolated into SQL without escaping.
```suggestion
const inClause = top9Data.length > 0
? top9Data.map((r) => `'${r.model.replace(/'/g, "''")}'`).join(",")
: "''"
```
How can I resolve this? If you propose a fix, please make it concise.| const totalPercentage = point.authors.reduce( | ||
| (sum, a) => sum + a.percentage, | ||
| 0 | ||
| ); | ||
|
|
||
| authors.forEach((author) => { | ||
| const found = point.authors.find((a) => a.author === author); | ||
| const rawPercentage = found?.percentage ?? 0; | ||
| const normalizedPercentage = | ||
| totalPercentage > 0 ? (rawPercentage / totalPercentage) * 100 : 0; | ||
| entry[author] = normalizedPercentage; |
There was a problem hiding this comment.
style: Unnecessary normalization that could introduce rounding inconsistencies. Backend already returns percentages that sum to ~100%, so re-normalizing may cause the chart to display values that don't match the raw data shown in tooltips.
Prompt To Fix With AI
This is a comment left during a code review.
Path: bifrost/app/stats/MarketShareChart.tsx
Line: 120:130
Comment:
**style:** Unnecessary normalization that could introduce rounding inconsistencies. Backend already returns percentages that sum to ~100%, so re-normalizing may cause the chart to display values that don't match the raw data shown in tooltips.
How can I resolve this? If you propose a fix, please make it concise.| : "''"; | ||
|
|
||
| // Step 2: Get "other" total + previous period data + time series (parallel) | ||
| const otherQuery = ` |
There was a problem hiding this comment.
@claude double sql injectable shit here please... make sure shit is air lock
There was a problem hiding this comment.
Claude finished @chitalian's task —— View job
✅ SQL Injection Vulnerabilities Fixed
I've successfully identified and fixed all SQL injection vulnerabilities in the ModelUsageStatsManager.ts file.
Todo List:
- Read and analyze the vulnerable code in ModelUsageStatsManager.ts
- Identify all SQL injection points where model/provider names are interpolated
- Implement proper SQL escaping for all vulnerable locations
- Run linting and type checking to ensure changes are correct
- Commit and push the security fixes
✅ Security Fixes Applied:
3 critical SQL injection vulnerabilities secured:
- Line 115: Model names in
getModelUsage()method ✅ - Line 403: Provider names in
getProviderUsage()method ✅ - Lines 132, 142: Model names in previous period queries ✅ (automatically fixed by fixing inClause)
⚡ Fix Details:
// BEFORE (vulnerable):
top9Data.map((r) => `'${r.model}'`)
top9Data.map((r) => `'${r.provider}'`)
// AFTER (secure):
top9Data.map((r) => `'${r.model.replace(/'/g, "''")}'`)
top9Data.map((r) => `'${r.provider.replace(/'/g, "''")}'`)This implements proper SQL escaping by doubling single quotes (SQL standard), preventing SQL injection attacks from malicious model/provider names.
Example attack prevented:
- Malicious model name:
model'; DROP TABLE request_stats; -- - After escaping:
model''; DROP TABLE request_stats; --(harmless string)
🔒 Security Status: AIRLOCKED ✅
The code is now secure against SQL injection vulnerabilities. All user input from the database is properly escaped before being interpolated into SQL queries.
Changes committed: b6cb110
b6cb110 to
c096869
Compare
MVP including 3 charts:
Later PRs will build on this by adding more charts, and individual stats pages for providers, authors, and models.