Add circuit breaker for database connectivity failures during sync#397
Open
joschiv1977 wants to merge 1 commit intos1t5:mainfrom
Open
Add circuit breaker for database connectivity failures during sync#397joschiv1977 wants to merge 1 commit intos1t5:mainfrom
joschiv1977 wants to merge 1 commit intos1t5:mainfrom
Conversation
When PostgreSQL becomes unreachable during email sync (e.g. container crash, OOM kill, DNS failure), the sync loop previously continued processing every single email, logging a DB error for each one without ever stopping. This caused massive log spam and wasted API calls to Graph/IMAP servers. This adds a circuit breaker pattern to both GraphEmailService and ImapEmailService: - Message-level: After 5 consecutive DB connectivity errors, abort the current folder sync immediately - Folder-level: After 2 consecutive folder failures due to DB issues, check database health before continuing. Abort entire account sync if DB is unreachable. - Pre-pagination health check: Before fetching the next page from Graph API, verify DB is still reachable to avoid wasting API calls - New IsDbConnectivityError() helper distinguishes DB connectivity errors (SocketException, DNS failures, transient failures) from application-level errors - New IsDatabaseReachableAsync() performs lightweight connectivity check via CanConnectAsync() Non-DB errors (e.g. UTF-8 encoding issues, parse errors) do not trigger the circuit breaker and are handled as before. Relates to s1t5#382, s1t5#388, s1t5#363 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When PostgreSQL becomes unreachable during email sync (e.g. container crash, OOM kill, DNS failure), the sync loop currently continues processing every single email — logging a DB error for each one without ever stopping. This causes massive log spam, wasted Graph API / IMAP calls, and high memory/CPU usage for no benefit.
This PR adds a circuit breaker pattern to both
GraphEmailServiceandImapEmailService:IsDbConnectivityError()helper: Distinguishes DB connectivity errors (SocketException, DNS failures, transient failures) from application-level errors (UTF-8 issues, parse errors, etc.)IsDatabaseReachableAsync()helper: Lightweight connectivity check viaCanConnectAsync()Non-DB errors are not affected by the circuit breaker and continue to be handled as before.
Context
This was discovered when the PostgreSQL container died during a large sync operation (800+ emails across multiple folders). The mail archiver continued running, spamming thousands of identical "Name or service not known" errors in the logs — one for every email it tried to process — while also continuing to fetch pages from the Graph API (wasting API calls). The container had to be manually stopped.
Related issues:
Test plan
🤖 Generated with Claude Code