-
Notifications
You must be signed in to change notification settings - Fork 136
[Bug]:WebSocket Connection Stability & Reconnection Logic #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Bug]:WebSocket Connection Stability & Reconnection Logic #201
Conversation
📝 WalkthroughWalkthroughAdds WebSocket and RTC connection-state tracking, capped retry logic, reconnect/leave UI in OnlineDebateRoom; introduces a new ConnectionStatus React component; and adds Changes
Sequence DiagramsequenceDiagram
participant User
participant OnlineDebateRoom
participant WebSocket
participant ConnectionStatus
participant RTC
User->>OnlineDebateRoom: Open debate page
OnlineDebateRoom->>WebSocket: Create / open connection
WebSocket-->>OnlineDebateRoom: onopen
OnlineDebateRoom->>ConnectionStatus: state = connected
Note over WebSocket: network interruption
WebSocket-->>OnlineDebateRoom: onerror/onclose
OnlineDebateRoom->>OnlineDebateRoom: wsRetryCount++, update wsLastError, set state=reconnecting
OnlineDebateRoom->>ConnectionStatus: update state, retryCount
alt Retry succeeds (< WS_MAX_RETRIES)
OnlineDebateRoom->>WebSocket: reconnect attempt
WebSocket-->>OnlineDebateRoom: onopen
OnlineDebateRoom->>ConnectionStatus: state = connected
else Max retries exceeded
OnlineDebateRoom->>ConnectionStatus: state = error
ConnectionStatus-->>User: show error + Reconnect/Leave buttons
User->>OnlineDebateRoom: click Reconnect or Leave
alt Reconnect
OnlineDebateRoom->>WebSocket: manual reconnect
else Leave
OnlineDebateRoom->>RTC: stop tracks / close peer connection
OnlineDebateRoom->>WebSocket: close
OnlineDebateRoom->>User: navigate away
end
end
RTC->>OnlineDebateRoom: onconnectionstatechange / oniceconnectionstatechange
OnlineDebateRoom->>ConnectionStatus: update rtcState
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In @frontend/src/Pages/OnlineDebateRoom.tsx:
- Around line 1406-1419: The connection-state handler in
pc.onconnectionstatechange incorrectly treats pc.connectionState ===
'disconnected' as a terminal failure; change the mapping so 'disconnected' sets
rtc state to 'disconnected' (not 'failed') and only map 'failed' to
setRtcConnectionState('failed'), leaving 'connected' -> 'connected' and the
default -> 'disconnected'; update references to pc.onconnectionstatechange,
pc.connectionState, and setRtcConnectionState accordingly.
- Around line 1129-1141: The onerror handler currently increments wsRetryCount
which leads to double-counting because onclose also increments it; update
rws.onerror to only log and setWsLastError (and optionally setWsConnectionState
to 'error' for immediate UI feedback) without calling setWsRetryCount, and move
all retry-increment logic and WS_MAX_RETRIES checks into the rws.onclose handler
(keep setWsConnectionState updates and newCount logic inside onclose) so retry
attempts are only counted once per failure.
- Around line 1143-1159: The onclose handler captures a stale wsRetryCount;
replace direct references to wsRetryCount with a functional state update to read
the latest value atomically: in rws.onclose use setWsRetryCount(prev => { const
next = prev + 1; if (prev < WS_MAX_RETRIES) {
setWsConnectionState('reconnecting'); console.log(`[Connection] Retrying
(${next}/${WS_MAX_RETRIES})...`); } else { setWsConnectionState('error');
console.log('[Connection] Max retries exceeded'); } return Math.min(next,
WS_MAX_RETRIES); }); and keep setWsLastError as-is; this ensures the retry check
uses the current value rather than the stale captured wsRetryCount.
- Around line 1947-1958: handleLeaveDebate closes wsRef and pcRef directly but
doesn’t stop local media tracks, so extract and reuse the same cleanup logic
used in the WebSocket effect (or explicitly stop tracks) before closing
connections: stop all tracks on localStreamRef.current (or iterate
pcRef.current.getSenders() and stop each sender.track if present), then close
pcRef.current and wsRef.current, and only after that call navigate('/game');
update handleLeaveDebate to call that shared cleanup helper (or include the
track-stopping steps) so camera/mic indicators are turned off on leave.
🧹 Nitpick comments (1)
frontend/src/components/ConnectionStatus.tsx (1)
42-57: Consider consolidating duplicate switch cases.Cases
connecting/reconnectinganddisconnected/errorreturn identical values. Consolidating improves maintainability.♻️ Optional refactor
const getStatusIcon = () => { switch (state) { case 'connected': return '●'; case 'connecting': - return '↻'; case 'reconnecting': return '↻'; case 'disconnected': - return '⚠'; case 'error': return '⚠'; default: return '○'; } };
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/.gitignorefrontend/src/Pages/OnlineDebateRoom.tsxfrontend/src/components/ConnectionStatus.tsx
🔇 Additional comments (7)
backend/.gitignore (1)
2-2: Good practice, but appears unrelated to PR objectives.Ignoring production config files is a security best practice to prevent accidental commits of sensitive settings. The backend config structure contains only
config.prod.sample.yml(a template) andconfig.go, confirming that this change properly ignores the actual production config while preserving the sample file for reference. No other environment-specific config files (staging, dev, test) exist in the repository.However, this backend change seems unrelated to the PR's stated objectives, which focus entirely on WebSocket connection stability in the frontend OnlineDebateRoom component. Consider whether this should be a separate commit for cleaner scope separation.
frontend/src/components/ConnectionStatus.tsx (2)
1-15: LGTM! Well-structured type definitions and props interface.The component defines clear types for
ConnectionStateandRTCState, and the props interface captures all necessary inputs for displaying connection status and handling user actions.
89-145: LGTM! Well-designed progressive disclosure UI.The component appropriately shows a minimal indicator when connected and expands to show detailed status, error messages, and action buttons only when there are issues. The reconnect/leave actions are correctly gated to terminal states.
frontend/src/Pages/OnlineDebateRoom.tsx (4)
248-254: LGTM! Connection state management variables are well-defined.The state variables properly track WebSocket connection state, retry count, last error, and RTC connection state. The
WS_MAX_RETRIES = 5constant aligns with the PR objective of limiting reconnection attempts.
2158-2167: LGTM! ConnectionStatus component properly integrated.The component receives all required props with appropriate values, aligning with the PR objectives to display connection health and provide user recovery options.
2487-2495: LGTM! Button styling correctly reflects recording state.The conditional className properly applies destructive styling when recording is active.
1936-1945: No action required. Thereconnect()method is available in reconnecting-websocket v4.4.0 (installed as per package.json) and the implementation is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
frontend/src/Pages/OnlineDebateRoom.tsx (2)
2489-2497: Consider using Button'svariantprop for consistency.The styling works, but if the
Buttoncomponent supports avariant="destructive"prop (common in shadcn/ui), using it would be more consistent with the component's API and easier to maintain.<Button onClick={ isManualRecording ? handleStopSpeaking : handleStartSpeaking } - className={`h-8 rounded-md px-3 text-xs ${isManualRecording ? 'bg-destructive text-destructive-foreground hover:bg-destructive/90' : ''}`} + variant={isManualRecording ? "destructive" : "default"} + size="sm" disabled={!isManualRecording && !canStartSpeaking} >
1099-1100: Acknowledge intentional dependency omissions in WebSocket effect.The
eslint-disablecomment at line 1099 acknowledges that the dependency array is intentionally incomplete. This is a common pattern for WebSocket initialization effects where re-running the effect on every state change would cause unnecessary reconnections.However, the
onmessagehandler references state variables likelocalRole,peerRole,debatePhase, andcurrentUserthat may become stale. For critical message handling, consider using refs to access current values:const localRoleRef = useRef(localRole); // Already exists at line 187 // Update ref in effect: localRoleRef.current = localRole; // Then use localRoleRef.current in onmessageThis pattern is already partially applied (e.g.,
localRoleRef,currentUserIdRef), which is good.Also applies to: 1454-1460
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
frontend/src/Pages/OnlineDebateRoom.tsx
🔇 Additional comments (8)
frontend/src/Pages/OnlineDebateRoom.tsx (8)
248-254: Well-structured connection state management.The connection state types are appropriately defined with clear union types. The initial state of
'connecting'aligns with the WebSocket being created on mount.
1115-1127: Good connection state reset on open.The handler properly resets retry count and error state when the connection is established, which aligns with the reconnection flow requirements from issue #192.
1399-1419: RTC state tracking is functional.The connection state mapping covers the key states. The
defaultcase appropriately treats'new','connecting', and'closed'as disconnected for UI purposes.Consider also exposing ICE connection failures (e.g.,
'failed'ICE state) in a future iteration if more granular RTC diagnostics are needed.
1942-1960: Leave handler provides thorough cleanup.The cleanup sequence is appropriate: stopping media tracks before closing connections and navigating away. Good use of the navigation hook.
Minor: For consistency with the
wsRef.currentnull check, consider adding a null check forpcRef.current:- if (pcRef.current) { - pcRef.current.close(); - } + pcRef.current?.close();
2160-2169: ConnectionStatus integration meets PR objectives.The component receives all necessary props to display connection state, retry progress (retryCount/maxRetries), RTC state, error messages, and action callbacks—fulfilling the requirements from issue #192 for exposing connection status to users. All prop names and types match the component's interface exactly, and the component properly displays the retry progress indicator as "Reconnecting 2/5" during reconnection attempts.
8-13: LGTM on imports.The new imports for
useNavigateandConnectionStatusare appropriate for the connection management features. TheConnectionStatuscomponent is properly defined with the expected interface and correctly integrated with all required props.
1931-1940: Thereconnect()method does exist inreconnecting-websocketv4.4.0. The code is correct as written.The library's API includes
reconnect(code?: number, reason?: string), which forces a reconnect attempt by closing the existing socket (if present) and immediately starting the reconnection sequence while honoring the instance's reconnection options. The call towsRef.current.reconnect()at line 1938 will work as intended.The proposed refactoring to manually recreate the WebSocket instance is unnecessary and less efficient than using the library's built-in reconnect mechanism.
Likely an incorrect or invalid review comment.
1129-1153: Retry tracking correctly aligns with library behavior.The
onclosehandler properly uses functional state updates to prevent race conditions. The retry count correctly increments on each close event, which aligns with howReconnectingWebSocketfiresonclosefor each underlying socket instance that closes—the library creates a new native WebSocket on each reconnection attempt, soonclosefires once per failed connection attempt before the library attempts to reconnect.The handler assignment approach (
rws.onclose = ...) is correct and avoids listener duplication issues that can occur withaddEventListener.One edge case: when the user manually closes the WebSocket (e.g., via
handleLeaveDebate),onclosewill trigger and increment the retry count. This doesn't cause issues since navigation happens immediately afterward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @frontend/src/Pages/OnlineDebateRoom.tsx:
- Around line 1969-1987: handleLeaveDebate fails to close spectator peer
connections stored in spectatorPCsRef; update the handleLeaveDebate callback to
iterate spectatorPCsRef.current (handle Map/object/array shape used in your
code), call .close() on each RTCPeerConnection, remove any event listeners if
attached, and then clear or reset spectatorPCsRef.current to an empty container;
keep existing cleanup for localStreamRef, audioStream, wsRef, and pcRef and
ensure you reference spectatorPCsRef inside the same callback so all spectator
connections are closed before navigate('/game').
- Around line 1153-1172: The onclose handler's manual retry counter
(setWsRetryCount) can be off-by-one relative to ReconnectingWebSocket's
maxRetries semantics—inspect how ReconnectingWebSocket v4.4.0 defines maxRetries
(whether it's total attempts or retries after the initial attempt) and adjust
the check in rws.onclose accordingly: either change the comparison from if
(newCount >= WS_MAX_RETRIES) to if (newCount > WS_MAX_RETRIES) or compare
against WS_MAX_RETRIES + 1 so wsRetryCount semantics match the library, or
better yet read the library's actual retry/attempt property (e.g.,
rws.retryCount or rws.reconnectAttempts if available) and base the error
transition (setting setWsConnectionState('error')) on that instead of the manual
counter to ensure synchronization.
🧹 Nitpick comments (1)
frontend/src/Pages/OnlineDebateRoom.tsx (1)
248-253: Consider movingWS_MAX_RETRIESoutside the component.The constant
WS_MAX_RETRIESis recreated on every render. While this has minimal performance impact, it's cleaner to define constants outside the component scope for clarity and consistency.Suggested refactor
+const WS_MAX_RETRIES = 5; + const OnlineDebateRoom = (): JSX.Element => { // ... other code - const WS_MAX_RETRIES = 5;
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/.gitignorefrontend/src/Pages/OnlineDebateRoom.tsx
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/.gitignore
🔇 Additional comments (10)
frontend/src/Pages/OnlineDebateRoom.tsx (10)
8-13: LGTM!The new imports for
useNavigateandConnectionStatusare correctly added and align with the PR objectives for navigation support and connection status UI.
1134-1146: LGTM!The
onopenhandler correctly resets connection state, retry count, and error message upon successful connection. The logging and subsequent initialization calls are appropriate.
1148-1151: LGTM!The
onerrorhandler correctly logs the error and stores the error message. Not changingwsConnectionStatehere is appropriate sinceonclosetypically follows and handles state transitions.
1426-1446: LGTM with a minor note.The RTC connection state tracking correctly maps the key states. Note that
connectingandnewstates will default todisconnected, which is a reasonable simplification for the UI. The additional ICE connection state logging is helpful for debugging.
2187-2196: LGTM!The
ConnectionStatuscomponent is properly integrated with all necessary props for state, retry information, error messages, and action handlers. Placing it at the top of the container ensures visibility.
2530-2530: LGTM!The button styling update correctly applies destructive styling when recording is active, providing clear visual feedback to users.
144-144: LGTM!The explicit type annotation
(match: string)improves type safety for the protocol replacement logic.
150-150: LGTM!The
useNavigatehook is correctly initialized and used for the leave debate functionality.
1127-1127: LGTM!Using
WS_MAX_RETRIESfor the WebSocket configuration ensures consistent retry limits between the library and UI state tracking.
1957-1967: No action required. ReconnectingWebSocket v4.4.0 resets its internal retry counter on successful connection (onopen event), so the manual reconnect handler correctly manages both library state and UI state. The code is correct as-is.
| rws.onclose = (event: any) => { | ||
| console.log('[WS] WebSocket closed:', event.code, event.reason); | ||
| const errorMessage = event.code === 1006 ? 'Abnormal closure' : | ||
| event.code === 1008 ? 'Policy violation' : | ||
| event.code === 1011 ? 'Server error' : | ||
| `Connection closed (code: ${event.code})`; | ||
| setWsLastError(errorMessage); | ||
|
|
||
| setWsRetryCount(prev => { | ||
| const newCount = prev + 1; | ||
| if (newCount >= WS_MAX_RETRIES) { | ||
| setWsConnectionState('error'); | ||
| console.log('[Connection] Max retries exceeded'); | ||
| } else { | ||
| setWsConnectionState('reconnecting'); | ||
| console.log(`[Connection] Retrying (${newCount}/${WS_MAX_RETRIES})...`); | ||
| } | ||
| return newCount; | ||
| }); | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the OnlineDebateRoom.tsx file to understand the context
wc -l frontend/src/Pages/OnlineDebateRoom.tsxRepository: AOSSIE-Org/DebateAI
Length of output: 107
🏁 Script executed:
# Look for where ReconnectingWebSocket is instantiated and how it's configured
rg "ReconnectingWebSocket" frontend/src/Pages/OnlineDebateRoom.tsx -A 5 -B 5Repository: AOSSIE-Org/DebateAI
Length of output: 1710
🏁 Script executed:
# Check package.json to understand the exact version of ReconnectingWebSocket
rg "reconnecting-websocket|ReconnectingWebSocket" . --type json -A 2 -B 2Repository: AOSSIE-Org/DebateAI
Length of output: 1740
🏁 Script executed:
# Look for wsRetryCount usage throughout the file to understand state management
rg "wsRetryCount|wsConnectionState" frontend/src/Pages/OnlineDebateRoom.tsxRepository: AOSSIE-Org/DebateAI
Length of output: 333
🏁 Script executed:
# Let's look at the specific code around lines 1153-1172 where onclose is defined
sed -n '1140,1180p' frontend/src/Pages/OnlineDebateRoom.tsxRepository: AOSSIE-Org/DebateAI
Length of output: 1584
🏁 Script executed:
# Check if there are other handlers (onopen, onerror, onmessage) to understand full behavior
rg "rws\.on(open|close|error|message)" frontend/src/Pages/OnlineDebateRoom.tsx -A 2Repository: AOSSIE-Org/DebateAI
Length of output: 629
🏁 Script executed:
# Look for WS_MAX_RETRIES definition
rg "WS_MAX_RETRIES" frontend/src/Pages/OnlineDebateRoom.tsx -B 2 -A 2Repository: AOSSIE-Org/DebateAI
Length of output: 1142
🌐 Web query:
reconnecting-websocket 4.4.0 npm onclose retry behavior maxRetries
💡 Result:
Short answer: on an onclose the library will automatically schedule reconnection attempts until options.maxRetries is reached (default: Infinity). retryCount increments on each attempt; if you set maxRetries to a finite number and it is exhausted the library stops retrying and emits an error indicating the host is down (historical docs show an EHOSTDOWN error). [1][2]
Sources:
- reconnecting-websocket (v4.4.0) README / npm (default options include maxRetries: Infinity). [1]
- pladaria GitHub README (reconnect behavior, maxRetries semantics; docs/examples note emission of EHOSTDOWN when retries exhausted in earlier releases). [2]
Verify potential off-by-one error in retry count logic.
The onclose handler manually increments wsRetryCount to mirror ReconnectingWebSocket's internal retry behavior. While the manual counter is tied to the library's events (not a true "double counting" issue), there's a potential synchronization concern:
The condition if (newCount >= WS_MAX_RETRIES) sets state to 'error' when the counter reaches 5. However, the exact semantics of maxRetries: 5 in ReconnectingWebSocket v4.4.0 matters—it could mean either:
- 5 total connection attempts (1 initial + 4 retries)
- 5 retries after the initial attempt (1 initial + 5 retries = 6 total)
If the library interprets maxRetries as the second case, the manual counter will trigger the 'error' state one attempt early, creating a state mismatch.
Clarify the retry counting logic to ensure wsRetryCount and the library's internal retry limit stay synchronized.
🤖 Prompt for AI Agents
In @frontend/src/Pages/OnlineDebateRoom.tsx around lines 1153 - 1172, The
onclose handler's manual retry counter (setWsRetryCount) can be off-by-one
relative to ReconnectingWebSocket's maxRetries semantics—inspect how
ReconnectingWebSocket v4.4.0 defines maxRetries (whether it's total attempts or
retries after the initial attempt) and adjust the check in rws.onclose
accordingly: either change the comparison from if (newCount >= WS_MAX_RETRIES)
to if (newCount > WS_MAX_RETRIES) or compare against WS_MAX_RETRIES + 1 so
wsRetryCount semantics match the library, or better yet read the library's
actual retry/attempt property (e.g., rws.retryCount or rws.reconnectAttempts if
available) and base the error transition (setting setWsConnectionState('error'))
on that instead of the manual counter to ensure synchronization.
| const handleLeaveDebate = useCallback(() => { | ||
| console.log('[Connection] Leaving debate room'); | ||
| // Stop local media tracks | ||
| if (localStreamRef.current) { | ||
| localStreamRef.current.getTracks().forEach(track => track.stop()); | ||
| } | ||
| if (audioStream) { | ||
| audioStream.getTracks().forEach(track => track.stop()); | ||
| } | ||
| // Clean up connections | ||
| if (wsRef.current) { | ||
| wsRef.current.close(); | ||
| } | ||
| if (pcRef.current) { | ||
| pcRef.current.close(); | ||
| } | ||
| // Navigate away | ||
| navigate('/game'); | ||
| }, [navigate, audioStream]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing cleanup of spectator peer connections.
When leaving the debate, spectatorPCsRef connections are not closed. This could leave orphaned connections.
Suggested fix
const handleLeaveDebate = useCallback(() => {
console.log('[Connection] Leaving debate room');
// Stop local media tracks
if (localStreamRef.current) {
localStreamRef.current.getTracks().forEach(track => track.stop());
}
if (audioStream) {
audioStream.getTracks().forEach(track => track.stop());
}
+ // Clean up spectator connections
+ spectatorPCsRef.current.forEach((spectatorPc) => {
+ try {
+ spectatorPc.close();
+ } catch {
+ // Ignore close errors
+ }
+ });
+ spectatorPCsRef.current.clear();
// Clean up connections
if (wsRef.current) {
wsRef.current.close();
}
if (pcRef.current) {
pcRef.current.close();
}
// Navigate away
navigate('/game');
}, [navigate, audioStream]);🤖 Prompt for AI Agents
In @frontend/src/Pages/OnlineDebateRoom.tsx around lines 1969 - 1987,
handleLeaveDebate fails to close spectator peer connections stored in
spectatorPCsRef; update the handleLeaveDebate callback to iterate
spectatorPCsRef.current (handle Map/object/array shape used in your code), call
.close() on each RTCPeerConnection, remove any event listeners if attached, and
then clear or reset spectatorPCsRef.current to an empty container; keep existing
cleanup for localStreamRef, audioStream, wsRef, and pcRef and ensure you
reference spectatorPCsRef inside the same callback so all spectator connections
are closed before navigate('/game').
This PR resolves the WebSocket connection stability issue in OnlineDebateRoom.
Resolve #192
It adds:
This prevents silent freezes when network drops and allows users to recover without refreshing.
Video Demo
https://drive.google.com/file/d/11zSy58asfaIkNIZh4Y76Jv3Xb98V6gzk/view?usp=sharing
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.