Skip to content

AP / hostapd: IoT Device Authentication Limbo Bug #6975

@dacaito

Description

@dacaito

Describe the bug

I think this should go here, but please if I got it wrong point me to the right place and I'll re-add it there. I believe this is the same as #6876.

hostapd: IoT Device Authentication Limbo Bug

Issue Summary

IoT devices get stuck in "authentication limbo" after power cycling, where they appear connected to WiFi but cannot communicate. The device remains in this state indefinitely until manually removed from the access point.

Problem Description

When IoT devices reconnect after a brief disconnection (power cycle, signal loss), they can enter a state where:

  • Device appears connected in iw station dump
  • Device cannot communicate (ping fails, no data flow)
  • RX byte counter remains static despite "connected" status
  • Auto-disconnect timeouts do not trigger
  • Only manual station removal via iw dev <interface> station del <mac> restores connectivity

Root Cause

  1. Trigger: Device power cycle or brief disconnection
  2. Authentication Loop: Device sends authentication/association requests during reconnection
  3. State Mismatch: hostapd believes the session is still valid and ignores requests
  4. Activity Reset: Device activity timer gets reset by these unanswered requests
  5. Limbo State: Normal inactivity timeouts don't apply because WiFi link appears "active"

Technical Details

# Device appears connected but RX counter is frozen
$ iw dev wlan0 station get xx:xx:xx:xx:xx:xx
Station xx:xx:xx:xx:xx:xx (on wlan0)
    inactive time:  300 ms
    rx bytes:       12345  # ← This number stops incrementing
    tx bytes:       67890
    signal:         -45 dBm

# Communication fails
$ ping <device_ip>
PING <device_ip>: No route to host

# Manual fix works immediately
$ sudo iw dev wlan0 station del xx:xx:xx:xx:xx:xx
# Device reconnects successfully within 60 seconds

Packet Capture Analysis

Packet sniffer analysis using Wireshark confirms the root cause:

  1. Device sends authentication requests: IoT device transmits 802.11 authentication frames during reconnection attempts
  2. hostapd ignores requests: No authentication response frames are sent back to the device
  3. Device retries periodically: Authentication requests continue at regular intervals
  4. No response ever sent: hostapd fails to respond to any authentication attempts
  5. Manual station removal fixes it: After iw station del, subsequent authentication requests receive proper responses

This packet-level evidence confirms that hostapd's station state management is preventing it from processing authentication requests from devices it believes are already connected.

Steps to reproduce the behaviour

Reproduction Steps

  1. Set up Linux WiFi access point using hostapd
  2. Connect IoT device and verify normal operation
  3. Power cycle the IoT device (unplug for 10+ seconds)
  4. Wait for device to attempt reconnection
  5. Observe: Device shows as connected but communication fails
  6. Monitor: RX byte counter remains static

Device (s)

Raspberry Pi 5

System

Environment

  • hostapd: v2.10 (affects multiple versions)
  • Kernel: Linux 6.12+ (likely affects earlier versions)
  • WiFi Driver: brcmfmac (likely affects other drivers)
  • Affected Devices: Confirmed on various IoT devices (smart switches, sensors)

Expected vs Actual Behavior

Expected: Device reconnects normally after power cycle
Actual: Device enters authentication limbo requiring manual intervention

Logs

No response

Additional context

Why Standard Timeouts Don't Work

The 30-second auto-disconnect mechanism fails because:

  • Authentication requests are treated as "activity"
  • Device appears to have an active WiFi connection
  • Inactivity timers reset due to ongoing authentication attempts
  • No timeout mechanism accounts for stalled data flow

Proposed Solution

hostapd should properly handle authentication/association requests from devices attempting to reconnect, even when it believes an existing session is still valid. Specifically:

  1. Detect reconnection attempts: When a station sends authentication requests but has stalled RX activity
  2. Invalidate stale sessions: Clear existing station state before processing new authentication
  3. Respond to authentication: Process the authentication request normally instead of ignoring it
  4. Reset activity timers: Ensure inactivity timeouts work correctly for reconnecting devices

Current Workaround

Since hostapd doesn't handle this scenario properly, a workaround is to monitor station RX byte counters and automatically remove stations with stalled data flow:

# Monitor RX activity every 20 seconds
# Remove station if no RX progress for 60+ seconds
if [ "$current_rx" -eq "$previous_rx" ] && [ "$stall_time" -gt 60 ]; then
    iw dev $interface station del $mac_address
fi

This forces a clean reconnection by removing the stalled station state.

Impact

  • Severity: High for IoT/embedded deployments
  • Scope: Any Linux access point with IoT devices
  • Frequency: Occurs on every device power cycle/reconnection
  • Workaround: Manual station removal or automated RX monitoring

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions