[PW_SID:1028648] Improve roam scan strategy#483
[PW_SID:1028648] Improve roam scan strategy#483IWDTestBot wants to merge 18 commits intoworkflowfrom
Conversation
This is taken care of by the individual cache items and if none exist, tar fails.
Return the number of freqs in a scan_freq_set, useful for knowing how big the set is.
Splits the scan frequencies into more subsets that have been ordered such that the more common frequencies appear first, and the more uncommon frequencies last. Non-DFS channels are also added to the earlier subsets to prioritize fast-scanning frequencies. This approach allows iwd to scan the frequencies with the statistically highest chance for BSSes first, resulting in shorter scan times until a good BSS is found. In future patches this will also be used when roaming.
When iwd decides to roam, it scans either neighbor freqs or known freqs (if neighbors are not available). If it fails to roam after getting the scan results, it scans ALL freqs. There's a high chance that both neighbor and/or known scans fail to roam and we end up scanning all freqs. This is very slow and if you're already moving away from the current BSS, there's a high chance you will lose connection completely before the scan is finished. Instead of scanning all freqs at once, use the already-defined subsets to optimize the scans. The subsets contain a handful of freqs each and are ordered to increase the chance of finding a good BSS early. In order to not scan the same freq multiple times, use a list (scanned_freqs) to keep track of which freqs have been scanned in the current roam attempt. When a roam scan is triggered, add the most prioritized freqs to the list of freqs that should be scanned. The order of priority is: 1. Neighbor freqs 2. Known freqs 3. Subsets, starting with index 0 and incrementing if the subset is exhausted A freq is only added to the scan if it has not yet been scanned in the current roam attempt. An exception to this are neighbor freqs. They have a higher chance of containing good BSSes, so they're scanned every 3rd scan (defined by STATION_SCANS_BEFORE_NEIGHBOR_SCAN). This approach results in more scans, but fewer freqs per scan, leading to shorter delays between scan results. It also avoids scanning the same freqs back-to-back, which is generally not very useful. In combination with the subset ordering, this increases the chance of finding a good BSS early and results in better roaming performance.
Reaching the full-roam-scan event now requires scanning several subsets which takes a bit longer, so the default timeout of 10 isn't long enough anymore.
These tests can't be guaranteed to reach a full scan anymore, as iwd will find the BSS before that happens. Getting the roaming and connected events should be enough to pass these tests.
This solves an issue where test_ignore_candidate_list_quirk() uses set_value() to set a vendor element telling the station that the candidate list should be ignored, and that config was never cleared in teardown. This resulted in subsequent tests ignoring the candidate list and doing a normal scan. This caused them to connect to a BSS other than the one in the candidate list, causing the tests to fail. When calling default(), it first resets all configs to default, then calls reload(). So we still get the reload() call as before, with the added benefit of also clearing the configs. This makes the tests pass.
This test no longer triggers a full scan. Instead, we verify that it doesn't scan the bad candidate channel, which is verified by not getting "no-roam-candidates". Then we also verify that it roams to the other available BSS, which it should do immediately because that BSS is on a channel that's in the first subset of channels to be scanned. This means it should roam directly without getting "no-roam-candidates" in-between. To allow for this, add an optional "disallow" list of events to wait_for_event(). This functionality already exists in hostapd.py, so the same solution was copied to iwd.py.
The scan subsets now contains both 2.4GHz and 5GHz channels, this seems to cause an issue when MAC randomization is used together with the simulated hardware. The AP(s) in these tests operate on 2.4GHz. From the logs it's clear that the station's probe request to the 2.4GHz AP does arrive, and the AP responds, but the station never processes the response. If the AP is changed to 5GHz then everything works fine. I believe this is because 2.4GHz is scanned first, then there's a MAC change, then 5GHz is scanned. For some reason, this causes the simulated hardware to miss the response from the first scan, possibly because the MAC address doesn't match.
|
Fetch PR Prep - Setup ELL Make Distcheck Build - Configure Make Check Make Check w/Valgrind Incremental Build with patches |
|
Fetch PR GitLint Prep - Setup ELL Make Distcheck Build - Configure Make Check Make Check w/Valgrind Incremental Build with patches Autotest Runner Output: Clang Build |
34cec50 to
720e28a
Compare
720e28a to
d0811d4
Compare
Return the number of freqs in a scan_freq_set, useful for knowing how
big the set is.
src/util.c | 11 +++++++++++
src/util.h | 1 +
2 files changed, 12 insertions(+)