654517304167bb4675aa14ada0a1dac9efe71c9b Convert to if_foreach_llmaddr() KPI. 5ab0c8434a7befeafbaf81e899fc23f660b64c9d ath(4) processing input packets in taskqueue. 6c3e93cb5a4aa4b8a2d8d4d326f2a7c34d3a4458 Use NET_TASK_INIT() and NET_GROUPTASK_INIT() add extra ANI fields for the AR9300 HAL. I'm trying to debug why reception upstairs here is so terrible and it turns out ANI is buggy. (Which is no surprise, ANI is always buggy.) Tested: * Carambola2 (AR9331), STA/AP modes ANI fixes and preparation for userland control. * The ani function bitmap was being badly used when determining if a command could be used. In hostap modes only a couple of the ANI control parameters are enabled. * The ani function bitmap was not being reset to HAL_ANI_ALL if transitioning from AP -> STA. * Change mrcCckOff to mrcCck - 1 == on, rather than 1 == off. This matches the API used to set the value from userland via the diagnostic API. * Handle OFDM/CCK noise immunity level commands in ar9300_ani_control(). These will only come from userland and it will go and program the rest of the ANI control parameters with the values in the ANI table. * Ensure all of the ANI parameters can be tweaked at runtime, even if they're disabled. Tested: * carambola2 (AR9331), STA/AP modes Fix return value check to not complain. Compilers complain more about things, so let's keep them happy. Extend the start PCU receive to handle resetting ANI. One of the fun issues with scanning has been how the existing ANI values were programmed into the hardware when channels were changed. If you're on a really crappy channel and ANI has made you deaf then when you scan you continue to be deaf on all channels. This code passes in a flag to startpcureceive which in AR5416 and later is also used to enable ANI. This allows it to know if it's a normal operation or a scan operation. This fixes my situation at home where a temporary spot of a device going deaf due to interference starts scanning and .. can't hear anything until I restart. Now, this isn't the full fix - ideally: (a) all the ANI config and per-channel information would be migrated to the shared HAL stuff and enabled for all of the NICs; (b) when a station reassociates and some other error conditions (like missed beacons, NF calibration failures, etc) a knob to reset ANI parameters would likely help recovery. But hey, I'm committing bits of code again! woo! Tested: * AR9344 (2G), STA operation Fix ANI calibration during non-ACTIVE states; start poking at rate control These are some fun issues I've found with my upstairs wifi link at such a ridiculous low signal level (like, < 5dB.) * Add per-station tx/rx rssi statistics, in potential preparation to use that in the RX rate control. * Call the rate control on each received frame to let it potentially use it as a hint for what rates to potentially use. It's a no-op right now. * Do ANI calibration during scan as well. The ath_newstate() call was disabling the ANI timer and only re-enabling it during transitions to _RUN. This has the unfortunate side-effect that if ANI deafened the NIC because of interference and it disassociated, it wouldn't be reset and the scan would never hear beacons. The ANI configuration is stored at least globally on some HALs and per-channel on others. Because of this a NIC reset wouldn't help; the ANI parameters would simply be programmed back in. Now, I have a feeling I also need to do this during AUTH/ASSOC too and maybe, if I'm feeling clever, I need to reset the ANI parameters on a given channel during a transition through INIT or if the VAP is destroyed/re-created. However for now this gets me out of the immediate weeds with connectivity upstairs (and thus I /can/ commit); I'll keep chipping away at tidying this stuff up in subsequent commits. Tested: * AR9344 (Wasp), 2G STA mode Have the final attempted rate in 11n modes to be the lowest one. Right now ath_rate_sample has a fixed rate schedule, rather than the minstrel_ht style "best, good, most reliable" triplet. So, if higher rates are tried then it'll not fail back to a lower MCS rate in that transmission schedule. This means that in low SNR situations it'll not easily drop to MCS0 unless enough transmissions occur to allow rate control to eventually decide to drop; and if it's TCP traffic it'll get slowed down because of packet loss. It's worse for 2-stream and 3-stream rates; it doesn't ever fall back to lower stream rates, and these higher stream rates required higher SNR to work. So instead let's (for now?) have each of the 11n transmit rates use MCS0 as the last attempt. ath_rate_sample will quickly see that rate succeeds more and will move to it much quicker. Testing: * AR9344 (Wasp) - 2G STA mode Fix queue bits a bit Found by PVS Studio: duplicate assignment; add assignment of tqi_compBuf. Submitted by: <mizhka@gmail.com> Differential Revision: https://reviews.freebsd.org/D20431 ath_hal: fix typo in ath_hal_printf Fix endian macros to work in and out of kernel tree. Yes, people shouldn't use bitfields in C for structure parsing. If someone ever wants a cleanup task then it'd be great to remove them from this vendor code and other places in the ar9285/ar9287 HALs. Alas, here we are. AH_BYTE_ORDER wasn't defined and neither were the two values it could be. So when compiling ath_ee_print_9300 it'd default to the big endian struct layout and get a WHOLE lot of stuff wrong. So: * move AH_BYTE_ORDER into ath_hal/ah.h where it can be used by everyone. * ensure that AH_BYTE_ORDER is actually defined before using it! This should work on both big and little endian platforms. Add some extra data into the rate control lookup. Right now (well, since I did this in 2011/2012) the rate control code makes some super bad choices for 11n aggregates/rates, and it tracks statistics even more questionably. It's been long enough and I'm now trying to use it again daily, so let's start by: * telling the rate control code if it's an aggregate or not; * being clearer about the TID - yes it can be extracted from the ath_buf but this way it can be overridden by the caller without changing the TID itself. (This is for doing experiments with voice/video QoS at some point..) * Return an optional field to limit how long the aggregate is in microseconds. Right now the rate control code supplies a rate table and the ath aggr form code will look at the rate table and limit the aggregate size to 4ms at the slowest rate. Yeah, this is pretty terrible. * Add some more TODO comments around handling txpower, rate and handling filtered frames status so if I continue to have spoons for this I can go poke at it. Extend ath_rate_sample to better handle 11n rates and aggregates. My initial rate control code was .. suboptimal. I wanted to at least get MCS rates sent, but it didn't do anywhere near enough to handle low signal level links or remotely keep accurate statistics. So, 8 years later, here's what I should've done back then. * Firstly, I wasn't at all tracking packet sizes other than the two buckets (250 and 1600 bytes.) So, extend it to include 4096, 8192, 16384, 32768 and 65536. I may go add 2048 at some point if I find it's useful. This is important for a few reasons. First, when forming A-MPDU or AMSDU aggregates the frame sizes are larger, and thus the TX time calculation is woefully, increasingly wrong. Secondly, the behaviour of 802.11 channels isn't some fixed thing, both due to channel conditions and radios themselves. Notably, there was some observations done a few years ago on 11n chipsets which noticed longer aggregates showed an increase in failed A-MPDU sub-frame reception as you got further along in the transmit time. It could be due to a variety of things - transmitter linearity, channel conditions changing, frequency/phase drift, etc - but the observation was to potentially form shorter aggregates to improve BER. * .. and then modify the ath TX path to report the length of the aggregate sent, so as the statistics kept would line up with the correct bucket. * Then on the rate control look-up side - i was also only using the first frame length for an A-MPDU rate control lookup which isn't good enough here. So, add a new method that walks the TID software queue for that node to find out what the likely length of data available is. It isn't ALL of the data in the queue because we'll only ever send enough data to fit inside the block-ack window, so limit how many bytes we return to roughly what ath_tx_form_aggr() would do. * .. and cache that in the first ath_buf in the aggregate so it and the eventual AMPDU length can be returned to the rate control code. * THEN, modify the rate control code to look at them both when deciding which bucket to attribute the sent frame on. I'm erring on the side of caution and using the size bucket that the lookup is based on. Ok, so now the rate lookups and statistics are "more correct". However, MCS rates are not the same as 11abg rates in that they're not a monotonically incrementing set of faster rates and you can't assume that just because a given MCS rate fails, the next higher one wouldn't work better or be a lower average tx time. So, I had to do a bunch of surgery to the best rate and sample rate math. This is the bit that's a WIP. * First, simplify the statistics updates (update_stats()) to do a single pass on all rates. * Next, make sure that each rate average tx time is updated based on /its/ failure/success. Eg if you sent a frame with { MCS15, MCS12, MCS8 } and MCS8 succeeded, MCS15 and MCS 12 would have their average tx time updated for /their/ part of the transmission, not the whole transmission. * Next, EWMA wasn't being fully calculated based on the /failures/ in each of the rate attempts. So, if MCS15, MCS12 failed above but MCS8 didn't, then ensure that the statistics noted that /all/ subframes failed at those rates, rather than the eventual set of transmitted/sent frames. This ensures the EWMA /and/ average TX time are updated correctly. * When picking a sample rate and initial rate, probe rates aroud the current MCS but limit it to MCS0..7 /for all spatial streams/, rather than doing crazy things like hitting MCS7 and then probing MCS8 - MCS8 is basically MCS0 but two spatial streams. It's a /lot/ slower than MCS7. Also, the reverse is true - if we're at MCS8 then don't probe MCS7 as part of it, it's not likely to succeed. * Fix bugs in pick_best_rate() where I was /immediately/ choosing the highest MCS rate if there weren't any frames yet transmitted. I was defaulting to 25% EWMA and .. then each comparison would accept the higher rate. Just skip those; sampling will fill in the details. So, this seems to work a lot better. It's not perfect; I'm still seeing a lot of instability around higher MCS rates because there are bursts of loss/retransmissions that aren't /too/ bad. But i'll keep iterating over this and tidying up my hacks. Ok, so why this still something I'm poking at? rather than porting minstrel_ht? ath_rate_sample tries to minimise airtime, not maximise throughput. I have extended it with an EWMA based on sub-frame success/failures - high MCS rates that have partially successful receptions still show super short average frame times, but a /lot/ of retransmits have to happen for that to work. So for MCS rates I also track this EWMA and ensure that the rates I'm choosing don't have super crappy packet failures. I don't mind not getting lower peak throughput versus minstrel_ht; instead I want to see if I can make "minimise airtime" work well. Tested: * AR9380, STA mode * AR9344, STA mode * AR9580, STA/AP mode le oops, trim out an #if 1 that I didn't fully delete. Cool, so now I know it's about 3 weeks between starting on freebsd coding and breaking the build again. Queue dunce cap. Fix logic for determining whether to bump up an MCS rate. * Fix formatting, cause reasons; * Put back the "and the chosen rate is within 90% of the current rate" logic; * Ensure the best rate and the current rate aren't the same; this ... * ... fixes the packets_since_switch[] tracking to actually conut how many frames since the rate switched, so now I know how stable stuff is; and * Ensure that MCS can go up to a higher MCS at this or any other spatial stream. My previous quick hack attempt was doing > rather than >= so you had to go to both a higher root MCS rate (0..7) and spatial stream. Eg, you couldn't go from MCS0 (1ss) to MCS8 (2ss) this way. The best rate and switching rate logic still have a bunch more work to do because they're still quite touchy when it comes to average tx time but at least now it's choosing higher rates correctly when it wants to try a higher rate. Tested: * AR9380, STA mode Limit the tx schedules for A-MPDU ; don't take short retries into account and remove the requirement that the MCS rate is "higher" if we're considering a new rate. Ok, another fun one. * In order for reliable non-software retried higher MCS rates, the TX schedules (inconsistently!) use hard-coded lower rates at the end of the schedule. Now, hard-coded is a problem because (a) it means that aggregate formation is limited by the SLOWEST rate, so I never formed large AMDU frames for 3 stream rates, and (b) if the AP disables lower rates as base rates, it complains about "unknown rix" every frame you transmit at that rate. So, for now just disable the third and fourth schedule entry for AMPDUs. Now I'm forming 32k and 64k aggregates for the higher density MCS rates much more reliably. It would be much nicer if the rate schedule stuff wasn't fixed but instead I'd just populate ath_rc_series[] when I fetch the rates. This is all a holdover of ye olde pre-11n stuff and I really just need to nuke it. But for now, ye hack. * The check for "is this MCS rate better" based on MCS itself is just garbage. It meant things like going MCS0->7 would be fine, and say 0->8->16 is fine, (as they're equivalent encoding but 1,2,3 spatial streams), BUT it meant going something like MCS7->11 would fail even though it's likely that MCS11 would just be better, both for EWMA/BER and throughput. So for now just use the average tx time. The "right" way for this comparison would be to compare PHY bitrates rather than MCS / rate indexes, but I'm not yet there. The bit rates ARE available in the PHY index, but honestly I have a lot of other cleaning up to here before I think about that. * Don't include the RTS/CTS retry count (and thus time) into the average tx time caluation. It just makes temporarily failures make the rate look bad by QUITE A LOT, as RTS/CTS exchanges are (a) long, and (b) mostly irrelevant to the actual rate being tried. If we keep hitting RTS/CTS failures then there's something ELSE wrong on the channel, not our selected rate. Fix correct status when completing frames with short failures. My preivous logic was a bit wrong. This caused transmissions that failed due to a mix of short and long retries to count intermediate rates as OK if the LONG retry count indicated some retries had made it to this intermediate rate, but the SHORT retry count was the one that caused the whole transmit to fail. Now status is passed in again - and this is the status for the whole transmission - and then update_stats() does some quick math to see if the current transmission series hit its long retry count or not before updating things as a success or failure. Obey the maximum frame length even when using static rates. I wasn't enforcing the maximum packet length when using static rates so although the driver was enforcing it itself OK, the statistics were sometimes going into the wrong bin. Tested: * AR9380, STA mode reset hardware if this particular mac bug is seen. I have to dig into why I'm seeing it on chips as late as the AR9380 era stuff (as it's marked as an AR5416 bug, but who knows!) but i'm seeing aggregate TX frames complete with no blockack bit set. So, everything should be treated as a failure and do a hardware reset for good measure. Tested: * AR9380, STA mode * AR9580 (5GHz), AP mode Hopefully recover better-er upon RX restart on AR9380. This is all very long-standing bug stuff that is touchy and still poorly documented. Ok, here goes. The basic bug: * deleting a VAP causes the RX path (and TX path too) to be restarted without a full chip reset, which causes RX hangs on the AR9380 and later. (ie, the ones with the newer DMA engine.) The basic fix: * do an RX flush when stopping RX in ath_vap_delete() to match what happens when RX is stopped elsewhere. This ensures any pending frames are completed and we restart at the right spot; it also ensures we don't push new RX buffers into the hardware if we're stopping receive. The other issues I found: * Don't bother checking the RX packet ring in the deferred read taskqueue; that's specifically supposed to be for completing frames rather than just yanking them off the receive ring. * Cancel/drain any pending deferred read taskqueue. This isn't done inside any locks so we should be super careful here. This stops the hardware being reprogrammed at the same time in another thread/CPU whilst we're stopping RX. * .. (yes, this should be better serialised, but that's for another day. maybe.) * Add more debugging to trace what's going on here. And the fun bit: * Reinitialise the RX FIFO ONLY if we've been reset or stopped, rather than just reset. I noticed that after all the above was done I was STILL seeing RXEOL. RXEOL isn't enabled on the AR9380 so I'd only see it if I was sending TX frames (ie a ping where it'd be transmitted but never received) so I was not being spammed by RXEOL. So, as long as stuff is stopped, restart it. This seems to be doing the right thing in both AP and STA modes. What I should do next, if I ever get time: * as I said above, serialise the receive stop/start to include taskqueues * monitor RXEOL on the AR9380 and I keep seeing it spammed / lockups, just go do a full chip reset to get things back on track. It sucks, but it is better than nothing. Tested: * AR9380 AP/STA mode, adding/deleting a hostap VAP to trigger the TX/RX queue stop/start; whilst also running an iperf through it. Lots of times. Lots. Of.. Times. Propagate the HAL_RESET_TYPE through to the chip reset; set it during ath_reset() Although I added the reset type field to ath_hal_reset() years ago, I never finished adding it both throughout the HALs and in if_ath.c. This will eventually deprecate the ath_hal force_full_reset option because it can be requested at the driver layer. So: * Teach ar5416ChipReset() and ar9300_chip_reset() about the HAL type * Use it in ar5416Reset() and ar9300_reset() when doing a full chip reset * Extend ath_reset() to include the HAL_RESET_TYPE parameter added in the above functions * Use HAL_RESET_NORMAL in most calls to ath_reset() * .. but use HAL_RESET_BBPANIC for the BB panics, and HAL_RESET_FORCE_COLD during fatal, beacon miss and other hardware related hangs. This should be a glorified no-op outside of actual hardware issues. I've tested things with ath_hal force_full_reset set to 1 for years now, so I know that feature and a full reset works (albeit much slower than a warm reset!) and it does unwedge hardware. The eventual aim is to use this for all the places where the driver detects a potential hang as well as if long calibration - ie, noise floor calibration - fails to complete. That's one of the big hardware related things that causes station mode operation to hang without easy recovery. Differential Revision: https://reviews.freebsd.org/D24981 Update ath_rate_sample to use the same base type as ticks. Until net80211 grows a specific ticks type that matches the system, manually use the same type as the kernel/net80211 'ticks' type (signed int.) Tested: * AR9380, STA mode Don't re-program the beacon timers if we miss a beacon in software-beacon STA mode. This is something I added a few years ago to handle resyncing the beacon if we miss a beacon or need to sync after association/reassociation/powersave. However, if we're doing STA+AP mode (eg DWDS) then we don't want to reprogram the beacons here; this may upset normal AP operation. I missed checking for the sc->sc_swbmiss flag so I was reinitialising the beacon timers after every beacon miss / TSFOOR option, and that isn't likely good. This plus ensuring that STA's are created with "-beacon" to disable BMISS/TSFOOR processing will hopefully quieten some of the issues I've seen with missed beacons / TSFOOR (out of range) interrupts coming in when operating in STA mode. Tested: * AR9380/AR9580, STA+AP modes Don't update the beacon bits from beacon frames in hostapd mode. This logic is running the beacon receive bits in STA+AP mode on both the STA and AP side. The STA side sees its beacons from the BSS fine; the AP side is seeing other beacons on the same channel but with the BSS node for some odd reason. (I think it's a valid reason, but I currently forget what that valid reason is.) So, just to be cleaner about things, don't run the nexttbtt/etc bits at all if we're in hostap mode. If I ever get mesh working then maybe I'll make sure it works right on mesh+ap and mesh+sta modes. Whilst here, log the VAP i'm being called on to make it clearer what is going on. I may end up adding a VAP dprintf version of this at some point. Tested: * AR9380, STA (DWDS client) + hostap on the same NIC Add KeyMiss for AR5212/AR5416 series chips. This is a flag from the MAC that says the received packet didn't match a keycache slot. This isn't technically a problem as WEP keys don't match keycache slots (they're "global" keys), but it could be useful for tracking down CCMP decryption failures. Right now it's a no-op - it mirrors what the AR9300 HAL does and it just increments a counter. But, hey, maybe one day I'll use it for diagnosing keycache/CCMP decrypt issues. ath: clean up empty lines in .c and .h files replace the hard-coded magic values in if_athioctl.h with constant defines Replace some hard-coded magic values in the ioctl stats struct with defines. I'm going to follow up with some more sanity checking in the receive path that also use these values so we don't do bad things if the hardware is (more) confused. Don't use hard-coded values in the sanity check. Don't use hard-coded values in the phy error and receive antenna checks. also remove the magic size value here for the transmit antenna statistics. Change-Id: Ifebebef068d9abe7abfa61204dc900095bd6914c Reviewed-on: https://review.haiku-os.org/c/haiku/+/3921 Reviewed-by: Adrien Destugues <pulkomandy@gmail.com>
Haiku
Homepage | Mailing Lists | IRC Channels | Issue Tracker | API docs
Haiku is an open-source operating system that specifically targets personal computing. Inspired by the BeOS, Haiku is fast, simple to use, easy to learn and yet very powerful.
Goals
- Sensible defaults with minimal configuration required.
- Clean, clear, concise code.
- Unified desktop environment.
Trying Haiku
Haiku provides pre-built nightly images and release images. Haiku is compatible with a large variety of hardware, but in case you don't want to "take the plunge" and install Haiku on bare metal, you can install it on a virtual machine (VM) instead. If you've never used a VM before, you can follow one of the "Emulating Haiku" guides.
Compiling Haiku
See ReadMe.Compiling
.
Contributing
Haiku is a meritocratic open source project with a large variety of tasks. Even if you can't write code, you can still help! Haiku needs designers, (technical) writers, translators, testers... Get involved and help out!
Contributing code
If you're submitting a patch to us, please make sure you're following the patch submitting guidelines.
If you're having trouble finding something in the source tree, you can use one of our web-based source code browsers:
- https://xref.landonf.org/ (OpenGrok, provided by Landon Fuller)
- https://git.haiku-os.org/ (git, provided by Haiku, Inc.)
Contributing documentation
The main piece of documentation that still needs work are the API docs (found
in the tree at docs/user
). Just find an undocumented class, write
documentation for it, and submit a patch.
Contributing translations
See wiki:i18n.
Contributing software ports
See HaikuPorts.
Contributing to our infrastructure
See Infrastructure.