View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0005229||unreal||ircd||public||2019-03-11 17:58||2019-03-11 18:48|
|Target Version||Fixed in Version|
|Summary||0005229: Some issues when you have a ton of glines|
|Description||1) As I mentioned on IRC before, if you have 10k+ glines and you do /stats G you're very likely to be killed with "Max SendQ exceeded". Perhaps sending it in batches would work, maybe a configurable like set::gline-batch-size or something similar. =] I've been thinking about a proper way to resume the dump (to prevent the race condition you mentioned) and the only "safe" way I could think of was to duplicate the aTKLine list, but that seems pretty damn wasteful. :D An easier way (I guess) would be simply to not check the sendQ for stats/gline output, although that might cause problems with processing all that data on the client side. I'd go for an explicit flag for the latter option, like an operpriv or something in a class block (á la nofakelag).|
2) Similarly, if a lot of these glines expire simultaneously the IRCd takes hours to walk through them all. This leads to a lot of timeshifting between IRCds, which in turn causes netsplits and potentially re-setting of previously removed glines (thereby causing a loop of sorts). We actually had 50k glines which started getting removed around 7 AM, but it was still going at 7 *PM* so I decided "fuck it I'll just stop all IRCds at once". :>
I set the severity to major because it causes an entirely unusable network. I also think there's a high probability of many other networks experiencing the same issues at some point, not to mention the spam waves seem to increase in "strength" as time goes by so the risk only goes up. :D
|Tags||No tags attached.|
|3rd party modules|
All but the /stats thing has been fixed a few days ago with the new system. These are the draft release notes for that change:
This release focuses on performance improvements, especially when the
server is under attack. UnrealIRCd now uses a technique that makes KLINE's,
GLINE's and (G)ZLINE's placed on individual IP's (*@IP) extremely fast.
Just to illustrate the performance improvements:
* Previously it took 129 seconds to add 100k ZLINE's, now it takes 2.5 secs.
* Checking a connection against 100,000 ZLINE's is now 250 times faster.
* Previously 7,500 clients could connect per minute, now 33,560 per minute.
* Even with 1 million ZLINE's on *@IP it can handle 30,000 connections p/m.
* Rejecting Z-lined users is even faster at 435,000 connections per minute
with 100,000 active ZLINE's.
Benchmarked on a 2GHz Intel Xeon Skylake CPU with Linux 4.15.
So it's extremely fast. Not just with *@IP but also performance fixes with regards to adding, removing, slowness in connecting, expiring, etc. (not just with *@IP but with any sort of *LINES).
So I think you should be good with this new version :)
As for /stats.. I have considered the LIST-style approach but as I explained (and you repeat here) that is a bit difficult to keeping track. I also wonder who would really want to see 10k or even 100k of *LINES? We have a search system anyway, although it's probably underdocumented.
It is probably a whole lot easier to just bump the sendq in the config file. Some quick calculations show that even a 2M sendq should give you enough for 10k zlines most of the time (it depends on the length of the reason field and such). Personally I think 5M sendq for opers is just fine... after all I trust them enough to have a huge sendq. This makes me wonder what sendq you use... for 50k *LINES I would think that 10M should be sufficient in any case.
Anyway, to get back to the original question... I wonder who really wants to see the full list of 10k or 100k zlines. I mean.. you are not going to read 10,000 entries, right? So what is it for? Dumping to somewhere? Maybe a module would be more appropriate for that?
Anyway, in spite of recent attacks I could bump the defaults in the example conf... nowadays even the default server sendq of 5M seems a bit low.
As mentioned on IRC:
* Some command to display counts (totals)
* Document the searching options, like /stats G +r xyz. apparently there is +m for mask, +r for reason, and +s for the setter
* Fix and test the searching options, I don't think it works for zline ? Seems para is not passed.
18:44 <~Syzop[AWAY]> so this leaves the case of where someone does /zline and you have 10k zlines...
18:44 <~Syzop[AWAY]> maybe some kind of limit of output? like with WHO
18:44 <~Syzop[AWAY]> and displaying a notice regarding the search options (and count thingy) that will
18:44 <@Gottem> ya sounds fine by me
|2019-03-11 17:58||Gottem||New Issue|
|2019-03-11 18:11||syzop||Note Added: 0020544|
|2019-03-11 18:13||syzop||Note Edited: 0020544||View Revisions|
|2019-03-11 18:14||syzop||Note Edited: 0020544||View Revisions|
|2019-03-11 18:42||syzop||Note Added: 0020545|
|2019-03-11 18:47||syzop||Note Edited: 0020545||View Revisions|
|2019-03-11 18:48||syzop||Assigned To||=> syzop|
|2019-03-11 18:48||syzop||Status||new => acknowledged|