View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0006145||unreal||ircd||public||2022-06-26 11:14||2023-07-16 20:41|
|Fixed in Version||6.1.2-rc1|
|Summary||0006145: Make spamfilter less intrusive|
|Description||Someone in IRC complained that when a message on IRC was too long and matched a spamfilter, it would be truncated due to the IRC protocol.|
That lead to another discussion about the entirety of the message that triggered the spamfilter being shown, which could reveal some personal messages when target "p" is choosen.
So, I thought that I'd suggest the output of the spamfilter to be something like:
[info] [Spamfilter] nick!ident@host matches filter '*https://bad.site.tld*': [string: https://bad.site.tld?form=login] [reason: Testing] [action: block]
|Additional Information||Snippet of the discussion generated on IRC|
<Nini> when the link in spamfilter is too long, we do not see the end of the link , the [reason: ] and [action: ]
<@PeGaSuS> screenshots are harder to understand than a paste with the output
<alice> huh... are you just banning all links, lol
<alice> also wow that's really weird and creepy that it shows you the contents of a PM to another user, huh, gosh
<alice> I guess it's just because it literally cannot fit the rest of the line from the user into the snomask... but hrm
<Nini> the regexp is perfect, it is the truncation that is the problem
<Nini> here is what I see in my snotices when a link is too long
<@PeGaSuS> <alice> also wow that's really weird and creepy that it shows you the contents of a PM to another user, huh, gosh >> a spamfilter is meant to protect the network and the users from unwanted spam. that's why it is triggered by any PRIVMSG (and NOTICE, probably) and it shows the offender, target and spamfilter being hit (to help fighting possible false positives) and also sends,
<@PeGaSuS> by default, a notice to the offender saying they've actually hit a spamfilter.
<alice> yes, and i still find it very creepy that it broadcasts the actual contents of a private message that matched the filter
<alice> the fact that you've hit this regex is fine, but the actual contents of the message... that's... less than ideal (in my personal opinion)
<alice> (i hadn't actually realised that it would do that, i made an assumption that it would just show the filter and that's it... so i might go patch that out of mine, but)
<@PeGaSuS> AFAIK, all the IRCds that support spamfilter (or other kind of filtering) act this way. otherwise you don't know what the actual message is and you can't see if it's a false positive because you're probably using a too greedy regex or something
<alice> Charybdis/Solanum doesn't, for precisely the reason I was stating before, to avoid the possibility of it being used to spy on users.
<alice> It'll tell you what you matched on, but not the actual contents of the message
<@PeGaSuS> you could probably suggest that on a feature request on https://bugs.unrealircd.org
<%UnrealFeeds> ^ Main - UnrealIRCd Bug Tracker
<alice> (but then again, I suspect I have a different view to how IRC networks should be ran, compared to a majority of unrealircd operators (and I'm old and grumpy, so :p) :)
<@PeGaSuS> I do understand the concern but I do also understand why it is built this way. one enhancement that could be done, then, is to only show the string that hit the actual spamfilter
<@PeGaSuS> like the `https://this.hit.the.spamfilter` only instead the whole message
<alice> Hm? I mean, that doesn't solve my problem. I *do not* want to see the contents of private messages between users, if I'm not intentionally a party to that communication... And it would impact a data protection impact assessment if I was still running a semi-serious network anymore, heh... as there's no way I'dve guessed that that behaviour would've happened
<alice> Did this get introduced in unreal 6? I don't recall it being in unreal 5, but... I might've missed that
<@PeGaSuS> iirc, that's the behaviour of spamfilter since always..
|Tags||No tags attached.|
|3rd party modules|
First a word about spying. Spying is covertly reading / gathering information. Spamfilter cannot be used that way, it will always take an action, like killing or *lining so that is very visible and not covert. For the block action, the user gets a notice they triggered spamfilter. For action "warn", which still allows the message through, unrealircd sends a numeric (RPL_SPAMCMDFWD) which contains this phrase: "Command processed, but a copy has been sent to ircops for evaluation (anti-spam) purposes. "
All this isn't new, we do this since UnrealIRCd 3.2.x.
With that aside, alice does raise a legit concern: she never wants to see a private conversation that she is not part of, ever. I think we should add an option for this (in the set block). I don't think hiding it should be the default because the text can be genuinely useful, or perhaps only hide it for private message by default... or have that option... (take that into account for the new set option).
As for cutting things off (if you do want to see the text, like currently), which is what Nini started about: there's not much that can be done about that but we should probably move the [cmd: ...] bits to the very end so it cannot be abused to cause a cutoff and hiding the other fields like [reason] and such. If you want to have the full text/fields of everything, then use JSON logging on disk or unrealircd.org/json-log on IRC.. it's the only way the problem can be solved completely.
Pegasus also has another suggestion which is to only show the part that matched. That's also an idea. It won't really solve alice's concerns though and.. for tracking false positives it may leave out too much "context" to see if something is a false positive or not... eg someone was talking about blocking phone numbers: with pegasus reduced showing of a match you would then get to see a number.. but you may still not know if it was actually a phone number or was a number about something else. It would also need to be handled in PCRE2 and in the simple matcher, so it's more work, but it's not a bad idea in itself.
I'm closing this one, did add the option to hide the content (for PM or for channel+PM) in https://github.com/unrealircd/unrealircd/commit/f333aa0c09054e1e6bf3d59f86527ee93d3c4826 a few days ago.
I don't think the alternative idea is worth pursuing... I'm happy with what we have now :D
|2022-06-26 11:14||PeGaSuS||New Issue|
|2022-06-26 11:27||syzop||Note Added: 0022583|
|2022-06-26 11:27||syzop||Note Edited: 0022583|
|2022-06-26 11:28||syzop||Additional Information Updated|
|2022-06-26 11:28||syzop||Additional Information Updated|
|2022-06-26 11:30||syzop||Note Edited: 0022583|
|2022-06-26 11:31||syzop||Note Edited: 0022583|
|2023-07-16 20:41||syzop||Assigned To||=> syzop|
|2023-07-16 20:41||syzop||Status||new => resolved|
|2023-07-16 20:41||syzop||Resolution||open => fixed|
|2023-07-16 20:41||syzop||Fixed in Version||=> 6.1.2-rc1|
|2023-07-16 20:41||syzop||Note Added: 0022960|