0000920: Regex documentation - UnrealIRCd Bug Tracker

ID	Project	Category	View Status	Date Submitted	Last Update

0000920	unreal	ircd	public	2003-04-26 20:35	2004-12-26 15:30

Reporter	Rocko	Assigned To	~~codemastr~~
Priority	normal	Severity	feature	Reproducibility	always
Status	resolved	Resolution	fixed
OS	Debian Woody	OS Version	3.0
Product Version	3.2-beta17
Fixed in Version	3.2.3

Summary	0000920: Regex documentation
Description	I don´t know, if wildcards are allowed, but in the beta15 release, there is an entry "badword channel { word "fuck"; };", so I think it is. (btw. there is an entry with only fuck, so it exist twice!) And when I am using an entry like: badword channel { word "m*se"; }; and say then: sei it will be <censored>i and thats not right, because there isn´t a "m" in front of sei.
Tags	No tags attached.

3rd party modules

syzop 2003-04-26 22:10 administrator ~0002507	Just for the record: this bug is unrelated to fast badwords replace since it's recognized as a regex (c R mse <censored>). I'm a regex-n00b so I dunnow if this is good or bad... Someone reported a similar issue (at IRC) about "irc..*" which caused "hi irc.blah.com is nice" was replaced to "hi irc.repl.aced" (the part after it got dropped)...

Schutzgeist 2003-04-27 23:43 reporter ~0002524	Well If I set badword channel { word "lame"; replace "leet";}; And someone writes lame, User will see "leet" - Thats okay. If I set badword channel { word "lame"; replace "leet";}; And someone says lame - everyone see "Lame" so it doesn´t work. BUT e is then a BarChar, coz when I type Unreal it will put out: Unrleetal Another Problem is when is a bad character. When U set badword channel { word "lame"; replace "leet";}; badword channel { word "test"; replace "<censored word>";}; And someone types just a _star_ like "*" The outout will be: <cleetnsorleetd word> I think Rocko typed the same;) but I just wanted to show some more examples;) The new way of the BadWordConfig is very interesting but very difficult to handle. With wildcards U can do a lot of mistakes. Without wildcards the BadWordList will lose it worth.

~~codemastr~~ 2003-04-28 04:18 reporter ~0002528	This is part your problem, part Unreal's problem. In regex, the * operator does NOT mean the same thing as it does in a wildcard or glob expression. It does NOT match "any characters" it matches any of the previous character, for example, "tst" says "match 0 or more t's followed by st". What you probably want is "." the "." says any character, therefore ".*" says "match 0 or more of any character". But this still produces some odd results, in some rare circumstances. After beta16 I will be making Unreal use a new regex library that supports "non-greedy repeat-operations" which will make it function 100% as expected assuming you know the correct syntax. Additionally the new library is much faster than the current one, so it has more features, and it is faster, so it is obviously a good choice. Perhaps I'll consider writing up a simple document on some of the basic features of regex...

AngryWolf 2003-06-29 19:37 reporter ~0003131 Last edited: 2003-07-26 13:11	I don't think it is Unreal's problem. Fast badword replace was designed for "blah", "blah", "blah" and "blah", where blah is a string of alphabetic characters. Rocko should better use "m[[:alnum:]]+se", because "mse" matches any number of occurrences of "m" followed by "se". In addition, I don't understand why Syzop's fast badword replace system is not documented in unreal32docs.html. Instead of this, badword::word is only mentioned as a "a simple word" (or a regex), meaning that no wildcards are accepted, but it is not true. By the way, what about, for example, http://www.pcre.org/ ? In my opinion, perl-style regular expressions are easy to use. (Just an idea.) Or to satisfy users, making a not-as-fast-replace-system-as-fast-badword-replace which supports blah*? :-) [Corrected a mistake: s/alphanumeric/alphabetic/, sorry.] edited on: 07-26-03 13:11

~~codemastr~~ 2003-06-30 20:30 reporter ~0003139	I don't like PCRE, it is slow. I'm going to be using TRE which is lightning fast and supports some rather advanced features. One (that no other regex lib supports) is approximate matching. That is a very nice feature for badwords.

AngryWolf 2003-07-06 06:18 reporter ~0003173	If you still intend to write that tutorial, please, could you illustrate the features of regexes by clear examples in it? Surely the official documentation, which is available at http://kouli.iki.fi/~vlaurika/tre/syntax.html, is not widely understandable, particularly for newbies. Just like as it is described at http://www.zytrax.com/tech/web/regex.htm, but specialized for TRE, that would be probably fairly enough to also understand the way approximate matching works.

AngryWolf 2003-07-26 13:10 reporter ~0003338	Can I ask a question? If the fast badword replace system is designed to accept all alphabetical characters, but nothing more, why does it allow an opening curly bracket? (character: "{", code: 123)

~~codemastr~~ 2004-12-26 15:30 reporter ~0008673	Done in .211

Date Modified	Username	Field	Change
2003-04-26 20:35	Rocko	New Issue
2003-04-26 22:10	syzop	Note Added: 0002507
2003-04-27 23:43	Schutzgeist	Note Added: 0002524
2003-04-28 04:18	~~codemastr~~	Note Added: 0002528
2003-06-29 19:37	AngryWolf	Note Added: 0003131
2003-06-29 19:39	AngryWolf	Note Edited: 0003131
2003-06-30 13:24	syzop	Severity	minor => feature
2003-06-30 13:24	syzop	Category	=> ircd
2003-06-30 13:24	syzop	Product Version	3.2-beta15 => 3.2-beta17
2003-06-30 13:24	syzop	Summary	Bugs with matching badwords when wildcards are used. => badwords wildcard confusion / TRE
2003-06-30 20:30	~~codemastr~~	Note Added: 0003139
2003-07-06 06:18	AngryWolf	Note Added: 0003173
2003-07-26 13:10	AngryWolf	Note Added: 0003338
2003-07-26 13:11	AngryWolf	Note Edited: 0003131
2004-01-18 01:51	~~codemastr~~	Status	new => assigned
2004-01-18 01:51	~~codemastr~~	Assigned To	=> codemastr
2004-12-26 15:28	~~codemastr~~	Summary	badwords wildcard confusion / TRE => Regex documentation
2004-12-26 15:30	~~codemastr~~	Status	assigned => resolved
2004-12-26 15:30	~~codemastr~~	Fixed in Version	=> 3.2.3
2004-12-26 15:30	~~codemastr~~	Resolution	open => fixed
2004-12-26 15:30	~~codemastr~~	Note Added: 0008673

View Issue Details

Activities

Issue History