UnrealIRCd Bug Tracker - unreal
Viewing Issue Advanced Details
2852 ircd feature always 2006-03-15 10:23 2007-05-02 21:07
Spider84  
*UNIX, win32  
normal all  
acknowledged 3.2.4  
open  
none    
none  
Not touched yet by developer
No need for upstream InspIRCd patch
Not decided
None
0002852: Codepage support added based
This is codepage recode support for UnrealIRCd.
child of 0003284acknowledged  3rd Party Module Wishlist 
? file icon Unreal3.2.4-cp.diff [^] (37,162 bytes) 2006-03-15 10:23
? file icon Unreal3.2.4-cp.new.diff [^] (40,183 bytes) 2006-03-15 13:09
? file icon Unreal3.2.4-cp.new1.diff [^] (38,229 bytes) 2006-03-15 20:39
gz file icon codepages.diff.gz [^] (20,627 bytes) 2006-11-04 11:55
? file icon unreal3.2.5.cvs.codepages.diff [^] (63,102 bytes) 2006-11-07 18:00
? file icon unreal3.2.5.cvs.codepages-2.diff [^] (63,202 bytes) 2006-11-08 11:24
? file icon Unreal3.2.5-cp.diff [^] (218,861 bytes) 2006-12-01 11:58
? file icon ircrecoder.py [^] (3,621 bytes) 2007-05-02 21:07
Issue History
2006-03-15 10:23 Spider84 New Issue
2006-03-15 10:23 Spider84 File Added: Unreal3.2.4-cp.diff
2006-03-15 10:52 stskeeps Note Added: 0011369
2006-03-15 11:03 Spider84 Note Added: 0011370
2006-03-15 11:06 stskeeps Note Added: 0011372
2006-03-15 13:09 Spider84 File Added: Unreal3.2.4-cp.new.diff
2006-03-15 13:09 Spider84 Note Added: 0011373
2006-03-15 20:39 Spider84 File Added: Unreal3.2.4-cp.new1.diff
2006-03-15 20:39 Spider84 Note Added: 0011384
2006-03-17 13:54 syzop Note Added: 0011388
2006-03-17 22:09 Spider84 Note Added: 0011389
2006-03-18 08:18 stskeeps Note Added: 0011391
2006-03-18 16:16 syzop Note Added: 0011394
2006-04-28 02:42 Xuefer Note Added: 0011642
2006-04-28 09:09 Spider84 Note Added: 0011649
2006-06-16 05:10 reaper Note Added: 0011968
2006-06-16 06:24 Spider84 Note Added: 0011970
2006-07-25 06:37 reaper Note Added: 0012085
2006-07-25 08:26 Spider84 Note Added: 0012086
2006-09-10 23:03 reaper Note Added: 0012376
2006-09-11 02:53 Spider84 Note Added: 0012377
2006-11-01 06:54 syzop Note Added: 0012536
2006-11-01 06:55 syzop Note Edited: 0012536
2006-11-04 11:55 fbi File Added: codepages.diff.gz
2006-11-04 11:55 fbi Note Added: 0012581
2006-11-04 11:57 fbi Note Edited: 0012581
2006-11-04 12:00 Bock Note Added: 0012582
2006-11-04 16:29 Bock Note Edited: 0012582
2006-11-07 17:59 Bock Note Added: 0012603
2006-11-07 18:00 Bock File Added: unreal3.2.5.cvs.codepages.diff
2006-11-08 11:24 Bock File Added: unreal3.2.5.cvs.codepages-2.diff
2006-11-08 11:29 Bock Note Added: 0012604
2006-12-01 11:58 Spider84 File Added: Unreal3.2.5-cp.diff
2006-12-01 12:04 Spider84 Note Added: 0012780
2006-12-02 16:35 Bock Note Added: 0012783
2007-04-27 04:02 stskeeps Status new => closed
2007-04-27 04:02 stskeeps Note Added: 0013784
2007-04-27 04:03 stskeeps Resolution open => won't fix
2007-04-27 17:44 syzop Note Added: 0013896
2007-04-27 17:44 syzop Status closed => acknowledged
2007-04-27 17:44 syzop Resolution won't fix => open
2007-04-27 17:44 syzop Relationship added child of 0003284
2007-04-27 17:45 syzop Note Added: 0013897
2007-04-27 17:51 syzop Note Edited: 0013897
2007-04-28 08:00 Bock Note Added: 0013908
2007-05-02 21:07 pv2b File Added: ircrecoder.py
2007-05-02 21:07 pv2b Note Added: 0013983

Notes
(0011369)
stskeeps   
2006-03-15 10:52   
I'll have to decline this patch - on the grounds you're having stuff like "bopm-nick" and references to variables you're not declaring in the patch. You need to clean up your patch first. (yes, i noticed its commented)
(0011370)
Spider84   
2006-03-15 11:03   
For example what variables? I has not understood in what a problem.
(0011372)
stskeeps   
2006-03-15 11:06   
+DLLFUNC int codepage_rehash()
+{
+ if (bopmnick) {
+ free(bopmnick);
+ bopmnick = NULL;
+ bopmnick_present = 0;
+ }
+ return 1;
+}
+*/

You need to remove the stuff you have commented out and clean up the patch a little.
(0011373)
Spider84   
2006-03-15 13:09   
Sorry :)
(0011384)
Spider84   
2006-03-15 20:39   
Unreal3.2.4-cp.new1.diff
+ Added libinconv version report to /VERSION cmd, like OpenSSL or zlib.
(0011388)
syzop   
2006-03-17 13:54   
I've never been keen on all this charset stuff... It seems it are always russian people (no offence) that come up with it, because they could not agree on a standard codepage (having 2 or 3, instead of 1 major one).

(actually I see you are the same guy who mailed me this in January 2004)

I could, of course, still allow it in, so you .ru guys don't need to run a modified ircd anymore, but then I have to note that this is pretty much russian-specific/3rdpartyfeature at the moment and that I'm not responsible for any bugs (and indeed.. it should be explicitly enabled in.. for example ./Config -advanced or include/config.h).
Maybe I can live with it then for 3.2* ;)
(0011389)
Spider84   
2006-03-17 22:09   
You are right:)
I am simple is already tired to patch each new version for, those who uses UnrealIRCd.
(0011391)
stskeeps   
2006-03-18 08:18   
I still somewhat think message codepage/encoding stuff should be on client side.. Like, I see clients show up UTF-8 and Iconv conversion stuff. If they can be transmitted as proper text on the IRC protocol, it's not really our business, is it?
(0011394)
syzop   
2006-03-18 16:16   
*nod*
(0011642)
Xuefer   
2006-04-28 02:42   
server side iconv is nice, and would helps a lot to migrate the world into utf-8 or any encoding. codemastr might know better about the situation on this as i could sometimes see him arround the mirc forum. before mirc implement the utf-8 in 6.17, all the users use system(even win2k/xp)'s default locale, especially eastern asia. how can u imagine a irc network have hundreds of thousand of user upgrade to mirc+utf-8 at once? if they upgrade 1 by 1, they would change default setting of mirc to use system locale, not utf-8, in order to communicate with others.

with server side conversion, u can upgrade all server to utf-8, while having the old ip:port accept client in locale, and all other new ip:port using utf-8 or so. migrating all user to utf-8. unless, well, u don't go to utf-8.
(0011649)
Spider84   
2006-04-28 09:09   
@Xuefer: Thx for you voice :)
(0011968)
reaper   
2006-06-16 05:10   
This patch don't work if i use glibc (it include iconv, iconv headers and iconv libs) and i don't have libiconv. How to solve this problem?
(0011970)
Spider84   
2006-06-16 06:24   
t is to hard to quick convert iconv code to gconv. Now I have't time to this.
If you know C language you can try to make it youself, just read http://gnu.dp.ua/software/libc/manual/html_node/glibc-iconv-Implementation.html [^]
(0012085)
reaper   
2006-07-25 06:37   
After the new version of your Unreal IRCD was released will the new patch be released soon? If yes, will it be using gconv (glibc)?
(0012086)
Spider84   
2006-07-25 08:26   
Yes. I'll try to make new patch in near time. I just come back from hollydays :)
I'll try to use gconv but I not promis.

Sorry for my English and long ansver.
(0012376)
reaper   
2006-09-10 23:03   
Whether there is any progress?
(0012377)
Spider84   
2006-09-11 02:53   
I have no time no. Sorry.
(0012536)
syzop   
2006-11-01 06:54   
(edited on: 2006-11-01 06:55)
It would be nice if this would get 100% modulized... It would only require a couple of new hooks on our side (Unreal), such as a callback for sending / receiving buffers to name the two most important ones.

Seems though, the author nor anyone else has time to develop this patch any further ?

In the positive case that someone would have time to develop this, this would be a simple add-on module that would work on newer Unreal versions (say, starting next version), without constant code changes whenever something changes in the Unreal core that breaks the patch. The code would be more maintainable than a patch, and could later on be worked on by multiple people (bugfixes, enhancements).

BUT, if nobody has time, then.. it's going to stay like it is now.. a patch for 3.2.4, without gconv support, and which is so big/hackish that we find it scary to apply to 3.2*, and applying to 3.3* doesn't help anyone atm.

So, if anyone (Russian? or not..) who has time would jump in to get this developed further, that would be helpful. Not so much helpful to me, but to the general Russian community... I find it hard to believe there are no coders alive there or that they would all be unwilling to help :P

If anyone is interested, post it here or mail me at syzop@unrealircd.com so we can discuss what's needed.

(0012581)
fbi   
2006-11-04 11:55   
(edited on: 2006-11-04 11:57)
stable codepages patch created.
file uploaded.

it work with all codepages supported by iconv/libiconv include Unicode (UTF) without any anomalies,
also patch add support for russian translit.

patch coded by me and Spider.

file name is codepages.diff.gz

(0012582)
Bock   
2006-11-04 12:00   
(edited on: 2006-11-04 16:29)
and it support *nix and windows (I compile it with libiconv 1.9.2 for windows).

UPD: I look at diff and find some blots and not good code:
diff -urN Unreal3.2/include/config.h Unreal3.2cp/include/config.h
--- Unreal3.2/include/config.h Fri Jun 16 18:29:14 2006
+++ Unreal3.2cp/include/config.h Sat Nov 4 21:45:24 2006
@@ -461,6 +461,9 @@
  */
 #define JOINTHROTTLE
 
+#define CODEPAGE /* enable codepage support by i (admin@i386.net.ru) & Spider */
+//#define WITH_ICONV /* Need only in Win32 */
+

and some uncorrect thing. I look at it tomorrow and try to make it more comfortable for win32...

(0012603)
Bock   
2006-11-07 17:59   
Is the diff to latest cvs.
For compiling in windows - USE_ICONV in separate block (like SSL, ZIP and CURL).

Known blots:
1) seg. fault on version, if we try add version of ICONV to /version
(+#ifdef WITH_ICONV
+ if (IsAnOper(sptr))
+ sendto_one(sptr, ":%s NOTICE %s :libiconv %s", me.name, _libiconv_version >> 8, _libiconv_version & 0xff);
+#endif
) (( in this diff not included))

2) To see the using translate scheme of client on other server by whois - to get TRUE info - we must do whois nick nick (like to see idle)

3) m_codepage.c - do we must write to clients if we don't compiled with iconv ("command /codepage not supported by server") or no (Command CODEPAGE not found)?
(0012604)
Bock   
2006-11-08 11:29   
compiles fine for windows/linix/bsd. (unreal3.2.5.cvs.codepages-2.diff)

Needs to CaseViewing of /copepage:

1)
[19:21:16] -› Your codepage is now cP1251
[19:21:24] -› Your codepage is now KoI8-R
[19:25:03] -› Your codepage is now uTf-8
[19:25:16] -› Your codepage is now iSo8859-5

/Maybe all BIG letters?/

2) Version of iconv (gnu iconv or iconv in libc(in linux)).

Working stable in all servers of ircnet fbi (windows, linux, bsd).
Huh.. It's was hard. :]
(0012780)
Spider84   
2006-12-01 12:04   
Unreal3.2.5-cp.diff
* fixed some bugs in ch_codepage()/dech_codepage() functions
* fixed /VERSION report for opers
+ Tryed to remove codepage info from WHOIS on remote servers - need to test.
* removed file based codepage support and #ifdef CODEPAGE_USE
+ Add --with-iconv option to configure script on *UNIX based systems - need to test.
All fixes and addons recived from diferent users of net. Thx.
(0012783)
Bock   
2006-12-02 16:35   
some more unclean in patch...
Do like I - only codepage things contains in diff
(0013784)
stskeeps   
2007-04-27 04:02   
Too early version - please resubmit for 3.2.6/3.3
(0013896)
syzop   
2007-04-27 17:44   
hmmm, if we start closing issues for this reason, we'll loose track of history... (sure, relationships, but people forget to add/mention them and already seem to be not reading most of things).

-> Bock and all:
see my last comment on how we can get this codepage thing working (I never got contacted): basically, we'll just provide enough hooks and such to make this possible in a module.
it seems though, my comments were completely ignored. a missed opportunity IMO.

Let us know what kind of hooks will be needed and at what places (you could give it a try adding them yourself as well, of course). for example a hook right before queueing some text (""send"") and the same at receive are the most important ones that come to my mind.
hybrid has provided such hooks, which has worked nicely for similar codepage translation modules.
please remember: 3.3* is meant to provide a more modulized aproach, so this does not mean put-as-many-new-subsystems-in-the-core :P.

on a side note, stuff like adding lines to whois and all are more cosmetic things, and can be added at a later stage.

so let's focus on how to get this possible as a module. My offer of november 2006 to discuss it via email no longer stands, but you can use this bugtracker, and someone will deal with it :P
(0013897)
syzop   
2007-04-27 17:45   
(edited on: 2007-04-27 17:51)
I've added this to the '3rd Party Module Wishlist'. I should note though, that if this is implemented and looks OK, we can include it in unreal, or even make it an official module. In any case, discussion about that is completely irrelevant now, so focus on the rest ;p
EDIT: clarification

(0013908)
Bock   
2007-04-28 08:00   
Oh, if I can programming and knows about HOOKS.... But I don't. :P
So, I will search people who can do this.
(0013983)
pv2b   
2007-05-02 21:07   
I didn't see this discussion on the bug tracker earlier, so I just spent a few hours making what I deemed to be a rather clean solution which didn't involve any modification of the UnrealIRCD code.

Basically, it's a little Python program that sits in between the user and the unrealircd performing encoding conversion. It requires the cgiirc block in the IRC server to be configured properly. This is because the UnrealIRCD WEBIRC extention is used to change the source hostmask (which would otherwise be localhost, because of the way the program works).

In its current configuration, it listens for clients on port 6666 and 6668. Port 6666 is intended for Latin-1 clients and port 6668 for UTF-8 clients.

If a client connects on port 6666, it'll receive any data it would have on port 6667 -- but anything that validates as UTF-8 is automatically converted back into Latin-1. Any text sent by the client is treated as Latin-1 and converted to UTF-8 before being sent back to port 6667. This in essence makes a typical Latin-1 client limitedly UTF-8 aware with fallback support.

If a client connects on port 6666, it'll receive any data it would have on port 6667 -- but anything that does not validate as UTF-8 is automatically converted from Latin-1 to UTF-8. Any text sent by the client is treated as UTF-8 and sent as such to port 6667. This in essence makes a typical UTF-8 client able to decode Latin-1 as well if it does not have built in Latin-1 fallback support.

Any clients wishing to perform their own logic may continue to use 6667, so this scheme is optional to users. IRC clients do not have to be modified, since port number changes are already supported in all IRC clients.

Now, there are a lot of reasons against distributing my program as-is. It's meant as a proof of concept only. But I hope maybe somebody might find it useful either to use seperately, or maybe as inspiration for a future patch. I don't believe in adding another command to perform character set selection -- that just requires even more IRC client support. Keeping it simple is a good idea. :-)