View Issue Details

IDProjectCategoryView StatusLast Update
0002051unrealircdpublic2004-10-10 23:47
Reportertomcat188hk Assigned Tosyzop  
PrioritynormalSeverityminorReproducibilityN/A
Status resolvedResolutionfixed 
OSLinux - RedHatOS VersionRedHat 9 
Product Version3.2.1 
Fixed in Version3.2.2 
Summary0002051: Problems for Chinese nick
DescriptionThere is a problem with chinese nick support in Unreal 3.2.1! It only support very few chinese words, always shows:
Erroneous Nickname: Illegal characters
Is it a bug or something? I think I edited config.h correctly, should be nothing wrong with it.
Additional InformationI am using Chinese Trad.
TagsNo tags attached.
3rd party modules

Activities

codemastr

2004-09-02 19:30

reporter   ~0007476

None of us know anything about Chinese, there really is nothing we can do for you. The Chinese and Japanese support was added by a user who submitted a patch.

But anyway, my guess is you are using BIG5 glyphs which are NOT supported by Unreal. Unreal only supports GBK.

tomcat188hk

2004-09-02 23:54

reporter   ~0007477

nah, you are wrong

It doesnt support any GBK Chinese, I tried that.

Is there anyway to make Unreal allows all kind of nick can be used?

AngryWolf

2004-09-03 15:51

reporter   ~0007482

For example, "@" and "!"? Ok, let's say, my nick is "a!b@c". If I want to ban you from a #channel, would I type: "/mode #channel +b a!b@c!*@*"? That's one reason why there must be a limitation to what characters can be used in nicknames. At least protocol-specific characters must be excluded.

Unreal follows the same standard about allowed characters in nicknames as what RFC14519 defined. Because that's a standard, IRC clients do the same, and thus any additional characters can break clients.

codemastr

2004-09-03 20:16

reporter   ~0007484

Well, I know of many people using Unreal with GBK and it seems to work fine for them.

tomcat188hk

2004-09-04 02:48

reporter   ~0007492

I saw an Unreal server that works fine with BIG5 chinese
irc.chatcafe.net

perhaps codemastr could get them to send you patch to fix this problem...

codemastr

2004-09-04 03:29

reporter   ~0007494

tomkat, perhaps you could ask them? When I went there, I couldn't find out where to go (all the channels have topics in Chinese) and I couldn't find anyone that spoke English. My guess is you'd have much better luck talking to them than I would. I'd be glad to add BIG5 support, it sounds like a very useful thing for Chinese users. If they are willing to provide their patch, I'll definately look into adding it, but I don't think I'm capable of asking them for it :)

tomcat188hk

2004-09-04 04:36

reporter   ~0007504

Sure, not a problem. =)

tomcat188hk

2004-09-04 17:09

reporter   ~0007524

I asked them to send you an email, hopefully they are going to help you.

codemastr

2004-09-04 19:11

reporter   ~0007525

Sounds good. I'll let you know when I hear back from them.

codemastr

2004-09-18 03:21

reporter   ~0007700

Well they never emailed me, so I don't know what to do about this...

syzop

2004-10-06 21:51

administrator   ~0007889

I did some research and it seems we should (and intended to) allow 2 ranges:
0x8140 - 0xa0fe (CJK UNIFIED IDEOGRAPH)
0xaa40 - 0xfe4f (CJK UNIFIED IDEOGRAPH)
our code currently failed to fully allow the latest one, it only allows a subset of it, namely 0xb0a1 - 0xf7fe (which seems totally odd ;p, looked at GBK/CP936 codepage and that doesn't make any sense).

As a temporary fix, could you, tomcat188hk, change the following code in src/s_user.c around line 636 from:
    const unsigned int GBK_S = 0xb0a1;
    const unsigned int GBK_E = 0xf7fe;
to:
    const unsigned int GBK_S = 0xaa40;
    const unsigned int GBK_E = 0xfea0;
and run 'make'

And let us know if this solves the problem ;).
[I looked at the GBK codepage, and AFAIK that should solve it :p]

codemastr

2004-10-07 02:48

reporter   ~0007894

Yeah, I came to a similar conclusion. However, since I know next to nothing about Chinese, I figured I'd wait for someone who does know to respond. But I still haven't received an email from those guys...

Xuefer

2004-10-08 08:22

reporter   ~0007915

Last edited: 2004-10-08 08:34

GBK and BIG5 is complete different standard/orginizing
they have overlaps on code range and many same chars
u can't modify code to have both support for GBK/BIG5, only 1 can be choose
we check it because we don't like invalidate chars.
validate for GBK may be invalidate BIG5, and vesa.

and the current implement of GBK checking is wrong too

const unsigned int GBK_S = 0xb0a1;
const unsigned int GBK_E = 0xf7fe;
AWord >= GBK_S && AWord <= GBK_E

this seems the author of this path trying to limited beyond 0xb1fe 0xb2fe?
no, 0xb1ff 0xb2ff etc. will be validate, yet it shouldn't be
c2 != 0xFF
try //nick $chr($base(b1, 16, 10)) $+ $chr($base(ff, 16, 10)) on mIRC, will get invisible nick

we have to limit on c1 and c2 exactly, not AWord

btw: the JAP part is GBK too, it's not JIS or whatever japanese use, but could be used by some Chinese(for fun)

edited on: 2004-10-08 08:34

codemastr

2004-10-08 14:14

reporter   ~0007918

What?

Xuefer

2004-10-08 14:31

reporter   ~0007921

Last edited: 2004-10-08 14:40

what i said in previous note is:
one cannot enable both GBK and BIG5, if one do, he will be sorry

ok, here we go the source for GBK only
note, the macro name is renamed to NICK_*

#if defined(NICK_GB2312) || defined(NICK_GBK) || defined(NICK_GBK_JAP) || defined(NICK_BIG5)
#define NICK_MULTIBYTE
#endif

#ifdef NICK_MULTIBYTE
int isvalidChinese(const unsigned char c1, const unsigned char c2)
{
    unsigned int w = (((unsigned int)c1) << 8) | c2;

/* rang of w/c1/c2 (rw never used) */
#define rw(s, e) (w >= ((unsigned int )s) && w <= ((unsigned int )e))
#define r1(s, e) (c1 >= ((unsigned char)s) && c1 <= ((unsigned char)e))
#define r2(s, e) (c2 >= ((unsigned char)s) && c2 <= ((unsigned char)e))
#define e1(e) (c1 == (unsigned char)e)

#ifdef NICK_GBK_JAP
    /* GBK/1 */
    /* JIS_PIN part 1 */
    if (e1(0xA4) && r2(0xA1, 0xF3)) return 1;
    /* JIS_PIN part 2 */
    if (e1(0xA5) && r2(0xA1, 0xF6)) return 1;
#endif
#if defined(NICK_GB2312) || defined(NICK_GBK)
    /* GBK/2 BC with GB2312 */
    if (r2(0xA1, 0xFE))
    {
        /* Block 16-55, ordered by Chinese Spelling(PinYin) 3755 chars */
        if (r1(0xB0, 0xD6)) return 1;
        /* Block 55 is NOT full (w <= 0xd7f9) */
        if (e1(0xD7) && c2 <= (unsigned char)0xF9 /* r2(0xA1, 0xF9)*/) return 1;
        /* Block 56-87 is level 2 chars, ordered by writing 3008 chars */
        if (r1(0xD8, 0xF7)) return 1;
    }
#endif

#ifdef NICK_GBK
    /* GBK/3 */
    if (r1(0x81, 0xA0) && r2(0x40, 0xFE)) return 1;
    /* GBK/4 */
    if (r2(0x40, 0xA0) && r1(0xAA, 0xFE)) return 1;
#endif

#ifdef NICK_BIG5
    /* check BIG5 here */
#endif

    /* all failed */
    return 0;

#undef rw
#undef r1
#undef r2
#undef e1
}

#endif



do_nick_name is recommended to use #ifdef NICK_MULTIBYTE
c1 is already checked in do_nick_name, so BIG5 part(if filled with code) many not work, cos i'm not familiar to the "strict limit" on BIG5, i won't implement it

edited on: 2004-10-08 14:36

edited on: 2004-10-08 14:40

edited on: 2004-10-08 14:40

syzop

2004-10-08 15:31

administrator   ~0007923

Last edited: 2004-10-09 03:53

**scratch that.. I'll use a different aproach :p, still... might mail you at some point later, who knows ;p**

edited on: 2004-10-09 03:53

syzop

2004-10-10 23:47

administrator   ~0007946

Xuefers fix is now in CVS [.151].

Issue History

Date Modified Username Field Change
2004-09-02 08:19 tomcat188hk New Issue
2004-09-02 19:30 codemastr Note Added: 0007476
2004-09-02 23:54 tomcat188hk Note Added: 0007477
2004-09-03 15:51 AngryWolf Note Added: 0007482
2004-09-03 20:16 codemastr Note Added: 0007484
2004-09-04 02:48 tomcat188hk Note Added: 0007492
2004-09-04 03:29 codemastr Note Added: 0007494
2004-09-04 04:36 tomcat188hk Note Added: 0007504
2004-09-04 17:09 tomcat188hk Note Added: 0007524
2004-09-04 19:11 codemastr Note Added: 0007525
2004-09-18 03:21 codemastr Note Added: 0007700
2004-10-06 21:51 syzop Note Added: 0007889
2004-10-07 02:48 codemastr Note Added: 0007894
2004-10-08 08:22 Xuefer Note Added: 0007915
2004-10-08 08:34 Xuefer Note Edited: 0007915
2004-10-08 14:14 codemastr Note Added: 0007918
2004-10-08 14:31 Xuefer Note Added: 0007921
2004-10-08 14:36 Xuefer Note Edited: 0007921
2004-10-08 14:40 Xuefer Note Edited: 0007921
2004-10-08 14:40 Xuefer Note Edited: 0007921
2004-10-08 15:31 syzop Note Added: 0007923
2004-10-09 03:53 syzop Note Edited: 0007923
2004-10-10 23:47 syzop Status new => resolved
2004-10-10 23:47 syzop Fixed in Version => 3.2.2
2004-10-10 23:47 syzop Resolution open => fixed
2004-10-10 23:47 syzop Assigned To => syzop
2004-10-10 23:47 syzop Note Added: 0007946