View Issue Details

IDProjectCategoryView StatusLast Update
0003734unrealircdpublic2016-03-20 11:56
ReporterramittaAssigned To 
PrioritynormalSeverityfeatureReproducibilityalways
Status feedbackResolutionopen 
PlatformOSWindowsOS VersionVista
Product Version3.2.7 
Target VersionFixed in Version 
Summary0003734: about allowing arabic nicknames in unreal
Descriptionhello , i use unreal for my irc server

and i read here : http://www.unrealircd.com/files/docs/unreal32docs.html#feature_nickchars

there's no arabic
arabic use : windows-1256
if windows-1256 allowed in unreal so it will work

is there way to allow it in the unreal ?
or even in the future ?

thank you
TagsNo tags attached.
3rd party modules

Relationships

has duplicate 0003861 closed About adding more languages for nicks ( Arabic ) 

Activities

syzop

2010-09-26 20:17

administrator   ~0016381

If someone supplies us with a list of characters that need to be allowed, then we can implement it.

With that, I mean a list of:
1) ascii code numbers like '140, 150, 160' etc
OR:
2) just type all characters like 'äÄöÖüÜß' (and then I hope I can display them well with some tricks)... better send it to me by email at syzop@unrealircd.com

Be careful to only allow 'letters', and NOT things like quotes, whitespace/space, etc...

I really don't know much about the Arabic language, so we need a list which we can just copy/paste, we can't decide ourselves which characters are 'ok' :)

syzop

2010-11-17 18:03

administrator   ~0016422

Changing status to 'Feedback' as I cannot work on this alone.
I don't know arabic, and I don't know how the character set works. You mailed me with some webpage which shows all arabic characters, but that does not help me much.

How come people assume that I know and understand all languages around the world.. be it Arabic, Greek, Turkish, Chinese, Japanese, etc etc !?

syzop

2013-05-11 10:17

administrator   ~0017538

I wonder if this post will go well, but I received this by e-mail. I did not verify it in any way, like it if really only includes characters and not also spaces etc.

****
list of arabic characters : (leters)
 
=================================
??????????????????????????????
=================================
OR
 
list of arabic characters : (acsii)
 
=================================
 '153,154,157,158,160,161,162,163,164,165,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253'
=================================
****

syzop

2013-05-11 10:21

administrator   ~0017539

This was the table he pasted:

Glancing over it, i doesn't seem like a good idea to include 160 (A0) and some other characters? We don't allow spaces in nick names, so it would be illogical to allow equivalent characters in other languages?
Seems to me like this table includes far more characters that shouldn't be included (currency sign??).

If someone with an Arab background could comment on this, and filter out unneeded characters, that would be greatly appreciated.
I hope the same person can then comment on any other possible issues.


153 99 ? ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE ISOLATED FORM
154 9A ? ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE FINAL FORM
157 9D ? ARABIC LIGATURE LAM WITH ALEF ISOLATED FORM
158 9E ? ARABIC LIGATURE LAM WITH ALEF FINAL FORM
160 A0 NO-BREAK SPACE
161 A1 ­ SOFT HYPHEN
162 A2 ? ARABIC LETTER ALEF WITH MADDA ABOVE FINAL FORM
163 A3 £ POUND SIGN
164 A4 ¤ CURRENCY SIGN
165 A5 ? ARABIC LETTER ALEF WITH HAMZA ABOVE FINAL FORM
168 A8 ? ARABIC LETTER ALEF FINAL FORM
169 A9 ? ARABIC LETTER BEH
170 AA ? ARABIC LETTER TEH
171 AB ? ARABIC LETTER THEH
172 AC ? ARABIC COMMA
173 AD ? ARABIC LETTER JEEM
174 AE ? ARABIC LETTER HAH
175 AF ? ARABIC LETTER KHAH
176 B0 ? ARABIC-INDIC DIGIT ZERO
177 B1 ? ARABIC-INDIC DIGIT ONE
178 B2 ? ARABIC-INDIC DIGIT TWO
179 B3 ? ARABIC-INDIC DIGIT THREE
180 B4 ? ARABIC-INDIC DIGIT FOUR
181 B5 ? ARABIC-INDIC DIGIT FIVE
182 B6 ? ARABIC-INDIC DIGIT SIX
183 B7 ? ARABIC-INDIC DIGIT SEVEN
184 B8 ? ARABIC-INDIC DIGIT EIGHT
185 B9 ? ARABIC-INDIC DIGIT NINE
186 BA ? ARABIC LETTER FEH
187 BB ? ARABIC SEMICOLON
188 BC ? ARABIC LETTER SEEN
189 BD ? ARABIC LETTER SHEEN
190 BE ? ARABIC LETTER SAD
191 BF ? ARABIC QUESTION MARK
192 C0 ¢ CENT SIGN
193 C1 ? ARABIC LETTER HAMZA
194 C2 ? ARABIC LETTER ALEF WITH MADDA ABOVE
195 C3 ? ARABIC LETTER ALEF WITH HAMZA ABOVE
196 C4 ? ARABIC LETTER WAW WITH HAMZA ABOVE
197 C5 ? ARABIC LETTER AIN FINAL FORM
198 C6 ? ARABIC LETTER YEH WITH HAMZA ABOVE INITIAL FORM
199 C7 ? ARABIC LETTER ALEF
200 C8 ? ARABIC LETTER BEH INITIAL FORM
201 C9 ? ARABIC LETTER TEH MARBUTA
202 CA ? ARABIC LETTER TEH INITIAL FORM
203 CB ? ARABIC LETTER THEH INITIAL FORM
204 CC ? ARABIC LETTER JEEM INITIAL FORM
205 CD ? ARABIC LETTER HAH INITIAL FORM
206 CE ? ARABIC LETTER KHAH INITIAL FORM
207 CF ? ARABIC LETTER DAL
208 D0 ? ARABIC LETTER THAL
209 D1 ? ARABIC LETTER REH
210 D2 ? ARABIC LETTER ZAIN
211 D3 ? ARABIC LETTER SEEN INITIAL FORM
212 D4 ? ARABIC LETTER SHEEN INITIAL FORM
213 D5 ? ARABIC LETTER SAD INITIAL FORM
214 D6 ? ARABIC LETTER DAD INITIAL FORM
215 D7 ? ARABIC LETTER TAH
216 D8 ? ARABIC LETTER ZAH
217 D9 ? ARABIC LETTER AIN INITIAL FORM
218 DA ? ARABIC LETTER GHAIN INITIAL FORM
219 DB ¦ BROKEN BAR
220 DC ¬ NOT SIGN
221 DD ÷ DIVISION SIGN
222 DE × MULTIPLICATION SIGN
223 DF ? ARABIC LETTER AIN
224 E0 ? ARABIC TATWEEL
225 E1 ? ARABIC LETTER FEH INITIAL FORM
226 E2 ? ARABIC LETTER QAF INITIAL FORM
227 E3 ? ARABIC LETTER KAF INITIAL FORM
228 E4 ? ARABIC LETTER LAM INITIAL FORM
229 E5 ? ARABIC LETTER MEEM INITIAL FORM
230 E6 ? ARABIC LETTER NOON INITIAL FORM
231 E7 ? ARABIC LETTER HEH INITIAL FORM
232 E8 ? ARABIC LETTER WAW
233 E9 ? ARABIC LETTER ALEF MAKSURA
234 EA ? ARABIC LETTER YEH INITIAL FORM
235 EB ? ARABIC LETTER DAD
236 EC ? ARABIC LETTER AIN MEDIAL FORM
237 ED ? ARABIC LETTER GHAIN FINAL FORM
238 EE ? ARABIC LETTER GHAIN
239 EF ? ARABIC LETTER MEEM
240 F0 ? ARABIC SHADDA MEDIAL FORM
241 F1 ? ARABIC SHADDA
242 F2 ? ARABIC LETTER NOON
243 F3 ? ARABIC LETTER HEH
244 F4 ? ARABIC LETTER HEH MEDIAL FORM
245 F5 ? ARABIC LETTER ALEF MAKSURA FINAL FORM
246 F6 ? ARABIC LETTER YEH FINAL FORM
247 F7 ? ARABIC LETTER GHAIN MEDIAL FORM
248 F8 ? ARABIC LETTER QAF
249 F9 ? ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE ISOLATED FORM
250 FA ? ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE FINAL FORM
251 FB ? ARABIC LETTER LAM
252 FC ? ARABIC LETTER KAF
253 FD ? ARABIC LETTER YEH

aldheffiry

2013-05-11 10:40

reporter   ~0017540

Last edited: 2013-05-11 10:52

View 4 revisions

I appreciate for your help.
it is my pleasure to do this.

aldheffiry

2013-05-11 11:24

reporter   ~0017541

acsii_Numbers

176,177,178,179,180,181,182,183,184,185 (total 10)

176 B0 ? ARABIC-INDIC DIGIT ZERO
177 B1 ? ARABIC-INDIC DIGIT ONE
178 B2 ? ARABIC-INDIC DIGIT TWO
179 B3 ? ARABIC-INDIC DIGIT THREE
180 B4 ? ARABIC-INDIC DIGIT FOUR
181 B5 ? ARABIC-INDIC DIGIT FIVE
182 B6 ? ARABIC-INDIC DIGIT SIX
183 B7 ? ARABIC-INDIC DIGIT SEVEN
184 B8 ? ARABIC-INDIC DIGIT EIGHT
185 B9 ? ARABIC-INDIC DIGIT NINE
===========================================

acsii_special Arabic Letter Forms (total 7)

153,154,157,158,224,249,250

153 99 ? ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE ISOLATED FORM
154 9A ? ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE FINAL FORM
157 9D ? ARABIC LIGATURE LAM WITH ALEF ISOLATED FORM
158 9E ? ARABIC LIGATURE LAM WITH ALEF FINAL FORM
224 E0 ? ARABIC TATWEEL
249 F9 ? ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE ISOLATED FORM
250 FA ? ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE FINAL FORM

===========================================

acsii_Common Arabic Letters (total 65)

162,165,168,169,170,171,173,174,175,186,188,189,190,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,223,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,242,243,244,245,246,247,248,251,252,253

162 A2 ? ARABIC LETTER ALEF WITH MADDA ABOVE FINAL FORM
165 A5 ? ARABIC LETTER ALEF WITH HAMZA ABOVE FINAL FORM
168 A8 ? ARABIC LETTER ALEF FINAL FORM
169 A9 ? ARABIC LETTER BEH
170 AA ? ARABIC LETTER TEH
171 AB ? ARABIC LETTER THEH
173 AD ? ARABIC LETTER JEEM
174 AE ? ARABIC LETTER HAH
175 AF ? ARABIC LETTER KHAH
186 BA ? ARABIC LETTER FEH
188 BC ? ARABIC LETTER SEEN
189 BD ? ARABIC LETTER SHEEN
190 BE ? ARABIC LETTER SAD
193 C1 ? ARABIC LETTER HAMZA
194 C2 ? ARABIC LETTER ALEF WITH MADDA ABOVE
195 C3 ? ARABIC LETTER ALEF WITH HAMZA ABOVE
196 C4 ? ARABIC LETTER WAW WITH HAMZA ABOVE
197 C5 ? ARABIC LETTER AIN FINAL FORM
198 C6 ? ARABIC LETTER YEH WITH HAMZA ABOVE INITIAL FORM
199 C7 ? ARABIC LETTER ALEF
200 C8 ? ARABIC LETTER BEH INITIAL FORM
201 C9 ? ARABIC LETTER TEH MARBUTA
202 CA ? ARABIC LETTER TEH INITIAL FORM
203 CB ? ARABIC LETTER THEH INITIAL FORM
204 CC ? ARABIC LETTER JEEM INITIAL FORM
205 CD ? ARABIC LETTER HAH INITIAL FORM
206 CE ? ARABIC LETTER KHAH INITIAL FORM
207 CF ? ARABIC LETTER DAL
208 D0 ? ARABIC LETTER THAL
209 D1 ? ARABIC LETTER REH
210 D2 ? ARABIC LETTER ZAIN
211 D3 ? ARABIC LETTER SEEN INITIAL FORM
212 D4 ? ARABIC LETTER SHEEN INITIAL FORM
213 D5 ? ARABIC LETTER SAD INITIAL FORM
214 D6 ? ARABIC LETTER DAD INITIAL FORM
215 D7 ? ARABIC LETTER TAH
216 D8 ? ARABIC LETTER ZAH
217 D9 ? ARABIC LETTER AIN INITIAL FORM
218 DA ? ARABIC LETTER GHAIN INITIAL FORM
223 DF ? ARABIC LETTER AIN
225 E1 ? ARABIC LETTER FEH INITIAL FORM
226 E2 ? ARABIC LETTER QAF INITIAL FORM
227 E3 ? ARABIC LETTER KAF INITIAL FORM
228 E4 ? ARABIC LETTER LAM INITIAL FORM
229 E5 ? ARABIC LETTER MEEM INITIAL FORM
230 E6 ? ARABIC LETTER NOON INITIAL FORM
231 E7 ? ARABIC LETTER HEH INITIAL FORM
232 E8 ? ARABIC LETTER WAW
233 E9 ? ARABIC LETTER ALEF MAKSURA
234 EA ? ARABIC LETTER YEH INITIAL FORM
235 EB ? ARABIC LETTER DAD
236 EC ? ARABIC LETTER AIN MEDIAL FORM
237 ED ? ARABIC LETTER GHAIN FINAL FORM
238 EE ? ARABIC LETTER GHAIN
239 EF ? ARABIC LETTER MEEM
242 F2 ? ARABIC LETTER NOON
243 F3 ? ARABIC LETTER HEH
244 F4 ? ARABIC LETTER HEH MEDIAL FORM
245 F5 ? ARABIC LETTER ALEF MAKSURA FINAL FORM
246 F6 ? ARABIC LETTER YEH FINAL FORM
247 F7 ? ARABIC LETTER GHAIN MEDIAL FORM
248 F8 ? ARABIC LETTER QAF
251 FB ? ARABIC LETTER LAM
252 FC ? ARABIC LETTER KAF
253 FD ? ARABIC LETTER YEH

syzop

2013-05-11 13:39

administrator   ~0017542

Thanks.

So if we want to support Arabic nicknames, we need to include 'acsii_Common Arabic Letters', and also include 'acsii_Numbers'.

Do we also need to include 'acsii_special Arabic Letter Forms (total 7)' ?
Are these letters? Sorry I know nothing about Arabic.
If they are letters then they need to be included.
If they are punctuation symbols like ' " ` : ; , . and such as used in Latin/other scripts, then they probably should not be included, as we don't allow such symbols in western nick names either (with a few exceptions).

aldheffiry

2013-05-11 16:59

reporter   ~0017543

Last edited: 2013-05-11 19:48

View 4 revisions

Special Arabic Letter Forms it is linked 2 letters make a meaning of a letter like a (ch)annel.

it is consider same important of common letters but rarely used. I suppose you can add them too.

aldheffiry

2013-05-11 19:47

reporter   ~0017544

Last edited: 2013-05-12 04:13

View 6 revisions

While I am in the path takeing shower I figure out that we don't need to include 'ASCII_special Arabic Letter Forms (coz its Combine between 2 common letters and could this make troubles ).
===================================
except (224 E0 ? ARABIC TATWEEL).
===================================
We have include (224 E0 ? ARABIC TATWEEL) becoz its 1 unique letter, have own shape and included in most services scripts without problems. It is just rarely used.

nenolod

2013-05-12 07:23

reporter   ~0017548

hello,

we already support iso8859-6, based on atheme's libguess table on unrealircd 3.4.

===
        if (latin1 || !strcmp(name, "arabic"))
        {
                char bytes[] = { 0xa0, 0xa4, 0xac, 0xad, 0xbb, 0xbf, 0x00 };
                charsys_addallowed(bytes);
                charsys_addallowed_range(0xc1, 0xda);
                charsys_addallowed_range(0xe0, 0xf2);
        }
===

if it is not working, can you provide a list of bytes that you are seeing rejection issues with, and the relevant portions of the configuration?

syzop

2013-05-12 16:06

administrator   ~0017550

I see. I'm not sure if libguess is a good source for this purpose...

aldheffiry provided a broken down list with which characters to allow a few comments up, so we can use that and compare it with your table. See 0003734:0017541.

I see A0 / A4 and such in nenolod's paste, I think those shouldn't be included, or at least have my doubts. I just happened to comment on them in my previous comment at 0003734:0017539, note that the same comment may apply to other characters as well.
I also miss A9, AA, and a lot of other characters, which according to aldheffiry's table from above should be included.

aldheffiry

2013-05-12 16:36

reporter   ~0017551

Last edited: 2013-05-12 17:07

View 4 revisions

I`d like to try atheme's libguess table on my unrealircd, would you mind help me the steps i need to install and configure?

by the way, I am using Unreal3.2.10.1:

I Fixed charsys as your changes http://hg.unrealircd.org/hg/unreal/rev/ad2e45a60382

nenolod

2013-05-13 07:44

reporter   ~0017553

hmm.

libguess defines iso8859-6 as such:

https://github.com/atheme/libguess/blob/master/src/libguess/guess.scm#L415

are we talking about windows-1256 instead of iso8859-6, as libguess just accepts all byte sequences there at the moment.

(libguess is indeed not really meant for nicknames -- but can be used to guess codepages successfully, some patches for other ircds exist for that)

aldheffiry

2013-05-13 10:48

reporter   ~0017554

Last edited: 2013-05-13 10:54

View 2 revisions

what about UTF-8! it is support arabic letters.

by the way, which dir install libguess in ?

nenolod

2013-05-14 00:18

reporter   ~0017555

utf-8 has problems for nicknames because there are duplicate characters that could look like rfc1459 frame elements or other special characters in irc (@+~&% for example).

a small, non-exhaustive search yields at least these, but there are likely many more:

=====
- U+0x02f8 -- Raised colon (:)
- U+0xfe30 -- Vertical two-dot (looks like colon) (:)
- U+0xfe55 -- Small colon (:)
- U+0xfe5f -- Small hash symbol (#)
- U+0xfe60 -- Small ampersand (&)
- U+0xfe61 -- Small asterisk (*)
- U+0xfe69 -- Small dollar symbol ($)
- U+0xfe6b -- Small at symbol (@)
- U+0xff03 -- (Japanese) Full-width hash symbol (#)
- U+0xff04 -- (Japanese) Full-width dollar symbol ($)
- U+0xff06 -- (Japanese) Full-width ampersand (&)
- U+0xff0a -- (Japanese) Full-width asterisk (*)
- U+0xff1a -- (Japanese) Full-width colon (:)
- U+0xff20 -- (Japanese) Full-width at symbol (@)
=====

i think the right way to proceed is to encourage clients to use punycode representation for nicknames containing multi-byte utf-8 sequences.

this requires teaching clients to understand punycode, but poses less problems.

aldheffiry

2013-05-16 11:03

reporter   ~0017577

IRC support for displaying UTF-8 text as Unicode has been added. This works in status, channel, query, and other windows, and in nickname listboxes, window titlebars, switchbar, and tooltips. The display of UTF-8 can be enabled by default for all windows in the Options/IRC/Messages dialog, or individually for any window you like via the Fonts dialog. Use the /font command to open the Fonts dialog. Make sure you select a font that contains the characters or script (hebrew, arabic, greek, cyrillic,...) you want to see!

syzop

2013-05-16 21:17

administrator   ~0017579

For UTF8 see 0003719. Also touches something similar as nenolod just said.. the 'similar characters problem'.

Best to keep the discussion in there.

And keep this ticket on iso8859-6 or the windows-whatever codepage(s).

aldheffiry

2013-05-17 07:32

reporter   ~0017581

Last edited: 2013-05-17 11:32

View 3 revisions

but, what about the file I sent to you. is that could help?

syzop

2016-03-20 11:56

administrator   ~0019142

This was added at some point by nenolod in 4.x (based on someones patch?) but was later disabled because the feature was untested and caused UnrealIRCd to crash.

So at this point the feature is not usable yet.

Issue History

Date Modified Username Field Change
2008-10-06 08:11 ramitta New Issue
2009-07-24 01:14 Stealth Relationship added has duplicate 0003861
2010-09-26 20:17 syzop Note Added: 0016381
2010-11-17 18:03 syzop Note Added: 0016422
2010-11-17 18:03 syzop Status new => feedback
2013-05-11 10:17 syzop Note Added: 0017538
2013-05-11 10:21 syzop Note Added: 0017539
2013-05-11 10:40 aldheffiry Note Added: 0017540
2013-05-11 10:51 aldheffiry Note Edited: 0017540 View Revisions
2013-05-11 10:52 aldheffiry Note Edited: 0017540 View Revisions
2013-05-11 10:52 aldheffiry Note Edited: 0017540 View Revisions
2013-05-11 11:24 aldheffiry Note Added: 0017541
2013-05-11 13:39 syzop Note Added: 0017542
2013-05-11 16:59 aldheffiry Note Added: 0017543
2013-05-11 16:59 aldheffiry Note Edited: 0017543 View Revisions
2013-05-11 17:00 aldheffiry Note Edited: 0017543 View Revisions
2013-05-11 19:47 aldheffiry Note Added: 0017544
2013-05-11 19:48 aldheffiry Note Edited: 0017543 View Revisions
2013-05-11 19:49 aldheffiry Note Edited: 0017544 View Revisions
2013-05-11 19:53 aldheffiry Note Edited: 0017544 View Revisions
2013-05-11 19:54 aldheffiry Note Edited: 0017544 View Revisions
2013-05-11 19:55 aldheffiry Note Edited: 0017544 View Revisions
2013-05-12 04:13 aldheffiry Note Edited: 0017544 View Revisions
2013-05-12 07:23 nenolod Note Added: 0017548
2013-05-12 16:06 syzop Note Added: 0017550
2013-05-12 16:36 aldheffiry Note Added: 0017551
2013-05-12 17:05 aldheffiry Note Edited: 0017551 View Revisions
2013-05-12 17:06 aldheffiry Note Edited: 0017551 View Revisions
2013-05-12 17:07 aldheffiry Note Edited: 0017551 View Revisions
2013-05-13 07:44 nenolod Note Added: 0017553
2013-05-13 10:48 aldheffiry Note Added: 0017554
2013-05-13 10:54 aldheffiry Note Edited: 0017554 View Revisions
2013-05-14 00:18 nenolod Note Added: 0017555
2013-05-16 11:03 aldheffiry Note Added: 0017577
2013-05-16 21:17 syzop Note Added: 0017579
2013-05-17 07:32 aldheffiry Note Added: 0017581
2013-05-17 11:32 aldheffiry Note Edited: 0017581 View Revisions
2013-05-17 11:32 aldheffiry Note Edited: 0017581 View Revisions
2015-08-08 17:49 syzop Severity minor => feature
2016-03-20 11:56 syzop Note Added: 0019142