View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001332 | unreal | ircd | public | 2003-11-02 07:09 | 2003-12-23 22:46 |
Reporter | AngryWolf | Assigned To | |||
Priority | normal | Severity | crash | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Platform | Linux | OS | SuSE | OS Version | 8.2 |
Product Version | 3.2-beta18 | ||||
Summary | 0001332: Rehash won't reload modules and can cause trouble | ||||
Description | On the 31th of August, Rocko reported a crash bug (0001222), which codemastr said was caused by an unofficial module. I'm the author of that module, and I'm stating UnrealIRCd has to do something with the bug. I've installed a test server, started it, no errors. After connecting to the server, I type /module: *** m_objects - $Id: m_objects.c,v 1.3 2003/11/01 20:52:34 angrywolf Exp $ (Objects dumper (commands /events, /hooks & /modules)) *** m_bopmhelper - v0.0.1 (BOPM Helper) *** commands - $Id: l_commands.c,v 1.1.2.49 2003/05/10 18:53:50 codemastr Exp $ (Wrapper library for m_ commands) As you can see, I'm using an "official" module, m_bopmhelper, and my m_objects module for debugging. I've uploaded the last one on this page. Now I'm using command /hooks: *** Hook LOCAL_CONNECT: o/m_bopmhelper a/135813048 *** Hook CONFIGPOSTTEST: o/m_bopmhelper a/135796096 *** Hook REHASH: o/m_bopmhelper a/135813144 *** Hook STATS: o/commands a/135822456 *** Hook CONFIGTEST: o/m_bopmhelper a/135795088 *** Hook CONFIGTEST: o/commands a/135795040 *** Hook CONFIGRUN: o/commands a/135822408 *** Hook CONFIGRUN: o/m_bopmhelper a/135813096 *** 8 hooks That's OK, now let's do a /rehash: ** Notice -- AngryWolf is rehashing server config file 382 unrealircd.conf : Rehashing *** Notice -- Loading IRCd configuration .. *** Notice -- Configuration loaded without any problems .. And /hooks again: *** Notice -- Loading IRCd configuration .. *** Notice -- Configuration loaded without any problems .. *** Hook LOCAL_CONNECT: o/m_bopmhelper a/135795064 *** Hook REHASH: o/m_bopmhelper a/135809840 *** Hook STATS: o/commands a/135819392 *** Hook CONFIGTEST: o/commands a/136601280 *** Hook CONFIGRUN: o/commands a/135819344 *** Hook CONFIGRUN: o/m_bopmhelper a/135795112 *** 6 hooks Hm? Hooks CONFIGTEST and CONFIGPOSTTETS of m_bopmhelper has been removed? Why? Well, if these hooks are freed up from the memory, the MOD_UNLOAD function must be lucky if it expects that the memory space for the hook is still allocated in the memory. Otherwise in function HookDel "hook->type" might get a false value, and "p" too (p = Hooks[hook->type]). I think it's not my bug, actually I don't know what is correct, to remove hooks in MOD_UNLOAD or skip hooks added in MOD_TEST. | ||||
Tags | No tags attached. | ||||
Attached Files | |||||
3rd party modules | |||||
|
I can reproduce the bug anytime I want, with a very large b2.conf included in the main configuration file. When I start the server, it doesn't want to respond for a very long time, and have to kill ircd. Then I get a core file. Details: Core was generated by `/home/angrywolf/IRCNetwork/Unreal-server1/src/ircd'. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/lib/libssl.so.0.9.6...done. Loaded symbols for /usr/lib/libssl.so.0.9.6 Reading symbols from /usr/lib/libcrypto.so.0.9.6...done. Loaded symbols for /usr/lib/libcrypto.so.0.9.6 Reading symbols from /lib/libcrypt.so.1...done. Loaded symbols for /lib/libcrypt.so.1 Reading symbols from /lib/libnsl.so.1...done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libz.so.1...done. Loaded symbols for /lib/libz.so.1 Reading symbols from /lib/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/libc.so.6...done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from src/modules/commands.so...done. Loaded symbols for src/modules/commands.so Reading symbols from src/modules/m_bopmhelper.so...done. Loaded symbols for src/modules/m_bopmhelper.so #0 0x0806ce4c in HookDel (hook=0x0) at modules.c:944 944 for (p = Hooks[hook->type]; p; p = p->next) { (gdb) bt #0 0x0806ce4c in HookDel (hook=0x0) at modules.c:944 #1 0x40017eb2 in Mod_Unload (module_unload=0) at m_bopmhelper.c:103 #2 0x0806d160 in unload_all_modules () at modules.c:1063 #3 0x08068dfb in s_die () at ircd.c:229 #4 <signal handler called> #5 0x080764d7 in config_parse (filename=0x8186570 "b2.conf", confdata=0x402c0008 "/* \n Unreal Internet Relay Chat Daemon\n Copyright (C) Carsten V. Munk 2000\n\n NOTE: Those words are not meant to insult you (the user)\n but is meant to be a list of words so that the +G channel/"...) at s_conf.c:986 #6 0x0807623b in config_load (filename=0x8186570 "b2.conf") at s_conf.c:885 #7 0x08077352 in load_conf (filename=0x8186570 "b2.conf") at s_conf.c:1486 0000008 0x080793af in _conf_include (conf=0x8181d40, ce=0x8181e68) at s_conf.c:2480 #9 0x08077421 in load_conf (filename=0x80a2801 "unrealircd.conf") at s_conf.c:1507 #10 0x080770e3 in init_conf (rootconf=0x80a2801 "unrealircd.conf", rehash=0) at s_conf.c:1372 #11 0x08069d44 in main (argc=1, argv=0xbffff6b4) at ircd.c:1094 #12 0x401979ed in __libc_start_main () from /lib/libc.so.6 Hmm.. OK, in this case ircd was killed, so it's not the original bug if it is. edited on: 11-02-03 14:04 |
|
Well, if these hooks are freed up from the memory, the MOD_UNLOAD function must be lucky if it expects that the memory space for the hook is still allocated in the memory. Otherwise in function HookDel "hook->type" might get a false value, and "p" too (p = Hooks[hook->type]). I haven't a clue what you are trying to say there... How is the Mod_Unload function being lucky? I don't understand what you are saying at all. At the time Mod_Unload is called, NO memory has been freed. Therefore no "luck" is involved, the hook IS allocated. So how is it possible that hook->type has a "false" value? You're going to have to be more specific on what exactly you think is causing this problem, because from your description, I haven't a clue. |
|
Yeah, right, okay. Another stupid bugreport... edited on: 11-03-03 08:47 |
|
Actually no, I did some more testing and I found the problem. I have no idea how to fix it yet, but the problem stems from the unfortunate way that Linux handles .so files. |
|
While you are at it, maybe you can fix that "Rehash won't reload modules" (0001325) bug too (somehow... if possible...) :). |
|
The rehash problem cited there is the exact same problem. Well, not the same problem, but the same thing causes both. The problem is, *nix only lets you have one copy of a .so loaded. It uses reference counting. If I dlopen() the file 15 times, each after making a change, dlopen() will return the same pointer every time. The only solution I can come up with, and it definately has problems (I'm totally open to suggestions), is we must "rename" the .so file some how so dlopen() is fooled into thinking this .so is actually a different .so and will load the new code rather than returning a pointer to the old code. If anyone has any suggestions on how this could be accomplished better, I'm willing to listen. In any case, to describe this bug a bit more. What happens is, since the new dlopen() just returns a pointer to the old module, it means that when the new hook is added in Mod_Test (ConfHookTest) it overwrites the old value there. Because, it is not a new variable, since the module pointer is the same, so is the ConfHookTest variable and therefore, when HookAdd is called, ConfHookTest is changed. Now, when the "old" module calls HookDel, since it is the same module as the "new" one, it too has the new ConfHookTest value. As a result, HookDel deletes the new hook. And, the old hook is just leaked into memory. |
|
ah now I get it... fun! :P. |
|
Also hmm... will that work (having two instances loaded)? Won't it cause any symbol conflicts? Like.. I have a modname_config_test() routine.. what if another module with the same function name (since it's the same module) tries to load? :P I do make a lot of functions static, but.. not the ones that get called by unrealircd (hooks, new commands, callbacks, etc) ;p. |
|
As far as I know, no symbols are exported from dinamically loaded modules, you have to get the addresses of such symbols with dlsym(). By the way, think about what would happen if you load tons of modules, each of them having Mod_Init, Most_Load, Mod_Unload, etc... |
|
Yeah AngryWolf is correct, the symbols are not exported in the standard sense. They are "exportable" meaning you can dlsym() them, but they aren't loaded into the symbol table of the loading program. |
|
Good. Hope it works... More and more people are asking for it. For example when I released a new anti dcc module people didn't understand it didn't upgrade on /rehash (and maybe some have not even noticed it didn't) :). Also then, finally the commands.so stuff would be reloaded (since there's no alternative except to reboot), which would be great if we find some minor bug :). Patch-while-you-run :p. |
|
Yeah I agree, it would be great, the problem is renaming. For example, say I do, at reload, rename commands.so to commands.so.2, well some people run Unreal with setuid, therefore, we can't necessarily be sure that that rename will succeed. Another alternative is to write it to /tmp, but I'm not sure I like that idea. Again, any suggestions? |
|
Yeah I played a bit and it sucks indeed. Before I continue I would like to say that we probably shouldn't rename since we _might_ get interrupted somehow between rename and rename-it-back (eg: killed, crash, whatever.. not unlikely). I tried (success=loaded it twice, failed=used old handle): - ./bla.so & /full/path/to/bla.so: failed - making a symlink from ./bla.so to ./bla.copy.so and loading it: failed - making a hardlink from ./bla.so to ./bla.copy.so and loading it: succeeded - copying ./bla.so to ./bla.copy.so and loading it: succeeded As you can see, I too, don't see any other way than modifying the filesystem (too bad trick 1 doesn't work)... So yes, I think we should use a temp directory or something [1] and try to hardlink (or if that fails, copy) and load it. [1] Perhaps we could add a directory tmp/ in the Unreal3.2 directory. I always find public directories (like /tmp) a bit scary.. race conditions, but also that $$$ 3rd party modules might stay in /tmp or can be grabbed by other users. Since pretty much users run it on a multi-user-shell that's not-so-good, ok.. well.. not very good reasons maybe, I just hate writing stuff to public(-writable) dirs :P. |
|
I agree with Syzop, that copying the module to DPATH/tmp (or anywhere else in the main dir of Unreal) would be better than using /tmp, which is too risky. For an other example, hardlinks cannot cross filesystem boundaries. Most people install UnrealIRCd in their directory under /home where the partition mounted on is different to what's mounted on /. So hardlinking to /tmp would fail in this case. |
|
Something else I'm considering. Someone, a long time ago, mentioned to me it isn't the fact that it is the same file, it is the fact that it is the same "version" of the file. That kind of leads me to believe there is some flag we can pass to the linker or something that after each build it can increase the version number or something. Granted, that doesn't fix the HookDel issue, but it does make /rehash load the new .so after it has been changed. |
|
[changed title] |
|
i'm not experienced at all with programming ;) but i just found at man ld --major-image-version value Sets the major number of the "image version". Defaults to 1. --major-os-version value Sets the major number of the "os version". Defaults to 4. --major-subsystem-version value Sets the major number of the "subsystem version". Defaults to 4. --minor-image-version value Sets the minor number of the "image version". Defaults to 0. --minor-os-version value Sets the minor number of the "os version". Defaults to 0. --minor-subsystem-version value Sets the minor number of the "subsystem version". Defaults to 0. could that be the solution for the reload on /rehash bug with setting a version number higher and higher on every compile ? |
|
I believe this is now fixed (along with the /rehash doesn't load new .so file thing) as of CVS .2000. I'm not closing the bug report just yet simply because I think this new system is going to need a bit of testing. For example I wasn't able to test to make sure I didn't add anything that will screw up on Windows. This definately does solve the hook problem, I tested that, I just haven't tested to ensure this works on all OSes and all system configurations. If you have any problems let me know. |
|
penna22, yeah that seems like it would do the trick, the problem is, not everyone has ld... |
|
well as i said i'm not experienced at all with all that compiling/coding stuff ;) |
|
Syzop: in your latest commit (.2004), you wrote "temporarely file" two times, which should be "temporary file", just a suggestion. |
|
could you say where? ;) |
|
modules.c and support.c (let me know if you want more exact info). |
|
yeah I've no idea what you are talking about... |
|
Sorry, probably it's me not giving you detailed information. Here they are: [angrywolf@gep1 Unreal]$ grep -r -n temporarely . ./Changes:2073:- Added some temporarely SQLINE debugging/trace code ./src/modules.c:164: return "Unable to create temporarely file!"; ./src/support.c:1707: config_error("Unable to create temporarely file in directory '%s': %s", Better? |
|
oh that, I think that was intentional. |
|
I think he is saying you spelled the word wrong. It's "temporary" not "temporarely" |
|
oh 2 times != 2 times but 2 times.. right! confusing indeed. *edit* will fix that later (or if code is bored...), working on other stuff atm :| edited on: 12-11-03 01:28 |
|
Sorry, Syzop, next time I'll keep out the confusion from my messages. :-) There is one more issue I've experienced recently: if I run ./Configure twice or more, I get this message: mkdir: cannot create directory `/home/angrywolf/IRC/IRCNetwork/Server1/tmp': File exists So, at one time, if you, coders, really get bored, I'ld be glad to see this very minor non-crash bug fixed. :-) Thanks in advance. |
|
Oh, and another one (either not a major issue): * unrealircd.conf:6: loadmodule src/modules/m_getinfo.so: failed to load: tmp/327B23C6.m_getinfo.so: cannot open shared object file: No such file or directory You know, that happens when src/modules/m_getinfo.so doesn't exist. I just don't think the error message above is the proper one. People might think there's something wrong with their temporary directories. |
|
question: could that be extended to external channel modes or is it just not possible because it would be very usefull for my first coded modul for unreal if i just could make changes to it without restarting my server ;) |
|
that takes a lot of work and is dangerous, therefore we won't do it (or at least not anytime soon). |
|
ah ok it's just the last feature i need from unreal but if it's hard to code it's understandable that it's still missing ;) |
|
No one has reported any problems, so I consider this fixed. |
Date Modified | Username | Field | Change |
---|---|---|---|
2003-11-02 07:09 | AngryWolf | New Issue | |
2003-11-02 07:09 | AngryWolf | File Added: m_objects.c | |
2003-11-02 11:23 | AngryWolf | Note Added: 0003929 | |
2003-11-02 14:04 | AngryWolf | Note Edited: 0003929 | |
2003-11-03 04:34 |
|
Note Added: 0003946 | |
2003-11-03 08:24 | AngryWolf | Note Added: 0003948 | |
2003-11-03 08:25 | AngryWolf | Note Edited: 0003948 | |
2003-11-03 08:47 | AngryWolf | Note Edited: 0003948 | |
2003-11-03 14:49 |
|
Note Added: 0003949 | |
2003-11-04 00:29 | syzop | Note Added: 0003958 | |
2003-11-04 05:39 |
|
Note Added: 0003959 | |
2003-11-04 14:05 | syzop | Note Added: 0003965 | |
2003-11-04 14:10 | syzop | Note Added: 0003966 | |
2003-11-04 18:33 | AngryWolf | Note Added: 0003968 | |
2003-11-04 21:55 |
|
Note Added: 0003970 | |
2003-11-04 21:59 | syzop | Note Added: 0003971 | |
2003-11-05 04:29 |
|
Note Added: 0003972 | |
2003-11-05 15:41 | syzop | Note Added: 0003973 | |
2003-11-05 20:01 | AngryWolf | Note Added: 0003974 | |
2003-11-05 23:42 |
|
Note Added: 0003975 | |
2003-11-10 01:04 | syzop | Status | new => confirmed |
2003-11-10 01:04 | syzop | ETA | none => < 1 month |
2003-11-10 01:04 | syzop | Summary | Crash with HookDel => Rehash won't reload modules and can cause trouble |
2003-11-10 01:05 | syzop | Note Added: 0003996 | |
2003-11-10 08:14 | penna22 | Note Added: 0003997 | |
2003-12-06 01:11 |
|
Note Added: 0004217 | |
2003-12-06 01:13 |
|
Status | confirmed => feedback |
2003-12-06 05:43 |
|
Note Added: 0004219 | |
2003-12-06 08:42 | penna22 | Note Added: 0004220 | |
2003-12-10 21:44 | AngryWolf | Note Added: 0004286 | |
2003-12-10 22:02 | syzop | Note Added: 0004287 | |
2003-12-10 22:08 | AngryWolf | Note Added: 0004288 | |
2003-12-10 22:33 | syzop | Note Added: 0004289 | |
2003-12-10 22:59 | AngryWolf | Note Added: 0004290 | |
2003-12-10 23:18 | syzop | Note Added: 0004291 | |
2003-12-11 00:24 |
|
Note Added: 0004297 | |
2003-12-11 01:18 | syzop | Note Added: 0004298 | |
2003-12-11 01:28 | syzop | Note Edited: 0004298 | |
2003-12-11 17:32 | AngryWolf | Note Added: 0004303 | |
2003-12-11 18:07 | AngryWolf | Note Added: 0004304 | |
2003-12-11 19:26 | penna22 | Note Added: 0004311 | |
2003-12-11 20:09 | syzop | Note Added: 0004312 | |
2003-12-11 20:25 | penna22 | Note Added: 0004313 | |
2003-12-23 22:46 |
|
Status | feedback => resolved |
2003-12-23 22:46 |
|
Resolution | open => fixed |
2003-12-23 22:46 |
|
Assigned To | => codemastr |
2003-12-23 22:46 |
|
Note Added: 0004419 |