View Issue Details

IDProjectCategoryView StatusLast Update
0005099unrealircdpublic2018-06-15 07:44
ReporterGottem Assigned Tosyzop  
PrioritynormalSeverityminorReproducibilityalways
Status closedResolutionno change required 
Summary0005099: StripControlCodes() reduces multi-word string to single-word
DescriptionRemember how we spoke about my usage of StripControlCodes(StripColors(text)) and how just the first one should be enough? Welllll it turns out that if you do that, for some reason a multi-word string gets reduced to just one word. :D Check/run the code snippet under reproduction steps, then swap the lines assigning stuff to plaintext and you'll notice only the second variant returns a proper string thru CallCmdoverride(). Strangely enough both flavours yield the same result up to and including sendto_realops().

I've compared the 2 functions but I don't see anything that could be wrong with it. There's one difference in the first if within the while loop but idk what's going on with the variable nc etc. ;_;

My dev net is still (partially) on 4.0.12.1 but the original report about my module doing this (m_anticaps) came from someone on the support IRC (I wanna say Amiga or DG but idr for sure) and he was on 4.0.17 or maybe the git version.
Steps To Reproducestatic int someoverride(Cmdoverride *ovr, aClient *cptr, aClient *sptr, int parc, char *parv[]) {
    // Input: PRIVMSG #test :This is multi word
    char p[BUFSIZE];
    char *plaintext;
    snprintf(p, sizeof(p), "%s", parv[2]);
    plaintext = (char *)StripControlCodes(p);
    //plaintext = (char *)StripColors(StripControlCodes(p)); // Unsafe, but just testing anyways
    parv[2] = plaintext;
    sendto_realops("%s -- %d", parv[2], parc); // This is multi word -- 3
    return CallCmdoverride(ovr, cptr, sptr, parc, parv);
    // Output: This
}

I know I can use ircd_log() instead of sendto_realops() but my terminal will get covered by something eventually anyways so this is just easier imo. :>
3rd party modulesN/A

Activities

syzop

2018-06-13 08:13

administrator   ~0020146

So I try both:
plaintext = (char *)StripControlCodes(p);
and also:
plaintext = (char *)StripColors(StripControlCodes(p));

For both I see:
*** This is multi word -- 3
And if I break on CallCmdoverride in gdb:
Breakpoint 1, CallCmdoverride (ovr=0x555555f19c90, cptr=0x555555f77200, sptr=0x555555f77200, parc=3,
    parv=0x555555920040 <para>) at modules.c:1514
1514 {
(gdb) n
1515 if (ovr->prev)
(gdb) p parv[1]
$1 = 0x7fffffffe028 "#test"
(gdb) p parv[2]
$2 = 0x7ffff2663540 <new_str> "This is multi word"
(gdb) p parv[3]
$3 = 0x0
(gdb)

I don't see anything special.

But you were saying you couldn't reproduce it either ? Or could you?

If you can reproduce the problem with 4.0.18-rc1, let me know.

Gottem

2018-06-14 17:50

developer   ~0020147

Yeah I can reproduce it eternally it seems. :D

I did some more digging and it took a bit but I think I figured out the exact reproduction conditions. I'm on the git version (which hasn't received another commit since your 4.0.18-rc1 version change) and I have exactly 1 module loaded, so I can add a cmdoverride and experiment with it. Also, the sendto_realops() message always gave me proper output, it's that another channel user would get the truncated one.

So:
* I'm connected twice to the same server, once as oper via SSL and another time as a regular user on a plaintext connection
* I send the message from the regular user and read it on the oper side (since opers bypass many checks I put in my modules I always test from this angle)
* I restarted the server after removing *every other module* from the config file (it actually crashed on rehash but that's not really a shocker :D)
* For good measure I'm snprintf()'ing parv[2] and char *text into a char p[BUFSIZE] -- I noticed that your m_tkl for example does StripControlCodes() directly on str_in and judging from the stripping code it's perfectly safe since you create another buffer there, but let's exclude every possibility =]

Now, my testing module contains:
* Override for PRIVMSG: the code is exactly the example one from the OP
* Handler for HOOKTYPE_PRE_CHANMSG: see below

The hook code looks something like:
char *test_hook_prechanmsg(aClient *sptr, aChannel *chptr, char *text, int notice) {
    char txt[BUFSIZE];
    char *cleantext;
    char *word;
    char *p = NULL;
    snprintf(txt, sizeof(txt), "%s", text);
    cleantext = (char *)StripControlCodes(txt);
    for(word = strtoken(&p, cleantext, " ,.-"); word; word = strtoken(&p, NULL, " ,.-")) { }
    return text;
}

If I remove the strtoken() call/for loop I see the entire message on both sides. Which is especially weird considering I'm first cloning the "text", then StripControlCodes() returns another (different) buffer which is then finally used for strtoken(). So it shouldn't even come near the original char *text. Then I dug even deeper and it gets a bit more strange: both the override and hook handler seem to be necessary, since disabling either results in seeing the full message. I'm guessing something static gets messed up along the way, depending on what order the code is called in? Like StripControlCodes() > StripControlCodes() > strtoken() might trigger it (override, then hook) but StripControlCodes() > strtoken() > StripControlCodes() might not.

syzop

2018-06-14 18:11

administrator   ~0020148

Could you give me a complete module file that I can test with? So I can just load that and see the problem? (And tell me what to do)

Gottem

2018-06-14 18:20

developer   ~0020149

Sure: https://pastebin.com/raw/BJszh1ek
Simply load just this module, connect twice to the same server, send a multi-word message from one end and check the message on the other end.

syzop

2018-06-14 18:53

administrator   ~0020150

Thanks. This has to do with buffer re-use.
1) The override is called and:
plaintext = (char *)StripControlCodes(p);
parv[2] = plaintext;
This means the internal buffer ('new_str') from StripControlCodes() is now parv[2]
2) Now the prechanmsg hook is called:
The parameter 'text' is still the internal buffer from StripControlCodes()
But then:
cleantext = (char *)StripControlCodes(txt);
Since StripControlCodes() works on the same internal buffer again for output (the 'new_str' in StripControlCodes), it will also affect 'text'.
In other words:
cleantext = text :)
Then the subsequent strtoken() will frag the buffer.

So what would fix this? A couple of options but best would be to copy the result of StripControlCodes to somewhere else before you assign it to parv[2]. Eg:

static int test_override_privmsg(Cmdoverride *ovr, aClient *cptr, aClient *sptr, int parc, char *parv[])
{
    char plaintext[BUFSIZE];
    char *res;
    res = StripControlCodes(parv[2]);
    strlcpy(plaintext, res, sizeof(plaintext));
    parv[2] = plaintext;
    return CallCmdoverride(ovr, cptr, sptr, parc, parv);
}

Also, be aware of the following. Your example didn't do this, but if you ever need the storage to be accessible AFTER you return, such as in the example below, then you need to use 'static char' as storage so it's not on the stack:

char *test_hook_prechanmsg(aClient *sptr, aChannel *chptr, char *text, int notice)
{
    static char plaintext[BUFSIZE];
    char *res;
    res = StripControlCodes(text);
    strlcpy(plaintext, res, sizeof(plaintext));
    return plaintext;
    /* because 'plaintext' is not on the stack, it can safely be accessed by the next hook, or PRIVMSG itself.. */
}

syzop

2018-06-14 19:09

administrator   ~0020151

Btw, if you are hitting a conflict with something that you can't control, for example in Unreal (not in the example from abov3e), then let me know and we can sort something out :)

Gottem

2018-06-14 19:15

developer   ~0020152

Figured it was something like that, but I don't quite know the internals as well as you (like, I have no idea what happens to parv[2] after CallCmdoverride). Also explains why StripColors(StripControlCodes(text)) seems to work just fine. :D So thanks for pointing it out.

And yeah, I'm familiar with returning stuff not on the stack. But doesn't prechanmsg just ignore whatever text you return? It'll break only if it's NULL and otherwise it'll use the original text pointer afaik (I think you told me even :>).

As for conflicts, I tend to create bug reports for stuff I'm missing so that should be snazzy. =]

syzop

2018-06-14 19:20

administrator   ~0020153

Last edited: 2018-06-14 19:21

Good :)

And yes, sorry, prechanmsg (HOOKTYPE_PRE_CHANMSG) was indeed a bad example... should have been chanmsg (HOOKTYPE_CHANMSG).

Gottem

2018-06-14 23:09

developer   ~0020154

Helder. :>

Issue History

Date Modified Username Field Change
2018-05-30 22:49 Gottem New Issue
2018-06-13 08:13 syzop Note Added: 0020146
2018-06-14 07:39 syzop Assigned To => syzop
2018-06-14 07:39 syzop Status new => feedback
2018-06-14 17:50 Gottem Note Added: 0020147
2018-06-14 18:11 syzop Note Added: 0020148
2018-06-14 18:20 Gottem Note Added: 0020149
2018-06-14 18:53 syzop Note Added: 0020150
2018-06-14 19:09 syzop Note Added: 0020151
2018-06-14 19:15 Gottem Note Added: 0020152
2018-06-14 19:20 syzop Note Added: 0020153
2018-06-14 19:20 syzop Note Edited: 0020153
2018-06-14 19:21 syzop Note Edited: 0020153
2018-06-14 23:09 Gottem Note Added: 0020154
2018-06-15 07:44 syzop Status feedback => closed
2018-06-15 07:44 syzop Resolution open => no change required