View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0004948 | unreal | ircd | public | 2017-05-10 15:55 | 2017-05-10 17:05 |
Reporter | Eman | Assigned To | syzop | ||
Priority | normal | Severity | block | Reproducibility | have not tried |
Status | resolved | Resolution | open | ||
Product Version | 4.0.9 | ||||
Summary | 0004948: large number of connections causes port to stop working | ||||
Description | after a large number of server connection attempts (every 30 seconds for at least 48 hours), unreal gives the following errors at every subsequent server connect attempt: [2017/05/10 - 9:48:10AM] -coruscant.r-type.ca- *** No more connections allowed (yavin.r-type.us) [2017/05/10 - 9:48:10AM] -coruscant.r-type.ca- Connect to host yavin.r-type.us[172.16.2.14] failed: Success [2017/05/10 - 9:48:31AM] -coruscant.r-type.ca- *** All connections in use. ([@*/6697]) [2017/05/10 - 9:48:40AM] -coruscant.r-type.ca- *** No more connections allowed (yavin.r-type.us) [2017/05/10 - 9:48:40AM] -coruscant.r-type.ca- Connect to host yavin.r-type.us[172.16.2.14] failed: Success [2017/05/10 - 9:49:01AM] -coruscant.r-type.ca- *** All connections in use. ([@*/6697]) this causes the port to be entirely unresponsive for even user connections. the only fix appears to be restarting the ircd. | ||||
Additional Information | for reference, this is the server output when it 'switched' to spitting out the no more connections: [2017/05/09 - 10:12:10PM] -coruscant.r-type.ca- Exiting ssl client yavin.r-type.us[@172.16.2.14.0]: SSL_connect(): Connection refused [2017/05/09 - 10:12:10PM] -coruscant.r-type.ca- Lost connection to yavin.r-type.us[172.16.2.14]: SSL_connect(): Connection refused [2017/05/09 - 10:12:40PM] -coruscant.r-type.ca- *** Connection to yavin.r-type.us[172.16.2.14] activated. [2017/05/09 - 10:12:40PM] -coruscant.r-type.ca- Exiting ssl client yavin.r-type.us[@172.16.2.14.0]: SSL_connect(): Connection refused [2017/05/09 - 10:12:40PM] -coruscant.r-type.ca- Lost connection to yavin.r-type.us[172.16.2.14]: SSL_connect(): Connection refused [2017/05/09 - 10:13:10PM] -coruscant.r-type.ca- *** No more connections allowed (yavin.r-type.us) [2017/05/09 - 10:13:11PM] -coruscant.r-type.ca- Connect to host yavin.r-type.us[172.16.2.14] failed: Success [2017/05/09 - 10:13:36PM] -coruscant.r-type.ca- *** All connections in use. ([@*/6697]) [2017/05/09 - 10:13:40PM] -coruscant.r-type.ca- *** No more connections allowed (yavin.r-type.us) [2017/05/09 - 10:13:40PM] -coruscant.r-type.ca- Connect to host yavin.r-type.us[172.16.2.14] failed: Success [2017/05/09 - 10:14:10PM] -coruscant.r-type.ca- *** No more connections allowed (yavin.r-type.us) [2017/05/09 - 10:14:10PM] -coruscant.r-type.ca- Connect to host yavin.r-type.us[172.16.2.14] failed: Success [2017/05/09 - 10:14:40PM] -coruscant.r-type.ca- *** No more connections allowed (yavin.r-type.us) [2017/05/09 - 10:14:40PM] -coruscant.r-type.ca- Connect to host yavin.r-type.us[172.16.2.14] failed: Success [2017/05/09 - 10:15:10PM] -coruscant.r-type.ca- *** No more connections allowed (yavin.r-type.us) | ||||
Tags | No tags attached. | ||||
3rd party modules | |||||
|
running /close does nothing to help this: [2017/05/10 - 9:55:55AM] -> Server: close [2017/05/10 - 9:55:56AM] 0: Connections Closed [2017/05/10 - 9:55:56AM] -coruscant.r-type.ca- *** [email protected] closed 0 unknown connections |
|
Which Operating System? (Strange, I thought that was a mandatory field :D) |
|
centos 6.9 |
|
If it happens again, could you run an: lsof -n -p PID-OF-IRCD-PROCESS Or if you don't have lsof, you can do: ls -al /proc/PID-OF-IRCD-PROCESS/fd Naturally, replace PID-OF-IRCd-PROCESS with an actual number :) What you describe ("all connections in use") is that all file descriptors are in use. If you think this is incorrect, for example because you see 800 clients and you have compiled with a MAXCONNECTIONS of 1024 then it is probably "leaking" file descriptors. These will show up in the lsof / ls from above. |
|
Ah, just saw your strace (downloaded and deleted the attachment, for some reason attachments cannot be private so it was public). Strange, it sends "All connections in use" to FD 13. Such a low FD number during such an error is a bit unexpected, though theoretically possible. In fact all the FD numbers I saw in your trace are low, as in <40. You probably restarted by now, but.. the lsof or ls from above would still be useful when you hit the problem OR you can run it before that if it starts leaking fd's gradually.. then they will show up already after a while and no need to wait until you hit 1024. ** I could try to reproduce it by the method you describe - server connection attempts - it may or may not be related. ** Any 3rd party modules loaded ? |
|
Nevermind tracing. I can reproduce the problem here. It's not a FD leak, but an incorrect counter. |
|
** public ** https://github.com/unrealircd/unrealircd/commit/a6f5460ad819902920146c37a04bfbc7051de9ed commit a6f5460ad819902920146c37a04bfbc7051de9ed Author: Bram Matthys <[email protected]> Date: Wed May 10 17:03:45 2017 +0200 Update OpenFiles upon failed SSL connect to remote server. Reported by Eman (0004948). |
Date Modified | Username | Field | Change |
---|---|---|---|
2017-05-10 15:55 | Eman | New Issue | |
2017-05-10 15:56 | Eman | Note Added: 0019759 | |
2017-05-10 16:03 | syzop | Note Added: 0019760 | |
2017-05-10 16:04 | Eman | File Added: strace.txt | |
2017-05-10 16:04 | Eman | Note Added: 0019762 | |
2017-05-10 16:46 | syzop | Note Added: 0019763 | |
2017-05-10 16:47 | syzop | File Deleted: strace.txt | |
2017-05-10 16:50 | syzop | Note Added: 0019764 | |
2017-05-10 16:52 | syzop | Note Edited: 0019764 | |
2017-05-10 16:58 | syzop | Assigned To | => syzop |
2017-05-10 16:58 | syzop | Status | new => confirmed |
2017-05-10 16:58 | syzop | Note Added: 0019765 | |
2017-05-10 16:59 | syzop | Severity | minor => block |
2017-05-10 16:59 | syzop | View Status | public => private |
2017-05-10 17:05 | syzop | Status | confirmed => resolved |
2017-05-10 17:05 | syzop | View Status | private => public |
2017-05-10 17:05 | syzop | Note Added: 0019767 |