ASTERISK-19251: Manager eventq fills up with events with Usecount neq 0

[Home]

Summary: ASTERISK-19251: Manager eventq fills up with events with Usecount neq 0

Reporter: John Covert (jcovert) Labels:

Date Opened: 2012-01-26 22:47:47.000-0600 Date Closed: 2017-12-12 19:16:19.000-0600

Priority: Major Regression?

Status: Closed/Complete Components: Core/ManagerInterface

Versions: 1.8.8.2 Frequency of
Occurrence Constant

Related
Issues:

Environment: Linux 2.6.32-71.29.1.el6.i686 #1 SMP Mon Jun 27 18:07:00 BST 2011 i686 i686 i386 GNU/Linux Attachments: ( 0) Console-2012-01-27.txt
( 1) Console-2012-01-30.txt
( 2) Inscribe-2012-01-26.txt

Description: Once a user has connected to the manager, the eventq begins to fill up. The attached output from "manager show eventq" shows over 3400 events on a system up for less than two weeks.

Comments: By: John Covert (jcovert) 2012-01-26 22:53:21.321-0600

Please note that there are no currently connected users.
By: Matt Jordan (mjordan) 2012-01-27 08:47:51.109-0600

Does the "manager show connected" list any users?

How are you connecting to the manager session?
If over TCP, with the displayconnects config option set to yes, do you see the log statement "Manager '[username]' logged off from [address]"?
If over HTTP, with the displayconnects config option set to yes, do you see the log statement "HTTP Manager '[username]' logged off from [address]\n"?

By: John Covert (jcovert) 2012-01-27 14:48:58.760-0600

I am connecting with http, and yes, I do see the disconnect message. This is easy to reproduce: the entries with non-zero Usecounts start showing up once there has been a login and then further call activity.

In the attachment Console-2012-01-27.txt, I show a recently started system with no problems yet.

An http manager "Login" is done. Then a call is placed through the PBX. Subsequently, an http "WaitEvent" is issued, and the http session is allowed to time out.

As you can see, there is an event left on the queue with a non-zero Usecount, even after the call is gone, and even after there are further logins. This event will never go away, and over time thousands of them build up.

In addition to fixing the bug, it would be ideal if it were possible to allow a Login parameter which indicates that this session will only be issuing commands (other than WaitEvent) and looking at output from those commands, so that events don't even need to be queued unless other logged-in sessions need them.

/john
By: Matt Jordan (mjordan) 2012-01-30 07:08:49.365-0600

John:

From your log, it doesn't look like you are seeing the disconnect message. Instead, you are seeing:

"HTTP Manager 'local' timed out from 127.0.0.1"

That's not the same message I posted. It appears as if an HTTP manager session that times out is not fully destructed, leaving the session object in memory. This would account for why the events are being queued, even when no manager session is technically active.

In the meantime, you should be able to avoid this by setting the httptimeout to a larger value, and explicitly logging out, instead of letting the session time out.
By: John Covert (jcovert) 2012-01-30 08:13:50.799-0600

Matt:

Doesn't matter. Same thing happens even if you log off.

See the attached file Console-2012-01-30.txt, in which I rebooted my test system to clear the queue, logged in with the http manager, issued a waitevent, placed a call, issued another waitevent, hung up the call, issued a logoff, waited 2.5 minutes for purge_events to get rid of the zero usecounts events, and then demonstrated that there is still an event with a usecount of 1.

Also, even though unrelated to this problem, setting httptimeout to a larger value is disastrous (we already tried that) because then the purge_events algorithm, which uses httptimeout*2.5 to determine which events with a zero use count to remove, allows the queue to grow even more.

While you're working in this module, btw, this line of in "manager show settings"
HTTP Timeout (minutes): 60
actually displays the timeout in seconds. It would be simplest to change the message to seconds, rather than calculate and display a fractional value for minutes.

By: John Covert (jcovert) 2012-08-29 11:27:07.804-0500

Ping.
By: Corey Farrell (coreyfarrell) 2017-12-12 19:16:19.142-0600

I believe this was fixed by ASTERISK-24505.