[Home]

Summary:ASTERISK-17036: [patch] Output of queue_log stopped after several hours
Reporter:Juan Carlos Valero (eb3dgz)Labels:
Date Opened:2011-01-19 05:23:21.000-0600Date Closed:2011-08-10 15:55:52
Priority:MajorRegression?
Status:Closed/CompleteComponents:Applications/app_queue
Versions:Frequency of
Occurrence
Related
Issues:
causesASTERISK-20504 Asterisk 1.8 should use [compat] section in asterisk.conf and behave acconding to the flag pbx_realtime=1.6
is caused byASTERISK-15863 [patch] Improve realtime queue logging
is duplicated byASTERISK-18208 logger doesn't append data to queue_log file after asterisk starts
Environment:Attachments:( 0) backtrace-threads.txt
( 1) core-show-locks.txt
( 2) jira_asterisk_17036_v1.8.patch
( 3) queue_logger.patch
Description:Tested in two systems with Asterisk 1.8.2. After several hours the logger output to queue_log is stopped and there are no more writes of statistics in this file.

After a reload of the logger module the queue_log output starts working well again but with the loss of the events processed while logger was not wrinting queue_log.

This affect report statistics of queues, and is very important in call center environments. No effect has been noted with CDR, as CDR works OK. No call lost in CDR, only in Queue_log.

****** ADDITIONAL INFORMATION ******

In logger.conf:

queue_log = yes
queue_log_to_file = yes
queue_log_name = queue_log

An example:

1295004727|1295004638.165|operadora|SIP/501|COMPLETECALLER|9|52|1
1295435167|NONE|NONE|NONE|CONFIGRELOAD|
1295435679|1295435652.224|comercial|NONE|ENTERQUEUE||660481917|1

As can see, after an output of Jabuary 11th, the statistics stopped. After a reload works again.

No selinux, no errors in messages or full logs. Simply stopped.
Comments:By: Leif Madsen (lmadsen) 2011-01-19 14:55:31.000-0600

I think you should provide a 'core show locks' when it is stuck, as that could be a deadlock. Additionally a backtrace would also be useful.

https://wiki.asterisk.org/wiki/display/AST/Debugging

By: Juan Carlos Valero (eb3dgz) 2011-01-19 15:57:08.000-0600

No locks under "core show locks", and can not produce a backtrace as no core dumped. Asterisk is running OK, no apparently any problem, but no queue_log information update is produced.

By: Sean Bright (seanbright) 2011-01-19 16:52:06.000-0600

You can generate a backtrace without a core file, see:

https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace#GettingaBacktrace-GettingInformationForADeadlock

By: Juan Carlos Valero (eb3dgz) 2011-01-20 02:25:22.000-0600

I have uploaded a backtrace and outut of show locks. After I do a "module logger reload" I can see in the console:

[Jan 20 09:22:09]   == Parsing '/etc/asterisk/logger.conf': [Jan 20 09:22:09]   == Found
[Jan 20 09:22:09]  Asterisk Queue Logger restarted

After this, tue queue statistics are updated OK from a random period, until stopped now.

By: Sean Bright (seanbright) 2011-01-20 08:53:28.000-0600

I would assume this is affecting all of your log files (the CDRs use a separate mechanism).  Do you have logging output going to your full or messages file and do these stop at the same time?

By: Juan Carlos Valero (eb3dgz) 2011-01-20 08:56:28.000-0600

No, the output of /var/log/asterisk/full and /var/log/asterisk/messages logfiles are OK. Only the queue_log file is stopped.

For now doing a periodical "logger reload" seems to stop the problem, but this is a bad solution (only temporal).

By: Denis Smirnov (mithraen) 2011-01-26 07:11:39.000-0600

I also have this issue, but queue_log in realtime. Small part of calls writed to database, but it <10% of calls.

"module reload" command doesn't help.

By: Juan Carlos Valero (eb3dgz) 2011-02-14 02:48:33.000-0600

This patch solve the problem. Tested in two machines for more than one week without any problem.

--- main/logger.c 2011-02-11 10:49:55.000000000 +0100
+++ main/logger.c 2011-02-11 10:49:47.000000000 +0100
@@ -393,12 +393,13 @@
AST_RWLIST_INSERT_HEAD(&logchannels, chan, list);
global_logmask |= chan->logmask;
}
- if (qlog) {
- char tmp[4096];
+
+ if (qlog)
fclose(qlog);
- snprintf(tmp, sizeof(tmp), "%s/%s", ast_config_AST_LOG_DIR, queue_log_name);
- qlog = fopen(tmp, "a");
- }
+
+ char tmp[4096];
+ snprintf(tmp, sizeof(tmp), "%s/%s", ast_config_AST_LOG_DIR, queue_log_name);
+ qlog = fopen(tmp, "a");

if (!locked) {
AST_RWLIST_UNLOCK(&logchannels);

By: David Woolley (davidw) 2011-02-14 05:28:07.000-0600

Digium won't read patches that are submitted as comments, because they are not supported by suitable copyright clearances.  You need to complete the electronic licence grant, then submit the patch as an attachment, with the documentation or code box ticked.

Alternatively, describe the nature of the problem in sufficient detail that it is easy to work out how to patch it.

By: Juan Carlos Valero (eb3dgz) 2011-02-14 05:29:21.000-0600

Sorry, I have uploaded the patch and have do the licence grant.

By: Ronald Chan (loloski) 2011-02-17 21:52:36.000-0600

eb3dgz: is this applicable for 1.6.2 branch? thanks

By: Juan Carlos Valero (eb3dgz) 2011-02-18 01:13:10.000-0600

Loloski: sorry, I don't known.I don't have any 1.6 running for do testing. It tested with 1.8.2 and 1.8.3 releases and works ok! No more issues after do the patch two weeks ago in several systems.

By: Ronald Chan (loloski) 2011-02-22 15:59:29.000-0600

I think 1.6 is not affected with this issue.

By: Amilcar S Silvestre (amilcar) 2011-03-01 08:36:58.000-0600

I have 1.8.4-rc2 installed and after asterisk boots, it doesn't write do queue_log until we do a "logger reload".

This patch corrects the problem. Thanks.

By: Dmitry Manin (d1m4) 2011-03-10 06:49:40.000-0600

Thanks for the patch.

By: Juan Carlos Valero (eb3dgz) 2011-03-10 07:29:12.000-0600

After one month with the problem solved, this code is secure for include into future versions of Asterisk, but no insertion into SVN. Why ?

By: Adrien P. (adri) 2011-04-27 15:03:21

I am affected by this bug and the patch work well on my production system for 2 weeks without problem.

By: Juan Carlos Valero (eb3dgz) 2011-04-27 15:20:44

To the admins: can be possible to include this patch into the mainsteam code ?

By: Bill Mania (billmania) 2011-07-21 12:07:12.025-0500

This patch also appears to correct a problem with asterisk 1.8.4.2 on debian. After restarting asterisk, the queue_log file wasn't updated until after I manually executed "logger reload" via the CLI. After applying the patch, I no longer have to execute the "logger reload".

By: zebble (zebble) 2011-07-27 16:28:33.308-0500

Any chance of this being included at all?  1.8.5.0 is out now, and this was reported back in 1.8.2.1 (6 months ago).  It's a fairly critical bug, as it prevents packages such as Asternic's Call Center Stats or QueueMetrics from working at all.

Are most people using SQL-based queue logging?  If they are, which Queue stats packages are they using as I can't seem to find one that uses a SQL-based queue log.

By: Benoit P (b_plessis) 2011-07-27 16:54:02.214-0500

As for the last question, well, QueueMetrics for one is able to  ...

By: Richard Mudgett (rmudgett) 2011-07-28 18:34:04.299-0500

[^jira_asterisk_17036_v1.8.patch] is the first patch version posted on https://reviewboard.asterisk.org/r/1333

By: Richard Mudgett (rmudgett) 2011-08-10 15:55:52.018-0500

[^jira_asterisk_17036_v1.8.patch] committed.

By: Indreias Ioan (indreias) 2011-09-24 13:16:41.929-0500

Issue looks to be fixed but not on 1.8.6.0 (installed from Asterisk RPM repo) even that this version was released on 30/Aug/2011 (issue marked as resolve on 10/Aug/2011)
Strange enough is that the ChangeLog for the last version (1.8.7.0) do not mention that this issue (#18642) was fixed.
Could somebody confirm the real status of this issue?

By: Richard Mudgett (rmudgett) 2011-09-26 09:57:24.524-0500

Look above.  There is a tab called "Subversion Commits".  Clicking on that tab you will see the commit message.  The ChangeLog will have that commit message if it is in that version.  The Mantis issue 18642 reference is no longer used to reference this issue since the Mantis to JIRA transition.  The JIRA reference for this issue is ASTERISK-17036.

The reason a particular commit is not in a particular revision has to do with when the -rc1 version of an Asterisk version was tagged.  If the -rc1 tag was done before an issue fix commit then it will not likely be put in that release version.


By: Indreias Ioan (indreias) 2011-09-26 15:05:08.958-0500

Thanks Richard for sorting this out - it was my confusion between JIRA and Mantis issue IDs.