[Home]

Summary:ASTERISK-23617: Asterisk segfault when AgentLogin concur in a time frame using Realtime ODBC
Reporter:VoIPCamp (voipcamp)Labels:
Date Opened:2014-04-10 13:03:22Date Closed:2014-07-09 15:02:48
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_agent Resources/res_odbc
Versions:11.8.1 11.9.0 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:CentOS release 6.5 2.6.32-431.11.2.el6.x86_64 (was 2.6.32-431.5.1.el6.x86_64) Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz 12Gb RAMAttachments:( 0) core.1014.gdb.txt
( 1) core.1014.log.txt
( 2) core.12979.txt
( 3) core.17055.gdb.txt
( 4) core.17055.log.txt
( 5) core.17381.txt
( 6) core.21046.txt
( 7) core.28334.gdb.txt
( 8) core.28334.log.txt
( 9) core.28596.gdb.txt
(10) core.28596.log.txt
(11) core.4930.txt
(12) core.5904.txt
(13) core.6580.gdb.txt
(14) core.6580.log.txt
(15) core.ast1190.14903.gdb.txt
(16) core.ast1190.14903.log.txt
(17) core.ast1190.1620.gdb.txt
(18) core.ast1190.1620.log.txt
(19) core.ast1190.7930.gdb.txt
(20) core.ast1190.7930.log.txt
Description:Asterisk segfault when AgentLogin or logoff (hang up the chan_agent call) concur in a time frame. This use to happen when operation start or when agents shift is over.

Realtime ODBC is used for dynamic agents and parts of dialplan. I've isolated some backtraces which seems an issue with the odbc driver itself and not asterisk (even though think it should avoid to crash).
Comments:By: Richard Mudgett (rmudgett) 2014-04-10 16:18:58.950-0500

Please attach text files as {{.txt}} files so it is easy to view the backtraces.  JIRA sucks as figuring out what mime type a file is on its own and seems to depend upon file name extensions instead.

By: VoIPCamp (voipcamp) 2014-04-10 17:54:03.839-0500

bt full
thread apply all bt
compiled with DONT_OPTIMIZE and BETTER_BACKTRACES
Last messages in log are "Asterisk uncleanly ending" and the modules unregistering process.

By: VoIPCamp (voipcamp) 2014-04-10 21:54:13.913-0500

Sorry about the file extension, [~rmudgett]. It's fixed now.

By: Rusty Newton (rnewton) 2014-04-14 15:21:03.453-0500

Can you provide an Asterisk log as shown on the wiki here https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information ?

Please verify it has DEBUG and VERBOSE messages in the log, and runs up to the time of the crash.

By: VoIPCamp (voipcamp) 2014-04-14 17:19:05.206-0500

[~rnewton]: Unfortunately some logs where logrotated then deleted, I've just uploaded new backtraces which have a corresponding log file. Thank you.

By: Rusty Newton (rnewton) 2014-05-03 13:47:22.692-0500

Please re-test with 11.9.0. The non-ODBC crash should be fixed. (see ASTERISK-23103)

For this one: https://issues.asterisk.org/jira/secure/attachment/49948/core.17055.gdb.txt  , I don't know.

First, lets see if both still occur in the newer version.


By: VoIPCamp (voipcamp) 2014-05-05 12:18:53.304-0500

[~rnewton] We have been working with 11.9.0 since April 27, no problems so far. Please let us give it some days in order to see if crash happens with this version.

By: VoIPCamp (voipcamp) 2014-05-13 02:20:42.597-0500

Two cores today using Asterisk 11.9.0.

Several "audiohook.c: Read factory %p and write factory %p both fail to provide 160 samples"  and "Read/Write factory %p was pretty quick last time, waiting for them." stripped from logs to keep them short.

By: VoIPCamp (voipcamp) 2014-05-26 22:05:02.900-0500

Another core today. This one looks pretty similar to the already uploaded [^core.ast1190.1620.gdb.txt].



By: Rusty Newton (rnewton) 2014-06-10 09:58:12.058-0500

Didn't see your response since Enter Feedback or Send Back were not selected when you commented. Thanks for the additional logs and traces. We'll take a look.

By: Rusty Newton (rnewton) 2014-06-10 12:59:44.187-0500

I thought we already had this, but can you go ahead and post all relevant configuration on here? res_odbc, your backend, agents.conf, queues.conf, everything involved that could lead to the specific configuration needed to reproduce. Of course scrub confidential information please.

By: Rusty Newton (rnewton) 2014-06-10 13:00:10.528-0500

Remember to press Enter Feedback or Send Back when you have responded so that we'll see it.

By: Rusty Newton (rnewton) 2014-06-25 12:59:51.229-0500

[~kmoore] looked into the backtraces. It appears you have roughly five different crashes. Memory corruption is suspected. To validate, you should run under [Valgrind|https://wiki.asterisk.org/wiki/display/AST/Valgrind] and provide the resulting output. You'll also want to run a memory diagnostic to check your system RAM.

Press Enter Feedback when you have the requested information.

By: Rusty Newton (rnewton) 2014-07-09 15:02:37.611-0500

Suspended due to lack of activity. Please request a bug marshal in #asterisk-bugs on the IRC network irc.freenode.net to reopen the issue should you have the additional information requested.  Further information can be found at http://www.asterisk.org/developers/bug-guidelines