[Home]

Summary:ASTERISK-24450: Random Segmentation fault crash due to static realtime
Reporter:Deepak Singh Rawat (dsr)Labels:
Date Opened:2014-10-26 00:51:28Date Closed:2014-11-20 15:49:42.000-0600
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Applications/app_agent_pool Resources/res_odbc
Versions:12.4.0 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:uname -a: Linux prod2-asterisk02 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9 21:36:05 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux OS: CentOS release 6.4 (Final) Memory: 24 GBAttachments:( 0) backtrace_odbc_crash.txt
Description:PFA the backtrace. Asterisk 12.4 crashes randomly in production. It looks like crash due to static realtime. We use MS SQL Server database and the agent info and credentials are stored in remote db. We use func_odbc.conf to read password from the db.
Comments:By: Michael L. Young (elguero) 2014-10-26 09:46:13.208-0500

Do you have idlecheck turned on?

By: Deepak Singh Rawat (dsr) 2014-10-26 12:20:24.667-0500

[~elguero] idlecheck is turned off. Earlier we were using asterisk 1.4 with the same database with idlecheck turned off and we never faced this issue. We have other servers on asterisk 1.4 and they are running fine. When we revert this server to 1.4 then it's not crashing. We are only facing this issue with asterisk 12.4 and that too with only one server. We have one more production server on 12.4 and there we are not facing this issue. The one difference between both the severs is user/call load. On the server on which we are facing this issue, the user load is almost 10x more than the stable server. Did something related to idlecheck change in 12.4? Also wasn't idlecheck issue only with MySql?

By: Rusty Newton (rnewton) 2014-11-03 16:10:40.996-0600

I've had a developer look over the backtrace and the issue is likely a bug with libodbc. What version of the libodbc library are you using? Have you checked for open issues against that library?

Were you running any Asterisk logging up to that crash? Can you attach that to the issue?



By: Deepak Singh Rawat (dsr) 2014-11-04 12:30:37.482-0600

[~rnewton] We are using unixODBC 2.2.14. It's an old version released in 2008. We have been using it successfully with asterisk 1.4 since a long time. I will upgrade to latest version and see if that fixes this issue.

We did have asterisk logging enabled. There are lot of entries with the following message just before the crash:

{noformat}
[Oct 21 13:57:53] NOTICE[29205] manager.c: Request to hangup non-existent channel: Local/XXXXXXXX-XXXXXXXXXX6@outc-1-0000f42e;2
[Oct 21 13:57:54] NOTICE[29205] manager.c: Request to hangup non-existent channel: Local/XXXXXXXX-XXXXXXXXXX3@outc-1-0000f430;2
[Oct 21 13:57:54] NOTICE[29205] manager.c: Request to hangup non-existent channel: Local/XXXXXXXX-XXXXXXXXXX3@outc-1-0000f431;2
[Oct 21 13:57:54] NOTICE[4631][C-0001113d] res_odbc.c: Connecting prod-db
{noformat}

We are also looking into our code to see if our application was sending wrong hangup requests to asterisk and due to the load it crashed. I just want to make sure this crash is not due to a bug in asterisk 12x as we have been using asterisk 1.4 without such crashes since a very long time.




By: Rusty Newton (rnewton) 2014-11-05 16:49:58.167-0600

Let us know what you find. There is nothing apparent for us to investigate and as such the next step is really for you to file an issue with the unixODBC project.

{quote}
I just want to make sure this crash is not due to a bug in asterisk 12x as we have been using asterisk 1.4 without such crashes since a very long time.
{quote}

There are about 7 years between the release of 1.4 and 12, so there are many dramatic differences in Asterisk functionality between the two.  I wouldn't be surprised if Asterisk did something different that although valid may expose a bug in lib ODBC.

That being said, a developer has already reviewed the trace and it doesn't appear to be a bug in Asterisk.

By: Michael L. Young (elguero) 2014-11-06 07:21:24.457-0600

Just to clarify my earlier comment... It was based on a quick search done based on what I saw in the backtrace.  Rusty has already pointed out that it would look to be a bug in libodbc.  I was searching to see if there were any known bugs in libodbc and the second result from google showed this issue: ASTERISK-14161.  On that issue, a developer asked the user to try setting idlecheck.  That was all that was on that issue and it was closed.  So, I didn't put too much thought into my simple question except to think that maybe the connection to the db server was being closed and therefore causing the crash.  I tend to start with the quick, small, simple, back to basics type of things when it comes to troubleshooting since in most cases that is the fix before diving deep into other possibilities.  As Rusty pointed out, you cannot really compare 1.4 with 12 since there have been a lot of major changes.

With the detailed information that Rusty has given you, hopefully you can track this down and get it fixed.



By: Rusty Newton (rnewton) 2014-11-20 15:49:42.041-0600

Closing this out since by all indications this is not a bug in Asterisk.

[~dsr] when you figure out if a newer version of your ODBC library fixes the issue, or if you file a bug on the unixODBC project then please post here letting us know what happened. Thanks.