[Home]

Summary:ASTERISK-16902: [patch] Random segfault when querying MySQL via func_odbc
Reporter:Kevin Sandy (ks3)Labels:
Date Opened:2010-11-02 13:30:22Date Closed:2011-03-08 15:07:40.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Functions/func_odbc
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) console.txt.20101103T1501
( 1) res_odbc.patch
( 2) valgrind.txt.20101102T1353
( 3) valgrind.txt.20101103T1501
Description:For the past couple months, our two Asterisk servers crash (at different times) about once a week, and the last entries in the console log are calls to some of our func_odbc functions. Within the last couple weeks, these crashes have increased to about every day, sometimes 2-3 times per day.

****** ADDITIONAL INFORMATION ******

Our servers are running CentOS 5.4, 64 bit. They do have updates available, but none related to MySQL or unixODBC.

I have tried to get a core dump (dumpcore = yes is set in asterisk.conf), but none is ever generated. I'll attach the valgrind file from the latest crash; unfortunately, when I set it up to run under valgrind I forgot to redirect the console output to a file, so I don't have that yet. I'll attach that the next time it crashes.
Comments:By: Kevin Sandy (ks3) 2010-11-03 14:28:19

From the console log, it appears the crash is related to "SQL Alloc Handle failed". We have several other processes that tie into the same MySQL server / database, and they aren't exhibiting any errors, so I don't think there are any overall database issues.

By: Walter Doekes (wdoekes) 2010-11-04 03:43:30

I can confirm part of the problem. It is however not random. It exists in both 1.4 and 1.6 (probably 1.8 too).

Whenever a query fails (in my case usually if I forget to add a view and asterisk therefore cannot select on it), the asterisk odbc link thinks the link may be down. It then attempts to reconnect.

However, if someone else does a query while it is reconnecting, the handle is pointing to garbage and it crashes in libodbc.

Try setting pooling=no.

By: Kevin Sandy (ks3) 2010-11-04 08:15:51

That certainly makes sense. We do receive periodic errors on SQL queries - a developer here created a trigger that runs when CDR records are added, and this seems to fail periodically - I haven't seen the statements in the trigger, but I assume there's little to no error checking. I'll look into that side of it.

I checked our res_odbc.conf, and pooling was already off (that appears to be the default setting). This seems related to issue 0014748, and one of the workarounds in that is to enable pooling. I've enabled pooling for the moment while I do a bit more research.

By: Kevin Sandy (ks3) 2010-11-04 10:36:47

I've attached a patch against the current svn version of res_odbc.c. There was an update at some point to use ast_odbc_sanity_check in the ast_odbc_prepare_and_execute function, but ast_odbc_direct_execute continued to blindly disconnect and reconnect.

Also, each of those functions set obj->up to 0, which appears to have the effect of forcing ast_odbc_sanity_check to disconnect and reconnect. It seems to me that it should only disconnect / reconnect if the testsql statement fails, so I have removed the obj->up = 0 statements as well.

We are running 1.6.2.13, and the patch also cleanly applies against it's source. I'll be putting this in testing later today.

By: Kevin Sandy (ks3) 2010-12-09 10:31:15.000-0600

Just to follow up, for the past 2-3 weeks we have been running Asterisk 1.6.2.14 with the attached patch and connection pooling disabled - everything appears to be working as anticipated. We have gotten the warning in the patch (performing sanity check) several times, but the server has never crashed or failed in any other way.

By: Digium Subversion (svnbot) 2011-01-05 12:47:47.000-0600

Repository: asterisk
Revision: 300621

U   branches/1.4/res/res_odbc.c

------------------------------------------------------------------------
r300621 | tilghman | 2011-01-05 12:47:47 -0600 (Wed, 05 Jan 2011) | 10 lines

Use the sanity check in place of the disconnect/connect cycle.

The disconnect/connect cycle has the potential to cause random crashes.

(closes issue ASTERISK-16902)
Reported by: ks3
Patches:
      res_odbc.patch uploaded by ks3 (license 1147)
Tested by: ks3

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=300621

By: Digium Subversion (svnbot) 2011-01-05 12:55:00.000-0600

Repository: asterisk
Revision: 300622

_U  branches/1.6.2/
U   branches/1.6.2/res/res_odbc.c

------------------------------------------------------------------------
r300622 | tilghman | 2011-01-05 12:54:59 -0600 (Wed, 05 Jan 2011) | 17 lines

Merged revisions 300621 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r300621 | tilghman | 2011-01-05 12:47:46 -0600 (Wed, 05 Jan 2011) | 10 lines
 
 Use the sanity check in place of the disconnect/connect cycle.
 
 The disconnect/connect cycle has the potential to cause random crashes.
 
 (closes issue ASTERISK-16902)
  Reported by: ks3
  Patches:
        res_odbc.patch uploaded by ks3 (license 1147)
  Tested by: ks3
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=300622

By: Digium Subversion (svnbot) 2011-01-05 12:56:13.000-0600

Repository: asterisk
Revision: 300623

_U  branches/1.8/
U   branches/1.8/res/res_odbc.c

------------------------------------------------------------------------
r300623 | tilghman | 2011-01-05 12:56:13 -0600 (Wed, 05 Jan 2011) | 24 lines

Merged revisions 300622 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.6.2

................
 r300622 | tilghman | 2011-01-05 12:54:58 -0600 (Wed, 05 Jan 2011) | 17 lines
 
 Merged revisions 300621 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r300621 | tilghman | 2011-01-05 12:47:46 -0600 (Wed, 05 Jan 2011) | 10 lines
   
   Use the sanity check in place of the disconnect/connect cycle.
   
   The disconnect/connect cycle has the potential to cause random crashes.
   
   (closes issue ASTERISK-16902)
    Reported by: ks3
    Patches:
          res_odbc.patch uploaded by ks3 (license 1147)
    Tested by: ks3
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=300623

By: Digium Subversion (svnbot) 2011-01-05 12:57:06.000-0600

Repository: asterisk
Revision: 300624

_U  trunk/
U   trunk/res/res_odbc.c

------------------------------------------------------------------------
r300624 | tilghman | 2011-01-05 12:57:06 -0600 (Wed, 05 Jan 2011) | 31 lines

Merged revisions 300623 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

................
 r300623 | tilghman | 2011-01-05 12:56:12 -0600 (Wed, 05 Jan 2011) | 24 lines
 
 Merged revisions 300622 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.6.2
 
 ................
   r300622 | tilghman | 2011-01-05 12:54:58 -0600 (Wed, 05 Jan 2011) | 17 lines
   
   Merged revisions 300621 via svnmerge from
   https://origsvn.digium.com/svn/asterisk/branches/1.4
   
   ........
     r300621 | tilghman | 2011-01-05 12:47:46 -0600 (Wed, 05 Jan 2011) | 10 lines
     
     Use the sanity check in place of the disconnect/connect cycle.
     
     The disconnect/connect cycle has the potential to cause random crashes.
     
     (closes issue ASTERISK-16902)
      Reported by: ks3
      Patches:
            res_odbc.patch uploaded by ks3 (license 1147)
      Tested by: ks3
   ........
 ................
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=300624