Summary: | ASTERISK-16902: [patch] Random segfault when querying MySQL via func_odbc | ||
Reporter: | Kevin Sandy (ks3) | Labels: | |
Date Opened: | 2010-11-02 13:30:22 | Date Closed: | 2011-03-08 15:07:40.000-0600 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Functions/func_odbc |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) console.txt.20101103T1501 ( 1) res_odbc.patch ( 2) valgrind.txt.20101102T1353 ( 3) valgrind.txt.20101103T1501 | |
Description: | For the past couple months, our two Asterisk servers crash (at different times) about once a week, and the last entries in the console log are calls to some of our func_odbc functions. Within the last couple weeks, these crashes have increased to about every day, sometimes 2-3 times per day. ****** ADDITIONAL INFORMATION ****** Our servers are running CentOS 5.4, 64 bit. They do have updates available, but none related to MySQL or unixODBC. I have tried to get a core dump (dumpcore = yes is set in asterisk.conf), but none is ever generated. I'll attach the valgrind file from the latest crash; unfortunately, when I set it up to run under valgrind I forgot to redirect the console output to a file, so I don't have that yet. I'll attach that the next time it crashes. | ||
Comments: | By: Kevin Sandy (ks3) 2010-11-03 14:28:19 From the console log, it appears the crash is related to "SQL Alloc Handle failed". We have several other processes that tie into the same MySQL server / database, and they aren't exhibiting any errors, so I don't think there are any overall database issues. By: Walter Doekes (wdoekes) 2010-11-04 03:43:30 I can confirm part of the problem. It is however not random. It exists in both 1.4 and 1.6 (probably 1.8 too). Whenever a query fails (in my case usually if I forget to add a view and asterisk therefore cannot select on it), the asterisk odbc link thinks the link may be down. It then attempts to reconnect. However, if someone else does a query while it is reconnecting, the handle is pointing to garbage and it crashes in libodbc. Try setting pooling=no. By: Kevin Sandy (ks3) 2010-11-04 08:15:51 That certainly makes sense. We do receive periodic errors on SQL queries - a developer here created a trigger that runs when CDR records are added, and this seems to fail periodically - I haven't seen the statements in the trigger, but I assume there's little to no error checking. I'll look into that side of it. I checked our res_odbc.conf, and pooling was already off (that appears to be the default setting). This seems related to issue 0014748, and one of the workarounds in that is to enable pooling. I've enabled pooling for the moment while I do a bit more research. By: Kevin Sandy (ks3) 2010-11-04 10:36:47 I've attached a patch against the current svn version of res_odbc.c. There was an update at some point to use ast_odbc_sanity_check in the ast_odbc_prepare_and_execute function, but ast_odbc_direct_execute continued to blindly disconnect and reconnect. Also, each of those functions set obj->up to 0, which appears to have the effect of forcing ast_odbc_sanity_check to disconnect and reconnect. It seems to me that it should only disconnect / reconnect if the testsql statement fails, so I have removed the obj->up = 0 statements as well. We are running 1.6.2.13, and the patch also cleanly applies against it's source. I'll be putting this in testing later today. By: Kevin Sandy (ks3) 2010-12-09 10:31:15.000-0600 Just to follow up, for the past 2-3 weeks we have been running Asterisk 1.6.2.14 with the attached patch and connection pooling disabled - everything appears to be working as anticipated. We have gotten the warning in the patch (performing sanity check) several times, but the server has never crashed or failed in any other way. By: Digium Subversion (svnbot) 2011-01-05 12:47:47.000-0600 Repository: asterisk Revision: 300621 U branches/1.4/res/res_odbc.c ------------------------------------------------------------------------ r300621 | tilghman | 2011-01-05 12:47:47 -0600 (Wed, 05 Jan 2011) | 10 lines Use the sanity check in place of the disconnect/connect cycle. The disconnect/connect cycle has the potential to cause random crashes. (closes issue ASTERISK-16902) Reported by: ks3 Patches: res_odbc.patch uploaded by ks3 (license 1147) Tested by: ks3 ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=300621 By: Digium Subversion (svnbot) 2011-01-05 12:55:00.000-0600 Repository: asterisk Revision: 300622 _U branches/1.6.2/ U branches/1.6.2/res/res_odbc.c ------------------------------------------------------------------------ r300622 | tilghman | 2011-01-05 12:54:59 -0600 (Wed, 05 Jan 2011) | 17 lines Merged revisions 300621 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r300621 | tilghman | 2011-01-05 12:47:46 -0600 (Wed, 05 Jan 2011) | 10 lines Use the sanity check in place of the disconnect/connect cycle. The disconnect/connect cycle has the potential to cause random crashes. (closes issue ASTERISK-16902) Reported by: ks3 Patches: res_odbc.patch uploaded by ks3 (license 1147) Tested by: ks3 ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=300622 By: Digium Subversion (svnbot) 2011-01-05 12:56:13.000-0600 Repository: asterisk Revision: 300623 _U branches/1.8/ U branches/1.8/res/res_odbc.c ------------------------------------------------------------------------ r300623 | tilghman | 2011-01-05 12:56:13 -0600 (Wed, 05 Jan 2011) | 24 lines Merged revisions 300622 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ................ r300622 | tilghman | 2011-01-05 12:54:58 -0600 (Wed, 05 Jan 2011) | 17 lines Merged revisions 300621 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r300621 | tilghman | 2011-01-05 12:47:46 -0600 (Wed, 05 Jan 2011) | 10 lines Use the sanity check in place of the disconnect/connect cycle. The disconnect/connect cycle has the potential to cause random crashes. (closes issue ASTERISK-16902) Reported by: ks3 Patches: res_odbc.patch uploaded by ks3 (license 1147) Tested by: ks3 ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=300623 By: Digium Subversion (svnbot) 2011-01-05 12:57:06.000-0600 Repository: asterisk Revision: 300624 _U trunk/ U trunk/res/res_odbc.c ------------------------------------------------------------------------ r300624 | tilghman | 2011-01-05 12:57:06 -0600 (Wed, 05 Jan 2011) | 31 lines Merged revisions 300623 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.8 ................ r300623 | tilghman | 2011-01-05 12:56:12 -0600 (Wed, 05 Jan 2011) | 24 lines Merged revisions 300622 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ................ r300622 | tilghman | 2011-01-05 12:54:58 -0600 (Wed, 05 Jan 2011) | 17 lines Merged revisions 300621 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r300621 | tilghman | 2011-01-05 12:47:46 -0600 (Wed, 05 Jan 2011) | 10 lines Use the sanity check in place of the disconnect/connect cycle. The disconnect/connect cycle has the potential to cause random crashes. (closes issue ASTERISK-16902) Reported by: ks3 Patches: res_odbc.patch uploaded by ks3 (license 1147) Tested by: ks3 ........ ................ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=300624 |