Summary: | ASTERISK-17378: [patch] SEGFAULT in remote_bridge_loop after a SIP to SIP attended transfer with external IAX2 or DAHDI call | ||
Reporter: | Alec Davis (alecdavis) | Labels: | |
Date Opened: | 2011-02-09 19:18:11.000-0600 | Date Closed: | 2011-03-11 00:56:08.000-0600 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Core/RTP |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) bug18781.diff.txt ( 1) bug18781.diff2.txt ( 2) bug18781.diff3.txt ( 3) gdb.txt ( 4) gdb-SVN-branch-1.8-r307793.txt ( 5) gdb-SVN-trunk-r308157.txt | |
Description: | Bug initially reported on 1.8.2 tag release, after experiencing similar on 1.8.2.3 tagged release. Is still repeatable with trunk ****** ADDITIONAL INFORMATION ****** minimal log file '/var/log/asterisk/full' Last three segfaults have similar last events recorded as <b><u>[Feb 9 xx:xx:xx] VERBOSE[4786] res_musiconhold.c: -- Stopped music on hold on SIP/GXP00xx-00000xxx</u></b> | ||
Comments: | By: Alec Davis (alecdavis) 2011-02-09 19:33:50.000-0600 ===========1.8.2.3============ [Feb 9 13:59:42] VERBOSE[21815] pbx.c: -- Executing [s@macro-procexten:32] Dial("SIP/GXP0016-000001df", "SIP/GXP0011,30,r") in new stack [Feb 9 13:59:42] VERBOSE[21815] netsock2.c: == Using SIP RTP CoS mark 5 [Feb 9 13:59:42] VERBOSE[21815] app_dial.c: -- Called GXP0011 [Feb 9 13:59:42] VERBOSE[4779] chan_sip.c: == Extension Changed 8524[trusted] new state Ringing for Notify User GXP0009 [Feb 9 13:59:42] VERBOSE[4779] chan_sip.c: == Extension Changed 8524[trusted] new state Ringing for Notify User GXP0012 [Feb 9 13:59:42] VERBOSE[4779] chan_sip.c: == Extension Changed 8524[trusted] new state Ringing for Notify User GXP0010 [Feb 9 13:59:42] VERBOSE[4779] chan_sip.c: == Extension Changed 8524[trusted] new state Ringing for Notify User GXP0013 [Feb 9 13:59:42] VERBOSE[21815] app_dial.c: -- SIP/GXP0011-000001e0 is ringing [Feb 9 13:59:48] VERBOSE[21815] app_dial.c: -- SIP/GXP0011-000001e0 answered SIP/GXP0016-000001df [Feb 9 13:59:48] VERBOSE[4779] chan_sip.c: == Extension Changed 8524[trusted] new state InUse for Notify User GXP0009 [Feb 9 13:59:48] VERBOSE[4779] chan_sip.c: == Extension Changed 8524[trusted] new state InUse for Notify User GXP0012 [Feb 9 13:59:48] VERBOSE[21815] rtp_engine.c: -- Remotely bridging SIP/GXP0016-000001df and SIP/GXP0011-000001e0 [Feb 9 13:59:48] VERBOSE[4779] chan_sip.c: == Extension Changed 8524[trusted] new state InUse for Notify User GXP0010 [Feb 9 13:59:48] VERBOSE[4779] chan_sip.c: == Extension Changed 8524[trusted] new state InUse for Notify User GXP0013 [Feb 9 14:00:10] VERBOSE[4779] chan_sip.c: == Extension Changed 8530[trusted] new state InUse for Notify User GXP0016 [Feb 9 14:00:10] VERBOSE[4779] chan_sip.c: == Extension Changed 8530[trusted] new state InUse for Notify User GXP0015 [Feb 9 14:00:10] VERBOSE[4779] chan_sip.c: == Extension Changed 8530[trusted] new state InUse for Notify User GXP0018 [Feb 9 14:00:19] VERBOSE[21815] res_musiconhold.c: -- Started music on hold, class 'default', on SIP/GXP0011-000001e0 [Feb 9 14:00:20] VERBOSE[4786] res_musiconhold.c: -- Stopped music on hold on IAX2/auckbdt-3109 [Feb 9 14:00:20] VERBOSE[4786] res_musiconhold.c: -- Stopped music on hold on SIP/GXP0011-000001e0 [Feb 9 14:01:43] VERBOSE[21850] config.c: == Parsing '/etc/asterisk/asterisk.conf': [Feb 9 14:01:43] VERBOSE[21850] config.c: == Found ===========1.8.2.3============ [Feb 9 15:55:24] VERBOSE[2590] app_dial.c: -- SIP/GXP0003-000000b2 answered SIP/8514-000000b1 [Feb 9 15:55:24] VERBOSE[21863] chan_sip.c: == Extension Changed 8513[trusted] new state InUse for Notify User GXP0005 [Feb 9 15:55:24] VERBOSE[21863] chan_sip.c: == Extension Changed 8513[trusted] new state InUse for Notify User GXP0003 [Feb 9 15:55:24] VERBOSE[21863] chan_sip.c: == Extension Changed 8513[trusted] new state InUse for Notify User GXP0001 [Feb 9 15:55:24] VERBOSE[2590] rtp_engine.c: -- Remotely bridging SIP/8514-000000b1 and SIP/GXP0003-000000b2 [Feb 9 15:55:26] VERBOSE[2590] res_musiconhold.c: -- Started music on hold, class 'default', on SIP/GXP0003-000000b2 [Feb 9 15:55:26] VERBOSE[21863] chan_sip.c: == Extension Changed 8520[trusted] new state InUse for Notify User GXP0012 [Feb 9 15:55:26] VERBOSE[21863] chan_sip.c: == Extension Changed 8520[trusted] new state InUse for Notify User GXP0013 [Feb 9 15:55:26] VERBOSE[21863] chan_sip.c: == Extension Changed 8520[trusted] new state InUse for Notify User GXP0010 [Feb 9 15:55:26] VERBOSE[21863] chan_sip.c: == Extension Changed 8520[trusted] new state InUse for Notify User GXP0011 [Feb 9 15:55:26] VERBOSE[21870] res_musiconhold.c: -- Stopped music on hold on IAX2/auckbdt-3475 [Feb 9 15:55:26] VERBOSE[21870] res_musiconhold.c: -- Stopped music on hold on SIP/GXP0003-000000b2 [Feb 9 15:59:19] VERBOSE[2640] config.c: == Parsing '/etc/asterisk/asterisk.conf': [Feb 9 15:59:19] VERBOSE[2640] config.c: == Found ===========1.8.2============ [Feb 10 13:25:34] VERBOSE[19781] pbx.c: -- Executing [s@macro-procexten:32] Dial("SIP/GXP0018-00000155", "SIP/GXP0010,30,r") in new stack [Feb 10 13:25:34] VERBOSE[19781] netsock2.c: == Using SIP RTP CoS mark 5 [Feb 10 13:25:34] VERBOSE[19781] app_dial.c: -- Called GXP0010 [Feb 10 13:25:34] VERBOSE[11880] chan_sip.c: == Extension Changed 8523[trusted] new state Ringing for Notify User GXP0011 [Feb 10 13:25:34] VERBOSE[11880] chan_sip.c: == Extension Changed 8523[trusted] new state Ringing for Notify User GXP0009 [Feb 10 13:25:34] VERBOSE[11880] chan_sip.c: == Extension Changed 8523[trusted] new state Ringing for Notify User GXP0012 [Feb 10 13:25:35] VERBOSE[19781] app_dial.c: -- SIP/GXP0010-00000156 is ringing [Feb 10 13:25:38] VERBOSE[11880] chan_sip.c: == Extension Changed 8523[trusted] new state InUse for Notify User GXP0011 [Feb 10 13:25:38] VERBOSE[11880] chan_sip.c: == Extension Changed 8523[trusted] new state InUse for Notify User GXP0009 [Feb 10 13:25:38] VERBOSE[11880] chan_sip.c: == Extension Changed 8523[trusted] new state InUse for Notify User GXP0012 [Feb 10 13:25:38] VERBOSE[19781] app_dial.c: -- SIP/GXP0010-00000156 answered SIP/GXP0018-00000155 [Feb 10 13:25:38] VERBOSE[19781] rtp_engine.c: -- Remotely bridging SIP/GXP0018-00000155 and SIP/GXP0010-00000156 [Feb 10 13:25:48] VERBOSE[19781] res_musiconhold.c: -- Started music on hold, class 'default', on SIP/GXP0010-00000156 [Feb 10 13:25:51] VERBOSE[11887] res_musiconhold.c: -- Stopped music on hold on IAX2/auckbdt-2102 [Feb 10 13:25:51] VERBOSE[11887] res_musiconhold.c: -- Stopped music on hold on SIP/GXP0010-00000156 [Feb 10 13:30:01] VERBOSE[19834] config.c: == Parsing '/etc/asterisk/asterisk.conf': [Feb 10 13:30:01] VERBOSE[19834] config.c: == Found By: Alec Davis (alecdavis) 2011-02-09 20:31:14.000-0600 reproducable I can do it on demand, better wait till after hours, unhappy staff. may be over complicated, but from the logs I could see an IAX trunk seemed to be involved. SIPA rings SIPB over IAX trunk SIPB does an attended transfer to SIPC Then segfault! If SIPB does a direct transfer to SIPC no segfault Setup: All SIP phones are Grandstream GXP series<pre> SIPA ASTERISK-1 IAX IAX ASTERISK-2 SIPB SIPC</pre> By: Alec Davis (alecdavis) 2011-02-09 22:20:25.000-0600 BAD MAGIC NUMBER: A dahdi call dials GXP0001, then GXP0001 attended transfer to GXP0004 Test: DAHDI ASTERISK GXP0001 GXP0004 -- Called GXP0004 -- SIP/GXP0004-000000b4 is ringing -- SIP/GXP0004-000000b4 answered SIP/GXP0001-000000b3 -- Remotely bridging SIP/GXP0001-000000b3 and SIP/GXP0004-000000b4 -- Started music on hold, class 'default', on SIP/GXP0004-000000b4 -- Stopped music on hold on DAHDI/i1/5604866-e2 -- Stopped music on hold on SIP/GXP0004-000000b4 [Feb 10 17:11:58] ERROR[14146]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0x1965 for 0xb03eeb20 [Feb 10 17:11:58] ERROR[14146]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0x1965 for 0xb03eeb20 By: Alec Davis (alecdavis) 2011-02-09 22:38:43.000-0600 Confirmed segfault is only when a call comes in over IAX2 TRUNK GXP9000 ASTERISK-1 IAX2 || IAX2 ASTERISK-2 GXP0001 GXP0004 >>>>> Call coming from IAX2 to GXP0001 <<<<< -- Executing [s@macro-procexten:32] Dial("IAX2/auckbdt-46", "SIP/GXP0001,10,r") in new stack == Using SIP RTP CoS mark 5 -- Called GXP0001 -- SIP/GXP0001-00000008 is ringing -- SIP/GXP0001-00000008 answered IAX2/auckbdt-46 >>>>> GXP0001 now talking on line1 to IAX2 caller <<<<< -- Started music on hold, class 'default', on IAX2/auckbdt-46 >>>>> GXP0001 has select another line, IAX2 caller has MOH <<<<< >>>>> GXP0001 dials GXP0004 <<<<< -- Executing [s@macro-procexten:32] Dial("SIP/GXP0001-00000009", "SIP/GXP0004,20,r") in new stack == Using SIP RTP CoS mark 5 -- Called GXP0004 -- SIP/GXP0004-0000000a is ringing -- SIP/GXP0004-0000000a answered SIP/GXP0001-00000009 -- Remotely bridging SIP/GXP0001-00000009 and SIP/GXP0004-0000000a >>>>> GXP0001 now talking on line2 with GXP0004 <<<<< >>>>> GXP0001 now transfers line1 to line 2 <<<<< -- Started music on hold, class 'default', on SIP/GXP0004-0000000a -- Stopped music on hold on IAX2/auckbdt-46 -- Stopped music on hold on SIP/GXP0004-0000000a astrid*CLI> Disconnected from Asterisk server >>>>> SEGFAULTED <<<<< By: Alec Davis (alecdavis) 2011-02-16 22:31:46.000-0600 Segfault still exists with SVN-branch-1.8-r307793 and SVN-trunk-r308157 uploaded: gdb-SVN-branch-1.8-r307793.txt gdb-SVN-trunk-r308157.txt By: Alec Davis (alecdavis) 2011-02-17 03:31:39.000-0600 This is with SVN-trunk-r308157M with bug18781.diff.txt applied Setup: FXS ASTERISK-myhome IAX2 || || IAX2 ASTERISK-work SIP/GXP8923 SIP/GXP8922 console output: with simple call, note patch has extra debug prints. -- Accepting AUTHENTICATED call from XXX.YYY.ZZZ.AAA: > requested format = alaw, > requested prefs = (alaw|ulaw|gsm), > actual format = alaw, > host prefs = (alaw|ulaw|gsm), > priority = mine -- Executing [8923@trusted:1] Dial("IAX2/myhome-2102", "SIP/GXP8923,,r") in new stack == Using SIP RTP CoS mark 5 -- Called GXP8923 -- SIP/GXP8923-00000006 is ringing -- SIP/GXP8923-00000006 answered IAX2/myhome-2102 -- Started music on hold, class 'default', on IAX2/myhome-2102 == Using SIP RTP CoS mark 5 -- Executing [8922@trusted:1] Dial("SIP/GXP8923-00000007", "SIP/GXP8922,,r") in new stack == Using SIP RTP CoS mark 5 -- Called GXP8922 -- SIP/GXP8922-00000008 is ringing -- SIP/GXP8922-00000008 answered SIP/GXP8923-00000007 -- Remotely bridging SIP/GXP8923-00000007 and SIP/GXP8922-00000008 <b>[Feb 17 22:20:51] WARNING[3043]: rtp_engine.c:1078 remote_bridge_loop: Channels c0->'SIP/GXP8923-00000007' c1->'SIP/GXP8922-00000008' in remote_bridge_loop</b> -- Started music on hold, class 'default', on SIP/GXP8922-00000008 -- Stopped music on hold on IAX2/myhome-2102 -- Stopped music on hold on SIP/GXP8922-00000008 <b>[Feb 17 22:20:53] WARNING[3043]: rtp_engine.c:1086 remote_bridge_loop: Oooh, something is weird, backing out <u>>>>>>>>>>>>>>>> Normally would segfault here <<<<<<<<<<<<<<<<<<<<<<</u> [Feb 17 22:20:53] WARNING[3043]: rtp_engine.c:1276 remote_bridge_loop: Channel c0->'IAX2/myhome-2102' changed, in bridge with c1->'SIP/GXP8922-00000008'</b> == Spawn extension (trusted, 8923, 1) exited non-zero on 'SIP/GXP8923-00000007<ZOMBIE>' == Spawn extension (trusted, 8922, 1) exited non-zero on 'IAX2/myhome-2102' -- Hungup 'IAX2/myhome-2102' By: Alec Davis (alecdavis) 2011-02-17 04:13:07.000-0600 bug18781.diff.txt applied also fixes BAD MAGIC number, when call is from DAHDI, and call is transferred. Purposely disabled part of patch to demonstrate. Setup: DAHDI E1 ASTERISK-work SIP/GXP8923 SIP/GXP8922 <b>[Feb 17 23:08:18] WARNING[7746]: rtp_engine.c:1022 remote_bridge_loop: Channels c0->'SIP/GXP8923-00000001' c1->'SIP/GXP8922-00000002' in remote_bridge_loop</b> -- Started music on hold, class 'default', on SIP/GXP8922-00000002 -- Stopped music on hold on DAHDI/i1/4XXXX855-1 -- Stopped music on hold on SIP/GXP8922-00000002 <b>[Feb 17 23:08:22] WARNING[7746]: rtp_engine.c:1030 remote_bridge_loop: Oooh, something is weird, backing out [Feb 17 23:08:22] WARNING[7746]: rtp_engine.c:1208 remote_bridge_loop: Channel c0->'DAHDI/i1/4XXXX855-1' changed, in bridge with c1->'SIP/GXP8922-00000002' <u>[Feb 17 23:08:22] ERROR[7746]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0x1965 for 0xafba43d8 [Feb 17 23:08:22] ERROR[7746]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0x1965 for 0xafba43d8</u> [Feb 17 23:08:22] WARNING[7746]: rtp_engine.c:1216 remote_bridge_loop: Channel c1->'SIP/GXP8922-00000002' changed, in bridge with c0->'DAHDI/i1/4XXXX855-1'</b> == Spawn extension (trusted, 8923, 1) exited non-zero on 'SIP/GXP8923-00000001<ZOMBIE>' By: Alec Davis (alecdavis) 2011-02-17 05:48:55.000-0600 Different approach with bug18781.diff2.txt only update_peer if glueX 'technology type' hasn't changed. By: Matt Riddell (zx81) 2011-02-23 17:03:05.000-0600 We've just had the same crash and applied the diff2 patch - will report on whether we see the crash again. By: Alec Davis (alecdavis) 2011-02-23 17:14:05.000-0600 ZX81: You may also need to consider ASTERISK-17431 By: Matt Riddell (zx81) 2011-02-23 18:40:36.000-0600 Ok, cool thanks Alec By: Matt Riddell (zx81) 2011-02-24 16:21:46.000-0600 Just had the following before a crash: [Feb 25 11:09:40] -- Remotely bridging SIP/716-0000007b and SIP/707-0000007c [Feb 25 11:09:40] WARNING[29487]: rtp_engine.c:1022 remote_bridge_loop: Channels c0->'SIP/716-0000007b' c1->'SIP/707-0000007c' in remote_bridge_loop [Feb 25 11:09:44] == Manager 'andrew' logged on from 10.1.1.45 [Feb 25 11:09:47] -- Stopped music on hold on lcr/35 [Feb 25 11:09:47] WARNING[29487]: rtp_engine.c:1030 remote_bridge_loop: Oooh, something is weird, backing out [Feb 25 11:09:47] ERROR[29487]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0xa01 for 0x18a1818 [Feb 25 11:09:47] ERROR[29487]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0xa01 for 0x18a1818 [Feb 25 11:09:47] ERROR[29487]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0xa01 for 0x18a1818 [Feb 25 11:09:47] ERROR[29487]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0xa01 for 0x18a1818 [Feb 25 11:09:47] == Spawn extension (custom-day, s, 1) exited non-zero on 'SIP/716-0000007b<ZOMBIE>' [Feb 25 11:09:47] == Extension Changed 716[extensions] new state Idle for Notify User 700 lucas*CLI> Disconnected from Asterisk server [Feb 25 11:09:48] Executing last minute cleanups Asterisk ending (0). By: Matt Riddell (zx81) 2011-02-24 16:25:54.000-0600 Is there any way to just disable remote bridging? Other than forcing every phone to use a different codec :) By: Alec Davis (alecdavis) 2011-02-24 18:13:58.000-0600 ZX81: sip.conf directmedia=no previously was canreinvite=no seemed to disable remote bridge, at least for a sip to sip call, and a transfer to sip call By: Matt Riddell (zx81) 2011-02-24 18:19:03.000-0600 Yeah, I've just set that :) Now dealing with another problem :) By: Alec Davis (alecdavis) 2011-03-03 14:59:19.000-0600 bug18781.diff3.txt update from reviewboard don't call update_peer if one of the following; where 'x' is either 0/1 1). ZOMBIE 2). cx->tech_pvt != pvtx 3). gluex != ast_rtp_instance_get_glue(cx->tech->type)) By: Alec Davis (alecdavis) 2011-03-03 15:04:56.000-0600 ZX81: are you able to test the diff3 patch. If I read you last crash correctly, IAX or DAHDI wern't involved. By: Matt Riddell (zx81) 2011-03-03 15:18:44.000-0600 I've reverted everything to stock 1.8 SVN, applied your patch (diff3) and will let you know if I see a crash. Oh, and I've removed the directmedia=no. By: Alec Davis (alecdavis) 2011-03-10 21:58:01.000-0600 ZX81: Do you still have these patches in place? You and I seem to be the only ones that have tripped over this :( They're working well for us, just need to get another positve and I'll get these committed. By: Matt Riddell (zx81) 2011-03-10 22:03:15.000-0600 Yep, no crashes even with directmedia removed. Patch working well. By: Digium Subversion (svnbot) 2011-03-11 00:47:46.000-0600 Repository: asterisk Revision: 310287 U branches/1.8/main/rtp_engine.c ------------------------------------------------------------------------ r310287 | alecdavis | 2011-03-11 00:47:45 -0600 (Fri, 11 Mar 2011) | 17 lines remote_bridge_loop: prevent segfault when after transfer of IAX2 of DAHDI call If the channel condition is one of the following after breaking out of the loop, don't try to update_peer (where x = 0/1) 1). ZOMBIE 2). cx->tech_pvt != pvtx 3). gluex != ast_rtp_instance_get_glue(cx->tech->type)) (closes issue ASTERISK-17378) Reported by: alecdavis Patches: bug18781.diff3.txt uploaded by alecdavis (license 585) Tested by: alecdavis, ZX81 Review: https://reviewboard.asterisk.org/r/1128/ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=310287 By: Digium Subversion (svnbot) 2011-03-11 00:56:07.000-0600 Repository: asterisk Revision: 310288 _U trunk/ U trunk/main/rtp_engine.c ------------------------------------------------------------------------ r310288 | alecdavis | 2011-03-11 00:56:07 -0600 (Fri, 11 Mar 2011) | 23 lines Merged revisions 310287 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.8 ........ r310287 | alecdavis | 2011-03-11 19:47:44 +1300 (Fri, 11 Mar 2011) | 17 lines remote_bridge_loop: prevent segfault when after transfer of IAX2 of DAHDI call If the channel condition is one of the following after breaking out of the loop, don't try to update_peer (where x = 0/1) 1). ZOMBIE 2). cx->tech_pvt != pvtx 3). gluex != ast_rtp_instance_get_glue(cx->tech->type)) (closes issue ASTERISK-17378) Reported by: alecdavis Patches: bug18781.diff3.txt uploaded by alecdavis (license 585) Tested by: alecdavis, ZX81 Review: https://reviewboard.asterisk.org/r/1128/ ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=310288 |