[Home]

Summary:ASTERISK-17378: [patch] SEGFAULT in remote_bridge_loop after a SIP to SIP attended transfer with external IAX2 or DAHDI call
Reporter:Alec Davis (alecdavis)Labels:
Date Opened:2011-02-09 19:18:11.000-0600Date Closed:2011-03-11 00:56:08.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/RTP
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) bug18781.diff.txt
( 1) bug18781.diff2.txt
( 2) bug18781.diff3.txt
( 3) gdb.txt
( 4) gdb-SVN-branch-1.8-r307793.txt
( 5) gdb-SVN-trunk-r308157.txt
Description:Bug initially reported on 1.8.2 tag release, after experiencing similar on 1.8.2.3 tagged release.

Is still repeatable with trunk




****** ADDITIONAL INFORMATION ******

minimal log file '/var/log/asterisk/full'
Last three segfaults have similar last events recorded as
<b><u>[Feb  9 xx:xx:xx] VERBOSE[4786] res_musiconhold.c:     -- Stopped music on hold on SIP/GXP00xx-00000xxx</u></b>

Comments:By: Alec Davis (alecdavis) 2011-02-09 19:33:50.000-0600

===========1.8.2.3============
[Feb  9 13:59:42] VERBOSE[21815] pbx.c:     -- Executing [s@macro-procexten:32] Dial("SIP/GXP0016-000001df", "SIP/GXP0011,30,r") in new stack
[Feb  9 13:59:42] VERBOSE[21815] netsock2.c:   == Using SIP RTP CoS mark 5
[Feb  9 13:59:42] VERBOSE[21815] app_dial.c:     -- Called GXP0011
[Feb  9 13:59:42] VERBOSE[4779] chan_sip.c:   == Extension Changed 8524[trusted] new state Ringing for Notify User GXP0009
[Feb  9 13:59:42] VERBOSE[4779] chan_sip.c:   == Extension Changed 8524[trusted] new state Ringing for Notify User GXP0012
[Feb  9 13:59:42] VERBOSE[4779] chan_sip.c:   == Extension Changed 8524[trusted] new state Ringing for Notify User GXP0010
[Feb  9 13:59:42] VERBOSE[4779] chan_sip.c:   == Extension Changed 8524[trusted] new state Ringing for Notify User GXP0013
[Feb  9 13:59:42] VERBOSE[21815] app_dial.c:     -- SIP/GXP0011-000001e0 is ringing
[Feb  9 13:59:48] VERBOSE[21815] app_dial.c:     -- SIP/GXP0011-000001e0 answered SIP/GXP0016-000001df
[Feb  9 13:59:48] VERBOSE[4779] chan_sip.c:   == Extension Changed 8524[trusted] new state InUse for Notify User GXP0009
[Feb  9 13:59:48] VERBOSE[4779] chan_sip.c:   == Extension Changed 8524[trusted] new state InUse for Notify User GXP0012
[Feb  9 13:59:48] VERBOSE[21815] rtp_engine.c:     -- Remotely bridging SIP/GXP0016-000001df and SIP/GXP0011-000001e0
[Feb  9 13:59:48] VERBOSE[4779] chan_sip.c:   == Extension Changed 8524[trusted] new state InUse for Notify User GXP0010
[Feb  9 13:59:48] VERBOSE[4779] chan_sip.c:   == Extension Changed 8524[trusted] new state InUse for Notify User GXP0013
[Feb  9 14:00:10] VERBOSE[4779] chan_sip.c:   == Extension Changed 8530[trusted] new state InUse for Notify User GXP0016
[Feb  9 14:00:10] VERBOSE[4779] chan_sip.c:   == Extension Changed 8530[trusted] new state InUse for Notify User GXP0015
[Feb  9 14:00:10] VERBOSE[4779] chan_sip.c:   == Extension Changed 8530[trusted] new state InUse for Notify User GXP0018
[Feb  9 14:00:19] VERBOSE[21815] res_musiconhold.c:     -- Started music on hold, class 'default', on SIP/GXP0011-000001e0
[Feb  9 14:00:20] VERBOSE[4786] res_musiconhold.c:     -- Stopped music on hold on IAX2/auckbdt-3109
[Feb  9 14:00:20] VERBOSE[4786] res_musiconhold.c:     -- Stopped music on hold on SIP/GXP0011-000001e0
[Feb  9 14:01:43] VERBOSE[21850] config.c:   == Parsing '/etc/asterisk/asterisk.conf': [Feb  9 14:01:43] VERBOSE[21850] config.c:   == Found

===========1.8.2.3============
[Feb  9 15:55:24] VERBOSE[2590] app_dial.c:     -- SIP/GXP0003-000000b2 answered SIP/8514-000000b1
[Feb  9 15:55:24] VERBOSE[21863] chan_sip.c:   == Extension Changed 8513[trusted] new state InUse for Notify User GXP0005
[Feb  9 15:55:24] VERBOSE[21863] chan_sip.c:   == Extension Changed 8513[trusted] new state InUse for Notify User GXP0003
[Feb  9 15:55:24] VERBOSE[21863] chan_sip.c:   == Extension Changed 8513[trusted] new state InUse for Notify User GXP0001
[Feb  9 15:55:24] VERBOSE[2590] rtp_engine.c:     -- Remotely bridging SIP/8514-000000b1 and SIP/GXP0003-000000b2
[Feb  9 15:55:26] VERBOSE[2590] res_musiconhold.c:     -- Started music on hold, class 'default', on SIP/GXP0003-000000b2
[Feb  9 15:55:26] VERBOSE[21863] chan_sip.c:   == Extension Changed 8520[trusted] new state InUse for Notify User GXP0012
[Feb  9 15:55:26] VERBOSE[21863] chan_sip.c:   == Extension Changed 8520[trusted] new state InUse for Notify User GXP0013
[Feb  9 15:55:26] VERBOSE[21863] chan_sip.c:   == Extension Changed 8520[trusted] new state InUse for Notify User GXP0010
[Feb  9 15:55:26] VERBOSE[21863] chan_sip.c:   == Extension Changed 8520[trusted] new state InUse for Notify User GXP0011
[Feb  9 15:55:26] VERBOSE[21870] res_musiconhold.c:     -- Stopped music on hold on IAX2/auckbdt-3475
[Feb  9 15:55:26] VERBOSE[21870] res_musiconhold.c:     -- Stopped music on hold on SIP/GXP0003-000000b2
[Feb  9 15:59:19] VERBOSE[2640] config.c:   == Parsing '/etc/asterisk/asterisk.conf': [Feb  9 15:59:19] VERBOSE[2640] config.c:   == Found

===========1.8.2============
[Feb 10 13:25:34] VERBOSE[19781] pbx.c:     -- Executing [s@macro-procexten:32] Dial("SIP/GXP0018-00000155", "SIP/GXP0010,30,r") in new stack
[Feb 10 13:25:34] VERBOSE[19781] netsock2.c:   == Using SIP RTP CoS mark 5
[Feb 10 13:25:34] VERBOSE[19781] app_dial.c:     -- Called GXP0010
[Feb 10 13:25:34] VERBOSE[11880] chan_sip.c:   == Extension Changed 8523[trusted] new state Ringing for Notify User GXP0011
[Feb 10 13:25:34] VERBOSE[11880] chan_sip.c:   == Extension Changed 8523[trusted] new state Ringing for Notify User GXP0009
[Feb 10 13:25:34] VERBOSE[11880] chan_sip.c:   == Extension Changed 8523[trusted] new state Ringing for Notify User GXP0012
[Feb 10 13:25:35] VERBOSE[19781] app_dial.c:     -- SIP/GXP0010-00000156 is ringing
[Feb 10 13:25:38] VERBOSE[11880] chan_sip.c:   == Extension Changed 8523[trusted] new state InUse for Notify User GXP0011
[Feb 10 13:25:38] VERBOSE[11880] chan_sip.c:   == Extension Changed 8523[trusted] new state InUse for Notify User GXP0009
[Feb 10 13:25:38] VERBOSE[11880] chan_sip.c:   == Extension Changed 8523[trusted] new state InUse for Notify User GXP0012
[Feb 10 13:25:38] VERBOSE[19781] app_dial.c:     -- SIP/GXP0010-00000156 answered SIP/GXP0018-00000155
[Feb 10 13:25:38] VERBOSE[19781] rtp_engine.c:     -- Remotely bridging SIP/GXP0018-00000155 and SIP/GXP0010-00000156
[Feb 10 13:25:48] VERBOSE[19781] res_musiconhold.c:     -- Started music on hold, class 'default', on SIP/GXP0010-00000156
[Feb 10 13:25:51] VERBOSE[11887] res_musiconhold.c:     -- Stopped music on hold on IAX2/auckbdt-2102
[Feb 10 13:25:51] VERBOSE[11887] res_musiconhold.c:     -- Stopped music on hold on SIP/GXP0010-00000156
[Feb 10 13:30:01] VERBOSE[19834] config.c:   == Parsing '/etc/asterisk/asterisk.conf': [Feb 10 13:30:01] VERBOSE[19834] config.c:   == Found



By: Alec Davis (alecdavis) 2011-02-09 20:31:14.000-0600

reproducable I can do it on demand, better wait till after hours, unhappy staff.

may be over complicated, but from the logs I could see an IAX trunk seemed to be involved.

 SIPA rings SIPB over IAX trunk
 SIPB does an attended transfer to SIPC
 Then segfault!
 
 If SIPB does a direct transfer to SIPC no segfault

Setup: All SIP phones are Grandstream GXP series<pre>  SIPA
ASTERISK-1
  IAX
  IAX
ASTERISK-2
  SIPB   SIPC</pre>



By: Alec Davis (alecdavis) 2011-02-09 22:20:25.000-0600

BAD MAGIC NUMBER: A dahdi call dials GXP0001, then GXP0001 attended transfer to GXP0004

Test:

DAHDI
ASTERISK
 GXP0001 GXP0004

  -- Called GXP0004
   -- SIP/GXP0004-000000b4 is ringing
   -- SIP/GXP0004-000000b4 answered SIP/GXP0001-000000b3
   -- Remotely bridging SIP/GXP0001-000000b3 and SIP/GXP0004-000000b4
   -- Started music on hold, class 'default', on SIP/GXP0004-000000b4
   -- Stopped music on hold on DAHDI/i1/5604866-e2
   -- Stopped music on hold on SIP/GXP0004-000000b4
[Feb 10 17:11:58] ERROR[14146]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0x1965 for 0xb03eeb20
[Feb 10 17:11:58] ERROR[14146]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0x1965 for 0xb03eeb20



By: Alec Davis (alecdavis) 2011-02-09 22:38:43.000-0600

Confirmed segfault is only when a call comes in over IAX2 TRUNK

GXP9000
ASTERISK-1
  IAX2
   ||
  IAX2
ASTERISK-2
GXP0001   GXP0004

>>>>> Call coming from IAX2 to GXP0001 <<<<<
   -- Executing [s@macro-procexten:32] Dial("IAX2/auckbdt-46", "SIP/GXP0001,10,r") in new stack
 == Using SIP RTP CoS mark 5
   -- Called GXP0001
   -- SIP/GXP0001-00000008 is ringing
   -- SIP/GXP0001-00000008 answered IAX2/auckbdt-46
>>>>> GXP0001 now talking on line1 to IAX2 caller <<<<<

   -- Started music on hold, class 'default', on IAX2/auckbdt-46
>>>>> GXP0001 has select another line, IAX2 caller has MOH <<<<<

>>>>> GXP0001 dials GXP0004 <<<<<
   -- Executing [s@macro-procexten:32] Dial("SIP/GXP0001-00000009", "SIP/GXP0004,20,r") in new stack
 == Using SIP RTP CoS mark 5
   -- Called GXP0004
   -- SIP/GXP0004-0000000a is ringing
   -- SIP/GXP0004-0000000a answered SIP/GXP0001-00000009
   -- Remotely bridging SIP/GXP0001-00000009 and SIP/GXP0004-0000000a
>>>>> GXP0001 now talking on line2 with GXP0004 <<<<<

>>>>> GXP0001 now transfers line1 to line 2 <<<<<
   -- Started music on hold, class 'default', on SIP/GXP0004-0000000a
   -- Stopped music on hold on IAX2/auckbdt-46
   -- Stopped music on hold on SIP/GXP0004-0000000a
astrid*CLI>
Disconnected from Asterisk server
>>>>> SEGFAULTED <<<<<



By: Alec Davis (alecdavis) 2011-02-16 22:31:46.000-0600

Segfault still exists with SVN-branch-1.8-r307793 and SVN-trunk-r308157
uploaded:
  gdb-SVN-branch-1.8-r307793.txt
  gdb-SVN-trunk-r308157.txt



By: Alec Davis (alecdavis) 2011-02-17 03:31:39.000-0600

This is with SVN-trunk-r308157M with bug18781.diff.txt applied

Setup:

  FXS
ASTERISK-myhome
  IAX2
   ||
   ||
  IAX2
ASTERISK-work
SIP/GXP8923 SIP/GXP8922

console output:
with simple call, note patch has extra debug prints.

  -- Accepting AUTHENTICATED call from XXX.YYY.ZZZ.AAA:
      > requested format = alaw,
      > requested prefs = (alaw|ulaw|gsm),
      > actual format = alaw,
      > host prefs = (alaw|ulaw|gsm),
      > priority = mine
   -- Executing [8923@trusted:1] Dial("IAX2/myhome-2102", "SIP/GXP8923,,r") in new stack
 == Using SIP RTP CoS mark 5
   -- Called GXP8923
   -- SIP/GXP8923-00000006 is ringing
   -- SIP/GXP8923-00000006 answered IAX2/myhome-2102
   -- Started music on hold, class 'default', on IAX2/myhome-2102
 == Using SIP RTP CoS mark 5
   -- Executing [8922@trusted:1] Dial("SIP/GXP8923-00000007", "SIP/GXP8922,,r") in new stack
 == Using SIP RTP CoS mark 5
   -- Called GXP8922
   -- SIP/GXP8922-00000008 is ringing
   -- SIP/GXP8922-00000008 answered SIP/GXP8923-00000007
   -- Remotely bridging SIP/GXP8923-00000007 and SIP/GXP8922-00000008
<b>[Feb 17 22:20:51] WARNING[3043]: rtp_engine.c:1078 remote_bridge_loop: Channels c0->'SIP/GXP8923-00000007' c1->'SIP/GXP8922-00000008' in remote_bridge_loop</b>
   -- Started music on hold, class 'default', on SIP/GXP8922-00000008
   -- Stopped music on hold on IAX2/myhome-2102
   -- Stopped music on hold on SIP/GXP8922-00000008
<b>[Feb 17 22:20:53] WARNING[3043]: rtp_engine.c:1086 remote_bridge_loop: Oooh, something is weird, backing out
<u>>>>>>>>>>>>>>>> Normally would segfault here <<<<<<<<<<<<<<<<<<<<<<</u>
[Feb 17 22:20:53] WARNING[3043]: rtp_engine.c:1276 remote_bridge_loop: Channel c0->'IAX2/myhome-2102' changed, in bridge with c1->'SIP/GXP8922-00000008'</b>
 == Spawn extension (trusted, 8923, 1) exited non-zero on 'SIP/GXP8923-00000007<ZOMBIE>'
 == Spawn extension (trusted, 8922, 1) exited non-zero on 'IAX2/myhome-2102'
   -- Hungup 'IAX2/myhome-2102'



By: Alec Davis (alecdavis) 2011-02-17 04:13:07.000-0600

bug18781.diff.txt applied also fixes BAD MAGIC number, when call is from DAHDI, and call is transferred.

Purposely disabled part of patch to demonstrate.

Setup:
 DAHDI E1
ASTERISK-work
SIP/GXP8923 SIP/GXP8922

<b>[Feb 17 23:08:18] WARNING[7746]: rtp_engine.c:1022 remote_bridge_loop: Channels c0->'SIP/GXP8923-00000001' c1->'SIP/GXP8922-00000002' in remote_bridge_loop</b>
   -- Started music on hold, class 'default', on SIP/GXP8922-00000002
   -- Stopped music on hold on DAHDI/i1/4XXXX855-1
   -- Stopped music on hold on SIP/GXP8922-00000002
<b>[Feb 17 23:08:22] WARNING[7746]: rtp_engine.c:1030 remote_bridge_loop: Oooh, something is weird, backing out
[Feb 17 23:08:22] WARNING[7746]: rtp_engine.c:1208 remote_bridge_loop: Channel c0->'DAHDI/i1/4XXXX855-1' changed, in bridge with c1->'SIP/GXP8922-00000002'
<u>[Feb 17 23:08:22] ERROR[7746]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0x1965 for 0xafba43d8
[Feb 17 23:08:22] ERROR[7746]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0x1965 for 0xafba43d8</u>
[Feb 17 23:08:22] WARNING[7746]: rtp_engine.c:1216 remote_bridge_loop: Channel c1->'SIP/GXP8922-00000002' changed, in bridge with c0->'DAHDI/i1/4XXXX855-1'</b>
 == Spawn extension (trusted, 8923, 1) exited non-zero on 'SIP/GXP8923-00000001<ZOMBIE>'



By: Alec Davis (alecdavis) 2011-02-17 05:48:55.000-0600

Different approach with bug18781.diff2.txt
only update_peer if glueX 'technology type' hasn't changed.

By: Matt Riddell (zx81) 2011-02-23 17:03:05.000-0600

We've just had the same crash and applied the diff2 patch - will report on whether we see the crash again.

By: Alec Davis (alecdavis) 2011-02-23 17:14:05.000-0600

ZX81: You may also need to consider ASTERISK-17431

By: Matt Riddell (zx81) 2011-02-23 18:40:36.000-0600

Ok, cool thanks Alec

By: Matt Riddell (zx81) 2011-02-24 16:21:46.000-0600

Just had the following before a crash:

[Feb 25 11:09:40]     -- Remotely bridging SIP/716-0000007b and SIP/707-0000007c
[Feb 25 11:09:40] WARNING[29487]: rtp_engine.c:1022 remote_bridge_loop: Channels c0->'SIP/716-0000007b' c1->'SIP/707-0000007c' in remote_bridge_loop
[Feb 25 11:09:44]   == Manager 'andrew' logged on from 10.1.1.45
[Feb 25 11:09:47]     -- Stopped music on hold on lcr/35
[Feb 25 11:09:47] WARNING[29487]: rtp_engine.c:1030 remote_bridge_loop: Oooh, something is weird, backing out
[Feb 25 11:09:47] ERROR[29487]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0xa01 for 0x18a1818
[Feb 25 11:09:47] ERROR[29487]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0xa01 for 0x18a1818
[Feb 25 11:09:47] ERROR[29487]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0xa01 for 0x18a1818
[Feb 25 11:09:47] ERROR[29487]: astobj2.c:116 INTERNAL_OBJ: bad magic number 0xa01 for 0x18a1818
[Feb 25 11:09:47]   == Spawn extension (custom-day, s, 1) exited non-zero on 'SIP/716-0000007b<ZOMBIE>'
[Feb 25 11:09:47]   == Extension Changed 716[extensions] new state Idle for Notify User 700
lucas*CLI>
Disconnected from Asterisk server
[Feb 25 11:09:48] Executing last minute cleanups
Asterisk ending (0).

By: Matt Riddell (zx81) 2011-02-24 16:25:54.000-0600

Is there any way to just disable remote bridging?  Other than forcing every phone to use a different codec :)

By: Alec Davis (alecdavis) 2011-02-24 18:13:58.000-0600

ZX81: sip.conf directmedia=no previously was canreinvite=no seemed to disable remote bridge, at least for a sip to sip call, and a transfer to sip call



By: Matt Riddell (zx81) 2011-02-24 18:19:03.000-0600

Yeah, I've just set that :)

Now dealing with another problem :)

By: Alec Davis (alecdavis) 2011-03-03 14:59:19.000-0600

bug18781.diff3.txt update from reviewboard

don't call update_peer if one of the following; where 'x' is either 0/1
1). ZOMBIE
2). cx->tech_pvt != pvtx
3). gluex != ast_rtp_instance_get_glue(cx->tech->type))

By: Alec Davis (alecdavis) 2011-03-03 15:04:56.000-0600

ZX81: are you able to test the diff3 patch.

If I read you last crash correctly, IAX or DAHDI wern't involved.

By: Matt Riddell (zx81) 2011-03-03 15:18:44.000-0600

I've reverted everything to stock 1.8 SVN, applied your patch (diff3) and will let you know if I see a crash.  Oh, and I've removed the directmedia=no.

By: Alec Davis (alecdavis) 2011-03-10 21:58:01.000-0600

ZX81: Do you still have these patches in place?
You and I seem to be the only ones that have tripped over this :(

They're working well for us, just need to get another positve and I'll get these committed.

By: Matt Riddell (zx81) 2011-03-10 22:03:15.000-0600

Yep, no crashes even with directmedia removed. Patch working well.

By: Digium Subversion (svnbot) 2011-03-11 00:47:46.000-0600

Repository: asterisk
Revision: 310287

U   branches/1.8/main/rtp_engine.c

------------------------------------------------------------------------
r310287 | alecdavis | 2011-03-11 00:47:45 -0600 (Fri, 11 Mar 2011) | 17 lines

remote_bridge_loop: prevent segfault when after transfer of IAX2 of DAHDI call

If the channel condition is one of the following after breaking out of the loop, don't try to update_peer
(where x = 0/1)
1). ZOMBIE
2). cx->tech_pvt != pvtx
3). gluex != ast_rtp_instance_get_glue(cx->tech->type))

(closes issue ASTERISK-17378)
Reported by: alecdavis
Patches:
     bug18781.diff3.txt uploaded by alecdavis (license 585)
Tested by: alecdavis, ZX81

Review: https://reviewboard.asterisk.org/r/1128/


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=310287

By: Digium Subversion (svnbot) 2011-03-11 00:56:07.000-0600

Repository: asterisk
Revision: 310288

_U  trunk/
U   trunk/main/rtp_engine.c

------------------------------------------------------------------------
r310288 | alecdavis | 2011-03-11 00:56:07 -0600 (Fri, 11 Mar 2011) | 23 lines

Merged revisions 310287 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

........
 r310287 | alecdavis | 2011-03-11 19:47:44 +1300 (Fri, 11 Mar 2011) | 17 lines
 
 remote_bridge_loop: prevent segfault when after transfer of IAX2 of DAHDI call
 
 If the channel condition is one of the following after breaking out of the loop, don't try to update_peer
 (where x = 0/1)
  1). ZOMBIE
  2). cx->tech_pvt != pvtx
  3). gluex != ast_rtp_instance_get_glue(cx->tech->type))
 
 (closes issue ASTERISK-17378)
 Reported by: alecdavis
 Patches:
       bug18781.diff3.txt uploaded by alecdavis (license 585)
 Tested by: alecdavis, ZX81
 
 Review: https://reviewboard.asterisk.org/r/1128/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=310288