[Home]

Summary:ASTERISK-17080: [patch] Deadlock in chan_sip
Reporter:alric (alric)Labels:
Date Opened:2010-12-08 17:10:34.000-0600Date Closed:2011-05-05 14:33:16
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) additional_requested_debugging.txt
( 1) another_core_show_locks.txt
( 2) bug18441.patch
( 3) core_show_locks.txt
( 4) locks_after_REFER.txt
Description:I've been testing T.38 with my SIP provider using a Linksys ATA and a fax machine behind it.

I initiate the fax with my fax machine and it connects with the other end.  Shortly after the fax machine thinks the call disconnects.  If I type a 'core show channels' at this point, I don't see any active channels.  At this point the PBX does not accept any more calls.  Commands also seem to not go through on the CLI until I 'exit' and 'asterisk -vvvvvr' again.

Asterisk is compiled with DONT_OPTIMIZE and DEBUG_THREADS.  I have run this scenario 3 to 5 times now with the same result so I am faithful I can reproduce it as needed.

Attached is the output from 'core show locks' after the presumed deadlock.
Comments:By: alric (alric) 2010-12-08 17:12:14.000-0600

I originally ran into this on 1.8.0, and after upgrading to 1.8.1 I can still reproduce it.

By: alric (alric) 2010-12-08 17:17:52.000-0600

OS is Ubuntu 10.04

By: Leif Madsen (lmadsen) 2010-12-08 19:00:33.000-0600

I'd suggest getting a backtrace from the running process as well.

By: alric (alric) 2010-12-09 09:53:08.000-0600

Attached file has a bt, bt full, thread apply all bt and another core show locks per our IRC chat.

By: viniciusfontes (viniciusfontes) 2010-12-10 04:29:42.000-0600

I'm having similar problems with deadlocks but I don't think fax is involved. Asterisk just deadlocks "no for reason".

Posting here because I'm almost sure it's the same bug.

Attached is the output of core show locks on my system.



By: Leif Madsen (lmadsen) 2010-12-16 09:42:53.000-0600

After talking to mnicholson, this is a legit deadlock, but is not fax related.

By: Kadir Terzi (kterzi) 2011-01-13 16:49:40.000-0600

I think my issue(ASTERISK-1831569) is also related to this one.

By: alric (alric) 2011-01-17 09:17:17.000-0600

Issue persists in 1.8.2.

By: Wolfgang Liegel (wliegel) 2011-01-18 09:16:58.000-0600

I can reproduce this deadlock in 1.8.2 by doing a simple SIP REFER (blind transfer).
The output of 'core show locks' looks quite the same like the first locks-file posted by Alric.

By: alric (alric) 2011-01-24 10:36:47.000-0600

Just tested 1.8.2.2, as well as 1.6.2.16.1.  I can reproduce the deadlock in both versions.

By: alric (alric) 2011-01-24 13:06:34.000-0600

I tested 14 different versions of Asterisk today, ranging from 1.6.1.0 to 1.8.2.2.

As near as I can tell, 1.6.1.11 does not deadlock in the same situation where 1.6.1.12 does.

Additionally, I tried the latest 1.8.2.2 with normal sip friends (non-realtime) and udptl disabled to perhaps try and isolate things a bit. I can still reproduce the deadlock under those conditions.

By: alric (alric) 2011-01-27 09:02:11.000-0600

I traced this back to faxdetect = yes.

I had that parameter set to yes, but I had no fax extension defined.  I got the inspiration to check on that from the fax detection things that were added between 1.6.1.11 and 1.6.1.12.  I see some additional locks and unlocks were added.  I don't have the C knowledge to say if those locks are related though.

It does look like that section of code (near line 6610 on 1.6.1.12, line 6980 on 1.8.2.2) has remained relatively unchanged through all those versions.

By: Jeff Peeler (jpeeler) 2011-01-28 11:51:53.000-0600

The patch I uploaded should resolve the deadlock shown in the "Another core show locks" file. The other two deadlocks (which are the same) I believe was already fixed in ASTERISK-17046 and has not yet made it in a release. viniciusfontes please test the patch, and the others please test the head of the branch.

By: Digium Subversion (svnbot) 2011-05-05 14:09:15

Repository: asterisk
Revision: 317283

U   branches/1.8/channels/chan_sip.c

------------------------------------------------------------------------
r317283 | jrose | 2011-05-05 14:09:14 -0500 (Thu, 05 May 2011) | 10 lines

Resolves a deadlock that occurs during sip_new

This is based on an uncommitted patch by jpeeler for the issue.  Instead of
relocking and then unlocking the channel though, we keep the lock on the channel
until we are finished doing what we need to the channel.

(closes issue ASTERISK-17080)
Reported by: Alric


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=317283

By: Digium Subversion (svnbot) 2011-05-05 14:33:13

Repository: asterisk
Revision: 317334

_U  trunk/
U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r317334 | jrose | 2011-05-05 14:33:12 -0500 (Thu, 05 May 2011) | 16 lines

Merged revisions 317283 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

........
 r317283 | jrose | 2011-05-05 14:09:13 -0500 (Thu, 05 May 2011) | 10 lines
 
 Resolves a deadlock that occurs during sip_new
 
 This is based on an uncommitted patch by jpeeler for the issue.  Instead of
 relocking and then unlocking the channel though, we keep the lock on the channel
 until we are finished doing what we need to the channel.
 
 (closes issue ASTERISK-17080)
 Reported by: Alric
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=317334