[Home]

Summary:ASTERISK-16711: [patch] Exceptionally long queue length queuing to XXXXX
Reporter:Alec Davis (alecdavis)Labels:
Date Opened:2010-09-22 06:18:50Date Closed:2012-05-04 11:31:19
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/PBX
Versions:Frequency of
Occurrence
Related
Issues:
is related toASTERISK-19807 Create masquerade "super-test" for Asterisk Test Suite
Environment:Attachments:( 0) expectionally-us.txt
( 1) fix-issue_18020-trunk.diff
( 2) fix-issue_18020-v1.4.patch
( 3) fix-issue_18028-v1.8.1.1.patch
( 4) Longqueue.txt
( 5) putty1.log
( 6) putty2.log
Description:Although this is under test conditions, this can go on for ever - seems like 100's per second.

But the state of code at the moment, allows you to hangup, and it clears.

****** ADDITIONAL INFORMATION ******

Using the following dialplan: and dialling 10030 to create 30 looped localchan calls.

[echo-test]
exten => s,1,NoOp(${CALLERID(num):-4})
exten => s,n,Answer()
exten => s,n,Playback(echo-test)
exten => s,n,Echo()
exten => s,n,Playback(goodbye)
exten => s,n,HangUp()

[phones]
exten => 10000,1,Goto(echo-test,s,1)

exten => _1XXXX,1,Set(i=${MATH(${EXTEN}-1,int)})
exten => _1XXXX,n,Dial(Local/${i}@phones)
Comments:By: Alec Davis (alecdavis) 2010-09-22 06:22:15

attached file Longqueue.txt is a 2 second call to 10020.

By: Stefan Schmidt (schmidts) 2010-09-22 08:02:52

it looks like there is a maquerading timing issue with this.
could you try this with asterisk -vvvvgT started, cause normally you should see a better time description on this (not only sec, also ms) and it would be clearer to see what happens when.

By: Alec Davis (alecdavis) 2010-09-23 01:50:08

Log file exceptionally-us.txt with microsecond resolution attached.

To get microsecond resolution required a change in logger.conf

[general]
;
; Customize the display of debug message time stamps
; this example is the ISO 8601 date format (yyyy-mm-dd HH:MM:SS)
;
; see strftime(3) Linux manual for format specifiers.  Note that there is also
; a fractional second parameter which may be used in this field.  Use %1q
; for tenths, %2q for hundredths, etc.
;
;dateformat=%F %T       ; ISO 8601 date format
;dateformat=%F %T.%3q   ; with milliseconds
dateformat=%F %T.%6q   ; with microseconds

By: Stefan Schmidt (schmidts) 2010-09-23 03:00:42

sorry was my fault i mean you should start it with console -cvvvvgT so the time would also be there on the verbose.
your log only shows the microseconds after the problem starts.

i have tested this by myself on trunk and with 20 or 50 local loops i didnt see a problem. with 300 i got a deadlock. with 1000 after 372 local calls i get a chanunavail.



By: Stefan Schmidt (schmidts) 2010-09-23 03:39:22

but this was only sip, i see you have this with dahdi, maybe thats the point why i didnt see this queue length message.

By: Alec Davis (alecdavis) 2010-09-24 21:56:08

That deadlock, although under 'testing' conditions, potentially could happen at any time, it's just a matter of time. We need to at least get it documented.

For the record, my machine isn't a core2duo:
Intel(R) Pentium(R) 4 CPU 2.60GHz (hyper threading enabled)
Memtotal: 2068264

When were getting the "Exceptionally long queue", the CPU is at 188%
Then the messages stop, seems like it goes back to normal, no deadlocks, no high CPU. But no audio channel!

Also many open channels: where they would have normally been optimized out.
note: we are not using the '/n' switch.

asterix*CLI> core show channels
Channel              Location             State   Application(Data)
Local/10013@phones-7 (None)               Up      AppDial((Outgoing Line))
Local/10013@phones-7 10013@phones:2       Up      Dial(Local/10012@phones)
Local/10005@phones-0 (None)               Up      AppDial((Outgoing Line))
Local/10005@phones-0 10005@phones:2       Up      Dial(Local/10004@phones)
Local/10028@phones-2 (None)               Up      AppDial((Outgoing Line))
Local/10028@phones-2 10028@phones:2       Up      Dial(Local/10027@phones)
Local/10012@phones-a (None)               Up      AppDial((Outgoing Line))
Local/10012@phones-a 10012@phones:2       Up      Dial(Local/10011@phones)
Local/10027@phones-2 10027@phones:2       Up      Dial(Local/10026@phones)
Local/10027@phones-2 (None)               Up      AppDial((Outgoing Line))
Local/10033@phones-8 10033@phones:2       Up      Dial(Local/10032@phones)
Local/10033@phones-8 (None)               Up      AppDial((Outgoing Line))
Local/10009@phones-1 (None)               Up      AppDial((Outgoing Line))
Local/10009@phones-1 10009@phones:2       Up      Dial(Local/10008@phones)
Local/10006@phones-0 (None)               Up      AppDial((Outgoing Line))
Local/10006@phones-0 10006@phones:2       Up      Dial(Local/10005@phones)
Local/10018@phones-a (None)               Up      AppDial((Outgoing Line))
Local/10018@phones-a 10018@phones:2       Up      Dial(Local/10017@phones)
Local/10032@phones-1 (None)               Up      AppDial((Outgoing Line))
Local/10032@phones-1 10032@phones:2       Up      Dial(Local/10031@phones)
Local/10022@phones-6 (None)               Up      AppDial((Outgoing Line))
Local/10022@phones-6 10022@phones:2       Up      Dial(Local/10021@phones)
Local/10026@phones-9 (None)               Up      AppDial((Outgoing Line))
Local/10026@phones-9 10026@phones:2       Up      Dial(Local/10025@phones)
Local/10040@phones-4 10040@phones:2       Up      Dial(Local/10039@phones)
Local/10040@phones-4 (None)               Up      AppDial((Outgoing Line))
Local/10001@phones-8 (None)               Up      AppDial((Outgoing Line))
Local/10001@phones-8 10001@phones:2       Up      Dial(Local/10000@phones)
Local/10046@phones-e (None)               Up      AppDial((Outgoing Line))
Local/10046@phones-e 10046@phones:2       Up      Dial(Local/10045@phones)
Local/10000@phones-0 s@echo-test:4        Up      Echo()
Local/10008@phones-4 10008@phones:2       Up      Dial(Local/10007@phones)
Local/10000@phones-0 (None)               Up      AppDial((Outgoing Line))
Local/10008@phones-4 (None)               Up      AppDial((Outgoing Line))
Local/10020@phones-3 10020@phones:2       Up      Dial(Local/10019@phones)
Local/10020@phones-3 (None)               Up      AppDial((Outgoing Line))
Local/10047@phones-c 10047@phones:2       Up      Dial(Local/10046@phones)
Local/10047@phones-c (None)               Up      AppDial((Outgoing Line))
Local/10045@phones-4 10045@phones:2       Up      Dial(Local/10044@phones)
Local/10045@phones-4 (None)               Up      AppDial((Outgoing Line))
Local/10023@phones-8 (None)               Up      AppDial((Outgoing Line))
Local/10023@phones-8 10023@phones:2       Up      Dial(Local/10022@phones)
Local/10019@phones-1 10019@phones:2       Up      Dial(Local/10018@phones)
Local/10019@phones-1 (None)               Up      AppDial((Outgoing Line))
SIP/bt100black-00000 10050@phones:2       Up      Dial(Local/10049@phones)
Local/10011@phones-8 (None)               Up      AppDial((Outgoing Line))
Local/10011@phones-8 10011@phones:2       Up      Dial(Local/10010@phones)
Local/10016@phones-1 (None)               Up      AppDial((Outgoing Line))
Local/10029@phones-4 (None)               Up      AppDial((Outgoing Line))
Local/10016@phones-1 10016@phones:2       Up      Dial(Local/10015@phones)
Local/10029@phones-4 10029@phones:2       Up      Dial(Local/10028@phones)
Local/10010@phones-6 10010@phones:2       Up      Dial(Local/10009@phones)
Local/10010@phones-6 (None)               Up      AppDial((Outgoing Line))
Local/10025@phones-b (None)               Up      AppDial((Outgoing Line))
Local/10025@phones-b 10025@phones:2       Up      Dial(Local/10024@phones)
Local/10007@phones-f (None)               Up      AppDial((Outgoing Line))
Local/10007@phones-f 10007@phones:2       Up      Dial(Local/10006@phones)
Local/10017@phones-7 10017@phones:2       Up      Dial(Local/10016@phones)
Local/10017@phones-7 (None)               Up      AppDial((Outgoing Line))
Local/10015@phones-e (None)               Up      AppDial((Outgoing Line))
Local/10015@phones-e 10015@phones:2       Up      Dial(Local/10014@phones)
Local/10037@phones-4 (None)               Up      AppDial((Outgoing Line))
Local/10037@phones-4 10037@phones:2       Up      Dial(Local/10036@phones)
Local/10042@phones-3 (None)               Up      AppDial((Outgoing Line))
Local/10042@phones-3 10042@phones:2       Up      Dial(Local/10041@phones)
Local/10035@phones-a (None)               Up      AppDial((Outgoing Line))
Local/10035@phones-a 10035@phones:2       Up      Dial(Local/10034@phones)
Local/10036@phones-a (None)               Up      AppDial((Outgoing Line))
Local/10030@phones-3 (None)               Up      AppDial((Outgoing Line))
Local/10036@phones-a 10036@phones:2       Up      Dial(Local/10035@phones)
Local/10030@phones-3 10030@phones:2       Up      Dial(Local/10029@phones)
Local/10044@phones-e 10044@phones:2       Up      Dial(Local/10043@phones)
Local/10044@phones-e (None)               Up      AppDial((Outgoing Line))
Local/10021@phones-6 10021@phones:2       Up      Dial(Local/10020@phones)
Local/10021@phones-6 (None)               Up      AppDial((Outgoing Line))
Local/10024@phones-9 10024@phones:2       Up      Dial(Local/10023@phones)
Local/10024@phones-9 (None)               Up      AppDial((Outgoing Line))
Local/10049@phones-9 10049@phones:2       Up      Dial(Local/10048@phones)
Local/10049@phones-9 (None)               Up      AppDial((Outgoing Line))
Local/10039@phones-1 10039@phones:2       Up      Dial(Local/10038@phones)
Local/10039@phones-1 (None)               Up      AppDial((Outgoing Line))
Local/10038@phones-e (None)               Up      AppDial((Outgoing Line))
Local/10038@phones-e 10038@phones:2       Up      Dial(Local/10037@phones)
Local/10031@phones-e 10031@phones:2       Up      Dial(Local/10030@phones)
Local/10031@phones-e (None)               Up      AppDial((Outgoing Line))
Local/10014@phones-2 (None)               Up      AppDial((Outgoing Line))
Local/10014@phones-2 10014@phones:2       Up      Dial(Local/10013@phones)
Local/10048@phones-f 10048@phones:2       Up      Dial(Local/10047@phones)
Local/10048@phones-f (None)               Up      AppDial((Outgoing Line))
Local/10043@phones-3 10043@phones:2       Up      Dial(Local/10042@phones)
Local/10043@phones-3 (None)               Up      AppDial((Outgoing Line))
Local/10034@phones-7 (None)               Up      AppDial((Outgoing Line))
Local/10034@phones-7 10034@phones:2       Up      Dial(Local/10033@phones)
Local/10041@phones-2 10041@phones:2       Up      Dial(Local/10040@phones)
Local/10041@phones-2 (None)               Up      AppDial((Outgoing Line))
95 active channels
48 active calls
56 calls processed
asterix*CLI> core show locks

=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
===
=======================================================================

asterix*CLI>

Then, hang up, they all clear down.

 == Spawn extension (phones, 10050, 2) exited non-zero on 'SIP/bt100black-00000001'
 == Spawn extension (phones, 10049, 2) exited non-zero on 'Local/10049@phones-97ad;2'
 == Spawn extension (phones, 10047, 2) exited non-zero on 'Local/10047@phones-c5e1;2'
 == Spawn extension (phones, 10046, 2) exited non-zero on 'Local/10046@phones-e980;2'
 == Spawn extension (phones, 10045, 2) exited non-zero on 'Local/10045@phones-4580;2'
 == Spawn extension (phones, 10044, 2) exited non-zero on 'Local/10044@phones-e526;2'
 == Spawn extension (phones, 10043, 2) exited non-zero on 'Local/10043@phones-3d9a;2'
 == Spawn extension (phones, 10048, 2) exited non-zero on 'Local/10048@phones-f59b;2'
 == Spawn extension (phones, 10042, 2) exited non-zero on 'Local/10042@phones-30c5;2'
 == Spawn extension (phones, 10041, 2) exited non-zero on 'Local/10041@phones-2fe9;2'
 == Spawn extension (phones, 10040, 2) exited non-zero on 'Local/10040@phones-4528;2'
 == Spawn extension (phones, 10037, 2) exited non-zero on 'Local/10037@phones-4d65;2'
 == Spawn extension (phones, 10036, 2) exited non-zero on 'Local/10036@phones-a3d2;2'
 == Spawn extension (phones, 10035, 2) exited non-zero on 'Local/10035@phones-adc6;2'
 == Spawn extension (phones, 10038, 2) exited non-zero on 'Local/10038@phones-e3c2;2'
 == Spawn extension (phones, 10034, 2) exited non-zero on 'Local/10034@phones-70fc;2'
 == Spawn extension (phones, 10033, 2) exited non-zero on 'Local/10033@phones-81ef;2'
 == Spawn extension (phones, 10032, 2) exited non-zero on 'Local/10032@phones-1854;2'
 == Spawn extension (phones, 10031, 2) exited non-zero on 'Local/10031@phones-e84d;2'
 == Spawn extension (phones, 10030, 2) exited non-zero on 'Local/10030@phones-3107;2'
 == Spawn extension (phones, 10029, 2) exited non-zero on 'Local/10029@phones-4301;2'
 == Spawn extension (phones, 10028, 2) exited non-zero on 'Local/10028@phones-2b78;2'
 == Spawn extension (phones, 10027, 2) exited non-zero on 'Local/10027@phones-2bb8;2'
 == Spawn extension (phones, 10026, 2) exited non-zero on 'Local/10026@phones-909b;2'
 == Spawn extension (phones, 10025, 2) exited non-zero on 'Local/10025@phones-baf9;2'
 == Spawn extension (phones, 10023, 2) exited non-zero on 'Local/10023@phones-80f8;2'
 == Spawn extension (phones, 10021, 2) exited non-zero on 'Local/10021@phones-6344;2'
 == Spawn extension (phones, 10022, 2) exited non-zero on 'Local/10022@phones-6f2b;2'
 == Spawn extension (phones, 10020, 2) exited non-zero on 'Local/10020@phones-33e8;2'
 == Spawn extension (phones, 10019, 2) exited non-zero on 'Local/10019@phones-180a;2'
 == Spawn extension (phones, 10018, 2) exited non-zero on 'Local/10018@phones-a6c5;2'
 == Spawn extension (phones, 10024, 2) exited non-zero on 'Local/10024@phones-9111;2'
 == Spawn extension (phones, 10017, 2) exited non-zero on 'Local/10017@phones-72fb;2'
 == Spawn extension (phones, 10016, 2) exited non-zero on 'Local/10016@phones-1f65;2'
 == Spawn extension (phones, 10039, 2) exited non-zero on 'Local/10039@phones-1985;2'
 == Spawn extension (phones, 10014, 2) exited non-zero on 'Local/10014@phones-2566;2'
 == Spawn extension (phones, 10015, 2) exited non-zero on 'Local/10015@phones-e4d8;2'
 == Spawn extension (phones, 10013, 2) exited non-zero on 'Local/10013@phones-7335;2'
 == Spawn extension (phones, 10012, 2) exited non-zero on 'Local/10012@phones-aee5;2'
 == Spawn extension (phones, 10011, 2) exited non-zero on 'Local/10011@phones-89fa;2'
 == Spawn extension (phones, 10010, 2) exited non-zero on 'Local/10010@phones-6ef1;2'
 == Spawn extension (phones, 10009, 2) exited non-zero on 'Local/10009@phones-1488;2'
 == Spawn extension (phones, 10008, 2) exited non-zero on 'Local/10008@phones-41ee;2'
 == Spawn extension (phones, 10007, 2) exited non-zero on 'Local/10007@phones-f3c8;2'
 == Spawn extension (phones, 10006, 2) exited non-zero on 'Local/10006@phones-0a93;2'
 == Spawn extension (phones, 10005, 2) exited non-zero on 'Local/10005@phones-0ae1;2'
 == Spawn extension (echo-test, s, 4) exited non-zero on 'Local/10000@phones-056b;2'
 == Spawn extension (phones, 10001, 2) exited non-zero on 'Local/10001@phones-82b6;2'
asterix*CLI>
asterix*CLI> core show channels
Channel              Location             State   Application(Data)
0 active channels
0 active calls
56 calls processed
asterix*CLI>

Note: 10002, 10003, 10004 had been sucessfully optimized out.



By: Alec Davis (alecdavis) 2010-09-24 22:04:53

enabling '/n' on the dial command, leaves all channels open as it should, but still get exceptionally long queue.

By: John Covert (jcovert) 2011-02-16 17:16:15.000-0600

It doesn't take 30 local channels, it can happen with just one active.
The following is on 1.8.1.1

I just had an occurrence of this (message spewed 58,057 times over a three minute period) at a client who is getting more and more frustrated with the state of things:

[root@asterisk asterisk]# grep queue /tmp/x.tmp |wc
 58057  638627 7024897
[root@asterisk asterisk]# grep queue /tmp/x.tmp
[14-Feb-2011 13:16:36.75] WARNING[9558] channel.c: Exceptionally long queue length queuing to Local/sleep@monitor-be7c;2
[14-Feb-2011 13:16:36.75] WARNING[9558] channel.c: Exceptionally long queue length queuing to Local/sleep@monitor-be7c;2
...
[14-Feb-2011 13:19:30.78] WARNING[9559] channel.c: Exceptionally long queue length queuing to Local/sleep@monitor-be7c;2
[14-Feb-2011 13:19:31.37] WARNING[9559] channel.c: Exceptionally long queue length queuing to Local/sleep@monitor-be7c;2

While this was going on a dozen or so unrelated SIP calls got fouled up, and we just noticed they were still hanging around two days later, because a client called us and complained about missing recordings (which would have been queued when the Mixmonitor command completed)

Channel              Context              Extension        Prio State   Application  Data                      CallerID        Duration Accountcode PeerAccount BridgedTo
SIP/onstate-sip-serv macro-anyoldcall     s                  10 Up      Dial         SIP/+16239357327@onstate- +15132409191    51:33:49                         SIP/onstate-sip-serv
SIP/onstate-sip-serv macro-anyoldcall     s                  10 Up      Dial         SIP/+16232437242@onstate- +16102524416    51:44:38                         SIP/onstate-sip-serv

and these others:
SIP/asterisk4-00017e callskype            gmagent1            1 Up      AppDial      (Outgoing Line)           gmagent1        51:49:01                         SIP/onstate-sip-serv
SIP/asterisk4-00017e callskype            klauss-ia           1 Up      AppDial      (Outgoing Line)           klauss-ia       51:48:41                         SIP/onstate-sip-serv
SIP/onstate-sip-serv callskype            klauss-ia          31 Up      Dial         SIP/8759klauss-ia@asteris +12145457068    51:48:41                         SIP/asterisk4-00017e
SIP/brendar.moving.s onstate              8717brendar.movi    1 Up      AppDial      (Outgoing Line)           8717brendar.mov 51:53:47                         SIP/onstate-sip-serv
SIP/asterisk4-00017e callskype            aahcs.ladyclaire    1 Up      AppDial      (Outgoing Line)           aahcs.ladyclair 51:50:19                         SIP/onstate-sip-serv
SIP/onstate-sip-serv callskype            gmagent1           31 Up      Dial         SIP/8759gmagent1@asterisk +14104300932    51:49:01                         SIP/asterisk4-00017e
SIP/onstate-sip-serv onstate              8717brendar.movi   13 Up      Dial         SIP/brendar.moving.specia +12487988748    51:53:47                         SIP/brendar.moving.s
SIP/jenilynn-00017d8 onstate              8717jenilynn        1 Up      AppDial      (Outgoing Line)           8717jenilynn    51:54:38                         SIP/onstate-sip-serv
SIP/johns.moving.spe onstate              8717johns.moving    1 Up      AppDial      (Outgoing Line)           8717johns.movin 51:47:54                         SIP/onstate-sip-serv
SIP/onstate-sip-serv macro-anyoldcall     s                   1 Up      AppDial      (Outgoing Line)           +16128861133    51:54:23                         SIP/onstate-sip-serv
SIP/onstate-sip-serv onstate              8717topdog.james   13 Up      Dial         SIP/topdog.james.etchegar +12392623643    51:54:27                         SIP/topdog.james.etc
SIP/onstate-sip-serv onstate              8717jenilynn       13 Up      Dial         SIP/jenilynn,120,M(onansw +14058485000    51:54:38                         SIP/jenilynn-00017d8
SIP/onstate-sip-serv onstate              8717bcrann.movin   13 Up      Dial         SIP/bcrann.moving.special +14044318017    51:48:13                         SIP/bcrann.moving.sp
SIP/onstate-sip-serv callskype            aahcs.rowena.raq   27 Up      Dial         SIP/8759aahcs.rowena.raqu +13096696012    51:49:57                         SIP/asterisk2-00017e
SIP/onstate-sip-serv callskype            osa.ivyjoy.villa   27 Up      Dial         SIP/8759osa.ivyjoy.villan +14434800725    51:50:42                         SIP/asterisk2-00017e
SIP/asterisk2-00017e callskype            aahcs.christine.    1 Up      AppDial      (Outgoing Line)           aahcs.christine 51:51:24                         SIP/onstate-sip-serv
SIP/onstate-sip-serv callskype            np.agent3          27 Up      Dial         SIP/8759np.agent3@asteris +14085957148    51:50:51                         SIP/asterisk2-00017e
SIP/onstate-sip-serv callskype            aahcs.ladyclaire   31 Up      Dial         SIP/8759aahcs.ladyclaire. +18707465111    51:50:20                         SIP/asterisk4-00017e
SIP/asterisk2-00017e callskype            np.agent3           1 Up      AppDial      (Outgoing Line)           np.agent3       51:50:51                         SIP/onstate-sip-serv
SIP/topdog.james.etc onstate              8717topdog.james    1 Up      AppDial      (Outgoing Line)           8717topdog.jame 51:54:27                         SIP/onstate-sip-serv
SIP/asterisk2-00017e callskype            osa.ivyjoy.villa    1 Up      AppDial      (Outgoing Line)           osa.ivyjoy.vill 51:50:42                         SIP/onstate-sip-serv
SIP/onstate-sip-serv callskype            aahcs.christine.   27 Up      Dial         SIP/8759aahcs.christine.c +15802239447    51:51:24                         SIP/asterisk2-00017e
SIP/asterisk2-00017e callskype            aahcs.rowena.raq    1 Up      AppDial      (Outgoing Line)           aahcs.rowena.ra 51:49:56                         SIP/onstate-sip-serv
SIP/bcrann.moving.sp onstate              8717bcrann.movin    1 Up      AppDial      (Outgoing Line)           8717bcrann.movi 51:48:13                         SIP/onstate-sip-serv
SIP/hollye.moving.sp onstate              8717hollye.movin    1 Up      AppDial      (Outgoing Line)           8717hollye.movi 51:48:11                         SIP/onstate-sip-serv
SIP/onstate-sip-serv macro-anyoldcall     s                  10 Up      Dial         SIP/+16128861133@onstate- +17154565448    51:54:23                         SIP/onstate-sip-serv
SIP/onstate-sip-serv onstate              8717johns.moving   13 Up      Dial         SIP/johns.moving.speciali +14044318017    51:47:54                         SIP/johns.moving.spe
SIP/onstate-sip-serv onstate              8717hollye.movin   13 Up      Dial         SIP/hollye.moving.special +18054534687    51:48:11                         SIP/hollye.moving.sp

Here's one of the two day old calls:

[root@asterisk asterisk]# asterisk -rx 'core show channel SIP/onstate-sip-server-00017e79'
-- General --
          Name: SIP/onstate-sip-server-00017e79
          Type: SIP
      UniqueID: asterisk.onstate.net-1297707348.133129
      LinkedID: asterisk.onstate.net-1297707348.133129
     Caller ID: +12145457068
Caller ID Name: (N/A)
   DNID Digits: 8759klauss-ia
      Language: en
         State: Up (6)
         Rings: 0
 NativeFormats: 0x80004 (ulaw|h263)
   WriteFormat: 0x4 (ulaw)
    ReadFormat: 0x4 (ulaw)
WriteTranscode: No
 ReadTranscode: No
1st File Descriptor: 583
     Frames in: 6107
    Frames out: 4
Time to Hangup: 0
  Elapsed Time: 52h58m40s
 Direct Bridge: SIP/asterisk4-00017e7a
Indirect Bridge: SIP/asterisk4-00017e7a
--   PBX   --
       Context: callskype
     Extension: klauss-ia
      Priority: 31
    Call Group: 0
  Pickup Group: 0
   Application: Dial
          Data: SIP/8759klauss-ia@asterisk4,120,M(onanswer)
   Blocking in: ast_waitfor_nandfds
     Variables:
BRIDGEPVTCALLID=4cc2bf9b63318baf0f1ba4874ed94005@63.210.44.146:5060
BRIDGEPEER=SIP/asterisk4-00017e7a
DIALEDPEERNUMBER=8759klauss-ia@asterisk4
DIALEDPEERNAME=SIP/asterisk4-00017e7a
DIALSTATUS=ANSWER
~HASH~SIP_CAUSE~SIP/asterisk4-00017e7a~=SIP 200 OK
DIALEDTIME=
ANSWEREDTIME=
retry=0
MACRO_DEPTH=0
which=server2
server2now=44
server1now=66
server2=asterisk4
server2max=40
server1=asterisk2
server1max=65
DB_RESULT=241
SkypeCount=111
GOSUB_RETVAL=
numcalls=91
prefix=
db_tracker=recording/test/test
SIPCALLID=2691d25b@sipserver1.On-State.com
SIPDOMAIN=gw.onstate.net
SIPURI=sip:SipServerClient@63.210.44.142:5061

 CDR Variables:
level 1: dnid=8759klauss-ia
level 1: clid=+12145457068
level 1: src=+12145457068
level 1: dst=klauss-ia
level 1: dcontext=callskype
level 1: channel=SIP/onstate-sip-server-00017e79
level 1: dstchannel=SIP/asterisk4-00017e7a
level 1: lastapp=Dial
level 1: lastdata=SIP/8759klauss-ia@asterisk4,120,M(onanswer)
level 1: start=2011-02-14 13:15:48
level 1: answer=2011-02-14 13:15:54
level 1: duration=190720
level 1: billsec=190714
level 1: disposition=ANSWERED
level 1: amaflags=DOCUMENTATION
level 1: uniqueid=asterisk.onstate.net-1297707348.133129
level 1: linkedid=asterisk.onstate.net-1297707348
level 1: sequence=183520

[root@asterisk asterisk]# asterisk -rx 'core show channel SIP/asterisk4-00017e7a'
-- General --
          Name: SIP/asterisk4-00017e7a
          Type: SIP
      UniqueID: asterisk.onstate.net-1297707348.133130
      LinkedID: asterisk.onstate.net-1297707348.133129
     Caller ID: klauss-ia
Caller ID Name: (N/A)
   DNID Digits: (N/A)
      Language: en
         State: Up (6)
         Rings: 0
 NativeFormats: 0x80004 (ulaw|h263)
   WriteFormat: 0x4 (ulaw)
    ReadFormat: 0x4 (ulaw)
WriteTranscode: No
 ReadTranscode: No
1st File Descriptor: 598
     Frames in: 6170
    Frames out: 0
Time to Hangup: 0
  Elapsed Time: 53h0m6s
 Direct Bridge: SIP/onstate-sip-server-00017e79
Indirect Bridge: SIP/onstate-sip-server-00017e79
--   PBX   --
       Context: callskype
     Extension: klauss-ia
      Priority: 1
    Call Group: 0
  Pickup Group: 0
   Application: AppDial
          Data: (Outgoing Line)
   Blocking in: ast_waitfor_nandfds
     Variables:
BRIDGEPVTCALLID=2691d25b@sipserver1.On-State.com
BRIDGEPEER=SIP/onstate-sip-server-00017e79
MACRO_DEPTH=0
DIALEDPEERNUMBER=8759klauss-ia@asterisk4
SIPCALLID=4cc2bf9b63318baf0f1ba4874ed94005@63.210.44.146:5060

 CDR Variables:
level 1: dnid=
level 1: clid=klauss-ia
level 1: src=klauss-ia
level 1: dst=s
level 1: dcontext=onstate
level 1: channel=SIP/asterisk4-00017e7a
level 1: lastapp=MacroExit
level 1: start=2011-02-14 13:15:48
level 1: answer=2011-02-14 13:15:54
level 1: duration=190805
level 1: billsec=190799
level 1: disposition=ANSWERED
level 1: amaflags=DOCUMENTATION
level 1: uniqueid=asterisk.onstate.net-1297707348.133130
level 1: linkedid=asterisk.onstate.net-1297707348
level 1: sequence=183514

By: Rafael Prado Rocchi (prado) 2011-04-12 19:49:33

Alecdavis, thank you a lot for that dialplan! It was essencial for me to find the problem on the asterisk code.

Jcovert, I made a patch for your version (1.8.1.1).
This problem also affected 1.4 as well, patch solved too. (did not test on 1.6)

Now I'm gonna test it in trunk and upload it in a few minutes.

By: Rafael Prado Rocchi (prado) 2011-04-12 20:08:17

3 patches uploaded (Trunk, 1.8.1.1 and 1.4)

It solves the problems reported here, but I would like some feedback from other developers about it.

Prado

By: Rafael Prado Rocchi (prado) 2011-04-12 20:28:29

Some information:

Searching SVN code now, I found this problem was created by issue 14249, more than 2 years ago. It's a suspended issue for lack of feedback but the code was put in asterisk.

It first appeared in asterisk 1.4.24 and is present in all versions from that until 1.8 and trunk.

https://issues.asterisk.org/view.php?id=14249

I'll do some testing about that old issue 14249..... feedback later.


Prado

By: John Covert (jcovert) 2011-04-12 20:46:04

I'm on 1.8.3.2 now.  I tried your patch.  No joy.

See issue ASTERISK-16258  (note that the following happens if I go back to the default helperthreads)

asterisk*CLI> originate sip/x33 application dial IAX2/guest@iax.covert.org/ringnoanswer
   -- Launching dial(IAX2/guest@iax.covert.org/ringnoanswer) on SIP/x33-00000004
   -- Called guest@iax.covert.org/ringnoanswer
   -- Accepting UNAUTHENTICATED call from 192.168.0.100:
      > requested format = ulaw,
      > requested prefs = (ulaw|gsm),
      > actual format = ulaw,
      > host prefs = (ulaw|gsm),
      > priority = caller
   -- Call accepted by 192.168.0.100 (format ulaw)
   -- Format for call is ulaw
   -- Executing [ringnoanswer@sip:1] Progress("IAX2/192.168.0.100:4569-519", "") in new stack
   -- Executing [ringnoanswer@sip:2] NoOp("IAX2/192.168.0.100:4569-519", ""Cisco 33" <33>") in new stack
   -- Executing [ringnoanswer@sip:3] PlayTones("IAX2/192.168.0.100:4569-519", "440+495/2000,0/4000") in new stack
   -- Executing [ringnoanswer@sip:4] Wait("IAX2/192.168.0.100:4569-519", "120") in new stack
   -- IAX2/192.168.0.100:4569-2748 is making progress passing it to SIP/x33-00000004
[Apr 12 21:43:02] WARNING[26350]: channel.c:1426 __ast_queue_frame: Exceptionally long voice queue length queuing to IAX2/192.168.0.100:4569-2748
[Apr 12 21:43:07] WARNING[26350]: channel.c:1426 __ast_queue_frame: Exceptionally long voice queue length queuing to IAX2/192.168.0.100:4569-2748
[Apr 12 21:43:12] WARNING[26350]: channel.c:1426 __ast_queue_frame: Exceptionally long voice queue length queuing to IAX2/192.168.0.100:4569-2748
   -- Hungup 'IAX2/192.168.0.100:4569-2748'
[Apr 12 21:43:29] WARNING[26350]: channel.c:1426 __ast_queue_frame: Exceptionally long voice queue length queuing to IAX2/192.168.0.100:4569-519
[Apr 12 21:43:29] WARNING[26350]: channel.c:1426 __ast_queue_frame: Exceptionally long voice queue length queuing to IAX2/192.168.0.100:4569-519
 == Spawn extension (sip, ringnoanswer, 4) exited non-zero on 'IAX2/192.168.0.100:4569-519'
   -- Hungup 'IAX2/192.168.0.100:4569-519'
asterisk*CLI>



By: Rafael Prado Rocchi (prado) 2011-04-13 11:32:27

All my testes were done using alecdavis dialplan. It worked.
I'll try to reproduce yours, using IAX.

By: John Covert (jcovert) 2011-04-19 12:20:39

So prado, what was the theory behind the patch when it was applied, and what is  your theory about why it causes this problem, and finally, what about the original problem this was supposed to fix?

It seems that if simply removing "ast_indicate(chan, -1);" from the routine really does fix the problem, the right fix should be to determine what ast_indicate is doing that's so bad and fix it there.

/john

By: Leif Madsen (lmadsen) 2011-05-04 14:32:12

Have a client also experiencing this. I have implemented the patch onto a 1.6.2.14 system and am continuing to monitor it. We'll see what happens.

It does feel like this isn't the solution, but rather a work around right now. It'd be nice to know why this happens.

By: Leif Madsen (lmadsen) 2011-05-09 13:53:43

I'm dropping this status to Confirmed as there is a patch, but I don't think this is going to be the right solution.

By: John michelle (jhmichelle) 2011-05-22 23:46:58

I can confirm that my asterisk box have the same problem, i have tested the above patch still no joy.one more note this warning seems to appear with peers that are
Transatlantic i.e longer delay. so maybe we need more tolerance for longer delays.

By: David Vossel (dvossel) 2011-05-25 15:14:33

The exceptional queue length is probably due to having debug threads enabled.  Debug threads is crazy expensive, especially when dealing with local_channels.

By: Alec Davis (alecdavis) 2011-05-26 06:11:57

Turning off 'DONT OPTIMIZE' and 'DEBUG THREADS', and logger.conf disable full, and core set verbose = 0 and core set debug = 0.
Still gives, "Exceptionally long queue length queuing to Local/10036@phones-f6c1;2"

But applying reviewboard https://reviewboard.asterisk.org/r/1231/ atleast no more deadlocks.

By: John michelle (jhmichelle) 2011-05-26 12:39:08

This error happens when i am using IAX2 , so i hope there would be a similar patch for the deadlocks for CHAN_IAX like the one alecdavis referred to.

By: David Vossel (dvossel) 2011-05-26 16:11:00

I thought this issue was isolated to the use of local channels. What iax2 deadlock are you referring to?

By: John michelle (jhmichelle) 2011-05-26 16:51:30

when this problem occurs in iax it happens when calls are transatlantic so
network delay is what causes the frames to be queued for more than 96 which
causes the frame head to be removed which leads to voice degradation. when that happen while channel.c is still printing the error. the callee tries to hangup
And acquire a lock. this is where the deadlock happens between ast_queue_frame
and ast_write. network delay is a major factor and i don't know how to reproduce that.



By: John michelle (jhmichelle) 2011-05-26 16:56:29

dvossel : please check jcovert post he also mentions iax2 problem

By: Digium Subversion (svnbot) 2011-05-31 13:52:58

Repository: asterisk
Revision: 321515

U   branches/1.8/channels/chan_local.c

------------------------------------------------------------------------
r321515 | dvossel | 2011-05-31 13:52:57 -0500 (Tue, 31 May 2011) | 12 lines

Chan_local locking cleanup.

This patch removes all of the unnecessary deadlock
avoidance loops that occur in chan_local.  It also
resolves an issue with a deadlock triggered by
local channel optimizations.

(issue ASTERISK-16711)

Review: https://reviewboard.asterisk.org/r/1231/


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=321515

By: Digium Subversion (svnbot) 2011-05-31 14:01:46

Repository: asterisk
Revision: 321516

_U  trunk/
U   trunk/channels/chan_local.c

------------------------------------------------------------------------
r321516 | dvossel | 2011-05-31 14:01:45 -0500 (Tue, 31 May 2011) | 18 lines

Merged revisions 321515 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

........
 r321515 | dvossel | 2011-05-31 13:52:54 -0500 (Tue, 31 May 2011) | 12 lines
 
 Chan_local locking cleanup.
 
 This patch removes all of the unnecessary deadlock
 avoidance loops that occur in chan_local.  It also
 resolves an issue with a deadlock triggered by
 local channel optimizations.
 
 (issue ASTERISK-16711)
 
 Review: https://reviewboard.asterisk.org/r/1231/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=321516

By: John michelle (jhmichelle) 2011-06-03 09:18:50

one more note , i am on asterisk 1.6.2.18 now, when asterisk starts spewing
Exceptionally long voice queue length queuing to IAX2/XXX , it also show the warning MAX retries exceeded to host IAX/XXX and it doesn't stop there , the channel Doesn't get hangup and 'MAX retries...' warning keeps showing up every
30 to 40 secs for every channel that didn't hangup.

By: Vladimir Mikhelson (vmikhelson) 2011-06-05 01:25:17

Tested the patch as a possible remedy for issue 18800. Asterisk seg faulted 7 times upon restart.

After the patch reversal everything is back to normal.

Core dump analysis is attached to issue 18800.  I can re-attach here if needed.

-Vladimir

By: ppower (ppower) 2011-06-24 09:31:17.075-0500

This issue caught my eye because the 'Exceptionally long voice queue length' message was plaguing me with regards to MOH and parking (mantis 0018262). My understanding of this problem is that the timer events  needed to process the queued frames are not getting noticed. In my experience, with IAX if you get 'Exceptionally long voice queue length' you will eventually get 'MAX retries exceeded to host' messages.


By: Leif Madsen (lmadsen) 2011-09-22 07:39:53.884-0500

Are people still running into this on 1.8.7.0-rc1 or later?

By: JoshE (n8ideas) 2011-12-27 16:49:02.772-0600

I seem to have just run into this issue on 1.8.7.0.  Not sure of exact circumstances, but this was a Local/ issue not an IAX channel problem, so far as I can tell.  Don't have better debugging on this now, but will keep an eye on this.

By: Alex (alexrixhardson) 2012-01-25 07:58:14.287-0600

Yes, I have also run into this issue on 1.8.7.0. The warnings in the messages log file were "Exceptionally long queue length queueing to Local/XXXX". There was no IAX2 active channel. Only SIP, DAHDI and Local channels were active at the time. There was about 10-20 messages per seconds and it was impossible to make any additional call which would include Local channel at that point. Restart of Asterisk was required. No other warnings  were logged prior this incident.

By: Alex (alexrixhardson) 2012-01-30 11:08:10.737-0600

I managed to capture two "core show locks" outputs from Asterisk once it started spitting the "Exceptionally long queue length queueing to Local" messages. Please, see the putty1.log and putty2.log files.
*Edit:* This was captured on 1.8.9.0 (I did an upgrade on Friday, but unfortunately it didn't solve this problem)

By: Alex (alexrixhardson) 2012-01-31 05:49:35.356-0600

I have a few production servers using 1.8.X Asterisk with several hundred concurrent calls going on almost 24/7. This error crashes at least one of the servers per week.

Is it possible to raise the severity of this problem from "Minor" to "Major"?

By: Ole Kaas (ole.kaas) 2012-06-06 08:17:20.087-0500

FWIW I've just seen several of these on 1.8.13.0. I was inspecting log after restarting asterisk because it didn't repond to SIP requests. When I get an alert, I log onto asterisk CLI and issus an "core show channels" to verify the alert. When I get an output like the one below I kill asterisk with a -9 and start it again with the init script. It could be unrelated, but the below sample log entry appeared just before asterisk was killed. It could also be that because SIP was dead, the below log entry appeared because the SIP phone with that extension didn't respond.


{noformat}
Connected to Asterisk 1.8.13.0 currently running on server (pid = 25677)
asterisk*CLI> core show channels
Channel              Location             State   Application(Data)            
asterisk*CLI> exit
{noformat}

{noformat}
asterisk[25677]: WARNING[8788]: channel.c:1474 in __ast_queue_frame: Exceptionally long voice queue length queuing to Local/ext@some-context
{noformat}

By: Richard Mudgett (rmudgett) 2012-06-06 09:58:32.034-0500

The "Exceptionally long queue length queuing to XXXXX" message can be caused by many things.  This issue is specific to local channels optimizing themselves out of the call.  If you are seeing this message, you should create a new issue and supply the necessary debug information to determine the cause of the problem.