[Home]

Summary:ASTERISK-20486: MeetMe Unable to write frame to channel after SIP channel hangs up.
Reporter:Michael Cargile (mcargile)Labels:
Date Opened:2012-09-28 14:45:38Date Closed:2012-11-15 16:43:45.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Applications/app_meetme Channels/chan_sip/General
Versions:1.8.16.0 Frequency of
Occurrence
Occasional
Related
Issues:
is caused byASTERISK-16783 Playback stalls when playing demo-congrats to an IAX2 channel
is caused byASTERISK-17211 [patch] AMI redirect from meetme - calls fail
duplicatesASTERISK-19594 app_meetme unable to write frame (stuck channel)
duplicatesASTERISK-19949 app_meetme unable to write frame to Local/XXX channel (stuck channel)
duplicatesASTERISK-19491 MeetMe fails to remove participant from conference when user leaves during playback of 'conf-onlyperson.ulaw'
Environment:I have two asterisk boxes. One is running asterisk 1.4.43. It is acting like a test carrier. The other box is running 1.8.16.0 and is placing calls to the test carrier and routing them to MeetMe conferences.Attachments:( 0) asterisk-20486-ref-leak.diff
( 1) meetme_hangup_patch_ASTERISK-20486_v3.diff
( 2) meetme_hangup_patch_ASTERISK-20486.diff
( 3) refs.gz
( 4) refs-with-patch
( 5) refs-with-patch.gz
( 6) timer_patch_ASTERISK-20486.diff
Description:Basically it looks like in some instances MeetMe is still trying to pass frames to a channel that should have hung up.  If you look the caller id name which we use as a Unique ID matches the caller id name in the BYE event before this error.  It looks like it is generating about 50 errors per second for one of these calls.  I am not sure if this is a SIP issue, or a MeetMe issue.  I think this might be related to this issue:

ASTERISK-19594

but that one has already been closed.


Output from AGI script called before getting into the MeetMe conference to show the channel as well as the calleridname (which we use to keep track of the call):

{noformat}
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi|AGI Environment Dump:
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- accountcode =
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- arg_1 = NORMAL-----LB
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- callerid = 8633939330
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- calleridname = V9281514180000238212
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- callingani2 = 0
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- callingpres = 0
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- callingtns = 0
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- callington = 0
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- channel = SIP/testcarrier-00017f4b
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- context = default
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- dnid = unknown
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- enhanced = 0.0
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- extension = 8368
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- language = en
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- priority = 3
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- rdnis = unknown
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- request = agi-VDAD_ALL_outbound.agi
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- threadid = -1268466832
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- type = SIP
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- uniqueid = 1348859661.314636
2012-09-28 15:14:33|agi-VDAD_ALL_outbound.agi| -- version = 1.8.16.0
{noformat}

Asterisk CLI output with SIP debugging enabled:

{noformat}
[Sep 28 15:14:35]     -- Executing [8600058@default:1] MeetMe("SIP/testcarrier-00017f4b", "8600058,F") in new stack
[Sep 28 15:14:35]   == Parsing '/etc/asterisk/meetme.conf': [Sep 28 15:14:35]   == Found
[Sep 28 15:14:35]   == Parsing '/etc/asterisk/meetme-vicidial.conf': [Sep 28 15:14:35]   == Found
[Sep 28 15:14:35]     -- Created MeetMe conference 1022 for conference '8600058'
[Sep 28 15:14:35]     -- <SIP/testcarrier-00017f4b> Playing 'conf-onlyperson.gsm' (language 'en')
...
[Sep 28 15:14:38]
<--- Transmitting (NAT) to 192.168.198.14:5060 --->
SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.168.198.14:5060;branch=z9hG4bK099f4e9c;received=192.168.198.14;rport=5060
From: <sip:9999024387@192.168.198.14>;tag=as657a2bf3
To: "V9281514180000238212" <sip:8633939330@192.168.198.15>;tag=as4bc64123
Call-ID: 05a3c6f264fb72905243ca216bb0eba2@192.168.198.15:5060
CSeq: 102 BYE
Server: Asterisk PBX 1.8.16.0
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO, PUBLISH
Supported: replaces, timer
Content-Length: 0
...
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:39] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
...
[Sep 28 15:14:45] WARNING[3688]: app_meetme.c:3678 conf_run: Unable to write frame to channel SIP/testcarrier-00017f4b
[Sep 28 15:14:45] WARNING[8004]: chan_sip.c:3918 __sip_autodestruct: Autodestruct on dialog '05a3c6f264fb72905243ca216bb0eba2@192.168.198.15:5060' with owner in place (Method: BYE). Rescheduling destruc
tion for 10000 ms
[Sep 28 15:14:45]     -- Hungup 'DAHDI/pseudo-753634465'
[Sep 28 15:14:45]   == Spawn extension (default, 8600058, 1) exited non-zero on 'SIP/testcarrier-00017f4b'
[Sep 28 15:14:45]     -- Executing [h@default:1] AGI("SIP/testcarrier-00017f4b", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----111---------------") in new stack
{noformat}

1.8 extension.conf

{noformat}
; global varibale for carrier
TESTSIPTRUNK = SIP/testcarrier

; Outbound dial to test carrier
exten => _91999NXXXXXX,1,AGI(agi://127.0.0.1:4577/call_log)
exten => _91999NXXXXXX,2,Dial(${TESTSIPTRUNK}/${EXTEN:2},,To)
exten => _91999NXXXXXX,3,Hangup

; agent connection
exten => _999999999999,1,AGI(agi://127.0.0.1:4577/call_log)
exten => _999999999999,2,Dial(${TESTSIPTRUNK}/${EXTEN},,To)
exten => _999999999999,3,Hangup

; routing extension
exten => 8368,1,Playback(sip-silence)
exten => 8368,n,AGI(agi://127.0.0.1:4577/call_log)
exten => 8368,n,AGI(agi-VDAD_ALL_outbound.agi,NORMAL-----LB)
exten => 8368,n,AGI(agi-VDAD_ALL_outbound.agi,NORMAL-----LB)
exten => 8368,n,Hangup()

; meetme conference extensions
exten => _86000[5-9]X,1,Meetme(${EXTEN},F)
exten => _86000[5-9]X,n,Hangup()
exten => _8600[1-2]XX,1,Meetme(${EXTEN},F)
exten => _8600[1-2]XX,n,Hangup()
{noformat}

1.4 extension.conf

{noformat}
; PERFORMANCE TESTING
exten => _999XXXXXX1,1,Answer
exten => _999XXXXXX1,2,Wait(2)
exten => _999XXXXXX1,3,Playback(vicidial-welcome)
exten => _999XXXXXX1,4,Hangup

exten => _999XX11112,1,Wait(2)
exten => _999XX11112,2,Answer
exten => _999XX11112,3,Playback(ss-noservice)
exten => _999XX11112,4,Playback(vm-goodbye)
exten => _999XX11112,5,Hangup

exten => _999XX18112,1,Wait(2)
exten => _999XX18112,2,Answer
exten => _999XX18112,3,Playback(vtiger-fax)
exten => _999XX18112,4,Playback(vtiger-fax)
exten => _999XX18112,5,Hangup

exten => _999XX19112,1,Wait(2)
exten => _999XX19112,2,Answer
exten => _999XX19112,3,Playback(vtiger-email)
exten => _999XX19112,4,Playback(vtiger-email)
exten => _999XX19112,5,Hangup

exten => _999XXXX112,1,Wait(5)
exten => _999XXXX112,2,Answer
exten => _999XXXX112,3,Playback(demo-instruct)
exten => _999XXXX112,4,Playback(demo-instruct)
exten => _999XXXX112,5,Hangup

exten => _999XXXXXX2,1,Wait(8)
exten => _999XXXXXX2,2,Answer
exten => _999XXXXXX2,3,Playback(demo-instruct)
exten => _999XXXXXX2,4,Hangup

exten => _999XXXXXX3,1,Set(PRI_CAUSE=1)
exten => _999XXXXXX3,2,Hangup

exten => _999XXXXXX4,1,Set(PRI_CAUSE=27)
exten => _999XXXXXX4,2,Hangup

exten => _999XXXXXX5,1,Wait(60)
exten => _999XXXXXX5,2,Hangup

exten => _999XXXXXX6,1,Wait(10)
exten => _999XXXXXX6,2,Answer
exten => _999XXXXXX6,3,Playback(demo-instruct)
exten => _999XXXXXX6,4,Hangup

exten => _999XXXXXX7,1,Wait(12)
exten => _999XXXXXX7,2,Answer
exten => _999XXXXXX7,3,Playback(demo-enterkeywords)
exten => _999XXXXXX7,4,Hangup

exten => _999XXXXXX8,1,Set(PRI_CAUSE=17)
exten => _999XXXXXX8,2,Hangup

exten => _999XXXXXX9,1,Wait(6)
exten => _999XXXXXX9,2,Answer
exten => _999XXXXXX9,3,Playback(demo-abouttotry)
exten => _999XXXXXX9,4,Hangup

exten => _999XXXXXX0,1,Wait(5)
exten => _999XXXXXX0,2,Answer
exten => _999XXXXXX0,3,Playback(vm-goodbye)
exten => _999XXXXXX0,4,Hangup

; agent playback
exten => 99999999999,1,Answer
exten => 99999999999,2,Playback(conf)
exten => 99999999999,3,Playback(conf)
exten => 99999999999,4,Playback(conf)
exten => 99999999999,5,Playback(conf)
exten => 99999999999,6,Playback(conf)
exten => 99999999999,7,Playback(conf)
exten => 99999999999,8,Playback(conf)
exten => 99999999999,9,Playback(conf)
exten => 99999999999,10,Playback(conf)
exten => 99999999999,11,Playback(conf)
exten => 99999999999,12,Playback(conf)
exten => 99999999999,13,Playback(conf)
exten => 99999999999,14,Hangup
{noformat}

1.8 sip.conf

{noformat}
[testcarrier]
type=peer
context=trunkinbound
usecallerid=yes
trustrpid=yes
sendrpid=yes
host=192.168.198.14
qualify=yes
insecure=port,invite
disallow=all
allow=ulaw
dtmfmode=RFC2833
{noformat}

1.4 sip.conf

{noformat}
[testcarrier]
type=peer
context=trunkinbound
usecallerid=yes
trustrpid=yes
sendrpid=yes
host=192.168.198.15
qualify=yes
insecure=port,invite
disallow=all
allow=ulaw
dtmfmode=RFC2833
{noformat}

System in question is being used to port the Vicidial Call Center Suite over to Asterisk 1.8.  It is currently operating in test mode.  

In test mode agents are simulated by originating a call with a Local channel.  The local channel is pointed at the 999999999999 extension which routes through to the 1.4 test carrier.  On the test carrier the call ends up in an extension that plays classical music for 12 hours.  The extension for the originated call is one of the meetme conferences.  This calls stays there throughout the test.

Outbound calls are originated as a Local channel as well.  The Local channel is pointed at the 91999NXXXXXX extension depending on the "phone number" being dialed.  This routed through to the 1.4 test carrier where depending on what "phone number" was dialed different things happen.  The extension for the originated call is the 8368.  This extension first plays sip-silence which is a 40ms sound file of nothing.  This gives the Local channel a chance to optimize out as this is only done during processing of audio frames and Vicidial was written at a time when Asterisk had issues routing Local channels so refuses to do so.  Next the call is logged in Vicidial with the call_log AGI and then goes into the outbound routing AGI.  There it finds an agent that is available to take the call and then transfers the call to the MeetMe conference that is associated with that agent.

As stated this error pops up occasionally.  With five 'agents' logged in and about 15 'calls' being attempted at any given time it tends to take about 15 to 30 minutes before this pops up.  If I up the number of 'agent' and 'calls' it can take as little as five minutes.




CLI output after applying the timer and meetme patch passed through a grep filter to get just the warnings:
{noformat}
dev-dial2:~ # asterisk -r | grep WARN
[Nov  7 12:30:17] WARNING[7727]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-000000df
[Nov  7 12:34:32] WARNING[11859]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-00000251
[Nov  7 12:37:41] WARNING[14182]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-00000312
[Nov  7 12:46:45] WARNING[23962]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-00000631
[Nov  7 12:48:51] WARNING[24887]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-00000681
{noformat}

Full CLI output after timer and meetme patches applied:
{noformat}
[Nov  7 12:30:15]     -- <SIP/testcarrier-000000ee>AGI Script agi-VDAD_ALL_outbound.agi completed, returning 0
[Nov  7 12:30:15]     -- Executing [192*168*198*015*8600060@default:1] Goto("SIP/testcarrier-000000ee", "default,8600060,1") in new stack
[Nov  7 12:30:15]     -- Goto (default,8600060,1)
[Nov  7 12:30:15]     -- Executing [8600060@default:1] MeetMe("SIP/testcarrier-000000ee", "8600060,F") in new stack
[Nov  7 12:30:15]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:30:15]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:30:15]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:30:16]     -- <Local/919999003312@default-000000ee;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----16-----ANSWER-----8-----0 completed, returning 0
[Nov  7 12:30:16]   == Spawn extension (default, 919999003312, 2) exited non-zero on 'Local/919999003312@default-000000ee;2'
[Nov  7 12:30:17]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:30:17] WARNING[7727]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-000000df
[Nov  7 12:30:17]   == Spawn extension (default, 8600059, 1) exited non-zero on 'SIP/testcarrier-000000df'
[Nov  7 12:30:17]     -- Executing [h@default:1] AGI("SIP/testcarrier-000000df", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:30:18]     -- <SIP/testcarrier-000000df>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0--------------- completed, returning 0
[Nov  7 12:30:19]     -- SIP/testcarrier-000000fc answered Local/919999000612@default-000000fc;2
[Nov  7 12:30:19]        > Channel Local/919999000612@default-000000fc;1 was answered.
[Nov  7 12:30:19]     -- Executing [8368@default:1] Playback("Local/919999000612@default-000000fc;1", "sip-silence") in new stack
[Nov  7 12:30:19]     -- <Local/919999000612@default-000000fc;1> Playing 'sip-silence.gsm' (language 'en')
[Nov  7 12:30:19]     -- Executing [8368@default:2] AGI("Local/919999000612@default-000000fc;1", "agi://127.0.0.1:4577/call_log") in new stack
[Nov  7 12:30:19]     -- Executing [h@default:1] AGI("Local/919999000612@default-000000fc;2", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----16-----ANSWER-----8-----0") in new stack
[Nov  7 12:30:19]     -- <SIP/testcarrier-000000fc>AGI Script agi://127.0.0.1:4577/call_log completed, returning 0
[Nov  7 12:30:19]     -- Executing [8368@default:3] AGI("SIP/testcarrier-000000fc", "agi-VDAD_ALL_outbound.agi,NORMAL-----LB") in new stack
[Nov  7 12:30:19]     -- Launched AGI Script /var/lib/asterisk/agi-bin/agi-VDAD_ALL_outbound.agi
[Nov  7 12:30:19]     -- SIP/testcarrier-000000ef answered Local/919999054327@default-000000ef;2
[Nov  7 12:30:19]        > Channel Local/919999054327@default-000000ef;1 was answered.
[Nov  7 12:30:19]     -- Executing [8368@default:1] Playback("Local/919999054327@default-000000ef;1", "sip-silence") in new stack
[Nov  7 12:30:19]     -- <Local/919999054327@default-000000ef;1> Playing 'sip-silence.gsm' (language 'en')
[Nov  7 12:30:19]     -- Executing [8368@default:2] AGI("Local/919999054327@default-000000ef;1", "agi://127.0.0.1:4577/call_log") in new stack
[Nov  7 12:30:19]     -- Executing [h@default:1] AGI("Local/919999054327@default-000000ef;2", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----16-----ANSWER-----12-----0") in new stack
[Nov  7 12:30:19]     -- <SIP/testcarrier-000000ef>AGI Script agi://127.0.0.1:4577/call_log completed, returning 0
[Nov  7 12:30:19]     -- Executing [8368@default:3] AGI("SIP/testcarrier-000000ef", "agi-VDAD_ALL_outbound.agi,NORMAL-----LB") in new stack
...
[Nov  7 12:34:32]     -- Executing [919999056546@default:1] AGI("Local/919999056546@default-00000261;2", "agi://127.0.0.1:4577/call_log") in new stack
[Nov  7 12:34:32]     -- <Local/919999056546@default-00000261;2>AGI Script agi://127.0.0.1:4577/call_log completed, returning 0
[Nov  7 12:34:32]     -- Executing [919999056546@default:2] Dial("Local/919999056546@default-00000261;2", "SIP/testcarrier/9999056546,,To") in new stack
[Nov  7 12:34:32]   == Using SIP RTP CoS mark 5
[Nov  7 12:34:32]     -- Called SIP/testcarrier/9999056546
[Nov  7 12:34:32]   == Manager 'sendcron' logged on from 127.0.0.1
[Nov  7 12:34:32]     -- Executing [919999025898@default:1] AGI("Local/919999025898@default-00000262;2", "agi://127.0.0.1:4577/call_log") in new stack
[Nov  7 12:34:32]     -- <Local/919999025898@default-00000262;2>AGI Script agi://127.0.0.1:4577/call_log completed, returning 0
[Nov  7 12:34:32]     -- Executing [919999025898@default:2] Dial("Local/919999025898@default-00000262;2", "SIP/testcarrier/9999025898,,To") in new stack
[Nov  7 12:34:32]   == Using SIP RTP CoS mark 5
[Nov  7 12:34:32]     -- Called SIP/testcarrier/9999025898
[Nov  7 12:34:32]     -- Got SIP response 603 "Declined" back from 192.168.198.14:5060
[Nov  7 12:34:32]     -- SIP/testcarrier-00000262 is busy
[Nov  7 12:34:32]   == Everyone is busy/congested at this time (1:1/0/0)
[Nov  7 12:34:32]     -- Executing [919999025898@default:3] Hangup("Local/919999025898@default-00000262;2", "") in new stack
[Nov  7 12:34:32]   == Spawn extension (default, 919999025898, 3) exited non-zero on 'Local/919999025898@default-00000262;2'
[Nov  7 12:34:32]     -- Executing [h@default:1] AGI("Local/919999025898@default-00000262;2", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY----------") in new stack
[Nov  7 12:34:32]     -- <Local/919999058508@default-00000257;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:34:32]     -- <Local/919999036624@default-00000258;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:34:32]   == Spawn extension (default, 8600069, 1) exited non-zero on 'SIP/testcarrier-0000024c'
[Nov  7 12:34:32]     -- Executing [h@default:1] AGI("SIP/testcarrier-0000024c", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:34:32] WARNING[11859]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-00000251
[Nov  7 12:34:32]   == Spawn extension (default, 8600070, 1) exited non-zero on 'SIP/testcarrier-00000251'
[Nov  7 12:34:32]     -- Executing [h@default:1] AGI("SIP/testcarrier-00000251", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:34:33]     -- <Local/919999049964@default-0000025c;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:34:33]   == Spawn extension (default, 8600054, 1) exited non-zero on 'SIP/testcarrier-00000202'
[Nov  7 12:34:33]     -- Executing [h@default:1] AGI("SIP/testcarrier-00000202", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:34:33]     -- <Local/919999025898@default-00000262;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:34:33]     -- <SIP/testcarrier-0000024c>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0--------------- completed, returning 0
[Nov  7 12:34:34]     -- <SIP/testcarrier-00000251>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0--------------- completed, returning 0
[Nov  7 12:34:34]     -- <SIP/testcarrier-00000202>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0--------------- completed, returning 0
[Nov  7 12:34:34]   == Spawn extension (default, 919999010855, 2) exited non-zero on 'Local/919999010855@default-0000022e;2'
[Nov  7 12:34:34]     -- Executing [h@default:1] AGI("Local/919999010855@default-0000022e;2", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0-----CANCEL----------") in new stack
[Nov  7 12:34:34]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:34:34]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:34:34]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:34:35]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:34:35]   == Manager 'sendcron' logged on from 127.0.0.1
[Nov  7 12:34:35]     -- Executing [919999039607@default:1] AGI("Local/919999039607@default-00000263;2", "agi://127.0.0.1:4577/call_log") in new stack
[Nov  7 12:34:35]     -- <Local/919999039607@default-00000263;2>AGI Script agi://127.0.0.1:4577/call_log completed, returning 0
[Nov  7 12:34:35]     -- Executing [919999039607@default:2] Dial("Local/919999039607@default-00000263;2", "SIP/testcarrier/9999039607,,To") in new stack
...
[Nov  7 12:37:38]     -- Got SIP response 603 "Declined" back from 192.168.198.14:5060
[Nov  7 12:37:38]     -- SIP/testcarrier-00000360 is busy
[Nov  7 12:37:38]   == Everyone is busy/congested at this time (1:1/0/0)
[Nov  7 12:37:38]     -- Executing [919999036794@default:3] Hangup("Local/919999036794@default-00000360;2", "") in new stack
[Nov  7 12:37:38]   == Spawn extension (default, 919999036794, 3) exited non-zero on 'Local/919999036794@default-00000360;2'
[Nov  7 12:37:38]     -- Executing [h@default:1] AGI("Local/919999036794@default-00000360;2", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY----------") in new stack
[Nov  7 12:37:38]   == Spawn extension (default, 8600064, 1) exited non-zero on 'SIP/testcarrier-0000030e'
[Nov  7 12:37:38]     -- Executing [h@default:1] AGI("SIP/testcarrier-0000030e", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:37:39]     -- <Local/919999041574@default-0000035a;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:37:39]     -- <SIP/testcarrier-0000030e>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0--------------- completed, returning 0
[Nov  7 12:37:39]     -- <Local/919999019583@default-0000035e;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:37:39]     -- <Local/919999038488@default-0000035f;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:37:39]     -- <Local/919999036794@default-00000360;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:37:41] WARNING[14182]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-00000312
[Nov  7 12:37:41]   == Spawn extension (default, 8600055, 1) exited non-zero on 'SIP/testcarrier-00000312'
[Nov  7 12:37:41]     -- Executing [h@default:1] AGI("SIP/testcarrier-00000312", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:37:41]   == Spawn extension (default, 8600068, 1) exited non-zero on 'SIP/testcarrier-00000315'
[Nov  7 12:37:41]     -- Executing [h@default:1] AGI("SIP/testcarrier-00000315", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:37:41]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:37:41]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:37:41]   == Manager 'sendcron' logged on from 127.0.0.1
[Nov  7 12:37:41]     -- Executing [919999041872@default:1] AGI("Local/919999041872@default-00000361;2", "agi://127.0.0.1:4577/call_log") in new stack
[Nov  7 12:37:41]     -- <Local/919999041872@default-00000361;2>AGI Script agi://127.0.0.1:4577/call_log completed, returning 0
[Nov  7 12:37:41]     -- Executing [919999041872@default:2] Dial("Local/919999041872@default-00000361;2", "SIP/testcarrier/9999041872,,To") in new stack
[Nov  7 12:37:41]   == Using SIP RTP CoS mark 5
[Nov  7 12:37:41]     -- Called SIP/testcarrier/9999041872
[Nov  7 12:37:41]   == Manager 'sendcron' logged on from 127.0.0.1
[Nov  7 12:37:41]     -- Executing [919999024382@default:1] AGI("Local/919999024382@default-00000362;2", "agi://127.0.0.1:4577/call_log") in new stack
[Nov  7 12:37:41]     -- <Local/919999024382@default-00000362;2>AGI Script agi://127.0.0.1:4577/call_log completed, returning 0
[Nov  7 12:37:41]     -- Executing [919999024382@default:2] Dial("Local/919999024382@default-00000362;2", "SIP/testcarrier/9999024382,,To") in new stack
...
[Nov  7 12:46:45]   == Spawn extension (default, 8600066, 1) exited non-zero on 'SIP/testcarrier-0000062d'
[Nov  7 12:46:45]     -- Executing [h@default:1] AGI("SIP/testcarrier-0000062d", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:46:45]     -- <Local/919999023730@default-0000062d;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----16-----ANSWER-----5-----0 completed, returning 0
[Nov  7 12:46:45]   == Spawn extension (default, 919999023730, 2) exited non-zero on 'Local/919999023730@default-0000062d;2'
[Nov  7 12:46:45] WARNING[23962]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-00000631
[Nov  7 12:46:45]   == Spawn extension (default, 8600075, 1) exited non-zero on 'SIP/testcarrier-00000631'
[Nov  7 12:46:45]     -- Executing [h@default:1] AGI("SIP/testcarrier-00000631", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:46:45]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:46:45]   == Spawn extension (default, 919999022935, 2) exited non-zero on 'Local/919999022935@default-0000060b;2'
[Nov  7 12:46:45]     -- Executing [h@default:1] AGI("Local/919999022935@default-0000060b;2", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0-----CANCEL----------") in new stack
[Nov  7 12:46:45]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:46:45]     -- Executing [h@default:1] AGI("Local/919999021501@default-00000642;2", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----16-----ANSWER-----2-----2") in new stack
[Nov  7 12:46:45]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:46:45]   == Spawn extension (default, 919999031375, 2) exited non-zero on 'Local/919999031375@default-0000060d;2'
[Nov  7 12:46:45]     -- Executing [h@default:1] AGI("Local/919999031375@default-0000060d;2", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0-----CANCEL----------") in new stack
[Nov  7 12:46:45]     -- <Local/919999040000@default-00000631;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----16-----ANSWER-----5-----0 completed, returning 0
[Nov  7 12:46:45]   == Spawn extension (default, 919999040000, 2) exited non-zero on 'Local/919999040000@default-00000631;2'
[Nov  7 12:46:45]     -- Executing [h@default:1] AGI("Local/919999057221@default-00000644;2", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----16-----ANSWER-----2-----2") in new stack
[Nov  7 12:46:45]   == Manager 'sendcron' logged off from 127.0.0.1
[Nov  7 12:46:45]     -- SIP/testcarrier-00000633 answered Local/919999025549@default-00000633;2
[Nov  7 12:46:45]        > Channel Local/919999025549@default-00000633;1 was answered.
[Nov  7 12:46:45]     -- Executing [8368@default:1] Playback("Local/919999025549@default-00000633;1", "sip-silence") in new stack
...
[Nov  7 12:48:50]   == Manager 'sendcron' logged on from 127.0.0.1
[Nov  7 12:48:50]     -- Executing [919999006037@default:1] AGI("Local/919999006037@default-000006f6;2", "agi://127.0.0.1:4577/call_log") in new stack
[Nov  7 12:48:50]     -- <Local/919999006037@default-000006f6;2>AGI Script agi://127.0.0.1:4577/call_log completed, returning 0
[Nov  7 12:48:50]     -- Executing [919999006037@default:2] Dial("Local/919999006037@default-000006f6;2", "SIP/testcarrier/9999006037,,To") in new stack
[Nov  7 12:48:50]   == Using SIP RTP CoS mark 5
[Nov  7 12:48:50]     -- Called SIP/testcarrier/9999006037
[Nov  7 12:48:50]   == Manager 'sendcron' logged on from 127.0.0.1
[Nov  7 12:48:50]     -- Executing [919999037520@default:1] AGI("Local/919999037520@default-000006f7;2", "agi://127.0.0.1:4577/call_log") in new stack
[Nov  7 12:48:50]     -- <Local/919999037520@default-000006f7;2>AGI Script agi://127.0.0.1:4577/call_log completed, returning 0
[Nov  7 12:48:50]     -- Executing [919999037520@default:2] Dial("Local/919999037520@default-000006f7;2", "SIP/testcarrier/9999037520,,To") in new stack
[Nov  7 12:48:50]   == Using SIP RTP CoS mark 5
[Nov  7 12:48:50]     -- Called SIP/testcarrier/9999037520
[Nov  7 12:48:50]   == Spawn extension (default, 8600064, 1) exited non-zero on 'SIP/testcarrier-0000067a'
[Nov  7 12:48:50]     -- Executing [h@default:1] AGI("SIP/testcarrier-0000067a", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:48:50]   == Spawn extension (default, 8600068, 1) exited non-zero on 'SIP/testcarrier-0000067c'
[Nov  7 12:48:50]     -- Executing [h@default:1] AGI("SIP/testcarrier-0000067c", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:48:50]     -- <Local/919999057448@default-000006ee;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:48:50]     -- <Local/919999010774@default-000006ef;2>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----21-----BUSY---------- completed, returning 0
[Nov  7 12:48:51] WARNING[24887]: app_meetme.c:3778 conf_run: Unable to write frame to channel SIP/testcarrier-00000681
[Nov  7 12:48:51]   == Spawn extension (default, 8600062, 1) exited non-zero on 'SIP/testcarrier-00000681'
[Nov  7 12:48:51]     -- Executing [h@default:1] AGI("SIP/testcarrier-00000681", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:48:51]     -- <SIP/testcarrier-0000067a>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0--------------- completed, returning 0
[Nov  7 12:48:51]   == Spawn extension (default, 8600059, 1) exited non-zero on 'SIP/testcarrier-00000683'
[Nov  7 12:48:51]     -- Executing [h@default:1] AGI("SIP/testcarrier-00000683", "agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0---------------") in new stack
[Nov  7 12:48:51]     -- <Local/919999039611@default-000006f3;1>AGI Script agi-VDAD_ALL_outbound.agi completed, returning 0
[Nov  7 12:48:51]     -- Executing [8368@default:4] AGI("Local/919999039611@default-000006f3;1", "agi-VDAD_ALL_outbound.agi,NORMAL-----LB") in new stack
[Nov  7 12:48:51]     -- Launched AGI Script /var/lib/asterisk/agi-bin/agi-VDAD_ALL_outbound.agi
[Nov  7 12:48:51]     -- <SIP/testcarrier-0000067c>AGI Script agi://127.0.0.1:4577/call_log--HVcauses--PRI-----NODEBUG-----0--------------- completed, returning 0
{noformat}
Comments:By: Matt Jordan (mjordan) 2012-09-28 14:58:08.908-0500

Looks like something isn't releasing the SIP channel.  Do you think you could try getting a ref debug log from chan_sip, using the instructions on the wiki here:

https://wiki.asterisk.org/wiki/display/AST/Reference+Count+Debugging

By: Michael Cargile (mcargile) 2012-09-28 15:05:37.926-0500

Sure I can do that.

By: Michael Cargile (mcargile) 2012-09-28 15:44:04.139-0500

This is the refs log from Asterisk.  I stopped Asterisk shortly after this problem popped up.

By: Michael Cargile (mcargile) 2012-09-28 15:57:45.962-0500

Did you need the CLI output that matches the refs file?

By: Michael L. Young (elguero) 2012-09-30 22:35:07.189-0500

Give this patch a try and report back if it resolves this for you.

By: Michael Cargile (mcargile) 2012-10-01 08:15:46.955-0500

That did not seem to do it.  I tested it with 1.8.17.0-rc1 because it would not apply to 1.8.16.0 and I could not get svn to compile.  I am uploading a new refs file.  I stopped asterisk shortly after the error stopped.

By: David Brillert (aragon) 2012-10-04 09:31:37.498-0500

I couldn't read your refs-with-patch.gz file...
Not that I can help fix the problem if I could read it but I think I have a duplicate issue of yours and would like to help you get your patch so we can both test...


By: Michael Cargile (mcargile) 2012-10-04 10:01:21.297-0500

Interesting.  Must have gotten corrupted.  I just uploaded the uncompressed version and double checked it.  I can read it fine now.

By: David Brillert (aragon) 2012-10-04 10:12:56.007-0500

I can read it now

By: Johan Wilfer (johan) 2012-10-05 02:07:01.383-0500

You can try the patch in ASTERISK-19594. This only masks the problem, but I use it as a work-around myself.
Maybe this issue and ASTERISK-19949 is related - I have the same problem with Local-channels with a pure Dahdi-environment and no sip at all.

By: Michael L. Young (elguero) 2012-10-06 20:00:06.838-0500

I have been analyzing the code to see if I could help or not.  At first, I was focused on chan_sip thinking there was a ref leak there.  Then Johan mentioned in his comments other issues that seem very similar.  The debug log in ASTERISK-19594 shows chan_local and chan_sip having this issue.

The conclusion I have come to is this.  If I am wrong, perhaps somebody will chime in.  It would appear that app_meetme references the channel entering the conference by linking it into a table containing references to channels participating in the conference.  I do not see a check, or am not finding where it would be, to check that the channel has hungup or not.  Since the reference to the channel is not being released by app_meetme, the channel is not being destroyed even after the last bye is received.  Everything continues in app_meetme, announcements are trying to be written to the now "hungup" channel, until finally the auto_destroy runs in the case of chan_sip which finally kills the channel and therefore, the messages stop being displayed on the console and everything goes back to normal.

I have not been able to lab up this scenario yet to see if my analysis is correct or not.

By: Michael Cargile (mcargile) 2012-10-06 20:11:57.948-0500

I was actually just working on another section of our program that records the meetme conferences and got the same error.  We use chan_local routed to Monitor extensions.  This gives us more flexibility.  However as soon as the the local channel enters the meetme conference this error starts popping up accept for the local channel to the Monitor.  On top of that the recording generated by the Monitor is 0 bytes.  This worked perfectly in Asterisk 1.4.  I was about to open another ticket regarding this, but it sounds like it might be related to what you were saying.

By: Michael L. Young (elguero) 2012-10-06 21:15:11.594-0500

Michael, I am trying to recreate this scenario.  How are you initiating the call from 1.8 to your "testcarrier" and then routing the call to MeetMe?  Is there more than one channel connected to the MeetMe room when this message is displayed on the console?

If you could provide the dial plan for this scenario and any pertinent info to help try to setup a test environment, would be very helpful.

By: Johan Wilfer (johan) 2012-10-07 07:37:57.524-0500

Michael Cargile: I would open a separate bug report on the Monitor problem. I had a problem with MixMonitor in previous versions, but this is fixed as of 1.8.13 (Bug report: ASTERISK-19426). Maybe you can switch to MixMonitor?

By: Johan Wilfer (johan) 2012-10-07 07:38:29.969-0500

The problem with the Local-channel seem to happen on a frequent basis for me, but not frequent enough to be easy to debug. As opposed to the sip-channel there is not auto-destruct on the Local-channel so I found this out when our server had filled the hard-drive with millions of these messages.

Yet another issue that may be related is ASTERISK-19491, maybe this could be an easier way to troubleshoot this as this seems to occur constant according to the reporter.

By: Michael Cargile (mcargile) 2012-10-08 11:01:01.640-0500

Michael L. Young, as mentioned in the updated description the system is being used to port the Vicidial Call Center Suite over to Asterisk 1.8.  If you have any questions let me know and I can pop into IRC.

By: Johan Wilfer (johan) 2012-10-09 14:23:12.521-0500

I've updated the related issue ASTERISK-19949 with a way to recreate my scenario. Maybe it can help you to if these issues are related? In the issue summary there is a dialplan that creates this error every time.

By: Jonathan Rose (jrose) 2012-11-05 16:20:42.231-0600

I don't know about chan_sip itself having anything to do with this. I don't think that's the case. It seems like this hangup condition affects chan_local as well in the same way. I've got a patch that addresses the problem for both channel drivers from the angle of meetme itself, and as far as I can tell it doesn't look like the channels themselves are leaking afterwards.  Please give it a shot and report back.

meetme_hangup_patch_ASTERISK-20486.diff

By: Michael L. Young (elguero) 2012-11-05 16:29:44.178-0600

Jonathan, I just wanted to concur with you.  Once we had received more information on this report, I didn't think it had anything to do with chan_sip either.  I was unable to reproduce the issue on my end and that is as far as I was able to get with it.  That patch you posted is along the lines of what I was thinking was possibly needed.

For those who could reproduce this issue, please report back on if it fixes the issue or not.

By: Jonathan Rose (jrose) 2012-11-05 16:52:11.863-0600

Well, I'm tracking the changelog at the moment and it looks like one of Russell's later patches (while he was still leading the Asterisk project) here was something along the lines of taking the sort of behavior I would have been introducing here out.

Is a puzzlement!

By: Johan Wilfer (johan) 2012-11-06 02:42:50.634-0600

I've tried you patch with the method I documented in ASTERISK-19949, and the patch from Jonathan works for me. (https://issues.asterisk.org/jira/secure/attachment/45269/meetme_hangup_patch_ASTERISK-20486.diff). Before the patch I got these messages constant.

Just noted that your patch is quite similar to https://issues.asterisk.org/jira/secure/attachment/43067/meetme-hangup-trunk-360459.patch that also resolves my issue. Matt Jordan commented on ASTERISK-19949 about this patch:

"In general, 'stuck' channels are indicative of something more fundamentally 'wrong', and the patch on ASTERISK-19594 would only mask the problem (and would only resolve it in MeetMe - if the channel happened to get 'stuck' in another application, it would not fix it there)."

I have however used the above patch in production for five months without problems. And this is a system that only does conferencing with Meetme.

By: Michael Cargile (mcargile) 2012-11-06 13:23:46.719-0600

I have applied this patch and ran the system under heavy load for over two hours.  I am still getting the warning, but when I do it is not spamming to the screen.  I am getting one warning per instance it seems.  Before I would get 25+ warning per second till the channel completely cleared out.  This is a big improvement, but I am not sure the issue is completely fixed.

By: Jonathan Rose (jrose) 2012-11-07 10:47:58.930-0600

You would both be correct, and yes, I'm aware of the warning you still receive with the patch above in place. This appears to have been the result of a questionable patch a while back against channel.c involving timerfd that was obscured for a long while by a check similar to the one in this patch.  The patch I mentioned by Russell (described here: http://lists.digium.com/pipermail/asterisk-commits/2011-January/047556.html) removed this check exposing the problem we've noted in this bug with the nearly infinitely spammed log messages against our ast_write.

I've discussed this with another developer (Mark Michelson) and the timerfd check in channel.c is sketchy at best, so our current plan (pending review) is to remove this patch and if the issue it was intended to solve comes back up, to deal with the consequences of that in a more appropriate fashion. I'm posting a new patch to this issue (timer_patch_ASTERISK-20486.diff) which you all may confirm works.

By: Jonathan Rose (jrose) 2012-11-07 11:13:16.632-0600

Adding caused by links.

By: Michael Cargile (mcargile) 2012-11-07 11:35:18.283-0600

Still getting the WARNING occasionally.

I noticed something though.  When I first tested just now, I unapplied the meetme patch and applied the timer patch.  I ran like that for about five minutes.  I would occasionally get a single instance of that WARNING, then every so often the screen would flood with the same WARNING.  I then reapplied the meetme patch, and I cannot get the flood to come back, but I am still getting the occasional instance of that warning.  This makes me think that there are two different scenarios generating this.

By: Jonathan Rose (jrose) 2012-11-07 11:46:39.949-0600

If you could, would you please post a log showing an instance of these occasional reproductions?

EDIT:  I've confirmed that the timerfd problem isn't the only instance of these odd looping errors from conf_run. It seems that addresses hangups during the playback of conf-onlyperson, but not of certain others like hanging up from the menu provided by the 's' option.

By: Michael Cargile (mcargile) 2012-11-07 11:49:38.137-0600

Having run this for a while now I must say the instances of the single WARNING are also greatly reduced.

By: Michael L. Young (elguero) 2012-11-07 12:05:03.784-0600

Jonathan, I thought that app_meetme only worked with dahdi and depends on dahdi as a timer.  Am I mistaken?

By: Johan Wilfer (johan) 2012-11-07 12:08:04.394-0600

I'll run some tests with this in in the morning (it's evening here right now in Sweden).

Maybe we have to test to redirect participants out of Meetme if parts of the resolution in ASTERISK-17211 gets reverted?

I have seen these single warnings on asterisk 1.4 for a long time when SIP channels was leaving the conference. This occurred every now and then so the single warnings may have existed a long time. I never noticed any ill effects on the calls though.

By: Jonathan Rose (jrose) 2012-11-07 13:32:51.938-0600

No, we aren't reverting anything from ASTERISK-17211.  The reversion is against a part of ASTERISK-16783 and has nothing in particular to do with meetme.

@Michael Young: Well, yes, it depends on DAHDI, but I can definitely confirm that timer_fd has an effect as well. Disabling timerfd stops the repeating errors I described above with the most recent patch. It turns out there are others though (such as hanging up while the menu is playing) and I'm presently looking into them.

By: Johan Wilfer (johan) 2012-11-07 15:45:39.629-0600

I couldn't help myself spending the evening on this.. The new patch "timer_patch_ASTERISK-20486.diff" does not remedy the problem for me. I run the setup described in the description in ASTERISK-19949 and the CLI gets spammed with WARNING's just like before.

@Jonathan Rose: Can you reproduce this with you machine? If you can't you can have ssh access to my dev-server if you want to have a look at it yourself.

By: Jonathan Rose (jrose) 2012-11-07 16:12:16.207-0600

Yeah, it's looking like the new patch is actually the wrong approach after I've spent a little more time with this. The patch works for some of these errors, but not all of them. Yours in particular is obviously not going to be fixed by it since you aren't getting the failed to write frame error that I always saw from the timerfd problem.

I'm currently trying to reproduce your case to see if the current approach is going to fix it, which is similar to the old one, only taking a few extra cases into account.


EDIT:  I can go ahead and confirm that your method is reproducable and that the log message appears identical.  The new new patch that I'm working on solves it already though. I haven't tested it with the timerfd method, but I'm pretty sure it would have no effect.


@Michael:

So I've finally figured out what you meant by single warnings...  They are somewhat rare (I have to make roughly 20 to 30 calls with hangup interrupted streams to get one most of the time) and I still don't understand the cause of them very well at all.  For now I'm going to consider them to be a different problem since they aren't reproducable and don't seem to be impeded by the patches I've tossed around so far. It seems to be completely contained within the actual streaming function.

By: Michael Cargile (mcargile) 2012-11-07 16:26:46.244-0600

The timer and meetme patches have also fixed the other issue I had with Local Channels and meetme.  We use a Local channel to bring in a Monitor extension.  Before this error would flood to the screen like crazy when we did this and the recording for the Monitor would be blank.  Now it works great.

By: Jonathan Rose (jrose) 2012-11-07 16:46:42.096-0600

If you would, please clarify whether either patch works or if it's both patches in tandem that fix these problems.

By: Michael Cargile (mcargile) 2012-11-08 08:16:42.221-0600

Neither apparently.  When I first tested that issue it was with asterisk 1.8.16.  I have been applying your patches to 1.8.18-rc1 and figure that that was what fixed those issues.  I just removed both patched and it still works.  That issue appears to have been fixed between 1.8.16 and 1.8.18-rc1.

By: Johan Wilfer (johan) 2012-11-08 09:58:42.604-0600

Ok, I tested with asterisk-1.8.17.0 yesterday and The new patch "timer_patch_ASTERISK-20486.diff" does not remedy the problem for me. I run the setup described in the description in ASTERISK-19949 and the CLI gets spammed with WARNING's just like before.

I've seen this happen in two scenarios in my production-server:

First with a Originate that connected a Meetme to a Local-channel that could do some announcements. In some cases the Local channel didn't do any announcement and exited after about 0.1-0.4 seconds. This is pretty much the case the example in ASTERISK-19949 reproduces. This is constant and is always the same.

The second scenario is when ~ 10 users connected via SIP leaves the Meetme simultaneously. Every user that leaves the conference leads to a Originate + Meetme and a Local-channel. At this point I had changed the code with Originate and the Local channel to always run announcements but even in this case the flood of warnings was triggered. This however is hard to reproduce.

By: Michael Cargile (mcargile) 2012-11-08 10:21:54.789-0600

Johan try the timer patch and the meetme patch on asterisk 1.8.18-rc1.  My MeetMe / Local Channel problems which were there in 1.8.16 are gone.

By: Johan Wilfer (johan) 2012-11-08 10:25:30.067-0600

I would like to do that right away, but my dev-server dropped of the Internet earlier today. I'll have to go to the datacenter and investigate this in the morning or try to lab up this test on another machine. Grr..

By: Jonathan Rose (jrose) 2012-11-08 11:08:07.985-0600

Alright, I think for the rest of these issues we are just going to reintroduce the hangup check before the ast_waitfor_nandfds function is used. Mark and I went over Russell's commit log and he removed that bit simply because it no longer appeared to be necessary, and it seems it actually was.  We can't find any real problematic issues causing this problem aside from possibly the timerfd thing in some of the cases and it also is caught by the hangup check.  We'll keep our eyes on that one.

I thought about adding a bunch of bailout conditions for every streamfile function since having multiple of those in a single pass of the loop without bailing out on the exit condition produces a warning, but the changes seem a little severe just to keep a more or less harmless warning from occurring in MeetMe.  The newest patch does however fix a small problem with menus not checking the response of the DTMF while hanging up the menus. It ends up using the 'default' case when the user hangs up because DTMF gets set to -1.  To address that I just made the menus only respond to positive values for dtmf (which should cover every valid DTMF a phone can send anyway).

The new patch I'd like people concerned with these issues to try is meetme_hangup_patch_ASTERISK-20486_v3.diff
It's a lot like the old patch and to be blunt it basically just reintroduces the hangup check that I mentioned Russell removed.

From this point in the issue, we are only concerned with the looping messages along the lines of:

[Nov  8 11:02:12] WARNING[7858]: app_meetme.c:3774 conf_run: Unable to write frame to channel SIP/gold-00000001

Where this error continues to occur frequently.

Single warnings produced by playback of files on channels that have already hung up aren't a concern. In otherwords, if you just see:

WARNING[7858]: file.c:764 ast_readaudio_callback: Failed to write frame

And it isn't looping like crazy (in fact, I imagine there are probably certain ways to get this message to crop up 3 or 4 times in a row with the right options to meetme) then it isn't really a concern, meetme is just haphazardly programmed without proper error response to ast_streamfile in a lot of places. Attempting to account for every place that ast_streamfile isn't checked appropriately would be a major effort that could break things. It's just not worthwhile.

By: Michael Cargile (mcargile) 2012-11-08 11:28:40.719-0600

@Jonathan Rose

Which version of Asterisk do you want me to test with and which patches should I apply?  Should I still apply the timer patch?

By: Jonathan Rose (jrose) 2012-11-08 11:35:31.242-0600

Any recent release of 1.8 (last couple months or so) should be fine. The only patch I want you to apply is meetme_hangup_patch_ASTERISK-20486_v3.diff

By: Michael Cargile (mcargile) 2012-11-08 12:11:37.820-0600

Testing now.  Will get back to you in a hour or so.

By: Michael Cargile (mcargile) 2012-11-08 16:14:59.129-0600

Sorry got side tracked by work.  I have not gotten any of the looping WARNINGs.  I am still getting about 50 of the single WARNINGs per hour, but they are never for the same channel like before.

By: Johan Wilfer (johan) 2012-11-11 11:47:57.574-0600

Okay, I have now tested the new patch. I run the setup described in the description in ASTERISK-19949.

tested asterisk-1.8.17.0 (unpatched): CLI spawns WARNINGS in a loop
tested asterisk-1.8.17.0 (patched meetme_hangup_patch_ASTERISK-20486_v3.diff): No warnings.
tested asterisk-1.8.19.0-rc1 (unpatched): CLI spawns WARNINGS in a loop
tested asterisk-1.8.19.0-rc1 (patched meetme_hangup_patch_ASTERISK-20486_v3.diff): No warnings.

Also, this patch has a lot of similarity with https://issues.asterisk.org/jira/secure/attachment/43067/meetme-hangup-trunk-360459.patch which I've been using for about five months now and it has worked perfectly. Good work in trackning down why this happened! Thank you Jonathan!