[Home]

Summary:ASTERISK-24831: Upon receiving BYE a dialog is scheduled for destruction in 32000 ms and gets stuck, causing long calls
Reporter:Fabian Borot (fborot)Labels:
Date Opened:2015-02-25 18:57:44.000-0600Date Closed:
Priority:MajorRegression?
Status:Open/NewComponents:Channels/chan_sip/General
Versions:11.2.1 13.1.0 Frequency of
Occurrence
Frequent
Related
Issues:
is related toASTERISK-24747 Call is not terminated in time due to lack of RTP nor it is shutdown if peer sends BYE
Environment:Asterisk 13.1-cert1 built by root @ AW-MIA-CORE-14 on a x86_64 running Linux on 2015-02-22 04:22:00 UTC Linux AW-MIA-CORE-14 2.6.18-308.24.1.el5 #1 SMP Tue Dec 4 17:43:34 EST 2012 x86_64 x86_64 x86_64 GNU/Linux Attachments:( 0) autodestruct_msgs_2nd_channel__b_leg_call-id.txt
( 1) debug_log_filtered_by_11820_c-000da360-with_private_IPs.txt
( 2) sip.conf
( 3) sip_msgs_a_leg.txt
( 4) sip_msgs_b_leg.txt
( 5) snapshot_a_leg.png
( 6) snapshot_b_leg.png
Description:We are having a very serious with calls not being terminated upon receiving of the BYEs sometimes.

A call is established, then one of the legs(in this case  B leg, outgoing channel) sends a BYE, the log shows the BYE being received and replied to (sip debug, plus debug level), but the call is scheduled to be destroyed in 32000 ms. (really long time!!) and nothing is shown in the logs regarding the termination of the whole dialog right away.

Then the A leg ( incoming channel) sends a BYE before that time (7 secs after we received the BYE in the B leg!!) because the asterisk did not send the BYE upon receiving the Bye for the B leg.

It seems in this case that the BYE from the A leg (2nd BYE) was the one that shutdown the call and we can see the DIAL command returning but sometimes also the process is repeated (also incoming channel scheduled for destruction in 32000 ms)

Sometimes the difference is more than a minute and this is causing us to have different times for the call record, over billing out customers and creating huge disputes.

I have a pcap file with both legs of the call, the asterisk log with debug info but I can publish it here because it has public IPs (from us, our customer and our providers)

I am submitting a summary of the debug session for this call replacing the IPs by private IPs, I can provide the rest to a safe place upon request.

At first sight it seems to be that the "sip_scheddestroy_final" inside "handle_request_bye" in chan_sip.c is called with the def value = -1, causing it to scheduled the destruction to "timer1 * 64" exactly 32000 ms
which should not be the case because asterisk received the BYE and replied to it as well right away with 200 OK, so there is nothing to wait for.

In case that the asterisk is the one sending the BYE then it has to go through the process of retransmitting the BYE request at T1, then 2T1 etc

let me know what else is needed please, we have a lot or urgency to fix this issue, although it can not be replicated we thing that we can provide enough information to help find the problem


Comments:By: Fabian Borot (fborot) 2015-02-25 18:58:34.820-0600

debug information

By: Fabian Borot (fborot) 2015-02-25 19:01:23.197-0600

debug info

By: Matt Jordan (mjordan) 2015-02-25 19:36:39.964-0600

We require additional debug to continue with triage of your issue. Please follow the instructions on the wiki [1] for how to collect debugging information from Asterisk. For expediency, where possible, attach the debug with a '.txt' file extension so that the debug will be usable for further analysis.

Thanks!

[1] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information

Please note that we need 'sip set debug on' enabled in your log. Your 'sip.conf' as well as a cap showing the message traffic would also help.

By: Fabian Borot (fborot) 2015-02-25 20:19:34.836-0600

Thank you Matt I have all of that it, but cant posted it here, what can we do?


By: Matt Jordan (mjordan) 2015-02-25 20:24:55.798-0600

As an open source project, you'll need to sanitize the input and provide the information, or else provide explicit instructions that allows a bug marshal to reproduce the issue in a controlled environment.

We don't close issues down unless they are security related, as that eliminates transparency in the project, and prevents the community from participating or helping with the issue.

By: Fabian Borot (fborot) 2015-02-25 20:30:45.934-0600

I have the pcap file and the complete log with debug containing the call with sip debug also, it is 65 MB compressed as rar file. Uncompressed it is 1+ GB, but I can not post it here as it is (and also the max size for attachments is 10 GB) I would need to remove the sensitive information and it would take more time.
I can provide it to you in another way, please email me at fborot at hotmail.com to exchange information about how to provide it to you, thank you a lot


By: Fabian Borot (fborot) 2015-02-26 09:07:19.142-0600

I think the debug information that I provided yesterday should be enough to get you started, I am sanitizing more files for submitting

By: Fabian Borot (fborot) 2015-02-26 09:36:15.939-0600

sip messages from both legs

By: Fabian Borot (fborot) 2015-02-26 09:36:42.413-0600

I just added the sip messages captured with sip set debug on.

By: Fabian Borot (fborot) 2015-02-26 09:44:26.749-0600

snapshots of both legs

By: Fabian Borot (fborot) 2015-02-26 09:45:09.990-0600

I attached the pcap snapshot from both legs,

By: Fabian Borot (fborot) 2015-02-26 09:48:44.841-0600

autodestruct messages , they keep showing even after the call is gone