Summary: | ASTERISK-27251: chan_sip doesn't honour rtptimeout setting | ||
Reporter: | Ian Gilmour (tuxian) | Labels: | |
Date Opened: | 2017-09-05 10:57:13 | Date Closed: | |
Priority: | Major | Regression? | |
Status: | Open/New | Components: | Channels/chan_sip/General |
Versions: | 13.17.1 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Ubuntu 16.04 (64-bit) | Attachments: | ( 0) ast-conf.tgz ( 1) cli.tgz |
Description: | I have Asterisk running in the cloud with sip.conf configured with:
{noformat} rtptimeout=60 rtpholdtimeout=300 rtpkeepalive=20 {noformat} I have a confbridge managed conference room, configured to play MOH if there is only 1 participant. I have a SIP client (Jitsi) configured to use TLS+SRTP. It registers with Asterisk on a chan_sip managed IP address + port. The SIP client is behind a NAT. If the SIP client enters the conference room MOH plays as expected. If I then disconnect the SIP client from the network (so it doesn't deregister, or issue a SIP BYE) I see Asterisk sending media (MOH) to the SIP client's orginal IP address and port indefinitely (>24hours+) and the CLI shows the user as present in the conf call. Media output only terminates if I kick the user manually out of the conf call via the CLI. Once the SIP client is disconnected from the network wireshark shows ICMP Destination unreachable messages being returned in response to each Asterisk (MOH) outgoing media packet. n.b. I see similar behaviour if I run both Asterisk and client locally. i.e. no NAT. The bug seems to be related to ASTERISK-26523. Reverting the ASTERISK-26523 change so that it only updates the lastrtprx if the frame isn't AST_FRAME_NULL gives me a chan_sip that honours the sip.conf rtptimeout. | ||
Comments: | By: Asterisk Team (asteriskteam) 2017-09-05 10:57:15.003-0500 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Rusty Newton (rnewton) 2017-09-06 17:46:48.672-0500 Thanks for the report and debug. However we also need protocol specific debug captured at the time of the issue. Please include the following: * Asterisk log files generated using the instructions on the Asterisk wiki [1], with the appropriate protocol debug options enabled, e.g. 'pjsip set logger on' if the issue involves the chan_pjsip channel driver. * Configuration information for the relevant channel driver, e.g. pjsip.conf. * A wireshark compatible packet capture, captured at the same time as the Asterisk log output. [1] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information By: Ian Gilmour (tuxian) 2017-09-11 06:55:32.469-0500 [^cli.tgz] contains a cli.txt file. This shows the CLI output of a chan_sip Jitsi client joining a conf call with the following debug enabled: {noformat} sip set debug on core set verbose 4 core set debug 4 {noformat} As it's the only user in the conf call, MOH is played to the user. Around cli.txt:1206 the sip client (Jitsi) is removed from the network. Asterisk CLI "sip show channelstats" shows no media pkts being received from the client beyond this point. Yet Asterisk continues to send MOH media to it for the next 5mins i.e. well beyond the rtptimeout period (60 secs), with no SIP or RTP pkts being received. "sip show channelstats" output: {noformat} grep -nH -e "10.10.0.19 539ab9d4d04" cli.txt cli.txt:1154:10.10.0.19 539ab9d4d04 00:01:11 0000003350 0000000000 ( 0.00%) 0.0000 0000002569 0000000000 ( 0.00%) 0.0060 cli.txt:1158:10.10.0.19 539ab9d4d04 00:01:13 0000003457 0000000000 ( 0.00%) 0.0000 0000002677 0000000000 ( 0.00%) 0.0061 cli.txt:1168:10.10.0.19 539ab9d4d04 00:01:15 0000003539 0000000000 ( 0.00%) 0.0000 0000002759 0000000000 ( 0.00%) 0.0057 cli.txt:1206:10.10.0.19 539ab9d4d04 00:01:32 0000004052 0000000000 ( 0.00%) 0.0000 0000003625 0000000000 ( 0.00%) 0.0063 cli.txt:1214:10.10.0.19 539ab9d4d04 00:01:35 0000004052 0000000000 ( 0.00%) 0.0000 0000003733 0000000000 ( 0.00%) 0.0063 cli.txt:1218:10.10.0.19 539ab9d4d04 00:01:36 0000004052 0000000000 ( 0.00%) 0.0000 0000003802 0000000000 ( 0.00%) 0.0063 cli.txt:1233:10.10.0.19 539ab9d4d04 00:01:38 0000004052 0000000000 ( 0.00%) 0.0000 0000003910 0000000000 ( 0.00%) 0.0063 cli.txt:1240:10.10.0.19 539ab9d4d04 00:01:53 0000004052 0000000000 ( 0.00%) 0.0000 0000004626 0000000000 ( 0.00%) 0.0063 cli.txt:1244:10.10.0.19 539ab9d4d04 00:01:54 0000004052 0000000000 ( 0.00%) 0.0000 0000004720 0000000000 ( 0.00%) 0.0063 cli.txt:1256:10.10.0.19 539ab9d4d04 00:02:32 0000004052 0000000000 ( 0.00%) 0.0000 0000006588 0000000000 ( 0.00%) 0.0063 cli.txt:1260:10.10.0.19 539ab9d4d04 00:02:34 0000004052 0000000000 ( 0.00%) 0.0000 0000006714 0000000000 ( 0.00%) 0.0063 cli.txt:1272:10.10.0.19 539ab9d4d04 00:03:29 0000004052 0000000000 ( 0.00%) 0.0000 0000009439 0000000000 ( 0.00%) 0.0063 cli.txt:1353:10.10.0.19 539ab9d4d04 00:05:31 0000004052 0000000000 ( 0.00%) 0.0000 0000015508 0000000000 ( 0.00%) 0.0063 cli.txt:1357:10.10.0.19 539ab9d4d04 00:05:34 0000004052 0000000000 ( 0.00%) 0.0000 0000015675 0000000000 ( 0.00%) 0.0063 cli.txt:1361:10.10.0.19 539ab9d4d04 00:05:36 0000004052 0000000000 ( 0.00%) 0.0000 0000015759 0000000000 ( 0.00%) 0.0063 cli.txt:1365:10.10.0.19 539ab9d4d04 00:05:37 0000004052 0000000000 ( 0.00%) 0.0000 0000015822 0000000000 ( 0.00%) 0.0063 cli.txt:1369:10.10.0.19 539ab9d4d04 00:05:39 0000004052 0000000000 ( 0.00%) 0.0000 0000015892 0000000000 ( 0.00%) 0.0063 {noformat} The cli.txt output also confirms that: {noformat} RTP Keepalive: 20 RTP Timeout: 60 RTP Hold Timeout: 300 {noformat} By: Ian Gilmour (tuxian) 2017-09-11 08:59:07.957-0500 Note, the mod I'm suggesting was originally applied as the fix for ASTERISK-25270 back in Jun-2016. For some reason this change was reversed in the fix for ASTERISK-26523. By: Joshua C. Colp (jcolp) 2017-09-11 09:06:13.322-0500 The fix was reversed with this comment: Unfortunately, the aforementioned commit caused a regression (incoming calls would eventually disconnect). Thus it is being removed. I believe the change caused local RTP bridging to fail, specifically the code never knew that media was flowing (when it was) and terminated the call incorrectly. By: Ian Gilmour (tuxian) 2017-09-11 10:06:52.155-0500 Note, I can also reregister the same chan_sip SIP client (Jitsi) with Asterisk, rejoin the same conf call, and again kill the sip client's network connection, and I can do this multiple times. This results in the same user appearing in the conf call multiple times. Every time I do this it adds another outgoing Asterisk media stream, that continues to be sent packets indefinitely, with (potentially) nothing at the other end to receive it. {noformat} *CLI> sip show channelstats Peer Call ID Duration Recv: Pack Lost ( %) Jitter Send: Pack Lost ( %) Jitter 10.10.0.19 0f2e4ed2355 00:00:22 0000000901 0000000000 ( 0.00%) 0.0000 0000000822 0000000000 ( 0.00%) 0.0060 10.10.0.19 daa1c4bb415 00:51:44 0000001761 0000000000 ( 0.00%) 0.0000 0000000154K 0000000000 ( 0.00%) 0.0063 *CLI> sip show channelstats Peer Call ID Duration Recv: Pack Lost ( %) Jitter Send: Pack Lost ( %) Jitter 10.10.0.19 ac65cff28ff 00:01:09 0000003297 0000000000 ( 0.00%) 0.0000 0000003076 0000000003 ( 0.10%) 0.0062 10.10.0.19 0f2e4ed2355 00:21:52 0000034224 0000000000 ( 0.00%) 0.0000 0000064726 0000000000 ( 0.00%) 0.0063 10.10.0.19 daa1c4bb415 01:13:14 0000001761 0000000000 ( 0.00%) 0.0000 0000000217K 0000000000 ( 0.00%) 0.0063 10.10.0.19 1e1f8c48ab4 00:03:12 0000002196 0000000000 ( 0.00%) 0.0000 0000008971 0000000000 ( 0.00%) 0.0060 4 active SIP channels *CLI> sip show channelstats Peer Call ID Duration Recv: Pack Lost ( %) Jitter Send: Pack Lost ( %) Jitter 10.10.0.19 ac65cff28ff 00:01:11 0000003413 0000000000 ( 0.00%) 0.0000 0000003192 0000000003 ( 0.09%) 0.0063 10.10.0.19 0f2e4ed2355 00:21:54 0000034224 0000000000 ( 0.00%) 0.0000 0000064842 0000000000 ( 0.00%) 0.0063 10.10.0.19 daa1c4bb415 01:13:17 0000001761 0000000000 ( 0.00%) 0.0000 0000000218K 0000000000 ( 0.00%) 0.0063 10.10.0.19 1e1f8c48ab4 00:03:15 0000002196 0000000000 ( 0.00%) 0.0000 0000009087 0000000000 ( 0.00%) 0.0060 4 active SIP channels *CLI> sip show channelstats Peer Call ID Duration Recv: Pack Lost ( %) Jitter Send: Pack Lost ( %) Jitter 10.10.0.19 ac65cff28ff 00:01:13 0000003517 0000000000 ( 0.00%) 0.0000 0000003296 0000000003 ( 0.09%) 0.0057 10.10.0.19 0f2e4ed2355 00:21:56 0000034224 0000000000 ( 0.00%) 0.0000 0000064946 0000000000 ( 0.00%) 0.0063 10.10.0.19 daa1c4bb415 01:13:19 0000001761 0000000000 ( 0.00%) 0.0000 0000000218K 0000000000 ( 0.00%) 0.0063 10.10.0.19 1e1f8c48ab4 00:03:17 0000002196 0000000000 ( 0.00%) 0.0000 0000009191 0000000000 ( 0.00%) 0.0060 4 active SIP channels *CLI> sip show channelstats Peer Call ID Duration Recv: Pack Lost ( %) Jitter Send: Pack Lost ( %) Jitter 10.10.0.19 ac65cff28ff 00:01:16 0000003629 0000000000 ( 0.00%) 0.0000 0000003408 0000000003 ( 0.09%) 0.0066 10.10.0.19 0f2e4ed2355 00:21:58 0000034224 0000000000 ( 0.00%) 0.0000 0000065058 0000000000 ( 0.00%) 0.0063 10.10.0.19 daa1c4bb415 01:13:21 0000001761 0000000000 ( 0.00%) 0.0000 0000000218K 0000000000 ( 0.00%) 0.0063 10.10.0.19 1e1f8c48ab4 00:03:19 0000002196 0000000000 ( 0.00%) 0.0000 0000009303 0000000000 ( 0.00%) 0.0060 4 active SIP channels *CLI> sip show channelstats Peer Call ID Duration Recv: Pack Lost ( %) Jitter Send: Pack Lost ( %) Jitter 10.10.0.19 ac65cff28ff 00:01:17 0000003687 0000000000 ( 0.00%) 0.0000 0000003466 0000000003 ( 0.09%) 0.0061 10.10.0.19 0f2e4ed2355 00:22:00 0000034224 0000000000 ( 0.00%) 0.0000 0000065116 0000000000 ( 0.00%) 0.0063 10.10.0.19 daa1c4bb415 01:13:22 0000001761 0000000000 ( 0.00%) 0.0000 0000000218K 0000000000 ( 0.00%) 0.0063 10.10.0.19 1e1f8c48ab4 00:03:20 0000002196 0000000000 ( 0.00%) 0.0000 0000009361 0000000000 ( 0.00%) 0.0060 4 active SIP channels *CLI> sip show channelstats Peer Call ID Duration Recv: Pack Lost ( %) Jitter Send: Pack Lost ( %) Jitter 10.10.0.19 ac65cff28ff 00:01:18 0000003754 0000000000 ( 0.00%) 0.0000 0000003533 0000000003 ( 0.08%) 0.0060 10.10.0.19 0f2e4ed2355 00:22:01 0000034224 0000000000 ( 0.00%) 0.0000 0000065183 0000000000 ( 0.00%) 0.0063 10.10.0.19 daa1c4bb415 01:13:23 0000001761 0000000000 ( 0.00%) 0.0000 0000000218K 0000000000 ( 0.00%) 0.0063 10.10.0.19 1e1f8c48ab4 00:03:21 0000002196 0000000000 ( 0.00%) 0.0000 0000009428 0000000000 ( 0.00%) 0.0060 4 active SIP channels *CLI> conf confbridge config *CLI> confbridge kick list lock mute record show unlock unmute *CLI> confbridge list 1234 Channel Flags User Profile Bridge Profile Menu CallerID ============================== ====== ================ ================ ================ ================ SIP/1006-00000000 ME USER conferences admin_menu 1006 SIP/1006-00000001 ME USER conferences user_menu 1006 SIP/1006-00000002 ME USER conferences user_menu 1006 SIP/1006-00000003 ME USER conferences user_menu 1006 *CLI> {noformat} This simulates a SIP client at the end of a dodgy network connection - so it seems reasonable to expect Asterisk to handle it slightly more gracefully. By: Ian Gilmour (tuxian) 2017-09-11 10:44:45.539-0500 Rereading ASTERISK-25270, I checked my confbridge setup. In the previous tests I had "jitterbuffer=yes" configured in the confbridge.conf user profile. Setting "jitterbuffer=no" results in a conf room that honours the rtptimeout setting again, but presumably will potentially result in poorer audio heard by the conf room participants. I also confirmed that setting confbridge "music_on_hold_when_empty=no", when jitterbuffer="yes", results in the same behaviour as when "music_on_hold_when_empty=yes". i.e. I can get the same user in a conference call multiple times, each being sent a media stream indefinitely, with no client at the other end of the media stream. Presumably the music on hold media is simply replaced with silence frames. By: Rusty Newton (rnewton) 2017-09-13 16:04:41.272-0500 Ian, if you can, to make this a little easier for folks who want to reproduce and investigate the issue - you can attach chan_sip, musiconhold and dialplan to the issue with some instructions on how to quickly reproduce the issue. By: Ian Gilmour (tuxian) 2017-09-14 10:36:18.690-0500 [^ast-conf.tgz] contains the requested config files. Asterisk is running on a local ubuntu 16.04 VM. The asterisk config supports registrations from 20 chan_sip users (1001-1020 - defined in users.conf), each with no password. I've simplified the dialplan to just support a conf room on 1234 being dialed by a local chan_sip user. The user gets prompted for conf room number and pin (both 1-20 digits in length). To aid debugging I've also simplified the chan_sip setup to accept clients on tcp rather than tls and use rtp rather than srtp. musiconhold is now off by default. The problem still occurs. The Jitsi SIP client is running on a local host and registers with asterisk as user 1006, on tcp port 5060. Jitsi is configured to use ulaw and alaw. To reproduce the problem: * Having registered a SIP client with asterisk, dial 1234. You'll get prompted for a conf room number and pin. Enter 1-20 digits for each and you'll enter the dynamically created conf room. * You'll be told "you are the only participant in the conference". * Disconnect the SIP client host from the network and terminate the SIP client app. * "sip show channelstats" confirms media continues to play beyond the rtptimeout configured in sip.conf (60 secs). Media continues to flow indefinitely. * Reconnect the SIP client host, restart the SIP client app (Jitsi) and reregister it with asterisk. * Again dial 1234 and re-enter the same conf room as before, using the same pin. * You'll get told "there is 1 other user in the conference". * "confbridge list <confRoomNumber>" confirms the same user is now in the conference twice. * "sip show channelstats" confirms media is now playing to 2 users. Media continues to flow to both users indefinitely. You can repeat the above to get the same user in the conference multiple times, with media streams flowing indefinitely to each. The only way to stop the media flow is to kick the user out of the conference. Or to disable the confbridge jitterbuffer (i.e. set jitterbuffer=no in confbridge.conf) and restart asterisk. Or to revert the ASTERISK-26523 change. |