Summary: | ASTERISK-16665: Asterisk Crash on RTCP package in SRTP mode | ||||
Reporter: | Bernhard S (bernhards) | Labels: | |||
Date Opened: | 2010-09-10 01:52:08 | Date Closed: | 2011-10-05 17:33:29 | ||
Priority: | Critical | Regression? | No | ||
Status: | Closed/Complete | Components: | Resources/res_srtp | ||
Versions: | Frequency of Occurrence | ||||
Related Issues: |
| ||||
Environment: | Attachments: | ( 0) bt_full_not_optimized.txt ( 1) bt_full.txt ( 2) bt_not_optimized.txt ( 3) bt.txt ( 4) bt-full.txt ( 5) srtp-crash-after-5sek.pcap | |||
Description: | "snom360-SIP 8.4.18 42570" connected to Asterisk with TLS. snom makes an outbound call to another phone (without srtp). Other telephone does ring - then Asterisk does crash. libsrtp version 1.4.4. was used - without a change. ****** ADDITIONAL INFORMATION ****** Last Output of the "full-log": [Sep 10 08:26:34] DEBUG[22299] res_rtp_asterisk.c: Ooh, format changed from unknown to ulaw [Sep 10 08:26:34] DEBUG[22299] res_rtp_asterisk.c: Created smoother: format: ulaw ms: 20 len: 160 [Sep 10 08:26:34] DEBUG[22299] res_rtp_asterisk.c: Starting RTCP transmission on RTP instance '0x9bab4d8' [Sep 10 08:26:34] DEBUG[22299] res_rtp_asterisk.c: Setting RTCP address on RTP instance '0x9bab4d8' [Sep 10 08:26:34] DEBUG[22299] res_rtp_asterisk.c: RTP NAT: Got audio from other end. Now sending to address 212.47.191.70:30456 [Sep 10 08:26:34] DEBUG[22299] res_rtp_asterisk.c: Ooh, format changed from unknown to alaw [Sep 10 08:26:34] DEBUG[22299] res_rtp_asterisk.c: Created smoother: format: alaw ms: 20 len: 160 [Sep 10 08:26:34] DEBUG[22299] res_rtp_asterisk.c: Starting RTCP transmission on RTP instance '0xb4334c58' GDB backtracce: #0 0x0085d612 in rdb_add_index () from /usr/lib/asterisk/modules/res_srtp.so (gdb) bt #0 0x0085d612 in rdb_add_index () from /usr/lib/asterisk/modules/res_srtp.so #1 0x008577a1 in srtp_unprotect_rtcp () from /usr/lib/asterisk/modules/res_srtp.so #2 0x00856199 in ast_srtp_unprotect (srtp=0x9baee20, buf=0xb4184668, len=0xb418c6f4, rtcp=1) at res_srtp.c:328 #3 0x0074855a in __rtp_recvfrom (instance=0x9bab4d8) at res_rtp_asterisk.c:328 #4 rtcp_recvfrom (instance=0x9bab4d8) at res_rtp_asterisk.c:337 ASTERISK-1 ast_rtcp_read (instance=0x9bab4d8) at res_rtp_asterisk.c:1637 ASTERISK-2 0x0074b4af in ast_rtp_read (instance=0x9bab4d8, rtcp=1) at res_rtp_asterisk.c:1975 ASTERISK-3 0x045a84c5 in sip_rtp_read (ast=0xb432c260) at chan_sip.c:6839 ASTERISK-4 sip_read (ast=0xb432c260) at chan_sip.c:6920 ASTERISK-5 0x080bb5b3 in __ast_read (chan=0xb432c260, dropaudio=0) at channel.c:3742 ASTERISK-6 0x0089666b in wait_for_answer (in=0xb432c260, outgoing=0xb4331890, to=0xb418e170, peerflags=0xb418e1a0, opt_args=0xb418e0b0, pa=0xb418d7f0, num_in=0xb418e104, result=0xb418e15c, dtmf_progress=0x0, ignore_cc=1) at app_dial.c:1334 ASTERISK-7 0x00898a11 in dial_exec_full (chan=0xb432c260, data=<value optimized out>, peerflags=0xb418e1a0, continue_exec=0x0) at app_dial.c:2236 ASTERISK-8 0x0089d199 in dial_exec (chan=0xb432c260, data=0xb4190254 "SIP/sip07/0041445633931:::,120,t") at app_dial.c:2733 ASTERISK-9 0x08137fab in pbx_exec (c=0xb432c260, app=0x9adf560, data=0xb4190254 "SIP/sip07/0041445633931:::,120,t") | ||||
Comments: | By: Marcello Ceschia (marcelloceschia) 2010-09-10 08:17:33 workaround for this: -disable RTCP Support in: Advanced->SIP/RTP->RTCP Support By: Bernhard S (bernhards) 2010-09-10 08:46:09 Yes I know. But this does not solve the main problem :-) By: Marcello Ceschia (marcelloceschia) 2010-09-10 09:24:54 but it looks like a firmware issue on the snom. I had the same issue with 8.4.18 I downgraded to 8.2.35 -> no problem upgrade back to 8.4.18 -> no problem any more By: Bernhard S (bernhards) 2010-09-10 09:34:51 That might be a timing issue but neither the server not the client should crash if the other party does something wrong. By: Marcello Ceschia (marcelloceschia) 2010-09-10 09:44:25 you are right, but first of all we should fined the reason when this happens By: Paul Belanger (pabelanger) 2010-09-10 16:10:40 We will need an unoptimized (see below) backtrace. --- Thank you for your bug report. In order to move your issue forward, we require a backtrace from the core file produced after the crash. Please see the doc/backtrace.txt file in your Asterisk source directory. Also, be sure you have DONT_OPTIMIZE enabled in menuselect within the Compiler Flags section, then: make install after enabling, reproduce the crash, and then execute the instructions in doc/backtrace.txt. When complete, attach that file to this issue report. Thanks! By: Bernhard S (bernhards) 2010-09-10 17:03:17 I already added the last ~ 20 lines of the backtrace which does show what happend within the RTCP flow. As marcelloceschia said, he has also the bug - therefore I think its also possible to reproduce this ticket. I think this is an timing issue related to the establishment of the SRTP session. Maybe the RTCP is the VERY first SRTP package which comes in and some SRTP stuff is not already initialized. By: Paul Belanger (pabelanger) 2010-09-10 19:04:15 Yes, however we still need you to follow the instructions in doc/backtrace.txt. More information is better then less. By: Bernhard S (bernhards) 2010-09-11 01:48:09 Hi, I added the backtrace and backtrace_full. I dont know if they were created on an optimized asterisk. I have to check that and try to get "non-optimized" if optimized. Maybe you could already find something interessting or someone else could reproduce the failure, too. Thanks for your help and your effort. Best regards, Bernhard By: Marcello Ceschia (marcelloceschia) 2010-09-11 02:26:37 I hope i can reproduce this issue. I got the crash always after 5 seconds connection time, but it looks like it only happend if the phone have an uptime > 12 hours. By: Paul Belanger (pabelanger) 2010-09-11 08:44:22 bernhards: Yes, they are still optimized, you can tell when you see "<value optimized out>" within a backtrace. By: Bernhard S (bernhards) 2010-09-13 01:23:26 Please find attached the not-optimized backtraces of the segfault. Best regards, Bernhard By: Marcello Ceschia (marcelloceschia) 2010-09-13 08:03:23 I added an not optimized backtrace: bt-full.txt, after 5 seconds connection I got this core dump wiresharke trace for this: srtp-crash-after-5sek.pcap The uptime of the snom360 Uptime:1 days, 2 hours, 33 minutes I use the res_srtp.c.patch from issue ASTERISK-1732563 By: Leif Madsen (lmadsen) 2010-09-13 10:43:27 Also, issues that are somewhat hardware related require information to be produced by the reporter as the developers do not necessarily have access to the same hardware you are using to produce the crash. By: Bernhard S (bernhards) 2010-09-27 01:50:06 Which informations do you need to fix that issue for the rc3? I think this is a major bug within the SRTP stack which is also maybe a release blocker? By: Bernhard S (bernhards) 2010-10-21 01:30:01 Will there be a bugfix before Asterisk 1.8 or not? By: 1stbs (1stbs) 2010-10-26 12:25:24 hello i got the same prob i try to fix it and get follow analysis the error occur in svn/ trunk/ srtp/ crypto/ replay/ rdb.c exactly in the function rdb_add_index when it calls v128_set_bit(&rdb->bitmask, rdb_bits_in_bitmask-delta); i have added some debbuging information in the function index is : 3010 rdb->window_start is : 0 rdb_bits_in_bitmask is : 128 delta in else tree is : 2883 rdb_bits_in_bitmask-delta= -2755 and with that negativ value you got the the core dump right now i have just out commented the else tree to devoid the crash. it seems to work but replay function is damaged a little bit. Perhaps anybody here has a better idea to solve this issue By: Birger "WIMPy" Harzenetter (wimpy) 2010-11-08 14:10:51.000-0600 I've got the same issue in Asterisk 1.8.0. With a freshly started phone everything is ok. After some uptime of the phone, about half an hour seems be be enough, Asterisk segfaults for me as well. Interesting points for me are: - According to the phones display it's an insecure connection. (yes, I'm using TLS as well) - Asterisk always crashes after nearly 5 seconds. But during that time audio works in both directions. What can we do to help pinpoint the issue? By: Andrew Fried (afried) 2010-12-06 10:01:47.000-0600 This problem also exists in 1.8.1-rc1. I'm also using Snom 300 phones with the 8.4.18 firmware. As marcelloceschia stated, disabling RTCP Support on the phone itself did seem to solve the immediate problem. I'd like to mirror bernhards concerns that this is a very serious bug and hope that Digium elevates this bug in their queue accordingly. By: Gianluca Varisco (gvarisco) 2010-12-22 03:14:13.000-0600 I confirm the behavior also on 1.8.x trunk By: Andrew Fried (afried) 2010-12-22 14:46:05.000-0600 Is there any way we can get this issue elevated to a priority above "regular" and assigned to someone for remediation? All Snom phones default configuration have "RTCP Support" turned on, and any one phone accepting a secure call with that setting cause Asterisk to core dump within 5 seconds of receiving a call when TLS/SRTP are active. The bug surfaced in 1.8.0 (with the support of SRTP), and remain in both 1.8.1 and 1.8.2-rc1. By: Leif Madsen (lmadsen) 2011-01-05 15:03:18.000-0600 Issues are prioritized and assigned to developers as time and resources allow. By: Bob Beers (bbeers) 2011-02-09 09:40:22.000-0600 I have a similar issue, but I did not get Asterisk crash, just call hangup. I would get this warning from Asterisk when RTCP packet arrived, about 5 seconds after call started: WARNING[3181]: res_rtp_asterisk.c:1817 ast_rtcp_read: RTCP Read error: Success. Hanging up. My problem occurs with Asterisk 1.8.0-beta2 and libsrtp-1.4.4. The other end is Avaya Aura AS5300 13.0.0.4. TEO TSG-6 <-- no TLS --> Asterisk <-- TLS --> Avaya <-???-> ??? No, I don't have a wireshark trace, but I have it on my todo list. I worked-around the issue by patching res_rtp_asterisk.c to ignore RTCP packets if SRTP was active. Then calls would stay up. I don't recommend the patch, but I'll attach it to help developers isolate the issue. By: Bob Beers (bbeers) 2011-02-09 09:43:17.000-0600 Well, truth is I feel so sure that the patch is a bad idea, I don't even want to attach it as a submission, but I'll paste it in-line here for informational purposes ... Index: res/res_rtp_asterisk.c =================================================================== --- res/res_rtp_asterisk.c (revision 303637) +++ res/res_rtp_asterisk.c (working copy) @@ -1640,6 +1640,7 @@ static struct ast_frame *ast_rtcp_read(struct ast_rtp_instance *instance) { struct ast_rtp *rtp = ast_rtp_instance_get_data(instance); + struct ast_srtp *srtp = ast_rtp_instance_get_srtp(instance); struct ast_sockaddr addr; unsigned int rtcpdata[8192 + AST_FRIENDLY_OFFSET]; unsigned int *rtcpheader = (unsigned int *)(rtcpdata + AST_FRIENDLY_OFFSET); @@ -1652,6 +1653,11 @@ 0, &addr)) < 0) { ast_assert(errno != EBADF); if (errno != EAGAIN) { + if (srtp != NULL) { + /* If we have an srtp stream, ignore srtcp (for now). */ + ast_log(LOG_WARNING, "RTCP Read error: %s. We ignore during SRTP.\n", strerror(errno)); + return &ast_null_frame; + } ast_log(LOG_WARNING, "RTCP Read error: %s. Hanging up.\n", strerror(errno)); return NULL; } By: Roman Shpount (rshpount) 2011-04-07 23:33:45 There is an invalid index access in rdb.c when adding an index. When the sequence number of the SRTCP packet is outside of window by more then window size, a bit is set outside of the window bitmap. When sequence number is outside of the window size by more then one, a wrong bit is set. Here is the proposed fix. In the current code in rdb.c: err_status_t rdb_add_index(rdb_t *rdb, uint32_t index) { uint32_t delta; /* here we *assume* that index > rdb->window_start */ delta = (index - rdb->window_start); if (delta < rdb_bits_in_bitmask) { /* if the index is within the window, set the appropriate bit */ v128_set_bit(&rdb->bitmask, delta); } else { delta -= rdb_bits_in_bitmask - 1; /* shift the window forward by delta bits*/ v128_left_shift(&rdb->bitmask, delta); v128_set_bit(&rdb->bitmask, rdb_bits_in_bitmask-delta); rdb->window_start += delta; } return err_status_ok; } It should be: err_status_t rdb_add_index(rdb_t *rdb, uint32_t index) { uint32_t delta; /* here we *assume* that index > rdb->window_start */ delta = (index - rdb->window_start); if (delta < rdb_bits_in_bitmask) { /* if the index is within the window, set the appropriate bit */ v128_set_bit(&rdb->bitmask, delta); } else { delta -= rdb_bits_in_bitmask - 1; /* shift the window forward by delta bits*/ v128_left_shift(&rdb->bitmask, delta); v128_set_bit(&rdb->bitmask, rdb_bits_in_bitmask-1); rdb->window_start += delta; } return err_status_ok; } By: Gregory Hinton Nietsky (irroot) 2011-05-23 04:16:39 this is not a asterisk problem the problem is in srtp the patch above SHOULD not be on mantis the patch is available on the SRTP source forge page. By: Vladimir Mikhelson (vmikhelson) 2011-05-24 15:47:34 rshpount, thank you for posting here. It definitely helps with troubleshooting Asterisk crashes. Can you please post diff? I would like to test your patch. Here is the link to the SourceForge LibSRTP Bug Tracker irroot referred to: http://sourceforge.net/tracker/?func=detail&aid=3280295&group_id=38894&atid=423799 -Vladimir By: Roman Shpount (rshpount) 2011-05-26 00:50:56 Please see the difference between versions 1.4 and 1.5 of rdb.c in libsrtp: http://srtp.cvs.sourceforge.net/viewvc/srtp/srtp/crypto/replay/rdb.c?r1=1.4&r2=1.5 This is the fix for this bug. By: Leif Madsen (lmadsen) 2011-10-05 17:33:29.269-0500 Closing as it was an issue outside of Asterisk. By: Thomas Arimont (tomaso) 2011-12-07 03:06:10.438-0600 The patch referenced here did not work in my environment (srtp lib 1.4.4, Asterisk 1.8.8..0-rc4, SNOM Version 7). I had to use this one (found it somewhere, but can't remember where): {noformat} --- srtp-1.4.4-r0/crypto/replay/rdb.c 2006/06/08 17:00:28 +++ srtp-1.4.4-r0/crypto/replay/rdb.c 2010/05/14 22:15:12 @@ -115,7 +115,7 @@ /* shift the window forward by delta bits*/ v128_left_shift(&rdb->bitmask, delta); - v128_set_bit(&rdb->bitmask, rdb_bits_in_bitmask-delta); + v128_set_bit(&rdb->bitmask, 127); rdb->window_start += delta; } {noformat} By: IƱaki Baz Castillo (ibc) 2012-03-29 07:42:16.801-0500 Assuming that this bug is fixed in libsrtp 1.5 (as Roman Shpount clarifies in a previous comment), note that current libsrtp version in most of the systems is still 1.4.X. So, wouldn't be better if Asterisk was include libsrtp 1.5 within its sources rather than depending on an external library that probably will be 1.4.X version? |