[Home]

Summary:ASTERISK-26835: res_rtp_asterisk: Crash when freeing RTCP address string
Reporter:Niklas Larsson (pnlarsson)Labels:
Date Opened:2017-03-03 02:02:06.000-0600Date Closed:2017-04-21 13:12:05
Priority:MajorRegression?
Status:Closed/CompleteComponents:Resources/res_rtp_asterisk
Versions:13.14.0 Frequency of
Occurrence
Occasional
Related
Issues:
is duplicated byASTERISK-27018 Crash in res_rtp_asterisk.c
is related toASTERISK-26853 res_rtp_asterisk: Crash in pjnath when receiving packet
Environment:Debian 8Attachments:( 0) 0001-res_rtp_asterisk-Set-rtp-rtcp-to-NULL-to-prevent-dou.patch
( 1) backtrace_20170306_clean.txt
( 2) backtrace_core.uc51-2017-03-17T08-34-30+0100.txt
( 3) backtrace_core.uc62-2017-03-02T11-47-33+0100.txt
( 4) backtrace-threads-clean.txt
( 5) core_show_locks.txt
( 6) core-asterisk-running-2017-03-31T09-33-43-0400-brief.txt
( 7) core-asterisk-running-2017-03-31T09-33-43-0400-full.txt
( 8) core-asterisk-running-2017-03-31T09-33-43-0400-locks.txt
( 9) core-asterisk-running-2017-03-31T09-33-43-0400-thread1.txt
(10) core-asterisk-running-2017-04-03T09-07-25-0400-brief.txt
(11) core-asterisk-running-2017-04-03T09-07-25-0400-full.txt
(12) core-asterisk-running-2017-04-03T09-07-25-0400-locks.txt
(13) core-asterisk-running-2017-04-03T09-07-25-0400-thread1.txt
(14) example.mp3
Description:Now and then we get this segfaults and it has have been around for some versions (at least 13.13 could be before as well).
Comments:By: Asterisk Team (asteriskteam) 2017-03-03 02:02:06.861-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Ross Beer (rossbeer) 2017-03-06 06:33:08.874-0600

I am experiencing the same issue, please see attached backtrace.

By: Sean Bright (seanbright) 2017-03-08 15:52:46.529-0600

I've attached a total stab-in-the-dark patch. Could you give it a whirl and let me know.

By: Ross Beer (rossbeer) 2017-03-08 16:33:39.590-0600

I've applied the patch, I'll let you know the outcome. Thank you for your assistance.

By: Sean Bright (seanbright) 2017-03-08 16:36:01.483-0600

It's either going to not have any affect whatsoever, or it may reduce the occurrence of the problem, but I don't believe it is a complete fix. If it reduces the occurrences, it's just turning a timing problem into a slightly less likely timing problem.

By: Ross Beer (rossbeer) 2017-03-13 12:26:55.520-0500

The patch is still in place with no further crashes. Could this patch put on to gerrit for inclusion in the releases?

By: Sean Bright (seanbright) 2017-03-13 14:02:12.811-0500

If it actually fixed the problem I would submit for inclusion, but it doesn't. It just makes the chances of a crash less likely. Until there is an actual fix, I would suggest this issue remain open.

In the meantime, you're obviously free to continue patching your local installation.

By: Niklas Larsson (pnlarsson) 2017-03-17 02:46:02.164-0500

With the stab in the dark patch applied

By: Richard Mudgett (rmudgett) 2017-03-27 17:22:56.596-0500

A patch is up for review at https://gerrit.asterisk.org/#/c/5341/  It needs some real-world testing.  I have run it through the testsuite a couple times and done some test calls.

By: Sebastian Gutierrez (sum) 2017-03-30 09:50:16.830-0500

tested the patch in production to see if resolve my issue of dtls timeout crash but was worse, I think a deadlock occurred and have to go back to a previous version, it  processed more than 1000 calls

By: Richard Mudgett (rmudgett) 2017-03-30 15:41:13.584-0500

New patch up on gerrit.  Still at https://gerrit.asterisk.org/#/c/5341/

By: Sebastian Gutierrez (sum) 2017-03-31 08:48:34.261-0500

deadlocked again, this time we have all traces, dont optimize, debug threads, and if needed the core dump has malloc debug

By: Sebastian Gutierrez (sum) 2017-04-03 09:09:02.592-0500

With the latest patch I had some rtp issues, having the calls with cuts (seems like missing packets), I attach all logs and an mp3, all calls were the same and going back to a previous version solve the issue.

By: Richard Mudgett (rmudgett) 2017-04-03 12:59:55.357-0500

New patch to fix another race condition segfault exposed by the patch up on gerrit.

By: Richard Mudgett (rmudgett) 2017-04-04 16:53:54.920-0500

New patch to fix another race condition segfault exposed by the patch up on gerrit.

By: Richard Mudgett (rmudgett) 2017-04-06 13:17:25.959-0500

New patch up on gerrit.  The new patch adds more protection from reinvites restarting ICE negotiations.

By: Sebastian Gutierrez (sum) 2017-04-10 14:24:57.027-0500

I had the same crash I was getting with patchset 5 haven't tested patchset 6 (I will today)

{noformat}
#0  dtls_srtp_handle_timeout (instance=instance@entry=0x7f387c0131a0, rtcp=rtcp@entry=1) at res_rtp_asterisk.c:2050
#1  0x00007f38256f1408 in dtls_srtp_handle_rtcp_timeout (data=0x7f387c0131a0) at res_rtp_asterisk.c:2085
#2  0x00000000005b0fab in ast_sched_runq (con=0x1e89570) at sched.c:783
#3  0x00007f381ddfff8e in do_monitor (data=data@entry=0x0) at chan_sip.c:29615
#4  0x00000000005e817d in dummy_start (data=<optimized out>) at utils.c:1235
#5  0x00007f388aefb6ba in start_thread (arg=0x7f38156c4700) at pthread_create.c:333
#6  0x00007f388a4e482d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
{noformat}


didnt get the full data because I think there are several full logs of the issue, If needed I will take them.

By: Richard Mudgett (rmudgett) 2017-04-12 12:35:45.618-0500

Patch version 6 up on gerrit is expected to be merged in a few days (after a merge conflict is resolved).  For those testing the patch, I haven't heard about how well the patch is working for you.

By: Ross Beer (rossbeer) 2017-04-12 12:41:53.849-0500

I've been testing this from around Patch V2 and it has resolved the crash and deadlocks I was getting.

I don't use ICE and therefore didn't have any of the issues relating to that.

By: Ross Beer (rossbeer) 2017-04-14 02:31:12.025-0500

Bad news, when testing with Asterisk GIT-13-13.15.0-rc1-79-g5e2a8efM using patchset 7 there is a deadlock. However using Asterisk GIT-13-13.15.0-rc1-64-ge851412M
with patchset 6 there is no deadlock.

This may mean the rebase has caused issues between patchset 6 and 7 or another piece of code has been committed that is causing a new deadlock.




By: Richard Mudgett (rmudgett) 2017-04-14 09:32:33.352-0500

[~rossbeer] You should know by now what kind of information we need to fix issues:
See "Getting Information For A Deadlock" on this page https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information

By: Ross Beer (rossbeer) 2017-04-14 09:43:31.303-0500

@Richard Mudgett I did get core dump, however, I did the backtrace after downgrading to a previous version, therefore the stack was corrupt.

I will install the latest GIT version plus patchset 7 and will wait for a further deadlock. Once I have the required information I will update the ticket.

By: Ross Beer (rossbeer) 2017-04-14 17:13:46.272-0500

Richard, please find attached the backtrace and 'core show locks'

By: Richard Mudgett (rmudgett) 2017-04-14 17:28:17.577-0500

The patch for ASTERISK-26923 is causing the deadlock.  Commit 3e7c396a51b240088c475dd53e7bac9869376129

Revert that commit and you shouldn't have a deadlock anymore.

By: Ross Beer (rossbeer) 2017-04-18 04:24:40.196-0500

I can confirm reverting the commit has resolved the deadlock. There have been no further issues with this patch since the 14th April.

By: Friendly Automation (friendly-automation) 2017-04-21 13:12:06.564-0500

Change 5342 merged by George Joseph:
rtp_engine/res_rtp_asterisk: Fix RTP struct reentrancy crashes.

[https://gerrit.asterisk.org/5342|https://gerrit.asterisk.org/5342]

By: Friendly Automation (friendly-automation) 2017-04-21 13:12:26.104-0500

Change 5341 merged by George Joseph:
rtp_engine/res_rtp_asterisk: Fix RTP struct reentrancy crashes.

[https://gerrit.asterisk.org/5341|https://gerrit.asterisk.org/5341]

By: Friendly Automation (friendly-automation) 2017-04-21 15:47:42.528-0500

Change 5343 merged by George Joseph:
rtp_engine/res_rtp_asterisk: Fix RTP struct reentrancy crashes.

[https://gerrit.asterisk.org/5343|https://gerrit.asterisk.org/5343]