[Home]

Summary:ASTERISK-28510: Asterisk crashing on dtls handshake
Reporter:Jørgen Ligaard Sørensen (jls@adversus.dk)Labels:pjsip
Date Opened:2019-08-21 08:23:34Date Closed:2019-08-27 05:01:00
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Resources/res_rtp_asterisk
Versions:16.5.0 Frequency of
Occurrence
Occasional
Related
Issues:
Environment:Ubuntu 16.04.1Attachments:( 0) core.asterisk.29-brief.txt
( 1) core.asterisk.29-full.txt
( 2) core.asterisk.29-locks.txt
( 3) core.asterisk.29-thread1.txt
Description:We have observed several occurences of Asterisk crashing after recently changing our dial-plan to PJSIP.
We've not found any direct way to reproduce the problem, but it does seem to be restricted to high trafic hours - however there doesn't seem to be any particular threshold that causes the crash.

For every occurence we've observed the following entries occur in the log immediately prior to the crash.

[Aug 21 12:18:19] VERBOSE[9439][C-000001c8] res_rtp_asterisk.c: 0x7f5682408ca0 -- Strict RTP switching to RTP target address 194.247.61.32:25184 as source
[Aug 21 12:18:19] ERROR[74] pjproject:   icess0x7f568244d588 ...Error sending STUN request: Invalid argument
[Aug 21 12:18:19] ERROR[137] pjproject:   icess0x7f568244d588 ..Error sending STUN request: Invalid argument
[Aug 21 12:18:19] ERROR[137] pjproject:   icess0x7f568244d588 ..Error sending STUN request: Invalid argument

After enabling backtraces we're also seeing the same consistent backtrace for how the error occurs (res_rtp_asterisk.c).
Comments:By: Asterisk Team (asteriskteam) 2019-08-21 08:23:34.971-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: Joshua C. Colp (jcolp) 2019-08-21 08:30:13.425-0500

What version of OpenSSL is in use? The crash itself seems to be occurring deep down in OpenSSL, when asked to start the DTLS handshake.

By: Jørgen Ligaard Sørensen (jls@adversus.dk) 2019-08-21 08:59:27.055-0500

openssl version
OpenSSL 1.0.2g  1 Mar 2016

I also caught the following from std-err
d1_both.c(419): OpenSSL internal error, assertion failed: len == (unsigned int)ret

By: Joshua C. Colp (jcolp) 2019-08-22 08:08:23.280-0500

I don't think we're going to be able to do anything within Asterisk itself to resolve this problem, as it appears to indeed be in OpenSSL itself. I'd suggest upgrading to a new version as numerous DTLS fixes have gone.

By: Jørgen Ligaard Sørensen (jls@adversus.dk) 2019-08-22 10:13:55.487-0500

Ok. I'll try to get a new build running sometime next week and see if that helps.
Meanwhile, I've done some digging around, and the issue seems to be restricted to INVITEs where the SDP has ice-candidates using a host name rather than an IP.
Still havn't figured out exactly how to reproduce it, but we're increasing our logging in an attempt to narrow down the specifics

By: Jørgen Ligaard Sørensen (jls@adversus.dk) 2019-08-26 10:40:01.364-0500

I upgraded the platform to Ubuntu 18.04 (which includes OpenSSL 1.1.1) and this has fixed the problem.
However, this does appear to be a regression from Asterisk 16.4.0 as I can only reproduce the issue on Asterisk 16.5.0

By: Joshua C. Colp (jcolp) 2019-08-26 11:20:34.488-0500

16.5.0 uses a different method of doing packets with DTLS, so it would exercise new code in OpenSSL and could hit a bug within it.