[Home]

Summary:ASTERISK-27381: Crash inside opus codec
Reporter:Torrey Searle (tsearle)Labels:
Date Opened:2017-10-30 10:18:20Date Closed:2017-12-01 11:52:30.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Codecs/codec_opus
Versions:13.15.1 Frequency of
Occurrence
Related
Issues:
is duplicated byASTERISK-26810 codec_opus: Crash when translating frame with opus section
is related toASTERISK-26870 codec_opus: Codec configured with constant bit rate, but frame sizes changes
Environment:Attachments:( 0) codecs.conf
( 1) core.asterisk.ast13sip01-1509443936-11-brief.txt
( 2) core.asterisk.ast13sip01-1509443936-11-full.txt
( 3) core.asterisk.ast13sip01-1509443936-11-locks.txt
( 4) core.asterisk.ast13sip01-1509443936-11-thread1.txt
( 5) log
Description:Crash has been detected on our WEBRTC platform inside the opus codec version 1.1.0 attached is the backtrace
{noformat}
#0  0x00007f0102b5c642 in ?? () from /usr/lib/asterisk/modules/codec_opus-13.0_1.1.0-x86_64.so
#1  0x00007f0102b50f26 in ?? () from /usr/lib/asterisk/modules/codec_opus-13.0_1.1.0-x86_64.so
#2  0x00000000005dd38e in framein (f=<optimized out>, pvt=<optimized out>) at translate.c:423
#3  ast_translate (path=0x7f00ec02cdd0, f=0x7f0030004560, consume=0) at translate.c:573
#4  0x00000000004b9c3c in ast_write (chan=0x7f008837fec0, fr=0x7f0030004560) at channel.c:5290
#5  0x00000000004822ea in bridge_channel_handle_write (bridge_channel=<optimized out>) at bridge_channel.c:2346
#6  bridge_channel_wait (bridge_channel=<optimized out>) at bridge_channel.c:2593
#7  bridge_channel_internal_join (bridge_channel=0x7f00ec0080b0) at bridge_channel.c:2728
#8  0x000000000046c3d6 in ast_bridge_join (bridge=bridge@entry=0x7f00ec00d120, chan=chan@entry=0x7f008837fec0, swap=swap@entry=0x0, features=features@entry=0x7f00aab08a10,
   tech_args=tech_args@entry=0x0, flags=flags@entry=(AST_BRIDGE_JOIN_PASS_REFERENCE | AST_BRIDGE_JOIN_INHIBIT_JOIN_COLP)) at bridge.c:1713
#9  0x000000000050536f in ast_bridge_call_with_flags (chan=chan@entry=0x7f008837fec0, peer=peer@entry=0x7f00ec015bf0, config=config@entry=0x7f00aab08e00, flags=flags@entry=0)
   at features.c:672
#10 0x0000000000505477 in ast_bridge_call (chan=chan@entry=0x7f008837fec0, peer=peer@entry=0x7f00ec015bf0, config=config@entry=0x7f00aab08e00) at features.c:711
#11 0x00007f0056643019 in dial_exec_full (chan=0x7f008837fec0, data=<optimized out>, peerflags=peerflags@entry=0x7f00aab09740, continue_exec=continue_exec@entry=0x0)
   at app_dial.c:3224
#12 0x00007f0056644126 in dial_exec (chan=<optimized out>, data=<optimized out>) at app_dial.c:3280
#13 0x0000000000579d5e in pbx_exec (c=c@entry=0x7f008837fec0, app=app@entry=0x2224550, data=data@entry=0x7f00aab09c50 "PJSIP/883510080318@cnhk1ser,180,b(predial^s^1)")
   at pbx_app.c:491
#14 0x000000000056ee41 in pbx_extension_helper (c=0x7f008837fec0, context=0x7f0088380878 "webrtc", exten=0x7f00883808c8 "s", priority=1, label=<optimized out>,
   callerid=<optimized out>, action=E_SPAWN, found=0x7f00aab0bcec, combined_find_spawn=1, con=0x0) at pbx.c:2884
#15 0x0000000000570fd9 in ast_spawn_extension (combined_find_spawn=<optimized out>, found=<optimized out>, callerid=<optimized out>, priority=<optimized out>,
   exten=<optimized out>, context=<optimized out>, c=<optimized out>) at pbx.c:4109
#16 __ast_pbx_run (c=0x7f008837fec0, args=0x7f00ec02cec0, args@entry=0x0) at pbx.c:4286
#17 0x000000000057251b in pbx_thread (data=data@entry=0x7f008837fec0) at pbx.c:4608
#18 0x00000000005e2eda in dummy_start (data=<optimized out>) at utils.c:1238
#19 0x00007f0115bb1064 in start_thread (arg=0x7f00aab0c700) at pthread_create.c:309
{noformat}
Comments:By: Asterisk Team (asteriskteam) 2017-10-30 10:18:21.287-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Richard Mudgett (rmudgett) 2017-10-30 10:27:12.579-0500

Potentially related to ASTERISK-27194 but doesn't appear to be.

By: Torrey Searle (tsearle) 2017-10-31 05:26:20.971-0500

I tested the version with the debug symbols in lab and it crashed immediately with the same back trace.  Attached is the coredump output

By: Richard Mudgett (rmudgett) 2017-11-08 14:18:16.390-0600

Please attach the codecs.conf file or the equivalent database config settings.
Also what are the last few lines in the log.

By: Torrey Searle (tsearle) 2017-11-10 02:15:32.222-0600

The crash is a result of failure to initialise the codec.  The codec fails to initialize  because of no value set for application.

Application is not getting set (not even to a default) because codec_opus_config_attr_cfg somewhere between line 62 and 69 sets cfg to null

{noformat}
Breakpoint 1, codec_opus_config_attr_cfg (fmt=0x7fffdc014588) at codec_opus_config.c:62
62 codec_opus_config.c: No such file or directory.
(gdb) print *fmt
$1 = {name = 0x620429 "opus", codec = 0x11e51b8, attribute_data = 0x7fffdc0145b0, interface = 0x7fffc9c3b4c0 <opus_interface>}
(gdb) info locals
cfg = 0x7fffd0053300
__PRETTY_FUNCTION__ = "codec_opus_config_attr_cfg"
(gdb) next
66 in codec_opus_config.c
(gdb) next
[New Thread 0x7fffa37f5700 (LWP 21948)]
[New Thread 0x7fffa38ed700 (LWP 21946)]
[New Thread 0x7fffa3871700 (LWP 21947)]
[New Thread 0x7fffa3779700 (LWP 21949)]
[Thread 0x7fffc7448700 (LWP 21930) exited]
[Thread 0x7fffc7540700 (LWP 21939) exited]
[New Thread 0x7fffa36fd700 (LWP 21952)]
[Thread 0x7fffc75bc700 (LWP 21940) exited]
[New Thread 0x7fffa3681700 (LWP 21953)]
[Thread 0x7fffa39e5700 (LWP 21943) exited]
[New Thread 0x7fffa39e5700 (LWP 21954)]
[Thread 0x7fffa37f5700 (LWP 21948) exited]
[Thread 0x7fffa3779700 (LWP 21949) exited]
69 in codec_opus_config.c
(gdb) info locals
cfg = 0x0
__PRETTY_FUNCTION__ = "codec_opus_config_attr_cfg"
(gdb) next
[Thread 0x7fffc74c4700 (LWP 21938) exited]
70 in codec_opus_config.c
(gdb) info locals
cfg = 0x0
__PRETTY_FUNCTION__ = "codec_opus_config_attr_cfg"
(gdb) next
[New Thread 0x7fffa3779700 (LWP 21956)]
[New Thread 0x7fffc74c4700 (LWP 21955)]
[Thread 0x7fffa3969700 (LWP 21944) exited]
opus_dec_set (pvt=0x7fffd0011210, fmt_src=0x7fffdc014588, fmt_dst=0x11e3f48) at codec_opus.c:264
264 codec_opus.c: No such file or directory.
(gdb) print *pvt
$2 = {t = 0x7fffcdaf1b60 <opustolin>, f = {frametype = AST_FRAME_VOICE, subclass = {integer = 0, format = 0x11e3f48,
     frame_ending = 0}, datalen = 0, samples = 0, mallocd = 0, mallocd_hdr_len = 0, offset = 64,
   src = 0x7fffcdaf1b60 <opustolin> "opustolin", data = {ptr = 0x7fffd0011358, uint32 = 3489731416,
     pad = "X\023\001\320\377\177\000"}, delivery = {tv_sec = 0, tv_usec = 0}, frame_list = {next = 0x0}, flags = 1,
   ts = 3150328, len = 20, seqno = 20501}, samples = 0, datalen = 0, pvt = 0x7fffd00112f0, outbuf = {c = 0x7fffd0011358 "",
   uc = 0x7fffd0011358 "", i16 = 0x7fffd0011358, ui8 = 0x7fffd0011358 ""}, plc = 0x0, next = 0x0, nextin = {tv_sec = 1509618132,
   tv_usec = 716600}, nextout = {tv_sec = 1509618132, tv_usec = 696600}, explicit_dst = 0x11e3f48}
(gdb) print cfg
$3 = (struct codec_opus_config *) 0x0
(gdb) info locals
dec = 0x7fffd00112f0
sample_rate = 48000
ptime = 0
error = 0
cfg = 0x0
__PRETTY_FUNCTION__ = "opus_dec_set"
{noformat}

By: Richard Mudgett (rmudgett) 2017-11-14 16:11:21.572-0600

I am unable to reproduce the crash.  I tried sending a WebRTC call to a playback exten, to an indirect playback exten (using a local channel), and sending the call a SIP phone exten.  I can see in the code how a global defaults config might not be initialized if there is a codecs.conf [opus] section but I don't see how it could get used as a result.

Could you attach an Asterisk log of a call that crashes with verbose of at least 4 and debug of at least 4.

Also codec_opus 1.2.0 was released yesterday.  It has a fix that should prevent the crash but not fix why it was trying to use an uninitialized config.

By: Torrey Searle (tsearle) 2017-11-17 03:00:22.164-0600

The test call I was doing passes through asterisk twice, I filtered the log on the channel id so you only have the channel with the issue in this log.  This channel is receiving a call in opus and performing an echo test.

{noformat}
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'AGI'
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [3227470203@default:1] AGI("PJSIP/srnd0ser-000000a2", "agi://127.0.0.1:20203/agi_load_balance_handler") in new stack
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] res_agi.c: Wow, connected!
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: agi://127.0.0.1:20203/agi_load_balance_handler: [ast13sip01] request for VCC node received
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: agi://127.0.0.1:20203/agi_load_balance_handler: statement : SELECT ms.ip, ms.max_calls  FROM callcontrollocal.machine_status ms  WHERE ms.hostname LIKE ?    AND ms.service = vcc    AND IFNULL(ms.force_alive, ms.is_alive) = 1;
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: agi://127.0.0.1:20203/agi_load_balance_handler: selected VCC node (in pop ast13) : 10.1.25.3
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: <PJSIP/srnd0ser-000000a2>AGI Script agi://127.0.0.1:20203/agi_load_balance_handler completed, returning 0
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'NoOp'
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [3227470203@default:2] NoOp("PJSIP/srnd0ser-000000a2", "using VCC node : 10.1.25.3") in new stack
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx_variables.c: Function PJSIP_HEADER(read,X-VCC-CallType) result is '(null)'
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx_variables.c: Expression result is '0'
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'GotoIf'
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [3227470203@default:3] GotoIf("PJSIP/srnd0ser-000000a2", "0?voxfax-init,,1:") in new stack
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx_builtins.c: Not taking any branch
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'AGI'
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [3227470203@default:4] AGI("PJSIP/srnd0ser-000000a2", "agi://10.1.25.3/init") in new stack
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] res_agi.c: Wow, connected!
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: <PJSIP/srnd0ser-000000a2>AGI Script agi://10.1.25.3/init completed, returning 0
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'Answer'
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [echo@ivrs2:1] Answer("PJSIP/srnd0ser-000000a2", "") in new stack
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] res_rtp_asterisk.c: 0x7f3f781151f0 -- Probation learning mode pass with source address 10.1.25.2:12238
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] chan_pjsip.c: Oooh, got a frame with format of opus on channel 'PJSIP/srnd0ser-000000a2' when we're sending 'alaw', switching to match
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] channel.c: Channel PJSIP/srnd0ser-000000a2 setting write format path: alaw -> opus
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] channel.c: Channel PJSIP/srnd0ser-000000a2 setting read format path: alaw -> opus
[Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'Wait'
[Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [echo@ivrs2:2] Wait("PJSIP/srnd0ser-000000a2", "4") in new stack
[Nov 17 08:52:59] DEBUG[2430][C-0000009c] pbx.c: Launching 'Playback'
[Nov 17 08:52:59] VERBOSE[2430][C-0000009c] pbx.c: Executing [echo@ivrs2:3] Playback("PJSIP/srnd0ser-000000a2", "demo-echotest") in new stack
[Nov 17 08:52:59] DEBUG[2430][C-0000009c] channel.c: Channel PJSIP/srnd0ser-000000a2 setting write format path: gsm -> opus
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] DEBUG[2430][C-0000009c] channel.c: Scheduling timer at (50 requested / 50 actual) timer ticks per second
[Nov 17 08:52:59] VERBOSE[2430][C-0000009c] file.c: <PJSIP/srnd0ser-000000a2> Playing 'demo-echotest.gsm' (language 'en')
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
{noformat}

By: Torrey Searle (tsearle) 2017-11-17 03:11:15.933-0600

full log in attachment

By: Richard Mudgett (rmudgett) 2017-11-17 15:53:44.979-0600

I've managed to recreate the {{codec_opus.c: Opus: failed to create encoder: invalid argument}} symptoms of what is in the [^log] file.  I did it by manually loading codec_opus after Asterisk was up and then did a call scenario similar to what is in the log.  Could you attach your {{modules.conf}} file so I can see when you are loading the codec?

By: Torrey Searle (tsearle) 2017-11-17 16:04:20.018-0600

I also renamed the opus codec files, to add the arch & version in the filename (mentioning that in case that would also impact loading order)

they are installed with the following names
codec_opus-13.0_1.1.0-x86_64.so
format_ogg_opus-13.0_1.1.0-x86_64.so
codec_opus_config-en_US.xml


{noformat}
[modules]
autoload=yes
noload => pbx_ael.so
noload => pbx_dundi.so
noload => pbx_gtkconsole.so
noload => pbx_kdeconsole.so
noload => chan_sip.so
noload => chan_alsa.so
noload => chan_iax2.so
noload => chan_mgcp.so
noload => chan_multicast_rtp.so
noload => chan_oss.so
noload => chan_phone.so
noload => chan_skinny.so
noload => chan_unistim.so
noload => cdr_adaptive_odbc.so
noload => cdr_csv.so
noload => cdr_custom.so
noload => cdr_manager.so
noload => cdr_mysql.so
noload => cdr_odbc.so
noload => cdr_pgsql.so
noload => cdr_radius.so
noload => cdr_sqlite.so
noload => cdr_sqlite3_custom.so
noload => cdr_syslog.so
noload => res_config_ldap.so
noload => res_config_pgsql.so
{noformat}

By: Torrey Searle (tsearle) 2017-11-20 07:54:44.076-0600

changing
[opus] to [myopus]

causes the echo test to work

By: Richard Mudgett (rmudgett) 2017-11-30 14:34:42.876-0600

{noformat}
[Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument
{noformat}
The above error message as well as the ability to use [opus] instead of [myopus] should be fixed by Digium's codec_opus v1.3.0 when it is released.