Summary: | ASTERISK-27381: Crash inside opus codec | ||||||
Reporter: | Torrey Searle (tsearle) | Labels: | |||||
Date Opened: | 2017-10-30 10:18:20 | Date Closed: | 2017-12-01 11:52:30.000-0600 | ||||
Priority: | Major | Regression? | |||||
Status: | Closed/Complete | Components: | Codecs/codec_opus | ||||
Versions: | 13.15.1 | Frequency of Occurrence | |||||
Related Issues: |
| ||||||
Environment: | Attachments: | ( 0) codecs.conf ( 1) core.asterisk.ast13sip01-1509443936-11-brief.txt ( 2) core.asterisk.ast13sip01-1509443936-11-full.txt ( 3) core.asterisk.ast13sip01-1509443936-11-locks.txt ( 4) core.asterisk.ast13sip01-1509443936-11-thread1.txt ( 5) log | |||||
Description: | Crash has been detected on our WEBRTC platform inside the opus codec version 1.1.0 attached is the backtrace
{noformat} #0 0x00007f0102b5c642 in ?? () from /usr/lib/asterisk/modules/codec_opus-13.0_1.1.0-x86_64.so #1 0x00007f0102b50f26 in ?? () from /usr/lib/asterisk/modules/codec_opus-13.0_1.1.0-x86_64.so #2 0x00000000005dd38e in framein (f=<optimized out>, pvt=<optimized out>) at translate.c:423 #3 ast_translate (path=0x7f00ec02cdd0, f=0x7f0030004560, consume=0) at translate.c:573 #4 0x00000000004b9c3c in ast_write (chan=0x7f008837fec0, fr=0x7f0030004560) at channel.c:5290 #5 0x00000000004822ea in bridge_channel_handle_write (bridge_channel=<optimized out>) at bridge_channel.c:2346 #6 bridge_channel_wait (bridge_channel=<optimized out>) at bridge_channel.c:2593 #7 bridge_channel_internal_join (bridge_channel=0x7f00ec0080b0) at bridge_channel.c:2728 #8 0x000000000046c3d6 in ast_bridge_join (bridge=bridge@entry=0x7f00ec00d120, chan=chan@entry=0x7f008837fec0, swap=swap@entry=0x0, features=features@entry=0x7f00aab08a10, tech_args=tech_args@entry=0x0, flags=flags@entry=(AST_BRIDGE_JOIN_PASS_REFERENCE | AST_BRIDGE_JOIN_INHIBIT_JOIN_COLP)) at bridge.c:1713 #9 0x000000000050536f in ast_bridge_call_with_flags (chan=chan@entry=0x7f008837fec0, peer=peer@entry=0x7f00ec015bf0, config=config@entry=0x7f00aab08e00, flags=flags@entry=0) at features.c:672 #10 0x0000000000505477 in ast_bridge_call (chan=chan@entry=0x7f008837fec0, peer=peer@entry=0x7f00ec015bf0, config=config@entry=0x7f00aab08e00) at features.c:711 #11 0x00007f0056643019 in dial_exec_full (chan=0x7f008837fec0, data=<optimized out>, peerflags=peerflags@entry=0x7f00aab09740, continue_exec=continue_exec@entry=0x0) at app_dial.c:3224 #12 0x00007f0056644126 in dial_exec (chan=<optimized out>, data=<optimized out>) at app_dial.c:3280 #13 0x0000000000579d5e in pbx_exec (c=c@entry=0x7f008837fec0, app=app@entry=0x2224550, data=data@entry=0x7f00aab09c50 "PJSIP/883510080318@cnhk1ser,180,b(predial^s^1)") at pbx_app.c:491 #14 0x000000000056ee41 in pbx_extension_helper (c=0x7f008837fec0, context=0x7f0088380878 "webrtc", exten=0x7f00883808c8 "s", priority=1, label=<optimized out>, callerid=<optimized out>, action=E_SPAWN, found=0x7f00aab0bcec, combined_find_spawn=1, con=0x0) at pbx.c:2884 #15 0x0000000000570fd9 in ast_spawn_extension (combined_find_spawn=<optimized out>, found=<optimized out>, callerid=<optimized out>, priority=<optimized out>, exten=<optimized out>, context=<optimized out>, c=<optimized out>) at pbx.c:4109 #16 __ast_pbx_run (c=0x7f008837fec0, args=0x7f00ec02cec0, args@entry=0x0) at pbx.c:4286 #17 0x000000000057251b in pbx_thread (data=data@entry=0x7f008837fec0) at pbx.c:4608 #18 0x00000000005e2eda in dummy_start (data=<optimized out>) at utils.c:1238 #19 0x00007f0115bb1064 in start_thread (arg=0x7f00aab0c700) at pthread_create.c:309 {noformat} | ||||||
Comments: | By: Asterisk Team (asteriskteam) 2017-10-30 10:18:21.287-0500 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Richard Mudgett (rmudgett) 2017-10-30 10:27:12.579-0500 Potentially related to ASTERISK-27194 but doesn't appear to be. By: Torrey Searle (tsearle) 2017-10-31 05:26:20.971-0500 I tested the version with the debug symbols in lab and it crashed immediately with the same back trace. Attached is the coredump output By: Richard Mudgett (rmudgett) 2017-11-08 14:18:16.390-0600 Please attach the codecs.conf file or the equivalent database config settings. Also what are the last few lines in the log. By: Torrey Searle (tsearle) 2017-11-10 02:15:32.222-0600 The crash is a result of failure to initialise the codec. The codec fails to initialize because of no value set for application. Application is not getting set (not even to a default) because codec_opus_config_attr_cfg somewhere between line 62 and 69 sets cfg to null {noformat} Breakpoint 1, codec_opus_config_attr_cfg (fmt=0x7fffdc014588) at codec_opus_config.c:62 62 codec_opus_config.c: No such file or directory. (gdb) print *fmt $1 = {name = 0x620429 "opus", codec = 0x11e51b8, attribute_data = 0x7fffdc0145b0, interface = 0x7fffc9c3b4c0 <opus_interface>} (gdb) info locals cfg = 0x7fffd0053300 __PRETTY_FUNCTION__ = "codec_opus_config_attr_cfg" (gdb) next 66 in codec_opus_config.c (gdb) next [New Thread 0x7fffa37f5700 (LWP 21948)] [New Thread 0x7fffa38ed700 (LWP 21946)] [New Thread 0x7fffa3871700 (LWP 21947)] [New Thread 0x7fffa3779700 (LWP 21949)] [Thread 0x7fffc7448700 (LWP 21930) exited] [Thread 0x7fffc7540700 (LWP 21939) exited] [New Thread 0x7fffa36fd700 (LWP 21952)] [Thread 0x7fffc75bc700 (LWP 21940) exited] [New Thread 0x7fffa3681700 (LWP 21953)] [Thread 0x7fffa39e5700 (LWP 21943) exited] [New Thread 0x7fffa39e5700 (LWP 21954)] [Thread 0x7fffa37f5700 (LWP 21948) exited] [Thread 0x7fffa3779700 (LWP 21949) exited] 69 in codec_opus_config.c (gdb) info locals cfg = 0x0 __PRETTY_FUNCTION__ = "codec_opus_config_attr_cfg" (gdb) next [Thread 0x7fffc74c4700 (LWP 21938) exited] 70 in codec_opus_config.c (gdb) info locals cfg = 0x0 __PRETTY_FUNCTION__ = "codec_opus_config_attr_cfg" (gdb) next [New Thread 0x7fffa3779700 (LWP 21956)] [New Thread 0x7fffc74c4700 (LWP 21955)] [Thread 0x7fffa3969700 (LWP 21944) exited] opus_dec_set (pvt=0x7fffd0011210, fmt_src=0x7fffdc014588, fmt_dst=0x11e3f48) at codec_opus.c:264 264 codec_opus.c: No such file or directory. (gdb) print *pvt $2 = {t = 0x7fffcdaf1b60 <opustolin>, f = {frametype = AST_FRAME_VOICE, subclass = {integer = 0, format = 0x11e3f48, frame_ending = 0}, datalen = 0, samples = 0, mallocd = 0, mallocd_hdr_len = 0, offset = 64, src = 0x7fffcdaf1b60 <opustolin> "opustolin", data = {ptr = 0x7fffd0011358, uint32 = 3489731416, pad = "X\023\001\320\377\177\000"}, delivery = {tv_sec = 0, tv_usec = 0}, frame_list = {next = 0x0}, flags = 1, ts = 3150328, len = 20, seqno = 20501}, samples = 0, datalen = 0, pvt = 0x7fffd00112f0, outbuf = {c = 0x7fffd0011358 "", uc = 0x7fffd0011358 "", i16 = 0x7fffd0011358, ui8 = 0x7fffd0011358 ""}, plc = 0x0, next = 0x0, nextin = {tv_sec = 1509618132, tv_usec = 716600}, nextout = {tv_sec = 1509618132, tv_usec = 696600}, explicit_dst = 0x11e3f48} (gdb) print cfg $3 = (struct codec_opus_config *) 0x0 (gdb) info locals dec = 0x7fffd00112f0 sample_rate = 48000 ptime = 0 error = 0 cfg = 0x0 __PRETTY_FUNCTION__ = "opus_dec_set" {noformat} By: Richard Mudgett (rmudgett) 2017-11-14 16:11:21.572-0600 I am unable to reproduce the crash. I tried sending a WebRTC call to a playback exten, to an indirect playback exten (using a local channel), and sending the call a SIP phone exten. I can see in the code how a global defaults config might not be initialized if there is a codecs.conf [opus] section but I don't see how it could get used as a result. Could you attach an Asterisk log of a call that crashes with verbose of at least 4 and debug of at least 4. Also codec_opus 1.2.0 was released yesterday. It has a fix that should prevent the crash but not fix why it was trying to use an uninitialized config. By: Torrey Searle (tsearle) 2017-11-17 03:00:22.164-0600 The test call I was doing passes through asterisk twice, I filtered the log on the channel id so you only have the channel with the issue in this log. This channel is receiving a call in opus and performing an echo test. {noformat} [Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'AGI' [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [3227470203@default:1] AGI("PJSIP/srnd0ser-000000a2", "agi://127.0.0.1:20203/agi_load_balance_handler") in new stack [Nov 17 08:52:55] DEBUG[2430][C-0000009c] res_agi.c: Wow, connected! [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: agi://127.0.0.1:20203/agi_load_balance_handler: [ast13sip01] request for VCC node received [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: agi://127.0.0.1:20203/agi_load_balance_handler: statement : SELECT ms.ip, ms.max_calls FROM callcontrollocal.machine_status ms WHERE ms.hostname LIKE ? AND ms.service = vcc AND IFNULL(ms.force_alive, ms.is_alive) = 1; [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: agi://127.0.0.1:20203/agi_load_balance_handler: selected VCC node (in pop ast13) : 10.1.25.3 [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: <PJSIP/srnd0ser-000000a2>AGI Script agi://127.0.0.1:20203/agi_load_balance_handler completed, returning 0 [Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'NoOp' [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [3227470203@default:2] NoOp("PJSIP/srnd0ser-000000a2", "using VCC node : 10.1.25.3") in new stack [Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx_variables.c: Function PJSIP_HEADER(read,X-VCC-CallType) result is '(null)' [Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx_variables.c: Expression result is '0' [Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'GotoIf' [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [3227470203@default:3] GotoIf("PJSIP/srnd0ser-000000a2", "0?voxfax-init,,1:") in new stack [Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx_builtins.c: Not taking any branch [Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'AGI' [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [3227470203@default:4] AGI("PJSIP/srnd0ser-000000a2", "agi://10.1.25.3/init") in new stack [Nov 17 08:52:55] DEBUG[2430][C-0000009c] res_agi.c: Wow, connected! [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] res_agi.c: <PJSIP/srnd0ser-000000a2>AGI Script agi://10.1.25.3/init completed, returning 0 [Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'Answer' [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [echo@ivrs2:1] Answer("PJSIP/srnd0ser-000000a2", "") in new stack [Nov 17 08:52:55] DEBUG[2430][C-0000009c] res_rtp_asterisk.c: 0x7f3f781151f0 -- Probation learning mode pass with source address 10.1.25.2:12238 [Nov 17 08:52:55] DEBUG[2430][C-0000009c] chan_pjsip.c: Oooh, got a frame with format of opus on channel 'PJSIP/srnd0ser-000000a2' when we're sending 'alaw', switching to match [Nov 17 08:52:55] DEBUG[2430][C-0000009c] channel.c: Channel PJSIP/srnd0ser-000000a2 setting write format path: alaw -> opus [Nov 17 08:52:55] DEBUG[2430][C-0000009c] channel.c: Channel PJSIP/srnd0ser-000000a2 setting read format path: alaw -> opus [Nov 17 08:52:55] DEBUG[2430][C-0000009c] pbx.c: Launching 'Wait' [Nov 17 08:52:55] VERBOSE[2430][C-0000009c] pbx.c: Executing [echo@ivrs2:2] Wait("PJSIP/srnd0ser-000000a2", "4") in new stack [Nov 17 08:52:59] DEBUG[2430][C-0000009c] pbx.c: Launching 'Playback' [Nov 17 08:52:59] VERBOSE[2430][C-0000009c] pbx.c: Executing [echo@ivrs2:3] Playback("PJSIP/srnd0ser-000000a2", "demo-echotest") in new stack [Nov 17 08:52:59] DEBUG[2430][C-0000009c] channel.c: Channel PJSIP/srnd0ser-000000a2 setting write format path: gsm -> opus [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] DEBUG[2430][C-0000009c] channel.c: Scheduling timer at (50 requested / 50 actual) timer ticks per second [Nov 17 08:52:59] VERBOSE[2430][C-0000009c] file.c: <PJSIP/srnd0ser-000000a2> Playing 'demo-echotest.gsm' (language 'en') [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument {noformat} By: Torrey Searle (tsearle) 2017-11-17 03:11:15.933-0600 full log in attachment By: Richard Mudgett (rmudgett) 2017-11-17 15:53:44.979-0600 I've managed to recreate the {{codec_opus.c: Opus: failed to create encoder: invalid argument}} symptoms of what is in the [^log] file. I did it by manually loading codec_opus after Asterisk was up and then did a call scenario similar to what is in the log. Could you attach your {{modules.conf}} file so I can see when you are loading the codec? By: Torrey Searle (tsearle) 2017-11-17 16:04:20.018-0600 I also renamed the opus codec files, to add the arch & version in the filename (mentioning that in case that would also impact loading order) they are installed with the following names codec_opus-13.0_1.1.0-x86_64.so format_ogg_opus-13.0_1.1.0-x86_64.so codec_opus_config-en_US.xml {noformat} [modules] autoload=yes noload => pbx_ael.so noload => pbx_dundi.so noload => pbx_gtkconsole.so noload => pbx_kdeconsole.so noload => chan_sip.so noload => chan_alsa.so noload => chan_iax2.so noload => chan_mgcp.so noload => chan_multicast_rtp.so noload => chan_oss.so noload => chan_phone.so noload => chan_skinny.so noload => chan_unistim.so noload => cdr_adaptive_odbc.so noload => cdr_csv.so noload => cdr_custom.so noload => cdr_manager.so noload => cdr_mysql.so noload => cdr_odbc.so noload => cdr_pgsql.so noload => cdr_radius.so noload => cdr_sqlite.so noload => cdr_sqlite3_custom.so noload => cdr_syslog.so noload => res_config_ldap.so noload => res_config_pgsql.so {noformat} By: Torrey Searle (tsearle) 2017-11-20 07:54:44.076-0600 changing [opus] to [myopus] causes the echo test to work By: Richard Mudgett (rmudgett) 2017-11-30 14:34:42.876-0600 {noformat} [Nov 17 08:52:59] ERROR[2430][C-0000009c] codec_opus.c: Opus: failed to create encoder: invalid argument {noformat} The above error message as well as the ability to use [opus] instead of [myopus] should be fixed by Digium's codec_opus v1.3.0 when it is released. |