[Home]

Summary:ASTERISK-28319: musl: Crash on startup when loading modules
Reporter:Sebastian Kemper (skemper)Labels:pjsip
Date Opened:2019-03-02 11:58:45.000-0600Date Closed:2019-04-19 08:34:05
Priority:MinorRegression?Yes
Status:Closed/CompleteComponents:General
Versions:16.2.1 Frequency of
Occurrence
Constant
Related
Issues:
Environment:OpenWrt trunk with musl libc 1.1.21Attachments:
Description:Hello Asterisk team,

I updated asterisk 15 to 16 on OpenWrt. After getting it to compile I saw that there was a segmentation fault every time I started the PBX. It happened in main/loader.c. I got help on the musl list: https://www.openwall.com/lists/musl/2019/02/28/4

This doesn't happen with Asterisk 15. But in Asterisk 16 I see this every time I start asterisk:

Asterisk PBX Core Initializing
 == Registering builtin functions:
 == Registered custom function 'EXCEPTION'
 == Registered custom function 'TESTTIME'
 == Manager registered action ShowDialPlan
 == Manager registered action ExtensionStateList
 == Registered application 'Answer'
 == Registered application 'BackGround'
 == Registered application 'Busy'
 == Registered application 'Congestion'
 == Registered application 'ExecIfTime'
 == Registered application 'Goto'
 == Registered application 'GotoIf'
 == Registered application 'GotoIfTime'
 == Registered application 'ImportVar'
 == Registered application 'Hangup'
 == Registered application 'Incomplete'
 == Registered application 'NoOp'
 == Registered application 'Proceeding'
 == Registered application 'Progress'
 == Registered application 'RaiseException'
 == Registered application 'Ringing'
 == Registered application 'SayAlpha'
 == Registered application 'SayAlphaCase'
 == Registered application 'SayDigits'
 == Registered application 'SayNumber'
 == Registered application 'SayPhonetic'
 == Registered application 'SetAMAFlags'
 == Registered application 'Wait'
 == Registered application 'WaitDigit'
 == Registered application 'WaitExten'
 == Registered application 'Set'
 == Registered application 'MSet'
 == Registered channel type 'Local' (Local Proxy Channel Driver)
 == Manager registered action LocalOptimizeAway
Asterisk Dynamic Loader Starting:
[Mar  2 17:51:37] NOTICE[19567]: loader.c:2230 load_modules: 91 modules will be loaded.
Segmentation fault
root@hank2:/tmp#

Backtrace:

(gdb)

Thread 1 "asterisk" hit Breakpoint 1, load_dlopen (resource_in=0x77d8f52b <parseHhMmSs+578> "D\350\240e", resource_in@...ry=0x5fa910 "res_pjproject.so", so_ext=0x0,
   so_ext@...ry=0x53b91c "", filename=0x5fa910 "res_pjproject.so", filename@...ry=0x7fff7a04 "/usr/lib/asterisk/modules/res_pjproject.so", flags=796226418,
   flags@...ry=258, suppress_logging=suppress_logging@...ry=0) at loader.c:952
952 if (resource_being_loaded) {
(gdb)
951 mod->lib = dlopen(filename, flags);
(gdb)
952 if (resource_being_loaded) {
(gdb)
955 const char *dlerror_msg = ast_strdupa(dlerror());
(gdb)

Thread 1 "asterisk" received signal SIGSEGV, Segmentation fault.
strlen (s=0x0, s@...ry=0x48d79d <load_dynamic_module+120> "\t\360\"\223\f\234\200\353\216#\005\032\240z\364e") at src/string/strlen.c:17
17 for (w = (const void *)s; !HASZERO(*w); w++);
(gdb) bt
#0  strlen (s=0x0, s@...ry=0x48d79d <load_dynamic_module+120> "\t\360\"\223\f\234\200\353\216#\005\032\240z\364e") at src/string/strlen.c:17
#1  0x0048d5db in load_dlopen (resource_in=0x77d8f52b <parseHhMmSs+578> "D\350\240e", resource_in@...ry=0x5fa910 "res_pjproject.so", so_ext=0x0,
   so_ext@...ry=0x53b91c "", filename=0x5fa910 "res_pjproject.so", filename@...ry=0x7fff7a04 "/usr/lib/asterisk/modules/res_pjproject.so", flags=796226418,
   flags@...ry=258, suppress_logging=suppress_logging@...ry=0) at loader.c:955
#2  0x0048d79d in load_dynamic_module (resource_in=resource_in@...ry=0x5fa910 "res_pjproject.so", suppress_logging=suppress_logging@...ry=1) at loader.c:1039
#3  0x0048eea3 in load_resource (resource_name=0x5fa910 "res_pjproject.so", suppress_logging=suppress_logging@...ry=1,
   module_priorities=module_priorities@...ry=0x7fff8c24, required=0, preload=0) at loader.c:1635
#4  0x0048f5e1 in load_resource_list (mod_count=<synthetic pointer>, load_order=0x7fff8c1c) at loader.c:1984
#5  load_modules () at loader.c:2232
#6  0x0042c99d in asterisk_daemon (isroot=<optimized out>, rungroup=<optimized out>, runuser=<optimized out>) at asterisk.c:4146
#7  main (argc=<optimized out>, argv=<optimized out>) at asterisk.c:3918
(gdb)

On the musl list it was explained to me that what happens is that dlerror() returned 0 and ast_strdupa calls strlen on this return value which segfaults as expected.

I tried replacing

const char *dlerror_msg = ast_strdupa(dlerror());

with

const char *dlerror_msg = dlerror(); dlerror_msg = ast_strdupa(dlerror_msg ? dlerror_msg : "");

This indeed stopped the segmentation fault. But the general opinion on the musl list is that this is just fixing the crash, not the underlying general issue.

Here's the boot log with this:

Asterisk PBX Core Initializing
 == Registering builtin functions:
 == Registered custom function 'EXCEPTION'
 == Registered custom function 'TESTTIME'
 == Manager registered action ShowDialPlan
 == Manager registered action ExtensionStateList
 == Registered application 'Answer'
 == Registered application 'BackGround'
 == Registered application 'Busy'
 == Registered application 'Congestion'
 == Registered application 'ExecIfTime'
 == Registered application 'Goto'
 == Registered application 'GotoIf'
 == Registered application 'GotoIfTime'
 == Registered application 'ImportVar'
 == Registered application 'Hangup'
 == Registered application 'Incomplete'
 == Registered application 'NoOp'
 == Registered application 'Proceeding'
 == Registered application 'Progress'
 == Registered application 'RaiseException'
 == Registered application 'Ringing'
 == Registered application 'SayAlpha'
 == Registered application 'SayAlphaCase'
 == Registered application 'SayDigits'
 == Registered application 'SayNumber'
 == Registered application 'SayPhonetic'
 == Registered application 'SetAMAFlags'
 == Registered application 'Wait'
 == Registered application 'WaitDigit'
 == Registered application 'WaitExten'
 == Registered application 'Set'
 == Registered application 'MSet'
 == Registered channel type 'Local' (Local Proxy Channel Driver)
 == Manager registered action LocalOptimizeAway
Asterisk Dynamic Loader Starting:
[Mar  2 17:45:41] NOTICE[18734]: loader.c:2230 load_modules: 91 modules will be loaded.
Loading extconfig.
[ Initializing Custom Configuration Options ]
 == extconfig => (Configuration)
Loading logger.
 == logger => (Logger)
Loading res_sorcery_astdb.so.

<snip>

Loading codec_alaw.so.
 == Registered translator 'alawtolin' from codec alaw to slin, table cost, 900000, computational cost 845
 == Registered translator 'lintoalaw' from codec slin to alaw, table cost, 600000, computational cost 1064
 == codec_alaw.so => (A-law Coder/Decoder)
[Mar  2 17:45:43] WARNING[18734]: loader.c:2234 load_modules: Some non-required modules failed to load.
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_pjproject.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_pjsip.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_sorcery_astdb.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_sorcery_realtime.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_http_websocket.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_pjsip_config_wizard.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_crypto.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_pjsip_pubsub.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_pjsip_session.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_sorcery_memory.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_sorcery_config.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: Module 'res_pjsip_outbound_publish.so' did not register itself during load
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: res_pjsip_transport_websocket declined to load.
[Mar  2 17:45:43] ERROR[18734]: loader.c:2249 load_modules: cdr_csv declined to load.
Asterisk Ready.
Asterisk cleanly ending (0).
Executing last minute cleanups
 == Manager unregistered action DBGet
 == Manager unregistered action DBPut
 == Manager unregistered action DBDel
 == Manager unregistered action DBDelTree
root@hank2:~#

The trailing errors ("did not register itself during load") I'm not sure of. I get these as will on x86_64 with glibc with some modules and they don't seem to cause any harm, most of the modules seem to load after all and work fine.

Kind regards,
Seb
Comments:By: Asterisk Team (asteriskteam) 2019-03-02 11:58:45.872-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: Joshua C. Colp (jcolp) 2019-03-04 09:06:45.387-0600

I've accepted this issue and it is now open. The use of musl as a library is community supported, however.

By: Sebastian Kemper (skemper) 2019-03-11 12:28:13.342-0500

Hi Joshua,

I needed some time to wrap my head around this. Yesterday I finally understood Rich's reply on the musl list. So dlopen on musl is permanent. dlclose is a no-op (shared objects are only closed once the whole application is shut down). And the constructor is only executed once, not every time dlopen is called.

As a workaround I thought that I could manually start ast_module_register after the 2nd dlopen.

--- a/main/loader.c
+++ b/main/loader.c
@@ -949,6 +949,18 @@ static struct ast_module *load_dlopen(co

       resource_being_loaded = mod;
       mod->lib = dlopen(filename, flags);
+#if !defined(__UCLIBC__) && !defined(__GLIBC__)
+       if (mod->lib && flags==(RTLD_NOW | RTLD_GLOBAL)) {
+               const struct ast_module_info *musl_info;
+               const char *musl_dlerror_msg;
+               dlerror();
+               musl_info = (const struct ast_module_info *) dlsym(mod->lib, "__mod_info");
+               musl_dlerror_msg = dlerror();
+               if (musl_info) {
+                       ast_module_register(musl_info);
+               }
+       }
+#endif
       if (resource_being_loaded) {
               struct ast_str *list;
               int c = 0;

But the __mod_info symbol is not exported by the shared objects, so dlsym never finds it. :-D

Anyway, thought I'd add this to this item.

Kind regards,
Seb

By: Friendly Automation (friendly-automation) 2019-04-19 08:34:06.795-0500

Change 11228 merged by Friendly Automation:
loader: support for permanent dlopen()

[https://gerrit.asterisk.org/c/asterisk/+/11228|https://gerrit.asterisk.org/c/asterisk/+/11228]

By: Friendly Automation (friendly-automation) 2019-04-19 09:07:56.438-0500

Change 11217 merged by Friendly Automation:
loader: support for permanent dlopen()

[https://gerrit.asterisk.org/c/asterisk/+/11217|https://gerrit.asterisk.org/c/asterisk/+/11217]