Summary: | ASTERISK-27531: Compiler optimizations can break module load sequence. | ||
Reporter: | abelbeck (abelbeck) | Labels: | patch |
Date Opened: | 2017-12-23 16:45:58.000-0600 | Date Closed: | 2017-12-27 19:36:51.000-0600 |
Priority: | Major | Regression? | Yes |
Status: | Closed/Complete | Components: | Core/General |
Versions: | 13.19.0 15.2.0 | Frequency of Occurrence | Constant |
Related Issues: | |||
Environment: | AstLinux 1.3.x, Linux 3.16.51, cross-compiled | Attachments: | ( 0) asterisk-13-attr-opt-O0-loader.patch ( 1) asterisk-13-loader.patch ( 2) asterisk-13-x86_64-config.log ( 3) ASTERISK-27531.patch ( 4) without-usleep-debug_log_27531.txt ( 5) with-usleep-debug_log_27531.txt |
Description: | No modules load, all module logs of the form ...
{noformat} loader.c:583 load_dlopen: Module 'res_musiconhold.so' did not register itself during load {noformat} | ||
Comments: | By: Asterisk Team (asteriskteam) 2017-12-23 16:45:59.241-0600 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: abelbeck (abelbeck) 2017-12-23 16:51:00.791-0600 Tested patch that fixes the issue. Some small amount of time is needed after calling dlopen() before resource_being_loaded is guaranteed to change. The value "1" for usleep is adequate for a tested 2 GHz 4-core box. By: Corey Farrell (coreyfarrell) 2017-12-23 17:22:17.100-0600 This is odd since ast_module_register should run from the same thread as dlopen, so it should be impossible for dlopen to return before ast_module_register. We will need debug logs to troubleshoot the cause of your issue. In addition: * Please run {{core show settings}} from Asterisk CLI and tell us what your "Build Options:" say. * Please attach the config.log produced when you ran ./configure on the Asterisk source. ---- We require additional debug to continue with triage of your issue. Please follow the instructions on the wiki \[1\] for how to collect debugging information from Asterisk. For expediency, where possible, attach the debug with a '.txt' file extension so that the debug will be usable for further analysis. Thanks! \[1\] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information By: abelbeck (abelbeck) 2017-12-23 17:48:05.842-0600 ...run {{core show settings}} from Asterisk CLI and tell us what your "Build Options:" say. {noformat} PBX Core settings ----------------- Version: 13.19.0-rc1 Build Options: OPTIONAL_API {noformat} After my current test build is finished, I'll post the config.log . By: abelbeck (abelbeck) 2017-12-23 19:29:51.665-0600 Attachment: Build system generated config.log as asterisk-13-x86_64-config.log By: Corey Farrell (coreyfarrell) 2017-12-23 19:35:57.540-0600 Please collect a debug log. By: abelbeck (abelbeck) 2017-12-23 19:40:02.854-0600 This issue was discovered on a x86_64 system. For an additional data point, I built an image using an i586 toolchain for a 1-core 500 Mhz AMD Geode LX (Soekris net5501). The exact same results ... without the {{usleep(1)}} no modules loaded, but with the {{usleep(1)}} the modules loaded just fine. Corey, if ast_module_register is indeed running in the same thread as dlopen, could this be a compiler optimization that is causing the problem ? By: Corey Farrell (coreyfarrell) 2017-12-23 19:56:17.130-0600 Could be compiler optimizations, you can see if DONT_OPTIMIZE (without the usleep) works around it but that will have a negative impact on performance. *We still need debug logs.* I won't be responding further until they're posted. By: abelbeck (abelbeck) 2017-12-23 20:22:25.874-0600 Attachment: without-usleep-debug_log_27531.txt (fails) Attachment: with-usleep-debug_log_27531.txt (succeeds) By: Corey Farrell (coreyfarrell) 2017-12-23 21:53:51.620-0600 I've acknowledged this issue, I still need more help to determine the best solution. I don't think usleep is the solution. I suspect usleep is just causing the compiler to disable an incorrect optimization. A couple more things: * Does setting {{DONT_OPTIMIZE}} work around the issue? * What do I need to setup a build VM? Can this issue be reproduced without cross-compiling? By: abelbeck (abelbeck) 2017-12-23 22:55:31.210-0600 Attachment: New patch to disable optimizations for function load_dlopen() Tested and it solves this issue. By: Corey Farrell (coreyfarrell) 2017-12-23 22:56:32.395-0600 [~abelbeck]: Could you try [^ASTERISK-27531.patch]? Please use your original build settings and do not add the usleep. By: abelbeck (abelbeck) 2017-12-23 22:59:03.440-0600 I tested by building with {{--enable DONT_OPTIMIZE}} and the problem goes away, but not a solution. Alternatively, I tested with the attached {{asterisk-13-attr-opt-O0-loader.patch}} patch, which solves the problem. By: abelbeck (abelbeck) 2017-12-23 23:02:09.556-0600 I first tried a similar patch as [^ASTERISK-27531.patch], but the compiler complained {{volatile}} does not work with pointers, but I will try your exact patch to be sure, stay tuned ... By: abelbeck (abelbeck) 2017-12-23 23:16:21.230-0600 Corey, your [^ASTERISK-27531.patch] also solves the problem. (My test had the {{volatile}} left of the {{*}} and did not work) Your patch looks like a solution to me. Thanks ! By: Corey Farrell (coreyfarrell) 2017-12-23 23:19:47.172-0600 I'll get this submitted for review, thanks for testing. By: Friendly Automation (friendly-automation) 2017-12-27 19:36:52.348-0600 Change 7733 merged by Jenkins2: loader: Add volatile to resource_being_loaded. [https://gerrit.asterisk.org/7733|https://gerrit.asterisk.org/7733] By: Friendly Automation (friendly-automation) 2017-12-27 19:44:33.797-0600 Change 7734 merged by Jenkins2: loader: Add volatile to resource_being_loaded. [https://gerrit.asterisk.org/7734|https://gerrit.asterisk.org/7734] By: Friendly Automation (friendly-automation) 2017-12-27 19:46:50.601-0600 Change 7735 merged by Jenkins2: loader: Add volatile to resource_being_loaded. [https://gerrit.asterisk.org/7735|https://gerrit.asterisk.org/7735] By: Friendly Automation (friendly-automation) 2017-12-27 21:09:41.416-0600 Change 7736 merged by Jenkins2: loader: Add volatile to resource_being_loaded. [https://gerrit.asterisk.org/7736|https://gerrit.asterisk.org/7736] By: Friendly Automation (friendly-automation) 2017-12-27 23:01:00.438-0600 Change 7737 merged by Richard Mudgett: loader: Add volatile to resource_being_loaded. [https://gerrit.asterisk.org/7737|https://gerrit.asterisk.org/7737] |