[Home]

Summary:ASTERISK-25301: asterisk segfault in res_hep_pjsip.so on client connect
Reporter:Michel R. Vaillancourt (jkl5group)Labels:
Date Opened:2015-08-03 14:33:03Date Closed:2020-01-14 11:13:50.000-0600
Priority:CriticalRegression?
Status:Closed/CompleteComponents:pjproject/pjsip
Versions:13.4.0 Frequency of
Occurrence
Constant
Related
Issues:
Environment:PJPROJECT version currently running against: 2.4 Ubuntu 14.04.2 LTS Digital Ocean VM DropletAttachments:( 0) backtrace.txt
Description:Brand new install, minimum system configuration.

Using Briaa X-Lite softphone by Counterpath.

Extension config is:
{noformat}
Mon Aug 03 15:24:50
(97)[switch2 ext_setup]$ ll
total 12K
-rwxrwx--- 1 asterisk asterisk 241 Aug  3 13:32 6001.conf
-rw-rw-r-- 1 jkl5tech asterisk 216 Aug  3 13:26 01_standard_codecs.sip.conf
-rw-rw-r-- 1 jkl5tech asterisk 138 Aug  3 13:25 00_standard_transport.sip.conf
Mon Aug 03 15:27:55
(98)[switch2 ext_setup]$ cat *
; ----- start of file -----
[standard-transports-template](!)
 type=transport
 protocol=udp
 bind=0.0.0.0

; -----  end of file  -----
; ----- start of file -----
[standard-codec-template](!)
 dtmfmode=RFC2833
 disallow=all
 allow=g722:10
 allow=g726:20
 allow=ilbc:25
 allow=ulaw:30
 allow=alaw:30
 allow=gsm:50

; -----  end of file  -----

[6001](standard-transports-template,standard-codec-template)
type=endpoint
context=from-internal
auth=6001
aors=6001

[6001]
type=auth
auth_type=userpass
password=**redacted**
username=6001

[6001]
type=aor
max_contacts=2
; ---- end of file ---
{noformat}

As soon as client connects to Asterisk, segfault happens instantly:
{noformat}
==> /var/log/syslog <==
Jul 31 16:17:42 switch2 kernel: [ 8962.569425] asterisk[23536]: segfault at c ip 00007f1b38909bc4 sp 00007f1b822ba030 error 4 in res_hep_pjsip.so[7f1b388dc000+45000]

==> /var/log/kern.log <==
Jul 31 16:17:42 switch2 kernel: [ 8962.569425] asterisk[23536]: segfault at c ip 00007f1b38909bc4 sp 00007f1b822ba030 error 4 in res_hep_pjsip.so[7f1b388dc000+45000]

==> /var/log/apport.log <==
ERROR: apport (pid 23591) Fri Jul 31 16:17:42 2015: called for pid 23507, signal 11, core limit 18446744073709551615
ERROR: apport (pid 23591) Fri Jul 31 16:17:42 2015: ignoring implausibly big core limit, treating as unlimited
ERROR: apport (pid 23591) Fri Jul 31 16:17:42 2015: executable: /usr/sbin/asterisk (command line "/usr/sbin/asterisk")
ERROR: apport (pid 23591) Fri Jul 31 16:17:42 2015: is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment
ERROR: apport (pid 23591) Fri Jul 31 16:17:42 2015: apport: report /var/crash/_usr_sbin_asterisk.1001.crash already exists and unseen, doing nothing to avoid disk usage DoS
{noformat}
Asterisk logs show nothing;  even the client connection is not logged.
Comments:By: Asterisk Team (asteriskteam) 2015-08-03 14:33:04.975-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Michel R. Vaillancourt (jkl5group) 2015-08-03 15:21:24.570-0500

Backtrace after crash.

By: Michel R. Vaillancourt (jkl5group) 2015-08-03 15:22:56.321-0500

Finally caught something "in the act" at the console when a Segfault occurs:

57)[switch2 asterisk]$ ps -C asterisk u
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
asterisk 21764 63.2  0.8 2964892 71808 pts/2   Sl+  16:10   0:07 /usr/sbin/asterisk -cdfgitTvvvv

*CLI> 16:11:10.250 sip_endpoint.c !Processing incoming message: Request msg REGISTER/cseq=1 (rdata0x373fcc8)
Segmentation fault (core dumped)

By: Richard Mudgett (rmudgett) 2015-08-03 17:37:36.333-0500

A couple things:
# You are including the {{standard-transports-template}} as part of the {{\[6001\]}} endpoint.  I'm surprised that you are not getting any config error messages when pjsip loads.
# The backtrace is optimized and missing some useful information as a result.

https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

By: Joshua C. Colp (jcolp) 2015-08-04 09:29:02.710-0500

After looking at this it appears that PJSIP was built with static libraries and not with shared ones, causing each module to have an instance itself. This is a recipe for disaster.

Can you confirm how pjproject was built and ensure that shared libraries are available?

By: Michel R. Vaillancourt (jkl5group) 2015-08-04 09:42:21.518-0500

Followed instructions at https://wiki.asterisk.org/wiki/display/AST/Building+and+Installing+pjproject

If you are seeing this as a static libraries build, I will uninstall and redo from scratch.  I'll get back to you when that is done.

Is there any way I can see for myself via the core dump or other means if the libraries are a static build again for some reason?

By: Joshua C. Colp (jcolp) 2015-08-04 09:45:46.786-0500

Yes, pjsip implemented symbols appear as being in the res_hep_pjsip.so module:

{code}
#0  0x00007fac51ee2bc4 in find_entry () from /usr/lib/asterisk/modules/res_hep_pjsip.so
#1  0x00007fac51ee3049 in pj_hash_get_lower () from /usr/lib/asterisk/modules/res_hep_pjsip.so
#2  0x00007fac51ec40bf in pjsip_ua_find_dialog () from /usr/lib/asterisk/modules/res_hep_pjsip.so
{code}

Normally they would be in a libpjsip.so module instead.

By: Michel R. Vaillancourt (jkl5group) 2015-08-04 10:33:11.667-0500

In spite of using the following:

./configure --prefix=/usr --enable-shared --disable-sound --disable-resample --disable-video --disable-opencore-amr CFLAGS='-O2 -DNDEBUG'

... based on the "signature" you describe, I'm still getting a static build.  

In fact, I'm noticing that it's also building the sound modules and the video modules.  Essentially, "configure" is acting as though the command line is just "./configure --prefix=/usr"



By: Michel R. Vaillancourt (jkl5group) 2015-08-04 12:18:48.568-0500

At this point, I'm being forced to conclude the issue might be with PJSIP since I am completely unable to get it to build with Shared Libraries.  I'll take this over to them and see what I get there.

By: Rusty Newton (rnewton) 2015-08-04 15:56:37.698-0500

Did you try the uninstall steps linked on this page?

https://wiki.asterisk.org/wiki/display/AST/Building+and+Installing+pjproject#BuildingandInstallingpjproject-UninstallingaPreviousVersionofpjproject



By: Michel R. Vaillancourt (jkl5group) 2015-08-06 10:36:38.449-0500

Hello, Rusty.  Yes, I have.

I've uninstalled both PJSIP and Asterisk, and then recompiled and re-installed both.

I'm going to do this build on another server and see if I get the same behavior.

By: Rusty Newton (rnewton) 2015-08-06 19:20:36.843-0500

Very strange... I'm of course unable to reproduce here on my lab systems. Let us know what you find out.

By: Michel R. Vaillancourt (jkl5group) 2015-08-07 07:03:23.280-0500

I'm setting up another server using the exact same procedure I used for this one to try and reproduce.   I'll let you know what comes of it.

By: Asterisk Team (asteriskteam) 2015-08-21 12:00:20.925-0500

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines