[Home]

Summary:ASTERISK-24448: Crash on multi-leg call with directmedia=yes using pjsip
Reporter:Dave Fullerton (davton)Labels:
Date Opened:2014-10-24 08:38:03Date Closed:2014-12-01 09:44:08.000-0600
Priority:CriticalRegression?
Status:Closed/CompleteComponents:pjproject/pjsip Resources/res_pjsip
Versions:13.0.0-beta3 Frequency of
Occurrence
Constant
Related
Issues:
is duplicated byASTERISK-24556 Asterisk 13 core dumps when calling from pjsip extension to another pjsip extension
Environment:Slackware 14.0 pjproject 2.3Attachments:( 0) asterisk.log
( 1) backtrace.txt
( 2) extensions-A.ael
( 3) extensions-B.ael
( 4) pjsip_show.txt
( 5) pjsip-A.conf
( 6) pjsip-B.conf
Description:When endpoint A registered to Asterisk 1 calls endpoint B registered to Asterisk 2, all via PJSIP, both Asterisk instances will segfault once endpoint B answers when directmedia=yes on all legs of the call. I suspect it crashes when trying to switch to a native bridge.
Comments:By: Dave Fullerton (davton) 2014-10-24 08:45:34.032-0500

The asterisk.log file has the console output with pjsip debugging enabled. The pjsip_show.txt file shows the transport, phone endpoint (3817) and the remote PBX (pbxbeta) endpoint. The other asterisk system is configured identically with the phone endpoint being 3700 and the remotepbx being holtestpbx.


By: Richard Mudgett (rmudgett) 2014-10-24 10:17:30.273-0500

Unfortunately the backtrace has no symbol information to figure out how Asterisk crashed.  Was Asterisk compiled on the machine where Asterisk crashed so the source is available for the backtrace?

Please follow the instructions on the given link:
https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

By: Dave Fullerton (davton) 2014-10-24 10:56:54.771-0500

Apologies, I followed the instructions on the wiki. Asterisk was built with: DONT_OPTIMIZE, LOADABLE_MODULES, BETTER_BACKTRACES, OPTIONAL_API and I copy-pasted the gdb command line from the wiki. I'm getting a weird message from gdb on second look so I may have a problem with gdb itself. I will investigate and try to upload a new backtrace when I get it figured out.


By: Dave Fullerton (davton) 2014-10-24 13:44:21.231-0500

Here is a new backtrace (backtrace2.txt). Asterisk was compiled on the same machine as the backtrace was generated and the source files were not touched. I re-compiled pjproject with "CFLAGS += -g" in user.mak and recompiled asterisk as well. I created the backtrace by doing the following:
{quote}
gdb -se asterisk -c core | tee /tmp/backtrace2.txt
(gdb) bt
(gdb) bt full
(gdb) thread apply all bt
(gdb) quit
{quote}

By: Richard Mudgett (rmudgett) 2014-10-24 14:06:08.934-0500

Unfortunately there are still no symbols. :(

By: Dave Fullerton (davton) 2014-10-24 14:21:25.326-0500

I'm sorry. This isn't something I have previous experience with. I will be happy to try whatever you suggest, but for now I'm at the end of what I know how to do.

By: Dave Fullerton (davton) 2014-10-27 15:17:15.869-0500

Turns out the build script I was using was doing a "strip" at the end of it. So I was compiling in all the info and then stripping it back out. I feel like such a fool. Here is a new backtrace that hopefully has what you need.

By: Rusty Newton (rnewton) 2014-11-03 16:39:27.076-0600

[~davton] thanks for the additional backtrace. That looks like what we need.

Can you also attach the pjsip.conf for each system with dialplan (simplified if needed) so that we can reproduce the issue in-house?

Thanks!

By: Dave Fullerton (davton) 2014-11-04 09:34:03.615-0600

I have uploaded the requested files. I cleaned them up to just the basics. One other thing I noticed is that if a phone on A calls the phone on B it core dumps 100% of the time. If a phone on B calls a phone on A there's no core dump but there is also no audio and the call terminates after 30 seconds. The phones and asterisk servers are all on the same WAN, with no firewalls between them.

By: Dave Fullerton (davton) 2014-11-04 10:18:31.680-0600

After my last comment I started thinking about the WAN/LAN impacts. I moved server B onto the same LAN as A as well as all the phones. The segmentation faults have stopped, however I still have no audio between the two phones.

By: Joshua C. Colp (jcolp) 2014-12-01 09:43:51.304-0600

I'm closing this out as ASTERISK-24556 is a duplicate and has more information (valgrind information).