[Home]

Summary:ASTERISK-27321: Asterisk Crashing with FRACK Errors and Serious Network Trouble
Reporter:Steven Sedory (stevensedory)Labels:fax pjsip
Date Opened:2017-10-06 11:28:22Date Closed:
Priority:CriticalRegression?
Status:Open/NewComponents:Channels/chan_sip/General
Versions:13.17.0 Frequency of
Occurrence
Frequent
Related
Issues:
is related toASTERISK-27300 Asterisk crashes randomly (FRACK!, chan_sip)
is related toASTERISK-27412 core: Audiohook freeing interpolated frame when it shouldn't.
Environment:FreePBX 13.0.192.16 and Asterisk 13.17.0, proxmox 4.4 on Dell R720, local RAID volume. Using TCP and obscure port for SIP. UDP 5060 still open/enabled, but firewalled to only allow Anveo Direct servers.Attachments:( 0) backtrace.txt
( 1) backtrace-2017-10-25.txt
( 2) backtrace-with-debuginfo.txt
( 3) best-backtrace.txt
( 4) core.2018-01-26T14-36-26-0800-brief.txt
( 5) core.2018-01-26T14-36-26-0800-full.txt
( 6) core.2018-01-26T14-36-26-0800-locks.txt
( 7) core.2018-01-26T14-36-26-0800-thread1.txt
( 8) full
Description:Running FreePBX 13.0.192.16 and Asterisk 13.17.0

I have previously posted about this issue in the freepbx and astersk forums. Here are those links:

https://community.asterisk.org/t/asterisk-freepbx-crashing-and-frack-errors/72159
https://community.freepbx.org/t/consistent-asterisk-freepbx-crash-issue/43682/1

Host: Dell R720 with 2x Xeon E5-2620 2.00GHz (6 Core) and 64GB RAM DDR3 ECC), local PERC storage.

Hypervisor: Proxmox 4.4-1.

Network: using onboard Quad NIC. Bridge “vmbr0” points to “bond0” as the bridge port, and bond0 has eth0 and eth1 in it in “active-backup” mode, each going to one of our two core switches. Using Cisco 3560G. Switch ports are in trunk mode, with native vlan set to our management vlan. VMs are tagged to our public facing vlan, for direct internet access.

VMs are running FreePBX/Asterisk versions mentioned above. Each have 4GB RAM fixed with ballooning disabled, 4 cores (2 sockets, 2 cores; have tried with NUMA enabled and disabled) with type “Default (kvm64)”, NIC using E1000 model, vdisk is 300G presented as ide0 as a raw image on a local LVM-Thin volume.

Endpoints: All endpoints are NAT’d. We use TCP for SIP with an obscure port (not 5060 or near that). RTP traffic on our VSP’s required port range is allowed as well. All other traffic is dropped per the FreePBX firewall.

In summary, what is happening is that we get a bunch of errors like this:

[2017-09-28 02:05:18] ERROR[7061] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x3e7c690 (0)
[2017-09-28 02:05:24] ERROR[6934] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x3e7c690 (0)
[2017-09-28 02:05:28] ERROR[7107] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x3e7c690 (0)

and right before and after, we have most of our peers go unreachable. Sometime Asterisk will crash afterwards, sometimes not.

The issue happens intermittently, but seems to happen more frequently on the VMs that have more peers/endpoints (100+). I don’t think we’ve had it happen on any VMs that had less than 100 peers/endpoints.

We recently chopped a server that had about 130 endpoints into two of 110 and 20. More accurately, we moved 110 off server A to server B, leaving 20 on server B. Before that move, we were experiencing FRACK! errors every day (anywhere from 20-300, usually all within a 20 minute window or so). Once the 110 were moved to server B, server A has never again had FRACK! errors or asterisk crashes. Server B however is having them now, just much less often then when all 130 endpoints were on server A. My assumption for that is due to the slightly lower endpoint total on the VM.

This morning was one of those instances. We had 193 errors, identical to the three I posted above (minus the ERROR[number] being different). AND, we had a crash afterwards. Here is the backtrace: http://pastebin.freepbx.org/view/8cccc15f2

So I come to you, the asterisk community, for help. I first posted on the FreePBX forum, and was directed here.

I understand this may point to a memory issue, but what is strange is that the Dell iDrac log doesn’t show any memory errors in it. Perhaps there are errors but iDrac just isn’t seeing them to report them. I’m hoping someone out there can parse through the backtrace and give me a clear answer to what the problem is. Thanks in advance.
Comments:By: Asterisk Team (asteriskteam) 2017-10-06 11:28:22.854-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Steven Sedory (stevensedory) 2017-10-06 11:31:02.995-0500

Note that there are links to crash dumps, and backtraces on the above urls.

I think that the "Serious Network Trouble" errors are what cause the FRACK errors, which then cause asterisk to crash.

By: Steven Sedory (stevensedory) 2017-10-09 19:13:39.298-0500

How long does this typically take to get assigned?

By: Joshua C. Colp (jcolp) 2017-10-09 19:44:17.314-0500

The chan_sip module is community supported and there is no current active maintainer. There is no timeframe on when this issue would get assigned or worked.

By: Steven Sedory (stevensedory) 2017-10-10 12:58:24.868-0500

Would you then suggest we switch to pjsip? The main reason I feel uncomfortable doing so is that we had issues with it in the past, and in FreePBX, it says under the sip driver selector "The chan_pjsip channel driver is considered "experimental" with known issues and does not work on Asterisk 11 or lower."

By: Richard Mudgett (rmudgett) 2017-10-10 13:28:35.771-0500

Of course chan_pjsip doesn't work on Asterisk 11 or lower.  It doesn't exist in those versions.  chan_sip became extended support when Asterisk 13 was released.  If you could ever consider chan_pjsip "experimental" it would have been in Asterisk 12 when it first appeared.  Since then chan_pjsip has steadily improved.

By: Steven Sedory (stevensedory) 2017-10-10 13:56:13.235-0500

Thanks for the quick replies. Very appreciated.

So, us being on Asterisk 13.17.0, and having the issues mentioned above, do you think switching to chan_pjsip is a good next troubleshooting step?

By: Richard Mudgett (rmudgett) 2017-10-10 14:44:27.754-0500

Well, switching to chan_pjsip isn't really a trouble shooting step, more like trouble avoidance.

The FRACK messages are only telling us that someone is using an ao2 object after it is destroyed and nothing else.  Depending upon how Asterisk is compiled, there could also be a small backtrace in the log showing who is trying to use it.  To get the best information for that backtrace you need to enable DONT_OPTIMIZE and BETTER_BACKTRACES in menuselect before compiling Asterisk.

A "Serious Network Trouble" log message is chan_sip specific.  I think it means that chan_sip couldn't send a packet with the socket.

The link to the backtrace pastebin is broken likely because it expired.  This is why the issue guidelines [1] want you to *attach* backtraces to the issue.

The number in "ERROR[number]" is a thread id number and not an error code.  You can grep through the log for that number to see what else that thread has done.

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

By: Steven Sedory (stevensedory) 2017-10-10 15:11:08.882-0500

Sorry about that. I'm in a bit of a shotgun approach with this as it is really hurting us. I apologize for overlooking the need to attach.

That said, backtrace from the core dump referenced above attached. Any insight you can provide would be greatly appreciated.

The reason I asked about going to chan_pjsip as a next step is because it seems to me that these Serious Network Trouble errors, once they get high or concentrated, that they cause the FRACKs, which in turn cause asterisk to crash or become unresponsive. Do you think that the FRACKs are because of the Serious Network Trouble errors?

And regarding recompiling, we are using the FreePBX distro, and would like to continue to. I will have to do some lab work recompiling before I do it with any of these production VMs, as it is something I've never done before.

By: Steven Sedory (stevensedory) 2017-10-10 15:46:05.144-0500

So after some more research, found the following article. It's from a much older version of asterisk, but mentioned KVM.

It says: In case you wonder why on earth is asterisk failing with a segmentation fault after a fresh install (compilation by hand from sources) – and you happen to run it in a KVM virtual machine – then the answer is pretty easy: make sure you run make menuconfig before you start the compilation and remove the compilation flag called “BUILD NATIVE”. Once you do that asterisk will run normally.

I just find it strange that I haven't heard this from anyone else. Thoughts?

By: Richard Mudgett (rmudgett) 2017-10-10 16:43:25.600-0500

Unfortunately, the [^backtrace.txt] file doesn't have any symbolic information.  You would have to build asterisk yourself to be able to get symbolic information for the next crash.

I cannot say if the FRACKs are related to the "Serious Network Trouble" messages.

BUILD_NATIVE lets the compiler use instructions specific to the machine it is compiled on.  If you then copy the executable to another machine with a different flavor processor it may crash because it doesn't implement an instruction.  Disabling BUILD_NATIVE makes the compiler use only generic instructions supported by x86 platforms.

By: Steven Sedory (stevensedory) 2017-10-10 17:40:11.259-0500

Thanks Richard, this is good info.

So taken that I'm using the FreePBX distro, which is an ISO that installs CentOS and Asterisk and FreePBX all preconfigured, there's no way it's being setup in such a way to best support my KVM vCPUs, correct? Meaning, there's a possiblity that their distro isn't setup to support KVM well, and I would have to compile asterisk on my own FROM THE KVM VM (vs from somewhere else). I'm I on the right track here you think?

Next step is to confirm it's okay to install their distro on KVM. I automatically assumed so, as they rave KVM as the first best hypervisor for FreePBX, followed by VMWare.  

Oh, and here's our build info:
{noformat}
PBX Core settings
-----------------
 Version:                     13.17.0
 Build Options:               DONT_OPTIMIZE, COMPILE_DOUBLE, OPTIONAL_API
 Maximum calls:               Not set
 Maximum open file handles:   193350
 Root console verbosity:      3
 Current console verbosity:   0
 Debug level:                 0
 Maximum load average:        0.000000
 Minimum free memory:         0 MB
 Startup time:                21:20:34
 Last reload time:            10:04:56
 System:                      Linux/2.6.32-642.6.2.el6.x86_64 built by mockbuild on x86_64 2017-07-27 17:42:42 UTC
 System name:
 Entity ID:                   52:77:c4:10:09:7d
 PBX UUID:                    22d4fa19-2724-4df7-9f24-dfd30eabfb09
 Default language:            en
 Language prefix:             Enabled
 User name and group:         /
 Executable includes:         Enabled
 Transcode via SLIN:          Enabled
 Transmit silence during rec: Enabled
 Generic PLC:                 Disabled
 Min DTMF duration::          80
 RTP dynamic payload types:   96-127

* Subsystems
 -------------
 Manager (AMI):               Enabled
 Web Manager (AMI/HTTP):      Disabled
 Call data records:           Enabled
 Realtime Architecture (ARA): Disabled

* Directories
 -------------
 Configuration file:
 Configuration directory:     /etc/asterisk
 Module directory:            /usr/lib64/asterisk/modules
 Spool directory:             /var/spool/asterisk
 Log directory:               /var/log/asterisk
 Run/Sockets directory:       /var/run/asterisk
 PID file:                    /var/run/asterisk/asterisk.pid
 VarLib directory:            /var/lib/asterisk
 Data directory:              /var/lib/asterisk
 ASTDB:                       /var/lib/asterisk/astdb
 IAX2 Keys directory:         /var/lib/asterisk/keys
 AGI Scripts directory:       /var/lib/asterisk/agi-bin
{noformat}


By: Richard Mudgett (rmudgett) 2017-10-10 18:08:28.740-0500

I have never installed nor used FreePBX.

{quote}
Build Options:               DONT_OPTIMIZE, COMPILE_DOUBLE, OPTIONAL_API
{quote}
BUILD_NATIVE is not listed in the build options you showed.

By: Steven Sedory (stevensedory) 2017-10-10 18:11:30.439-0500

In other words, the post I referenced above wouldn't make any difference for me to follow, as that's already the case, correct?

By: Richard Mudgett (rmudgett) 2017-10-10 18:55:39.600-0500

BUILD_NATIVE is not listed so it is not enabled so it should then work in KVM according to the post you referenced.

To get any useful backtraces from FRACKs you would need to enable BETTER_BACKTRACES.

To get any useful backtraces from a crash you need symbolic debug information in the executable.  I'm not sure what is specifically needed to get that as part of the build though.  I used the contrib/scripts/install_prereq script which gets the dependencies to build most modules.

By: Corey Farrell (coreyfarrell) 2017-10-10 19:03:54.473-0500

To get backtraces from a crash you should be able to {{yum install asterisk-debuginfo}} (assuming the asterisk RPM is named {{asterisk}}).  This will install symbols so you can load a core dump into gdb.

By: Steven Sedory (stevensedory) 2017-10-10 19:33:47.353-0500

Hi Corey, can I do this command on a FreePBX distro with jacking anything up to your knowledge?

Also, will I have to wait for another crash, or will I be able to generate a useful backtrace with the crash dump I already have after running the above command?

By: Corey Farrell (coreyfarrell) 2017-10-10 19:57:37.010-0500

If you haven't upgraded Asterisk since the crash dump happened then you should be able to just install the debuginfo for the matching version and it'll allow you to get a backtrace.  Installing the debuginfo wouldn't cause any problems on CentOS.   I've never used FreePBX so I can't say for sure but it should be fine.  I suggest asking FreePBX support/community about installing debuginfo & getting a backtrace.

By: Steven Sedory (stevensedory) 2017-10-10 21:25:36.502-0500

So, to be clear, I already can produce a backtracked (one is attached). But what you've mentioned will cause the back trace to have the useful symbols?

By: Corey Farrell (coreyfarrell) 2017-10-11 07:29:09.663-0500

Yes, it will replace {{0x00000000005ef6bb in ??}} with function/source info and give some info from local variables from the thread that crashed.

Note if yum says it can't find asterisk-debuginfo you will need to ask FreePBX how to get the debuginfo package.  Sometimes debuginfo is put into separate repositories to avoid installing it for {{yum install asterisk*}}.

By: Steven Sedory (stevensedory) 2017-10-11 08:59:05.163-0500

Great, thank you. Will work on getting this installed asap.

By: Steven Sedory (stevensedory) 2017-10-11 10:44:36.171-0500

So installed it by running the following per https://wiki.freepbx.org/display/SUP/Providing+Great+Debug
yum install pjproject-debuginfo asterisk13-debuginfo

But when I ran the following to get the backtrace, it spit out these errors, and the exact same backtrace file: gdb -se "asterisk" -ex "bt full" -ex "thread apply all bt" --batch -c /tmp/core.v12.ringvertical.com-2017-09-28T02\:05\:33-0700 > /tmp/better-backtrace2.txt

Do I need to restart asterisk or the server before it will work correctly?

warning: the debug information found in "/usr/lib/debug//usr/lib64/asterisk/modules/format_jpeg.so.debug" does not match "/usr/lib64/asterisk/modules/format_jpeg.so" (CRC mismatch).


warning: the debug information found in "/usr/lib/debug/usr/lib64/asterisk/modules/format_jpeg.so.debug" does not match "/usr/lib64/asterisk/modules/format_jpeg.so" (CRC mismatch).


warning: the debug information found in "/usr/lib/debug//usr/lib64/asterisk/modules/res_fax_spandsp.so.debug" does not match "/usr/lib64/asterisk/modules/res_fax_spandsp.so" (CRC mismatch).


warning: the debug information found in "/usr/lib/debug/usr/lib64/asterisk/modules/res_fax_spandsp.so.debug" does not match "/usr/lib64/asterisk/modules/res_fax_spandsp.so" (CRC mismatch).


warning: the debug information found in "/usr/lib/debug//usr/lib64/asterisk/modules/res_ari_events.so.debug" does not match "/usr/lib64/asterisk/modules/res_ari_events.so" (CRC mismatch).


warning: the debug information found in "/usr/lib/debug/usr/lib64/asterisk/modules/res_ari_events.so.debug" does not match "/usr/lib64/asterisk/modules/res_ari_events.so" (CRC mismatch).

By: Corey Farrell (coreyfarrell) 2017-10-11 11:05:45.280-0500

Restarting won't help.  This looks like it's saying that the asterisk binary package does not match the asterisk debuginfo package.  My guess is that this is a packaging bug with FreePBX.  The debuginfo packages are extracted from the binary packages, and it's only usable if the CRC of each file matches what is expected.

By: Steven Sedory (stevensedory) 2017-10-11 12:17:54.954-0500

bummer. Will do some searching around for a fix.

By: Steven Sedory (stevensedory) 2017-10-12 19:29:13.817-0500

Just attached "best-backtrace.txt".

I didn't give the same CRC mismatch error after I updated asterisk to 13.17.1. However, someone on freepbx forums said that this backtrace won't be helpful anymore as I updated asterisk and will have to wait for another crash.

True or false? Also, is there any info in this new one that helps?

By: Richard Mudgett (rmudgett) 2017-10-12 19:37:51.854-0500

Sorry.  The new backtrace isn't any help.

By: Steven Sedory (stevensedory) 2017-10-15 11:36:49.833-0500

So we were able to cause a crash by slamming one of the servers with SIPp.

The FRACK Errors before the crash are different, but hopefully related.

backtrace-with-debuginfo.txt attached.

By: Steven Sedory (stevensedory) 2017-10-16 14:37:43.713-0500

Anything helpful in the next backtrace-with-debuginfo.txt I attached?

By: Richard Mudgett (rmudgett) 2017-10-16 15:11:05.044-0500

The backtrace still does not have any symbolic information in it.  Before trying again you need to verify that you can get backtraces with the needed symbolic information.

From your backtrace thread 1 (at the end of the file):
{noformat}
Thread 1 (Thread 0x7f855c0ca700 (LWP 15151)):
#0  0x00007f853ba85a43 in ?? () from /usr/lib64/asterisk/modules/chan_sip.so
#1  0x00007f853ba6ac66 in ?? () from /usr/lib64/asterisk/modules/chan_sip.so
#2  0x00007f853baee0c6 in ?? () from /usr/lib64/asterisk/modules/chan_sip.so
#3  0x00007f853baa886c in ?? () from /usr/lib64/asterisk/modules/chan_sip.so
#4  0x00000000005c23de in ast_sched_runq ()
#5  0x00007f853baec405 in ?? () from /usr/lib64/asterisk/modules/chan_sip.so
#6  0x0000000000602d4c in ?? ()
#7  0x00007f85daa07aa1 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f85d9d8f93d in clone () from /lib64/libc.so.6
{noformat}

From ASTERISK-27346 core.FreePBX-2017-10-15T15-45-36-0500-brief.txt thread 1 (at the end of the file):
{noformat}
Thread 1 (Thread 0x7480ddc0 (LWP 4428)):
#0  0x728afef4 in iks_filter_add_rule () from /usr/lib/arm-linux-gnueabihf/libiksemel.so.3
#1  0x70bc621c in custom_connection_handler (opt=0x2262554, var=0x2266e28, obj=0x226330c) at chan_motif.c:2681
#2  0x00115d70 in aco_process_var (type=0x70bd9380 <endpoint_option>, cat=0x2266c30 "g********gmailcom", var=0x2266e28, obj=0x226330c) at config_options.c:743
#3  0x00115e60 in aco_process_category_options (type=0x70bd9380 <endpoint_option>, cfg=0x2263a90, cat=0x2266c30 "g********gmailcom", obj=0x226330c) at config_options.c:756
#4  0x001150c4 in process_category (cfg=0x2263a90, info=0x70bd944c <cfg_info>, file=0x70bd93c0 <jingle_conf>, cat=0x2266c30 "g********gmailcom", preload=0) at config_options.c:521
#5  0x001153ac in internal_process_ast_config (info=0x70bd944c <cfg_info>, file=0x70bd93c0 <jingle_conf>, cfg=0x2263a90) at config_options.c:560
#6  0x001159a0 in aco_process_config (info=0x70bd944c <cfg_info>, reload=0) at config_options.c:686
#7  0x70bc69bc in load_module () at chan_motif.c:2751
#8  0x00177550 in start_resource (mod=0x2181788) at loader.c:986
#9  0x001782e0 in load_resource_list (load_order=0x7ed4a794, global_symbols=0, mod_count=0x7ed4a78c) at loader.c:1238
#10 0x00178aa8 in load_modules (preload_only=0) at loader.c:1373
#11 0x0006f3d4 in asterisk_daemon (isroot=1, runuser=0x7ed4b988 "asterisk", rungroup=0x7ed4b978 "asterisk") at asterisk.c:4693
#12 0x0006e864 in main (argc=8, argv=0x7ed4ccb4) at asterisk.c:4443
{noformat}

See the difference?

To verify if you have symbolic information available, you can attach gdb to a running asterisk by "sudo gdb asterisk <asterisk-process-id>".  At the gdb command prompt you can then type "bt" to see the backtrace of the current thread.  If you have symbols available, the displayed backtrace will have function names, line numbers, and function parameter values similar to the example above.  To exit gdb issue the "quit" command.

[1] man gdb for gdb specific documentation
[2] https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

By: Steven Sedory (stevensedory) 2017-10-16 19:25:21.980-0500

So does this mean that our FreePBX distro, pre-compiled Asterisk, does not have the proper debuging information enabled?

We followed the instructions here: https://wiki.freepbx.org/display/SUP/Providing+Great+Debug

It's funny because it doesn't say anything about needing to recompile asterisk or anything, but simply to install debuginfo.

By: Steven Sedory (stevensedory) 2017-10-18 11:20:28.326-0500

October 18th Asterisk Full Log (3AM to 9:15AM)

By: Steven Sedory (stevensedory) 2017-10-18 11:24:26.073-0500

Okay, so I just uploaded "full", which is the Asterisk log from one of our servers today.

It, seemingly out of no where, started FRACKing like crazy this morning. I'm hoping there are some clues in this log that will point to the issue.

Please see file attached "full".

By: Steven Sedory (stevensedory) 2017-10-18 11:42:28.804-0500

So looking at the log, I noticed that about 20 minutes after one of our users unplugged his phone, the Serious Network Error came up, and nine seconds later, the first FRACK.

[2017-10-18 06:32:31] NOTICE[2636] chan_sip.c: Peer '235' is now UNREACHABLE!  Last qualify: 23
[2017-10-18 06:32:31] VERBOSE[2559] chan_sip.c: Extension Changed 235[ext-local] new state Unavailable for Notify User 201
[2017-10-18 06:53:14] WARNING[2636] chan_sip.c: sip_xmit of 0x7f6784903580 (len 759) to 76.168.37.93:59972 returned -2: Interrupted system call
[2017-10-18 06:53:14] ERROR[2636] chan_sip.c: Serious Network Trouble; __sip_xmit returns error for pkt data
[2017-10-18 06:53:15] WARNING[2636] chan_sip.c: sip_xmit of 0x7f678421a740 (len 760) to 76.168.37.93:59972 returned -2: No such file or directory
[2017-10-18 06:53:15] ERROR[2636] chan_sip.c: Serious Network Trouble; __sip_xmit returns error for pkt data
[2017-10-18 06:53:16] WARNING[2636] chan_sip.c: sip_xmit of 0x7f678417cbc0 (len 760) to 76.168.37.93:59972 returned -2: No such file or directory
[2017-10-18 06:53:16] ERROR[2636] chan_sip.c: Serious Network Trouble; __sip_xmit returns error for pkt data
[2017-10-18 06:53:25] ERROR[7153] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x22c55a0 (0)
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: Got 21 backtrace records
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #0: [0x603fb2] /usr/sbin/asterisk(__ast_assert_failed+0x88) [0x603fb2]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #1: [0x45d11a] /usr/sbin/asterisk() [0x45d11a]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #2: [0x45d147] /usr/sbin/asterisk() [0x45d147]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #3: [0x45e4f2] /usr/sbin/asterisk() [0x45e4f2]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #4: [0x45e729] /usr/sbin/asterisk(__ao2_link+0x43) [0x45e729]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #5: [0x45eb9c] /usr/sbin/asterisk() [0x45eb9c]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #6: [0x45ee3f] /usr/sbin/asterisk(__ao2_callback+0x5f) [0x45ee3f]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #7: [0x7f672298a04c] /usr/lib64/asterisk/modules/chan_sip.so(+0x6d04c) [0x7f672298a04c]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #8: [0x7f6722989d8e] /usr/lib64/asterisk/modules/chan_sip.so(+0x6cd8e) [0x7f6722989d8e]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #9: [0x4da4cc] /usr/sbin/asterisk(ast_cli_command_full+0x274) [0x4da4cc]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #10: [0x54d66e] /usr/sbin/asterisk() [0x54d66e]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #11: [0x5534fe] /usr/sbin/asterisk() [0x5534fe]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #12: [0x553e34] /usr/sbin/asterisk() [0x553e34]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #13: [0x5542ff] /usr/sbin/asterisk() [0x5542ff]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #14: [0x5ed787] /usr/sbin/asterisk() [0x5ed787]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #15: [0x600bb4] /usr/sbin/asterisk() [0x600bb4]

By: Corey Farrell (coreyfarrell) 2017-10-18 11:55:35.187-0500

{{ast_cli_command_full}}: what CLI command is being run?  Is this a sip reload?

By: Steven Sedory (stevensedory) 2017-10-18 13:09:34.600-0500

No one was touching the server at that time, so not sure :/

By: Corey Farrell (coreyfarrell) 2017-10-18 14:19:57.730-0500

{quote}
So does this mean that our FreePBX distro, pre-compiled Asterisk, does not have the proper debuging information enabled?
We followed the instructions here: https://wiki.freepbx.org/display/SUP/Providing+Great+Debug
It's funny because it doesn't say anything about needing to recompile asterisk or anything, but simply to install debuginfo.
{quote}

The Asterisk project does not provide any binaries so this is a question for folks in the FreePBX community.  We do need backtraces with debugging info like the example [~rmudgett] showed, otherwise progress on this issue will be impossible.

By: Steven Sedory (stevensedory) 2017-10-18 14:24:23.050-0500

Okay. I'll reach out to FreePBX and see what they can provide.

As for crash logs though, those FRACKs happened this morning, and the phone server was unusable until a complete reboot, but Asterisk did not crash (or at least didn't put a core dump file in /tmp/).

By: Corey Farrell (coreyfarrell) 2017-10-18 14:27:32.826-0500

Getting more information from backtraces produced by logger.c would require recompiling asterisk with BETTER_BACKTRACES enabled.

By: Steven Sedory (stevensedory) 2017-10-18 14:44:13.866-0500

Thank you Corey. Will that be helpful even without there needing to be a crash first? Meaning, just in the standard Asterisk full log?

By: Corey Farrell (coreyfarrell) 2017-10-18 14:56:02.116-0500

Recompiling with BETTER_BACKTRACES would make the standard Asterisk full log more useful.  For starters it would tell us which CLI command is being run:
{quote}
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #8: [0x7f6722989d8e] /usr/lib64/asterisk/modules/chan_sip.so(+0x6cd8e) [0x7f6722989d8e]
[2017-10-18 06:53:26] VERBOSE[7153] logger.c: #9: [0x4da4cc] /usr/sbin/asterisk(ast_cli_command_full+0x274) [0x4da4cc]
{quote}

Notice the lack of function name associated with the chan_sip.so line of the backtrace.  This means we have no idea which CLI command was running - chan_sip has 23 commands.

It's possible that recompiling with {{REF_DEBUG}} enabled could provide useful information but be aware that option has a performance impact (it writes a lot of info to a log file).  More information about reference count debugging is at https://wiki.asterisk.org/wiki/display/AST/Reference+Count+Debugging.

By: Andrew Nagy (tm1000) 2017-10-18 15:00:06.596-0500

Corey

"BETTER_BACKTRACES" also has performance impacts as far as we have been told? Which is why we don't put this in our RPM. Unless this is no longer true?

By: Steven Sedory (stevensedory) 2017-10-18 15:00:19.352-0500

Okay. Compiling Asterisk is something I've never done before, so I will need a little time to do it and make sure I'm doing it right. We've always used the FreePBX distro. Any pointers you could give would be great.

By: Richard Mudgett (rmudgett) 2017-10-18 20:27:07.757-0500

[~tm1000] I don't see how BETTER_BACKTRACES could have a performance impact under normal circumstances so it is OK to always enable it.  The only reason it is not enabled by default is because it uses an optional library.  The library it uses is only called to create backtraces.   Backtraces are created for the DEBUG_THREADS lock tracking (CLI "core show locks") and assertions (FRACKs) generating backtraces to go in the log.  The DEBUG_THREADS functionality itself has a performance impact because it has to serialize all Asterisk locking.

By: Andrew Nagy (tm1000) 2017-10-19 15:03:32.100-0500

Richard,

I talked to Corey, Matt Fredrickson, Jason Parker and Scott Griepentrog before your reply, they came to the same conclusions as you. So it is now enabled in all of our RPMs. Thanks for your help!

By: Joshua C. Colp (jcolp) 2017-10-22 17:38:50.926-0500

Placing this back into feedback pending a good backtrace.

By: Steven Sedory (stevensedory) 2017-10-22 19:57:31.139-0500

Okay. We've updated to the current freepbx 13 distro, which should now include everything needed to provide helpful debug info in the FRACKs and crash dumps.

By: Steven Sedory (stevensedory) 2017-10-23 18:47:02.578-0500

Just an update, we have updated all Asterisk to 13.17.2 and installed debug. So far no FRACKs or crashes, but it sometimes goes a week or two before any.

By: Steven Sedory (stevensedory) 2017-10-24 19:24:16.474-0500

Another update: Andrew informed me that BETTER_BACKTRACES is not working correctly with the RPMs, so we unfortunately won't be seeing the details we want when/if another crash or FRACK happens.

Any suggestions?

By: Andrew Nagy (tm1000) 2017-10-24 19:51:14.270-0500

What I said was "Unfortunately we have realized that better backtraces is not getting compiled correctly into our RPMs so at this time better backtraces is not part of the RPM. Just FYI."

We will have this resolved shortly.

By: Steven Sedory (stevensedory) 2017-10-25 00:54:14.059-0500

Great to hear, thanks.

By: Steven Sedory (stevensedory) 2017-10-25 09:31:43.436-0500

The FreePBX team updated their RPMs and BETTER_BACKTRACES is now working. I have updated our servers, so we should have some good debug info on the next crash or FRACK.

Also, as just mentioned here ASTERISK-27371 we crashed a server last night, and then replicated it, by running "core set debug on" for about an hour. We crashed it again the second time in an attempt to replicate it, but it took 2-3 hours that time.

Now that we've updated Asterisk to include BETTER_BACKTRACES, we should has some better info on the next crash.

That said, the FRACKs generated from what we could replicate was different then the ones mentioned at the top of this thread. Theses ones look like this:

[2017-10-24 18:09:07] ERROR[3003] astobj2.c: FRACK!, Failed assertion user_data is NULL (0)
[2017-10-24 18:09:07] ERROR[3003] astobj2.c: FRACK!, Failed assertion user_data is NULL (0)
[2017-10-24 18:09:07] ERROR[3003] astobj2.c: FRACK!, Failed assertion user_data is NULL (0)
[2017-10-24 18:09:07] ERROR[3003] astobj2.c: FRACK!, Failed assertion user_data is NULL (0)
[2017-10-25 01:08:40] ERROR[5840] astobj2.c: FRACK!, Failed assertion user_data is NULL (0)
[2017-10-25 01:08:40] ERROR[5840] astobj2.c: FRACK!, Failed assertion user_data is NULL (0)
[2017-10-25 01:08:40] ERROR[5840] astobj2.c: FRACK!, Failed assertion user_data is NULL (0)
[2017-10-25 01:08:40] ERROR[5840] astobj2.c: FRACK!, Failed assertion user_data is NULL (0)



By: Steven Sedory (stevensedory) 2017-10-26 01:02:19.511-0500

So we had a crash this evening. This was without "sip set debug on", just normal operation.

This was the output when I created the backtrace. What does this mean?

[root@v12 ~]# gdb -se "asterisk" -ex "bt full" -ex "thread apply all bt" --batch -c /tmp/core.v12.ringvertical.com-2017-10-25T01\:08\:40-0700 > /tmp/backtrace-2017-10-25.txt
Cannot access memory at address 0x6f632e6c6163697c

Also, new backtrace file is backtrace-2017-10-25.txt

CORRECTION: so we must have had a false notification of a crash around 6PM today, because I just realized the attached backtrace is from a crash from 1AM. Anyway, maybe there is some useful information in it.

By: Joshua C. Colp (jcolp) 2017-10-26 05:12:08.626-0500

There is absolutely nothing useful in it. It contains nothing, basically.

By: Steven Sedory (stevensedory) 2017-10-26 09:18:34.752-0500

Copy. Will wait for another crash or FRACK.

By: Asterisk Team (asteriskteam) 2017-11-09 12:00:01.553-0600

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

By: Steven Sedory (stevensedory) 2017-11-09 16:18:22.856-0600

To update everyone, we've had no more crashes since the last on 10/25. I'd say it's due to t 13.17.2, but we had a server FRACK on that version on 10/25.

By: Asterisk Team (asteriskteam) 2017-11-09 16:18:23.169-0600

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: Richard Mudgett (rmudgett) 2017-11-09 16:54:34.237-0600

We need the requested usable backtrace or more importantly the FRACK log backtrace produced by BETTER_BACKTRACES.  We have shown you what usable backtraces and FRACK log backtraces look like.

Without that information this issue is going nowhere.  I am closing this issue.

If you ever get the requested information all you have to do is attach the files and comment on the issue for it to automatically re-open.

By: Steven Sedory (stevensedory) 2017-11-12 19:26:28.120-0600

So we had some FRACKs yesterday. Do you just need to see the asterisk log from yesterday? If so, I've attached "full-20171112".

Here's part of the log near one of the FRACKs:

[2017-11-11 06:03:49] VERBOSE[7195] asterisk.c: Remote UNIX connection
[2017-11-11 06:03:49] VERBOSE[31005] asterisk.c: Remote UNIX connection disconnected
[2017-11-11 06:03:50] WARNING[8721] chan_sip.c: Unable to cancel schedule ID 0.  This is probably a bug (chan_sip.c: do_dialog_unlink_sched_items, line 3266).
[2017-11-11 06:03:50] ERROR[5146] /builddir/build/BUILD/asterisk-13.17.2/include/asterisk/utils.h: Memory Allocation Failure in function ast_str_create at line 655 of /builddir/build/BUILD/asterisk-13.17.2/include/asterisk/strings.h
[2017-11-11 06:03:50] WARNING[5146] chan_sip.c: sip_xmit of 0x7f0428c3af80 (len 139655827686296) to 108.23.78.98:4279 returned -2: Cannot allocate memory
[2017-11-11 06:03:50] ERROR[5146] chan_sip.c: Serious Network Trouble; __sip_xmit returns error for pkt data
[2017-11-11 06:03:50] ERROR[5146] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x7f04286eac38 (0)
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: Got 23 backtrace records
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #0: [0x607112] asterisk __ast_assert_failed() (0x60708a+88)
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #1: [0x45e2c6] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #2: [0x45e958] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #3: [0x45edcc] asterisk __ao2_ref() (0x45ed9b+31)
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #4: [0x7f03be9d6a65] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #5: [0x7f03be9d6eab] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #6: [0x7f03be9d8e9d] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #7: [0x7f03bea0685b] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #8: [0x7f03bea57ec8] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #9: [0x7f03bea14753] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #10: [0x7f03bea1777f] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #11: [0x7f03bea513e0] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #12: [0x7f03bea52c56] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #13: [0x7f03bea535fd] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #14: [0x7f03be9d2c62] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #15: [0x7f03be9d181c] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #16: [0x5f076f] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #17: [0x603d14] asterisk <unknown>()
[2017-11-11 06:03:51] ERROR[5146] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x7f04286eac38 (0)
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: Got 22 backtrace records
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #0: [0x607112] asterisk __ast_assert_failed() (0x60708a+88)
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #1: [0x45e2c6] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #2: [0x45e958] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #3: [0x45edcc] asterisk __ao2_ref() (0x45ed9b+31)
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #4: [0x7f03be9d6ebc] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #5: [0x7f03be9d8e9d] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #6: [0x7f03bea0685b] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #7: [0x7f03bea57ec8] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #8: [0x7f03bea14753] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #9: [0x7f03bea1777f] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #10: [0x7f03bea513e0] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #11: [0x7f03bea52c56] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #12: [0x7f03bea535fd] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #13: [0x7f03be9d2c62] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #14: [0x7f03be9d181c] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #15: [0x5f076f] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[5146] logger.c: #16: [0x603d14] asterisk <unknown>()
[2017-11-11 06:03:51] ERROR[8721] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x7f04286eac38 (0)
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: Got 13 backtrace records
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #0: [0x607112] asterisk __ast_assert_failed() (0x60708a+88)
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #1: [0x45e2c6] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #2: [0x45e958] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #3: [0x45edcc] asterisk __ao2_ref() (0x45ed9b+31)
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #4: [0x7f03be9d6a25] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #5: [0x5c33a6] asterisk ast_sched_runq() (0x5c3267+13F)
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #6: [0x7f03bea55405] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #7: [0x603d14] asterisk <unknown>()
[2017-11-11 06:03:51] ERROR[8721] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x7f04286eac38 (0)
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: Got 13 backtrace records
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #0: [0x607112] asterisk __ast_assert_failed() (0x60708a+88)
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #1: [0x45e2c6] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #2: [0x45e958] asterisk <unknown>()
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #3: [0x45edcc] asterisk __ao2_ref() (0x45ed9b+31)
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #4: [0x7f03be9d6a41] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #5: [0x5c33a6] asterisk ast_sched_runq() (0x5c3267+13F)
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #6: [0x7f03bea55405] chan_sip.so <unknown>()
[2017-11-11 06:03:51] VERBOSE[8721] logger.c: #7: [0x603d14] asterisk <unknown>()
[2017-11-11 06:03:57] VERBOSE[7195] asterisk.c: Remote UNIX connection
[2017-11-11 06:03:57] VERBOSE[31025] asterisk.c: Remote UNIX connection disconnected

By: Asterisk Team (asteriskteam) 2017-11-12 19:26:28.772-0600

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: Richard Mudgett (rmudgett) 2017-11-14 12:03:14.796-0600

Closing again as more useless information posted.

NOTHING CAN BE DONE IF BACKTRACES DO NOT HAVE SYMBOLS.  This includes FRACK backtraces.

By: xrobau (xrobau) 2017-11-14 18:59:52.863-0600

If this is a FreePBX Compiled Asterisk, The debug symbols need to be installed, and a valid backtrace needs to be generated by following the instructions at:

https://wiki.freepbx.org/display/SUP/Providing+Great+Debug#ProvidingGreatDebug-Backtraces(Segfaults/CoreDumps/AsteriskCrashing)



By: Steven Sedory (stevensedory) 2017-11-16 18:38:12.085-0600

So we would generate the backtrace with the instructions at that URL, but it seems to only apply to there being a full crash, not just FRACK errors. That said, be it a full backtrace on a crash, or FRACK backtraces in the asterisk log, we're missing the debug symbols.

Any guidance on how we can get those installed?

By: Asterisk Team (asteriskteam) 2017-11-16 18:38:12.557-0600

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: xrobau (xrobau) 2017-11-16 19:22:05.321-0600

Please follow the instructions in that link to install the debug symbols. This is not an Asterisk issue. If you are unsure of how to install the debug symbols, after following those instructions, please take it to the FreePBX forums - community.freepbx.org


By: Steven Sedory (stevensedory) 2017-11-16 19:33:12.895-0600

Hi Rob, so I've done that. Does this mean Asterisk MUST crash before I will get useful debug info? Meaning the FRACKs just won't, even though they seem like they should (i.e. "logger.c: Got 23 backtrace records")?

I have installed pjproject-debuginfo and asterisk13-debuginfo.

As for gdb, I don't have a recent core dump file to run that on. But it is my understanding that the asterisk log shouldn't be showing things like "chan_sip.so <unknown>()" but rather more useful information.

What am I missing?

By: Joshua C. Colp (jcolp) 2017-12-05 11:27:29.618-0600

Have you been able to work with [~xrobau] to get the needed information for this?

By: Steven Sedory (stevensedory) 2017-12-06 17:59:58.938-0600

I have not, however, we have not had any more crashes. We are now on 13.17.2. Perhaps it was one of the fixed bugs that was related to this?

By: Joshua C. Colp (jcolp) 2017-12-06 18:05:38.064-0600

There's generally no active development on chan_sip, unless a community member fixed something or made a change. That's generally rare and the changes minor. I don't think anything targeting this went in which would explain it unless the problem was outside of chan_sip in some way. I'll keep this in feedback though and if you see it again you can respond back.

By: Asterisk Team (asteriskteam) 2017-12-21 12:00:00.658-0600

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

By: Steven Sedory (stevensedory) 2018-01-26 16:04:14.195-0600

So we are right now having several of these FRACKs a minute. Sadly, our back traces still suck.

Andrew Nagy answered me on a FreePBX Community post today, after I asked if better backtraces work now on the FreePBX distro, and he said, "No. We have followed all of Digium’s recommendations."

Since these FRACKs are happening in real time, is there anything we can do to identify the cause other than the "unknown" backtrace info it's spitting out? OR, does anyone have any suggestions as to how we can get these to properly show the backtrace information?

This is the worst...

By: Asterisk Team (asteriskteam) 2018-01-26 16:04:14.531-0600

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: Andrew Nagy (tm1000) 2018-01-26 16:09:25.995-0600

Steven Sedory ,

The response that you have quoted form me above is out of context from what you asked me. You specifically asked "Has anything changed to where we can see what is behind the "unknown"s above?". To which I replied "No. We have followed all of Digium’s recommendations.". Nothing about my reply is in response to better backtraces, it's in response to 'has anything changed'.

You did not ask me "if better backtraces work now on the FreePBX distro". The answer to that is yes, they do. Asterisk is compiled with better backtraces according to the Asterisk wiki documentation on the subject.

By: Steven Sedory (stevensedory) 2018-01-26 16:50:39.393-0600

You're right Andrew, I rushed my post in hopes of getting some help quickly and was sloppy. I didn't intend to misrepresent you. I apologize. So if we run the upgrade scripts again, do you believe we will get the proper backtrace information we're looking for if the issue happens again?

By: Steven Sedory (stevensedory) 2018-01-26 18:55:20.092-0600

So I think we may finally have some useful backtraces! I've just attached them.

By: George Joseph (gjoseph) 2018-01-29 08:08:23.727-0600

Were these backtraces from an actual crash or were they generated from a running Asterisk instance?


By: Steven Sedory (stevensedory) 2018-01-29 12:19:37.941-0600

From an actual crash. We ran the following against the core dump file that was generated after the crash:

/var/lib/asterisk/scripts/ast_coredumper /tmp/[name of the core file]

By: George Joseph (gjoseph) 2018-02-01 08:51:41.490-0600

The backtrace still points to chan_sip and at the present time, chan_sip is entirely community supported.
I'll leave the issue open in case a community member can work on it.



By: Steven Sedory (stevensedory) 2018-02-02 16:47:07.894-0600

So is it advisable for us to switch all our endpoints to chan_pjsip instead?

By: Joshua C. Colp (jcolp) 2018-02-02 16:59:16.468-0600

The chan_pjsip module is core supported and thus worked on by us (Digium). It's up to you whether you would like to switch or not.

By: Steven Sedory (stevensedory) 2018-02-02 17:29:19.542-0600

Understood. In your opinion, with a mixed set of endpoints (polycom, grandstream, obihi, yealink, panasonic, cisco), would we be better off with one vs the other in a production environment?

By: Joshua C. Colp (jcolp) 2018-02-02 17:43:37.976-0600

PJSIP speaks SIP and there are people who have used it with those devices. If you want a core supported module then chan_pjsip is the option, otherwise it is up to the community to investigate and fix any problems you may encounter or have with chan_sip.

By: Steven Sedory (stevensedory) 2018-02-02 17:46:26.638-0600

Great, thanks for clarifying. If you'd like, we can close this.