[Home]

Summary:ASTERISK-26252: Segfault when using SendFax / ReceiveFax via T.38
Reporter:Michal Rybarik (pixall)Labels:
Date Opened:2016-07-31 16:19:56Date Closed:2016-08-13 09:00:05
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_sip/T.38 Resources/res_fax_spandsp
Versions:11.23.0 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) backtrace-receive.txt
( 1) backtrace-send.txt
( 2) debug_log_26252_receivefax.log
( 3) debug_log_26252_sendfax.log
( 4) FaxS-5411.tiff
Description:SendFax and ReceiveFax are doing segfaults on latest Asterisk 11 release, short while after SendFax/ReceiveFax is invoked. It happens on every call. In my setup faxes goes from/to Asterisk via SIP trunk (chan_sip + T.38). There was no such problem with older Asterisk 11 r412438 on the same host and setup.
Comments:By: Asterisk Team (asteriskteam) 2016-07-31 16:19:57.090-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Joshua C. Colp (jcolp) 2016-07-31 16:27:23.789-0500

Thank you for the crash report. However, we need more information to investigate the crash. Please provide:

1. A backtrace generated from a core dump using the instructions provided on the Asterisk wiki [1].
2. Specific steps taken that lead to the crash.
3. All configuration information necesary to reproduce the crash.

Thanks!

[1]: https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace



By: Michal Rybarik (pixall) 2016-07-31 16:53:18.116-0500

Asterisk debug log - ReceiveFax

By: Michal Rybarik (pixall) 2016-07-31 16:53:58.116-0500

Asterisk debug log - SendFax

By: Michal Rybarik (pixall) 2016-07-31 17:00:57.210-0500

core gdb backtrace

By: Joshua C. Colp (jcolp) 2016-07-31 17:03:27.724-0500

What version of SpanDSP is in use? The crash appears to be down there.

By: Michal Rybarik (pixall) 2016-07-31 17:12:22.918-0500

SpanDSP is version 0.0.6, snapshot from 2013-01-28.

By: Joshua C. Colp (jcolp) 2016-07-31 17:17:21.804-0500

We actually have tests which run nightly that do faxing in 11 using T.38 and audio which are running fine thus the SpanDSP question.

By: Michal Rybarik (pixall) 2016-07-31 17:26:37.432-0500

Should I try another SpanDSP version? Which one exactly? Or will you try to find the problem using the logs & backtraces?

By: Joshua C. Colp (jcolp) 2016-07-31 17:29:48.727-0500

You can try another one just in case, as for isolating the issue until someone dives deep into the issue it's an unknown. There's no time frame on that.

By: Michal Rybarik (pixall) 2016-07-31 17:39:41.884-0500

Could you please check which version do you use for nightly tests, which works fine? I'd like to try the same and see if it'll work for me too, or if problem persists.

By: Joshua C. Colp (jcolp) 2016-07-31 17:43:50.940-0500

The Ubuntu one is using a package of version "0.0.6~pre21-2". What that translates to for SpanDSP I'm uncertain of.

By: Michal Rybarik (pixall) 2016-07-31 19:27:41.041-0500

I have compared yours and mine SpanDSP sources and there is no significant difference between them. I have tried another SpanDSP too, and problem remains.

So I have looked deeper into backtraces and sources, and it seems that spandsp segfaults because of incorrect data received from Asterisk. In both backtraces there is segfault in t38_core_rx_ifp_stream(), and if I read backtraces correcttly, it receives empty buffer ("") in one argument and buflen=1 in next. Then spandsp tries to read 1 byte (buflen) from empty buffer, and it produces segfault, of course.

In res_fax_spandsp.c in function spands_fax_write() I see, that invalid data (empty buffer with buflen=1) comes from ast_frame f->data.ptr and f->datalen. But I'm not sure where ast_frame f comes from and why it has such inconsistent data inside, I'm little bit lost here.

By: Joshua C. Colp (jcolp) 2016-08-01 07:04:12.935-0500

Do you have a TIFF you can share which would exhibit the issue?

By: Michal Rybarik (pixall) 2016-08-01 07:14:51.227-0500

I'm attaching TIFF file, which was passed to SendFax in the debug & backtrace above. It was produced from PDF using ghostscript.

By: Michal Rybarik (pixall) 2016-08-01 07:36:15.757-0500

BTW, just to save the time - I'm not sure that the problem is related to the contents of fax document. In the opposite direction (ReceiveFax) I was sending fax from analog PSTN, call was passed via SS7 to Asterisk 11 acting as T.38 gateway, and then via SIP/T.38 trunk to the machine with Asterisk 11 running ReceiveFax. ReceiveFax crashed in very early stage of UDPTL dialog - sender fax machine haven't started scanning of the document yet, so I think there was no TIFF/image data available at the moment of segfault.

By: Rusty Newton (rnewton) 2016-08-10 08:57:25.691-0500

Cancel my previous comment, I misread a few things!

I'm going to open this up. If you can narrow down an exact process for step by step reproduction that would be very helpful. You probably want to attach as much of your relevant Asterisk configuration as possible.

By: Michal Rybarik (pixall) 2016-08-10 10:12:41.719-0500

Hi Rusty,
for me it's really easy to reproduce this bug - simply making T.38 call via SIP trunk, originated from SendFax(), or terminated to ReceiveFax() - for me it crashes on every attempt, at the exactly same phase of T.38 negotiation. I really don't know about anything special in my setup. If I downgrade back to branch 11 r412348M, fax works OK, thousands of faxes sent and received without restart. If I upgrade to 11.23.0, problem is back, it crashes on every fax attempt. I discovered that probably ast_frame structure has incorrect data inside (see my previous comments) but I'm unable to find where these data comes from.

I was thinking about trying older version (releases) in 11 branch from r412348 to 11.23.0, one by one, to find out when this problem began, but it needs some time to compile, install, test, etc. I have already provided Asterisk debugs, backtraces, and if you have idea what can I provide to help find the problem, please let me know, I'll do it.

By: Michal Rybarik (pixall) 2016-08-13 08:59:42.477-0500

We started to compile and test releases one by one, as I mentioned above, and we've seen the same segfaults on 11.9.0, 11.10.0, which should works OK. So it looked much more like problem with compilation, not with source code. Then we found that buildmachine has libspandsp built against libtiff4, and production server has libspandsp built against libtiff5. We installed right libspandsp on buildmachine, rebuilt Asterisk, and everything works now, no segfaults at all. So that's definitely not bug in Asterisk, and I'm very sorry for reporting this... I'm closing the issue as "Not a Bug".

By: Rusty Newton (rnewton) 2016-08-15 14:50:30.897-0500

[~pixall] Thanks for following up to let us know what was going on. That will help other community members that might run into a similar issue. Thanks again for your work!