[Home]

Summary:ASTERISK-21305: Segfault when hanging up channels active in MeetMe with recording
Reporter:Pedro Peña (pedropena)Labels:
Date Opened:2013-03-20 07:54:15Date Closed:
Priority:MajorRegression?
Status:Open/NewComponents:Applications/app_meetme
Versions:1.8.15.1 1.8.20.0 11.2.1 13.18.4 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Asterisk Now 2.0.0 Asterisk Now 3.0.0Attachments:( 0) backtrace1.txt
( 1) backtrace2.txt
( 2) backtrace3.txt
( 3) extensions_custom.conf
( 4) meetme_segfault.sh
( 5) trace-ASTERISK21305.txt
( 6) trace-patched-ASTERISK21305.txt
Description:The attached bash script (meetme_segfault.sh) causes a segfault in conjunction with the attached dialplan (extensions_custom.conf). The segfault happens in less than 10 seconds almost always.

Note that if record disabled (comment line 3, uncomment line 4) the segfault not occurs.
Comments:By: Michael L. Young (elguero) 2013-03-20 11:31:04.438-0500

Thank you for your bug report. In order to move your issue forward, we require a backtrace[1] from the core file produced after the crash. Also, be sure you have DONT_OPTIMIZE enabled in menuselect within the Compiler Flags section, then:

make install

After enabling, reproduce the crash, and then execute the backtrace[1] instructions. When complete, attach that file to this issue report.

[1] https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace



By: Pedro Peña (pedropena) 2013-03-20 14:00:23.820-0500

bactrace1.txt, backtrace2.txt and backtrace3.txt generated with version 1.8.15-cert1

By: Rusty Newton (rnewton) 2013-03-27 20:20:41.217-0500

Pedro. This feels like a corner case - especially slamming Asterisk with the call hangups from CLI at an unlimited rate through a script.  Did you have this happening in a production environment and how does the method of reproduction represent what was going on in production? I imagine you had a lot of MeetMe's running and had a random crash when some channels were hungup?

I am able to reproduce, but my backtraces are not exactly the same. It looks like the core dumps when Asterisk is in a corrupted state. I'll upload my trace tomorrow and ping someone to look at them.

By: Pedro Peña (pedropena) 2013-03-28 02:34:12.710-0500

Really this test reproduces crashes in a production system at a rate of 4 - 16 hours, just that much faster. Note that disabling record of meetme the crash not occurs (the same in production). In production the backtraces not are the same too, but always related with a hungup of a channel when the conference is being established.

I also observed that there is a corruption on the vars of the recording thread.

At the moment the only solution in production is to start record when the conference is established, not from the beginning (not using the 'r' param).

By: Rusty Newton (rnewton) 2013-03-29 10:34:59.476-0500

Attaching trace from my reproduction (trace-ASTERISK21305.txt)

* Used reporters meetme_segault.sh script to stimulate the crash
* res_timing_timerfd.so was in use for timing.
* compiled SVN-branch-1.8-r381770 with DONT_OPTIMIZE, DEBUG_THREADS, BETTER_BACKTRACES


By: Michael L. Young (elguero) 2013-03-29 12:26:57.127-0500

Based on a quick look at the code and the backtrace, I think that the listening channel used for recording is being hungup and the recording thread is still trying to access it.  That is why it only happens when the record option is turned on.

I will admit that I am not an expert with the app_meetme code but I think this patch might fix the problem.  It checks to see if the channel has been requested to be hungup before trying to read from it.

By: Rusty Newton (rnewton) 2013-04-01 19:40:21.734-0500

Michael - Asterisk SVN-branch-1.8-r384410M with your patch failed compilation with

{noformat}
app_meetme.c: In function ‘recordthread’:
app_meetme.c:5168:32: error: expected ‘;’ before ‘{’ token
{noformat}

Not being a C developer and not knowing Asterisk source I took a wild guess at what you were doing and put your expression into an if condition. Compilation was successful and I was still able to reproduce a crash using the script.

Attaching trace-patched-ASTERISK21305.txt.  The trace looks quite different this time. Asterisk was compiled with DEBUG_THREADS, BETTER_BACKTRACES, DONT_OPTIMIZE

By: Michael L. Young (elguero) 2013-04-01 20:41:19.411-0500

Rusty... yikes, how did I do that.  Sorry about that.

I was hoping it was going to be that simple but a part of me had a feeling that it wasn't going to be.

By: Rusty Newton (rnewton) 2013-04-02 15:15:03.554-0500

No worries at all! Thanks for attempting. Maybe that additional trace will be helpful to someone.