[Home]

Summary:ASTERISK-24317: Crash preceded by potential memory leak.
Reporter:Roberto (tel.medola)Labels:
Date Opened:2014-09-10 13:55:27Date Closed:2014-11-13 15:10:19.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:General
Versions:11.11.0 11.12.1 Frequency of
Occurrence
Related
Issues:
duplicatesASTERISK-22945 [patch] Memory leaks in chan_sip.c with realtime peers
Environment:Attachments:( 0) backtrace_tee.txt
( 1) backtrace.txt
( 2) build_peer.txt
( 3) memory_usage.zip
( 4) mmlog
( 5) mmlog.zip
( 6) ref_leak.zip
( 7) refs_0x285ef98.txt
( 8) refs_0x4a10f18.txt
( 9) refs_0x4a25968.txt
(10) refs_0x7f43a002ab08.txt
(11) refs_0x7f4400028898.txt
(12) refs_0x7f444c01bfb8.txt
(13) sip_data.sql
(14) sip.conf
Description:Hi.
My asterisk is 11.11.0 on a CentOS x86_64  - release 6.5 (Final).

I'm having serious problems with memory leak in my asterisk.
I have 16GB and the process uses on average 30% to meet my 100 users. However, I realize that he never lower than 30%, even when all disconnect the asterisk. Today the asterisk used all memory, all the swap and then crashed.

Could you help me please?
Thanks
Comments:By: Rusty Newton (rnewton) 2014-09-11 13:48:33.470-0500

Thank you for taking the time to report this bug and helping to make Asterisk better. Unfortunately, we cannot work on this bug because your description did not include enough information. You may find it helpful to read the Asterisk Issue Guidelines http://www.asterisk.org/developers/bug-guidelines. We would be grateful if you would then provide a more complete description of the problem. At a minimum, we need:

1. the specific steps or actions you took that caused you to encounter the problem,
2. the behavior you expected, and
3. the behavior you actually encountered (in as much detail as possible).

This likely includes output from the console with debug level logging, a SIP trace (if this is SIP related), and configuration information such as dialplan (e.g. extensions.conf) and channel configuration (e.g. sip.conf). Thanks!



By: Rusty Newton (rnewton) 2014-09-11 13:50:14.807-0500

Specifically:

1. Read through the guidelines: https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

2. Collect a backtrace for your crash, following the instructions closely. https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

From there we'll figure out what other diagnostics you need.

By: Roberto (tel.medola) 2014-09-11 14:40:32.374-0500

Hi, thanks for answer.

I will try update to Asterisk 11.12 berfore..

http://downloads.asterisk.org/pub/telephony/asterisk/ChangeLog-11-current



By: Roberto (tel.medola) 2014-09-16 14:59:04.840-0500

Hi. Does work the update to 11.12.
The asterisk Still crash...

The gdb result this on the end of command:
gdb asterisk /tmp/core.asteriskProducao-2014-09-16T16:14:39-0300

...
...
Program terminated with signal 11, Segmentation fault.
#0  0x00007fdb5ee59be9 in ?? () from /usr/lib64/asterisk/modules/codec_siren7.so

will I have problem with siren7 ?


By: Rusty Newton (rnewton) 2014-10-01 08:58:38.550-0500

We still need you to follow the instructions provided:

{quote}
2. Collect a backtrace for your crash, following the instructions closely. https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace
{quote}

Without the debug requested we cannot investigate the issue.

By: Roberto (tel.medola) 2014-10-10 14:01:21.426-0500

Follow coredump...

thanks

By: Rusty Newton (rnewton) 2014-11-03 12:35:08.562-0600

Unfortunately the backtraces are not enough to track down what is happening.

Is your configuration and call-flow fairly simple? Could you possible describe the general behavior of the system and provide a simplified configuration with which we could reproduce the issue?

You probably can't run Valgrind in production, so the next step is likely for you to re-compile with [MALLOC_DEBUG|https://wiki.asterisk.org/wiki/display/AST/MALLOC_DEBUG+Compiler+Flag] and attach the mmlog so that we can analyze it.

By: Roberto (tel.medola) 2014-11-04 05:13:19.414-0600

Ok.
Once deactivated the Siren7 codecs and 14, took to the crash happen. The crash occurred 5:00PM yesterday.

Will enable MALLOC_DEBU option, but it will affect server performance?
Thanks.

By: Roberto (tel.medola) 2014-11-06 05:12:44.119-0600

Hi.
I enabled codecs siren7 and siren14 and the asterisk crash.

Both codecs(generic version), i downloaded from http://my.digium.com/en/docs/siren14/siren14-download-overlay/

First I tried to install the version barcelona, but there were also crash.

By: Roberto (tel.medola) 2014-11-06 05:30:15.845-0600

... but memoryleak continues.
I have to restart the asterisk all day, even not using the codecs.

By: Corey Farrell (coreyfarrell) 2014-11-06 05:38:23.050-0600

Since you have asterisk compiled with MALLOC_DEBUG, please provide output of 'memory show summary' and 'memory show allocations'.  Please run each command as soon as you start Asterisk, then again after Asterisk has run for a bit, upload the output.  This should help give us a better idea of where the leaks are occurring.

By: Roberto (tel.medola) 2014-11-07 10:52:41.540-0600

Another CRASH !!!

By: Roberto (tel.medola) 2014-11-07 12:28:28.043-0600

Follows attach from memory( memory show summary and memory show allocations) as requested.

Tried to collect again, but the server froze and aborted the command.

I need help here. My server crashes almost every day and I can not give an answer to my manager.

Thanks.


By: Corey Farrell (coreyfarrell) 2014-11-07 13:08:25.757-0600

You are definitely leaking peers (Allocations_2014_11_07_09_48.txt shows almost 30,000 allocated peers).  Since peers are reference counted object, please turn MALLOC_DEBUG off and recompile with REF_DEBUG.  Follow instructions on the [wiki|https://wiki.asterisk.org/wiki/display/AST/Reference+Count+Debugging].  Note that since you're running 11.12.0 you will have to run refcounter.py found in the Asterisk source.  Given the rate you are leaking it will not take very long to produce useful results.  If the output of refcounter.py is too large, you can extract a single leaked object that is constructed by 'build_peer'.  I also need to see your sip.conf.

By: Roberto (tel.medola) 2014-11-11 08:31:17.490-0600

Follow files.

Ref_Count had to stop because the file 'refs' in less than 20 minutes was almost 1gb.

Extract 'build_peer' in the file as you requested.
On location build_peer process, found it odd because there is a very high incidence of creating channels chan_iax2, I do not use iax peers.

Anyway, I sent all the information you requested.

Thanks.

By: Corey Farrell (coreyfarrell) 2014-11-11 08:39:46.874-0600

Please provide the complete history of a single object created by build_peer.  For example if you {{grep refs -e '0x285ef98' > refs.txt}} you will get the history for that object.

By: Roberto (tel.medola) 2014-11-11 08:51:20.695-0600

Ok, sorry...

Extracts of different parts of the file for you to check.
Thanks

By: Corey Farrell (coreyfarrell) 2014-11-11 09:17:20.859-0600

I'm sorry this is actually showing me peers that did not leak.  Did {{refcounter.py -f /var/logs/asterisk/refs -sn}} fail to work?  If you can run that and extract an object from that it would be most helpful.

By: Roberto (tel.medola) 2014-11-11 09:25:28.642-0600

Yes, thats result:

Traceback (most recent call last):
 File "./refcounter.py", line 187, in <module>
   sys.exit(main(sys.argv))
 File "./refcounter.py", line 166, in main
   skewed_objects) = process_file(options)
 File "./refcounter.py", line 88, in process_file
   current_objects[obj]['curcount'] += int(parsed_line['delta'])
TypeError: cannot concatenate 'str' and 'int' objects



By: Corey Farrell (coreyfarrell) 2014-11-11 09:31:12.502-0600

It looks like your copy of refcounter.py has a bug that has been fixed.  Please download [the latest for 12|http://svn.asterisk.org/svn/asterisk/branches/12/contrib/scripts/refcounter.py], run that.  You may have to {{chmod +x refcounter.py}} after downloading.  This bug did not effect the data collected, so no need to rerun the test as long as you still have the refs file.

By: Roberto (tel.medola) 2014-11-11 09:53:06.105-0600

various memory leak...

like as

==== Leaked Object 0x7f43a002ab08 history ====
[24577] format_cap.c:116 ast_format_cap_add: +1  - [**constructor**]
[24577] format_cap.c:120 ast_format_cap_add: +1  - [1]
[24577] format_cap.c:121 ast_format_cap_add: -1  - [2]

You need to send the whole file to memory leak? With zip file is 4mb..

By: Corey Farrell (coreyfarrell) 2014-11-11 09:58:01.299-0600

If it's 4MB ZIP then JIRA should accept it.  It will be easier that way.

By: Roberto (tel.medola) 2014-11-11 10:00:16.448-0600

Follow

thanks.

By: Corey Farrell (coreyfarrell) 2014-11-11 10:51:40.798-0600

I believe you are suffering from ASTERISK-22945.  Please update to Asterisk 11.7.0 to get this fix, let us know if this resolves your issue.

By: Roberto (tel.medola) 2014-11-11 11:05:05.757-0600

Sorry, but i not understand. My asterisk is version 11.12.0.

By: Corey Farrell (coreyfarrell) 2014-11-11 11:10:45.860-0600

I typed wrong.. it's 11.14.0 you need.  In general if you are having any issue you should always try updating to the latest version of the current branch, as the issue may already be resolved.

By: Roberto (tel.medola) 2014-11-11 12:28:08.163-0600

Ok Farrel, I appreciate.
I will update today after work hours and after one or two days of use, I comment on the result.

Regarding the update before opening the issue, I think I did that. However, on day 09-11 I do not think the 11.14 version was released.

Thanks.

By: Roberto (tel.medola) 2014-11-13 03:52:34.925-0600

Hi.
Yesterday the asterisk was at 0.7% of 32gb of my server and was stable throughout the day.
I think with the 11.14 version of the memory leak problem is solved.

Thanks a lot.
Roberto.