[Home]

Summary:ASTERISK-28609: Memory Leak in res_rtp_asterisk.c
Reporter:Ted G (tgwaste)Labels:fax patch
Date Opened:2019-11-05 22:24:08.000-0600Date Closed:2019-12-18 08:47:54.000-0600
Priority:MinorRegression?
Status:Closed/CompleteComponents:General
Versions:16.3.0 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) ASTERISK-28609-1.patch
( 1) frame-malloc-debug.patch
( 2) leak.txt
( 3) mem-20191216.txt
Description:Hello,
 There appears to be a memory leak in frame.c.

Sometimes I received these errors in my logs:
[Nov  5 21:14:49] WARNING[14918][C-00028a27]: file.c:258 ast_writestream: Translated frame write failed

I believe this happenes because of some issues between my carrier and the asterisk system and most of the time these errors are not frequent.
However sometimes these errors are produced thousands of times per day and when that happens the memory usage for asterisk shoots up and never recovers:

...
  1073986 bytes in       5455 allocations in file xmldoc.c
  1327380 bytes in       1819 allocations in file chan_sip.c
  1351770 bytes in        840 allocations in file res_rtp_asterisk.c
  1707720 bytes in        105 allocations in file smoother.c
  2409853 bytes in        255 allocations in file file.c
  2988480 bytes in        163 allocations in file bridge_softmix.c
  7310456 bytes in        633 allocations in file translate.c
  7434868 bytes in       2773 allocations in file confbridge/conf_config_parser.c
899355372 bytes ( 899355372 cache) in    2046261 allocations in file frame.c         <----
933339512 bytes allocated (899355372 in caches) in 2102261 selected allocations
933339512 bytes in all allocations
934016850 bytes in all allocations and deferred free allocations


axis/root# top -b -n 1 | grep asterisk
19727 root      20   0 3353860 1.314g   8256 S 111.8  2.1  16620:19 asterisk     1.3GB :(

System uptime: 1 week, 3 days, 23 hours, 34 minutes, 42 seconds

1 week, 3 days worth of frame write failed error counts:

mmdd errors
1027 2218
1028 1426
1029 5275
1030 2454
1031 1612
1101 2797
1102 1635
1103 3278
1104 743
1105 1117
Comments:By: Asterisk Team (asteriskteam) 2019-11-05 22:24:09.137-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: Joshua C. Colp (jcolp) 2019-11-06 05:03:04.498-0600

What codecs are in use? What type of file is it writing to? Can you provide call flow and console output?

By: Ted G (tgwaste) 2019-11-06 11:11:51.717-0600

What codecs are in use?
I'm only using ulaw.

What type of file is it writing to?
I don't quite understand the question.  Its writing to a stream afaik:   file.c:258 ast_writestream: Translated frame write failed

Can you provide call flow and console output?
Again, I don't understand the question.  The system is enormous.  To provide a complete map of call flow would be weeks of effort.
The basics are:  Voip Phone company -> My Asterisk System -> Do Stuff -> Hang up

By: Joshua C. Colp (jcolp) 2019-11-06 11:17:05.330-0600

The message is being output as a result of an audio frame being given to file.c and a format module. This is used to record audio into a file. Knowing what file is being recorded to (the extension) can help to understand precisely what is going on. The message itself is only one part and provides a hint that can then be used to identify the full scenario of what is going on. As it is there isn't enough information to be able to understand that.

By: Ted G (tgwaste) 2019-11-06 12:22:46.817-0600

Ah.  The file extension is always .wav

Input File     : 'test-rec.wav'
Channels       : 1
Sample Rate    : 8000
Precision      : 16-bit
Duration       : 00:04:29.55 = 2156392 samples ~ 20216.2 CDDA sectors
File Size      : 4.31M
Bit Rate       : 128k
Sample Encoding: 16-bit Signed Integer PCM


By: Joshua C. Colp (jcolp) 2019-11-07 07:45:51.612-0600

How is the recording being done? MixMonitor? Record? I've been looking through the code and experimenting, and haven't yet been able to reproduce the frame leak given the information.

By: Ted G (tgwaste) 2019-11-07 08:18:21.361-0600

I do both types of recording, could be either.  The system has a lot of traffic.  I am not able to reproduce this at will.  It happens over time (~week)

By: Corey Farrell (coreyfarrell) 2019-11-07 09:01:52.687-0600

If you can retest with the attached patch it will give better debugging information for frame allocations.  Specifically MALLOC_DEBUG will report allocations as coming from the caller to ast_frdup / ast_frisolate instead of reporting frame.c.  So for example if MixMonitor were leaking frames you would see allocations for app_mixmonitor.c increasing.

Note this disables allocation caching in frame.c so it will likely have a performance impact but since you are already MALLOC_DEBUG I do not expect this to be major.

By: Ted G (tgwaste) 2019-11-07 09:55:51.147-0600

Thank you!  Will test as soon as I can.  I haven't noticed any performance issues with MALLOC_DEBUG so I just keep it on.

By: Ted G (tgwaste) 2019-11-08 12:09:04.810-0600

First I apologize as I know this is an amateur hour question but its probably been at least a decade since I applied a patch and there are multiple ways I've come across in my searching for how to do it.

Can you tell me the proper way?

Im getting the following in my attempt:

/usr/src/asterisk-16.3.0# patch include/asterisk/frame.h frame-malloc-debug.patch
patching file include/asterisk/frame.h
patching file include/asterisk/frame.h
Hunk #1 FAILED at 43.
Hunk #2 FAILED at 185.
Hunk #3 FAILED at 194.
Hunk #4 FAILED at 205.
Hunk #5 FAILED at 291.
Hunk #6 FAILED at 313.
Hunk #7 FAILED at 736.
7 out of 7 hunks FAILED -- saving rejects to file include/asterisk/frame.h.rej

/usr/src/asterisk-16.3.0# diff include/asterisk/frame.h ../asterisk-16.3.0.bak/include/asterisk/frame.h
563,564c563
< #define ast_frisolate(fr) __ast_frisolate(fr, __FILE__, __LINE__, __PRETTY_FUNCTION__)
< struct ast_frame *__ast_frisolate(struct ast_frame *fr, const char *file, int line, const char *func);
---
> struct ast_frame *ast_frisolate(struct ast_frame *fr);
571,572c570
< #define ast_frdup(fr) __ast_frdup(fr, __FILE__, __LINE__, __PRETTY_FUNCTION__)
< struct ast_frame *__ast_frdup(const struct ast_frame *fr, const char *file, int line, const char *func);
---
> struct ast_frame *ast_frdup(const struct ast_frame *fr);

Is that patched or not? :)


By: Sean Bright (seanbright) 2019-11-08 12:12:03.488-0600

{noformat}
cd /usr/src/asterisk-16.3.0
patch -p1 < /path/to/frame-malloc-debug.patch
{noformat}

By: Ted G (tgwaste) 2019-11-08 12:20:15.293-0600

Thank you Sean. All good.

By: Ted G (tgwaste) 2019-11-08 20:57:37.532-0600

So far since the patch I've received 230 errors but I'm not seeing anything different in memory show summary or console output

{code}
# asterisk -x 'memory show summary' | grep fr
     9520 bytes in        170 allocations in file framehook.c
   679167 bytes in deferred free large allocations
    22601 bytes in deferred free small allocations
   701768 bytes in deferred free allocations
 43627447 bytes in all allocations and deferred free allocations
{code}

[Nov  8 19:53:47] WARNING[13303][C-00001291]: file.c:258 ast_writestream: Translated frame write failed

Memory hasn't gone crazy yet but I am unsure what to be on the lookout for exactly.


By: Corey Farrell (coreyfarrell) 2019-11-10 19:18:18.845-0600

With my patch you will not see allocations under frame.c anymore, they will appear elsewhere.

By: Sean Bright (seanbright) 2019-11-12 08:37:10.479-0600

[~tgwaste], in your {{MixMonitor}} usage - are you using the {{r}} and/or {{t}} options along with the newer {{S}} option?

By: Ted G (tgwaste) 2019-11-12 17:18:54.395-0600

The only 2 flags I use for MixMonitor are 'a' and 'b'

IE:
EXEC MIXMONITOR /path/to/file,ab

By: Joshua C. Colp (jcolp) 2019-11-13 13:04:14.578-0600

Per the comment from [~coreyfarrell] the allocations will appear elsewhere. Can you please attach the full output?

By: Sean Bright (seanbright) 2019-11-20 10:10:35.954-0600

[~tgwaste], can you attach the output from {{memory show summary}} (don't filter it in any way).

By: Corey Farrell (coreyfarrell) 2019-11-23 13:37:37.734-0600

{code}
 27446120 bytes in     132721 allocations in file res_rtp_asterisk.c
{code}

How many channels were active when you retrieved this information?  Please post the complete output of asterisk CLI {{memory show allocations res_rtp_asterisk.c}} and tell us how many channels were active at the time of retrieval.

By: Ted G (tgwaste) 2019-11-23 13:50:23.256-0600

Was really slow pasting this.. I am unable to delete my duplicate messages.  Both messages resulted in an error from this site.

By: Richard Mudgett (rmudgett) 2019-11-23 14:10:02.803-0600

Please *attach* large blocks of text information as *files* to the issue instead of pasting them into a comment.  Your comment can then refer to points in the attached file.

By: Corey Farrell (coreyfarrell) 2019-11-23 14:52:26.188-0600

Sorry partly my fault I didn't specify.  For some reason I had expected the CLI command to group the output.  I've processed it with {{sort|uniq -c}}, attached as leak.txt.  It looks like {{ast_rtp_read}} has some paths where the {{frames}} list is non-empty but does not get returned or freed.

By: Benjamin Keith Ford (bford) 2019-11-25 09:43:26.096-0600

Looking at {{ast_rtp_read}}, it looks like Corey is right. I'm going to open an issue for an audit of the function.

By: George Joseph (gjoseph) 2019-12-05 15:08:04.438-0600

Ted, Can you upgrade to the latest version of 16 (16.6.2)?  I have a patch for you to try but I don't want to backport it to 16.3.

Apply the patch with
{noformat}
patch -p1 < ASTERISK-28609-1.patch
{noformat}
and rebuild.

THIS IS A TEST PATCH, not a permanent one.  

Give it a shot and let me know how it works.


By: Ted G (tgwaste) 2019-12-06 00:54:14.253-0600

Thanks George, I will try to get to this as soon as possible and report back.  

By: George Joseph (gjoseph) 2019-12-09 10:29:27.570-0600

Any news?


By: Ted G (tgwaste) 2019-12-09 11:20:06.003-0600

I'm applying the patch now.  I'll let it run a week(?) and then provide the same data Corey requested.


By: Ted G (tgwaste) 2019-12-10 08:03:19.463-0600

To clarify: Should I have installed the other patch as well?  frame-malloc-debug.patch?

By: George Joseph (gjoseph) 2019-12-10 08:40:11.510-0600

No.  I'd prefer that you only apply the fix patch.


By: George Joseph (gjoseph) 2019-12-12 09:16:22.304-0600

Hi [~tgwaste],  Any improvement?   We'd like to get this closed out. :)


By: Ted G (tgwaste) 2019-12-16 23:03:55.634-0600

Included is the output of {code}asterisk -x 'memory show summary'{code} for 1 full week.

>> mem-20191216.txt


By: George Joseph (gjoseph) 2019-12-17 06:23:34.229-0600

Well, that's certainly better than your original report of
{noformat}
899355372 bytes ( 899355372 cache) in    2046261 allocations in file frame.c         <----
{noformat}

Do you agree that the patch provided  solves your issue?


By: Ted G (tgwaste) 2019-12-17 16:06:53.297-0600

Well, as I originally stated this leak happens randomly.  Currently the memory usage is not crazy.  It can happen tomorrow or in a month there is no way of knowing.

Having said that, this doesn't quite seem right but it could be my lack of knowledge:

{code}
# asterisk -x 'core show channels concise' | grep SIP | wc -l
86
# asterisk -x 'memory show allocations res_rtp_asterisk.c' | wc -l
703
{code}


By: Sean Bright (seanbright) 2019-12-17 16:19:07.851-0600

[~tgwaste], what about that output doesn't seem right to you?

By: Ted G (tgwaste) 2019-12-17 16:29:04.068-0600

I thought it should closely match the number of channels but I guess that's not right?

I guess I miss-understood what Corey was getting at when he asked for the same information and asked:

"How many channels were active when you retrieved this information? Please post the complete output of asterisk CLI memory show allocations res_rtp_asterisk.c and tell us how many channels were active at the time of retrieval."

If everyone thinks the patch is the answer, then I'm OK to close the issue out.  If memory goes crazy again I can reopen. Will the patch be in the next version?

By: Sean Bright (seanbright) 2019-12-17 16:34:14.811-0600

The number of allocations in {{res_rtp_asterisk}} is some function of the number of channels, but I would not expect them to be the same number. 8 allocations in {{res_rtp_asterisk}} per SIP channel (703 / 86) does not seem unreasonable to me. Assuming that ratio remains constant, then I don't think there is anything to be concerned about.

By: George Joseph (gjoseph) 2019-12-17 16:34:55.264-0600

Yeah, there's only an approximate relationship.  We just wanted to get an idea of how busy the system was.

The patch does plug a few holes in the exact place shown in the debug information so I'm fairly confident that you won't see any increase over time but yeah feel free to re-open the issue if you do see them.

The patch was just officially too late for the next release but we're going to have to do another set of release candidates anyway so I'll see if we can get this in.


By: Friendly Automation (friendly-automation) 2019-12-18 08:47:56.072-0600

Change 13466 merged by Joshua C. Colp:
res_rtp_asterisk:  Add frame list cleanups to ast_rtp_read

[https://gerrit.asterisk.org/c/asterisk/+/13466|https://gerrit.asterisk.org/c/asterisk/+/13466]

By: Friendly Automation (friendly-automation) 2019-12-18 08:57:37.290-0600

Change 13393 merged by George Joseph:
res_rtp_asterisk:  Add frame list cleanups to ast_rtp_read

[https://gerrit.asterisk.org/c/asterisk/+/13393|https://gerrit.asterisk.org/c/asterisk/+/13393]

By: Friendly Automation (friendly-automation) 2019-12-18 09:01:33.379-0600

Change 13392 merged by George Joseph:
res_rtp_asterisk:  Add frame list cleanups to ast_rtp_read

[https://gerrit.asterisk.org/c/asterisk/+/13392|https://gerrit.asterisk.org/c/asterisk/+/13392]

By: Friendly Automation (friendly-automation) 2019-12-18 09:02:45.561-0600

Change 13483 merged by Friendly Automation:
res_rtp_asterisk:  Add frame list cleanups to ast_rtp_read

[https://gerrit.asterisk.org/c/asterisk/+/13483|https://gerrit.asterisk.org/c/asterisk/+/13483]

By: Friendly Automation (friendly-automation) 2019-12-18 09:03:31.215-0600

Change 13484 merged by George Joseph:
res_rtp_asterisk:  Add frame list cleanups to ast_rtp_read

[https://gerrit.asterisk.org/c/asterisk/+/13484|https://gerrit.asterisk.org/c/asterisk/+/13484]

By: Friendly Automation (friendly-automation) 2019-12-18 09:04:55.082-0600

Change 13412 merged by George Joseph:
res_rtp_asterisk:  Add frame list cleanups to ast_rtp_read

[https://gerrit.asterisk.org/c/asterisk/+/13412|https://gerrit.asterisk.org/c/asterisk/+/13412]