[Home]

Summary:ASTERISK-23616: Big memory leak in logger.c
Reporter:ibercom (ibercom)Labels:
Date Opened:2014-04-10 12:56:34Date Closed:2014-04-14 10:52:48
Priority:MajorRegression?
Status:Closed/CompleteComponents:Core/Logging
Versions:11.8.0 Frequency of
Occurrence
Constant
Related
Issues:
is duplicated byASTERISK-22540 WARNING[2324] asterisk.c: Fork failed: Cannot allocate memory in /var/log/asterisk/messages then segfault
Environment:Attachments:( 0) autoservice-callid-leak.patch
( 1) autoservice-callid-leak-v2.patch
( 2) callid-refleaks.txt
( 3) jira_asterisk_23616_v11_autoservice_leak.patch
( 4) memory.tar.gz
( 5) memory2.tar.gz
( 6) refs.gz
( 7) refs2.gz
Description:Once under a non-trivial load, Asterisk 11 produces a high volume of memory leak in logger.c

After processing 40039 calls "memory show summary" shows 176983 allocations:
560830 bytes in      13817 allocations in file logger.c

It grows and grows.
Comments:By: ibercom (ibercom) 2014-04-10 13:02:15.651-0500

memory show allocations shows:

40 bytes allocated by    ast_create_callid() line  1322 of logger.c

13809 lines.

By: Matt Jordan (mjordan) 2014-04-10 13:28:10.384-0500

I'm fairly confident that this was already fixed in r399513:

{noformat}
------------------------------------------------------------------------
r399513 | kharwell | 2013-09-20 09:23:30 -0500 (Fri, 20 Sep 2013) | 8 lines

Fix memory leak in logger.

Fixed a memory leak discovered in the logger where a temporary string buffer
was not being freed.

(closes issue ASTERISK-22540)
Reported by: John Hardin
{noformat}

{noformat}
Index: main/logger.c
===================================================================
--- main/logger.c (revision 399512)
+++ main/logger.c (revision 399513)
@@ -1811,6 +1811,7 @@
res = ast_str_set_va(&buf, 0, fmt, ap);
/* If the build failed then we can drop this allocated message */
if (res == AST_DYNSTR_BUILD_FAILED) {
+ ast_free(buf);
return;
}

@@ -1838,6 +1839,7 @@
} while (p && *p);

ast_log_callid(__LOG_VERBOSE, file, line, func, callid, "%s", ast_str_buffer(prefixed));
+ ast_free(buf);
}

void __ast_verbose(const char *file, int line, const char *func, int level, const char *fmt, ...)
{noformat}

This should have been fixed in the final release of Asterisk 11.7.0; if you're running the release candidate you would be running into this. You may want to jump to 11.8.0, as that should definitely include the fix.

By: ibercom (ibercom) 2014-04-10 13:43:08.809-0500

I'm sorry but I'm running asterisk-11.9.0-rc1

I have enabled Reference Count Debugging for logger.c and I have attached the results: /tmp/refs

By: Matt Jordan (mjordan) 2014-04-10 14:03:27.431-0500

Actually, if you're running with {{MALLOC_DEBUG}}, enable memory leaks at exit - that should tell us who is allocating all the memory.

By: Corey Farrell (coreyfarrell) 2014-04-10 14:11:43.128-0500

Based on the memalloc inside memory.tar.gz, you are leaking memory allocated by {{ast_create_callid}}.  Based on this we need to see /tmp/refs with {{REF_DEBUG}} defined for the whole system.  It's unlikely this is an actual leak in the logger.  Most likely something else that uses the callid is leaking it's reference.

By: ibercom (ibercom) 2014-04-10 14:13:43.773-0500

Matt: Yes, I'm running with MALLOC_DEBUG.

memory atexit --> list or summary ?

Which command ?


By: ibercom (ibercom) 2014-04-11 08:50:41.451-0500

Corey: Here you have the files with REF_DEBUG defined for the whole system.
refs2 is a big file, asterisk has run for 5 minutes and it has processed about 12 calls.
This has been a lightly loaded server.

By: Corey Farrell (coreyfarrell) 2014-04-11 16:16:06.456-0500

I've attached the history of 2 out of 3 callid's with leaks.  The 3rd one was omitted because of it's involvement in chan_iax2.c:socket_process.  The history of that 3rd was 76000 lines so I'm leaning towards looking at the other leaks first, and seeing if they resolve the chan_iax2 leak.

By: Corey Farrell (coreyfarrell) 2014-04-11 16:24:47.020-0500

I believe you are experiencing a leak from autoservice.  I was unable to match the following ref's to unref's:
{noformat}
  0x25b3450 +1   channel_internal_api.c:869:ast_channel_callid () [@2]
  0x25b3450 +1   logger.c:1368:ast_callid_threadassoc_change () [@3]
{noformat}

I searched the asterisk source, and found autoservice is the only caller of {{ast_read_threadstorage_callid}}.

[~ibercom]: Please try autoservice-callid-leak.patch, let us know if this solves your problem (either fully or partially).

By: Richard Mudgett (rmudgett) 2014-04-11 17:29:57.444-0500

Good job finding that ref leak Corey.

After looking at your patch [^autoservice-callid-leak.patch], I think this patch [^jira_asterisk_23616_v11_autoservice_leak.patch] is better in that it won't spam the debug logs when TEST_FRAMEWORK in enabled.  IIRC, not spamming the debug log is why ast_callid_threadassoc_change() was created.

By: Corey Farrell (coreyfarrell) 2014-04-11 18:40:00.223-0500

Good catch on the TEST_FRAMEWORK debug logs, we should go with your patch.

By: Corey Farrell (coreyfarrell) 2014-04-12 19:25:03.671-0500

I just realized, the clean-up section at the end of the procedure is pointless - it's a thread local reference and the thread is exiting.  [^autoservice-callid-leak-v2.patch] just leaves the thread local reference to by cleaned by thread exit.

By: ibercom (ibercom) 2014-04-14 08:51:48.907-0500

I'm trying the [^jira_asterisk_23616_v11_autoservice_leak.patch] and it is ok. This solves my problem with logger.c
Now "memory show summary" is only:
     8390 bytes in          6 allocations in file logger.c
with about 7000 calls.

Although there may be memory leak in iax2-parser.c:
<      62560 bytes (     62560 cache) in         55 allocations in file iax2-parser.c        (Before)
---
>     223648 bytes (    223648 cache) in        197 allocations in file iax2-parser.c     (After)


By: Richard Mudgett (rmudgett) 2014-04-14 09:54:18.165-0500

Corey, the code at the end of the procedure is really dead code since the autoservice thread never exits.  However, the thread exit will not release the callid reference held by the thread local storage since it only knows how to release the storage itself.

By: Corey Farrell (coreyfarrell) 2014-04-14 10:04:32.744-0500

I realize the code at the end is currently dead and plan to eventually submit a patch to exit autoservice at asterisk shutdown.  However on the other point, {{unique_callid_cleanup}} is used to clean the thread local storage, and it does release the reference if one exists.

By: Richard Mudgett (rmudgett) 2014-04-14 10:44:45.375-0500

Corey: Ok.  I see that now.