[Home]

Summary:ASTERISK-26516: pjsip: Memory corruption with possible memory leak.
Reporter:Richard Mudgett (rmudgett)Labels:
Date Opened:2016-10-28 11:18:20Date Closed:2016-11-16 23:20:39.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_pjsip Resources/res_pjsip
Versions:13.11.2 Frequency of
Occurrence
Occasional
Related
Issues:
is a clone ofASTERISK-26517 pjsip: Memory corruption with possible memory leak.
is related toASTERISK-26344 Asterisk 13.11.0 + PJSIP crash
is related toASTERISK-26387 Asterisk segfaults shortly after starting even with no active calls.
Environment:Attachments:( 0) 0007_jira_asterisk_26516_v13_free_mem_fix.patch
( 1) 2016-10-28.tgz
( 2) cli-and-gdb-2016-10-27.tgz
( 3) jira_asterisk_26344_v13_committed_fix.patch
( 4) jira_asterisk_26344_v13_debuging.patch
( 5) jira_asterisk_26516_test_fix_v13.patch
( 6) jira_asterisk_26516_v13_debugging_v2.patch
( 7) jira_asterisk_26516_v13_debugging.patch
( 8) screenlog-2016-11-01.tgz
( 9) screenlog-2016-11-02.tgz
(10) screenlog-2016-11-03.tgz
(11) screenlog-2016-11-10.tgz
(12) screenlog-2016-11-14.tgz
Description:This is a continuation of ASTERISK-26344 and ASTERISK-26387 since the original crashes are fixed with patches added to pjproject.  For the setup description look at ASTERISK-26344.
Comments:By: Richard Mudgett (rmudgett) 2016-10-28 18:18:35.269-0500

[~tuxian] - The valgrind logs for run 2 and 3 show who is corrupting memory.  The dns_a_callback() is using the passed in query pointer which was allocated by a destroyed tdata's memory pool.  Now I just need to figure out why the tdata object is getting destroyed before the dns_a_callback() is called to complete the tdata's DNS query.

By: Richard Mudgett (rmudgett) 2016-10-31 18:41:29

[^jira_asterisk_26344_v13_committed_fix.patch] - This is the actual patch that got committed for ASTERISK-26344.  Apply by {{patch -p1 -i <patch-file-name>}}

[^jira_asterisk_26516_v13_debugging.patch] - This is an updated debug message patch that replaces [^jira_asterisk_26344_v13_debuging.patch].  Apply by {{patch -p1 -i <patch-file-name>}}

[^0007_jira_asterisk_26516_v13_free_mem_fix.patch] - Copy this file into the third-party/pjproject/patches directory.

[~tuxian] I'm fairly certain I found the primary cause of the memory corruption.  Ironically, it turns out to be a hole in the original fix patch on ASTERISK-26344 which was fixed in the patch that got committed.  However, I think there is another potential for corruption that is fixed by [^0007_jira_asterisk_26516_v13_free_mem_fix.patch].

# From a clean workspace apply the three patches
# Run your DNS test with MALLOC_DEBUG.  It should be able to go overnight.

I expect the memory leak was another symptom of the bug that was also causing the primary memory corruption.

Please test and report back.  Thanks.

By: Ian Gilmour (tuxian) 2016-11-01 10:06:23.782-0500

Hi Richard,

The patches supplied didn't apply cleanly so I updated to the current origin/13 branch which appears to already have the jira_asterisk_26344_v13_committed_fix.patch. The other patches then applied cleanly.

Leaving res_pjsip_output_registration.c with:

{noformat}
schedule_registration(state->client_state, 3 + 1);
{noformat}

gives me a system that still appears to be leaking memory.

mmlog simply has:

{noformat}
1478001617 - New session
{noformat}

But grepping CLI output after running "memory show summary" & "memory show allocations" occassionally through the test run gives me:

{noformat}
grep -e "bytes in all allocations and deferred free allocations" -e "allocations in file.*pool_policy_malloc.c" ~/screenlog.0
  6656448 bytes in        682 allocations in file ../src/pj/pool_policy_malloc.c
 13437987 bytes in all allocations and deferred free allocations
  6697408 bytes in        722 allocations in file ../src/pj/pool_policy_malloc.c
 13571471 bytes in all allocations and deferred free allocations
  6728128 bytes in        752 allocations in file ../src/pj/pool_policy_malloc.c
 13577326 bytes in all allocations and deferred free allocations
  6733248 bytes in        757 allocations in file ../src/pj/pool_policy_malloc.c
 13605876 bytes in all allocations and deferred free allocations
  6738368 bytes in        762 allocations in file ../src/pj/pool_policy_malloc.c
 13613843 bytes in all allocations and deferred free allocations
  6755776 bytes in        776 allocations in file ../src/pj/pool_policy_malloc.c
 13573691 bytes in all allocations and deferred free allocations
  6760896 bytes in        781 allocations in file ../src/pj/pool_policy_malloc.c
 13624265 bytes in all allocations and deferred free allocations
 13607107 bytes in all allocations and deferred free allocations
 13650533 bytes in all allocations and deferred free allocations
  6791616 bytes in        811 allocations in file ../src/pj/pool_policy_malloc.c
 13670895 bytes in all allocations and deferred free allocations
 13716032 bytes in all allocations and deferred free allocations
 13673257 bytes in all allocations and deferred free allocations
 13698627 bytes in all allocations and deferred free allocations
  6853056 bytes in        871 allocations in file ../src/pj/pool_policy_malloc.c
 13726484 bytes in all allocations and deferred free allocations
 13720457 bytes in all allocations and deferred free allocations
  6991296 bytes in       1006 allocations in file ../src/pj/pool_policy_malloc.c
 13851289 bytes in all allocations and deferred free allocations
 13851545 bytes in all allocations and deferred free allocations
 13934849 bytes in all allocations and deferred free allocations
 13933169 bytes in all allocations and deferred free allocations
{noformat}
I highlighted the pool_policy_malloc.c above as this seems to be the file that shows a steady increase in usage.

I've attached the CLI output as [^screenlog-2016-11-01.tgz].  

By: Friendly Automation (friendly-automation) 2016-11-01 14:36:46.118-0500

Change 4270 had a related patch set uploaded by Richard Mudgett:
bundled pjproject: Fix DNS write to freed memory.

[https://gerrit.asterisk.org/4270|https://gerrit.asterisk.org/4270]

By: Friendly Automation (friendly-automation) 2016-11-01 14:37:05.533-0500

Change 4271 had a related patch set uploaded by Richard Mudgett:
bundled pjproject: Fix DNS write to freed memory.

[https://gerrit.asterisk.org/4271|https://gerrit.asterisk.org/4271]

By: Friendly Automation (friendly-automation) 2016-11-01 14:37:16.457-0500

Change 4272 had a related patch set uploaded by Richard Mudgett:
bundled pjproject: Fix DNS write to freed memory.

[https://gerrit.asterisk.org/4272|https://gerrit.asterisk.org/4272]

By: Richard Mudgett (rmudgett) 2016-11-01 18:24:40.105-0500

Teluu has accepted the [^0007_jira_asterisk_26516_v13_free_mem_fix.patch] and it will be merged into the v13+ branches in a few hours.

The increasing memory may be a red herring if it levels out after awhile.  Pjproject caches released memory pools to be reused the next time a pool is needed.  You should do a longer duration test to see if the memory keeps growing.

By: Friendly Automation (friendly-automation) 2016-11-02 01:22:13.339-0500

Change 4272 merged by zuul:
bundled pjproject: Fix DNS write to freed memory.

[https://gerrit.asterisk.org/4272|https://gerrit.asterisk.org/4272]

By: Friendly Automation (friendly-automation) 2016-11-02 01:22:16.075-0500

Change 4271 merged by zuul:
bundled pjproject: Fix DNS write to freed memory.

[https://gerrit.asterisk.org/4271|https://gerrit.asterisk.org/4271]

By: Friendly Automation (friendly-automation) 2016-11-02 05:24:59.974-0500

Change 4270 merged by Joshua Colp:
bundled pjproject: Fix DNS write to freed memory.

[https://gerrit.asterisk.org/4270|https://gerrit.asterisk.org/4270]

By: Ian Gilmour (tuxian) 2016-11-02 11:58:53.312-0500

Hi Richard, yes I realise pjproject caches released memory pools to be reused the next time a pool is needed, but this test only re-registers 5 TLS users every couple of mins, so I would have thought it would reach a stable state reasonably quickly, and well before a couple of hours - no?

Running the same system I built yesterday for 9 hours today (see [^screenlog-2016-11-02.tgz]) I see the cache pools growing:

{noformat}
grep -e "bytes in all allocations and deferred free allocations" -e "allocations in file.*pool_policy_malloc.c" -e "cachpool.*Total" screenlog.0
  6640576 bytes in        673 allocations in file ../src/pj/pool_policy_malloc.c
 13189225 bytes in all allocations and deferred free allocations
 13189497 bytes in all allocations and deferred free allocations
  6813632 bytes in        839 allocations in file ../src/pj/pool_policy_malloc.c
 13656403 bytes in all allocations and deferred free allocations
 13656395 bytes in all allocations and deferred free allocations
 13755992 bytes in all allocations and deferred free allocations
  6926272 bytes in        928 allocations in file ../src/pj/pool_policy_malloc.c
 13755980 bytes in all allocations and deferred free allocations
  7065248 bytes in       1015 allocations in file ../src/pj/pool_policy_malloc.c
 13927279 bytes in all allocations and deferred free allocations
 13927293 bytes in all allocations and deferred free allocations
  7298496 bytes in       1227 allocations in file ../src/pj/pool_policy_malloc.c
 14128893 bytes in all allocations and deferred free allocations
 14168989 bytes in all allocations and deferred free allocations
  7393728 bytes in       1320 allocations in file ../src/pj/pool_policy_malloc.c
 14251689 bytes in all allocations and deferred free allocations
 14250852 bytes in all allocations and deferred free allocations
  7788992 bytes in       1641 allocations in file ../src/pj/pool_policy_malloc.c
 14654274 bytes in all allocations and deferred free allocations
 14654285 bytes in all allocations and deferred free allocations
  8027072 bytes in       1806 allocations in file ../src/pj/pool_policy_malloc.c
 14876070 bytes in all allocations and deferred free allocations
 14876084 bytes in all allocations and deferred free allocations
  8444352 bytes in       2084 allocations in file ../src/pj/pool_policy_malloc.c
 15328742 bytes in all allocations and deferred free allocations
 15331609 bytes in all allocations and deferred free allocations
     cachpool   Total   7300064 of   8496320 (85 %) used!
  8728000 bytes in       2271 allocations in file ../src/pj/pool_policy_malloc.c
 15614381 bytes in all allocations and deferred free allocations
 15614392 bytes in all allocations and deferred free allocations
     cachpool   Total   7405112 of   8688320 (85 %) used!
     cachpool   Total   7405496 of   8690368 (85 %) used!
  8789440 bytes in       2331 allocations in file ../src/pj/pool_policy_malloc.c
 15668738 bytes in all allocations and deferred free allocations
 15668722 bytes in all allocations and deferred free allocations
     cachpool   Total   7416632 of   8749760 (84 %) used!
  8852928 bytes in       2393 allocations in file ../src/pj/pool_policy_malloc.c
 15732482 bytes in all allocations and deferred free allocations
 15732493 bytes in all allocations and deferred free allocations
     cachpool   Total   7428536 of   8813248 (84 %) used!
{noformat}

and

{noformat}
grep -e ":13:0" -A 8 screenlog.2 | sed 's/[[:space:]]*$//' | grep -e "top -" -e "21121"
top - 07:13:07 up 20 days, 23:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  53m  13m S 18.4  0.7   0:03.80 asterisk
top - 08:13:02 up 21 days, 12 min,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  58m  13m S  0.4  0.7   0:20.59 asterisk
top - 09:13:02 up 21 days,  1:12,  0 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  59m  13m S  0.4  0.8   0:38.15 asterisk
top - 10:13:05 up 21 days,  2:12,  0 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  61m  13m S  0.4  0.8   0:54.69 asterisk
top - 11:13:00 up 21 days,  3:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  62m  13m S  0.4  0.8   1:12.00 asterisk
top - 11:13:09 up 21 days,  3:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  62m  13m S  0.3  0.8   1:12.03 asterisk
top - 12:13:03 up 21 days,  4:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  63m  13m S  0.2  0.8   1:28.45 asterisk
top - 13:13:06 up 21 days,  5:12,  0 users,  load average: 0.00, 0.01, 0.00
21121 iang      20   0 2777m  63m  13m S  0.4  0.8   1:46.21 asterisk
top - 14:13:02 up 21 days,  6:12,  6 users,  load average: 0.00, 0.09, 0.13
21121 iang      20   0 2777m  64m  13m S  0.2  0.8   2:03.18 asterisk
top - 15:13:06 up 21 days,  7:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  65m  13m S  0.6  0.8   2:19.99 asterisk
top - 16:13:07 up 21 days,  8:12,  6 users,  load average: 0.05, 0.01, 0.00
21121 iang      20   0 2777m  65m  13m S  0.6  0.8   2:38.45 asterisk
{noformat}

I'll leave it running overnight.

By: Ian Gilmour (tuxian) 2016-11-03 03:16:09.340-0500

24+ hours later and the pjproject cache pool is still growing:

{noformat}
  6640576 bytes in        673 allocations in file ../src/pj/pool_policy_malloc.c
 13189225 bytes in all allocations and deferred free allocations
 13189497 bytes in all allocations and deferred free allocations
  6813632 bytes in        839 allocations in file ../src/pj/pool_policy_malloc.c
 13656403 bytes in all allocations and deferred free allocations
 13656395 bytes in all allocations and deferred free allocations
 13755992 bytes in all allocations and deferred free allocations
  6926272 bytes in        928 allocations in file ../src/pj/pool_policy_malloc.c
 13755980 bytes in all allocations and deferred free allocations
  7065248 bytes in       1015 allocations in file ../src/pj/pool_policy_malloc.c
 13927279 bytes in all allocations and deferred free allocations
 13927293 bytes in all allocations and deferred free allocations
  7298496 bytes in       1227 allocations in file ../src/pj/pool_policy_malloc.c
 14128893 bytes in all allocations and deferred free allocations
 14168989 bytes in all allocations and deferred free allocations
  7393728 bytes in       1320 allocations in file ../src/pj/pool_policy_malloc.c
 14251689 bytes in all allocations and deferred free allocations
 14250852 bytes in all allocations and deferred free allocations
  7788992 bytes in       1641 allocations in file ../src/pj/pool_policy_malloc.c
 14654274 bytes in all allocations and deferred free allocations
 14654285 bytes in all allocations and deferred free allocations
  8027072 bytes in       1806 allocations in file ../src/pj/pool_policy_malloc.c
 14876070 bytes in all allocations and deferred free allocations
 14876084 bytes in all allocations and deferred free allocations
  8444352 bytes in       2084 allocations in file ../src/pj/pool_policy_malloc.c
 15328742 bytes in all allocations and deferred free allocations
 15331609 bytes in all allocations and deferred free allocations
     cachpool   Total   7300064 of   8496320 (85 %) used!
  8728000 bytes in       2271 allocations in file ../src/pj/pool_policy_malloc.c
 15614381 bytes in all allocations and deferred free allocations
 15614392 bytes in all allocations and deferred free allocations
     cachpool   Total   7405112 of   8688320 (85 %) used!
     cachpool   Total   7405496 of   8690368 (85 %) used!
  8789440 bytes in       2331 allocations in file ../src/pj/pool_policy_malloc.c
 15668738 bytes in all allocations and deferred free allocations
 15668722 bytes in all allocations and deferred free allocations
     cachpool   Total   7416632 of   8749760 (84 %) used!
  8852928 bytes in       2393 allocations in file ../src/pj/pool_policy_malloc.c
 15732482 bytes in all allocations and deferred free allocations
 15732493 bytes in all allocations and deferred free allocations
     cachpool   Total   7428536 of   8813248 (84 %) used!
  9724864 bytes in       3017 allocations in file ../src/pj/pool_policy_malloc.c
 16593522 bytes in all allocations and deferred free allocations
 16593514 bytes in all allocations and deferred free allocations
     cachpool   Total   7821800 of   9681600 (80 %) used!
 10305984 bytes in       3427 allocations in file ../src/pj/pool_policy_malloc.c
 17144668 bytes in all allocations and deferred free allocations
 17179679 bytes in all allocations and deferred free allocations
     cachpool   Total   8092448 of  10264768 (78 %) used!
 13430720 bytes in       5626 allocations in file ../src/pj/pool_policy_malloc.c
 20263597 bytes in all allocations and deferred free allocations
 20284056 bytes in all allocations and deferred free allocations
     cachpool   Total   9554320 of  13394112 (71 %) used!
{noformat}

top confirms it:

{noformat}
top - 07:13:07 up 20 days, 23:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  53m  13m S 18.4  0.7   0:03.80 asterisk
top - 08:13:02 up 21 days, 12 min,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  58m  13m S  0.4  0.7   0:20.59 asterisk
top - 09:13:02 up 21 days,  1:12,  0 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  59m  13m S  0.4  0.8   0:38.15 asterisk
top - 10:13:05 up 21 days,  2:12,  0 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  61m  13m S  0.4  0.8   0:54.69 asterisk
top - 11:13:00 up 21 days,  3:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  62m  13m S  0.4  0.8   1:12.00 asterisk
top - 11:13:09 up 21 days,  3:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  62m  13m S  0.3  0.8   1:12.03 asterisk
top - 12:13:03 up 21 days,  4:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  63m  13m S  0.2  0.8   1:28.45 asterisk
top - 13:13:06 up 21 days,  5:12,  0 users,  load average: 0.00, 0.01, 0.00
21121 iang      20   0 2777m  63m  13m S  0.4  0.8   1:46.21 asterisk
top - 14:13:02 up 21 days,  6:12,  6 users,  load average: 0.00, 0.09, 0.13
21121 iang      20   0 2777m  64m  13m S  0.2  0.8   2:03.18 asterisk
top - 15:13:06 up 21 days,  7:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  65m  13m S  0.6  0.8   2:19.99 asterisk
top - 16:13:07 up 21 days,  8:12,  6 users,  load average: 0.05, 0.01, 0.00
21121 iang      20   0 2777m  65m  13m S  0.6  0.8   2:38.45 asterisk
top - 17:13:02 up 21 days,  9:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  65m  13m S  0.6  0.8   2:56.43 asterisk
top - 18:13:05 up 21 days, 10:12,  0 users,  load average: 0.06, 0.02, 0.00
21121 iang      20   0 2777m  65m  13m S  0.2  0.8   3:13.08 asterisk
top - 19:13:03 up 21 days, 11:12,  6 users,  load average: 0.01, 0.04, 0.00
21121 iang      20   0 2777m  67m  13m S  0.4  0.9   3:29.53 asterisk
top - 20:13:06 up 21 days, 12:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  68m  13m S  0.4  0.9   3:46.76 asterisk
top - 21:13:01 up 21 days, 13:12,  6 users,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  70m  13m S  0.6  0.9   4:04.42 asterisk
top - 22:13:04 up 21 days, 14:12,  1 user,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  70m  13m S  0.6  0.9   4:22.45 asterisk
top - 23:13:08 up 21 days, 15:12,  1 user,  load average: 0.00, 0.01, 0.00
21121 iang      20   0 2777m  73m  13m S  0.4  0.9   4:39.52 asterisk
top - 00:13:02 up 21 days, 16:12,  1 user,  load average: 0.00, 0.03, 0.01
21121 iang      20   0 2777m  75m  13m S  0.6  1.0   4:56.93 asterisk
top - 01:13:05 up 21 days, 17:12,  1 user,  load average: 0.02, 0.06, 0.03
21121 iang      20   0 2777m  75m  13m S  0.4  1.0   5:14.31 asterisk
top - 02:13:08 up 21 days, 18:12,  1 user,  load average: 0.04, 0.11, 0.05
21121 iang      20   0 2777m  78m  13m S  0.6  1.0   5:30.85 asterisk
top - 03:13:03 up 21 days, 19:12,  1 user,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  81m  13m S  0.7  1.0   5:47.84 asterisk
top - 04:13:06 up 21 days, 20:12,  1 user,  load average: 0.01, 0.01, 0.00
21121 iang      20   0 2777m  81m  13m S  0.7  1.0   6:05.10 asterisk
top - 05:13:01 up 21 days, 21:12,  1 user,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  83m  13m S  0.3  1.1   6:22.42 asterisk
top - 06:13:05 up 21 days, 22:12,  1 user,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  83m  13m S  0.4  1.1   6:39.28 asterisk
top - 07:13:08 up 21 days, 23:12,  1 user,  load average: 0.00, 0.00, 0.00
21121 iang      20   0 2777m  85m  13m S  0.6  1.1   6:57.48 asterisk
{noformat}

Full CLI log in [^screenlog-2016-11-03.tgz] (and mmlog has no entries other than the initial timestamp).

By: Richard Mudgett (rmudgett) 2016-11-09 13:07:57.134-0600

[^jira_asterisk_26516_v13_debugging_v2.patch] - Replaces the earlier [^jira_asterisk_26516_v13_debugging.patch].  The new version removes the grp_lock logging and adds logging for memory pools.  Hopefully, this will give enough information to determine which pool is leaking.

From an updated and clean workspace install [^jira_asterisk_26516_v13_debugging.patch] and your sip_outbound_registration_perform() {{schedule_registration(state->client_state, (ast_random() % 10) + 1);}} change.

A log capture of 30 min should be enough to have enough leaked memory to identify what pool is leaking.

There is a recent fix that is likely to affect the memory leak which should be merged by the time you are able to test.  https://gerrit.asterisk.org/#/c/4351/

By: Ian Gilmour (tuxian) 2016-11-10 10:15:20.730-0600

[^screenlog-2016-11-10.tgz] was generated using the current branch 13 trunk + your v2 patch.

Memory usage still looks to be creeping up...

{noformat}
grep -e "bytes in all allocations and deferred free allocations" -e "allocations in file.*pool_policy_malloc.c" -e "cachpool.*used_cnt" ~/screenlog.0
  6643136 bytes in        676 allocations in file ../src/pj/pool_policy_malloc.c
 13473977 bytes in all allocations and deferred free allocations
 13474266 bytes in all allocations and deferred free allocations
     cachpool    Capacity=52480, max_capacity=1048576, used_cnt=40
  6669248 bytes in        695 allocations in file ../src/pj/pool_policy_malloc.c
 13509783 bytes in all allocations and deferred free allocations
 13509767 bytes in all allocations and deferred free allocations
     cachpool    Capacity=63232, max_capacity=1048576, used_cnt=50
  6710208 bytes in        735 allocations in file ../src/pj/pool_policy_malloc.c
 13571270 bytes in all allocations and deferred free allocations
 13572012 bytes in all allocations and deferred free allocations
     cachpool    Capacity=63232, max_capacity=1048576, used_cnt=90
  6725568 bytes in        750 allocations in file ../src/pj/pool_policy_malloc.c
 13572492 bytes in all allocations and deferred free allocations
 13572506 bytes in all allocations and deferred free allocations
     cachpool    Capacity=63232, max_capacity=1048576, used_cnt=105
  6740928 bytes in        765 allocations in file ../src/pj/pool_policy_malloc.c
 13627074 bytes in all allocations and deferred free allocations
 13627085 bytes in all allocations and deferred free allocations
     cachpool    Capacity=63232, max_capacity=1048576, used_cnt=120
  6797248 bytes in        820 allocations in file ../src/pj/pool_policy_malloc.c
 13685842 bytes in all allocations and deferred free allocations
 13684196 bytes in all allocations and deferred free allocations
     cachpool    Capacity=63232, max_capacity=1048576, used_cnt=175
  6822848 bytes in        845 allocations in file ../src/pj/pool_policy_malloc.c
 13676524 bytes in all allocations and deferred free allocations
 13674824 bytes in all allocations and deferred free allocations
     cachpool    Capacity=63232, max_capacity=1048576, used_cnt=200
{noformat}

and...

{noformat}
rep -e ":.0:0" -A 8 ~/screenlog.2 | sed 's/[[:space:]]*$//' | grep -e "top -" -e 18993
top - 15:00:02 up 29 days,  6:59,  6 users,  load average: 0.02, 0.18, 0.22
18993 iang      20   0 2776m  53m  13m S  0.3  0.7   0:04.30 asterisk
top - 15:10:05 up 29 days,  7:09,  6 users,  load average: 0.00, 0.02, 0.10
18993 iang      20   0 2776m  57m  13m S  0.4  0.7   0:07.62 asterisk
top - 15:20:00 up 29 days,  7:19,  6 users,  load average: 0.00, 0.00, 0.04
18993 iang      20   0 2776m  57m  13m S  0.4  0.7   0:10.38 asterisk
top - 15:20:09 up 29 days,  7:19,  6 users,  load average: 0.00, 0.00, 0.03
18993 iang      20   0 2776m  57m  13m S  0.4  0.7   0:10.42 asterisk
top - 15:30:04 up 29 days,  7:29,  6 users,  load average: 0.00, 0.00, 0.00
18993 iang      20   0 2776m  58m  13m S  0.6  0.7   0:13.56 asterisk
top - 15:40:07 up 29 days,  7:39,  6 users,  load average: 0.00, 0.00, 0.00
18993 iang      20   0 2776m  58m  13m S  0.6  0.7   0:16.52 asterisk
top - 15:50:02 up 29 days,  7:49,  6 users,  load average: 0.00, 0.00, 0.00
18993 iang      20   0 2776m  58m  13m S  0.6  0.7   0:19.34 asterisk
top - 16:00:05 up 29 days,  7:59,  6 users,  load average: 0.00, 0.00, 0.00
18993 iang      20   0 2776m  59m  13m S  0.4  0.8   0:22.53 asterisk
{noformat}

By: Richard Mudgett (rmudgett) 2016-11-10 19:15:51.199-0600

The 'auth_cli0x7fff' pool seems to be leaked as it is never released but repeatedly created.

By: Alexei Gradinari (alexei gradinari) 2016-11-11 08:20:48.177-0600

The 'pept0xf3cf10' pool increased rapidly.
This poll is used for ast_pjsip_endpoint which never released.

{noformat}
[2016-11-10 14:58:30.049] DEBUG[18993]: pjproject:0 <?>:          pept0xf3cf10 Pool: created, size=4096
...
[2016-11-10 14:58:30.553] DEBUG[18993]: pjproject:0 <?>:          pept0xf3cf10 Pool: 584 bytes requested, resizing pool by 4000 bytes (used=5413984, cap=5416096)
{noformat}


By: Richard Mudgett (rmudgett) 2016-11-11 18:23:41.518-0600

[~alexei gradinari] The 'pept0xf3cf10' pool is used to configure pjproject for operation, grows only on Asterisk startup, must remain for the life of the Asterisk execution, and the log does not capture Asterisk being shut down.  The problem is that memory usage is growing *after* startup.  That pool is not causing the leak.

[~tuxian] The 'auth_cli0x????' pools appear to be leaking.  A new pool is created in the authentication cache for each different authentication realm.

The MALLOC_DEBUG log showed a growth of 179,712 bytes allocated to pjproject pools between the first summary and the last in the log.  That memory corresponds to about 175 auth_cli pools and matches closely with the 184 remaining 'auth_cli0x7fff' pools.  No other remaining pools created after startup represent close to that much memory.

There are also 180 of these lines.
{noformat}
[2016-11-10 16:02:54.489] DEBUG[19397]: pjproject:0 <?>:        sip_auth_clien ...Unable to set auth for tdta0x7fffd4024740: can not find credential for dev.dev.acme.com/Digest
{noformat}

I'd like to see a SIP message log of the registrations (pjsip set logger on) to see what the realm strings are and verify if the realms are changing.  I'll likely need to create another debugging patch to track why these pools are created.

I do appreciate your patience and ability to obtain the needed information.

By: Ian Gilmour (tuxian) 2016-11-14 05:50:28.250-0600

[^screenlog-2016-11-14.tgz] has pjsip logging enabled.

It also has a breakpoint added on the return from pjsip_endpt_create_pool() in sip_endpoint.c (line:675). I added the following commands to be run on the breakpoint being hit:

{noformat}
(gdb) comm 6
Type commands for breakpoint(s) 6, one per line.
End with a line saying just "end".
>print *pool
>bt
>cont
>end
(gdb)
{noformat}

So now we can see the full pool name of all these 'auth_cli0xffff' pools.

Hopefully this, together with the pjsip logger enabled, gives a bit more info as to why these pools are being created and never released.

By: Richard Mudgett (rmudgett) 2016-11-14 15:37:12.697-0600

[^jira_asterisk_26516_test_fix_v13.patch] - This is an Asterisk patch that should fix the memory pool leak.  PJPROJECT 2.5.5 introduced an API change that resulted in Asterisk leaking the memory pools.

Please test to make sure there aren't any more issues.

By: Ian Gilmour (tuxian) 2016-11-16 04:04:16.895-0600

Hi Richard, that last patch appears to do the job. Memory peaks and remains constant for the next 24 hours.

{noformat}
grep -e ":42:0" -A 8 ~/screenlog.2 | sed 's/[[:space:]]*$//' | grep -e "top -" -e 2033
top - 08:42:00 up 34 days, 41 min,  6 users,  load average: 0.12, 0.12, 0.16
2033 iang      20   0 2787m  54m  13m S  0.4  0.7   0:04.19 asterisk
top - 08:42:09 up 34 days, 41 min,  6 users,  load average: 0.10, 0.11, 0.16
2033 iang      20   0 2787m  54m  13m S  0.3  0.7   0:04.22 asterisk
top - 09:42:04 up 34 days,  1:41,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m S  0.6  0.8   0:21.79 asterisk
top - 10:42:05 up 34 days,  2:41,  6 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m S  0.4  0.8   0:39.74 asterisk
top - 11:42:00 up 34 days,  3:41,  6 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m S  0.3  0.8   0:58.17 asterisk
top - 11:42:09 up 34 days,  3:41,  6 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m S  0.6  0.8   0:58.22 asterisk
top - 12:42:03 up 34 days,  4:41,  6 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  62m  13m S  0.6  0.8   1:16.65 asterisk
top - 13:42:07 up 34 days,  5:41,  6 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  62m  13m S  0.6  0.8   1:35.97 asterisk
top - 14:42:01 up 34 days,  6:41,  6 users,  load average: 0.02, 0.11, 0.09
2033 iang      20   0 2787m  62m  13m S  0.4  0.8   1:54.32 asterisk
top - 15:42:05 up 34 days,  7:41,  6 users,  load average: 0.11, 0.08, 0.02
2033 iang      20   0 2787m  62m  13m S  0.9  0.8   2:13.25 asterisk
top - 16:42:00 up 34 days,  8:41,  6 users,  load average: 0.04, 0.02, 0.00
2033 iang      20   0 2787m  61m  13m S  0.6  0.8   2:31.44 asterisk
top - 16:42:09 up 34 days,  8:41,  6 users,  load average: 0.03, 0.02, 0.00
2033 iang      20   0 2787m  61m  13m S  0.3  0.8   2:31.47 asterisk
top - 17:42:03 up 34 days,  9:41,  6 users,  load average: 0.04, 0.01, 0.00
2033 iang      20   0 2787m  61m  13m S  0.3  0.8   2:50.06 asterisk
top - 18:42:06 up 34 days, 10:41,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  62m  13m S  0.6  0.8   3:07.97 asterisk
top - 19:42:00 up 34 days, 11:41,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  62m  13m S  0.7  0.8   3:25.87 asterisk
top - 19:42:09 up 34 days, 11:41,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  62m  13m S  0.4  0.8   3:25.91 asterisk
top - 20:42:04 up 34 days, 12:41,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  62m  13m S  0.6  0.8   3:43.51 asterisk
top - 21:42:07 up 34 days, 13:41,  6 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m S  0.7  0.8   4:01.59 asterisk
top - 22:42:01 up 34 days, 14:41,  0 users,  load average: 0.01, 0.02, 0.00
2033 iang      20   0 2787m  61m  13m S  0.7  0.8   4:19.05 asterisk
top - 23:42:05 up 34 days, 15:41,  0 users,  load average: 0.06, 0.03, 0.00
2033 iang      20   0 2787m  61m  13m S  0.4  0.8   4:36.24 asterisk
top - 00:42:00 up 34 days, 16:41,  0 users,  load average: 0.31, 0.17, 0.11
2033 iang      20   0 2787m  61m  13m S  0.7  0.8   4:53.69 asterisk
top - 00:42:09 up 34 days, 16:41,  0 users,  load average: 0.34, 0.18, 0.11
2033 iang      20   0 2787m  61m  13m S  0.4  0.8   4:53.73 asterisk
top - 01:42:03 up 34 days, 17:41,  0 users,  load average: 0.00, 0.03, 0.02
2033 iang      20   0 2787m  61m  13m S  0.4  0.8   5:11.26 asterisk
top - 02:42:07 up 34 days, 18:41,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m S  0.3  0.8   5:28.71 asterisk
top - 03:42:01 up 34 days, 19:41,  0 users,  load average: 0.02, 0.03, 0.00
2033 iang      20   0 2787m  61m  13m S  0.4  0.8   5:46.11 asterisk
top - 04:42:05 up 34 days, 20:41,  0 users,  load average: 0.00, 0.02, 0.00
2033 iang      20   0 2787m  61m  13m S  0.4  0.8   6:03.62 asterisk
top - 05:42:00 up 34 days, 21:41,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m S  0.6  0.8   6:20.73 asterisk
top - 05:42:09 up 34 days, 21:41,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m S  0.6  0.8   6:20.78 asterisk
top - 06:42:03 up 34 days, 22:41,  0 users,  load average: 0.07, 0.03, 0.00
2033 iang      20   0 2787m  61m  13m S  0.6  0.8   6:38.54 asterisk
top - 07:42:06 up 34 days, 23:41,  0 users,  load average: 0.03, 0.01, 0.00
2033 iang      20   0 2787m  61m  13m S  0.4  0.8   6:55.97 asterisk
top - 08:42:00 up 35 days, 41 min,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m S  0.6  0.8   7:13.20 asterisk
top - 08:42:09 up 35 days, 41 min,  0 users,  load average: 0.00, 0.00, 0.00
2033 iang      20   0 2787m  61m  13m T  0.6  0.8   7:13.25 asterisk
top - 09:42:05 up 35 days,  1:41,  6 users,  load average: 1.00, 0.97, 0.67
2033 iang      20   0 2787m  61m  13m S  0.6  0.8   7:32.41 asterisk
{noformat}

Many thanks.


By: Friendly Automation (friendly-automation) 2016-11-16 23:20:40.496-0600

Change 4442 merged by zuul:
res_pjsip_outbound_authenticator_digest.c: Fix memory pool leak.

[https://gerrit.asterisk.org/4442|https://gerrit.asterisk.org/4442]

By: Friendly Automation (friendly-automation) 2016-11-16 23:39:36.164-0600

Change 4443 merged by zuul:
res_pjsip_outbound_authenticator_digest.c: Fix memory pool leak.

[https://gerrit.asterisk.org/4443|https://gerrit.asterisk.org/4443]

By: Friendly Automation (friendly-automation) 2016-11-17 04:57:03.675-0600

Change 4444 merged by Joshua Colp:
res_pjsip_outbound_authenticator_digest.c: Fix memory pool leak.

[https://gerrit.asterisk.org/4444|https://gerrit.asterisk.org/4444]