[Home]

Summary:ASTERISK-28695: core: minmemfree watermark uses free RAM, not available RAM
Reporter:Kevin Flyn (Kevin_Flyn)Labels:
Date Opened:2020-01-15 10:47:46.000-0600Date Closed:2020-01-20 07:10:53.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:PBX/General
Versions:16.3.0 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Manjaro Linux 4.19.28-1Attachments:
Description:My asterisk system stopped accepting incomming and outgoing calls, and was delivering a "fast busy" tone to my home phone the other day, so I fired up the CLI and attempted an outgoing call with SIP debugging enabled and got the following output:

{noformat}
...
[2020-01-15 10:33:12.524] DEBUG[111414][C-00000003]: chan_sip.c:3801 __sip_xmit: Trying to put 'SIP/2.0 100' onto UDP socket destined for XXX.XXX.X.XX:5060
[2020-01-15 10:33:12.524] WARNING[111414][C-00000003]: pbx.c:4623 increase_call_count: Available system memory (~423MB) is below the configured low watermark (500MB)
[2020-01-15 10:33:12.525] DEBUG[111388]: chan_sip.c:30608 sip_devicestate: Checking device state for peer grandstream1
[2020-01-15 10:33:12.525] DEBUG[111414][C-00000003]: chan_sip.c:3457 sip_alreadygone: Setting SIP_ALREADYGONE on dialog 581189762-5060-5@BJC.BGI.F.ED
[2020-01-15 10:33:12.525] WARNING[111414][C-00000003]: chan_sip.c:26866 handle_request_invite: Failed to start PBX (call limit reached)
[2020-01-15 10:33:12.525] DEBUG[111388]: devicestate.c:466 do_state_change: Changing state for SIP/grandstream1 - state 2 (In use)
...
{noformat}

As you can see in the output, asterisk is telling me that my system is low on memory and has hit the "minmemfree" watermark that I had set in the asterisk.conf config file at 500MB. Knowing my server is minimally loaded, I logged in via ssh and executed the "free" command to investigate RAM status:

{noformat}
...
[hpz230]# free
             total        used        free      shared  buff/cache   available
Mem:          15792         786         424           8       14581       14668
Swap:          8192           0        8192
...
{noformat}

The amount of "free" memory is indeed below 500MB, but only because my system has been up for 120+ days, and the memory is being used by the kernel as a filesystem block cache. When one includes the fact that the ram being used as a block cache is available for application use without triggering the kernel OOM process killer, it seems to me that asterisk is calculating the amount of free ram for this setting incorrectly.

I issued a quick "sync" to sync all filesystem data to disk, then cleared all of
the linux fs cache data and ran free again, and the output is below:

{noformat}
...
[hpz230]# sync
[hpz230]# echo "3" > /proc/sys/vm/drop_caches
[hpz230]# free
             total        used        free      shared  buff/cache   available
Mem:          15792         785       14605           1         401       14736
Swap:          8192           0        8192
...
{noformat}

Now the "free" ram is 14GB+ and sure enough, asterisk happily started accepting phone calls again.

The following code snippet from github is where asterisk makes this decision:

https://github.com/asterisk/asterisk/blob/391aafb97172e3beb9a779458456d2e75ecf4610/main/pbx.c

{noformat}
#if defined(HAVE_SYSINFO)
if (option_minmemfree) {
if (!sysinfo(&sys_info)) {
/* make sure that the free system memory is above the configured low watermark
* convert the amount of freeram from mem_units to MB */
curfreemem = sys_info.freeram * sys_info.mem_unit;
curfreemem /= 1024 * 1024;
if (curfreemem < option_minmemfree) {
ast_log(LOG_WARNING, "Available system memory (~%ldMB) is below the configured low watermark (%ldMB)\n", curfreemem, option_minmemfree);
failed = -1;
}
}
}
#endif
{noformat}

As you can see, asterisk is calculating the watermark value based on the sysinfo() function results, using the fields "freeram" and "mem_unit" without regard to the amount of memory being used as a filesystem cache.

A quick look at the manpage for sysinfo() shows that sysinfo() also returns a
"bufferram" field which is the "Memory used by buffers" so it stands to reason that the above line of code that calculates the amount of free ram should read:

{noformat}
curfreemem = (sys_info.freeram + sys_info.bufferram) * sys_info.mem_unit;
{noformat}

Any system using the current code base that remains powered up for an extended period of time will eventually run into this issue unless it periodically clears the kernel filesystem cache buffers.
Comments:By: Asterisk Team (asteriskteam) 2020-01-15 10:47:47.172-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: Sean Bright (seanbright) 2020-01-15 13:27:21.315-0600

[~Kevin_Flyn], are you planning to [submit a patch|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process#PatchContributionProcess-SubmittingaPatch] for this?

By: Kevin Flyn (Kevin_Flyn) 2020-01-15 16:25:28.782-0600

Sean Bright, I was not planning on submitting a patch for this myself, no. What seems to clearly be a bug in my mind, may in fact be considered a feature by a member of the Asterisk development community, which I don't consider myself to be a member of.

Perhaps by the time this gets reviewed, and the politics play out, and people in positions of ultimate power decide this is in fact a bug, someone more actively involved in the development of asterisk, with a configured development environment, will create what seems to be the most trivial of patches and submit it, but I consider my personal involvement in this matter closed with the report itself.

By: Sean Bright (seanbright) 2020-01-15 16:28:18.962-0600

[~Kevin_Flyn], sorry to bother you.

By: Sean Bright (seanbright) 2020-01-16 07:39:30.431-0600

We got the Illuminati involved and they were OK with making this change.

By: Friendly Automation (friendly-automation) 2020-01-20 07:10:54.603-0600

Change 13597 merged by Joshua Colp:
pbx.c: Include filesystem cache in free memory calculation

[https://gerrit.asterisk.org/c/asterisk/+/13597|https://gerrit.asterisk.org/c/asterisk/+/13597]

By: Friendly Automation (friendly-automation) 2020-01-20 07:11:06.082-0600

Change 13637 merged by Joshua Colp:
pbx.c: Include filesystem cache in free memory calculation

[https://gerrit.asterisk.org/c/asterisk/+/13637|https://gerrit.asterisk.org/c/asterisk/+/13637]

By: Friendly Automation (friendly-automation) 2020-01-20 07:11:47.305-0600

Change 13636 merged by Joshua Colp:
pbx.c: Include filesystem cache in free memory calculation

[https://gerrit.asterisk.org/c/asterisk/+/13636|https://gerrit.asterisk.org/c/asterisk/+/13636]

By: Friendly Automation (friendly-automation) 2020-01-20 07:12:02.478-0600

Change 13635 merged by Joshua Colp:
pbx.c: Include filesystem cache in free memory calculation

[https://gerrit.asterisk.org/c/asterisk/+/13635|https://gerrit.asterisk.org/c/asterisk/+/13635]