[Home]

Summary:ASTERISK-29367: app_voicemail: reload causes voicemail taskprocessor overload with ODBC and pollmailboxes
Reporter:Luke Escude (lukeescude)Labels:
Date Opened:2021-03-23 10:18:17Date Closed:
Priority:MinorRegression?
Status:Open/NewComponents:Applications/app_voicemail
Versions:16.14.0 Frequency of
Occurrence
Related
Issues:
Environment:Centos 7 x64Attachments:
Description:We have the default taskprocessor queue size configured for VM: Low is 450, High is 500.

We have a higher-volume instance (typically around 35 simul calls at 11am on weekdays) of Asterisk 16.14 that appears to be hitting the high water mark whenever a Core Reload occurs.

I'm looking around to try and find out what exactly the Voicemail module is doing but I can't find any diagnostic commands to run for you guys.

There are only like maybe 30 voicemail boxes configured for this system. Additionally, the following parameters in voicemail.conf exist:

{noformat}
pollmailboxes=yes
pollfreq=60
maxsecs=600
maxsilence=10
minsecs=1
serveremail=no-reply@domain
attach=yes
fromstring=From
emailsubject=New Voicemail
emailbody=Body
externnotify=sh /primevox_scripts/voicemail_notify.sh
externpassnotify=sh /primevox_scripts/update_vm_password.sh
odbcstorage=vmodbc
odbctable=vm_table
{noformat}

So we're using ODBC for voicemail storage. Newest MariaDB Connector and UnixODBC 2.3.9 compiled.
Comments:By: Asterisk Team (asteriskteam) 2021-03-23 10:18:18.767-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: George Joseph (gjoseph) 2021-03-23 11:54:41.522-0500

app_voicemail uses only 1 taskprocessor and that's for mwi notification.  You say 30 mailboxes but how many subscriptions do those mailboxes have?  Does the taskprocessor clear by itself and how long does it take?

You can run {{pjsip show subscriptions inbound like message-summary}} to see how many mailbox subscriptions there are.

By: Luke Escude (lukeescude) 2021-03-23 15:38:13.814-0500

Hey George,

Result of `pjsip show subscriptions inbound like message-summary` is the following:

0 active subscriptions matched "message-summary"

I don't think we utilize any of the "SUBSCRIBE FOR MWI" settings on IP phones.

EDIT: Running `core show hints` shows there are a maximum of 12 people watching any particular mailbox. (We have Custom: hints for all vm boxes.

By: George Joseph (gjoseph) 2021-03-24 07:26:56.404-0500

We're talking about this taskprocessor right...
{code}
*CLI> core show taskprocessors

Processor                                                               Processed   In Queue  Max Depth  Low water High water
app_voicemail                                                                   0          0          0        450        500
{code}

Something is familiar about this issue so I'm going to have to investigate a but more to jog my memory.  In the mean time, would it be possible to upgrade to 18 and see if you still have the issue?   Lots of plumbing changed in app_voicemail in 18, including splitting out odbc and imap support into their own modules.  



By: Luke Escude (lukeescude) 2021-03-24 08:41:07.700-0500

Correct.

Hmm I could try out 18. It would be an update made available across our entire customer base (almost 500 Asterisk instances) though, so there's not really a way for me to roll it out to an individual customer. I'll start doing some testing with it - It may have some memory leak fixes for some leaks we're experiencing on 16 as well.

By: George Joseph (gjoseph) 2021-03-24 08:47:19.348-0500

OK.  I'll try and set up an ODBC instance as well.


By: Luke Escude (lukeescude) 2021-03-24 12:55:05.281-0500

Hmm I am noticing something added to Asterisk 17 (and also 18):

`A new module "res_mwi_devstate" has been added that allows subscriptions to voicemail boxes using "presence" events.  This allows common BLF keys to act as voicemail waiting indicators.`


The thing is, we're already able to perform that functionality in 16 using Hints. Is it possible we're doing something drastically incorrect with Hints to cause MWI handling to become overtaxed?

By: Joshua C. Colp (jcolp) 2021-03-24 13:09:39.221-0500

The res_mwi_devstate module is also in Asterisk 16.

By: Luke Escude (lukeescude) 2021-03-25 09:32:49.134-0500

Gotcha - I may look at using that in my upcoming dialplan re-write, instead of whatever method we're using now.

This poor customer has reloaded his PBX 6 times today, and every time it hits the 500 app_voicemail taskprocessor limit and a few of his phones get sent 503 responses to their REGISTERs. He's the same customer whose PBX self-reboots every 3 days because of memory leaks >.> I need to investigate that as well.

I am really hoping my dialplan rewrite fixes all of this, our current dial plan is from 3 years ago and it's absolute garbage.

By: Joshua C. Colp (jcolp) 2021-03-25 09:38:52.678-0500

You can configure PJSIP so it only cares about overloads in relation to its own taskprocessors[1].

[1] https://github.com/asterisk/asterisk/blob/master/configs/samples/pjsip.conf.sample#L1224

By: Luke Escude (lukeescude) 2021-03-25 11:22:44.981-0500

Also interesting, but I think I'd rather make my dial plan more efficient (or find/fix a bug) than turn off safety features.

I am going to replace our Custom:vm hints with the new res_mwi_devstate module, and eliminate the use of externnotify (its only purpose was to set the Custom state whenever a voicemail box was updated).

I'll update this thread with test results on 16 and on 18.

By: Luke Escude (lukeescude) 2021-03-25 15:58:18.871-0500

Okay cool, so on Asterisk 16.14, I made the following changes:

1. Switched our voicemail hints to use the new MWI devstate instead of using externnotify to manually set devstates.
2. Eliminated externnotify from voicemail.conf entirely.

No change in behavior, a whole lot of depth is added to the app_voicemail taskprocessor queue. On this no-volume test system, it will get as deep as 98 queued tasks, despite there only being 24 mailboxes and maybe like 5 endpoints watching any 1 mailbox.

So, I will recompile with Asterisk 18 and see if that changes anything.

By: Luke Escude (lukeescude) 2021-03-25 16:02:16.357-0500

Just figured it out... the "pollmailboxes" option is causing it. I have pollmailboxes=yes, with a pollfreq of 60, because I believe at some point (years ago) MWI wasn't working properly without it.

Does this qualify as a bug with pollmailboxes, or should I just avoid using that option?

EDIT: Interestingly, the app_voicemail taskprocessor is being used absolutely 0 now.... Even though MWI is working just fine. I have no idea why we ever needed to use pollmailboxes...

By: Joshua C. Colp (jcolp) 2021-03-26 08:54:22.708-0500

Did you externally manipulate mailboxes in the database? The pollmailboxes would be needed if that was occurring.

By: Luke Escude (lukeescude) 2021-03-26 09:04:49.375-0500

Yes, that occurred to me late last night, we have a web portal that allows users to move around, listen to, delete, and transfer their voicemails.

I will likely change over to using the VoicemailReload AMI action to facilitate the same goal, unless there in fact is a bug with the taskprocessor overloading when pollmailboxes is used.

By: George Joseph (gjoseph) 2021-03-26 09:49:38.036-0500

Let me test with pollmailboxes and see what the deal is.



By: Luke Escude (lukeescude) 2021-03-29 09:46:05.879-0500

So we pushed a change to production that eliminates pollmailboxes (and externnotify) and the taskprocessor overload no longer occurs.

By: Luke Escude (lukeescude) 2021-04-15 09:24:05.377-0500

Just a quick update, we're still seeing absolutely no overloads with voicemail taskprocessors now that pollmailboxes is turned off.