[Home]

Summary:ASTERISK-15479: [patch] [regression] [patch] chan_sip does not check other mailboxes on AST_EVENT_MWI
Reporter:Ben Winslow (rain)Labels:
Date Opened:2010-01-20 13:51:59.000-0600Date Closed:2011-08-23 23:45:24
Priority:MinorRegression?Yes
Status:Closed/CompleteComponents:Channels/chan_sip/Subscriptions
Versions:Frequency of
Occurrence
Related
Issues:
is related toASTERISK-18067 REGRESSION: After upgrading from 1.4.41 to *1.8.4.2 a SIP extension with a voicemail box can no longer monitor mwi of another extension
Environment:Attachments:( 0) chan_sip.c-mwi_multi_mailbox_fix-1.6.2.13.diff
( 1) chan_sip-mwi_fixes.diff
( 2) chan_sip-use_cached_mwi_events.diff
Description:When chan_sip receives an AST_EVENT_MWI, the callback will always believe the message counters in the event (AST_EVENT_IE_NEWMSGS/AST_EVENT_IE_OLDMSGS); however, if the peer receiving the event has voicemail messages in a different mailbox that it is ALSO subscribed to, these counters will be wrong (since the callback makes no attempt to check other mailboxes if event is not NULL.)

The easiest solution to this problem seems to be ignoring the IEs in the AST_EVENT_MWI event and going right to get_cached_mwi, which appears to check for MWI events on all subscribed inboxes.  The attached patch does this, but this caused some of my phones to incorrectly show messages waiting when I restarted Asterisk.  From some debug logging I added locally to get_cached_mwi, it seems that ast_event_get_cached() can return an MWI event with incorrect data in the NEWMSGS/OLDMSGS IEs while app_voicemail is loading (e.g. 1 new, 0 old), although a cache query for the same mailbox@context will (accurately) return 0/7 a moment later.  (I'm trying to debug that.)

Disclaimer on file.

****** STEPS TO REPRODUCE ******

Subscribe a SIP peer to multiple mailboxes (mailbox=123,456,789).
Leave a message on at least two of the subscribed mailboxes.
Delete all new messages from one of the subscribed mailboxes.
Observe the MWI on the SIP peer turning off.

****** ADDITIONAL INFORMATION ******

As far as I can tell, this bug exists in all versions of Asterisk using events for MWI (all of 1.6.x?)
Comments:By: Ben Winslow (rain) 2010-01-20 14:21:51.000-0600

After some additional testing, it appears that the incorrect AST_EVENT_MWIs from ast_event_get_cached are not limited to Asterisk startup -- ast_event_get_cached just gave me 0 AST_EVENT_IE_NEWMSGS, 0 AST_EVENT_IE_OLDMSGS for a mailbox showing 1 new message in 'voicemail show users'.

By: Ben Winslow (rain) 2010-03-05 15:32:51.000-0600

It looks like my issues after this patch were caused by ASTERISK-15431.  I've fixed that bug locally and I'm retesting this patch now.



By: Ben Winslow (rain) 2010-03-22 15:22:41

I've been testing the original patch for a couple of weeks without any issues.  I looked over this again today, and I noticed that there's still a case where the code could do the wrong thing: if there is a cached MWI event for a mailbox that has only old messages, but another mailbox without a cached MWI event DOES have new messages, the SIP peer will not be informed.

I'm attaching a new patch that fixes this issue by making get_cached_mwi() abort when ANY subscribed mailbox does not have a cached event; this causes the only caller (sip_send_mwi_to_peer) to poll the mailbox for an accurate count.



By: Jamuel Starkey (jamuel) 2010-04-01 16:46:44

Rain: have you seen cases (before your patch) where a peer can be sent the wrong message count?  In 1.6.1.12 I'm seeing an issue where a peer is receiving the message count for another peer's mailbox:

Peer A has messages 10/8
Peer B as messages 0/0

on SUBSCRIBE by Peer B to Peer B's mailbox, * returns Peer A's message count in the NOTIFY to Peer B.

I have polling turned on in app_voicemail so if I then were to go and add a message to Peer B's INBOX via the linux filesystem (e.g. touch /var/spool/asterisk/voicemail/default/PEERB/INBOX/msg0000.txt) then after the poll interval, Peer B's message count is correctly reflected (1/0). Then if I delete this message again, * correctly sends a NOTIFY reflecting (0/0) to Peer B.  But then after some time has elapsed (not sure of how long just yet) Peer B starts getting Peer A's message count again.

Any thoughts would be appreciated.

By: Ben Winslow (rain) 2010-04-13 11:12:39

Jamuel: I was seeing this issue before I applied the patch in ASTERISK-15431.  I suspect that if you apply the diff attached to that bug (or upgrade to 1.6.1.18, which includes that bug fix), your problem will go away.

By: Jamuel Starkey (jamuel) 2010-04-13 13:13:44

RAIN -- THANKS.  I will apply that patch and confirm.  Will probably take a day or so to get the patch in.

On a related note -- Bug: 17153 my solution of bypassing the event api seems to being working.

I'm glad to hear that I should be able to revert my patch and apply the one-liner that you suggested.

Longer term do you suggest applying the patch from 16607 and then applying the patches you've provided here in 16660?

By: Ben Winslow (rain) 2010-04-13 13:26:37

The patch from ASTERISK-15431 is already in the latest versions of Asterisk (so if you can upgrade or are already running the latest 1.6.1.x version, you should be good there.)  The patches I've supplied here are for a different issue than the one you describe, but the latest version of the patch will probably result in the mailbox being polled for messages more often (vs. using cached events), which may help in your case.

By: Ben Winslow (rain) 2010-05-04 14:46:07

I've been testing the second version of this patch for a couple of months now with no issues, but I wouldn't mind getting some more eyes on this one.  I would say it's ready for general testing, at least.

Is it worth counting the number of subscribed mailboxes and using the IEs in the event when the peer is subscribed to only one mailbox?  This is only a little less work than get_cached_mwi() is already doing.

By: Ben Winslow (rain) 2010-10-07 12:32:59

I have attached a new version of this patch that applies against 1.6.2.13.  The other two patches are obsolete.