[Home]

Summary:ASTERISK-26696: pjsip_pubsub: PJSIP Subscription Persistence in AstDB Does not update on subscription refresh
Reporter:Zach R (zrothy)Labels:
Date Opened:2017-01-05 12:28:08.000-0600Date Closed:2017-02-16 07:49:04.000-0600
Priority:MinorRegression?
Status:Closed/CompleteComponents:pjproject/pjsip Resources/res_pjsip_pubsub
Versions:13.0.0 13.13.1 Frequency of
Occurrence
Frequent
Related
Issues:
is duplicated byASTERISK-26821 Subscriptions with Expiry=0 is not removed
is duplicated byASTERISK-26724 Hints appear to not expire properly resulting in duplicates
is related toASTERISK-26756 res_pjsip_mwi: Asterisk does not terminate MWI subscription
Environment:CentOS 6.6 FinalAttachments:( 0) set1-duplicates.txt
( 1) set1-hints.txt
( 2) set1-subscription_persistence.txt
( 3) set2-duplicates.txt
( 4) set2-hints.txt
( 5) set2-subscription_persistence.txt
Description:When a phone (say a Cisco SPA 504G) subscribes to a hint about another extension it properly creates the persistence information in the AstDB. But after 30 minutes or whenever the phone sends another subscribe to refresh its subscription the persistence key does not have its expiration timestamp updated to the new time.

This results in the fact subscriptions are not persistent across crashes or restarts, since it sees the subscription as expired when it goes to rebuild subscriptions from the persistence keys. So it would start with no watchers instead of however many it had before the server restarted or crashed.

For example, say extension 500 subscribes to the hint for x501 at 4 AM and has an expiration of 30 minutes. The subscription persistence token in the astdb (database show subscription_persistence) has a timestamp that says it expires at 4:30 AM. Around 4:29 x500 sends a subscribe to refresh the subscription so that it should expire at 5 AM. Yet the astdb database still shows that subscription as expiring at 4:30 AM.

Say the Asterisk server than restarts at 4:45 AM, when x500's subscription to the hint for x501 is still valid. Once it comes back up the hint for x501 will show as 0 watchers instead of 1 as it was before the restart.

I've included an example below from a development server:
{noformat}
/subscription_persistence/subscription_persistence/b83309ef-e958-429a-a150-02c01327b2b1: {"local_port":"5060","endpoint":"107-eng4","src_port":"5060","transport_key":"UDP","packet":"SUBSCRIBE sip:288@eng4.sip.monmouth.com SIP/2.0\r\nVia: SIP/2.0/UDP 64.19.186.174:5060;branch=z9hG4bK5a9203f345803ecac6c09fc88d86b8bf\r\nVia: SIP/2.0/UDP 10.0.15.188:5060;branch=z9hG4bK-9a01007c\r\nFrom: \"107 test\" <sip:107-eng4@eng4.sip.monmouth.com>;tag=dbf1bb778d0a69c1\r\nTo: <sip:288@eng4.sip.monmouth.com>\r\nCall-ID: 96752e2f-e962d669@10.0.15.188\r\nCSeq: 26153 SUBSCRIBE\r\nContact: \"107 test\" <sip:107-eng4@64.19.186.174>\r\nAuthorization: Digest username=\"107-eng4\", realm=\"asterisk\", nonce=\"1483464068/53bc0175eb9b908d6651b2792c4250a8\", uri=\"sip:288@eng4.sip.monmouth.com\", response=\"07a7013bef2c6312a810d3de45900e51\", algorithm=MD5, cnonce=\"86b9bb98\", opaque=\"0428c5a113a17b58\", qop=auth, nc=00000001\r\nAccept: application/dialog-info+xml\r\nMax-forwards: 69\r\nExpires: 1800\r\nEvent: dialog\r\nUser-agent: foobar\r\nContent-Length: 0\r\n\r","expires":"1483465868","src_name":"64.19.186.174","cseq":"2924","local_name":"209.191.9.28","tag":"07c6bd3d-37db-4031-b82b-92572eccbf0a"}
{noformat}

The expires timestamp shows it should expire at 2017-01-03 12:51:08 -0500 and throughout the day this timestamp didn't change even though the phone never got unplugged or restarted. The subscription was still valid as well since it showed a watcher for extension 288 throughout the day (and was tested on the phone as well to confirm the blf updated properly based on the state of x288). When I restarted the server this key from the astdb got removed, and the watchers for 288 was at 0 until the extension 107 that was watching 288 sent another subscribe.
Comments:By: Asterisk Team (asteriskteam) 2017-01-05 12:28:09.484-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Rusty Newton (rnewton) 2017-01-05 16:02:13.604-0600

[~rmudgett] tells he has been able to reproduce this after seeing your issue, so this is confirmed. I'm opening it up.

By: Richard Mudgett (rmudgett) 2017-01-20 12:46:14.949-0600

There are three things I am aware of concerning subscriptions that relate to updating Asterisk level data structures.

# The expiry value in the "pjsip show subscriptions" commands (ASTERISK-23828) currently count down to zero because the Asterisk level structures don't get updated with the new expiration time when the subscription is refreshed.
# ASTERISK-26696 is the same as 1 but concerns the AstDB record.
# Stopping Asterisk, rebooting phones, and then starting Asterisk will result in persisted subscriptions that haven't expired being recreated and not getting deleted when the old subscriptions expire because the phone created new subscriptions and no longer knows about the old ones.  However, the old subscriptions are removed the next time asterisk restarts.


By: Zach R (zrothy) 2017-01-20 15:58:43.106-0600

{quote}
Richard Mudgett

There are three things I am aware of concerning subscriptions that relate to updating Asterisk level data structures.

# The expiry value in the "pjsip show subscriptions" commands (ASTERISK-23828) currently count down to zero because the Asterisk level structures don't get updated with the new expiration time when the subscription is refreshed.
# ASTERISK-26696 is the same as 1 but concerns the AstDB record.
# Stopping Asterisk, rebooting phones, and then starting Asterisk will result in persisted subscriptions that haven't expired being recreated and not getting deleted when the old subscriptions expire because the phone created new subscriptions and no longer knows about the old ones. However, the old subscriptions are removed the next time asterisk restarts.


{quote}

In response to Richard's comment about subscriptions, I'm not sure if the following data will help in response to item #3. Twice last week on the 11th to 12th and 13th to 14th I had experienced phones which made duplicate subscriptions that have not expired. Unlike the method you described to reproduce the bug, the server these cases occurred on did not reboot any time recently. The uptime for asterisk on the server currently shows over 2 weeks of uptime. I included the output of core show uptime below.
{noformat}
System uptime: 2 weeks, 1 day, 19 hours, 7 minutes, 46 seconds
Last reload: 1 week, 4 hours, 15 minutes, 2 seconds
{noformat}

I have attached the data I have on these cases to the ticket, with the first set as a set of files prefixed/starting with set1 and the second as set2. There are 3 files for each set:

* setx-hints.txt - Contains the diff of the hints for the day I noticed the trouble and the previous day
* setx-duplicates.txt  - Contains the duplicate subscriptions seen in the AstDB in the subscription_persistence family. These are grouped which extension they subscribe to.
* setx-subs.txt - Contains the subscriptions unsorted subscriptions in the subscription_persistence family of the AstDB for that day.

The first set happened on Jan 11th to Jan 12th 2017, and the second set happened on Jan 13th to Jan 14th 2017.

In set1 the phones rebooted at 4AM as we programmed them for synchronization of their configs, and as expected they sent a message to unsubscribe for the current subscription, and resubscribed when they came back up. Later that day it appears that they had been rebooted, as a new subscription was made at 10 AM on Jan 12th. This showed up as it had 2 subscriptions to each phone it was watching. The phone that had duplicate subscriptions to multiple hints was a Cisco SPA 50XG series phone.

In set 2 a phone that had been watching 4 extensions gained 2 more subscriptions to each of the phones it was watching. As with the first set the phone had rebooted at 4 AM to resync its configuration file with our provisioning server. Later that day it looks like it rebooted twice adding 2 additional subscriptions to the one it remade at 4 AM causing this phone to have 3 subscriptions to each of the 4 hints it was watching.

If needed I have dumps of the hints and subscriptions from the AstDB for each day starting from Jan 5th 2017.

By: Zach R (zrothy) 2017-01-20 16:01:07.759-0600

Data for hints that did not properly expire ever. During this time asterisk did not reboot.

By: Friendly Automation (friendly-automation) 2017-02-16 07:49:05.226-0600

Change 4916 merged by zuul:
res_pjsip_pubsub:  Correctly implement persisted subscriptions

[https://gerrit.asterisk.org/4916|https://gerrit.asterisk.org/4916]

By: Friendly Automation (friendly-automation) 2017-02-16 07:49:10.851-0600

Change 4917 merged by zuul:
res_pjsip_pubsub:  Correctly implement persisted subscriptions

[https://gerrit.asterisk.org/4917|https://gerrit.asterisk.org/4917]

By: Friendly Automation (friendly-automation) 2017-02-16 09:49:54.050-0600

Change 4918 merged by Joshua Colp:
res_pjsip_pubsub:  Correctly implement persisted subscriptions

[https://gerrit.asterisk.org/4918|https://gerrit.asterisk.org/4918]