[Home]

Summary:ASTERISK-27759: res_pjsip_pubsub: Subscription persistence does not preserve XML <dialog-info> version number
Reporter:Bryan Nelson (bnelsonfs)Labels:pjsip
Date Opened:2018-03-21 17:11:05Date Closed:2020-01-09 16:55:42.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Resources/res_pjsip_pubsub
Versions:15.3.0 Frequency of
Occurrence
Occasional
Related
Issues:
Environment:CentOS Linux release 7.4.1708 (Core)Attachments:
Description:When utilizing the subscription persistence via astDB to restore dialog-info+xml subscriptions on restart of asterisk, the re-built subscriptions do not have the correct 'version' number in the xml content, and the message is discarded by phones per the RFC.  I am unable to find anywhere that the current xml version number is stored in the persistence table, so it makes sense that it cannot be restored on restart.

Desired behavior would be similar to how the SIP Cseq is stored, and used when restoring the subscription.

Example:

Prior to restart, version number has reached 6
{quote}
<?xml version="1.0" encoding="UTF-8"?>
<dialog-info xmlns="urn:ietf:params:xml:ns:dialog-info" *version="6"* state="full" entity="sip:701@1.1.1.1:5060">
<dialog id="701">
 <state>terminated</state>
</dialog>
</dialog-info>
{quote}

After restart, subscription is restored, but version number starts at 1:
{quote}
<?xml version="1.0" encoding="UTF-8"?>
<dialog-info xmlns="urn:ietf:params:xml:ns:dialog-info" *version="1"* state="full" entity="sip:701@1.1.1.1:5060">
<dialog id="701">
 <state>confirmed</state>
</dialog>
</dialog-info>
{quote}

In this particular case, the version number would 'catch up' to the phones with 5 more state changes, and the BLF would resume functioning on the phone.  In cases where there is a very long running subscription with many state changes, this version number can be in the hundreds, so the delay before if begins functioning again is quite long.
Comments:By: Asterisk Team (asteriskteam) 2018-03-21 17:11:06.894-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: ADTopkek (ADTopkek) 2018-08-16 15:15:47.267-0500

This is now a problem on all yealink firmware V82+. It seems to also affect some newer Grandstream models too.

See: https://community.asterisk.org/t/blf-s-not-working-right-with-yealink-83-0-x-firmware-response-from-yealink/75708/7

Restarting the phones temporarily fixes the issue till the next server reboot. Causing major problems for some customers due to major reliance on BLF states.

By: Bryan Nelson (bnelsonfs) 2018-08-16 16:18:44.748-0500

@ADTopkek

As far as I can tell, this will impact any phone that properly follows the spec for dialog-info+xml subscriptions.  I've tested Polycom, Grandstream, Yealink, Cisco, and Obihai with the same results.  Any phone that continues to work fine is technically behaving badly, and has the potential to get out of order BLF notify messages and display the incorrect BLF state.

By: ADTopkek (ADTopkek) 2018-09-28 15:20:20.421-0500

That's really bad. Meaning asterisk is incompatible with modern SIP phones....

This is becoming a major problem as we are needing to update phones to newer firmwares for certain compatibility/bug fixes which then causes the BLF's to completely break every morning.

By: Bryan Nelson (bnelsonfs) 2018-09-28 16:36:25.435-0500

@ADTopkek

I definitely wouldn't go so far as to say this makes asterisk incompatible with modern SIP phones.  This problem only occurs when asterisk is restarted and the subscriptions are restored from the astdb, not when a phone restarts.  If you are having issues with BLF when a phone restarts, I suspect you are having a different problem.  I can't think of any reason to need an asterisk restart when updating firmware for phones...?

We have been running for a while with the sorcery config for persistence set to memory, rather than astdb storage to "disable" the persistence, which basically brings functionality back to the pre-pjsip days.
{quote}
[res_pjsip_pubsub]
subscription_persistence=memory
{quote}

(If anyone knows a more correct way to simply disable the persistence completely, I'd appreciate the help)

When asterisk restarts, all subscriptions are lost, and phones need to re-subscribe.  Setting a relatively low subscription interval helps restore things pretty quickly.  Since asterisk restarts "should" be a rare occurrence, and generally during off-hours, the BLF subscriptions are restored long before anyone is back in the office to use them.



By: ADTopkek (ADTopkek) 2018-10-01 08:37:18.753-0500

We restart asterisk every day just to keep calls from slowly degrading in quality. So every morning with newer firmware phone's BLF buttons would not work.

What do you mean brings it back to the pre-pjsip days? This happens on both PJSIP and SIP.

Resubscribing does not update the BLF version number. My phone is set to 5 min subscriptions and the problem persists for hours/days or until I manually reboot the phone. Same with clients.

By: ADTopkek (ADTopkek) 2018-11-07 10:20:33.380-0600

Same problem happening to Polycom VVX410.

By: Jens Beyer (j.beyer) 2018-12-12 06:35:45.951-0600

I can confirm the BLF problem with Yealink phones and firmware 44.82.X. The workaround that Bryan Nelson has posted is working for me as well, but is only a partial solution because you still cannot just reload but must restart asterisk. So I tried this:

\[res_pjsip_pubsub\]
subscription_persistence/cache=memory_cache,object_lifetime_maximum=4000,object_lifetime_stale=2000
subscription_persistence=config,dummy.conf

with the idea to have a memory cache on top of an empty config file. The memory cache should be cleared on reload and then the phones must resubscribe. But this has the same effect as the idea from Bryan Nelson - after reload the BLF will not work. Maybe it is necessary to add the option "expire_on_reload" but I have not tested this yet.

By: Jens Beyer (j.beyer) 2018-12-12 10:41:31.648-0600

Even with added memory cache option "expire_on_reload" a restart is still necessary to keep the BLFs alive. This is strange because it contradicts what the option seems to be for.

By: John Smith (user1) 2018-12-19 12:24:11.959-0600

Is Bryan's sorcery.conf fix a good workaround for a noob user like me?  What should we do until this is fixed?  Just stay with the old versions of the yealink firmware?  thanks

By: Friendly Automation (friendly-automation) 2020-01-09 16:24:38.208-0600

Change 13575 merged by Friendly Automation:
res_pjsip_pubsub: Add ability to persist generator state information.

[https://gerrit.asterisk.org/c/asterisk/+/13575|https://gerrit.asterisk.org/c/asterisk/+/13575]

By: Friendly Automation (friendly-automation) 2020-01-09 16:26:12.559-0600

Change 13581 merged by Friendly Automation:
res_pjsip_pubsub: Add ability to persist generator state information.

[https://gerrit.asterisk.org/c/asterisk/+/13581|https://gerrit.asterisk.org/c/asterisk/+/13581]

By: Bryan Nelson (bnelsonfs) 2020-01-09 16:34:55.822-0600

Thanks for the work on this one!

I am available for testing if needed.

Edit: Apologies, didn't mean to re-open the issue here.

By: Asterisk Team (asteriskteam) 2020-01-09 16:34:56.140-0600

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: Friendly Automation (friendly-automation) 2020-01-09 17:29:28.856-0600

Change 13574 merged by Kevin Harwell:
res_pjsip_pubsub: Add ability to persist generator state information.

[https://gerrit.asterisk.org/c/asterisk/+/13574|https://gerrit.asterisk.org/c/asterisk/+/13574]

By: Friendly Automation (friendly-automation) 2020-01-09 17:29:46.247-0600

Change 13520 merged by Kevin Harwell:
res_pjsip_pubsub: Add ability to persist generator state information.

[https://gerrit.asterisk.org/c/asterisk/+/13520|https://gerrit.asterisk.org/c/asterisk/+/13520]