[Home]

Summary:ASTERISK-24374: "sip qualify peer" CLI command stops periodic pokes for the peer forever, if the peer is unreachable
Reporter:Michele Cicciotti (PrivateWave SpA) (michele cicciotti privatewave)Labels:
Date Opened:2014-09-30 03:39:58Date Closed:
Priority:MinorRegression?
Status:Open/NewComponents:Channels/chan_sip/General
Versions:1.8.31.0 13.18.4 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Attachments:
Description:Periodic pokes for a realtime peer are started by the first call to {{sip_poke_peer}}. {{sip_poke_peer}} cancels the current {{pokeexpire}} timer, and transmits an {{OPTION}} request to the peer. If the transmission fails, {{sip_poke_noanswer}} is called immediately; otherwise, the {{pokeexpire}} timer is set to call {{sip_poke_noanswer}} after {{maxms * 2}} milliseconds (the maximum allowed roundtrip time)

If the peer answers to the {{OPTION}} request, {{handle_response_peerpoke}} is called, which updates peer reachability and re-schedules the pokeexpire timer to call {{sip_poke_peer_s}} in {{qualifyfreq}} milliseconds (if the peer is reachable) or 10 seconds

If the peer fails to answer, {{sip_poke_noanswer}} is called by the scheduler. {{sip_poke_noanswer}} sets the peer as unreachable, and schedules the {{pokeexpire}} timer to call {{sip_poke_peer_s}} in 10 seconds

{{sip_poke_peer_s}} is little more than a wrapper to {{sip_poke_peer}}

To sum up, the periodic poke loop goes like: {{sip_poke_peer}} → {{sip_poke_noanswer}}/{{handle_response_peerpoke}} → {{sip_poke_peer_s}} → repeat

However, when {{sip_poke_peer}} is called by CLI/manager command {{sip qualify peer}}, it won't schedule a call to {{sip_poke_noanswer}}:

{code}
} else if (!force) {
AST_SCHED_REPLACE_UNREF(peer->pokeexpire, sched, peer->maxms * 2, sip_poke_noanswer, peer,
unref_peer(_data, "removing poke peer ref"),
unref_peer(peer, "removing poke peer ref"),
ref_peer(peer, "adding poke peer ref"));
}
{code}

If the peer is unreachable, {{handle_response_peerpoke}} will never be called. This deadlock is supposed to be broken by {{sip_poke_noanswer}}, but it's never scheduled (and not all network errors can be detected synchronously by {{transmit_invite}}, so {{sip_poke_noanswer}} may never be called directly, either). Nobody is left to schedule a call to {{sip_poke_peer_s}}: the periodic poke state machine is dead and there is no way to restart it, except by pruning the peer from the realtime table

The easiest way to fix the issue is probably to change the above code into:

\[Edit\]: *Inline patch removed by mjordan*

Comments:By: Rusty Newton (rnewton) 2014-10-01 14:01:39.936-0500

Thanks for the detail of your report. Would you like to just go ahead and follow the [Patch Contributon process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process] so you can get credit and what not?

In the process, you'll put the patch up for review and get assistance from other developers.

By: Matt Jordan (mjordan) 2014-10-07 09:26:08.569-0500

I removed the inline patch on the issue description.

While it is a one line change, we need to make sure that all patches are properly submitted and contributed back. It would be *hugely* appreciated if you could attach the suggested fix as a diff to this issue.

Thanks!