[Home]

Summary:ASTERISK-22841: Improve documentation of t1min by clarifying its behavior as a global T1 minimum setting
Reporter:sohosys (sohosys)Labels:
Date Opened:2013-11-11 15:50:06.000-0600Date Closed:
Priority:MinorRegression?
Status:Open/NewComponents:Channels/chan_sip/General Documentation
Versions:1.8.24.0 13.18.4 Frequency of
Occurrence
Constant
Related
Issues:
Environment:allAttachments:
Description:The definition of timer t1min is "Minimum roundtrip time for messages to monitored hosts" the key word is "monitored", which means qualify= is set to yes or a numeric value.

The value of t1min is being evaluated when the timert1 value for a non-monitored host is read from the configuration and a warning is logged and timert1 is set to global_timert1 if the configured timert1 value for the non-monitored host is less than t1min.

t1min should not apply to non-monitored hosts by definition and the user should be able to set a timer1 value that is less than t1min for a non-monitored host.

Here is where the bug report ends and an opinion/feature request starts:
If T1 is set in the config for a monitored host it should be used instead of the last qualify result.

A case where this is needed is when there is a SBC in front of the monitored host. The SBC may answer the OPTIONS query directly and not pass it to the host which is behind the SBC. In this case the lastms is set to the RTT to the SBC, which is not indicative of the actual RTT for an INVITE which is proxied to the host (which adds additional delay).

Setting T1 to the last measured OPTIONS RTT is a problematic for another reason as well. Any amount of jitter in the network can result in unneeded retransmits. In practice T1 is set to the estimated RTT which is not always the last RTT but rather the average RTT plus an allowance for network jitter. There are many reasons when a user might want to adjust timer T1 on a peer by peer basis. They should not be forced to accept a global minimum or system calculated T1 regardless of whether qualify is on. Qualify is used in other ways by many users, not just as a means of setting T1.

The current timer configuration could be improved by adding a T1 jitter configuration value (in addition to or in place of T1min, which would be added to lastms to set T1. For example, if the lastms for a peer was 40ms and the T1jitter setting was 20, then T1 would be set to 60ms.

Again, a statically configured timert1 should override any calculated T1 regardless of the values set elsewhere, for both monitored and un-monitored peers.

Whether you agree with the opinion or not, the fact that t1min is being enforced on a non-monitored host appears to be a bug, or at the very minimum, the definition of t1min in the sample config file and the code comments is wrong.
Comments:By: Matt Jordan (mjordan) 2013-11-12 09:46:38.079-0600

I'm going to go ahead and just comment on your issue description. In general, the issue tracker is not meant for behavioral discussions - a more appropriate forum is the asterisk-dev mailing list, where more people can weigh in on the discussion. I will keep this open as an improvement to the documentation of {{t1min}} - if you'd like to provide a patch that clarifies the usage of that parameter, that would be appreciated.

{quote}
The definition of timer t1min is "Minimum roundtrip time for messages to monitored hosts" the key word is "monitored", which means qualify= is set to yes or a numeric value.
{quote}

Minor quibble: Monitored hosts does not necessarily have to imply OPTIONS request qualified hosts. If there was another mechanism to know the RTT of a particular host, that could be used as well. It just happens that it's the only mechanism Asterisk has to date :-)

{quote}
The value of t1min is being evaluated when the timert1 value for a non-monitored host is read from the configuration and a warning is logged and timert1 is set to global_timert1 if the configured timert1 value for the non-monitored host is less than t1min.

t1min should not apply to non-monitored hosts by definition and the user should be able to set a timer1 value that is less than t1min for a non-monitored host.
{quote}

I'm going to point out here that sample configuration files are not specifications. They are not requirements. They are documentation. They can be wrong; that doesn't make the implementation that consumes the configuration file wrong.

That aside, yes, {{global_t1min}} (which is derived from a globally defined {{t1min}} value) is used to evaluate whether or not the T1 timer settings are appropriate for all peers, regardless of whether or not {{qualify}} is set. I believe the offending code you're referring to is here:

{noformat}
/* Note that Timer B is dependent upon T1 and MUST NOT be lower
* than T1 * 64, according to RFC 3261, Section 17.1.1.2 */
if (peer->timer_b < peer->timer_t1 * 64) {
if (timerb_set && timert1_set) {
ast_log(LOG_WARNING, "Timer B has been set lower than recommended for peer %s (%d < 64 * Timer-T1=%d)\n", peer->name, peer->timer_b, peer->timer_t1);
} else if (timerb_set) {
if ((peer->timer_t1 = peer->timer_b / 64) < global_t1min) {
ast_log(LOG_WARNING, "Timer B has been set lower than recommended (%d < 64 * timert1=%d). (RFC 3261, 17.1.1.2)\n", peer->timer_b, peer->timer_t1);
peer->timer_t1 = global_t1min;
peer->timer_b = peer->timer_t1 * 64;
}
peer->timer_t1 = peer->timer_b / 64;
} else {
peer->timer_b = peer->timer_t1 * 64;
}
}
{noformat}

If we look at {{t1min}} as the minimum acceptable T1 value across all peers - regardless of whether or not they are qualified - this is an appropriate application of the value. If someone has specified {{t1timer}} and they have gone below the minimum value, this catches that - which is a *good* thing.

{quote}
Here is where the bug report ends and an opinion/feature request starts:
If T1 is set in the config for a monitored host it should be used instead of the last qualify result.
{quote}

This feels contrary to the definition of the T1 timer in RFC 3261 (section 17.1.1.1):

{quote}
T1 is an estimate of the round-trip time (RTT), and
  it defaults to 500 ms.
{quote}

If we know the RTT to the peer, we should use that. I don't believe globally altering that behavior has much benefit.

{quote}
A case where this is needed is when there is a SBC in front of the monitored host. The SBC may answer the OPTIONS query directly and not pass it to the host which is behind the SBC. In this case the lastms is set to the RTT to the SBC, which is not indicative of the actual RTT for an INVITE which is proxied to the host (which adds additional delay).
{quote}

In this particular case, qualifying the peer stands no chance of ever working. You have something in the middle of the communication path that is *lying* to you. Since the SBC is being a poor participant in the overall communication path, you should not qualify the peer in the first place. In that case, {{lastms}} on the peer will not be set, and the dialog T1 timer should default to the peer T1 timer {{t1timer}}. That should give you the behavior you want. I'm not sure how having {{t1min}} defined in that case hurts: you still should not be basing your T1 timer on something that is ludicrously low - and, if you wanted it to be some incredibly low value, the fact that the SBC is skewing the T1 timer lower wouldn't hurt in the first place.

{quote}
Setting T1 to the last measured OPTIONS RTT is a problematic for another reason as well. Any amount of jitter in the network can result in unneeded retransmits. In practice T1 is set to the estimated RTT which is not always the last RTT but rather the average RTT plus an allowance for network jitter. There are many reasons when a user might want to adjust timer T1 on a peer by peer basis. They should not be forced to accept a global minimum or system calculated T1 regardless of whether qualify is on. Qualify is used in other ways by many users, not just as a means of setting T1.

The current timer configuration could be improved by adding a T1 jitter configuration value (in addition to or in place of T1min, which would be added to lastms to set T1. For example, if the lastms for a peer was 40ms and the T1jitter setting was 20, then T1 would be set to 60ms.

Again, a statically configured timert1 should override any calculated T1 regardless of the values set elsewhere, for both monitored and un-monitored peers.

Whether you agree with the opinion or not, the fact that t1min is being enforced on a non-monitored host appears to be a bug, or at the very minimum, the definition of t1min in the sample config file and the code comments is wrong.
{quote}

And now we're way off of the definition of T1 timer provided by the RFC. This is a separate issue than a documentation patch. I'd suggest taking such an idea to the asterisk-dev mailing list, as the issue tracker is not a good place for such a discussion.



By: Matt Jordan (mjordan) 2013-11-12 09:47:53.198-0600

In conclusion:

# If you'd like to provide a patch to the documentation, that would be appreciated. Documentation fixes are much more likely to be made if they have patches.
# If you'd like to propose a behavioral change to Asterisk, the appropriate place for that is the asterisk-dev mailing list.

By: sohosys (sohosys) 2013-11-12 10:41:06.056-0600

actually the offending code is this:

} else if (!strcasecmp(v->name, "timert1")) {
                               if ((sscanf(v->value, "%30d", &peer->timer_t1) != 1) || (peer->timer_t1 < 200) || (peer->timer_t1 < global_t1min)) {
                                       ast_log(LOG_WARNING, "'%s' is not a valid T1 time at line %d.  Using default.\n", v->value, v->lineno);
                                       peer->timer_t1 = global_t1min;
                               }

and I wonder if the original intention of t1min was to address T1 timers that were too low as the result of qualify, or it was intened to truly be a global minimum that the user cannot override. the comments suggest the former.

I did also post to the dev list this morning.

By: Rusty Newton (rnewton) 2013-12-09 18:10:58.138-0600

sohosys, based on the list conversation (starting with this post: http://lists.digium.com/pipermail/asterisk-dev/2013-November/063657.html), do you intend to attach a documentation or source patch for review?

I don't have time at the moment to catch up on that particular thread, but just wanted to ping you to see what the status is.