ASTERISK-29241: pjsip / register: wrong port used in Contact and Via if multiple transports are defined.

[Home]

Summary: ASTERISK-29241: pjsip / register: wrong port used in Contact and Via if multiple transports are defined.

Reporter: Michael Maier (micha) Labels:

Date Opened: 2021-01-10 15:00:21.000-0600 Date Closed:

Priority: Minor Regression? No

Status: Open/New Components: Resources/res_pjsip_outbound_registration

Versions: 18.0.1 Frequency of
Occurrence Constant

Related
Issues:

Environment: CentOS 7 x86_64 Attachments:

Description: Define some transports:

CLI> pjsip show transports
{code}
Transport: <TransportId........> <Type> <cos> <tos> <BindAddress....................>
==========================================================================================

Transport: 0.0.0.0-tls tls 3 184 0.0.0.0:5061
Transport: 0.0.0.0-tls2 tls 3 184 0.0.0.0:5062
Transport: 0.0.0.0-tls3 tls 3 184 0.0.0.0:5063
Transport: 0.0.0.0-udp udp 3 184 0.0.0.0:5060
{code}

The 3 TLS transports are added to different trunk configuration. 2 examples:

{code}
CLI> pjsip show registration telekomPJSIP-001

<Registration/ServerURI..............................> <Auth..........> <Status.......>
==========================================================================================

telekomPJSIP-001/sip:tel.t-online.de telekomPJSIP-001 Registered

ParameterName : ParameterValue
============================================================
auth_rejection_permanent : true
client_uri : sip:+49...@tel.t-online.de
contact_header_params :
contact_user : +49...
endpoint : telekomPJSIP-001
expiration : 660
fatal_retry_interval : 0
forbidden_retry_interval : 10
line : true
max_retries : 10000
outbound_auth : telekomPJSIP-001
outbound_proxy :
retry_interval : 60
server_uri : sip:tel.t-online.de
support_mediasec : true
support_outbound : no
support_path : false
transport : 0.0.0.0-tls

CLI> pjsip show registration telekomPJSIP-002

<Registration/ServerURI..............................> <Auth..........> <Status.......>
==========================================================================================

telekomPJSIP-002/sip:tel.t-online.de telekomPJSIP-002 Registered

ParameterName : ParameterValue
=============================================================
auth_rejection_permanent : true
client_uri : ...
contact_header_params :
contact_user : ...
endpoint : telekomPJSIP-002
expiration : 660
fatal_retry_interval : 0
forbidden_retry_interval : 10
line : true
max_retries : 10000
outbound_auth : telekomPJSIP-002
outbound_proxy :
retry_interval : 60
server_uri : sip:tel.t-online.de
support_mediasec : true
support_outbound : no
support_path : false
transport : 0.0.0.0-tls2

CLI> pjsip show endpoint telekomPJSIP-001

Endpoint: telekomPJSIP-001 Not in use 0 of inf
OutAuth: telekomPJSIP-001/+49...
Aor: telekomPJSIP-001 0
Contact: telekomPJSIP-001/sip:+49...@tel.t-o 88c72b9045 Avail 13.944
Transport: 0.0.0.0-tls tls 3 184 0.0.0.0:5061
Identify: telekomPJSIP-001/telekomPJSIP-001
Match: 127.0.0.10/32

CLI> pjsip show endpoint telekomPJSIP-002

Endpoint: telekomPJSIP-002 Not in use 0 of inf
OutAuth: telekomPJSIP-002/+49...
Aor: telekomPJSIP-002 0
Contact: telekomPJSIP-002/sip:+49...@tel.t- 7f03d717f5 Avail 13.425
Transport: 0.0.0.0-tls2 tls 3 184 0.0.0.0:5062
Identify: telekomPJSIP-002/telekomPJSIP-002
Match: 127.0.0.10/32

[root@myfw ~]# netstat -n | grep 506
tcp 0 0 3.2.1.5:53527 217.0.20.195:5061 ESTABLISHED
tcp 0 0 3.2.1.5:49161 217.0.20.195:5061 ESTABLISHED
tcp 0 0 3.2.1.5:56727 217.0.20.195:5061 ESTABLISHED

I verified via tcpdump, that each Register now uses its own connection.

Next, I checked the Register packets - to telekomPJSIP-001, e.g:

Via: SIP/2.0/TLS 3.2.1.5:5062;rport;branch=...
^^^^
Contact: <sip:+49...@3.2.1.5:5062;transport=TLS;line=...>
^^^^
{code}
=> this should be 5061 (because of transport 0.0.0.0-tls which refers to 5061) - not 5062 ...
=> It turns out, that *all* Registers of all trunks are using port 5062 now. Why that? 5061 and 5063 is ignored completely.

Error can be seen in log and is printed by pjsip_message_filter.c:
{code}
static pj_status_t filter_on_tx_message(pjsip_tx_data *tdata)
...
ast_debug(5, "Re-wrote Contact URI host/port to %.*s:%d (this may be re-written again later)\n",
(int)pj_strlen(&uri->host), pj_strbuf(&uri->host), uri->port);
{code}

I didn't test with 18.1.x - but I guess that it could behave the same way.

Comments: By: Asterisk Team (asteriskteam) 2021-01-10 15:00:22.778-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].
By: Florian Floimair (f.floimair) 2021-03-17 10:35:28.690-0500

I submitted a patch for this today, as we also ran into this issue.

By: Michael Maier (micha) 2021-03-17 14:08:57.840-0500

I tested your patch against the configuration described [here|https://www.mail-archive.com/asterisk-dev@lists.digium.com/msg45516.html]. I'm now getting the correct local ports for each configured transport. Seems to work fine for me! Thanks!
By: Florian Floimair (f.floimair) 2021-03-18 04:51:02.962-0500

Great that you were able to verify it this quickly. Thanks!
Now waiting for feedback in Gerrit!
By: Friendly Automation (friendly-automation) 2021-06-15 09:06:39.398-0500

Change 15620 merged by Friendly Automation:
res_pjsip/pjsip_message_filter: set preferred transport in pjsip_message_filter

[https://gerrit.asterisk.org/c/asterisk/+/15620|https://gerrit.asterisk.org/c/asterisk/+/15620]
By: Friendly Automation (friendly-automation) 2021-06-15 09:07:55.703-0500

Change 15619 merged by Friendly Automation:
res_pjsip/pjsip_message_filter: set preferred transport in pjsip_message_filter

[https://gerrit.asterisk.org/c/asterisk/+/15619|https://gerrit.asterisk.org/c/asterisk/+/15619]
By: Friendly Automation (friendly-automation) 2021-06-15 09:11:50.606-0500

Change 15630 merged by George Joseph:
res_pjsip/pjsip_message_filter: set preferred transport in pjsip_message_filter

[https://gerrit.asterisk.org/c/asterisk/+/15630|https://gerrit.asterisk.org/c/asterisk/+/15630]
By: Michael Maier (micha) 2021-06-15 12:14:03.409-0500

Sorry - the actual patch is wrong and leads to missing port in Via and Contact header on outbound calls in INVITE method e.g. The fixed patch would be
{code}
+ if ((tdata->tp_info.transport->key.type != PJSIP_TRANSPORT_UDP) &&
+ (tdata->tp_info.transport->key.type != PJSIP_TRANSPORT_UDP6)) {
+ sel.type = PJSIP_TPSELECTOR_TRANSPORT; (not: ..._LISTENER)
+ sel.u.listener = tdata->tp_info.transport->factory;
+ prm.tp_sel = &sel;
+ }
+
{code}

The above patch admittedly works for me - but I fear there is a inconsistency in line 4 - this should be sel.u.transport, too, as in the original patch by Florian.
By: Asterisk Team (asteriskteam) 2021-06-15 12:14:03.606-0500

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.
By: Joshua C. Colp (jcolp) 2021-06-15 12:18:00.879-0500

Which port are you referring to? Bound, or ephemeral?
By: Joshua C. Colp (jcolp) 2021-06-15 12:24:06.730-0500

The patch does not add support for placing the ephemeral port in the messages. This would be a separate thing, and behind an option that defaults to off. The patch resolves the issue where if a transport is explicitly configured it is used in the signalling.
By: Michael Maier (micha) 2021-06-15 12:24:17.655-0500

It's the port pjsip uses for an outbound call to a trunk Asterisk registered to as client - it's independent from the listener port (if any listener is there at all - you don't need any listener if you register via TLS / TCP to a trunk).
By: Joshua C. Colp (jcolp) 2021-06-15 12:25:19.333-0500

This issue did not make it clear that you were referring to the ephemeral port, I shall leave this issue open.
By: Michael Maier (micha) 2021-06-15 13:15:17.891-0500

I tested now with listener enabled. The actual patch fixes the problem of "pjsip show transports" showing the correct port now - that's correct - but breaks at the same time the ports used in Via and Contact header.

Example:
{code}
[t-easybell]
type=transport
protocol=tls
bind=192.168.1.94:5063
ca_list_file=/etc/pki/tls/certs/ca-bundle.crt
method=tlsv1_2
verify_server=yes
{code}
{code}
localhost*CLI> pjsip show transports

Transport: <TransportId........> <Type> <cos> <tos> <BindAddress....................>
==========================================================================================

Transport: t-easybell tls 3 184 192.168.1.94:5063
{code}

The registration to easybell opens the following static TCP connection:
{code}
[root@localhost ~]# netstat -tnup
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 192.168.1.94:33917 212.172.58.207:5061 ESTABLISHED 9001/asterisk
{code}

If you're now placing an outbound call via easybell, the existing connection via port 192.168.1.94:33917 -> 212.172.58.207:5061 is used. But in Via and Contact header, port 5063 is placed - which is wrong.
The original patch fixed both scenarios whereas the actual patch fixes just the presentation in pjsip show transports and breaks the SIP protocol.
By: Joshua C. Colp (jcolp) 2021-06-15 13:28:56.763-0500

Can you cite where it actually says that it is incorrect and and that this breaks the SIP protocol?

From my reading of the RFC, comments by others, and mailing list posts, that statement is not incorrect. The port is supposed to be a port that can be used to establish a new connection back. Some implementations may have policy that does connection reuse if the ephemeral port is used there, instead, but that is not present within RFC3261. There is also [RFC5626|https://www.rfc-editor.org/rfc/rfc5626.txt] which does specify connection reuse, but that is not supported.

The patch itself does resolve the issue where if there are multiple transports and one is explicitly set, the correct port and IP address are placed in the signaling.
By: Michael Maier (micha) 2021-06-16 05:13:10.367-0500

Some Background for you to get hopefully a better understanding of how Asterisk / pjsip actually already works when it comes to TCP/TLS.

*1) Asterisk as client, which registers to SIP trunk provider*

a) pjsip opens a TCP connection to the SIP trunk provider: localIP:someSrcPort - destIP:destPort (destPort usually 5060/5061) / via usual TCP-Handshake
b) If TLS is in use, pjsip starts the TLS handshake
c) This new connection is now a static connection. All messages, regardless of direction, are sent through this connection, regardless, too, regarding inbound or outbound calls. Means: The SIP trunk uses exactly this connection for Inbound calls, too (connection reuse). In Short: absolutely all messages go through this connection - no other connections are involved - this would be foolish, because it would consume a lot of useless time and resources for starting new TCP/TLS-connections (each time TCP handshakes and TSL handshakes).
d) The described behavior is security by design, because Asterisk doesn't need any TCP listener at all for the connection to the SIP Trunk. Therefore it's impossible for any other attacker from outside to open a new connection to the Asterisk server to try to place a call.

*2) Asterisk as a server where SIP phones registers to*

a) SIP phone like Zoiper, sipdroid, Linphone, Gigaset, ... start registration to internal Asterisk listener (TCP/5060 or 5061)
b) If TLS is in use, SIP phone starts the TLS handshake
c) This new connection is now a static connection. All messages, regardless of direction, are sent through this connection, regardless, too, regarding inbound or outbound calls. Means: Asterisk itself uses exactly this connection for inbound calls, too. In Short: absolutely all messages go through this single connection - no other connections are involved - this would be foolish, because it would consume a lot of useless time and resources for starting new TCP/TLS-connections (each time TCP handshakes and TSL handshakes).
d) The described behavior is security by design, because Asterisk doesn't need any SYN right to connect to the phones at all. Therefore it's impossible to connect via TCP from Asterisk server to the voice devices. The opening of TCP connections from Asterisk server to (phone) networks are prohibited by firewall rules.

All of this I wrote here works just perfectly out of the box with Asterisk / pjsip. You may verify it yourself on base of netstat, tcpdump, pjsip log.

To be honest - I don't really understand, what you exactly want to say by stating "There is also RFC5626 which does specify connection reuse, but that is not supported.". I could find a [patch|https://trac.pjsip.org/repos/ticket/2149] of you for pjsip from 9/2018 as result of a problem with Google voice. I'm on the other side [referring to this behavior documented here|https://www.giacomovacca.com/2020/12/sip-connection-reuse-vs-persistent.html] - exactly this way behaves Asterisk / pjsip - regarding persistent connection and connection reuse. Asterisk sends during register in the Via header the parameter alias, which tells the server, that connection reuse is desired. All following requests from Asterisk to the SIP provider contain the alias parameter, too. Things are working brilliantly (Google is by far not a reference for me - it's even the opposite). BTW: I'm building the source based on the Asterisk src.rpm file from Sangoma. Maybe they don't activate your patch? In config.log, define HAVE_PJSIP_TRANSPORT_DISABLE_CONNECTION_REUSE is set to 1. Should be active than? If not - it's perfect, too - I don't want to be restricted because of Google problems. The world is much bigger than Google.

Now back to the original question here: Does the actual patch match the requirements? Scenario is transports containing both, client and listener functionality (client is default in pjsip and can't be deactivated - listener can be deactivated by pjsip means). The actual patch writes the listener ports to the SIP protocol - though they are never used in the described scenario - not even the initial connection uses this port.

The original patch from Florian wrote the effectively used local port to the SIP protocol in Via and Contact. For me I can say: the actual variant has been working in the short test case with two sip trunk providers (even in the NAT case), too. But I can't speak for Florian.

If some RFCs requests the listener port to be in the VIA / Contact header even if it's not used at all - as long as all participants ignore this wrong port and it is working - I don't have any problem. I can't globally say, if each trunk server behaves that way in each situation (I doubt to be honest).

If it comes to the next step using transports for registration to SIP trunk via TCP/TLS and completely deactivated transport listener: what do you want to write to VIA / Contact header? There is no listener at all. The only thing you can write to the headers is the effective local port in use which was used to build the connection from Asterik to SIP trunk. That's exactly what Florian's patch did.

But I admit, that there could be / are scenarios, too, where a listener additionally is really needed and used. In this case, you should for sure add the listener port. Maybe adding a switch to let the admin decide which port to place in SIP headers if in trouble? Or checking, if the transport provides a listener - if yes, take the listener port - if no, take the client port. This seems to be reasonable to me.
By: Joshua C. Colp (jcolp) 2021-06-16 05:28:17.532-0500

"The SIP trunk uses exactly this connection for Inbound calls, too (connection reuse). In Short: absolutely all messages go through this connection - no other connections are involved - this would be foolish, because it would consume a lot of useless time and resources for starting new TCP/TLS-connections (each time TCP handshakes and TSL handshakes)."

That statement is incorrect. A SIP trunk MAY reuse this connection, but that is not part of the standard SIP RFC and is a policy decision or deployment decision based on their implementation. It is a choice they've made. It is also part of SIP outbound and the SIP connection reuse RFC, which we don't support in Asterisk itself.

"Asterisk itself uses exactly this connection for inbound calls, too. In Short: absolutely all messages go through this single connection - no other connections are involved - this would be foolish, because it would consume a lot of useless time and resources for starting new TCP/TLS-connections (each time TCP handshakes and TSL handshakes)."

This is also incorrect by default. If rewrite_contact is enabled then we will reuse the existing connection, but it is not by default.

Google is a special case because of their unique usage of SIP (which required special behavior for them in the first place). The patch you are referring to was disabling connection reuse of outgoing connections to the same target IP address and port, as each connection was essentially bound to the authenticated user and you could not use multiple over the same connection. PJSIP by default reused the same connection causing it to not work.

The patch as merged is correct in what it was fixing, it just didn't introduce the behavior you desire. If a patch is put up for review which does so, then it will be reviewed. As I stated on my review, if an option was put in place to put the ephemeral port in the SIP signaling and it defaulted to off then such a change could be accepted. The same goes for disabling a listener.

If Florian wants to work on such a thing then he can.
By: Joshua C. Colp (jcolp) 2021-06-16 05:39:48.035-0500

As well, as I stated before I will keep this issue open in case someone does want to do the above so they have an issue to reference and you are notified.
By: Michael Maier (micha) 2021-06-21 02:00:33.604-0500

Thanks Joshua!
I have to apologize! Unfortunately, I forgot to switch to the correct configuration as I did my first test. Meanwhile I found some time to test the actual patch even against an "ephemeral" listener port. Your patch works fine in this case, too. But that's not surprising as it always sets the actual transport to the listener part. If the creation of a listener is disabled (because it's not needed at all) e.g., you need the first patch, which set the actual transport to the client part.
To get both things working at the same time, you just have to check at this point, if the listener part of the given transport contains a port != 0. If yes, set the actual transport to the listener - if no, set the actual transport to the client part.