ASTERISK-19430: 1.8.9.1 SIP NOTIFY crashes 2wire (U-Verse) routers

[Home]

Summary: ASTERISK-19430: 1.8.9.1 SIP NOTIFY crashes 2wire (U-Verse) routers

Reporter: Schmooze Com (schmoozecom) Labels:

Date Opened: 2012-02-23 15:29:54.000-0600 Date Closed: 2012-03-01 08:23:19.000-0600

Priority: Blocker Regression?

Status: Closed/Complete Components: Channels/chan_sip/Messaging

Versions: 1.8.9.1 Frequency of
Occurrence Constant

Related
Issues:
must be completed before resolving ASTERISK-19128 Asterisk 1.8.10 Blockers

must be completed before resolving ASTERISK-19129 Asterisk 10.2.0 Blockers

must be completed before resolving ASTERISK-19271 Asterisk 1.8.11.0 Blockers

must be completed before resolving ASTERISK-19272 Asterisk 10.3.0 Blockers

Environment: 2wire 3801HGV route/dsl modem firmware version 6.3.7.40-enh.tm Attachments: ( 0) empty_port.diff
( 1) nc.pcap
( 2) notify_crash.pcap
( 3) notify_nocrash.pcap
( 4) notify.pcap

Description: When a SIP NOTIFY is sent to a phone behind a 2wire router, it causes the router to crash and reboot itself. My suspicion is that the :0 at the end of the Call-ID (which was not present in 1.4 or 1.6) may be causing a problem with the ALG implementation in the 2wire router.
This can be easily replicated by sending a sipura-check-cfg to any device behind a 2wire 38xx router.
This problem does not occur when sending NOTIFY from 1.4 or 1.6.2.

Comments: By: Schmooze Com (schmoozecom) 2012-02-23 15:31:15.440-0600

Attached pcap file shows NOTIFY packets that will consistently crash the 2wire router.
By: Schmooze Com (schmoozecom) 2012-02-23 15:33:27.245-0600

Attached pcap shows NOTIFY packet from 1.4 that does not crash the 2wire router.
By: Schmooze Com (schmoozecom) 2012-02-23 15:35:30.150-0600

additional pcap that will consistently crash 2wire router. Same notify type as notify_nocrash.pcap but from 1.8.
By: Jason Parker (jparker) 2012-02-23 15:57:36.651-0600

I made this issue private, for somewhat obvious reasons. It can be made public again if 2wire does anything about it.
By: Matt Jordan (mjordan) 2012-02-23 16:38:29.858-0600

Just to check - has this been reported to 2wire? This is really more a bug in their firmware then an Asterisk problem.
By: Schmooze Com (schmoozecom) 2012-02-23 16:43:46.505-0600

Yes we sent them a email. Maybe it is my stupidity of the SIP Notify but why is it adding a :0 to the Call-ID. Asterisk 1.4 and 1.6 do not do this and they work fine. Once you add the :0 which I assume is the port number it crashes. Even if I set the bindport in sip.conf to 5060 it still adds the :0.
By: Matt Jordan (mjordan) 2012-02-23 17:00:27.806-0600

The Call-Id can be any unique identifier that ties together messages in a dialog. It must be globally unique over space and time with respect to the UAs involved. The RFC doesn't even mandate the form that asterisk is using of localid@host, it simply recommends it. The fact that the "host" part of the string now appends some form of port information (albeit apparently somewhat incorrect port information) is a detail that other UAs shouldn't care about. This can (obviously) change from version to version, and so long as the uniqueness of it is maintained, it is an implementation detail that no one should depend on.

As for that implementation decision...

The Call-ID's host portion is construct from one of two fields: the fromdomain setting (SIPFROMDOMAIN channel variable) if present, or the hostname of the local machine. The hostname of the machine is "stringified" in the netsock2 library using a flag that specifies that the address should be used, but the port shouldn't. Instead of the port that the host is bound on being presented to the getnameinfo library call, a port of "0" is presented.

Hence the :0.

I don't know why the behavior was changed from 1.6.x; that would take some digging. Again, however, the behavior isn't wrong: its just different.
By: Schmooze Com (schmoozecom) 2012-02-23 19:33:02.216-0600

Ok we did some testing and it does not seem to be the Call-ID with a :0. Its something else in the packet compared to what 1.4 and 1.6 did.

I am setting up a packet replay and going to change everything in the notify one line at a time to how 1.4 did it tell I find which part is causing the issue and I will report back what we find. Thanks for the explanation of Call-ID.

I also see the :0 is added to the from, via or contact headers when 1.4 and 1.6 did not have the :0.

Thanks for the feedback and help
By: Schmooze Com (schmoozecom) 2012-02-23 22:01:00.883-0600

We verified that it is the Contact: header. If the port is set to 0, it crashes but if it is set to 5060 it does not crash. Uploading nc.pcap which is the packet that confirmed the contact header.
By: Schmooze Com (schmoozecom) 2012-02-23 22:01:30.043-0600

This packet with the Contact header having port 0 causes the crash.
By: Matt Jordan (mjordan) 2012-02-24 11:29:30.516-0600

Can you provide:
a) The SIP configuration of the peers involved with this NOTIFY
b) A pcap that includes the offending NOTIFY, as well as the other SUBSCRIBE/NOTIFIES that occurred in relation to it

By: Schmooze Com (schmoozecom) 2012-02-24 11:45:48.986-0600

The np.pcap file attached last night is the offending NOTIFY packet that confirmed that the Contact header field is the cause.

The sip.conf entry for the phone used for testing is as follows:
[512]
deny=0.0.0.0/0.0.0.0
secret=#########
dtmfmode=rfc2833
canreinvite=no
context=from-internal
host=dynamic
trustrpid=yes
sendrpid=no
type=friend
setvar=FAXOPT(gateway)=no
nat=yes
port=5060
qualify=yes
qualifyfreq=60
transport=udp
encryption=no
callgroup=
pickupgroup=
dial=SIP/512
mailbox=512@default
permit=0.0.0.0/0.0.0.0
callerid=device <512>
callcounter=yes
faxdetect=no
cc_monitor_policy=generic

By: Schmooze Com (schmoozecom) 2012-02-24 11:47:21.211-0600

Note that the original packet had :0 as the port number on the Contact header. I tested it both with :0 and :0000 with the same result.
By: Kinsey Moore (kmoore) 2012-02-29 14:18:53.874-0600

I have reproduced the packet in question and it seems that the move from 1.6.2 to 1.8 included the introduction of ast_sockaddr (for IPv6 support) and a new function call to generate the contact header in netsock2.c. The pre-1.8 code ignored the port if it was the default and seems to have had access to the correct port information. The underlying problem is that the sockaddr's port is not initialized properly for these notifications.
By: Kinsey Moore (kmoore) 2012-02-29 15:24:26.163-0600

The attached patch resolves the issue for my reproduction. Could you test it out to make sure that we are experiencing the same mechanism of failure? With all the changes that ast_sockaddr brought, there is a possibility that we are experiencing this via different code paths.
By: Schmooze Com (schmoozecom) 2012-02-29 17:39:18.871-0600

The patch supplied by Kinsey appears to solve the issue for us.
By: Matt Jordan (mjordan) 2012-03-01 17:03:02.698-0600

Made this issue public as the release candidate with the fix has gone out onto asterisk.org.
By: Alec Davis (alecdavis) 2012-03-02 02:17:06.426-0600

Looks like this problem was noticed in june 2011

I'd rather point it out, than quote it here. Search for "SIP NOTIFY"
Read http://mailman.nanog.org/pipermail/nanog/2011-June/036688.html