[Home]

Summary:ASTERISK-21203: res_xmpp socket error: takes upto 19 minutes to restore xmpp socket connection to google
Reporter:Chirag Chhatriwala (cchhat01)Labels:
Date Opened:2013-03-03 10:29:16.000-0600Date Closed:2013-03-12 09:27:42
Priority:MajorRegression?
Status:Closed/CompleteComponents:Resources/res_xmpp
Versions:11.3.0 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Linux OpenWrt 3.7.5 #2 Wed Feb 6 20:40:48 EST 2013 mips GNU/LinuxAttachments:( 0) messages
Description:11.3.0-rc1 addresses a problem in which if a res_xmpp connection becomes stale (network interruption/IP address change), the res_xmpp performs a reconnect and the filters are also refreshed to make sure it responds to jingle/chan_motif signalling.

Now, it takes about 18-20 minutes to recover from a network outage. I.e. IP Address change. There doesn't seem to be any way to minimize the time before the following message:

[2013-03-02 09:18:13] WARNING[15652][C-00000026] res_rtp_asterisk.c: RTP Transmission error of packet to 74.125.135.127:19305: Network is unreachable
[2013-03-02 09:18:13] WARNING[15652][C-00000026] res_rtp_asterisk.c: RTP Transmission error of packet to 74.125.135.127:19305: Network is unreachable
[2013-03-02 09:18:13] WARNING[15652][C-00000026] res_rtp_asterisk.c: RTP Transmission error of packet to 74.125.135.127:19305: Network is unreachable
[2013-03-02 09:36:29] WARNING[2872] res_xmpp.c: JABBER: socket read error

I have a log file which is about 21.6MB in size so I'm not uploading it here just as yet.
I have a smaller log file which does show this time difference of 18+ minutes as well. The file shows that a google voice call gets interrupted (as a result of IP renewal resulting in a changed IP) and then the jabber socket read error appears around 18+ minutes later.

This is indicative of all IP Address renewals.

In the above, the time between 09:18:13 and 09:36:29, any outbound calls [through google voice] do not work and any inbound calls [through google voice] do not terminate on my SIP phone. After, 09:36:29, everything works as it should. I'm trying to minimize this window where the connection is in limbo.

Please help.

Thanks,
Chirag
Comments:By: Rusty Newton (rnewton) 2013-03-07 11:57:23.322-0600

Can you provide a similar log to what you have already, but with VERBOSE and DEBUG message types enabled? (be sure to turn both verbose and debug up to level 5)

https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information

If the log is just too huge, then at least provide excerpts from the time the outage happens, and when it finally attempts a reconnection.

You might also host the full size log somewhere where we can download it.


By: Chirag Chhatriwala (cchhat01) 2013-03-07 12:29:49.341-0600

Rusty,

I've uploaded the log file to my GoogleDrive. Feel free to download and investigate.
I'll give you a headstart as to where the problem start. It happens at [2013-02-13 09:17:02].
Search that particular string and you will be pointed to the first instance of res_rtp_asterisk.c error which results from a connection loss while the call was in progress.

https://docs.google.com/file/d/0B8cmL6zE2dvDeVphcU9XSDMyN00/edit

Cheers,
Chirag

By: Matt Jordan (mjordan) 2013-03-12 09:26:44.401-0500

First, I'm not sure I'd call this a bug. Google apparently changed the IP address out from under Asterisk in the middle of a call - and although I could be mistaken, I don't see anything informing Asterisk of the new RTP destination.

You can see where Google informed Asterisk initially where to send the call:

{noformat}
[2013-02-13 09:01:44] DEBUG[2903][C-0000000c] logger.c: Call_ID [C-0000000c] being removed from thread.
[2013-02-13 09:01:45] VERBOSE[2898] res_xmpp.c:
<--- XMPP received from 'google' --->
<iq to="myuserid@gmail.com/asterisk-x1297FE8D" from="+13475551212@voice.google.com/srvres-MTAuMjIwLjIyNi4xOTM6OTgzNg==" id="aaael" type="result"/>
<------------->
[2013-02-13 09:01:45] DEBUG[2898] res_xmpp.c: XML parsing successful
[2013-02-13 09:01:45] VERBOSE[2898] res_xmpp.c:
<--- XMPP received from 'google' --->
<iq from="+13475551212@voice.google.com/srvres-MTAuMjIwLjIyNi4xOTM6OTgzNg=="
   to="myuserid@gmail.com/asterisk-x1297FE8D"
   id="jingle:10.220.226.193-11467418:1:CB2823BE" type="set">
   <ses:session type="candidates" id="SIP1662940797@10.221.5.195"
                initiator="+13475551212@voice.google.com/srvres-MTAuMjIwLjIyNi4xOTM6OTgzNg=="
                xmlns:ses="http://www.google.com/session">
       <ses:candidate name="rtp"
                      address="74.125.135.127"
                      port="19305"
                      username="9DD85EE42FE277FD"
                      preference="3.0"
                      protocol="udp"
                      network="mediaproxy"
                      generation="0"
                      password=""
                      type="relay"/>
       <ses:candidate name="rtp"
                      address="74.125.135.127"
                      port="19305"
                      username="9DD85EE42FE277FD"
                      preference="2.0"
                      protocol="tcp"
                      network="mediaproxy"
                      generation="0"
                      password=""
                      type="relay"/>
       <ses:candidate name="rtp"
                      address="74.125.135.127"
                      port="443"
                      username="9DD85EE42FE277FD"
                      preference="1.0"
                      protocol="ssltcp"
                      network="mediaproxy"
                      generation="0"
                      password=""
                      type="relay"/>
   </ses:session>
</iq>
<------------->
[2013-02-13 09:01:45] DEBUG[2898][C-0000000c] logger.c: CALL_ID [C-0000000c] bound to thread.
{noformat}

Google doesn't send us an update of candidates. It doesn't tell us when it decides to switch the RTP address - so unfortunately, there doesn't appear to be any action we can take as we don't know where to send the media to.

As it is, I don't see a whole lot we can do here, even as an improvement.

By: Matt Jordan (mjordan) 2013-03-12 09:27:42.020-0500

Closing out as "Not a bug". If you feel that this can be addressed in some fashion based on information that Google provides when the media address is no longer valid, let us know either on this issue or in #asterisk-bugs and we'll be happy to reopen the issue.

By: Chirag Chhatriwala (cchhat01) 2013-03-13 01:46:13.945-0500

Matt Jordan:

You misunderstood the point. I'm not concerned about the IP Address being changed in the middle of the call.
The log file isn't very clear in indicating that "MY" IP address changed. In other words, the box which runs asterisk had its IP address changed as a result of PPPoE based renewal. When that happened the call and the connection to google via xmpp was lost.

What I want to know is, what can be done to minimize the time spent in limbo by the asterisk server figuring out whether it is still connected to google or not. Apparently, it takes about 18-20 minutes EACH time the IP address changes for the connection to be revived. All the buddies see the connection as offline for this time and when it re-connects (after 18-20 minutes), the buddies now see the connection as active. All calls then originating and terminating to the google voice number are properly established (only after the 18-20 minutes spent in limbo).

The fact that the call dropped is inherent to my ISP because every 6 or so hours my IP address renews to a different one and all calls during that period will be terminated for obvious reason (something I already know about).
I want to have the connection to google "re-activated or refreshed" as soon as possible (and not have to wait 20 minutes). That is what I'm trying to draw attention to.

Thanks,
Chirag

By: David Woolley (davidw) 2013-03-13 06:37:18.806-0500

My understanding is that such gratuitous IP address changes are done to frustrate the use of servers on cheap accounts.  Asterisk is a server.

As the internet and PPP were not designed to work in this fashion, I think the right solution here is to use an account that is suitable fro the purpose for which it is being used.

By: Matt Jordan (mjordan) 2013-03-13 08:44:10.462-0500

I did misunderstand the point - it's usually easy to point the finger at Google on these kinds of issues, since they're notorious for doing "interesting" things.

However, in this case, it's actually even easier to point out that this is not a bug. As David pointed out, changing a server IP address during operation is generally considered a "bad thing". If nothing else, it invalidates any information sent to any endpoint regarding Asterisk's location - any operation in progress is at risk of failing.

In general, if you make major system configuration changes - which changing the IP address of the server would amount to - you should, at a minimum, reload modules in Asterisk that communicate with other system - and more likely, restart Asterisk.

Anything that changed this behavior would be considered an improvement, and not a bug fix. I'm in complete agreement with David here - this is generally something that should not be done to any system that is providing services.

By: Chirag Chhatriwala (cchhat01) 2013-03-13 15:36:33.744-0500

David: In certain regions of the world, there are no other options because of the internet providers that are available in the area. Point is the ISP may also be doing this to protect against DDoS attacks on the service to its customers (who happen to be facing a lot of hacker traffic, which is also common). I'm not trying to argue how the internet or the PPP protocol were designed to work. I also want to show that Asterisk is working as a server on a connection which is forced to renew an IP address however the connection "google xmpp re-establishment" takes a long time. If it does the re-establishment, why take 20 minutes ? Why not a shorter interval to realize that the connection is stale and re-connect quicker?

Did anyone even look at the Logfile and try to understand why the re-connection took so long?

[2013-02-13 09:17:02]... call dropped due to IP address change...
.
.
.
[2013-02-13 09:35:41] WARNING[2898] res_xmpp.c: JABBER: socket read error
[2013-02-13 09:35:41] DEBUG[2898] res_xmpp.c: Connecting client 'google'

Thats about 19 minutes. Surely there has to be a way to speed this up.

You have to understand that even DHCP which upon IP renewal can issue a different IP address if the ISP forces. This isn't really about PPPoE.
Its about the re-establishment of the xmpp connection to google.

Edit: This instance shows the time it took to reconnect as about 19 minutes because the call dropped during the IP address change. However, in the case no calls are on-going, there is "no-way" to tell that the google connection is stale unless I look at the time it elapsed since the IP address changed (which only my router can tell).

By: David Woolley (davidw) 2013-03-15 10:28:20.376-0500

I find it difficult to believee that there are countries where it is impossible to get a proper internet connection, however, it is so, this needs to be addressed at the right level, not by tweaking applications to work round the problem.

First of all you need a new sockets API that will actively notify when a binding has been invalidated by an address change.

Secondly, you need to raise an RFC on how XMPP should react in such circumtances.  Similarly you need to raise RFCs for SIP and any other protocols standardised by IETF.