[Home]

Summary:ASTERISK-20453: res_xmpp.c: JABBER: socket read error, afterwards outgoing connections never get answered
Reporter:Andrey Petrov (anpetrov)Labels:
Date Opened:2012-09-20 12:53:40Date Closed:2012-10-08 08:03:12
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Channels/chan_motif Resources/res_xmpp
Versions:11.0.0-beta1 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Linux lucid 32bit running under openvzAttachments:
Description:Hello,

I have been using asterisk with google voice already for a 1+ years.

Recently I decided to switch to new asterisk11beta1, because of all new goodies coming with new Motif.

I switched about 2 weeks ago and everything worked pretty well.
However just a few days ago, XMPP connections seem to start ending abruptly:

{noformat}
[Sep 18 04:47:56] WARNING[8510] res_xmpp.c: JABBER: socket read error
[Sep 18 07:17:00] WARNING[8510] res_xmpp.c: JABBER: socket read error
[Sep 18 07:20:22] WARNING[8510] res_xmpp.c: JABBER: socket read error
[Sep 18 07:21:12] WARNING[8510] res_xmpp.c: JABBER: socket read error
{noformat}

Interestingly enough, this error first started occuring pretty seldom, perhaps 1 a day. So in the begginning I discarded it as mundane tcp error. However, the frequency somehow increased and now it happens every other minute. It also happened to me during actual call several times. I tried setting keepalive=yes in xmpp.conf but to no avail. I tried different transports for Motif and it didnt help either.

If I connected with asterisk -r,
"xmpp show connections" seems to show connection is alive. And it is alive, as I am receiving status notifications.

However, outgoing calls (and perhaps incoming, but I am not sure) placed through Motif never get answered.
I am getting something like this in the log when placing a call:
{noformat}
[Sep 18 08:05:54] VERBOSE[8542][C-00000000] netsock2.c:   == Using SIP RTP CoS mark 5
[Sep 18 08:05:54] VERBOSE[9012][C-00000000] pbx.c:     -- Executing [9093900003@outbound:1] MixMonitor("SIP/android-00000000", "to-9093900003-Tue Sep 18 08:05:54 2012.wav,b") in new stack
[Sep 18 08:05:54] VERBOSE[9013][C-00000000] app_mixmonitor.c:   == Begin MixMonitor Recording SIP/android-00000000
[Sep 18 08:05:54] VERBOSE[9012][C-00000000] pbx.c:     -- Executing [9093900003@outbound:2] Dial("SIP/android-00000000", "Motif/google/9093900003@voice.google.com,,Tr") in new stack
[Sep 18 08:05:54] VERBOSE[9012][C-00000000] app_dial.c:     -- Called Motif/google/9093900003@voice.google.com
[Sep 18 08:05:55] VERBOSE[9012][C-00000000] app_dial.c:     -- Motif/9093900003@voice.google.com-a3a3 is proceeding passing it to SIP/android-00000000
[Sep 18 08:06:10] VERBOSE[9012][C-00000000] pbx.c:   == Spawn extension (outbound, 9093900003, 2) exited non-zero on 'SIP/android-00000000'
[Sep 18 08:06:11] VERBOSE[9013][C-00000000] app_mixmonitor.c:   == MixMonitor close filestream (mixed)
[Sep 18 08:06:11] VERBOSE[9013][C-00000000] app_mixmonitor.c:   == End MixMonitor Recording SIP/android-00000000
{noformat}


This is a critical issue to me, because it forced me to downgrade back to 1.8.

Additional information:

I tried trunk, branches/11, but I can not confirm or deny if it is reproducible or not, because with these versions once call is established the voice is never heard both ways for unknown reason.

Comments:By: Matt Jordan (mjordan) 2012-09-25 08:25:28.414-0500

We require a complete debug log to help triage the issue. This document will provide instructions on how to collect debugging logs from an Asterisk machine for the purpose of helping bug marshals troubleshoot an issue: https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information

Please also provide your xmpp.conf and motif.conf configuration files.

By: Andrey Petrov (anpetrov) 2012-10-08 01:24:52.340-0500

somehow the issue seem to have disappeared. I will switch back to 11.

By: Aaron Chan (aaronc99) 2013-03-07 10:20:08.666-0600

i'm on 11.2.1 and i've been seeing this issue pretty frequently. It happens when:

- I answer a call on the Motif channel too fast. I mitigated this by issuing a Wait(2) before Answer() whenever i receive a google voice call

- Just happens randomly. Last night, after Asterisk was running for about 7 hours, I got "JABBER: socket read error" on the log even when i wasn't placing or receiving any calls. afterwards i was unable to place/receive calls through google voice.

In attempting to figure out why this happens on the new Motif channel but not on the old GTalk channel in 1.8, i read the code on res_xmpp.c and tried to compare it with res_jabber.c.

On res_xmpp.c's xmpp_client_thread() function, the following code was placed before the xmpp_client_reconnect() function:

              if (client->state == XMPP_STATE_DISCONNECTING) {
                       break;
               }

this, i assume, would break out of the infinite loop and terminate the thread.

however, on res_jabber.c's aji_recv_loop() function, the same code was placed AFTER the aji_reconnect() function.

i suspect this was the cause... i'm not sure why we would be terminating the thread in res_xmpp.c before attempting to reconnect, while for res_jabber.c we would try to reconnect before terminating the thread? if we move that code snippet so it's run after xmpp_client_receive(), will that solve the problem?