[Home]

Summary:ASTERISK-18210: chan_iax2 Asterisk crashes after client "crash"
Reporter:Nic Colledge (nic)Labels:
Date Opened:2011-07-29 17:51:56Date Closed:2012-02-20 16:45:46.000-0600
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Channels/chan_iax2
Versions:1.8.5.0 1.8.6.0 10.0.0-beta1 Frequency of
Occurrence
Occasional
Related
Issues:
Environment:Ubuntu server 11.04 Dahdi used for timing.Attachments:( 0) backtrace.txt
Description:Been having a problem recently with a user on a pretty unreliable internet connection using Zoiper can cause asterisk 1.8.5 to crash.

The following series of errors show up on the console.

[Jul 29 23:25:44] WARNING[4381]: chan_iax2.c:3484 __attempt_transmit: Max retries exceeded to host 192.168.1.101 on IAX2/103-1736 (type = 6, subclass = 11, ts=10005, seqno=5)
[Jul 29 23:25:56] NOTICE[4382]: chan_iax2.c:11877 __iax2_poke_noanswer: Peer '103' is now UNREACHABLE! Time: 35
[Jul 29 23:26:06] NOTICE[4377]: chan_iax2.c:11934 iax2_poke_peer: Still have a callno...

The crash will happen within a few minutes of the "still have a callno" message.

I can reproduce this issue pretty reliabibly on two servers.
However the issue is very sensitive to the timing of events so please follow the instructions very closely.
For example if the "UNREACHABLE" message comes before the "Max retries exceeded" message then the crash does not happen.

To Reproduce:
Setup a dialplan with just the Echo application (no playback before it).
Setup a iax peer and use Zoiper Free on windows to connect to asterisk.

0) Open windows task manager on the processes tab and sort by "image name" descending so you can quickly find zoiper.
1) Open zoiper and unregister the IAX connection.
2) Type in your Echo extension number (do not dial it)
3) register the IAX extension
4) dial the Echo extension (that you already typed in)
5) End the process using the Task manager (to simulate a crash of zoiper / loss of connection)
6) Wait for the errors on the asterisk console (and the crash).

Steps 3 to 5 need to be completed as quickly as possible (within say 10 seconds), otherwise zoiper will have time to reregister and the events will happen in the wrong order (on the console) making the crash less likley.

If back traces / core dumps would be usefull here let me know and I can produce some.

Thanks,
Nic.
Comments:By: Robert Verspuy (exarv) 2011-07-30 03:26:59.142-0500

I don't know if my problem is related.
See http://lists.digium.com/pipermail/asterisk-dev/2011-July/050135.html
When benchmarking with 120 simultaneous calls sip calls forwarded over an iax2 link with 2-5% packet loss, I get about 10 calls that keep hanging infinite on the remote asterisk with the max retries messages.
I don't get a crash however, but that could also be, because qualify was turned off.
I will retry the benchmark next monday with qualify turned on.

By: Nic Colledge (nic) 2011-08-01 17:03:27.997-0500

Yeah the calls in my case dont hang around for too long. I would be interested to see what happens when you try with qualify turned on.

By: Nic Colledge (nic) 2011-08-03 07:03:06.379-0500

I tried the patch from ASTERISK-17610 because of similarities in the error messages and a suggestion in the dev mailing list but it made no difference.

By: Nic Colledge (nic) 2011-08-09 07:24:27.960-0500

Had another crash today with:
[Aug  9 12:54:22] NOTICE[1446] chan_iax2.c: Still have a callno...
[Aug  9 12:54:33] ERROR[1439] chan_iax2.c: Bad address cast to IPv4
[Aug  9 12:54:33] NOTICE[1439] chan_iax2.c: Still have a callno...
as the last messages on the console.
Not sure if its related.

By: Nic Colledge (nic) 2011-08-15 09:47:23.464-0500

Managed to test this weekend against 1.8.6.0-rc1 and 10.0.0-beta1. Crash still happens in both of these versions.

By: Gregory Massel (gmza) 2011-10-31 06:42:38.840-0500

I can confirm that I experience similar things on a box with 218 iax2 peers [171 online, 43 offline, 4 unmonitored].

It's happened twice (about seven days apart) that the Asterisk process has crashed in a state that you have to kill -9 it before the process will end.

The first time was on Asterisk 1.8.7.0 and the second on Asterisk 1.8.7.1.

Prior to 1.8 I was running 1.4.42 and it was stable as long as I was on 1.4 series (well over a year).

When the problem occurs, logs flood with messages like:
[Oct 31 11:21:22] WARNING[2528] chan_iax2.c: Max retries exceeded to host x.x.x.x on IAX2/xxxxxxxxxx-xxxxx (type = 6, subclass = 11, ts=119999, seqno=40)
These messages log in respect of ALL active IAX2 peers.

Unfortunately I cannot replicate the problem and, as this is a production box, I cannot fiddle with it.



By: Sean Bright (seanbright) 2012-02-20 14:14:54.389-0600

I've tried to replicate the original crash (using Zoiper as mentioned in the original report) in my development environment (1.8 SVN) but haven't been able to.  Can you the OP confirm that he is still seeing this problem with the latest 1.8?

By: Nic Colledge (nic) 2012-02-20 15:06:05.141-0600

Yeah, I just tested this again on 1.8.10.0-rc2.

After killing the zoiper process it takes a couple of minutes for asterisk to crash, with the messages (below) showing up about half way through the process.

[Feb 20 20:58:18] WARNING[4659]: chan_iax2.c:3488 __attempt_transmit: Max retries exceeded to host X.X.X.X on IAX2/103-7703 (type = 6, subclass = 11, ts=10018, seqno=5)
[Feb 20 20:58:19] NOTICE[4658]: chan_iax2.c:11947 __iax2_poke_noanswer: Peer '103' is now UNREACHABLE! Time: 38
[Feb 20 20:58:36] NOTICE[4651]: chan_iax2.c:12004 iax2_poke_peer: Still have a callno...

Ill get the trunk version later tonight and test again with that.

Thanks,
Nic.

By: Nic Colledge (nic) 2012-02-20 16:45:03.706-0600

Hi,

Just tested SVN-branch-1.8-r355997 and it seems ok now so ive closed the issue.

Thanks,
Nic.

By: Nic Colledge (nic) 2012-02-20 16:45:46.115-0600

Tested SVN-branch-1.8-r355997 and cant reporduce the issue.