[Home]

Summary:ASTERISK-09028: segfault after a while of operation (maybe a month)
Reporter:Anton Vazir (vazir)Labels:
Date Opened:2007-03-16 03:08:37Date Closed:2008-04-14 13:29:30
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Channels/chan_h323
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) h323-bt1.txt
( 1) h323-bt2.txt
Description:Subj.
Asterisk 1.4.1


****** ADDITIONAL INFORMATION ******

(gdb) bt full
#0  0xb6e791b0 in strcmp () from /lib/tls/libc.so.6
No symbol table info available.
#1  0xb554864d in find_call_locked (call_reference=25316, token=0x82c0f18 "ip$localhost/25316") at chan_h323.c:1154
       pvt = (struct oh323_pvt *) 0x8262688
#2  0xb5547b6f in cleanup_connection (call_reference=25316, call_token=0x82c0f18 "ip$localhost/25316") at chan_h323.c:2300
       pvt = (struct oh323_pvt *) 0x82c0f30
#3  0xb554aa20 in MyH323EndPoint::OnConnectionCleared () from /usr/lib/asterisk/modules/chan_h323.so
No symbol table info available.
#4  0xb7b5b9c1 in H323Connection::OnCleared () from /usr/local/lib/libh323_linux_x86_r.so.1.18.0
No symbol table info available.
ASTERISK-1  0xb7b93342 in H323EndPoint::CleanUpConnections () from /usr/local/lib/libh323_linux_x86_r.so.1.18.0
No symbol table info available.
ASTERISK-2  0xb7b843c4 in H323ConnectionsCleaner::Main () from /usr/local/lib/libh323_linux_x86_r.so.1.18.0
No symbol table info available.
ASTERISK-3  0xb7392a9f in PThread::PX_ThreadStart () from /usr/local/lib/libpt_linux_x86_r.so.1.10.0
No symbol table info available.
ASTERISK-4  0xb7ef5b63 in start_thread () from /lib/tls/libpthread.so.0
No symbol table info available.
ASTERISK-5  0xb6eda18a in clone () from /lib/tls/libc.so.6
No symbol table info available.
(gdb)      
Comments:By: Paul Cadach (pcadach) 2007-04-24 18:25:25

This is a known issue, but I would like to collect a bit more information to figure out exact call scenario when this case happens.

In my view, this could be happened when two calls have the same call reference ID. The call reference ID generated by call origination side, so two independed nodes can produce the same call reference ID. OpenH323's call token usually provides more uniqueness by prepending call reference with peer's host name (or IP address), but in case we use a gatekeeper, we will have the same peer's name for every call (gatekeeper's name), and duplicates could be possible (GK usually distinguish calls by call GUID instead of call reference). Also, call references can be not unique for two calls made in opposite directions (one is outgoing, made by Asterisk, another is incoming to Asterisk).

Anton, could you take a look at your logs to check for duplicated call tokens? I do not have a system running under gatekeeper and do not have such high volume traffic to clearly reproduce this issue. Probably, I could make a some sort of special call generator which will produce duplicated call references.


WBR,
Paul.

By: Lennon Lim (lennon) 2007-06-20 10:44:57

I have the same problem too, it cause on high volunm calls around 60 concurrent, im not using gk. upload 2 bt log.

By: Andrey Solovyev (corruptor) 2007-06-22 08:14:45

Started to use chan_h323 recently.
Yesterday in the evening I had about 150+ concurrent H323 to SIP calls.
Today in the morning asterisk crashed on hangup, when it has only 10 calls or so.
Last message in the log:
DEBUG[20427] chan_h323.c: Cleaning connection to ip$10.12.12.12:3310/2576

Looked through the logs - no calls with the same ID.
I don't use gk.

I use the same versions of PwLib ( 1.10 )and OpenH323 ( 1.18) as lennon



By: Paul Cadach (pcadach) 2007-09-25 04:00:44

As I can see, the common case when this can happens is when Asterisk receives a call with goal set to e_callIndependentSupplementaryService, when call to regular processing of Setup message is missing but Release message handled as usual, causing creation of NULL call token in chan_h323's call table.

Could someone please acknowledge my idea about wrong call goal parameter?


Thanks,
Paul.

By: Andrey Solovyev (corruptor) 2007-09-25 06:13:35

PCadach, could you explain what should I look for? Is it enough to enable h.323 debug?

By the way, today I've looked through the logs and noticed that one crash was due to duplicate ref id. I've noticed this because I use h extension.
Call with id ip$10.12.12.52:4962/2919 ended and 2 seconds before connection was finally cleaned (asterisk began to process h,1 , h,2 and so on) a new call with the same id appeared.

PS. Paul, is it possible to discuss this problem on #asteriskru channel at irc.freenode.net ?

By: Sergey Tamkovich (sergee) 2007-09-25 06:54:36

PCadach, yes Paul, we are really missing you at #asteriskru :)

By: danpwi (danpwi) 2008-02-20 19:32:31.000-0600

Any updates on this one?  I can confirm that it's still present in 1.4.18:

(gdb) bt
#0  0x00c060d8 in strcmp () from /lib/tls/libc.so.6
#1  0x04df484c in find_call_locked (call_reference=27138,
   token=0xb7326a88 "ip$localhost/27138") at chan_h323.c:1148
#2  0x04df8b88 in cleanup_connection (call_reference=27138,
   call_token=0xb7326a88 "ip$localhost/27138") at chan_h323.c:2290
#3  0x04dfdf79 in MyH323EndPoint::OnConnectionCleared ()
  from /usr/lib/asterisk/modules/chan_h323.so
#4  0x005cd18d in H323Connection::OnCleared (this=0xb7326a88) at h323.cxx:2110
ASTERISK-1  0x005e0194 in H323EndPoint::CleanUpConnections (this=0xa099a98)
   at h323ep.cxx:2193
ASTERISK-2  0x005dde9d in H323ConnectionsCleaner::Main (this=0xa09a068)
   at h323ep.cxx:929
ASTERISK-3  0x01020b1f in PThread::PX_ThreadStart (arg=0xa09a068) at tlibthrd.cxx:1340
ASTERISK-4  0x00d7a3cc in start_thread () from /lib/tls/libpthread.so.0
ASTERISK-5  0x00c65c3e in clone () from /lib/tls/libc.so.6
(gdb) up
#1  0x04df484c in find_call_locked (call_reference=27138,
   token=0xb7326a88 "ip$localhost/27138") at chan_h323.c:1148
1148                            if ((token != NULL) && (!strcmp(pvt->cd.call_token, token))) {
(gdb) print pvt->cd
$1 = {call_reference = 27138, call_token = 0x0, call_source_aliases = 0x0,
 call_dest_alias = 0x0, call_source_name = 0x0, call_source_e164 = 0x0,
 call_dest_e164 = 0x0, redirect_number = 0x0, redirect_reason = -1,
 presentation = 0, type_of_number = 0, transfer_capability = -1,
 sourceIp = 0x0}

By: Sergey Tamkovich (sergee) 2008-03-03 04:23:02.000-0600

danpwi, are you available on IRC? or email?

By: danpwi (danpwi) 2008-03-04 19:09:20.000-0600

sure, email (reversed) moc.htuosiwp@piov

By: Digium Subversion (svnbot) 2008-04-14 13:26:54

Repository: asterisk
Revision: 114120

U   branches/1.4/channels/chan_h323.c

------------------------------------------------------------------------
r114120 | qwell | 2008-04-14 13:26:53 -0500 (Mon, 14 Apr 2008) | 7 lines

The call_token on the pvt can occasionally be NULL, causing a crash.

If it is NULL, we can skip this channel, since it can't the one we're looking for.

(closes issue ASTERISK-9028)
Reported by: vazir

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=114120

By: Digium Subversion (svnbot) 2008-04-14 13:29:09

Repository: asterisk
Revision: 114121

_U  trunk/
U   trunk/channels/chan_h323.c

------------------------------------------------------------------------
r114121 | qwell | 2008-04-14 13:29:09 -0500 (Mon, 14 Apr 2008) | 15 lines

Merged revisions 114120 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r114120 | qwell | 2008-04-14 13:31:57 -0500 (Mon, 14 Apr 2008) | 7 lines

The call_token on the pvt can occasionally be NULL, causing a crash.

If it is NULL, we can skip this channel, since it can't the one we're looking for.

(closes issue ASTERISK-9028)
Reported by: vazir

........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=114121

By: Digium Subversion (svnbot) 2008-04-14 13:29:30

Repository: asterisk
Revision: 114122

_U  branches/1.6.0/
U   branches/1.6.0/channels/chan_h323.c

------------------------------------------------------------------------
r114122 | qwell | 2008-04-14 13:29:29 -0500 (Mon, 14 Apr 2008) | 23 lines

Merged revisions 114121 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
r114121 | qwell | 2008-04-14 13:34:17 -0500 (Mon, 14 Apr 2008) | 15 lines

Merged revisions 114120 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r114120 | qwell | 2008-04-14 13:31:57 -0500 (Mon, 14 Apr 2008) | 7 lines

The call_token on the pvt can occasionally be NULL, causing a crash.

If it is NULL, we can skip this channel, since it can't the one we're looking for.

(closes issue ASTERISK-9028)
Reported by: vazir

........

................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=114122