[Home]

Summary:ASTERISK-14500: [patch] SIP OPTIONS qualify message forever
Reporter:Marc LEURENT (lftsy)Labels:
Date Opened:2009-12-03 10:04:46.000-0600Date Closed:2010-11-11 16:01:14.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 0245667911.pcap
( 1) bt_full-patchv2-101200.txt
( 2) bug16382.patch
( 3) bug16382-2.patch
( 4) bug16382-3.patch
( 5) flood.txt
( 6) mysql_bytes-day.png
( 7) options_storm.txt
( 8) options.JPG
( 9) sipshowsettings.txt
Description:Hello, I have a trouble with different Asterisk versions (1.4.26, 1.4.27, 1.4.27.1). When I use the steps below, Asterisk starts to send SIP OPTIONS to the previous IP/port used by a SIP realtime peer (that has been pruned) and will keep trying to send SIP OPTIONS pings forever, event if the peer is connected with a new IP/port address.

I have just checked with my old Asterisk 1.2.27 with the same sip.conf and I do not have the problem, the SIP OPTIONS stops once register timer has expired.

During my experience to reproduce the bug, I have been able to have 10 IP/port currently pinged by the Asterisk server for one single peer.
And the only way to stop it is to restart Asterisk...

Thank you for your attention!
Marc Leurent

****** ADDITIONAL INFORMATION ******

Here is my sip.conf

[general]
context=default ; Default context for incoming calls
allowguest=no ; Allow or reject guest calls (default is yes)
allowoverlap=no ; Disable overlap dialing support. (Default is yes)
realm=voip ; Realm for digest authentication
bindport=5060 ; UDP Port to bind to (SIP standard port is 5060)
bindaddr=0.0.0.0 ; IP address to bind to (0.0.0.0 binds to all)
srvlookup=no ; Enable DNS SRV lookups on outbound calls
pedantic=no ; Enable checking of tags in headers
tos_sip=cs3 ; Sets TOS for SIP packets.
tos_audio=ef ; Sets TOS for RTP audio packets.
maxexpiry=1800 ; Maximum allowed time of incoming registrations
minexpiry=60 ; Minimum length of registrations/subscriptions (default 60)
defaultexpiry=120 ; Default length of incoming/outgoing registration
disallow=all ; First disallow all codecs
allow=alaw ; Allow codecs in order of preference
language=en ; Default language setting for all users/peers
useragent=voipua ; Allows you to change the user agent string
dtmfmode=rfc2833 ; Set default dtmfmode for sending DTMF. Default: rfc2833
rtptimeout=60 ; Terminate call if 60 seconds of no RTP or RTCP activity
rtpholdtimeout=300 ; Terminate call if 300 seconds of no RTP or RTCP activity
rtpkeepalive=20 ; Number of seconds, when a RTP Keepalive packet will be sent if no other RTP traffic on that connection.
notifyringing=yes ; Control whether subscriptions already INUSE get sent
nat=yes ; Global NAT settings (Affects all peers and users)
rtcachefriends=yes ; Cache realtime friends by adding them to the internal list
rtupdate=yes ; Send registry updates to database using realtime? (yes|no)
qualify=yes ; Check if client is reachable. If yes, the checks occur every 60 seconds
t38pt_udptl=yes ; T.38 faxing only works in SIP to SIP calls, with no local or agent channel being used.
;progressinband=never ; never|no|yes : If we should generate in-band ringing always. Default never.
checkmwi=30 ; Global interval (in seconds) between mailbox checks. Default 10 seconds. (New in v1.2.x)
;videosupport=yes ; Turn on support for SIP video. Default no.
rtautoclear=no ; Auto-Expire friends created on the fly. If yes the autoexpire will be in 120 seconds. Default yes.
ignoreregexpire=yes ; The registration information will be used regardless of whether it has expired or not; (default no)
limitonpeers=yes ; This setting tells Asterisk to apply call limits to peers only. (default no)
pedantic=yes ; Setting it to yes enables slow pedantic checking for phones that require it, enables more strict SIP RFC compliancy.
rtsavesysname=yes ; Specifies whether or not Asterisk should save the systemname in the realtime database at the time of registration

Comments:By: c0rnoTa (c0rnota) 2009-12-04 02:13:02.000-0600

I'm experianced this issue too but only with INVITE messages. Two months ago, i'v had qualify problem with result as "Too many opened files" and "Couldn't create socket". Qualify=no solved this. But today i'v got another "Too many opened files". As I think, reason is too many opened channels for old peer's IP adress.
"sip show channels" gives me over 500 opened channels with INVITE message like this:
10.31.100.119    301         433be6354bc  00102/00000  0x0 (nothing)    No       Init: INVITE
10.31.100.195    404         5143cfaf114  00102/00000  0x0 (nothing)    No       Init: INVITE
10.31.10.21      400         41745a2b7d3  00102/00000  0x0 (nothing)    No       Init: INVITE
10.31.100.119    301         134d0b4c659  00102/00000  0x0 (nothing)    No       Init: INVITE
10.31.100.195    404         41a9bf4670b  00102/00000  0x0 (nothing)    No       Init: INVITE
10.31.10.21      400         1d70ecc27d0  00102/00000  0x0 (nothing)    No       Init: INVITE

But for example my peer 400 already registered with new IP:
 * Name       : 400
 Realtime peer: Yes, cached
 Expire       : 2366
 LastMsg      : 0
 Addr->IP     : 10.31.10.26 Port 50462
 Defaddr->IP  : 0.0.0.0 Port 5060
 Def. Username: 400
 SIP Options  : (none)
 Status       : Unmonitored
 Useragent    : eyeBeam release 1102q stamp 51814
 Reg. Contact : sip:400@10.31.10.26:50462;rinstance=65ac9666d82b1e98

And 10.31.10.21 is not ping-able.
Please notice, that all peers are realtime.

Is my problem have relationship for this issue or i should repost it with new one?

By: Marc LEURENT (lftsy) 2009-12-04 02:58:37.000-0600

Hello c0rnoTa, I have been monitoring the problem for a while but I did not notice trouble with INVITE retransmission. Indeed, when I call a peer that has just gone away, there will be retransmission because of no answer, and after a 408 Timeout From my OpenSIPs Proxy.
Furthermore, I didn't notice your error message related to this problem...

You might need to post your message in a new bug report
Have a nice day!

By: dant (dant) 2009-12-04 06:28:56.000-0600

c0rnoTa, you issue is these ones ASTERISK-14653, ASTERISK-14576...

By: Leif Madsen (lmadsen) 2009-12-04 08:09:44.000-0600

The relationships added do not necessarily mean the issue is in the same part of the code, but simply that the related issues appear to have a similar type of issue (for those who may run into this issue but not the same issue)

By: Leif Madsen (lmadsen) 2009-12-04 08:11:12.000-0600

Could you also provide a console trace while this is happening, along with SIP debug and SIP history enabled? Thanks!

By: c0rnoTa (c0rnota) 2009-12-04 08:17:34.000-0600

Thanks for redirection to 0015716, 0015627!!

By: Marc LEURENT (lftsy) 2009-12-04 10:30:09.000-0600

Ok, I have set these 2 commands and launched a pcap capture.
bas-flu-vp-ast-03*CLI> core set debug 20
Core debug was 0 and is now 20
bas-flu-vp-ast-03*CLI> sip history
SIP History Recording Enabled (use 'sip show history')

I'm going to add the pcap file with the filter sip contains "0245667911"
* I have simply registered the 0245667911
* I use sip prune realtime 0245667911 to remove it (or reload command) from memory and I unplug the phone
  bas-flu-vp-ast-03*CLI> sip show peers like 0245667911
  Name/username              Host            Dyn Nat ACL Port     Status     Realtime
  0245667911/0245667911      212.147.47.91    D   N      5060     OK (17 ms) Cached RT
  1 sip peers [Monitored: 1 online, 0 offline Unmonitored: 0 online, 0 offline]


bas-flu-vp-ast-03*CLI> sip show peers like 0245667911
Name/username              Host            Dyn Nat ACL Port     Status     Realtime
0245667911/0245667911      212.147.47.91    D   N      5060     UNREACHABLE Cached RT

* I call 0245667911 from 0245667945, it recreates the peer with MySQL DB contents and Expire -1 (IP 213.162.3.159:5060 is pinged)
* Even if I connect the phone from a different port/IP (194.38.160.113:5060), the first port/IP (IP 213.162.3.159:5060) is flooded by SIP OPTIONS forever... As you can see, the register time was 600s, but after 15 min after, the qualify OPTIONS message are still there, and it last until next restart. You will also notice that the new IPs is qualified too. And I can do it again and again with other IP/ports and there will all be SIP OPTIONS qualified

Thank you for your time

By: John Todd (jtodd) 2009-12-04 10:58:58.000-0600

Note to person who works on this or anyone else: can we verify that this is not the case in 1.6.x?

By: Marc LEURENT (lftsy) 2009-12-09 08:47:14.000-0600

Any news? Is there someone how could test it please?
I'm going to try to install Asterisk 1.6.2, but I'm sure it will take me many times to test it.

If someone with Asterisk 1.6.x with sip realtime user could test it, I will be very grateful! It won't take much than 2 minutes to test!

Best Regards,
Marc Leurent

By: Marc LEURENT (lftsy) 2009-12-09 10:26:40.000-0600

Eureka, I can confirm you that the trouble is specific to Asterisk 1.4.x
The problem is not present on Asterisk 1.2.27 and on Asterisk 1.6.2.0-rc7

When I use the procedure above, and when I register from a different IP, the first IP is not qualified anymore!

What can we do now to help solving this trouble on Asterisk 1.4.27.1?
Thanks

By: Tilghman Lesher (tilghman) 2009-12-09 11:07:43.000-0600

If you apply the patch in ASTERISK-14576, does that solve the issue for you?

By: Digium Subversion (svnbot) 2009-12-10 10:53:43.000-0600

Repository: asterisk
Revision: 234095

U   branches/1.4/channels/chan_sip.c

------------------------------------------------------------------------
r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 17 lines

When we receive no response at all to our INVITE, allow the channel to be destroyed.
(closes issue ASTERISK-14576)
Reported by: falves11
Patches:
      20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
      20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
Tested by: falves11
Review: https://reviewboard.asterisk.org/r/446/
(closes issue ASTERISK-14653)
Reported by: dant
(closes issue ASTERISK-15161)
Reported by: corruptor
(closes issue ASTERISK-14337)
Reported by: falves11
(issue ASTERISK-14500)
Reported by: lftsy

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234095

By: Digium Subversion (svnbot) 2009-12-10 10:55:10.000-0600

Repository: asterisk
Revision: 234129

_U  trunk/
U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 24 lines

Merged revisions 234095 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
 
 When we receive no response at all to our INVITE, allow the channel to be destroyed.
 (closes issue ASTERISK-14576)
  Reported by: falves11
  Patches:
        20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
        20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
  Tested by: falves11
 Review: https://reviewboard.asterisk.org/r/446/
 (closes issue ASTERISK-14653)
 Reported by: dant
 (closes issue ASTERISK-15161)
 Reported by: corruptor
 (closes issue ASTERISK-14337)
 Reported by: falves11
 (issue ASTERISK-14500)
 Reported by: lftsy
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234129

By: Digium Subversion (svnbot) 2009-12-10 10:56:15.000-0600

Repository: asterisk
Revision: 234131

_U  branches/1.6.0/
U   branches/1.6.0/channels/chan_sip.c

------------------------------------------------------------------------
r234131 | tilghman | 2009-12-10 10:30:22 -0600 (Thu, 10 Dec 2009) | 29 lines

Merged revisions 234129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 16 lines
 
 Merged revisions 234095 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   When we receive no response at all to our INVITE, allow the channel to be destroyed.
   (closes issue ASTERISK-14576)
    Reported by: falves11
    Patches:
          20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
          20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
    Tested by: falves11
   Review: https://reviewboard.asterisk.org/r/446/
   (closes issue ASTERISK-14653)
   Reported by: dant
   (closes issue ASTERISK-15161)
   Reported by: corruptor
   (closes issue ASTERISK-14337)
   Reported by: falves11
   (issue ASTERISK-14500)
   Reported by: lftsy
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234131

By: Digium Subversion (svnbot) 2009-12-10 10:57:09.000-0600

Repository: asterisk
Revision: 234132

_U  branches/1.6.1/
U   branches/1.6.1/channels/chan_sip.c

------------------------------------------------------------------------
r234132 | tilghman | 2009-12-10 10:30:32 -0600 (Thu, 10 Dec 2009) | 31 lines

Merged revisions 234129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 16 lines
 
 Merged revisions 234095 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
   
   When we receive no response at all to our INVITE, allow the channel to be destroyed.
   (closes issue ASTERISK-14576)
    Reported by: falves11
    Patches:
          20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
          20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
    Tested by: falves11
   Review: https://reviewboard.asterisk.org/r/446/
   (closes issue ASTERISK-14653)
   Reported by: dant
   (closes issue ASTERISK-15161)
   Reported by: corruptor
   (closes issue ASTERISK-14337)
   Reported by: falves11
   (issue ASTERISK-14500)
   Reported by: lftsy
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234132

By: Digium Subversion (svnbot) 2009-12-10 10:57:38.000-0600

Repository: asterisk
Revision: 234133

_U  branches/1.6.2/
U   branches/1.6.2/channels/chan_sip.c

------------------------------------------------------------------------
r234133 | tilghman | 2009-12-10 10:30:40 -0600 (Thu, 10 Dec 2009) | 31 lines

Merged revisions 234129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 16 lines
 
 Merged revisions 234095 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
   
   When we receive no response at all to our INVITE, allow the channel to be destroyed.
   (closes issue ASTERISK-14576)
    Reported by: falves11
    Patches:
          20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
          20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
    Tested by: falves11
   Review: https://reviewboard.asterisk.org/r/446/
   (closes issue ASTERISK-14653)
   Reported by: dant
   (closes issue ASTERISK-15161)
   Reported by: corruptor
   (closes issue ASTERISK-14337)
   Reported by: falves11
   (issue ASTERISK-14500)
   Reported by: lftsy
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234133

By: Leif Madsen (lmadsen) 2009-12-11 10:33:01.000-0600

Fixed by patch in issue ASTERISK-14576. Thanks!

By: Marc LEURENT (lftsy) 2009-12-22 03:34:54.000-0600

Good morning!
Sorry, but the problem is still there...
I have applied the patch you suggested on the Asterisk 1.4.28-rc1 version. But I still have the problem.

My Asterisk is using 100% of one of my CPU, is flooding one of my peer with SIP OPTIONS and is flooding my MySQL server with requests (about 35Mbps of bandwidth)

Here is the SIP message sent to the proxy in front of the users from Asterisk each 10^-6s (so there is a lot of messages!!)
OPTIONS sip:212.147.44.91 SIP/2.0
Via: SIP/2.0/UDP 212.147.45.83:5060;branch=z9hG4bK496a1eaf;rport
From: "asterisk" <sip:asterisk@212.147.45.83>;tag=as57017886
To: <sip:212.147.44.91>
Contact: <sip:asterisk@212.147.45.83>
Call-ID: 3d82b1ef25ae14a2462447f3529bb8cf@212.147.45.83
CSeq: 102 OPTIONS
User-Agent: Asterisk
Max-Forwards: 70
Date: Tue, 22 Dec 2009 08:21:42 GMT
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO
Supported: replaces
Content-Length: 0

SIP/2.0 403 Forbidden - Flood Detection
Via: SIP/2.0/UDP 212.147.45.83:5060;branch=z9hG4bK7c87247c;rport=5060
From: "asterisk" <sip:asterisk@212.147.45.83>;tag=as086c62f2
To: <sip:212.147.44.91>;tag=00f65ed7cfd3e4fbbaec16b7a8524c0b.9ba2
Call-ID: 1584ec2d6084f6e67f56917a70e77f20@212.147.45.83
CSeq: 102 OPTIONS
Server: SIP Proxy
Content-Length: 0


And here is the MySQL request also sent by Asterisk each 10^-3 s
SELECT * FROM sip_users WHERE name = '024xxxxxxx' AND host = 'dynamic'


I have blocked with iptables the requests so I can keep the Asterisk in this flooding mode to be able to give you as many bug report as we will need!!!


I have typed Asterisk debugging on google and I have found http://www.voip-info.org/wiki/view/Asterisk+debugging
But I have some trouble understanding it...

Here a set of command I have typed:

(SCREEN):root@bas-flu-vp-ast-03:[~]# asterisk -rx "show version"
Asterisk 1.4.28-rc1 built by root @ bas-flu-vp-ast-03 on a x86_64 running Linux on 2009-12-11 16:34:43 UTC
(SCREEN):root@bas-flu-vp-ast-03:[~]#    

(SCREEN):root@bas-flu-vp-ast-03:[~]# asterisk -rx "database show"
/SIP/Registry/0245667945



(gdb) info thread
 15 Thread 0x4201e940 (LWP 30774)  0x00000032642ca556 in poll () from /lib64/libc.so.6
 14 Thread 0x412ed940 (LWP 30775)  0x00000032642ca556 in poll () from /lib64/libc.so.6
 13 Thread 0x4074d940 (LWP 30776)  0x0000003264e0d56b in connect () from /lib64/libpthread.so.0
 12 Thread 0x409b4940 (LWP 30777)  0x00000032642cc5e2 in select () from /lib64/libc.so.6
 11 Thread 0x40554940 (LWP 30778)  0x00000032642cc5e2 in select () from /lib64/libc.so.6
 10 Thread 0x40e4f940 (LWP 30779)  0x00000032642cc5e2 in select () from /lib64/libc.so.6
 9 Thread 0x41ab3940 (LWP 30787)  ast_sched_add_variable (con=0x19eef820, when=4000, callback=0x2aaab9654f70 <sip_poke_noanswer>,
   data=0x2aaaac3daee0, variable=0) at sched.c:178
 8 Thread 0x41ec0940 (LWP 30788)  0x0000003264e0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
 7 Thread 0x411fb940 (LWP 30789)  0x00000032642cc5e2 in select () from /lib64/libc.so.6
 6 Thread 0x401df940 (LWP 30791)  0x00000032642ca556 in poll () from /lib64/libc.so.6
 5 Thread 0x4042f940 (LWP 30792)  0x0000003264e0db71 in nanosleep () from /lib64/libpthread.so.0
 4 Thread 0x40937940 (LWP 30794)  0x00000032642ca556 in poll () from /lib64/libc.so.6
 3 Thread 0x4209a940 (LWP 30805)  0x00000032642ca556 in poll () from /lib64/libc.so.6
 2 Thread 0x41e3e940 (LWP 30809)  0x0000003264e0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 1 Thread 0x2b964422c510 (LWP 30770)  0x0000003264e0d2cb in read () from /lib64/libpthread.so.0
(gdb)




Thread 15 (Thread 0x4201e940 (LWP 30774)):
#0  0x00000032642ca556 in poll () from /lib64/libc.so.6
#1  0x0000000000422774 in listener (unused=<value optimized out>) at asterisk.c:1010
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 14 (Thread 0x412ed940 (LWP 30775)):
#0  0x00000032642ca556 in poll () from /lib64/libc.so.6
#1  0x000000000046a591 in accept_thread (ignore=<value optimized out>) at manager.c:2484
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 13 (Thread 0x4074d940 (LWP 30776)):
#0  0x0000003264e0d56b in connect () from /lib64/libpthread.so.0
#1  0x00002aaab3e243b6 in my_connect () from /usr/lib64/mysql/libmysqlclient.so.15
#2  0x00002aaab3e26b8d in mysql_real_connect () from /usr/lib64/mysql/libmysqlclient.so.15
#3  0x00002aaab6fcd9d0 in ?? () from /usr/lib/asterisk/modules/res_config_mysql.so
#4  0x00002aaab6fcec41 in ?? () from /usr/lib/asterisk/modules/res_config_mysql.so
ASTERISK-1  0x0000000000446ee4 in ast_load_realtime (family=0x2aaab967d8e2 "sippeers") at config.c:1428
ASTERISK-2  0x00002aaab964bc13 in find_peer (peer=0x4074cb20 "0245667911", sin=0x0, realtime=<value optimized out>, devstate_only=1)
   at chan_sip.c:2671
ASTERISK-3  0x00002aaab9654dcc in sip_devicestate (data=0x4074cfa4) at chan_sip.c:17255
ASTERISK-4  0x000000000044c803 in ast_device_state (device=0x2aaac401a148 "SIP/0245667911") at devicestate.c:170
ASTERISK-5  0x000000000044cc90 in do_state_change (device=0x5 <Address 0x5 out of bounds>) at devicestate.c:285
ASTERISK-6 0x000000000044cd6d in do_devstate_changes (data=<value optimized out>) at devicestate.c:346
ASTERISK-7 0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
ASTERISK-8 0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
ASTERISK-9 0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 12 (Thread 0x409b4940 (LWP 30777)):
#0  0x00000032642cc5e2 in select () from /lib64/libc.so.6
#1  0x00002aaab1025a4f in do_parking_thread (ignore=<value optimized out>) at /usr/src/asterisk-1.4.28-rc1/include/asterisk/channel.h:1378
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6


Thread 11 (Thread 0x40554940 (LWP 30778)):
#0  0x00000032642cc5e2 in select () from /lib64/libc.so.6
#1  0x00002aaab31956ab in sound_thread (arg=<value optimized out>) at /usr/src/asterisk-1.4.28-rc1/include/asterisk/channel.h:1378
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 10 (Thread 0x40e4f940 (LWP 30779)):
#0  0x00000032642cc5e2 in select () from /lib64/libc.so.6
#1  0x00002aaab31956ab in sound_thread (arg=<value optimized out>) at /usr/src/asterisk-1.4.28-rc1/include/asterisk/channel.h:1378
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 9 (Thread 0x41ab3940 (LWP 30787)):
#0  ast_sched_add_variable (con=0x19eef820, when=4000, callback=0x2aaab9654f70 <sip_poke_noanswer>, data=0x2aaaac3daee0, variable=0)
   at sched.c:178
#1  0x00002aaab9648bd8 in sip_poke_peer (peer=0x2aaaac3daee0) at chan_sip.c:17186
#2  0x00002aaab9666509 in sip_poke_peer_s (data=0x211e3c90) at chan_sip.c:8497
#3  0x00000000004a3728 in ast_sched_runq (con=<value optimized out>) at sched.c:363
#4  0x00002aaab9662ae5 in do_monitor (data=<value optimized out>) at chan_sip.c:17020
ASTERISK-1  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
ASTERISK-2  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
ASTERISK-3  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 8 (Thread 0x41ec0940 (LWP 30788)):
#0  0x0000003264e0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00002aaaba09a79b in device_state_thread (data=<value optimized out>) at /usr/src/asterisk-1.4.28-rc1/include/asterisk/lock.h:791
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6


Thread 7 (Thread 0x411fb940 (LWP 30789)):
#0  0x00000032642cc5e2 in select () from /lib64/libc.so.6
#1  0x00002aaaba865bca in agent_check_and_process () from /usr/lib64/libnetsnmpagent.so.10
#2  0x00002aaaba2aebfa in agent_thread (arg=<value optimized out>) at snmp/agent.c:162
#3  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#4  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
ASTERISK-1  0x00000032642d309d in clone () from /lib64/libc.so.6


Thread 6 (Thread 0x401df940 (LWP 30791)):
#0  0x00000032642ca556 in poll () from /lib64/libc.so.6
#1  0x00002aaabf4491fb in do_monitor (data=<value optimized out>) at chan_dahdi.c:7333
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x4042f940 (LWP 30792)):
#0  0x0000003264e0db71 in nanosleep () from /lib64/libpthread.so.0
#1  0x00002aaac23edd4a in scan_thread (unused=<value optimized out>) at pbx_spool.c:458
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x40937940 (LWP 30794)):
#0  0x00000032642ca556 in poll () from /lib64/libc.so.6
#1  0x0000000000420a65 in monitor_sig_flags (unused=<value optimized out>) at asterisk.c:2679
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x4209a940 (LWP 30805)):
#0  0x00000032642ca556 in poll () from /lib64/libc.so.6
#1  0x0000000000422bdd in netconsole (vconsole=<value optimized out>) at asterisk.c:959
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x41e3e940 (LWP 30809)):
#0  0x0000003264e0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000042a2e5 in autoservice_run (ign=<value optimized out>) at /usr/src/asterisk-1.4.28-rc1/include/asterisk/lock.h:791
#2  0x00000000004b00bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003264e06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000032642d309d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x2b964422c510 (LWP 30770)):
#0  0x0000003264e0d2cb in read () from /lib64/libpthread.so.0
#1  0x00000000004ba9e0 in read_char (el=0x19e6df20, cp=0x7fffbb9960ff "") at read.c:298
#2  0x00000000004b59a0 in el_getc (el=0x19e6df20, cp=0x7fffbb9960ff "") at read.c:350
#3  0x00000000004b5e00 in el_gets (el=0x19e6df20, nread=0x7fffbb9966dc) at read.c:243
#4  0x0000000000427059 in main (argc=<value optimized out>, argv=0x1) at asterisk.c:3212

By: Marc LEURENT (lftsy) 2010-01-08 02:39:40.000-0600

Happy Ney Year!
Is there something else I can do to help debugging it to solve this matter on Asterisk 1.4.x ?
Thanks

Best Regards

By: Alan Graham (zerohalo) 2010-01-26 09:52:21.000-0600

I'm finding this happening in 1.4.29 as well - signs that it's happening for us include a large increase in mysql queries (from qualify of the peer, in this case), and an effective DoS of the peer after some time. It also seems to create starvation of realtime extensions in the process, taking out call processing for periods. If there anything I can post that would help here?

By: dtyoo (dtyoo) 2010-01-29 21:49:05.000-0600

I can confirm that I'm seeing this issue of runaway options requests, under 1.4.29.  It took us a while to figure out what was happening.  But we were seeing a huge number of sustained mysql realtime queries.  wireshark showed a substantial number of options messages going to a particular peer, several meg worth of traffic.  This effectively took down the internet connection / router that the remote peer was behind.

What information can I provide to help narrow down this issue?

By: Marc LEURENT (lftsy) 2010-02-02 01:47:10.000-0600

Hello again, I have the same problem on a running 1.4.26 server. It is flooding a sip peers that is not connected anymore on the platform. I have previously tested the 1.4.29 but the bug described above was still present.

The problem is not present on asterisk 1.2.x and 1.6.x but the upgrade will be very complicated for me since I will have to change the database entries with the new syntax...

reload chan_sip.c doesn't stop the flooding, only a restart solve the problem.

As suggested by "zerohalo" and "dtyoo", is there something I can do to help developpers understanding what's wrong?
Best Regards

By: Leif Madsen (lmadsen) 2010-02-02 08:50:54.000-0600

I have nothing further at this time to request; we're simply waiting on a developer who has the time and resources to move this issue forward.

By: Alan Graham (zerohalo) 2010-02-02 11:16:38.000-0600

I'm actually finding this happening on 1.4.20rc2 machines as well

By: Leif Madsen (lmadsen) 2010-02-03 10:27:33.000-0600

Actually, I'll go ahead and request some addtional information:

* console output demonstrating this issue, with sip history and debug level logging enabled (probably best done from logger.conf by enabling 'full' and then doing a 'logger reload' at the CLI, followed by 'core set verbose 10' and 'core set debug 10')

* SIP console output from Asterisk with SIP history enabled from sip.conf

* sip.conf demonstrating this issue, and what topology is being used, and how best to reproduce this.

Thanks!

By: Marc LEURENT (lftsy) 2010-02-04 03:24:45.000-0600

* Here are some steps to start reproducing manually the bug:
-> Please refer to post above (0114758) lftsy (reporter) 2009-12-04 10:30

* When problem appears without my intervention, it appears on a peer that has a bad xDSL connection so that it often changes it's IP/port of contact.

* The topology I use is a simple OpenSIPs 1.4.5 server in front of 2 Asterisk servers, the proxy does nothing except Keepalive and loadbalancing on source IP. All AAA is done on Asterisk. BUT, I can reproduce the bug on a direct IP connection to the Asterisk server!

* I have already put a 0245667911.pcap [^] (443,527 bytes) 2009-12-04 10:30 files above to see the problem. It corresponds to the manipulation to reproduce the bug. You will see that 2 different Contact IP/ports are PINGed from Asterisk. A sip prune realtime / or a reload of Asterisk server doesn't stop the Flooding.

* My sip.conf is also at the beginning of the ticket

Next time it will happen, I will print you further output you have requested here, hoping it will help us solving this bug.

Best Regards,
Marc Leurent

By: Marc LEURENT (lftsy) 2010-02-11 04:43:33.000-0600

Hello,
Yes!! One of my server had the problem currently! I'm going to print here everything you need!

By: Marc LEURENT (lftsy) 2010-02-11 04:54:46.000-0600

Here is all the details you asked. I'm going to keep the server flooding for a few yours if you need further information, but I will have to restart it before tonight....

* SIP CHOW CHANNELS is chowing many many SIP OPTIONS transactions... (16670 active SIP channels)

212.147.45.91    (None)      308bd32143e  00102/00000  0x0 (nothing)    No       Init: OPTIONS
212.147.45.91    (None)      21d699095af  00102/00000  0x0 (nothing)    No       Init: OPTIONS
212.147.45.91    (None)      73df27e6236  00102/00000  0x0 (nothing)    No       Init: OPTIONS
212.147.45.91    (None)      7e27d4444ef  00102/00000  0x0 (nothing)    No       Init: OPTIONS
212.147.45.91    (None)      41e8bf2d66d  00102/00000  0x0 (nothing)    No       Init: OPTIONS
212.147.45.91    (None)      146b3535531  00102/00000  0x0 (nothing)    No  (d)  Init: OPTIONS
212.147.45.91    (None)      41008d2b599  00102/00000  0x0 (nothing)    No  (d)  Init: OPTIONS
212.147.20.222   0009004141  25bfed474d3  00102/00000  0x8 (alaw)       No       Tx: ACK
212.147.45.91    0323440749  3c2fabb6a42  00101/00002  0x100 (g729)     No       Rx: ACK
16670 active SIP channels



* CORE SET DEBUG 20 is showing many many things.... I cannot read the screen, I got all the transaction destroy below

pul-lav-vp-ast-03*CLI> core set debug 20
Core debug was 0 and is now 20
Really destroying SIP dialog '6ee0fd1c263972af67a4e37702b2cd6b@212.147.47.83' Method: OPTIONS
Really destroying SIP dialog '75f3ab7f25a26e3d539b67510b7e9be8@212.147.47.83' Method: OPTIONS
Really destroying SIP dialog '075157740620306c1ccc1e065c1bc32b@212.147.47.83' Method: OPTIONS
Really destroying SIP dialog '6adbda4500c070b36328e47425ff0cfd@212.147.47.83' Method: OPTIONS
Really destroying SIP dialog '32f7f8694761279e0ea1db58685ca106@212.147.47.83' Method: OPTIONS
Really destroying SIP dialog '4006ff767b3101b93fc938b12239f557@212.147.47.83' Method: OPTIONS
Really destroying SIP dialog '7538ed357da4d5933acf29054e2555a5@212.147.47.83' Method: OPTIONS


* CORE SET VERBOSE 20 is not showing anything, because no new calls are possible since the server is flooding the proxy and has been blacklisted
pul-lav-vp-ast-03*CLI> core set verbose 20
Verbosity was 3 and is now 20
pul-lav-vp-ast-03*CLI>
pul-lav-vp-ast-03*CLI>
pul-lav-vp-ast-03*CLI>



* The topology I use is a simple OpenSIPs 1.4.5 server in front of 2 Asterisk servers, the proxy does nothing except Keepalive and loadbalancing on source IP. All AAA is done on Asterisk. BUT, I can reproduce the bug on a direct IP connection to the Asterisk server!


* And here is my sip.conf
[general]
context=default ; Default context for incoming calls
allowguest=no ; Allow or reject guest calls (default is yes)
allowoverlap=no ; Disable overlap dialing support. (Default is yes)
realm=voip ; Realm for digest authentication
bindport=5060 ; UDP Port to bind to (SIP standard port is 5060)
bindaddr=0.0.0.0 ; IP address to bind to (0.0.0.0 binds to all)
srvlookup=no ; Enable DNS SRV lookups on outbound calls
pedantic=no ; Enable checking of tags in headers
tos_sip=cs3 ; Sets TOS for SIP packets.
tos_audio=ef ; Sets TOS for RTP audio packets.
maxexpiry=180 ; Maximum allowed time of incoming registrations
minexpiry=60 ; Minimum length of registrations/subscriptions (default 60)
defaultexpiry=120 ; Default length of incoming/outgoing registration
disallow=all ; First disallow all codecs
allow=alaw ; Allow codecs in order of preference
language=en ; Default language setting for all users/peers
useragent=voipua ; Allows you to change the user agent string
dtmfmode=rfc2833 ; Set default dtmfmode for sending DTMF. Default: rfc2833
rtptimeout=60 ; Terminate call if 60 seconds of no RTP or RTCP activity
rtpholdtimeout=300 ; Terminate call if 300 seconds of no RTP or RTCP activity
rtpkeepalive=20 ; Number of seconds, when a RTP Keepalive packet will be sent if no other RTP traffic on that connection.
notifyringing=yes ; Control whether subscriptions already INUSE get sent
nat=yes ; Global NAT settings (Affects all peers and users)
rtcachefriends=yes ; Cache realtime friends by adding them to the internal list
rtupdate=yes ; Send registry updates to database using realtime? (yes|no)
rtautoclear=60 ; Auto-Expire friends created on the fly. If yes the autoexpire will be in 120 seconds. Default yes.
qualify=yes ; Check if client is reachable. If yes, the checks occur every 60 seconds
t38pt_udptl=yes ; T.38 faxing only works in SIP to SIP calls, with no local or agent channel being used.
progressinband=never ; never|no|yes : If we should generate in-band ringing always. Default never.


And the includes:
#include sip.local
regcontext=pul-lav-vp-ast-03


and another include (there are other trunks of course like the one below)
#include sip-trunk.conf
[sip_trunk_vm]
host=212.147.45.17
type=peer
context=default
dtmfmode=info
insecure=port,invite
nat=never
sendrpid=yes
disallow=all
allow=alaw

By: Marc LEURENT (lftsy) 2010-02-11 05:02:51.000-0600

Hello,
You will also found the 300 last lines of my full log message attached to this bug: https://issues.asterisk.org/file_download.php?file_id=25295&type=bug

You will see that the Asterisk server is sending many SIP OPTIONS and is also flooding the mysql database!

Best Regards,
Marc Leurent

By: Marc LEURENT (lftsy) 2010-02-11 05:44:36.000-0600

And here is some gdb debug commands that I have found on http://www.voip-info.org/tiki-index.php?page=Asterisk%20debugging

(SCREEN):root@pul-lav-vp-ast-03:[~]# gdb /usr/sbin/asterisk 3947
...
...
Reading symbols from /usr/lib/asterisk/modules/app_queue.so...done.
Loaded symbols for /usr/lib/asterisk/modules/app_queue.so
Reading symbols from /usr/lib/asterisk/modules/format_vox.so...done.
Loaded symbols for /usr/lib/asterisk/modules/format_vox.so
Reading symbols from /usr/lib/asterisk/modules/codec_alaw.so...done.
Loaded symbols for /usr/lib/asterisk/modules/codec_alaw.so
Reading symbols from /usr/lib/asterisk/modules/app_url.so...done.
Loaded symbols for /usr/lib/asterisk/modules/app_url.so
Reading symbols from /usr/lib/asterisk/modules/app_mp3.so...done.
Loaded symbols for /usr/lib/asterisk/modules/app_mp3.so
Reading symbols from /usr/lib/asterisk/modules/codec_ulaw.so...done.
Loaded symbols for /usr/lib/asterisk/modules/codec_ulaw.so
0x0000003addc0d2cb in read () from /lib64/libpthread.so.0
(gdb)

(gdb) bt
#0  0x0000003addc0d2cb in read () from /lib64/libpthread.so.0
#1  0x00000000004b9ad0 in read_char (el=0x1add0f10, cp=0x7fff0c034d8f "") at read.c:298
#2  0x00000000004b4a90 in el_getc (el=0x1add0f10, cp=0x7fff0c034d8f "") at read.c:350
#3  0x00000000004b4ef0 in el_gets (el=0x1add0f10, nread=0x7fff0c03536c) at read.c:243
#4  0x0000000000426fb9 in main (argc=<value optimized out>, argv=0x1) at asterisk.c:3212
(gdb)
(gdb)
(gdb) info thread
 16 Thread 0x41412940 (LWP 3948)  0x0000003add0ca556 in poll () from /lib64/libc.so.6
 15 Thread 0x4148e940 (LWP 3949)  0x0000003add0ca556 in poll () from /lib64/libc.so.6
 14 Thread 0x40fdf940 (LWP 3950)  0x0000003addc0d56b in connect () from /lib64/libpthread.so.0
 13 Thread 0x41d46940 (LWP 3951)  0x0000003add0cc5e2 in select () from /lib64/libc.so.6
 12 Thread 0x40f28940 (LWP 3952)  0x0000003add0ca556 in poll () from /lib64/libc.so.6
 11 Thread 0x407da940 (LWP 3959)  0x0000003addc0db71 in nanosleep () from /lib64/libpthread.so.0
 10 Thread 0x41dc2940 (LWP 3960)  ast_sched_add_variable (con=0x1adfbf30, when=4000,
   callback=0x2aaab0b21490 <sip_poke_noanswer>, data=0x2aaab8e04e40, variable=0) at sched.c:178
 9 Thread 0x42055940 (LWP 3961)  0x0000003add0cc5e2 in select () from /lib64/libc.so.6
 8 Thread 0x4150a940 (LWP 3963)  0x0000003add0cc5e2 in select () from /lib64/libc.so.6
 7 Thread 0x41586940 (LWP 3964)  0x0000003add0cc5e2 in select () from /lib64/libc.so.6
 6 Thread 0x4168f940 (LWP 3968)  0x0000003addc0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
 5 Thread 0x4036e940 (LWP 3969)  0x0000003add0ca556 in poll () from /lib64/libc.so.6
 4 Thread 0x40b91940 (LWP 3993)  0x0000003addc0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
 3 Thread 0x417b9940 (LWP 7132)  0x0000003add0ca556 in poll () from /lib64/libc.so.6
 2 Thread 0x41f3b940 (LWP 7442)  0x0000003add0ca556 in poll () from /lib64/libc.so.6
* 1 Thread 0x2ae795e3f510 (LWP 3947)  0x0000003addc0d2cb in read () from /lib64/libpthread.so.0
(gdb) thread apply all bt

Thread 16 (Thread 0x41412940 (LWP 3948)):
#0  0x0000003add0ca556 in poll () from /lib64/libc.so.6
#1  0x00000000004226d4 in listener (unused=<value optimized out>) at asterisk.c:1010
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 15 (Thread 0x4148e940 (LWP 3949)):
#0  0x0000003add0ca556 in poll () from /lib64/libc.so.6
#1  0x0000000000469991 in accept_thread (ignore=<value optimized out>) at manager.c:2481
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 14 (Thread 0x40fdf940 (LWP 3950)):
#0  0x0000003addc0d56b in connect () from /lib64/libpthread.so.0
#1  0x00002aaab1dd03b6 in my_connect () from /usr/lib64/mysql/libmysqlclient.so.15
#2  0x00002aaab1dd2b8d in mysql_real_connect () from /usr/lib64/mysql/libmysqlclient.so.15
#3  0x00002aaac2cad9cf in ?? () from /usr/lib/asterisk/modules/res_config_mysql.so
#4  0x00002aaac2caec11 in ?? () from /usr/lib/asterisk/modules/res_config_mysql.so
ASTERISK-1  0x0000000000446b14 in ast_load_realtime (family=0x2aaab0b4bd31 "sippeers") at config.c:1419
ASTERISK-2  0x00002aaab0b19b33 in find_peer (peer=0x40fdeb20 "022566xxyy", sin=0x0, realtime=<value optimized out>,
   devstate_only=1) at chan_sip.c:2592
ASTERISK-3  0x00002aaab0b212ec in sip_devicestate (data=0x40fdefa4) at chan_sip.c:16977
ASTERISK-4  0x000000000044c403 in ast_device_state (device=0x2aaab88bf7c8 "SIP/022566xxyy") at devicestate.c:170
ASTERISK-5  0x000000000044c890 in do_state_change (device=0x1b <Address 0x1b out of bounds>) at devicestate.c:285
ASTERISK-6 0x000000000044c96d in do_devstate_changes (data=<value optimized out>) at devicestate.c:346
ASTERISK-7 0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
ASTERISK-8 0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
ASTERISK-9 0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 13 (Thread 0x41d46940 (LWP 3951)):
#0  0x0000003add0cc5e2 in select () from /lib64/libc.so.6
---Type <return> to continue, or q <return> to quit---
#1  0x00002aaaab2cc99f in do_parking_thread (ignore=<value optimized out>)
   at /usr/src/asterisk-1.4.26/include/asterisk/channel.h:1376
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 12 (Thread 0x40f28940 (LWP 3952)):
#0  0x0000003add0ca556 in poll () from /lib64/libc.so.6
#1  0x00002aaaadd92a6b in do_monitor (data=<value optimized out>) at chan_dahdi.c:7310
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 11 (Thread 0x407da940 (LWP 3959)):
#0  0x0000003addc0db71 in nanosleep () from /lib64/libpthread.so.0
#1  0x00002aaaaf48fd4a in scan_thread (unused=<value optimized out>) at pbx_spool.c:458
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 10 (Thread 0x41dc2940 (LWP 3960)):
#0  ast_sched_add_variable (con=0x1adfbf30, when=4000, callback=0x2aaab0b21490 <sip_poke_noanswer>, data=0x2aaab8e04e40,
   variable=0) at sched.c:178
#1  0x00002aaab0b16cd8 in sip_poke_peer (peer=0x2aaab8e04e40) at chan_sip.c:16908
#2  0x00002aaab0b34589 in sip_poke_peer_s (data=0x2aaaba43a000) at chan_sip.c:8251
#3  0x00000000004a2858 in ast_sched_runq (con=<value optimized out>) at sched.c:363
#4  0x00002aaab0b21df3 in do_monitor (data=<value optimized out>) at chan_sip.c:16742
ASTERISK-1  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
ASTERISK-2  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
ASTERISK-3  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 9 (Thread 0x42055940 (LWP 3961)):
#0  0x0000003add0cc5e2 in select () from /lib64/libc.so.6
#1  0x00002aaab3d02bca in agent_check_and_process () from /usr/lib64/libnetsnmpagent.so.10
---Type <return> to continue, or q <return> to quit---
#2  0x00002aaab374cbfa in agent_thread (arg=<value optimized out>) at snmp/agent.c:162
#3  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#4  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
ASTERISK-1  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 8 (Thread 0x4150a940 (LWP 3963)):
#0  0x0000003add0cc5e2 in select () from /lib64/libc.so.6
#1  0x00002aaac12006ab in sound_thread (arg=<value optimized out>)
   at /usr/src/asterisk-1.4.26/include/asterisk/channel.h:1376
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 7 (Thread 0x41586940 (LWP 3964)):
#0  0x0000003add0cc5e2 in select () from /lib64/libc.so.6
#1  0x00002aaac12006ab in sound_thread (arg=<value optimized out>)
   at /usr/src/asterisk-1.4.26/include/asterisk/channel.h:1376
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 6 (Thread 0x4168f940 (LWP 3968)):
#0  0x0000003addc0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00002aaac6d37bbb in device_state_thread (data=<value optimized out>)
   at /usr/src/asterisk-1.4.26/include/asterisk/lock.h:790
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x4036e940 (LWP 3969)):
#0  0x0000003add0ca556 in poll () from /lib64/libc.so.6
#1  0x00000000004209c5 in monitor_sig_flags (unused=<value optimized out>) at asterisk.c:2679
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x40b91940 (LWP 3993)):
#0  0x0000003addc0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000042a0be in autoservice_run (ign=<value optimized out>)
   at /usr/src/asterisk-1.4.26/include/asterisk/lock.h:790
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x417b9940 (LWP 7132)):
#0  0x0000003add0ca556 in poll () from /lib64/libc.so.6
#1  0x000000000043b84f in ast_waitfor_nandfds (c=0x417b2ef0, n=2, fds=0x0, nfds=0, exception=0x0, outfd=0x0, ms=0x417b2f9c)
   at channel.c:1783
#2  0x000000000043f726 in ast_channel_bridge (c0=0x2aaab8216280, c1=0x0, config=0x417b46d0, fo=0x417b32d8, rc=0x417b32d0)
   at channel.c:1845
#3  0x00002aaaab2ca6a5 in ast_bridge_call (chan=0x2aaab8216280, peer=0x1b057f70, config=0x417b46d0) at res_features.c:1806
#4  0x00002aaac65173b7 in dial_exec_full (chan=0x2aaab8216280, data=<value optimized out>, peerflags=0x417b4990,
   continue_exec=0x0) at app_dial.c:1804
ASTERISK-1  0x00002aaac65189a5 in dial_exec (chan=0x417b2d00, data=0x8) at app_dial.c:1843
ASTERISK-2  0x00000000004803fd in pbx_extension_helper (c=0x2aaab8216280, con=<value optimized out>,
   context=0x2aaab82164d0 "Dial-Out", exten=0x2aaab8216520 "0412502020", priority=40, label=<value optimized out>,
   callerid=0x1af74190 "0009004132344xxyy", action=E_SPAWN) at pbx.c:536
ASTERISK-3  0x0000000000482a82 in __ast_pbx_run (c=0x2aaab8216280) at pbx.c:2283
ASTERISK-4  0x0000000000483949 in pbx_thread (data=0x417b2d00) at pbx.c:2599
ASTERISK-5  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
ASTERISK-6 0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
ASTERISK-7 0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x41f3b940 (LWP 7442)):
#0  0x0000003add0ca556 in poll () from /lib64/libc.so.6
#1  0x0000000000422b3d in netconsole (vconsole=<value optimized out>) at asterisk.c:959
#2  0x00000000004af1bc in dummy_start (data=<value optimized out>) at utils.c:856
#3  0x0000003addc06367 in start_thread () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#4  0x0000003add0d309d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x2ae795e3f510 (LWP 3947)):
#0  0x0000003addc0d2cb in read () from /lib64/libpthread.so.0
#1  0x00000000004b9ad0 in read_char (el=0x1add0f10, cp=0x7fff0c034d8f "") at read.c:298
#2  0x00000000004b4a90 in el_getc (el=0x1add0f10, cp=0x7fff0c034d8f "") at read.c:350
#3  0x00000000004b4ef0 in el_gets (el=0x1add0f10, nread=0x7fff0c03536c) at read.c:243
#4  0x0000000000426fb9 in main (argc=<value optimized out>, argv=0x1) at asterisk.c:3212



By: Marc LEURENT (lftsy) 2010-02-11 06:35:22.000-0600

To summarize the problem, I would say that the first thing that is not normal, (from my point of view), since a peer can have only one location, is the fact (you can see it in the 1st pcap file I have posted https://issues.asterisk.org/file_download.php?file_id=24694&type=bug) that a realtime peer can have several Contact location in the same time.

And I think that this is what leads to the flooding...
Ex:
OPTIONS sip:0245667911@194.38.160.113:5060 SIP/2.0
OPTIONS sip:0245667911@213.162.3.159:5060;transport=udp;user=phone SIP/2.0


And on my last pcap file today, the same problem appears, one peer has many Contact entries that are relayed via the proxy IP 212.147.45.91:5060 (but as I said before, the problem can appear with several destination IP/port in the same time on a direct connection without going through the proxy)

OPTIONS sip:044444xxyy@92.104.53.35:65082;line=0ahhazhb SIP/2.0
OPTIONS sip:044444xxyy@92.104.53.35:59188;rinstance=66d72254a0b7f923 SIP/2.0
OPTIONS sip:044444xxyy@92.104.53.35:65082;line=0ahhazhb SIP/2.0
OPTIONS sip:044444xxyy@92.104.53.35:59188;rinstance=66d72254a0b7f923 SIP/2.0
OPTIONS sip:044444xxyy@92.104.53.35:65246;line=snv7vwjm SIP/2.0
OPTIONS sip:044444xxyy@92.104.53.35:64749;line=1y8wt5tb SIP/2.0
OPTIONS sip:044444xxyy@92.104.53.35:55382;line=7tp7l2hd SIP/2.0
...

And of course the peer in not reachable and had a "Expire : -1" even if we have set a max expiration timer.

pul-lav-vp-ast-03*CLI> sip show peer 044444xxyy

 * Name       : 044444xxyy
 Realtime peer: Yes, cached
 Secret       : <Set>
 MD5Secret    : <Not set>
 Context      : sipresidential
 Subscr.Cont. : <Not set>
 Language     : de
 AMA flags    : Unknown
 Transfer mode: open
 CallingPres  : Presentation Allowed, Not Screened
 Callgroup    :
 Pickupgroup  :
 Mailbox      : 044444xxyy@default
 VM Extension : asterisk
 LastMsgsSent : 32767/65535
 Call limit   : 6
 Dynamic      : Yes
 Callerid     : "044444xxyy" <044444xxyy>
 MaxCallBR    : 384 kbps
 Expire       : -1LI>
 Insecure     : no
 Nat          : Always
 ACL          : No
 T38 pt UDPTL : Yes
 CanReinvite  : NoLI>
 PromiscRedir : No
 User=Phone   : No
 Video Support: No
 Trust RPID   : No
 Send RPID    : No
 Subscriptions: Yes
 Overlap dial : No
 DTMFmode     : rfc2833
 LastMsg      : 0
 ToHost       :
 Addr->IP     : (Unspecified) Port 0
 Defaddr->IP  : 0.0.0.0 Port 5060
 Reg. exten   :
 Def. Username: 044444xxyy
 SIP Options  : (none)
 Codecs       : 0x102 (gsm|g729)
 Codec Order  : (g729:20,gsm:20)
 Auto-Framing:  No
 Status       : UNKNOWN
 Useragent    :
 Reg. Contact :

And pruning this peer does not stop the flooding OPTIONS...
I hope all these informations will help you pinpoint the problem.

Best Regards,

By: Alan Graham (zerohalo) 2010-02-11 15:08:08.000-0600

I don't think I can add too much more to what iftsy has posted above. It looks related to the https://issues.asterisk.org/view.php?id=15768 that was closed as normal behavior. I have a pcap that shows this behavior to a single peer (of over a thousand on the server) sending options at the rate of about 5M over a couple of minutes. I can provide this pcap as well as access to production machines, if needed.

By: Marc LEURENT (lftsy) 2010-02-12 05:02:44.000-0600

Hello everybody,
I have discussed on #asterisk-dev with some developers, they were very reactive!
I have recompiled Asterisk with DO NOT OPTIMIZE flag and I have upgraded to version 1.4.30-rc2 in order to provide them more useful feedback.

I have more trouble reproducing the bug on the dev platform, but even after testing the patch above (commit SVN 234132), the problem was more difficult to reproduce but was still there.

@zerohalo Could you confirmed that you have the trouble on 1.4.30rc2, and not "1.4.20rc2" as you said before?

Thanks



By: Alan Graham (zerohalo) 2010-02-16 10:56:31.000-0600

@iftsy - the behavior was observed on 1.4.20rc2, not 30. As soon as I can capture it again, I will post, was just an observation that this may have been going on for a long time.

In the meantime, I just had another 1.4.29 DoS of one of our peers. Relevant BT thread and 'core show locks' below - the rest of the backtrace is uninteresting. This caused an immediate mysql starvation and OPTIONS DoS of around 4MB traffic. Attached is the munin graph showing the mysql throughput.


=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <file> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: -1208628320 (do_devstate_changes  started at [  363] devicestate.c ast_device_state_engine_init())
=== ---> Lock #0 (res_config_mysql.c): MUTEX 115 realtime_mysql &mysql_lock 0x973600 (1)
=== -------------------------------------------------------------------
===
=== Thread ID: 8866720 (do_monitor           started at [17064] chan_sip.c restart_monitor())
=== ---> Lock #0 (chan_sip.c): MUTEX 17010 do_monitor &monlock 0xa662c0 (1)
=== -------------------------------------------------------------------
===
=======================================================================


Thread 42 (process 19188):
#0  0x0810a99c in schedule (con=0x9c248f0, s=0xb0ccda00) at sched.c:179
#1  0x0810ac30 in ast_sched_add_variable (con=0x9c248f0, when=1000, callback=0x9faf60 <retrans_pkt>, data=0xaed39280, variable=1) at sched.c:231
#2  0x009fc770 in __sip_reliable_xmit (p=0xaed37a00, seqno=102, resp=0,
   data=0x8730bc "OPTIONS sip:99.XX.XX.XX SIP/2.0\r\nVia: SIP/2.0/UDP 70.XX.XX.XX:5060;branch=z9hG4bK28204ebb;rport\r\nFrom: \"unknown\" <sip:unknown@70.XX.XX.XX>;tag=as58a737ba\r\nTo: <sip:99.XX.XX.XX>\r\nContact: <sip:"..., len=495, fatal=1, sipmethod=3) at chan_sip.c:2130
#3  0x009fe435 in send_request (p=0xaed37a00, req=0x872ea0, reliable=XMIT_CRITICAL, seqno=102) at chan_sip.c:2428
#4  0x00a1605e in transmit_invite (p=0xaed37a00, sipmethod=3, sdp=0, init=2) at chan_sip.c:7688
ASTERISK-1  0x00a47be2 in sip_poke_peer (peer=0xb1e52900) at chan_sip.c:17167
ASTERISK-2  0x00a1b39d in sip_poke_peer_s (data=0xb1e52900) at chan_sip.c:8481
ASTERISK-3  0x0810b15f in ast_sched_runq (con=0x9c248f0) at sched.c:363
ASTERISK-4  0x00a4680f in do_monitor (data=0x0) at chan_sip.c:17011
ASTERISK-5  0x0811c3d5 in dummy_start (data=0x9c2a6e0) at utils.c:856
ASTERISK-6 0x006ed5cc in start_thread () from /lib/tls/libpthread.so.0
ASTERISK-7 0x00657f0e in clone () from /lib/tls/libc.so.6



By: Kristijan Vrban (vrban) 2010-03-01 03:14:29.000-0600

i just uploaded the file options_storm.txt with a output from a "sip debug" when i was also affected by this issue. asterisk was using 100% CPU and chan_sip was firing hundreds of OPTIONS per second

I also use SIP realtime peers with res_config_mysql



By: Alan Graham (zerohalo) 2010-03-01 09:40:35.000-0600

Anything else we can post to help? The severity is marked as minor but I think this is probably better described as major or crash.

By: Marc LEURENT (lftsy) 2010-03-05 08:19:23.000-0600

I'm going to test Asterisk 1.4.30-rc3 to see if the bug is still there...

By: Leif Madsen (lmadsen) 2010-03-08 12:48:20.000-0600

I've upgraded this to major to reflect that it already has a higher priority on our list of items to get done over the coming weeks.

By: Marc LEURENT (lftsy) 2010-03-15 09:03:16

Good afternoon,
Problem is still there on Asterisk 1.4.30 final release!

If needed, I'm available and I can grant access to our servers to an Asterisk developers if needed, please simply provide your phone number or ask for mine on a private channel (email)!


I have been able to reproduce the bug in doing some stress tests, simply doing 120 calls through the following design:
SIP-Tester -> OpenSIPs Proxy -> Asterisk 1.4.30 -> Asterisk Voicemail 1.4.26

After 2 hours, Asterisk server was sending flood OPTIONS to the OpenSIPs server, using 20Mb/s of bandwidth and flooding the MySQL server too. I have the pcap file showing that the server was flooding the IP of the proxy, and "sip show channels" was showing 15000 SIP OPTIONS transaction ongoing...

Best Regards,

By: Leif Madsen (lmadsen) 2010-03-15 14:09:20

This issue is fairly high in the priority stack of issues to be resolved, but as with all issues, this will be handled by a developer as soon as time and resources allows. As it stands, we currently have over 600 open issues in the issue tracker, so we're making our way through them as best we can.

Thanks for your patience and understanding.

By: Alan Graham (zerohalo) 2010-03-22 10:42:35

Here's another one (1.4.27.1). Full chan_sip deadlock, forced a crash to get a BT. Note that the 'core show locks' is slightly different:

=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <file> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: -1208841312 (do_devstate_changes started at [ 363]
devicestate.c ast_device_state_engine_init())
=== ---> Lock #0 (chan_sip.c): MUTEX 2805 find_peer &(&peerl)->_lock
0x193220 (1)
=== -------------------------------------------------------------------
===
=== Thread ID: 1924000 (do_monitor started at [17041]
chan_sip.c restart_monitor())
=== ---> Lock #0 (chan_sip.c): MUTEX 16987 do_monitor &monlock 0x190f60 (1)
=== ---> Lock #1 (sched.c): MUTEX 220 ast_sched_add_variable &con->lock
0x96b8a58 (1)
=== -------------------------------------------------------------------
===
=======================================================================

Thread 41 (process 6583):
#0 0x0810a26a in schedule (con=0x96b8a58, s=0xb3b73430) at sched.c:178
#1 0x0810a488 in ast_sched_add_variable (con=0x96b8a58, when=1000,
callback=0x125f60 <retrans_pkt>, data=0xb6a06e90, variable=1) at sched.c:231
#2 0x00127770 in __sip_reliable_xmit (p=0xb67168d8, seqno=102, resp=0,
data=0x1d40cc "OPTIONS sip:2XX.XX.XXX.XXX SIP/2.0\r\nVia: SIP/2.0/UDP
XX.XXX.X.XX:5060;branch=z9hG4bK1f6e4ce4;rport\r\nFrom: \"unknown\"
<sip:unknown@XX.XXX.X.XX>;tag=as51db4c55\r\nTo:
<sip:2XX.XX.XXX.XXX>\r\nContact: <sip:unkn"..., len=491, fatal=1,
sipmethod=3) at chan_sip.c:2130
#3 0x00129435 in send_request (p=0xb67168d8, req=0x1d3eb0,
reliable=XMIT_CRITICAL, seqno=102) at chan_sip.c:2428
#4 0x00140dc0 in transmit_invite (p=0xb67168d8, sipmethod=3, sdp=0,
init=2) at chan_sip.c:7683
ASTERISK-1 0x0017288e in sip_poke_peer (peer=0xb46c2f98) at chan_sip.c:17144
ASTERISK-2 0x001460ff in sip_poke_peer_s (data=0xb46c2f98) at chan_sip.c:8476
ASTERISK-3 0x0810a9b7 in ast_sched_runq (con=0x96b8a58) at sched.c:363
ASTERISK-4 0x001714bb in do_monitor (data=0x0) at chan_sip.c:16988
ASTERISK-5 0x0811bc2d in dummy_start (data=0x96ba168) at utils.c:856
ASTERISK-6 0x004265cc in start_thread () from /lib/tls/libpthread.so.0
ASTERISK-7 0x0037ef0e in clone () from /lib/tls/libc.so.6

By: Marc LEURENT (lftsy) 2010-04-13 04:02:49

Hello everybody!
Any update on this issue?

Thanks,
Best Regards

By: Leif Madsen (lmadsen) 2010-04-15 11:11:15

All issues are updated within the issue. When an issue gets moved forward, it will happen here.

---

Your issue is in queue, please be patient, and we will get to it as time permits and developer resources become available.

By: Alan Graham (zerohalo) 2010-07-19 00:02:52

we're getting this in 1.4.33rc2 - I believe that https://issues.asterisk.org/view.php?id=17643 may be similar, if not the same. I suspected perhaps a change in res_config_mysql may have caused this, but switching to odbc-mysql yields the same resulting deadlock and options DoS issue.

By: Alan Graham (zerohalo) 2010-07-26 15:45:40

From what we can tell there seems to be a race condition which rather than updating the realtime DB with the fromlocation, it gets put into the astdb - perhaps load-related?

If I patch chan_sip.c to remove the ast_db_del ast_db_get and ast_db_put, this situation doesn't seem to happen - I've not heavily load tested this.

from what I can guess, asterisk loses connection with realtime for a split second, causing the peer info to be retrieved from the astdb which causes this OPTIONS bomb.

I'm still picking through this to find a way to prove it, but that's what I've come up with so far.



By: Jeff Peeler (jpeeler) 2010-07-29 16:43:00

zerohalo - did the patch from 17643 resolve the issue here?

By: Alan Graham (zerohalo) 2010-07-29 16:47:03

jpeeler: we're still testing the patch under load - should have some information to you soon.

By: Alan Graham (zerohalo) 2010-08-05 07:48:41

while we haven't crashed using the patch in 17643, we've also not crashed without the patch on a another machine this week (both production machines). I'm hesitant to call this resolved yet, but it's looking good.

By: Jeff Peeler (jpeeler) 2010-08-05 13:45:55

Well thanks for the update. I'll check back sometime next week probably.

By: Alan Graham (zerohalo) 2010-08-07 09:54:12

Although I haven't seen the crash for 17643 so far, the OPTIONS race persists w/ the patch included there. Just had a runaway options race that pegged mysql at 2M whereas the normal is not above 3k

By: Jeff Peeler (jpeeler) 2010-08-16 14:13:32

lftsy is server access still available? I really need to reproduce this somehow and will work on doing so here, but getting server access is obviously faster.

By: Marc LEURENT (lftsy) 2010-08-17 02:36:31

Hello jpeeler!
I'm contacting you over IRC #asterisk-dev to plan a meeting! We'll keep averybody informed ASAP!
I'm currently preparing 2 Asterisk servers to reproduce the bug!

By: Jeff Peeler (jpeeler) 2010-08-17 17:50:42

I can register a realtime configured peer, and then register it elsewhere and the the option messages are stopped immediately from being sent to the previous location. I don't know how you all are reproducing this reliably.

By: Andrew Thomas (raffles) 2010-08-18 09:19:27

JFYI - I am getting a similar problem in 1.4.34 (upgrading to .35 tonight - to see if it persists).  If a SIP handset gets unplugged (for example) - then * sends out a OPTIONS flood to the last known address (with qualify=yes).  This has flooded a network with 7Mb/s of requests.

I have access to the trace if required.

EDIT : have attached 'options.JPG' - a screen shot of the Wireshark capture.



By: Jeff Peeler (jpeeler) 2010-08-18 09:35:29

Raffles does this happen every time? I see options being sent periodically with 4 retransmissions before being rescheduled. Perhaps the definition of flood being used is different. What are the timestamps for these packet transmissions?

By: Andrew Thomas (raffles) 2010-08-18 09:52:41

This happens every time * looses contact with a handset.  I am getting the full trace sent to me now.  I'll not attach it - it's massive - but I will give you a link to grab it from.

As you can see from the screen shot - I had an average of 7Mb/s in just 21 seconds.  All this traffic was the * server (196.1.1.1) sending out OPTIONS requests to the handset (192.168.0.81 in this case).  This can be verified by the Wireshark display filter.

This is causing an issue - as the WAN link is only a 2Mb/s SDSL one :)



By: Jeff Peeler (jpeeler) 2010-08-18 09:55:51

Raffles and you are using realtime peers or does this happen with normal peers as well?

By: Andrew Thomas (raffles) 2010-08-18 09:58:00

Sorry - should have said.  All my peers are realtime/mySQL based.

EDIT : www.dv-ip.com/options.pcap (approx. 20 Meg)



By: Alan Graham (zerohalo) 2010-08-18 13:52:30

I wish I could add anything additional here, but since we've patched out the read/write of peer data from ast_db in chan_sip, which granted, is a very big hammer approach, we haven't run into this again. It definitely seems like the problem is related to mysql connectivity blipping (on the local machine), but it's been very hard to reproduce reliably. It definitely takes decent load and patience for it to happen, generally pruning peers and killing connections will increase the chance of it happening, but never reliably happens when we want it to.

By: Jeff Peeler (jpeeler) 2010-08-18 13:55:20

zerohalo, did you ever see "MySQL RealTime: Failed to query database. Check debug for more info."?

By: Alan Graham (zerohalo) 2010-08-18 14:20:00

jpeeler, I can't say with confidence, as our logs have been well rotated by now - but it does sound familiar. I'll see if I can find a sacrificial production server and revert the patches - I'm never able to reproduce this in a lab environment. In our case, it's never just a peer or two that goes offline, but rather a bunch at once, as when a circuit goes down and the peers jump over to a backup circuit. 'sip show channels' will show unending numbers of OPTIONS dialogs, and as mentioned we essentially DoS the connection. Also, typically, the backup asterisk server will be the one that this manifests on. (connection A goes down, phones fail-over to connection B, pop up on asterisk server B for a split second, then immediately go back to server A, then server B DoSs connection A).

By: Jeff Peeler (jpeeler) 2010-08-18 14:26:44

Raffles, you said this happened every time for you. Can you reproduce and give access to a server for live debugging? I can't reproduce the problem at all.

By: Marc LEURENT (lftsy) 2010-08-18 14:58:36

Hello everybody,
We have spend some time with jpeeler (who I thank for its time) on one of our lab server to try finding a short procedure to reproduce the bug, but we failed to find a quick procedure without having to wait (which is going to be useful to test patches)...

The best procedure I had, was:
* Doing note above n°114758 until pinging deprecated IP/ports starts
* And wait for uptime to increase
* Then resuming doing note n°114758 above until flooding appears!

We/I really have to take some time to find a quick and 100% procedure that reproduces the bug!

Best Regards,

By: Marc LEURENT (lftsy) 2010-08-18 16:01:36

At least it has worked! We successfully re-created the flooding process on Asterisk 1.4.26, since it was a version I was able to reproduce the bug easier!

Going to check with asterisk SVN 1.4 branch ASAP to see if I can also easily recreate it!


Here is what I have done to "start the flooding", sorry I do not have the very short procedure,
but for sure, launching these automatic scripts for at least one hour and you should have some flood on 1.4.26!


What you need is:
* an Asterisk server 1.4.26 (to reproduce it quickly), 1.4.30 is broken too, 1.4.35 is tomorrow's step
* 2 real phones like 0215667818 and 0245667911 for me
* a twinkle phone also configured with 0215667818

The prune process is a way I have found to accelerate the process of flooding but it's not compulsory, since peer in production are not using this command except when changing password or codecs, and it's happening on peer that doesn't have changed password or codecs


After doing the procedure below,
I have now peer 0215667818 being pinged on (10min after unpluging all phones):
192.168.10.206:14436
192.168.10.206:10619
192.168.10.206:10749
192.168.10.206:16566
192.168.10.126:2048

And if we stop our scripts and wait for up to 2 days, the flood can grow to >4Mb/s (max I have seen was 20Mb/s on one single customer, just too bad for hime!)


############# HERE IS THE COMMAND TO SIMULATE A PEER BEHIND UNSTABLE CONNECTION ##########
# Here is the command I use to simulate an unstable xDSL line (changing connection port)
mlr@nb-ge-015:[~]$ while `sleep 1`
do sed -i "s/sip_port=.*/sip_port=$RANDOM/" .twinkle/twinkle.sys
sleep 5s
done


# Here is the command I'm using to kill Twinkle without unregistering
mlr@nb-ge-015:[~/.twinkle]$ while `sleep 1`
do killall -9 twinkle
sleep 10s
done

# Here is the command I'm using to restart Twinkle sometimes
mlr@nb-ge-015:[~/.twinkle]$ while `sleep 1`
do twinkle
sleep 13s
done

############# AND HERE IS THE ACTION IF WAS ALLOWED TO DO ON ASTERISK ##################
asterisk -rx "sip prune realtime 0215667818"
and call from 0245667911 to 0215667818

######################################################################

By: Andrew Thomas (raffles) 2010-08-23 05:52:25

jpeeler - sorry for lateness (been working 'away').

Anyway, unfortunately I can not give access to this particular site as it is not 'net' facing.

Also, I need to qualify (no pun intended) my use of 'every time'.  By 'every time' I mean 'every time on this customers site'.  Not on every one of my * switches (sorry).

This happened again this morning - and generated 3.5Mb/s this time.  It's as though the qualify's are getting queued up and then being sent out in 'burst' mode (ie. all at the same time).  Is this how the qualify works in 1.4.3x?

And to add - still in 1.4.35 :(

I think I need to read up on exactly how/often * sends out qualify OPTIONS packets so I can explain myself more clearly.

By: Jeff Peeler (jpeeler) 2010-08-31 11:20:11

The attached patch is a fix to stop qualifies once a peer is pruned. Please test and report back your results. I do not recall anybody saying that there were problems when peers were not pruned.

By: Marc LEURENT (lftsy) 2010-08-31 15:39:28

Hello,
I have to take time to test this patch ASAP, but I think that in production the flood process starts without prune because in our environment prune can only be triggered when modifying codec or password which do not happen often. And floods happens everyday on unstable peers

Since we have several Asterisk servers (2 in the environment with troubles), I was wondering if the problem can be triggered by creating a peer with an entry that is still valid but that was created on another Asterisk.

Ex: I have 2 Asterisk servers behind a proxy: ast-01 and ast-03, if I register on ast-01 with register timer of 3600s and a few minutes later, on ast-03, I try to end the call to this peer locally, it will create this peer on ast-03 with information from the database that is valid. And I think that it can start the flooding process like that without using the "sip prune realime" action.


What do you think about this?
I really have to try to confirm this!

By: Alan Graham (zerohalo) 2010-08-31 15:53:13

I can't say this happens always with a prune, though a prune does seem to help speed up the conditions that cause it or perhaps a prune at the right moment kicks it off. It seems to happen when a peer jumps on/offline rather quickly (as would be the case in severe packet loss). While browsing the bug tracker, I found ASTERISK-16455 which sounds pretty familiar. I'm going to go back through by backtraces and see if I can correlate hints being used for all of the peers in question (and test the patch in ASTERISK-16455 while I'm at it).

By: Alan Graham (zerohalo) 2010-09-17 10:01:43

patching with similar to ASTERISK-1749741 we've not run into this issue again - so far - we've been rolling this onto production servers over the course of the week, so I'll have a better update next week.



By: Alan Graham (zerohalo) 2010-09-23 12:25:03

some success - although the options flood is not gone, the patch in ASTERISK-1749741 removes the issues with increased mysql throughput to the realtime db and prevents starvation.

In a weird twist, I found that after cloning a replicating mysql slave for another machine that after several hours, the new replicating slave started DoSing a peer that had never been on it.

By: Alan Graham (zerohalo) 2010-09-23 17:29:56

the issue seems to hit more with a high qualify time (i.e., qualify=10000)

By: Jeff Peeler (jpeeler) 2010-10-07 16:53:32

Well folks, I know it's been a while since I've looked at this. I still can not reproduce this problem. Since people are saying that pruning peers are not necessary I've just tried registering a realtime peer from different locations with a simulated load (using stressapptest). I do not see how connectivity to the database would affect things since the registry check aborts for realtime. Even if it takes a long time to occur, we need to isolate the problem to one given condition such as a peer going offline over and over.

By: Jeff Peeler (jpeeler) 2010-10-11 18:11:55

lftsy, I've noticed a difference in your initial reported sip.conf and one that you commented. The option rtautoclear I believe is critical to making things break. Please confirm exactly what you have that option set to and report back. This seems to be the only way I can see which produces the correct conditions for a peer to have multiple ongoing OPTIONS transmissions without pruning a peer from the CLI.

Anybody else want to confirm that they have rtautoclear set to no (and are not pruning peers) and are having problems?

By: Alan Graham (zerohalo) 2010-10-11 18:16:21

autoclear is not set (and therefore default):

Realtime SIP Settings:
----------------------
 Realtime Peers:         Yes
 Realtime Users:         Yes
 Cache Friends:          Yes
 Update:                 Yes
 Ignore Reg. Expire:     No
 Save sys. name:         No
 Auto Clear:             120

I'll try setting it to no to see what happens.

By: Alan Graham (zerohalo) 2010-10-11 18:30:59

Interestingly, I can't set rtautoclear=no and have it reflected in 'sip show settings'. Our general sip context is included from another file, so I'll check there as well, but even setting rtautoclear=0 seems to have no effect.

By: Marc LEURENT (lftsy) 2010-10-12 01:14:21

Good morning, sorry not answering sooner but we've got jet-lag here -:) On the server we have reinstalled together, current configuration for sip.conf is:


[general]  
context=default ; Default context for incoming calls
allowguest=no ; Allow or reject guest calls (default is yes)
allowoverlap=no ; Disable overlap dialing support. (Default is yes)
realm=voip ; Realm for digest authentication
bindport=5060 ; UDP Port to bind to (SIP standard port is 5060)
bindaddr=0.0.0.0 ; IP address to bind to (0.0.0.0 binds to all)
srvlookup=no ; Enable DNS SRV lookups on outbound calls
pedantic=no ; Enable checking of tags in headers
tos_sip=cs3 ; Sets TOS for SIP packets.
tos_audio=ef ; Sets TOS for RTP audio packets.
maxexpiry=180 ; Maximum allowed time of incoming registrations
minexpiry=60 ; Minimum length of registrations/subscriptions (default 60)
defaultexpiry=120 ; Default length of incoming/outgoing registration
disallow=all ; First disallow all codecs
allow=alaw ; Allow codecs in order of preference
language=en ; Default language setting for all users/peers
useragent=voipua ; Allows you to change the user agent string
dtmfmode=rfc2833 ; Set default dtmfmode for sending DTMF. Default: rfc2833
rtptimeout=60 ; Terminate call if 60 seconds of no RTP or RTCP activity
rtpholdtimeout=300 ; Terminate call if 300 seconds of no RTP or RTCP activity
rtpkeepalive=20 ; Number of seconds, when a RTP Keepalive packet will be sent if no other RTP traffic on that connection.
notifyringing=yes ; Control whether subscriptions already INUSE get sent
nat=yes ; Global NAT settings (Affects all peers and users)
rtcachefriends=yes ; Cache realtime friends by adding them to the internal list
rtupdate=yes ; Send registry updates to database using realtime? (yes|no)
rtautoclear=60 ; Auto-Expire friends created on the fly. If yes the autoexpire will be in 120 seconds. Default yes.
qualify=yes ; Check if client is reachable. If yes, the checks occur every 60 seconds
t38pt_udptl=yes ; T.38 faxing only works in SIP to SIP calls, with no local or agent channel being used.
progressinband=never ; never|no|yes : If we should generate in-band ringing always. Default never.
 

And so rtautoclear is indeed set to 60s

Realtime SIP Settings:
----------------------
 Realtime Peers:         Yes
 Realtime Users:         Yes
 Cache Friends:          Yes
 Update:                 Yes
 Ignore Reg. Expire:     No
 Save sys. name:         No
 Auto Clear:             60


If you want me to change anything and report back, I'm available. Have a nice morning.

By: Andrew Thomas (raffles) 2010-10-12 03:20:46

"It seems to happen when a peer jumps on/offline rather quickly" - zerohalo

After a lot of further investigation - this seems to be the condition that sparks ours off.  The handsets are registered to a * box over a private WAN.  If that WAN link dies for any reason and then comes back (a bit like ADSL does) our * server starts the 'flood'.

One thing I did notice, though, is that it seems to affect Aastra 55i handsets a lot more than the Grandstream BT-20x ones.  Maybe there is something in the */handset combination that is triggering it?

Sorry I can't be of more help - time isn't on my side with this one (rather irate customer - so I just restart everything to keep the peace).

EDIT : just noticed a mention of 'hints'.  All our handsets are 'hint'ed.



By: Alan Graham (zerohalo) 2010-10-12 13:17:37

FYI - on a clean install, with default options + realtime, setting rtautoclear=no or rtautoclear=0 is not respected. I can set it to any value except no or 0. I will open a separate bug for this, but that might explain anyone who has it set to 'no' and still getting this issue.

By: Jeff Peeler (jpeeler) 2010-10-12 13:35:35

zerohalo, yes I noticed that after my last post. In this case though I think I've figured out a way to significantly reduce if not eliminate the redundant (actually stale) option transmissions and will be posting a patch for testing shortly.

By: Jeff Peeler (jpeeler) 2010-10-12 17:59:36

v2 patch uploaded, please report back

By: Alan Graham (zerohalo) 2010-10-12 20:35:13

jpeeler: I immediately get a segfault upon load:

Program terminated with signal 11, Segmentation fault.
#0  0x00f3c3a9 in __ast_pthread_mutex_lock (filename=0xf99bf4 "chan_sip.c", lineno=8598, func=0xf9f5bc "sip_poke_peer_s",
   mutex_name=0xf9f5cc "&(foundpeer)->_lock", t=0x5c) at /usr/src/asterisk/asterisk-1.4.37-rc1/include/asterisk/lock.h:360
360             int canlog = strcmp(filename, "logger.c") & t->track;

against 1.4.37-rc1, bt attached



By: Jeff Peeler (jpeeler) 2010-10-13 11:32:10

Thanks for testing, sorry about the crash. v3 should take care of that.

By: Alan Graham (zerohalo) 2010-10-13 11:41:29

I'll try out v3 this evening - thanks!

By: Alan Graham (zerohalo) 2010-10-14 08:54:38

No joy:

Core was generated by `asterisk -g'.
Program terminated with signal 11, Segmentation fault.
#0  0x00d453a9 in __ast_pthread_mutex_lock (filename=0xda2bf4 "chan_sip.c", lineno=8598, func=0xda85bc "sip_poke_peer_s",
   mutex_name=0xda85cc "&(foundpeer)->_lock", t=0x5c) at /usr/src/asterisk/asterisk-1.4.37-rc1/include/asterisk/lock.h:360
360             int canlog = strcmp(filename, "logger.c") & t->track;
#0  0x00d453a9 in __ast_pthread_mutex_lock (filename=0xda2bf4 "chan_sip.c", lineno=8598, func=0xda85bc "sip_poke_peer_s",
   mutex_name=0xda85cc "&(foundpeer)->_lock", t=0x5c) at /usr/src/asterisk/asterisk-1.4.37-rc1/include/asterisk/lock.h:360
#1  0x00d65752 in sip_poke_peer_s (data=0x9786e40) at chan_sip.c:8598
#2  0x0810f8d3 in ast_sched_runq (con=0x96a0500) at sched.c:363
#3  0x00d92993 in do_monitor (data=0x0) at chan_sip.c:17405
#4  0x081214b2 in dummy_start (data=0x96a0050) at utils.c:856
ASTERISK-1  0x00952832 in start_thread () from /lib/libpthread.so.0
ASTERISK-2  0x0033be0e in clone () from /lib/libc.so.6

By: Jeff Peeler (jpeeler) 2010-10-18 14:28:19

Zerohalo, I'm not sure what is causing your crash. Can you post your sip.conf and the relevant line from your sippeers table? In gdb, you can type "p peer->name" to see which peer is causing the crash. Maybe even run under valgrind and see what error output you get.

By: Alan Graham (zerohalo) 2010-10-20 15:31:11

jpeeler: I'll get this information for you tonight.

By: Alan Graham (zerohalo) 2010-10-21 13:25:37

jpeeler: it looks like I somehow got the patch applied twice and that was the cause of the second crash. This is now running on a couple of loaded servers to see if it takes care of the issue.

By: Jeff Peeler (jpeeler) 2010-10-25 16:23:15

What was the result?

By: Alan Graham (zerohalo) 2010-10-25 16:30:03

So far, so good - I haven't seen the flood any longer, but let's give it a couple more days to set and I'll roll it on to more servers.

By: Alan Graham (zerohalo) 2010-10-26 08:20:31

This morning we found a pretty large options bomb going - almost 10M going... I've iptable DROP'd the outgoing connection to the peers and it's available today, if someone wants access to the machine.

By: Alan Graham (zerohalo) 2010-10-26 09:17:58

this is on 1.4.37-rc1 w/ patch v3 + hardcoded rtautoclear=0

By: Jeff Peeler (jpeeler) 2010-11-01 11:58:20

Zerohalo, I'm glad you reported you hard coded rtautoclear. I assume also that autocreatepeer has not been changed from the default of off? A "sip show settings" output may be helpful in case I have further questions.

By: Alan Graham (zerohalo) 2010-11-01 14:17:05

jpeeler: attached - I also have the output of 'sip show channels' if you need it, but as it contains ip/peernames, I'd rather send it out of band

By: Jeff Peeler (jpeeler) 2010-11-02 15:48:30

Upon further examination the rtautoclear option does work properly, and really the value shouldn't ever be set to 0. I'm not sure how you hard coded it, but setting it to "no" should work. Hard coding rtautoclear to 0 would cause registrations to expire prematurely, which I think is related to the problem. So results from the hard coding removed and rtautoclear on (and off actually) would be appreciated. Leaving rtautoclear on should make the problem happen faster so definitely test that first.

By: Alan Graham (zerohalo) 2010-11-02 15:52:08

no problem - I'll see what I can do to get rtautoclear=no to work - as mentioned previously 'sip show settings' still seems to reflect the default - this was on a fresh build w/ stock configs modified only to enable realtime.

By: Jeff Peeler (jpeeler) 2010-11-02 15:53:37

Right, it isn't reflected in the sip show settings, slightly confusing I guess. Just test with yes first.

By: Jeff Peeler (jpeeler) 2010-11-04 11:27:33

Zerohalo, does the silence mean you are testing and not seeing any problems?

By: Alan Graham (zerohalo) 2010-11-04 11:50:05

So far - I have two sets of machines running rtautoclear=no and rtautoclear/default. Nothing observed yet.

By: Alan Graham (zerohalo) 2010-11-05 10:05:41

jpeeler: I've got several servers running w/ updated patches and the revert of the hardcoded rtautoclear; half running =no and half =default. I'm going to stress them this weekend to see if I can force the issue to reproduce - will let you know what I find afterward.

By: Jeff Peeler (jpeeler) 2010-11-08 16:12:32.000-0600

And the result was?

By: Alan Graham (zerohalo) 2010-11-08 16:18:14.000-0600

unable to reproduce at this time - I'd like another couple of days of testing to make sure, if that's okay.

By: Jeff Peeler (jpeeler) 2010-11-08 16:21:06.000-0600

Ok that's fine, just wanted to hear the status.

By: Alan Graham (zerohalo) 2010-11-10 17:04:05.000-0600

jpeeler - I've spent a lot of time trying to reproduce this (which was hard to reproduce in the first place) and I'm happy to report that I cannot reproduce the OPTIONS flood w/ your v3 patch. There are some curious issues with Polycom mwi resubscriptions and rtautoclear=no (another bug which I'll keep working on figuring out - seem related to various other bugs and similar connectivity issues that caused the floods in the first place), but other than that, the rtautoclear setting seems to have no effect in either direction when it comes to helping produce the issue. Thanks for all your time and help on this!

By: Jeff Peeler (jpeeler) 2010-11-11 14:32:59.000-0600

Thank you for the testing and quick feedback! Can finally merge :)

By: Digium Subversion (svnbot) 2010-11-11 15:12:29.000-0600

Repository: asterisk
Revision: 294688

U   branches/1.4/channels/chan_sip.c

------------------------------------------------------------------------
r294688 | jpeeler | 2010-11-11 15:12:28 -0600 (Thu, 11 Nov 2010) | 18 lines

Fix problem with qualify option packets for realtime peers never stopping.

The option packets not only never stopped, but if a realtime peer was not in
the peer list multiple options dialogs could accumulate over time. This
scenario has the potential to progress to the point of saturating a link just
from options packets. The fix was to ensure that the poke scheduler checks to
see if a peer is in the peer list before continuing to poke. The reason a peer
must be in the peer list to be able to properly manage an options dialog is
because otherwise the call pointer is lost when the peer is regenerated from
the database, which is how existing qualify dialogs are detected.

(closes issue ASTERISK-14500)
Reported by: lftsy
Patches:
     bug16382-3.patch uploaded by jpeeler (license 325)
Tested by: zerohalo


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=294688

By: Digium Subversion (svnbot) 2010-11-11 15:57:23.000-0600

Repository: asterisk
Revision: 294733

_U  branches/1.6.2/
U   branches/1.6.2/channels/chan_sip.c

------------------------------------------------------------------------
r294733 | jpeeler | 2010-11-11 15:57:22 -0600 (Thu, 11 Nov 2010) | 25 lines

Merged revisions 294688 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r294688 | jpeeler | 2010-11-11 15:12:27 -0600 (Thu, 11 Nov 2010) | 18 lines
 
 Fix problem with qualify option packets for realtime peers never stopping.
 
 The option packets not only never stopped, but if a realtime peer was not in
 the peer list multiple options dialogs could accumulate over time. This
 scenario has the potential to progress to the point of saturating a link just
 from options packets. The fix was to ensure that the poke scheduler checks to
 see if a peer is in the peer list before continuing to poke. The reason a peer
 must be in the peer list to be able to properly manage an options dialog is
 because otherwise the call pointer is lost when the peer is regenerated from
 the database, which is how existing qualify dialogs are detected.
 
 (closes issue ASTERISK-14500)
 (closes issue ASTERISK-16490)
 Reported by: lftsy
 Patches:
       bug16382-3.patch uploaded by jpeeler (license 325)
 Tested by: zerohalo
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=294733

By: Digium Subversion (svnbot) 2010-11-11 15:58:26.000-0600

Repository: asterisk
Revision: 294734

_U  branches/1.8/
U   branches/1.8/channels/chan_sip.c

------------------------------------------------------------------------
r294734 | jpeeler | 2010-11-11 15:58:26 -0600 (Thu, 11 Nov 2010) | 32 lines

Merged revisions 294733 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.6.2

................
 r294733 | jpeeler | 2010-11-11 15:57:22 -0600 (Thu, 11 Nov 2010) | 25 lines
 
 Merged revisions 294688 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r294688 | jpeeler | 2010-11-11 15:12:27 -0600 (Thu, 11 Nov 2010) | 18 lines
   
   Fix problem with qualify option packets for realtime peers never stopping.
   
   The option packets not only never stopped, but if a realtime peer was not in
   the peer list multiple options dialogs could accumulate over time. This
   scenario has the potential to progress to the point of saturating a link just
   from options packets. The fix was to ensure that the poke scheduler checks to
   see if a peer is in the peer list before continuing to poke. The reason a peer
   must be in the peer list to be able to properly manage an options dialog is
   because otherwise the call pointer is lost when the peer is regenerated from
   the database, which is how existing qualify dialogs are detected.
   
   (closes issue ASTERISK-14500)
   (closes issue ASTERISK-16490)
   Reported by: lftsy
   Patches:
         bug16382-3.patch uploaded by jpeeler (license 325)
   Tested by: zerohalo
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=294734

By: Digium Subversion (svnbot) 2010-11-11 16:01:03.000-0600

Repository: asterisk
Revision: 294735

_U  trunk/
U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r294735 | jpeeler | 2010-11-11 16:01:03 -0600 (Thu, 11 Nov 2010) | 39 lines

Merged revisions 294734 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

................
 r294734 | jpeeler | 2010-11-11 15:58:25 -0600 (Thu, 11 Nov 2010) | 32 lines
 
 Merged revisions 294733 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.6.2
 
 ................
   r294733 | jpeeler | 2010-11-11 15:57:22 -0600 (Thu, 11 Nov 2010) | 25 lines
   
   Merged revisions 294688 via svnmerge from
   https://origsvn.digium.com/svn/asterisk/branches/1.4
   
   ........
     r294688 | jpeeler | 2010-11-11 15:12:27 -0600 (Thu, 11 Nov 2010) | 18 lines
     
     Fix problem with qualify option packets for realtime peers never stopping.
     
     The option packets not only never stopped, but if a realtime peer was not in
     the peer list multiple options dialogs could accumulate over time. This
     scenario has the potential to progress to the point of saturating a link just
     from options packets. The fix was to ensure that the poke scheduler checks to
     see if a peer is in the peer list before continuing to poke. The reason a peer
     must be in the peer list to be able to properly manage an options dialog is
     because otherwise the call pointer is lost when the peer is regenerated from
     the database, which is how existing qualify dialogs are detected.
     
     (closes issue ASTERISK-14500)
     (closes issue ASTERISK-16490)
     Reported by: lftsy
     Patches:
           bug16382-3.patch uploaded by jpeeler (license 325)
     Tested by: zerohalo
   ........
 ................
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=294735