[Home]

Summary:ASTERISK-22541: Very high CPU usage on Asterisk 11.5.1
Reporter:JoshE (n8ideas)Labels:
Date Opened:2013-09-16 11:24:45Date Closed:2013-09-17 12:57:46
Priority:MajorRegression?
Status:Closed/CompleteComponents:
Versions:11.5.1 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:Not sure exactly best way to reproduce, but upgraded Asterisk 11.4.0 to 11.5.1 and immediately saw a 10x spike in CPU usage, with no other changes to the system itself.

System has around 800 peers and 30-50 concurrent bridged calls.  Under even moderate load, the 11.5.1 system spiked to 2 full cores, where before it was running about 20% of one core.

Reverting the system caused everything to go back to normal.
Comments:By: Matt Jordan (mjordan) 2013-09-16 21:53:11.058-0500

This is sort of like asking people to find a needle in a stack of needles while blindfolded. You'll need to provide a lot more information. Some obvious standard things that would help:
# What channel technologies were involved
# How were they bridged (native - local or remote - or core)
# Is recording being used in any fashion

Please attach the relevant portions of your dialplan, as well as channel driver configurations. Any log information during the loaded period would also be useful.

You could also use gdb to dump out a backtrace of the threads during the loaded periods. That would tell us what the system is actually doing during that period of time.

By: JoshE (n8ideas) 2013-09-17 08:57:15.789-0500

Just a couple quick comments.  Not sure I'll be able to repro in a production scenario very quickly, but I'm looking at ways to load test this.  The easier questions to answer:

1) 100% SIP.  No other channel drivers are being used - most are not even compiled.

2) Everything is SIP-SIP calling and largely transcoded from G.729 on one leg to G.711u on the other.  No differences in functionality between the configurations on those two versions.

3) Recording is used on < 10% of calls.  System is definitely not disk IO bound at this point.

Working on the backtrace and logs from the loaded period - but at first glance, nothing was abnormal except for the high CPU usage and complaints about very poor call quality.

By: JoshE (n8ideas) 2013-09-17 12:25:22.134-0500

Matt-

Going to recommend to close this one.  I hadn't realized that we'd been compiling with Debug Threads on.  When I went back and looked at the make diff, that was the difference between the two versions.

I'm about 98% certain this is the culprit.  Hopefully this may help someone else out.


By: Rusty Newton (rnewton) 2013-09-17 12:57:46.906-0500

That was probably it. Thanks!