[Home]

Summary:ASTERISK-21916: Call hangs when FILTER function is used in dial plan
Reporter:Elizabeth Hudnott (ElizabethHudnott)Labels:
Date Opened:2013-06-16 15:49:51Date Closed:2013-09-03 18:35:53
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Functions/func_strings
Versions:11.4.0 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Latest binaries of Asterisk 11.4.0 from Optware repository running on an Asus RT-n66u (Tomato USB based router, MIPS 74K V4.9) with firmware 3.0.0.4.270.25 (rmerlin variant) and Linux 2.6.22.19, installed on an ext3 partition on external HDAttachments:( 0) debug.log
Description:Whenever the FILTER function is used in the dial plan the call hangs.  No further dial plan steps get executed, ringing tone ceases, the call remains open until Asterisk is restarted and CPU utilization increases sharply.

Simple dial plan extract to reproduce the problem:
{noformat}
exten=>**4,1,Verbose(${FILTER(0-9,123)})
same=>n,Hangup()
{noformat}

This behaviour occurs consistently every time the extension is dialled from CSipSimple sofphone.
Comments:By: Elizabeth Hudnott (ElizabethHudnott) 2013-06-16 15:52:09.506-0500

Asterisk log file, debug level 5, verbosity level 5, all log facilities

By: Michael L. Young (elguero) 2013-06-16 21:24:12.918-0500

I tried to reproduce using the dial plan sample attached to this issue.  I was unable to reproduce the issue on 11.3 and on trunk.

I do not see the filter function being run in the debug logs.  Perhaps the levels need to be raised to 10 to get any information that could be of use.

If a deadlock is involved, we need the following.

Debugging deadlocks: Please select DEBUG_THREADS and DONT_OPTIMIZE in the Compiler Flags section of menuselect. Recompile and install Asterisk (i.e. make install).  This will then give you the console command "core show locks." When the symptoms of the deadlock present themselves again, please provide output of the deadlock via:

# asterisk -rx "core show locks" | tee /tmp/core-show-locks.txt
# gdb -se "asterisk" <pid of asterisk> | tee /tmp/backtrace.txt
gdb> bt
gdb> bt full
gdb> thread apply all bt

Then attach the core-show-locks.txt and backtrace.txt files to this issue. Thanks!



By: Walter Doekes (wdoekes) 2013-06-17 06:23:54.735-0500

{noformat}
[Jun 16 19:44:29] VERBOSE[13295][C-00000003] pbx.c: [Jun 16 19:44:29]     -- Executing [**4@admin:1] Goto("SIP/lizzy-00000001", "dialing,**4,1") in new stack
[Jun 16 19:44:29] VERBOSE[13295][C-00000003] pbx.c: [Jun 16 19:44:29]     -- Goto (dialing,**4,1)
[Jun 16 19:44:29] DEBUG[13295][C-00000003] func_strings.c: c1=48, c2=57
{noformat}

The FILTER func is being run:
{noformat}
static int filter(struct ast_channel *chan, const char *cmd, char *parse, char *buf,
                 size_t len)
...
                       ast_debug(4, "c1=%d, c2=%d\n", c1, c2);
{noformat}

So it runs through the loop, and on the next run, it stalls?

For it to loop without the next debug message ("Allowed..."), it would have to stall either here:
{noformat}
               if (ast_get_encoded_char(args.allowed, &c1, &consumed))
                       return -1;
{noformat}
here:
{noformat}
                       for (ac = (unsigned char) c1; ac != (unsigned char) c2; ac++) {
                               bitfield[ac / 32] |= 1 << (ac % 32);
                       }
{noformat}
or here:
{noformat}
       for (ac = 1; ac != 0; ac++) {
               if (bitfield[ac / 32] & (1 << (ac % 32))) {
                       allowed[allowedlen++] = ac;
               }
       }
{noformat}

Get the backtrace (thread apply all bt [full]) when it's running 100% cpu and we'll know which one it is.

By: Michael L. Young (elguero) 2013-06-17 10:09:00.280-0500

Thanks Walter.  I missed that one debug line while looking for the other.

By: Elizabeth Hudnott (ElizabethHudnott) 2013-06-19 21:30:04.074-0500

Hi,

Thank you for your responses.  I'm a newbie.  I managed to compile from source using the options you said, but in menuselect SIP was showing as XXX, so I guess my system's missing something needed to make SIP work when compiling from source.  All of my existing devices are SIP.  I tried to configure Zoiper as an IAX client but I've hit some sort of connection issue.  iptables is logging incoming packets that are accepted on port 4569 but "iax set debug on" displays nothing. Zoiper and the router's LAN IP are on the same side of the NAT.  I tried fiddling with bindaddr to force asterisk to only listen on the LAN IP but no luck.  I know this isn't a support forum but am a bit stuck on how to obtain the information that you require.

By: Walter Doekes (wdoekes) 2013-06-20 02:55:05.994-0500

In menuselect, when hovering over chan_sip, at the bottom you should see what dependencies you are missing.

You could take a look at:
{noformat}
contrib/scripts/install_prereq
{noformat}
That file will show common dev packages you require on common distros. Running it will probably fail on your Optware.

----

You're running on (relatively) uncommon hardware. That might explain issues in one the loops above.

Can you confirm that sizeof(char) == 1 and whether the char is signed or unsigned by default?

{noformat}
int main() {
 return sizeof(char);
}

gcc the_above.c -o the_above; ./the_above; echo $?
{noformat}

and:

{noformat}
int main() {
       char b = 127;
       b += 1;
       if (b == 128) {
               return 1;
       }
       return 0;
}

gcc the_above.c -o the_above; ./the_above && echo signed || echo unsigned
{noformat}

By: Elizabeth Hudnott (ElizabethHudnott) 2013-06-21 10:55:17.702-0500

Thank you for your suggestions.

sizeof(char) = 1 and it's unsigned

install_prereq just gives an empty list of dependencies.

In menuselect the list of dependencies for chan_sip disappears underneath the "Save & Exit" "button" so I can't read it.


By: Matt Jordan (mjordan) 2013-07-07 20:07:15.487-0500

I think we're going to need to get a backtrace from Asterisk when this happens. That will point out exactly what is causing the issue.

Instructions for getting a backtrace from a running instance of Asterisk is on the wiki here:

[https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace]

Note that if {{chan_sip}} isn't loading, you're probably missing OpenSSL, as {{res_crypto}} depends on it. Install the development libraries for OpenSSL and see if that resolves the missing dependencies.


By: Elizabeth Hudnott (ElizabethHudnott) 2013-07-08 13:11:13.881-0500

The openssl-dev package was already installed.  I now realize that I could potentially trigger the bug using a call file if I can't get any further with getting SIP working from source, so probably not a problem anymore.

I started trying to follow the back trace instructions but hit upon another problem though, outside of Asterisk:

gdb -ex "thread apply all bt" --batch /mnt/sda2/usr/devel/sbin/asterisk `pidof asterisk` >/mnt/sda2/tmp/backtrace-threads.txt

Excess command line arguments ignored. (29221 ...)

warning: process 29222 is a cloned process

warning: Cannot initialize thread debugging library: generic error

(The line above is repeated several times)

Presumably that means GDB is foo-bared?  Or can I carry on regardless?  I found some discussion threads on the web with other people having problems debugging threads when GDB is built using uClibc, which I think might be the case on this platform.  Unfortunately, I couldn't find a solution.

By: Rusty Newton (rnewton) 2013-07-29 14:46:39.287-0500

Not sure whats happening with your GDB.

In the command you pasted:
{noformat}
gdb -ex "thread apply all bt" --batch /mnt/sda2/usr/devel/sbin/asterisk `pidof asterisk` >/mnt/sda2/tmp/backtrace-threads.txt
{noformat}
You don't have a space between the ">" and the final path.

Also, did you check what `pidof asterisk` evaluates to? The errors almost make it sound like there was two Asterisk processes running.

We'll leave this in feedback longer for you to obtain a backtrace. Let us know if you can't get any trace.

By: Elizabeth Hudnott (ElizabethHudnott) 2013-07-29 15:54:04.325-0500

Not sure why I'd need a space after the >.  I don't usually put one there when I redirect output in Linux and have never had any problems before.

pidof asterisk gives a list of numbers, e.g. "31272 31271"...  Yes, there are 23 processes on my system named asterisk, all created from the same "asterisk" command invocation (without the quotes).  I assumed all bar one of them were just folked background processes created by the main one?

I don't know if this helps, but I'll mention anyway that I don't think there's any timer available on this system.  (I don't know why a simple filter function would need a timer or indeed any non-local resources but...).  I now remember that when I first installed the system I had to noload res_timing_pthread because initially the asterisk process(es) were dying every time a SIP call connected.  The system's been rebooted since then.  (Is there a bug report about that somewhere?  I don't see anything obvious.  Other people in blogs and forums have described the same issue in their router firmware environments.)

By: Rusty Newton (rnewton) 2013-07-29 17:46:02.556-0500

{quote}
pidof asterisk gives a list of numbers, e.g. "31272 31271"... Yes, there are 23 processes on my system named asterisk, all created from the same "asterisk" command invocation (without the quotes). I assumed all bar one of them were just folked background processes created by the main one?
{quote}

I'm used to seeing a single process. On rare occasion I've seen multiple Asterisk processes, but never had the need to run GDB on those systems. I don't know much about how the Tomato environment is setup. It sounds like it may be using something like http://en.wikipedia.org/wiki/LinuxThreads resulting in each Asterisk thread having its own process.

AFAIK you can only run GDB on a single process and that process should likely be the primary one.

{quote}
I don't know if this helps, but I'll mention anyway that I don't think there's any timer available on this system.
{quote}

That shouldn't affect the FILTER function in any way that I can think of.

By: Rusty Newton (rnewton) 2013-08-14 16:05:50.197-0500

To make it clear we'll need the backtrace as Walter requested to move forward with this:

"Get the backtrace (thread apply all bt [full]) when it's running 100% cpu and we'll know which one it is."

Otherwise I'm not sure what we can do, as we can't reproduce.

By: Elizabeth Hudnott (ElizabethHudnott) 2013-08-14 17:36:52.872-0500

"To make it clear we'll need the backtrace as Walter requested to move forward with this"

I'd love to provide one but I followed the instructions on the Asterisk wiki but am not getting anywhere.  Even running a simple gdb <asterisk binary including path> <lowest asterisk pid> just produces the error "warning: Cannot initialize thread debugging library: generic error".  Googling suggests the library in question is libthread_db, which according to both the package manager (optware) and ls is installed correctly but for some reason gdb nonetheless isn't working.

I don't know what else to try.

By: Walter Doekes (wdoekes) 2013-08-15 02:00:35.866-0500

What you could do is sprinkle the function in question with debug statements.

By: Walter Doekes (wdoekes) 2013-08-15 02:05:05.701-0500

.. but that assumes that you're able to compile asterisk at all. Which you apparently weren't.

If you can't get it to compile at all, then I agree that we're stuck. I'd advise closing the ticket until you're able to compile asterisk from source. If you need help compiling, you could try the IRC channel #asterisk on freenode.

By: Elizabeth Hudnott (ElizabethHudnott) 2013-08-15 03:23:49.319-0500

Asterisk compiles on my system, just with important functionality missing, like SIP.  I shall re-examine things when I can find some free time, particularly with regard to seeing if I can insert some debug statements and then trigger a call using a call file on the binary compiled from source.

By: Rusty Newton (rnewton) 2013-09-03 18:35:34.837-0500

Elizabeth, I'll go ahead and suspend the issue since we don't have anything to work with at this time. Feel free to comment on it again when you have additional information and we can look at opening it again. You can always ping a bug marshal on #asterisk-bugs at irc.freenode.net if you have questions.