[Home]

Summary:ASTERISK-19608: Asterisk-1.8.x starts rejecting calls with cause code 44 after some time.
Reporter:Denis Alberto Martinez (dmartinez)Labels:
Date Opened:2012-03-30 03:49:07Date Closed:2013-01-16 15:39:57.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_dahdi
Versions:Frequency of
Occurrence
Related
Issues:
is related toASTERISK-00133 Send subsequent RESTART message(s) after T316 expiry and no RESTART ACKNOWLEDGE
is related toASTERISK-25034 chan_dahdi: Some telco switches occasionally ignore ISDN RESTART requests.
Environment:Asterisk-1.8.x LibPRI-1.4.12 DAHDI-2.6.XAttachments:
Description:h3.*Symptoms observed by the user*

- After some time on production, Asterisk start rejecting calls on specific channels with ISDN cause code 44.

h3.Problem Description

In certain situations, such as extremely high call volume on crowded channels, there is a possibility of a call collision, in which the telco rejects an outgoing call (from Asterisk) because it's expecting to use the same channel for another call (to Asterisk).  When this occurs, Asterisk will invoke the Q.931 restart procedure,  but the telco does not always respond to all RESTART messages with a RESTART ACKNOWLEDGE.

Under this circutances Asterisk will lock the channel and it won't accept any new calls until the channel restart is completed.  Our interpretation of the Q.931 Specification (section 5.5.1) tell us that the (incoming) call rejection made by Asterisk can be considered expected behavior

{quote}
h5. 5.5.1 Sending RESTART message

...
_If a RESTART ACKNOWLEDGE message is not received prior to the expiry of timer T316, one or more subsequent RESTART messages may be sent until a RESTART ACKNOWLEDGE message is returned. *Meanwhile, no calls shall be placed or accepted over the channel or interface by the originator of the RESTART message*. A network shall limit the number of consecutive unsuccessful restart attempts to a default limit of two. When this limit is reached, the network shall make no further restart attempts. An indication will be provided to the appropriate maintenance entity. The channel or interface is considered to be in an out-of-service condition until maintenance action has been taken._
...
{quote}

In previous version of software (such as Asterisk-1.4.X) Asterisk used to tolerate this behavior, but recent changes and code improvement made it unacceptable.  


h3. How to Determine if this issue is affecting your system

*1. Make sure that logging in Asterisk is enable.*

Edit /etc/asterisk/logger.conf, uncomment the line beginning with "_;messages => ..._" (To uncomment, means to remove the ";" character). Edit the same line, to include the following information, then save the file

"_messages => notice,warning,error,verbose_"

Example:
{noformat}


;debug => debug
console => notice,warning,error
;console => notice,warning,error,debug
messages => notice,warning,error,verbose
;full => notice,warning,error,debug,verbose,dtmf,fax

{noformat}

*2. Rotate the logs*

Execute the command: _logger rotate_ in the Asterisk CLI

{noformat}


*CLI> logger rotate

{noformat}

*3. Restart Asterisk and DAHDI.*

This is to ensure that every channel on Asterisk is on a working state.


*4. Make sure that messages output is enable on Asterisk*

Log into the Asterisk CLI and run the command _logger show channels_ and make sure that _messages_ is shown

{noformat}


*CLI> logger show channels
Channel                             Type     Status    Configuration
-------                             ----     ------    -------------
/var/log/asterisk/messages          File     Enabled    - NOTICE WARNING ERROR VERBOSE
                                   Console  Enabled    - NOTICE WARNING ERROR  

{noformat}

*5. Enable PRI Debug*

In order to enable PRI on a specific span please run the CLI command: _pri set debug on span <SPAN NUMBER>_

Example
{noformat}


*CLI> pri set debug on span 1
Enabled debugging on span 1

{noformat}

*6. Start normal operation and watch if a channels get lock*

By using the following the Linux command, you should be able to see which B channels are in non-idle state and the number of active channels in Asterisk.  

_#watch -n1 'asterisk -rx"pri show channels" | grep -i "PRI\|Span\|Yes  No" && asterisk -rx"core show channels" | grep -i "active channels"'_

Example

{noformat}
Every 1.0s: asterisk -rx"pri show channels" | grep -i "PRI\|Span\|Yes  No" && asterisk -rx"core show channels" | grep -i "active...  Fri Mar 30 03:21:02 2012

PRI  B    Chan Call       PRI  Channel
Span Chan Chan Idle Level      Call Name
  1   19 Yes  No   Idle       No
0 active channels
{noformat}


*7. Debugging the data.*  

If Asterisk shows 0 active channels and one of the B channels is in non-idle state, it means that Asterisk is unable to receive calls on that channel.

7.1 Get a copy of the Asterisk logs (/var/log/asterisk/messages)
7.2 Look for the last time that Asterisk sent a SETUP message to the telco
7.3 Using the call reference in the SETUP message track the PRI messages between the telco and Asterisk
7.4 Determine if telco rejected the call and if Asterisk sent a RESTART (after the call rejection)
7.5 Check if the telco responded the restart request.



h3. Working around this issue / Solving the issue

We believe that the problem is caused by a bad behavior on the telco's switches, therefore users experiencing this issue should contact their telco and show the data. The telco should be able to determinate why their switches ignored the restart request.

In order to workaround or alleviate this issue we suggest:

1- Contact your telco and find out in which order their switches send the calls to Asteirsk. Call collisions should be minimized by allocating channels opposite from how the network allocates channels. If the network picks lower channels first for incoming calls then outgoing calls should pick higher channels first.

2- Increase the number of channels on your system.

3- Restart Asterisk on specific intervals (once a day)


{color:red}
*If you are Digium's customer with in-warranty card, please contact Digium Tech Support or Open a case using our web portal. We will do our best to provide assistance*
{color}
Comments:By: Richard Mudgett (rmudgett) 2012-04-25 14:58:50.551-0500

The patch associated with this issue is committed.

By: Denis Alberto Martinez (dmartinez) 2012-05-23 13:36:13.731-0500

h4. UPDATE

Digium's Engineering Department has confirmed that the patch for this issue was added on:

- Asterisk-1.8.13.0-rc1
- Asterisk-10.5.0-rc1

If you are experiencing issues that seems related to this one, please upgrade your Asterisk version as soon as possible.

By: David Lewis (dlewis7444) 2012-12-04 01:16:58.627-0600

Any idea when this will be released for the Certified Asterisk branch?

By: Richard Mudgett (rmudgett) 2012-12-04 10:03:32.141-0600

It is already in the v1.8.15-cert versions.

By: Matt Jordan (mjordan) 2013-01-16 15:39:57.212-0600

This issue is nice to have in the issue tracker as it documents clearly the symptoms of a particular problem, but at this point in time this issue should be resolved in all supported versions of Asterisk.

As such, I'm going to go ahead and close this out.

By: Denis Alberto Martinez (dmartinez) 2013-01-16 15:43:27.874-0600

It's important to mention that Asterisk 1.8.13.0 and 10.5.0 contain the fix for this issue. As always, if you setting up an Asterisk server, please make sure to install the latest released version of Asterisk/ DAHDI and libPRI.