[Home]

Summary:ASTERISK-28615: chan_dahdi: PRI span status may stay "Down, Active" after a short alarm
Reporter:Frederic LE FOLL (flefoll)Labels:
Date Opened:2019-11-07 11:46:54.000-0600Date Closed:2019-11-21 08:48:17.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Channels/chan_dahdi
Versions:16.4.0 Frequency of
Occurrence
Occasional
Related
Issues:
Environment:Asterisk with Digium PRI cardAttachments:( 0) CLI-ast-16.4-libpri-1.6.0.log
( 1) CLI-fix-ast-16.4-libpri-1.6.0.log
( 2) CLI-fix-ast-16.4-libpri-1.6.0-with-long-alarm.log
Description:Nominal status for an ISDN PRI is "Up, Active" (CLI pri show spans).
Upon a disconnection, span status becomes "In Alarm, Down, Active".
If disconnection is short enough, span status becomes, and remains,
"Down, Active".

Later, status goes back "Up, Active" if an incoming call is presented.

The problem seems to occur or not, depending on remote end.
It happens with two Asterisk face-to-face.
Comments:By: Asterisk Team (asteriskteam) 2019-11-07 11:46:54.822-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: Frederic LE FOLL (flefoll) 2019-11-07 11:52:09.913-0600

Analysis:

This problem occurs because chan_dahdi (through sig_pri) has 2 status bits:
- DCHAN_NOTINALARM
- DCHAN_UP

sig_pri updates these bits depending on alarms and events that come from libpri:
- it clears both bits when it receives an Alarm,
- it also clears DCHAN_UP when it receives PRI_EVENT_DCHAN_DOWN event,
- it sets DCHAN_UP when it receives any event except PRI_EVENT_DCHAN_DOWN (including PRI_EVENT_DCHAN_UP, but also PRI_EVENT_RING that notifies a new call).

Due to Q.921 timers and repetitions, as implemented in libpri, libpri may not break layer 2 for every link disconnection, depending on disconnection duration and remote end behavior.
This can be observed with "pri set debug intense".

Especially, with two Asterisk face-to-face, layer 2 will remain up for a short disconnection (duration compared to T200 x N200).
As a consequence:
1) disconnection: libpri will notify an Alarm (but no PRI_EVENT_DCHAN_DOWN) => sig_pri clears DCHAN_UP.
2) reconnection: libpri will notify an End Of Alarm (but no PRI_EVENT_DCHAN_UP) => DCHAN_UP remains cleared.
Later:
3) Activity in D-Channel (SETUP for an incoming call) => sig_pri sets DCHAN_UP.

According to status bits naming, DCHAN_NOTINALARM should represent the presence of an alarm, while DCHAN_UP should represent only D-Channel status.
So, it would seem better if alarms only impact DCHAN_NOTINALARM bit, while PRI_EVENT_DCHAN_DOWN, PRI_EVENT_DCHAN_UP (or any event PRI_EVENT_... showing that D-Channel indeed is 'up') should impact DCHAN_UP.

Due to current alarm processing in sig_pri, chan_dahdi/sig_pri and libpri thus have a different vision of D-Channel status, after a short alarm that does not break Q.921 layer:
- D-Channel is still 'up' for libpri,
- D-Channel is 'down' for chan_dahdi/sig_pri.

This it not new: sig_pri has implemented this DCHAN_UP processing since Asterisk 1.0.0.
Maybe it was necessary with former versions of libpri, but it does not seem as pertinent with current libpri (libpri 1.6.0 will send a PRI_EVENT_DCHAN_DOWN if disconnection persists, or if remote end resets link upon reconnection).

Changing sig_pri pri_event_alarm() function, in order to modify DCHAN_NOTINALARM only, affects following functions:
- sig_pri_ami_show_spans(),
- sig_pri_cli_show_spans()/sig_pri_cli_show_span() through build_status().
Through DCHAN_AVAILABLE, it also affects pri_is_up() and pri_find_dchan(), thus restarts and idle channels processing in pri_dchannel() [at first sight when reading pri_dchannel()]. Here also, the change should allow sig_pri to handle spans according to their real D-Channel status, as determined by libpri.

I would like to propose a change throught Gerrit, in association with this issue.

By: Frederic LE FOLL (flefoll) 2019-11-07 12:16:01.296-0600

Logs :
- CLI-ast-16.4-libpri-1.6.0.log: native behaviour of Asterisk 16.4.0 + libpri 1.6.0 on a short span disconnection.
- CLI-fix-ast-16.4-libpri-1.6.0.log:same scenario, with pri_event_alarm() modified.
- CLI-fix-ast-16.4-libpri-1.6.0-with-long-alarm.log:longer disconnection that breaks Q.921 layer, with pri_event_alarm() modified.

By: Friendly Automation (friendly-automation) 2019-11-21 08:48:18.706-0600

Change 13244 merged by Friendly Automation:
chan_dahdi: PRI span status may stay "Down, Active" after a short alarm

[https://gerrit.asterisk.org/c/asterisk/+/13244|https://gerrit.asterisk.org/c/asterisk/+/13244]

By: Friendly Automation (friendly-automation) 2019-11-21 09:25:16.478-0600

Change 13167 merged by George Joseph:
chan_dahdi: PRI span status may stay "Down, Active" after a short alarm

[https://gerrit.asterisk.org/c/asterisk/+/13167|https://gerrit.asterisk.org/c/asterisk/+/13167]

By: Friendly Automation (friendly-automation) 2019-11-21 09:26:34.416-0600

Change 13246 merged by Friendly Automation:
chan_dahdi: PRI span status may stay "Down, Active" after a short alarm

[https://gerrit.asterisk.org/c/asterisk/+/13246|https://gerrit.asterisk.org/c/asterisk/+/13246]

By: Friendly Automation (friendly-automation) 2019-11-21 09:27:02.418-0600

Change 13245 merged by George Joseph:
chan_dahdi: PRI span status may stay "Down, Active" after a short alarm

[https://gerrit.asterisk.org/c/asterisk/+/13245|https://gerrit.asterisk.org/c/asterisk/+/13245]