[Home]

Summary:ASTERISK-29782: app_queue: Queue members showing Unavailable for no reason
Reporter:Luke Escude (lukeescude)Labels:
Date Opened:2021-11-30 14:22:19.000-0600Date Closed:2022-01-19 09:17:53.000-0600
Priority:MinorRegression?
Status:Closed/CompleteComponents:Applications/app_queue
Versions:16.22.0 Frequency of
Occurrence
Related
Issues:
duplicatesASTERISK-29658 app_queue: Multiple members in same queue with same state interface don't all reflect proper state
is related toASTERISK-29806 app_queue: extension state incorrect
Environment:Attachments:
Description:Hello,

We're seeing instances where a customer may have the same members in 2 queues, but some will slowly start to show offline in the second queue over time.

Here is an example of queue show:

{code}
960 has 1 calls (max unlimited) in 'ringall' strategy (43s holdtime, 356s talktime), W:0, C:83, A:189, SL:56.6%, SL2:21.3% within 60s
  Members:
     Local/108@from-internal (ringinuse disabled) (paused was 21545 secs ago) (Not in use) has taken 4 calls (last was 22259 secs ago)
     Local/140@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/106@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/104@from-internal (ringinuse disabled) (paused was 9402 secs ago) (Not in use) has taken 9 calls (last was 10153 secs ago)
     Local/103@from-internal (ringinuse disabled) (paused was 6482 secs ago) (Not in use) has taken 5 calls (last was 12364 secs ago)
     Local/102@from-internal (ringinuse disabled) (paused was 78869 secs ago) (In use) has taken no calls yet
     Local/101@from-internal (ringinuse disabled) (paused was 78870 secs ago) (Unavailable) has taken no calls yet
     Local/165@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/130@from-internal (ringinuse disabled) (Ringing) has taken 31 calls (last was 5497 secs ago)
     Local/129@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/157@from-internal (ringinuse disabled) (paused was 78869 secs ago) (In use) has taken no calls yet
     Local/121@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/120@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Unavailable) has taken no calls yet
     Local/154@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/152@from-internal (ringinuse disabled) (Ringing) has taken 6 calls (last was 6797 secs ago)
     Local/151@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Unavailable) has taken no calls yet
     Local/150@from-internal (ringinuse disabled) (In use) has taken 15 calls (last was 41 secs ago)
     Local/114@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/148@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/111@from-internal (ringinuse disabled) (Ringing) has taken 13 calls (last was 50 secs ago)
     Local/110@from-internal (ringinuse disabled) (paused was 78869 secs ago) (In use) has taken no calls yet
  Callers:
     1. PJSIP/external-kamailio-0000489e (wait: 0:02, prio: 0)

970 has 1 calls (max unlimited) in 'ringall' strategy (49s holdtime, 45s talktime), W:0, C:32, A:128, SL:65.6%, SL2:20.0% within 60s
  Members:
     Local/108@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Ringing) has taken no calls yet
     Local/140@from-internal (ringinuse disabled) (Unavailable) has taken no calls yet
     Local/106@from-internal (ringinuse disabled) (Unavailable) has taken no calls yet
     Local/104@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/103@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/102@from-internal (ringinuse disabled) (Unavailable) has taken no calls yet
     Local/101@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Unavailable) has taken no calls yet
     Local/165@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/130@from-internal (ringinuse disabled) (In use) has taken no calls yet
     Local/129@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/157@from-internal (ringinuse disabled) (paused was 3956 secs ago) (Not in use) has taken 7 calls (last was 4630 secs ago)
     Local/121@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/120@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/154@from-internal (ringinuse disabled) (paused was 8348 secs ago) (In use) has taken no calls yet
     Local/152@from-internal (ringinuse disabled) (Not in use) has taken 6 calls (last was 6297 secs ago)
     Local/151@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Unavailable) has taken no calls yet
     Local/150@from-internal (ringinuse disabled) (In use) has taken no calls yet
     Local/114@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/148@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Not in use) has taken no calls yet
     Local/111@from-internal (ringinuse disabled) (paused was 78869 secs ago) (Unavailable) has taken no calls yet
     Local/110@from-internal (ringinuse disabled) (Not in use) has taken 19 calls (last was 244 secs ago)
  Callers:
     1. PJSIP/external-kamailio-0000488f (wait: 0:12, prio: 0)

{code}

You can see Local/102@internal is perfectly fine in queue 960, but is considered unavailable in queue 970 for some reason. The device is online, and the hint is registering just fine, so there's no reason for the second queue to think it's offline.
Comments:By: Asterisk Team (asteriskteam) 2021-11-30 14:22:20.266-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: Luke Escude (lukeescude) 2021-11-30 14:26:19.100-0600

It occurs to me that the initial (Unavailable) has to do with the login state (paused/unpaused) and the second (Unavailable) has to do with device status.

So, for some reason the queue is forgetting the agent is either logged in or logged out. I didn't know there could even be a state other than Paused or unpaused.

By: Joshua C. Colp (jcolp) 2021-11-30 14:59:59.678-0600

What is the actual queues configuration? Is there a reason you aren't actually specifying a device (or a hint) to use for monitoring if available, and relying on Local channel state instead? What does "logged in" mean? Can you also provide an actual core debug log? The app_queue module has debug logging showing when it reacts to events and changes state.

By: Luke Escude (lukeescude) 2021-11-30 15:05:27.826-0600

Queues.conf:

{code}
[general]
keepstats=yes
updatecdr=yes
setinterfacevar=yes
setqueueentryvar=yes
setqueuevar=yes
persistentmembers=yes

[960] ; Queue: 960 - Appointments
musicclass=5f4a57d24bd2d
setinterfacevar=yes
setqueueentryvar=yes
setqueuevar=yes
maxlen=0
strategy=ringall
autopause=no
wrapuptime=0
ringinuse=no
periodic-announce=https://greetingstorage/5f614476ecece.wav
periodic-announce-frequency=30
relative-periodic-announce=yes
servicelevel=60
timeoutrestart=no
timeout=0
retry=0
joinempty=paused,unavailable,invalid,unknown
leavewhenempty=paused,unavailable,invalid,unknown
announce-holdtime=yes

member => Local/101@from-internal,,,hint:101@from-internal
member => Local/102@from-internal,,,hint:102@from-internal
member => Local/103@from-internal,,,hint:103@from-internal
member => Local/104@from-internal,,,hint:104@from-internal
member => Local/106@from-internal,,,hint:106@from-internal
member => Local/108@from-internal,,,hint:108@from-internal
member => Local/110@from-internal,,,hint:110@from-internal
member => Local/111@from-internal,,,hint:111@from-internal
member => Local/114@from-internal,,,hint:114@from-internal
member => Local/120@from-internal,,,hint:120@from-internal
member => Local/121@from-internal,,,hint:121@from-internal
member => Local/129@from-internal,,,hint:129@from-internal
member => Local/130@from-internal,,,hint:130@from-internal
member => Local/140@from-internal,,,hint:140@from-internal
member => Local/148@from-internal,,,hint:148@from-internal
member => Local/150@from-internal,,,hint:150@from-internal
member => Local/151@from-internal,,,hint:151@from-internal
member => Local/152@from-internal,,,hint:152@from-internal
member => Local/154@from-internal,,,hint:154@from-internal
member => Local/157@from-internal,,,hint:157@from-internal
member => Local/165@from-internal,,,hint:165@from-internal
[970] ; Queue: 970 - Scheduling
musicclass=5f4a57d24bd2d
setinterfacevar=yes
setqueueentryvar=yes
setqueuevar=yes
maxlen=0
strategy=ringall
autopause=no
wrapuptime=0
ringinuse=no
periodic-announce=https://greetingstorage/5f614476ecece.wav
periodic-announce-frequency=30
relative-periodic-announce=yes
servicelevel=60
timeoutrestart=no
timeout=0
retry=0
joinempty=paused,unavailable,invalid,unknown
leavewhenempty=paused,unavailable,invalid,unknown
announce-holdtime=yes

member => Local/101@from-internal,,,hint:101@from-internal
member => Local/102@from-internal,,,hint:102@from-internal
member => Local/103@from-internal,,,hint:103@from-internal
member => Local/104@from-internal,,,hint:104@from-internal
member => Local/106@from-internal,,,hint:106@from-internal
member => Local/108@from-internal,,,hint:108@from-internal
member => Local/110@from-internal,,,hint:110@from-internal
member => Local/111@from-internal,,,hint:111@from-internal
member => Local/114@from-internal,,,hint:114@from-internal
member => Local/120@from-internal,,,hint:120@from-internal
member => Local/121@from-internal,,,hint:121@from-internal
member => Local/129@from-internal,,,hint:129@from-internal
member => Local/130@from-internal,,,hint:130@from-internal
member => Local/140@from-internal,,,hint:140@from-internal
member => Local/148@from-internal,,,hint:148@from-internal
member => Local/150@from-internal,,,hint:150@from-internal
member => Local/151@from-internal,,,hint:151@from-internal
member => Local/152@from-internal,,,hint:152@from-internal
member => Local/154@from-internal,,,hint:154@from-internal
member => Local/157@from-internal,,,hint:157@from-internal
member => Local/165@from-internal,,,hint:165@from-internal
{code}

So when I manually un-pause Local/102 in queue 970, it sets the state to Unavailable instead of Not In Use. When I un-pause it in queue 960, it works as intended.

Paused = Logged Out. Unpaused = Logged In however I will continue using the "paused" terminology.

We use Local channels because some of those agents might have extra dial plan rules, like call forwarding, or multiple devices ringing in tandem, etc.

By: Luke Escude (lukeescude) 2021-11-30 15:07:25.230-0600

Okay, here's how to fix it:

touch queues.conf
asterisk -rx "core reload"

So queues.conf has to be touched/modified in order for the state to go back to normal.

By: Joshua C. Colp (jcolp) 2021-11-30 15:48:09.689-0600

This sounds like it would be the same underlying issue as ASTERISK-29658, even if separate queues. It's also fixed the same way.

By: Joshua C. Colp (jcolp) 2021-11-30 15:49:29.717-0600

This still does need core debug though.

By: Asterisk Team (asteriskteam) 2021-12-15 12:00:01.318-0600

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

By: Luke Escude (lukeescude) 2022-01-19 09:01:25.791-0600

This issue is fixed by the patch in ASTERISK-29806

We've been running it since it came out, and it solves the issue.

By: Asterisk Team (asteriskteam) 2022-01-19 09:01:26.031-0600

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.