[Home]

Summary:ASTERISK-26400: app_queue: Queue member stops being called after AMI "Redirect" action for queues with wrapuptime
Reporter:Etienne Lessard (hexanol)Labels:regression
Date Opened:2016-09-22 13:27:21Date Closed:2017-05-24 07:50:34
Priority:MajorRegression?Yes
Status:Closed/CompleteComponents:Applications/app_queue
Versions:13.11.2 Frequency of
Occurrence
Constant
Related
Issues:
is duplicated byASTERISK-26534 Queue member stuck in state Not is use and in call after channel redirect to another extension.
is duplicated byASTERISK-26757 When a queue member transfers queue call, he remains marked as "in call"
is duplicated byASTERISK-26975 app_queue: Non-zero wrapup time can cause agents not to receive queue calls after transfer queue call
is related toASTERISK-26715 app_queue: Member will not receive any new calls after doing a transfer if wrapuptime = greater than 0 and using Local channel
is related toASTERISK-26862 app_queue: Queue stops calling members with local interface after forwarding in previous call
Environment:Attachments:( 0) 0001-app_queue-Handle-the-caller-being-redirected-out-of-.patch
( 1) 13reviewboardtests.txt
Description:Hello,

Given I have a queue with a *nonzero wrapuptime* and one queue member
And Alice calls this queue
And the queue member answers
When an AMI "Redirect" redirects Alice's channel to a different extension
Then the queue member won't receive any new call from the queue until Alice's channel is hung up

Note that after the AMI Redirect, the queue member is available / not in use, but won't receive any new call because the queue still think it's "in call", as shown by the "queue show" command.

This is especially noticeable if your queue member is member of many queues (all with wrapup) and you have "shared_lastcall = yes" in your queues.conf.

Also, if you find yourself in a scenario similar to the one described in ASTERISK-25844, this gets worse, i.e. your queue member won't receive any new calls even after Alice's channel is hung up.

This bug (which happens to be a regression) has been introduced by commit 338a8ffed673e4c3a828c7c216575f8e3e712350 and this commit references ASTERISK-19820.

Thanks
Comments:By: Asterisk Team (asteriskteam) 2016-09-22 13:27:21.812-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: David Brillert (aragon) 2017-01-17 10:57:17.942-0600

Since ASTERISK-19820 introduced the regression, will this be patched to fix the issue or will ASTERISK-19820 patch be reverted?

By: Joshua C. Colp (jcolp) 2017-01-17 11:04:24.727-0600

The issue itself does not have a person actively working on it yet, until that time the plan forward has not been determined.

By: David Brillert (aragon) 2017-01-31 08:40:11.210-0600

This is an intolerable regression affecting many customers and causing lost revenue. We would be happy to have the change ASTERISK-19820 reverted.
A member does not receive a call if any type of transfer is done (built in *1, *2, SIP) and wrapuptime = is greater than 0
Very easily reproduced by example:
queues.conf
wrapuptime = 10

By: Jeffrey S Becker (PBX-Support64) 2017-01-31 08:58:42.599-0600

We are unable to upgrade our instance of Asterisk due to this limitation.  The system is not feasible for a call center until this problem is resolved.

By: David Brillert (aragon) 2017-01-31 09:05:21.129-0600

I agree we cannot upgrade beyond 11.16 and this means we cannot use Asterisk versions which include important security fixes if we use the older version to avoid the regression.  Asterisk 13 is also effected since it includes the same commit.

By: David Brillert (aragon) 2017-01-31 12:38:20.080-0600

IMHO the commit at ASTERISK-19820 needs to be reverted until the original problem reported there can be fixed properly and thoroughly tested.  That change undermines all the work that went into state_interface

This logic does not make sense, since any parking or transfer done by the queue member will result in the (in call) flag being set.

2015-12-29 05:44 +0000 [1943cfc53c] Martin Tomec <tomec.martin@gmail.com>

   app_queue: Add member flag "in_call" to prevent reading wrong lastcall time

Member lastcall time is updated later than member status. There was chance to
check wrapuptime for available member with wrong (old) lastcall time.
New boolean flag "in_call" is set to true right before connecting call, and
reset to false after update of lastcall time. Members with "in_call" set to true
are treat as unavailable.

By: David Brillert (aragon) 2017-03-07 15:54:16.949-0600

Is anyone willing to commit a patch to revert 2015-12-29 05:44 +0000 [1943cfc53c] Martin Tomec <tomec.martin@gmail.com>?

This commit is causing multiple headaches and my only recourse is to downgrade Asterisk to old rpms built prior to Martin's commit.  It is a bad regression and causing lots of lost dollars.

By: Matthew Fredrickson (mattf) 2017-03-07 16:06:53.209-0600

[~matesstar] - do you have any comments on this?

By: Sean Bright (seanbright) 2017-03-09 08:07:18.920-0600

[~aragon], unfortunately that patch has been in the codebase since 12/2015, so it's not as simple as just reverting. I am looking into it and hope to have a fix in the next day or two.

By: David Brillert (aragon) 2017-03-09 08:42:04.045-0600

[~seanbright], much appreciated :)

By: Sean Bright (seanbright) 2017-03-09 09:50:00.570-0600

After review, I'm not convinced that this problem is directly related to [~matesstar]'s patch. I'm going to revert it locally to confirm, but it just appears to me that redirects out of the queue are broken. We explicitly handle blind transfers (which I would suggest as a work around for you if possible), attended transfers, and hangups, but the bridge being broken by the redirect is silently ignored, which is the source of the problem.

In my testing after the customer is redirected, not only was the "in call" status still set for the agent, but the device state also showed as "In use" until the customer hung up.

Stay tuned.

By: Sean Bright (seanbright) 2017-03-09 10:10:42.156-0600

[~aragon], I rolled back [~matesstar]'s patch in my test environment and my test agent stayed "In use" until the customer hung up, so this his patch being the problem was a red herring.

Save me the trouble of going back to an ancient version of Asterisk and tell me what your {{queue_log}} says when the customer is AMI Redirected out of the queue.

By: David Brillert (aragon) 2017-03-09 10:42:35.144-0600

Etienne opened the ticket specifically reporting the AMI problem.
I don't have to use AMI to reproduce.
I just have to do any type of SIP transfer or PARK and ensure the wrapuptime is set to some value greater than 0 then the not in use member will receive no new calls from queue. If wrapuptime  = 0 then member will receive calls but 'in call' flag will still be true after the transfer and not cleared until all transferred parties hangup.
So in any case the dynamic member will show 'in call' until the transferred parties both hangup.
My steps to reproduce are at ASTERISK-26715
Reproduction is remarkably easy.

Here is the queue show after the member has transferred call and the agent's SIP extension is idle.
What is odd is that the core show channels output shows the SIP extension 214 involved in a channel after the transfer is completed.

Call flow=
A = 216 calls queue = reception
B = queue member logged to SIP 214 answers via queue and completes an attended SIP Transfer to C = 213
C= 213 is bridged to 216
B= idle (transfer completed)

CLI output from the call flow:
{noformat}
debcomainbtn-reception has 0 calls (max unlimited) in 'ringall' strategy (0s holdtime, 25s talktime), W:0, C:9, A:0, SL:100.0% within 60s
  Members:
     Local/214@debcomainbtn-agent/n (ringinuse disabled) (dynamic) (in call) (Not in use) has taken 9 calls (last was 177 secs ago)
  No Callers

master88*CLI> core show channels
Channel              Location             State   Application(Data)
SIP/debcomainbtn213- (None)               Up      AppDial((Outgoing Line))
Local/214@debcomainb s@macro-debcomainbtn Up      Dial(SIP/debcomainbtn213,20,tk
Local/214@debcomainb s@debcomainbtn-agent Up      AppQueue((Outgoing Line))
SIP/debcomainbtn216- s@debcomainbtn-appli Up      Queue(debcomainbtn-reception,t
4 active channels
2 of 512 max active calls ( 0.39% of capacity)

master88*CLI> sip show channels
Peer             User/ANR         Call ID          Format           Hold     Last Message    Expiry     Peer
192.168.192.78   (None)           6e0d1264723bc95  (nothing)        No       Rx: OPTIONS                <guest>
172.31.240.111   debcomainbtn216  0_1966151905@17  (ulaw)           No       Tx: UPDATE                 debcomainb
172.31.240.110   debcomainbtn213  0715e9a879d7be6  (ulaw)           No       Tx: UPDATE                 debcomainb
3 active SIP dialogs
{noformat}

queue_log
{noformat}
1489076485|1489076484.40|debcomainbtn-reception|NONE|ENTERQUEUE||debcomainbtn216|1
1489076486|1489076484.40|debcomainbtn-reception|Local/214@debcomainbtn-agent/n|CONNECT|1|1489076485.41|1
1489076498|1489076484.40|debcomainbtn-reception|Local/214@debcomainbtn-agent/n|COMPLETECALLER|1|12|1
{noformat}

By: David Brillert (aragon) 2017-03-09 10:47:02.246-0600

[~seanbright]
You mentioned a SIP blind transfer should be explicitly handled so I tested that scenario as well
Same call flow as above only instead of attended SIP transfer I did a Polycom Blind transfer

{noformat}
master88*CLI> queue show

debcomainbtn-reception has 0 calls (max unlimited) in 'ringall' strategy (0s holdtime, 18s talktime), W:0, C:11, A:0, SL:100.0% within 60s
  Members:
     Local/214@debcomainbtn-agent/n (ringinuse disabled) (dynamic) (in call) (Not in use) has taken 11 calls (last was 19 secs ago)
  No Callers

master88*CLI> core show channels
Channel              Location             State   Application(Data)
SIP/debcomainbtn213- (None)               Up      AppDial((Outgoing Line))
SIP/debcomainbtn216- s@debcomainbtn-appli Up      Queue(debcomainbtn-reception,t
Local/214@debcomainb s@debcomainbtn-agent Up      AppQueue((Outgoing Line))
Local/214@debcomainb s@macro-debcomainbtn Up      Dial(SIP/debcomainbtn213,20,tk
4 active channels
2 of 512 max active calls ( 0.39% of capacity)

master88*CLI> sip show channels
Peer             User/ANR         Call ID          Format           Hold     Last Message    Expiry     Peer
172.31.240.111   debcomainbtn216  0_2064138430@17  (ulaw)           No       Tx: UPDATE                 debcomainb
192.168.192.78   (None)           3be2ae342e97dcb  (nothing)        No       Rx: OPTIONS                <guest>
172.31.240.110   debcomainbtn213  35c30b9a6a1c852  (ulaw)           No       Tx: ACK                    debcomainb
3 active SIP dialogs
{noformat}

By: Sean Bright (seanbright) 2017-03-09 11:28:08.947-0600

[~aragon], please try the attached patch. You will need the latest version of Asterisk 13 (potentially from git) for it to apply cleanly.

By: David Brillert (aragon) 2017-03-09 11:53:46.739-0600

[~seanbright] Thanks!!! :D
I understand Asterisk 11 is EOL.
It will take me some time to test the patch since our builds are based on Asterisk 11, and I have to build a working Asterisk 13 environment.

Have you done any testing on your end?

By: Sean Bright (seanbright) 2017-03-09 12:40:35.064-0600

[~aragon] - yes, I've tested and it works as it should. [~PBX-Support64], is it possible for you to test this patch and report back?

By: Sean Bright (seanbright) 2017-03-09 12:42:10.391-0600

[~hexanol], can you test the attached patch and let us know if it solves your problem?

By: Jeffrey S Becker (PBX-Support64) 2017-03-09 12:53:24.374-0600

Unfortunately, I don’t have the required skill set to download a patch and recompile Asterisk, nor do I have a suitable system set up to do so.

By: Sean Bright (seanbright) 2017-03-09 15:48:31.273-0600

[This patch is up for review on gerrit|https://gerrit.asterisk.org/#/c/5149].

By: David Brillert (aragon) 2017-03-10 13:33:56.784-0600

[~seanbright] I just tested reviewboard patch set 3
I think something is still not right with SIP transfers because I still see an outgoing channel associated with an idle SIP extension after transfer.

I'm attaching the full call CLI and queue show and core show channels output in file 13reviewboardtests.txt



By: Sean Bright (seanbright) 2017-03-10 14:01:14.315-0600

[~aragon], can you please tell me what a SIP transfer is?

By: David Brillert (aragon) 2017-03-10 14:12:48.063-0600

Testing with a Polycom phone using the transfer soft key as opposed to using Asterisk DTMF *1 or *2

By: Sean Bright (seanbright) 2017-03-10 14:16:03.471-0600

[~aragon], the queue member is using a Polycom and transferring the caller to another extension, is that correct? Because an AMI Redirect is not involved I don't think it relates to this issue, but I will test regardless.

By: Sean Bright (seanbright) 2017-03-10 14:35:10.741-0600

[~aragon], I've confirmed your findings. It only appears to happen when your queue members are Local channels with the {{/n}} (don't optimize) flag set.

By: David Brillert (aragon) 2017-03-10 14:37:12.847-0600

[~seanbright] Correct, the Polycom is transferring to another extension.
Keep in mind my ticket regarding transfers ASTERISK-26715 was closed out as a duplicate. I never reported an issue using AMI or external redirect (that was Etienne).

I also run into issues with a transfer to PARK extension after retrieving the call from the parking lot.

{noformat}
debcomainbtn-reception has 0 calls (max unlimited) in 'ringall' strategy (0s holdtime, 0s talktime), W:0, C:0, A:0, SL:0.0% within 60s
  Members:
     Local/214@debcomainbtn-agent/n (ringinuse disabled) (dynamic) (in call) (Not in use) has taken no calls yet
  No Callers

debcomainbtn-sales has 0 calls (max unlimited) in 'rrmemory' strategy (0s holdtime, 0s talktime), W:0, C:0, A:0, SL:0.0% within 10s
  Members:
     Local/214@debcomainbtn-agent/n (ringinuse disabled) (dynamic) (in call) (Not in use) has taken no calls yet
  No Callers

master88*CLI> core show channels
Channel              Location             State   Application(Data)
SIP/debcomainbtn213- 701@debcomainbtn-man Up      ParkedCall(parkinglot_debcomai
SIP/debcomainbtn216- s@debcomainbtn-appli Up      Queue(debcomainbtn-reception,t
Local/214@debcomainb s@debcomainbtn-agent Up      AppQueue((Outgoing Line))
Local/214@debcomainb 214@debcomainbtn-loc Up      Dial(SIP/debcomainbtn214,,tk)
{noformat}

By: David Brillert (aragon) 2017-03-10 14:41:19.286-0600

[~seanbright] Also correct, we always use /n for agents.

By: Joshua C. Colp (jcolp) 2017-03-10 14:42:54.927-0600

[~aragon] Since [~seanbright] has determined that your issue seems to be something different feel free to comment on your closed issue. It will automatically reopen and go back into triage.

[~seanbright] Thank you for all your work on this!

By: David Brillert (aragon) 2017-03-10 14:53:18.093-0600

[~jcolp] Honestly I have no idea if I want to re-open the other issue. If you want me to I can but I would rather just watch one ticket instead of spreading debug logs all over the place.

I too give huge respect to [~seanbright] for taking this one on (mad love).

By: Etienne Lessard (hexanol) 2017-03-10 15:16:53.654-0600

As far as the scenario in this issue's description is concerned, I confirm that the bug doesn't show up anymore after applying the patch (on an Asterisk 14.3.0). I didn't do any other tests.

Thanks

By: David Brillert (aragon) 2017-03-10 15:33:27.163-0600

Since Etienne has confirmed the reviewboard patch fixes his reported issue I have reopened ASTERISK-26715

By: Andrej (tekach) 2017-03-13 08:57:21.276-0500

Hi,

if this helps: I've originally opened a ticket ASTERISK-26757 which was closed as duplicate; our situation was similar, but not same - we are using yealink phones with built-in blind and attended transfer functions/buttons.

I've used the patch against asterisk 13.14.0 and after testing bug doesn't seem to be gone.

Shall I reopen my case or?

Andrej


By: Joshua C. Colp (jcolp) 2017-03-13 17:17:13.422-0500

[~tekach] Sure it's not ASTERISK-26715 which has already been reopened?

By: Andrej (tekach) 2017-03-14 06:38:48.226-0500

Hi Joshua,

You are right, issue seems the same.

Thanks,
Andrej

By: Friendly Automation (friendly-automation) 2017-03-15 20:32:05.940-0500

Change 5151 merged by zuul:
app_queue: Handle the caller being redirected out of a queue bridge

[https://gerrit.asterisk.org/5151|https://gerrit.asterisk.org/5151]

By: Friendly Automation (friendly-automation) 2017-03-15 21:19:40.307-0500

Change 5149 merged by zuul:
app_queue: Handle the caller being redirected out of a queue bridge

[https://gerrit.asterisk.org/5149|https://gerrit.asterisk.org/5149]

By: Friendly Automation (friendly-automation) 2017-03-16 05:25:29.059-0500

Change 5150 merged by Joshua Colp:
app_queue: Handle the caller being redirected out of a queue bridge

[https://gerrit.asterisk.org/5150|https://gerrit.asterisk.org/5150]

By: Sean Bright (seanbright) 2017-03-21 16:52:46.632-0500

Bad news. The fix mentioned above broke something else, so we are reverting and re-opening.

By: Luis Aguirre (laar789) 2017-04-03 08:08:58.626-0500

Hi, is ther any advance? Because, I'm having the same issue in asterisk 11.25.1, I can't update to asterisk 13 in my application, it's developed over asterisk 11 and the update to 13 is not possible right now. I assume that the fix will work for asterisk 11 right? Thanks in advance

By: Gabriel Williamson (gabe.williamson) 2017-04-12 15:00:44.407-0500

I am also interested in this issue being resolved. Should the resolution have been changed from fixed back to unresolved?

By: Joshua C. Colp (jcolp) 2017-04-12 15:03:15.391-0500

JIRA does not allow the resolution to be altered. It will only be updated when the issue is closed again (it's currently open).

By: Martin Tomec (matesstar) 2017-05-11 04:05:31.274-0500

The patch ASTERISK-19820 solved issue when wrapup time is sometimes ignored. It seems that the patch elevates other issues when the call is not hung-up "properly" - the agent remains in state "in_call". This is off course bigger issue, so as a temporary workaround it makes sense to revert my patch.
But final solution should be to correctly clear the "in_call" flag. Maybe there is somewhere missing call of function update_queue - hope that Sean will find it...

Sorry for late reply

By: Friendly Automation (friendly-automation) 2017-05-24 07:50:35.772-0500

Change 5640 merged by Jenkins2:
app_queue: Fix members showing as being in call when not.

[https://gerrit.asterisk.org/5640|https://gerrit.asterisk.org/5640]

By: Friendly Automation (friendly-automation) 2017-05-24 08:40:10.883-0500

Change 5639 merged by Jenkins2:
app_queue: Fix members showing as being in call when not.

[https://gerrit.asterisk.org/5639|https://gerrit.asterisk.org/5639]

By: Friendly Automation (friendly-automation) 2017-05-24 09:41:41.307-0500

Change 5641 merged by Joshua Colp:
app_queue: Fix members showing as being in call when not.

[https://gerrit.asterisk.org/5641|https://gerrit.asterisk.org/5641]

By: Matt Brown (mattbrown) 2017-06-20 07:45:26.351-0500

I can confirm we are experiencing this bug still in 13.16.0 & 11.25.1 (and 14.x branch).

We have a queue of dynamic members, call comes in and member answers call. Call is then transferred using ## (In call blind xfer) and the call is transferred - however this leaves the original member showing as (in call) and no further calls are received. However, changing wrapuptime to 0 will resolve the issue - but ideally we need wrapup time.

Therefore I would mark this bug as not resolved. The previous comment was made:

"A caller can leave the Queue() application after being bridged with a
member in a few ways:

 * Caller or member hangup
 * Caller is transferred somewhere else (blind or atx)
 * Caller is externally redirected elsewhere

The first 2 scenarios are currently handled by subscribing to stasis
messages, but the 3rd is not explicitly covered"

atx and blind still cause issues for the members of the queue.

By: Joshua C. Colp (jcolp) 2017-06-20 07:49:15.085-0500

This fix is not yet in a release. If you use the branch it should be resolved, as it has been for many others. If it's still not then please file a new issue with the precise scenario, configuration, and console output as it is different.

By: Matt Brown (mattbrown) 2017-06-20 08:13:52.733-0500

Apologies, I was reading http://downloads.asterisk.org/pub/telephony/asterisk/old-releases/ChangeLog-13.15.1 and thought this was fixed in this version. Thank you, I will checkout and retry with our lab setup.

By: Joshua C. Colp (jcolp) 2017-06-20 08:16:39.010-0500

In the future if you come across a JIRA issue you can check the "Target Release Version/s" field at the top. That automatically gets set to the release it is first in when the release happens.

By: Luis Aguirre (laar789) 2017-07-06 10:00:57.913-0500

Hi, I'm really happy to see this bug fixed, but please can you set a target release for asterisk 11. I'm currently working with asterisk 11 and changing to asterisk 13 would take some time, but a I need this fix in my current app. Please, or make a patch to asterisk 11

By: Joshua C. Colp (jcolp) 2017-07-06 10:05:03.353-0500

Asterisk 11 is no longer a bug fix supported branch. The only time it receives a release is as a result of security issues. The change would also need to be rearchitected some to work within 11.

By: Rusty Newton (rnewton) 2017-07-06 18:04:43.912-0500

[~laar789] for reference, in regards to Joshua's comment - here is the list of versions and support cycles https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions