Summary: | ASTERISK-26400: app_queue: Queue member stops being called after AMI "Redirect" action for queues with wrapuptime | ||||||||||||
Reporter: | Etienne Lessard (hexanol) | Labels: | regression | ||||||||||
Date Opened: | 2016-09-22 13:27:21 | Date Closed: | 2017-05-24 07:50:34 | ||||||||||
Priority: | Major | Regression? | Yes | ||||||||||
Status: | Closed/Complete | Components: | Applications/app_queue | ||||||||||
Versions: | 13.11.2 | Frequency of Occurrence | Constant | ||||||||||
Related Issues: |
| ||||||||||||
Environment: | Attachments: | ( 0) 0001-app_queue-Handle-the-caller-being-redirected-out-of-.patch ( 1) 13reviewboardtests.txt | |||||||||||
Description: | Hello,
Given I have a queue with a *nonzero wrapuptime* and one queue member And Alice calls this queue And the queue member answers When an AMI "Redirect" redirects Alice's channel to a different extension Then the queue member won't receive any new call from the queue until Alice's channel is hung up Note that after the AMI Redirect, the queue member is available / not in use, but won't receive any new call because the queue still think it's "in call", as shown by the "queue show" command. This is especially noticeable if your queue member is member of many queues (all with wrapup) and you have "shared_lastcall = yes" in your queues.conf. Also, if you find yourself in a scenario similar to the one described in ASTERISK-25844, this gets worse, i.e. your queue member won't receive any new calls even after Alice's channel is hung up. This bug (which happens to be a regression) has been introduced by commit 338a8ffed673e4c3a828c7c216575f8e3e712350 and this commit references ASTERISK-19820. Thanks | ||||||||||||
Comments: | By: Asterisk Team (asteriskteam) 2016-09-22 13:27:21.812-0500 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: David Brillert (aragon) 2017-01-17 10:57:17.942-0600 Since ASTERISK-19820 introduced the regression, will this be patched to fix the issue or will ASTERISK-19820 patch be reverted? By: Joshua C. Colp (jcolp) 2017-01-17 11:04:24.727-0600 The issue itself does not have a person actively working on it yet, until that time the plan forward has not been determined. By: David Brillert (aragon) 2017-01-31 08:40:11.210-0600 This is an intolerable regression affecting many customers and causing lost revenue. We would be happy to have the change ASTERISK-19820 reverted. A member does not receive a call if any type of transfer is done (built in *1, *2, SIP) and wrapuptime = is greater than 0 Very easily reproduced by example: queues.conf wrapuptime = 10 By: Jeffrey S Becker (PBX-Support64) 2017-01-31 08:58:42.599-0600 We are unable to upgrade our instance of Asterisk due to this limitation. The system is not feasible for a call center until this problem is resolved. By: David Brillert (aragon) 2017-01-31 09:05:21.129-0600 I agree we cannot upgrade beyond 11.16 and this means we cannot use Asterisk versions which include important security fixes if we use the older version to avoid the regression. Asterisk 13 is also effected since it includes the same commit. By: David Brillert (aragon) 2017-01-31 12:38:20.080-0600 IMHO the commit at ASTERISK-19820 needs to be reverted until the original problem reported there can be fixed properly and thoroughly tested. That change undermines all the work that went into state_interface This logic does not make sense, since any parking or transfer done by the queue member will result in the (in call) flag being set. 2015-12-29 05:44 +0000 [1943cfc53c] Martin Tomec <tomec.martin@gmail.com> app_queue: Add member flag "in_call" to prevent reading wrong lastcall time Member lastcall time is updated later than member status. There was chance to check wrapuptime for available member with wrong (old) lastcall time. New boolean flag "in_call" is set to true right before connecting call, and reset to false after update of lastcall time. Members with "in_call" set to true are treat as unavailable. By: David Brillert (aragon) 2017-03-07 15:54:16.949-0600 Is anyone willing to commit a patch to revert 2015-12-29 05:44 +0000 [1943cfc53c] Martin Tomec <tomec.martin@gmail.com>? This commit is causing multiple headaches and my only recourse is to downgrade Asterisk to old rpms built prior to Martin's commit. It is a bad regression and causing lots of lost dollars. By: Matthew Fredrickson (mattf) 2017-03-07 16:06:53.209-0600 [~matesstar] - do you have any comments on this? By: Sean Bright (seanbright) 2017-03-09 08:07:18.920-0600 [~aragon], unfortunately that patch has been in the codebase since 12/2015, so it's not as simple as just reverting. I am looking into it and hope to have a fix in the next day or two. By: David Brillert (aragon) 2017-03-09 08:42:04.045-0600 [~seanbright], much appreciated :) By: Sean Bright (seanbright) 2017-03-09 09:50:00.570-0600 After review, I'm not convinced that this problem is directly related to [~matesstar]'s patch. I'm going to revert it locally to confirm, but it just appears to me that redirects out of the queue are broken. We explicitly handle blind transfers (which I would suggest as a work around for you if possible), attended transfers, and hangups, but the bridge being broken by the redirect is silently ignored, which is the source of the problem. In my testing after the customer is redirected, not only was the "in call" status still set for the agent, but the device state also showed as "In use" until the customer hung up. Stay tuned. By: Sean Bright (seanbright) 2017-03-09 10:10:42.156-0600 [~aragon], I rolled back [~matesstar]'s patch in my test environment and my test agent stayed "In use" until the customer hung up, so this his patch being the problem was a red herring. Save me the trouble of going back to an ancient version of Asterisk and tell me what your {{queue_log}} says when the customer is AMI Redirected out of the queue. By: David Brillert (aragon) 2017-03-09 10:42:35.144-0600 Etienne opened the ticket specifically reporting the AMI problem. I don't have to use AMI to reproduce. I just have to do any type of SIP transfer or PARK and ensure the wrapuptime is set to some value greater than 0 then the not in use member will receive no new calls from queue. If wrapuptime = 0 then member will receive calls but 'in call' flag will still be true after the transfer and not cleared until all transferred parties hangup. So in any case the dynamic member will show 'in call' until the transferred parties both hangup. My steps to reproduce are at ASTERISK-26715 Reproduction is remarkably easy. Here is the queue show after the member has transferred call and the agent's SIP extension is idle. What is odd is that the core show channels output shows the SIP extension 214 involved in a channel after the transfer is completed. Call flow= A = 216 calls queue = reception B = queue member logged to SIP 214 answers via queue and completes an attended SIP Transfer to C = 213 C= 213 is bridged to 216 B= idle (transfer completed) CLI output from the call flow: {noformat} debcomainbtn-reception has 0 calls (max unlimited) in 'ringall' strategy (0s holdtime, 25s talktime), W:0, C:9, A:0, SL:100.0% within 60s Members: Local/214@debcomainbtn-agent/n (ringinuse disabled) (dynamic) (in call) (Not in use) has taken 9 calls (last was 177 secs ago) No Callers master88*CLI> core show channels Channel Location State Application(Data) SIP/debcomainbtn213- (None) Up AppDial((Outgoing Line)) Local/214@debcomainb s@macro-debcomainbtn Up Dial(SIP/debcomainbtn213,20,tk Local/214@debcomainb s@debcomainbtn-agent Up AppQueue((Outgoing Line)) SIP/debcomainbtn216- s@debcomainbtn-appli Up Queue(debcomainbtn-reception,t 4 active channels 2 of 512 max active calls ( 0.39% of capacity) master88*CLI> sip show channels Peer User/ANR Call ID Format Hold Last Message Expiry Peer 192.168.192.78 (None) 6e0d1264723bc95 (nothing) No Rx: OPTIONS <guest> 172.31.240.111 debcomainbtn216 0_1966151905@17 (ulaw) No Tx: UPDATE debcomainb 172.31.240.110 debcomainbtn213 0715e9a879d7be6 (ulaw) No Tx: UPDATE debcomainb 3 active SIP dialogs {noformat} queue_log {noformat} 1489076485|1489076484.40|debcomainbtn-reception|NONE|ENTERQUEUE||debcomainbtn216|1 1489076486|1489076484.40|debcomainbtn-reception|Local/214@debcomainbtn-agent/n|CONNECT|1|1489076485.41|1 1489076498|1489076484.40|debcomainbtn-reception|Local/214@debcomainbtn-agent/n|COMPLETECALLER|1|12|1 {noformat} By: David Brillert (aragon) 2017-03-09 10:47:02.246-0600 [~seanbright] You mentioned a SIP blind transfer should be explicitly handled so I tested that scenario as well Same call flow as above only instead of attended SIP transfer I did a Polycom Blind transfer {noformat} master88*CLI> queue show debcomainbtn-reception has 0 calls (max unlimited) in 'ringall' strategy (0s holdtime, 18s talktime), W:0, C:11, A:0, SL:100.0% within 60s Members: Local/214@debcomainbtn-agent/n (ringinuse disabled) (dynamic) (in call) (Not in use) has taken 11 calls (last was 19 secs ago) No Callers master88*CLI> core show channels Channel Location State Application(Data) SIP/debcomainbtn213- (None) Up AppDial((Outgoing Line)) SIP/debcomainbtn216- s@debcomainbtn-appli Up Queue(debcomainbtn-reception,t Local/214@debcomainb s@debcomainbtn-agent Up AppQueue((Outgoing Line)) Local/214@debcomainb s@macro-debcomainbtn Up Dial(SIP/debcomainbtn213,20,tk 4 active channels 2 of 512 max active calls ( 0.39% of capacity) master88*CLI> sip show channels Peer User/ANR Call ID Format Hold Last Message Expiry Peer 172.31.240.111 debcomainbtn216 0_2064138430@17 (ulaw) No Tx: UPDATE debcomainb 192.168.192.78 (None) 3be2ae342e97dcb (nothing) No Rx: OPTIONS <guest> 172.31.240.110 debcomainbtn213 35c30b9a6a1c852 (ulaw) No Tx: ACK debcomainb 3 active SIP dialogs {noformat} By: Sean Bright (seanbright) 2017-03-09 11:28:08.947-0600 [~aragon], please try the attached patch. You will need the latest version of Asterisk 13 (potentially from git) for it to apply cleanly. By: David Brillert (aragon) 2017-03-09 11:53:46.739-0600 [~seanbright] Thanks!!! :D I understand Asterisk 11 is EOL. It will take me some time to test the patch since our builds are based on Asterisk 11, and I have to build a working Asterisk 13 environment. Have you done any testing on your end? By: Sean Bright (seanbright) 2017-03-09 12:40:35.064-0600 [~aragon] - yes, I've tested and it works as it should. [~PBX-Support64], is it possible for you to test this patch and report back? By: Sean Bright (seanbright) 2017-03-09 12:42:10.391-0600 [~hexanol], can you test the attached patch and let us know if it solves your problem? By: Jeffrey S Becker (PBX-Support64) 2017-03-09 12:53:24.374-0600 Unfortunately, I don’t have the required skill set to download a patch and recompile Asterisk, nor do I have a suitable system set up to do so. By: Sean Bright (seanbright) 2017-03-09 15:48:31.273-0600 [This patch is up for review on gerrit|https://gerrit.asterisk.org/#/c/5149]. By: David Brillert (aragon) 2017-03-10 13:33:56.784-0600 [~seanbright] I just tested reviewboard patch set 3 I think something is still not right with SIP transfers because I still see an outgoing channel associated with an idle SIP extension after transfer. I'm attaching the full call CLI and queue show and core show channels output in file 13reviewboardtests.txt By: Sean Bright (seanbright) 2017-03-10 14:01:14.315-0600 [~aragon], can you please tell me what a SIP transfer is? By: David Brillert (aragon) 2017-03-10 14:12:48.063-0600 Testing with a Polycom phone using the transfer soft key as opposed to using Asterisk DTMF *1 or *2 By: Sean Bright (seanbright) 2017-03-10 14:16:03.471-0600 [~aragon], the queue member is using a Polycom and transferring the caller to another extension, is that correct? Because an AMI Redirect is not involved I don't think it relates to this issue, but I will test regardless. By: Sean Bright (seanbright) 2017-03-10 14:35:10.741-0600 [~aragon], I've confirmed your findings. It only appears to happen when your queue members are Local channels with the {{/n}} (don't optimize) flag set. By: David Brillert (aragon) 2017-03-10 14:37:12.847-0600 [~seanbright] Correct, the Polycom is transferring to another extension. Keep in mind my ticket regarding transfers ASTERISK-26715 was closed out as a duplicate. I never reported an issue using AMI or external redirect (that was Etienne). I also run into issues with a transfer to PARK extension after retrieving the call from the parking lot. {noformat} debcomainbtn-reception has 0 calls (max unlimited) in 'ringall' strategy (0s holdtime, 0s talktime), W:0, C:0, A:0, SL:0.0% within 60s Members: Local/214@debcomainbtn-agent/n (ringinuse disabled) (dynamic) (in call) (Not in use) has taken no calls yet No Callers debcomainbtn-sales has 0 calls (max unlimited) in 'rrmemory' strategy (0s holdtime, 0s talktime), W:0, C:0, A:0, SL:0.0% within 10s Members: Local/214@debcomainbtn-agent/n (ringinuse disabled) (dynamic) (in call) (Not in use) has taken no calls yet No Callers master88*CLI> core show channels Channel Location State Application(Data) SIP/debcomainbtn213- 701@debcomainbtn-man Up ParkedCall(parkinglot_debcomai SIP/debcomainbtn216- s@debcomainbtn-appli Up Queue(debcomainbtn-reception,t Local/214@debcomainb s@debcomainbtn-agent Up AppQueue((Outgoing Line)) Local/214@debcomainb 214@debcomainbtn-loc Up Dial(SIP/debcomainbtn214,,tk) {noformat} By: David Brillert (aragon) 2017-03-10 14:41:19.286-0600 [~seanbright] Also correct, we always use /n for agents. By: Joshua C. Colp (jcolp) 2017-03-10 14:42:54.927-0600 [~aragon] Since [~seanbright] has determined that your issue seems to be something different feel free to comment on your closed issue. It will automatically reopen and go back into triage. [~seanbright] Thank you for all your work on this! By: David Brillert (aragon) 2017-03-10 14:53:18.093-0600 [~jcolp] Honestly I have no idea if I want to re-open the other issue. If you want me to I can but I would rather just watch one ticket instead of spreading debug logs all over the place. I too give huge respect to [~seanbright] for taking this one on (mad love). By: Etienne Lessard (hexanol) 2017-03-10 15:16:53.654-0600 As far as the scenario in this issue's description is concerned, I confirm that the bug doesn't show up anymore after applying the patch (on an Asterisk 14.3.0). I didn't do any other tests. Thanks By: David Brillert (aragon) 2017-03-10 15:33:27.163-0600 Since Etienne has confirmed the reviewboard patch fixes his reported issue I have reopened ASTERISK-26715 By: Andrej (tekach) 2017-03-13 08:57:21.276-0500 Hi, if this helps: I've originally opened a ticket ASTERISK-26757 which was closed as duplicate; our situation was similar, but not same - we are using yealink phones with built-in blind and attended transfer functions/buttons. I've used the patch against asterisk 13.14.0 and after testing bug doesn't seem to be gone. Shall I reopen my case or? Andrej By: Joshua C. Colp (jcolp) 2017-03-13 17:17:13.422-0500 [~tekach] Sure it's not ASTERISK-26715 which has already been reopened? By: Andrej (tekach) 2017-03-14 06:38:48.226-0500 Hi Joshua, You are right, issue seems the same. Thanks, Andrej By: Friendly Automation (friendly-automation) 2017-03-15 20:32:05.940-0500 Change 5151 merged by zuul: app_queue: Handle the caller being redirected out of a queue bridge [https://gerrit.asterisk.org/5151|https://gerrit.asterisk.org/5151] By: Friendly Automation (friendly-automation) 2017-03-15 21:19:40.307-0500 Change 5149 merged by zuul: app_queue: Handle the caller being redirected out of a queue bridge [https://gerrit.asterisk.org/5149|https://gerrit.asterisk.org/5149] By: Friendly Automation (friendly-automation) 2017-03-16 05:25:29.059-0500 Change 5150 merged by Joshua Colp: app_queue: Handle the caller being redirected out of a queue bridge [https://gerrit.asterisk.org/5150|https://gerrit.asterisk.org/5150] By: Sean Bright (seanbright) 2017-03-21 16:52:46.632-0500 Bad news. The fix mentioned above broke something else, so we are reverting and re-opening. By: Luis Aguirre (laar789) 2017-04-03 08:08:58.626-0500 Hi, is ther any advance? Because, I'm having the same issue in asterisk 11.25.1, I can't update to asterisk 13 in my application, it's developed over asterisk 11 and the update to 13 is not possible right now. I assume that the fix will work for asterisk 11 right? Thanks in advance By: Gabriel Williamson (gabe.williamson) 2017-04-12 15:00:44.407-0500 I am also interested in this issue being resolved. Should the resolution have been changed from fixed back to unresolved? By: Joshua C. Colp (jcolp) 2017-04-12 15:03:15.391-0500 JIRA does not allow the resolution to be altered. It will only be updated when the issue is closed again (it's currently open). By: Martin Tomec (matesstar) 2017-05-11 04:05:31.274-0500 The patch ASTERISK-19820 solved issue when wrapup time is sometimes ignored. It seems that the patch elevates other issues when the call is not hung-up "properly" - the agent remains in state "in_call". This is off course bigger issue, so as a temporary workaround it makes sense to revert my patch. But final solution should be to correctly clear the "in_call" flag. Maybe there is somewhere missing call of function update_queue - hope that Sean will find it... Sorry for late reply By: Friendly Automation (friendly-automation) 2017-05-24 07:50:35.772-0500 Change 5640 merged by Jenkins2: app_queue: Fix members showing as being in call when not. [https://gerrit.asterisk.org/5640|https://gerrit.asterisk.org/5640] By: Friendly Automation (friendly-automation) 2017-05-24 08:40:10.883-0500 Change 5639 merged by Jenkins2: app_queue: Fix members showing as being in call when not. [https://gerrit.asterisk.org/5639|https://gerrit.asterisk.org/5639] By: Friendly Automation (friendly-automation) 2017-05-24 09:41:41.307-0500 Change 5641 merged by Joshua Colp: app_queue: Fix members showing as being in call when not. [https://gerrit.asterisk.org/5641|https://gerrit.asterisk.org/5641] By: Matt Brown (mattbrown) 2017-06-20 07:45:26.351-0500 I can confirm we are experiencing this bug still in 13.16.0 & 11.25.1 (and 14.x branch). We have a queue of dynamic members, call comes in and member answers call. Call is then transferred using ## (In call blind xfer) and the call is transferred - however this leaves the original member showing as (in call) and no further calls are received. However, changing wrapuptime to 0 will resolve the issue - but ideally we need wrapup time. Therefore I would mark this bug as not resolved. The previous comment was made: "A caller can leave the Queue() application after being bridged with a member in a few ways: * Caller or member hangup * Caller is transferred somewhere else (blind or atx) * Caller is externally redirected elsewhere The first 2 scenarios are currently handled by subscribing to stasis messages, but the 3rd is not explicitly covered" atx and blind still cause issues for the members of the queue. By: Joshua C. Colp (jcolp) 2017-06-20 07:49:15.085-0500 This fix is not yet in a release. If you use the branch it should be resolved, as it has been for many others. If it's still not then please file a new issue with the precise scenario, configuration, and console output as it is different. By: Matt Brown (mattbrown) 2017-06-20 08:13:52.733-0500 Apologies, I was reading http://downloads.asterisk.org/pub/telephony/asterisk/old-releases/ChangeLog-13.15.1 and thought this was fixed in this version. Thank you, I will checkout and retry with our lab setup. By: Joshua C. Colp (jcolp) 2017-06-20 08:16:39.010-0500 In the future if you come across a JIRA issue you can check the "Target Release Version/s" field at the top. That automatically gets set to the release it is first in when the release happens. By: Luis Aguirre (laar789) 2017-07-06 10:00:57.913-0500 Hi, I'm really happy to see this bug fixed, but please can you set a target release for asterisk 11. I'm currently working with asterisk 11 and changing to asterisk 13 would take some time, but a I need this fix in my current app. Please, or make a patch to asterisk 11 By: Joshua C. Colp (jcolp) 2017-07-06 10:05:03.353-0500 Asterisk 11 is no longer a bug fix supported branch. The only time it receives a release is as a result of security issues. The change would also need to be rearchitected some to work within 11. By: Rusty Newton (rnewton) 2017-07-06 18:04:43.912-0500 [~laar789] for reference, in regards to Joshua's comment - here is the list of versions and support cycles https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions |