[Home]

Summary:ASTERISK-17288: [regression] Asterisk 1.8x, SIP 484 set HANGUPCAUSE to 111 instead of 28
Reporter:Mikael Carlsson (mickecarlsson)Labels:
Date Opened:2011-01-26 08:47:36.000-0600Date Closed:2011-09-09 11:17:40
Priority:MinorRegression?
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:Frequency of
Occurrence
Related
Issues:
causesASTERISK-18702 Improvement of overlap dialling handling in chan_sip
Environment:Attachments:( 0) full.debug
( 1) full.sipdebug
( 2) full.verbose
( 3) revert-114773.patch
( 4) revert-part-295866.patch
( 5) tele2.pcap
Description:When dialing an incomplete number provider returns "SIP/2.0 484 Number incomplete" but HANGUPCASE is set to 111 instead of 28.
This is working with Asterisk 1.4
Running 1.8.3-rc1
Comments:By: Mikael Carlsson (mickecarlsson) 2011-05-17 00:18:17

Update:
Regression is from 1.4.41 -> 1.6.2.15

In Asterisk 1.6.2.14 it will fail and exit the dialplan with this:
   -- Executing [s@macro-dialout-trunk:20] Dial("SIP/2476-00000000", "SIP/ast-16/040496652,300,") in new stack
   -- Called ast-16/040496652
   -- Got SIP response 484 "Address incomplete" back from 192.168.0.221
 == Everyone is busy/congested at this time (1:0/0/1)
 == Spawn extension (macro-dialout-trunk, s, 20) exited non-zero on 'SIP/2476-00000000' in macro 'dialout-trunk'
 == Spawn extension (from-internal, 040496652, 6) exited INCOMPLETE on 'SIP/2476-00000000'

With Asterisk 1.6.2.15-rc1 this is the result:
   -- Executing [s@macro-dialout-trunk:20] Dial("SIP/2476-00000000", "SIP/ast-16/040496652,300,") in new stack
   -- Called ast-16/040496652
   -- Got SIP response 484 "Address incomplete" back from 192.168.0.221
 == Everyone is busy/congested at this time (1:0/0/1)
   -- Executing [s@macro-dialout-trunk:21] NoOp("SIP/2476-00000000", "Dial failed for some reason with DIALSTATUS = CHANUNAVAIL and HANGUPCAUSE = 111") in new stack

By: Mikael Carlsson (mickecarlsson) 2011-05-17 00:48:40

Update:
Now that I think of it, it must have been broken earlier in the chain. From 1.4.41 to 1.6-branch, as the behavior of exiting non-zero when receiving 484 is not what is happening in 1.4.41.
I will test to see if I can find the revision in 1.6 that broke it.
1.6.2.15-rc1 somewhat fixed the non-zero exit, but returns the wrong HANGUPCAUSE

By: Walter Doekes (wdoekes) 2011-05-17 01:31:02

I seem to recall that a 484 had the dialplan look for a 't' extension. (Don't know if that's relevant.)

By: Mikael Carlsson (mickecarlsson) 2011-05-18 05:58:44

Up to Asterisk 1.6.0.28 the SIP 484 is working as it should.

From version 1.6.1.0 the SIP 484 exits the dialplan and goes to the 't' extension:
   -- Executing [s@macro-dialout-trunk:20] Dial("SIP/2476-09ddda50", "SIP/ast-16/040496652,300,") in new stack
   -- Called ast-16/040496652
   -- Got SIP response 484 "Address incomplete" back from 192.168.0.221
 == Everyone is busy/congested at this time (1:0/0/1)
 == Spawn extension (macro-dialout-trunk, s, 20) exited non-zero on 'SIP/2476-09ddda50' in macro 'dialout-trunk'
 == Spawn extension (from-internal, 040496652, 6) exited INCOMPLETE on 'SIP/2476-09ddda50'
   -- Timeout on SIP/2476-09ddda50
 == CDR updated on SIP/2476-09ddda50
   -- Executing [t@from-internal:1] NoOp("SIP/2476-09ddda50", "Hangupcause 28") in new stack
   -- Executing [t@from-internal:2] Macro("SIP/2476-09ddda50", "hangupcall") in new stack

That is not a correct behavior for SIP 484.
From version 1.6.2.15-rc1 this is somewhat corrected as the dialplan goes to the next exten but sets the wrong HANGUPCAUSE.

By: Mikael Carlsson (mickecarlsson) 2011-05-18 13:05:44

This was added between 1.6.0.28 and 1.6.1.0 in app_dial.c:
+
+               /* SIP, in particular, sends back this error code to indicate an
+                * overlap dialled number needs more digits. */
+               if (chan->hangupcause == AST_CAUSE_INVALID_NUMBER_FORMAT) {
+                       res = AST_PBX_INCOMPLETE;
+               }
+

Here is the bug that is breaking functionality from previous versions causing an unnecessary timeout of three seconds before going to 't' exten.

I will now dig into

By: Mikael Carlsson (mickecarlsson) 2011-05-18 13:06:13

Sorry, more info will follow

By: Mikael Carlsson (mickecarlsson) 2011-05-18 15:24:58

OK, bear with me here.

The code from my previous note in app_dial.c breaks 484 so that it goes to a 't' exten after three second pause.
This is the first bug. It was introduced in 1.6.1.0

The second bug is when this was added: https://reviewboard.asterisk.org/r/740/
That was added in 1.6.2.15-rc1

Now we don't get a three second pause and don't go to 't' exten. But, it returns the wrong hangupcase for SIP 484, 111 instead of 28.

So, if I remove the code in note 0135104 from app_dial.c and this code from channel.c:
              if (ast_check_hangup(chan)) {
                      ast_queue_control(chan, AST_CONTROL_HANGUP);
              } else {

The 484 works again as it should.

Shall I provide those as diffs?

By: Mikael Carlsson (mickecarlsson) 2011-05-19 01:02:47

Code added in app_dial.c that broke SIP 484 is from revision 114773 based on https://issues.asterisk.org/view.php?id=12351

"Inspired by a post on the mailing list, this new application, Incomplete, permits a user to specify that an extension is incomplete and that the PBX should wait for more digits to add to the current extension."

The above is fine for extension to extension dialing, but does not work if trunk reports SIP 484 if you dial a to short number. So, I will go out on a limb and say that this implementation is bad and broken.

As for the code in channel.c, it was added in revision 295843 and fixes some regression from issue 16946, it was reviewed in https://reviewboard.asterisk.org/r/1013/

So, do you need more info? I am on irc with the same handle as I use here. Just ping me, not that my timezone is UTC+2

By: Mikael Carlsson (mickecarlsson) 2011-06-30 13:42:57.983-0500

Asterisk 1.8.5-rc1 is out, this issue is still unsolved. I just tested is on this version. Bummer. I got my hopes up as Rod Montgomery posted this on FreePBX a while ago:
*************
Hi Mikael, I discussed issue 18681 with others in the office and learned that it's likely to be marked as a blocking issue for 1.8.5.
*************

I provided some hint to where the problem lies, I have not coded anything in C for over 20 years and it would take me a lot of time just to go through the code to see how it all connects together.

By: Leif Madsen (lmadsen) 2011-06-30 16:10:44.480-0500

Can you look into this and tell me how we should move forward on it? Thanks!

By: Mikael Carlsson (mickecarlsson) 2011-07-01 14:55:35.548-0500

I just noted that relations where lost in the migration to jira

By: Mikael Carlsson (mickecarlsson) 2011-07-03 07:58:34.406-0500

So I have made some more testing to this.
If I completely revert the code from revision 114773 and part of revision 295866 the HANGUPCASE 28 work again.

The code added in revision 295866 destroys c->hangupcase somewhere in the chain and set it to 111 from 28, it is delivered OK from chan_sip.c. That I have not been able to trace down.

I have added two patches, one that completely reverts 114773 and one that partly removes some code to restore the hangupcase issue.

IMHO, the code in 114773 needs a revision as it complete breaks previous behavior of HANGUPCASE 28.
I have read the description for the function, but I failed to see any use for it, perhaps the author can shed some light to it?

The code in revision 295866 needs to check the c->hangupcase and preserve it so that it is not destroyed in the chain.




By: Mikael Carlsson (mickecarlsson) 2011-07-03 07:59:34.219-0500

Patch that completely removes revision 114773

By: Mikael Carlsson (mickecarlsson) 2011-07-03 08:00:16.192-0500

Patch that removes some code to restore functionality for HANGUPCAUSE 28

By: Mikael Carlsson (mickecarlsson) 2011-08-11 15:21:25.541-0500

Any news or update re this issue or ASTERISK-18099?

By: nicolasom (nicolasom) 2011-08-23 05:04:44.114-0500

Why the issue ASTERISK-18099 is private? I get "permission violation" when trying to enter.

By: Richard Mudgett (rmudgett) 2011-08-23 09:58:43.028-0500

ASTERISK-18099 is a clone of this issue used internally for prioritization.

By: Mikael Carlsson (mickecarlsson) 2011-08-29 00:04:27.206-0500

Is there anything you need (from me) to move this forward?
1.8.6.0-RC3 is out, still no sight of correcting this regression.


By: Matt Jordan (mjordan) 2011-08-31 13:53:42.979-0500

Hi Mikael -

I've begun investigating this issue.  As it turns out, both 114773 and 295866 were necessary changes and cannot be backed out.  However, per RFC 3398, we should be mapping an ISUP code of 28 to a SIP 484 response, so the current behavior of mapping 111 to SIP response 484 is definitely not following the recommendations in the RFC.  It also appears as if it circumvents the Incomplete application logic put into pbx and app_dial by revision 114773, which is also not optimal.  The logic that appears to be off is in our hangup / off-nominal response handling of SIP codes, so that's where we'll probably approach this issue.

If I put a patch up on SVN, would you be willing and able to test it out on a system?  It will be on the 1.8 branch, if that's okay.

Thanks

Matt

By: Mikael Carlsson (mickecarlsson) 2011-08-31 13:56:11.749-0500

Yes, I can test this as soon as it is in svn, no problems.

By: Mikael Carlsson (mickecarlsson) 2011-08-31 13:57:17.270-0500

Forgot to add:
I can test it in a test environment that I have in house and I can test it on a live setup (used for test purposes)

By: Matt Jordan (mjordan) 2011-09-01 12:40:13.767-0500

Mikael -

I have a patch for this issue on a team branch at /team/mjordan/AST_17288/1.8.  When a SIP peer returns a "484 Address Incomplete" response, the Dial application should:
(a) hangup the call
(b) return a HANGUPCAUSE of 28
(c) perform the dialplan logic for attempting to match on an Incomplete extension

The Incomplete extension logic will, if an extension does not exist that could be a potential match, direct you immediately to the 'i' extension (invalid).  If however there are potential matches, it will wait to see if any additional extension numbers are provided.  If none are provided, it will timeout and redirect to the 't' extension.  Otherwise, if will attempt to match whatever is provided.

Matt

By: Mikael Carlsson (mickecarlsson) 2011-09-01 14:11:10.227-0500

I just tested it. Hangupcause is now correct (28) but it never goes to 'i' exten. I don't have any 't' or 'e' exten:
{code}
pbx.c: Timeout, but no rule 't' or 'e' in context 'from-internal'
{code}
It is also redirecting back the 't' exten to a context where it all started, not the context that is actually making the call. That is also a regression from 1.6.0.

I also don't follow the logic for INCOMPLETE, The SIP provider returns SIP 484 and hang up at their end. The phone where I have entered the number has sent it to dialplan logic. How can anything proceed from there via the 't' exten?

{code}

 == Using SIP RTP TOS bits 184
 == Using SIP RTP CoS mark 5
   -- Called SIP/till221test/04552
   -- Got SIP response 484 "Address incomplete" back from 192.168.0.221:5060
 == Everyone is busy/congested at this time (1:0/0/1)
 == Spawn extension (macro-dialout-trunk, s, 20) exited non-zero on 'SIP/2476-0000000c' in macro 'dialout-trunk'
 == Spawn extension (from-internal, 04552, 6) exited INCOMPLETE on 'SIP/2476-0000000c'
   -- Timeout on SIP/2476-0000000c
 == CDR updated on SIP/2476-0000000c
   -- Executing [t@from-internal:1] NoOp("SIP/2476-0000000c", "Timeout from HC 28") in new stack
   -- Auto fallthrough, channel 'SIP/2476-0000000c' status is 'CHANUNAVAIL'
   -- Executing [h@from-internal:1] Hangup("SIP/2476-0000000c", "") in new stack
 == Spawn extension (from-internal, h, 1) exited non-zero on 'SIP/2476-0000000c'
{code}

This is getting from worse to unacceptable.

I have a suggestion, make the INCOMPLETE code configurable  bypassing that INCOMPLETE 'logic' and return to what it actually making sense, setting hangupcause to 28 and move to next priority in the dialplan as it was doing in 1.6.0.


By: Matt Jordan (mjordan) 2011-09-09 10:44:40.968-0500

Mikael -

I wanted to update you on the progress of this patch.  Based on our discussion on asterisk-dev, I think we've got the correct solution.  Here is how it will behave:
Scenario 1 (SIP setting is set to allowoverlap=yes):
1. SIP A calls into Asterisk and matches an extension.  The extension dials a SIP phone B.
2. SIP phone B returns 484 Address Incomplete
3. SIP phone A receives the 484 Address Incomplete, and returns to the Dialplan with a hangup code of 28
4. Dialplan takes over

Scenario 2 (SIP setting is set to allowoverlap=no):
1. SIP A calls into Asterisk and matches an extension.  The extension dials a SIP phone B.
2. SIP phone B returns 484 Address Incomplete
3. SIP phone A receives a 404 Not Found and returns to the Dialplan with a hangup code of 1 (note that this behavior is how it would appear if the address didn't match completely on the dialplan, as opposed to SIP B, and allowoverlap was set to no)
4. Dialplan takes over

I think the only difference then what we had discussed on asterisk-dev is the effect of the allowoverlap SIP flag - while in review, some comments made it clear that regardless of what the Dial application gets back from an external SIP phone, if that flag is set to no, we should return 404 instead of 484.  

As far as the Incomplete application tweaks go, it makes this a bit more fun, but is secondary to the scenario you were facing.  If the Dialplan goes into Incomplete status, and the SIP phone is answered, it will wait for more digits; otherwise it will use the logic outlined in the scenarios above.

This patch is set to be applied to 1.8, 10, and trunk.

By: Mikael Carlsson (mickecarlsson) 2011-09-09 13:48:59.681-0500

Without NO allowoverlap setting in sip.conf
-- Got SIP response 484 "Address incomplete" back from 192.168.0.221:5060
== Everyone is busy/congested at this time (1:0/0/1)
-- Executing [s@macro-dialout-trunk:21] NoOp("SIP/2476-00000000", "Dial failed for some reason with DIALSTATUS = CHANUNAVAIL and HANGUPCAUSE = 28") in new stack

The above is from my test server dialing another Asterisk that has this dialplan:
[test-custom-hangup]
exten => s,1,Wait(2)
exten => s,n,Hangup(28)

I then tested it with allowoverlap=yes and it was the same result as above.

I then tested it with allowoverlap=no and HANGUPCAUSE was set to 1

So, it is working as it should (IMHO) and removes the regression.
Thank you for listening and your coding skill. Much appreciated.