[Home]

Summary:ASTERISK-28295: chan_sip / pjsip: Non UTF-8 handling could be better
Reporter:Philip Mott (nexbridge)Labels:pjsip
Date Opened:2019-02-18 10:23:32.000-0600Date Closed:
Priority:MinorRegression?
Status:Open/NewComponents:Channels/chan_sip/General pjproject/pjsip
Versions:13.23.1 Frequency of
Occurrence
Occasional
Related
Issues:
Environment:Debian 9.6Attachments:
Description:We recently received a SIP invite header of the following form:

bq. INVITE sip:01234567890%A0@1.2.3.4

Which led to the following Asterisk errors:

bq. json.c: Error building JSON from '\{s: s, s: s\}': Invalid UTF-8 string.
bq. stasis_channels.c: Error creating message

I believe "%A0" is a URL-encoded non-breaking space, so it seems like that's not being handled correctly.

Three questions:

# Is this something that can be fixed, in the sense of not generating error messages?
# Is there a simple way for me to recreate the error?  Only I'm finding it difficult to put a non-breaking space into the Dial command, and obviously if I can replicate the issue then I can look into ways of protecting against it if need be.
# How would you specify special characters like that in the Asterisk REGEX function?

P.S. Sorry I couldn't find a component relating to json.c when filling out this form so went with 'General' - please correct if wrong!
Comments:By: Asterisk Team (asteriskteam) 2019-02-18 10:23:35.238-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: Chris Savinovich (csavinovich) 2019-02-19 10:53:05.104-0600

Hello Philip, please provide us with more information.  What channel technology are you using? (chan_sip or chan_pjsip).   Some more background into how the call originated would help (is it another phone in the same network? and if so what phone brand is it. Is is a local channel call?)

Thanks
C. Savinovich


By: Philip Mott (nexbridge) 2019-02-19 10:57:06.837-0600

Hi Chris,

We're using chan_sip, and the call originated from outside our network, so I can't say how it was generated - I wish I knew!  I've tried replicating the issue by using various other strange unicode characters and making internal calls using the Dial command, but it's handled those fine, which again seems to imply it's this specific character that's causing problems for some reason.

Thanks,

Philip

By: Chris Savinovich (csavinovich) 2019-02-19 15:41:17.988-0600

Hi Philip,
  Regrettably, starting with versions 13.x, Asterisk is in the process of solely supporting chan_pjsip for SIP calls and allowing its community developers to fix any new issues arising in chan_sip.  You would be more than welcome to submit any patches to chan_sip using our Gerrit site that you estimate can fix this issue, or find a fix in the community site.
 We also tested to verify if the same error is happening in chan_pjsip and the good news is that we didn't replicate the error in chan_pjsip, which we will gladly support.
Thanks,
C. Savinovich


By: Sean Bright (seanbright) 2019-02-27 13:57:33.263-0600

FWIW, {{0xA0}} is not a valid UTF-8 sequence (in extended ASCII it would be an {{รก}} and not a non-breaking space) and is therefore not able to be represented in JSON.

By: Philip Mott (nexbridge) 2019-02-28 03:39:12.522-0600

Yeah I've now found that I can replicate the issue by inserting a non-breaking space using the latin1 character set.