[Home]

Summary:ASTERISK-19613: Multibytes characters in files are not handled properly (signed char compared to int will get incorrect result if this byte is one of multibyte character)
Reporter:LiuYan刘研 (lovetide)Labels:
Date Opened:2012-04-01 03:08:47Date Closed:2017-03-06 14:15:01.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:PBX/pbx_spool
Versions:1.8.11.0 Frequency of
Occurrence
Constant
Related
Issues:
Environment:CentOS 5.8 x86_64 using UTF-8 encodingAttachments:( 0) asterisk-does-not-handle-multibyte-characters-properly.png.jpg
Description:I created a .call file which set variables to some Chinese characters ({{Set: CustomerName=姓名}}), but I got empty strings when retrieving values of these variables in dialplan (in fact, they are treated as trailing spaces after I read source code).

The reason maybe caused by comparing signed char with an int value. Because a multibyte character usually contains bytes with their highest bit set to 1 (0x80), then compare signed value of these bytes (less than 0) to 33 will got incorrect result. Unsigned values should be used in such cases.

related source code lines:
{noformat}
----------
pbx/pbx_spool.c:160:            while(!ast_strlen_zero(buf) && buf[strlen(buf) - 1] < 33)
pbx/pbx_spool.c:167:                            while ((*c) && (*c < 33))
----------
{noformat}

some others source codes also contains such comparison, such as
{noformat}
$ grep -inr --color -E "[<>]( )*3[0-9][^0-9]" *
----------
...
channels/chan_sip.c:3160:       return (l_name >= len && name[len] < 33 &&
channels/chan_sip.c:17212:              if (x < 31 && ast_codec_pref_index(pref, x + 1))
channels/chan_sip.c:21760:              while(*code && (*code > 32)) {  /* Search white space */
channels/chan_sip.c:21767:              while(*sep && (*sep > 32)) {    /* Search white space */
...
utils/extconf.c:1457:   while (*str && *str < 33)
utils/extconf.c:1482:           while ((work >= str) && *work < 33)
utils/extconf.c:3579:           while(*c && (*c > 32)) c++;
...
utils/muted.c:135:                      while(strlen(buf) && (buf[strlen(buf) - 1] < 33))
utils/muted.c:141:                              if (*val < 33)
utils/muted.c:148:                              while(*val && (*val < 33)) val++;
utils/muted.c:264:              while(strlen(buf) && (buf[strlen(buf) - 1] < 33))
...
----------
{noformat}
I'm not sure if those code will result incorrect result too (*looks likely*), hope developers can check their code and use a common method to handle such comparison (_include/asterisk/strings.h_ ?)
Comments:By: LiuYan刘研 (lovetide) 2012-04-01 04:06:30.023-0500

asterisk not handle multibyte characters properly due to comaring signed char with int

By: Sean Bright (seanbright) 2017-03-06 14:15:02.128-0600

The code that you referenced no longer exists as of Asterisk 13.14.0 and I am not able to replicate your results. If you can reproduce this in Asterisk 13 or later, please feel free to re-open this issue.