Summary: | ASTERISK-19613: Multibytes characters in files are not handled properly (signed char compared to int will get incorrect result if this byte is one of multibyte character) | ||
Reporter: | LiuYan刘研 (lovetide) | Labels: | |
Date Opened: | 2012-04-01 03:08:47 | Date Closed: | 2017-03-06 14:15:01.000-0600 |
Priority: | Major | Regression? | No |
Status: | Closed/Complete | Components: | PBX/pbx_spool |
Versions: | 1.8.11.0 | Frequency of Occurrence | Constant |
Related Issues: | |||
Environment: | CentOS 5.8 x86_64 using UTF-8 encoding | Attachments: | ( 0) asterisk-does-not-handle-multibyte-characters-properly.png.jpg |
Description: | I created a .call file which set variables to some Chinese characters ({{Set: CustomerName=姓名}}), but I got empty strings when retrieving values of these variables in dialplan (in fact, they are treated as trailing spaces after I read source code). The reason maybe caused by comparing signed char with an int value. Because a multibyte character usually contains bytes with their highest bit set to 1 (0x80), then compare signed value of these bytes (less than 0) to 33 will got incorrect result. Unsigned values should be used in such cases. related source code lines: {noformat} ---------- pbx/pbx_spool.c:160: while(!ast_strlen_zero(buf) && buf[strlen(buf) - 1] < 33) pbx/pbx_spool.c:167: while ((*c) && (*c < 33)) ---------- {noformat} some others source codes also contains such comparison, such as {noformat} $ grep -inr --color -E "[<>]( )*3[0-9][^0-9]" * ---------- ... channels/chan_sip.c:3160: return (l_name >= len && name[len] < 33 && channels/chan_sip.c:17212: if (x < 31 && ast_codec_pref_index(pref, x + 1)) channels/chan_sip.c:21760: while(*code && (*code > 32)) { /* Search white space */ channels/chan_sip.c:21767: while(*sep && (*sep > 32)) { /* Search white space */ ... utils/extconf.c:1457: while (*str && *str < 33) utils/extconf.c:1482: while ((work >= str) && *work < 33) utils/extconf.c:3579: while(*c && (*c > 32)) c++; ... utils/muted.c:135: while(strlen(buf) && (buf[strlen(buf) - 1] < 33)) utils/muted.c:141: if (*val < 33) utils/muted.c:148: while(*val && (*val < 33)) val++; utils/muted.c:264: while(strlen(buf) && (buf[strlen(buf) - 1] < 33)) ... ---------- {noformat} I'm not sure if those code will result incorrect result too (*looks likely*), hope developers can check their code and use a common method to handle such comparison (_include/asterisk/strings.h_ ?) | ||
Comments: | By: LiuYan刘研 (lovetide) 2012-04-01 04:06:30.023-0500 asterisk not handle multibyte characters properly due to comaring signed char with int By: Sean Bright (seanbright) 2017-03-06 14:15:02.128-0600 The code that you referenced no longer exists as of Asterisk 13.14.0 and I am not able to replicate your results. If you can reproduce this in Asterisk 13 or later, please feel free to re-open this issue. |