[Home]

Summary:ASTERISK-29614: app_agent_pool: XML Doc: unterminated entity reference
Reporter:Alexander Traud (traud)Labels:
Date Opened:2021-08-25 04:55:53Date Closed:2021-09-02 15:09:30
Priority:MajorRegression?
Status:Closed/CompleteComponents:Applications/app_agent_pool Applications/app_skel Documentation
Versions:13.38.3 16.20.0 18.6.0 19.0.0 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Ubuntu 18.04 LTS ./configure --enable-xmldoc, which is the default make all or make full, does not matterAttachments:
Description:This issue seems not to be of constant occurrence. Furthermore, I was not able to find any reports about this except several failed builds in Gerrit. Perhaps those are related but I did not investigated those further, yet.

Currently, I am able to replicate it quite constant and was able to investigate a bit with GDB. However, I am reporting early in my analysis, perhaps somebody sees the culprit faster.

*My symptom*:
{code} Loading app_agent_pool.so.
 == Manager registered action Agents
 == Manager registered action AgentLogoff
 == Registered custom function 'AGENT'
 == Registered application 'AgentLogin'
 == Registered application 'AgentRequest'
error : unterminated entity reference            ���
 == app_agent_pool.so => (Call center agent pool applications){code}*My call stack*:
apps/app_agent_pool.c
⤷ load_module
   ⤷ load_config
      ⤷ aco_info_init
         ⤷ type = agent_type (the second type from {{app_agent_pool}}, the one after the {{general_type}})
            ⤷ xmldoc_update_config_type in file main/xml.c
               ⤷ ast_xml_set_text
                  ⤷ xmlNodeSetContent

That error is not printed, when I remove _all_ categories except the terminating NULL, in the file {{apps/app_agent_pool.c}}, in the string array {{agent_type_blacklist}}. That error is not printed, when I comment/disable at least four of the eight {{aco_option_register}} in {{load_config}}.

This looks like a memory corruption, because even when that error is not printed, on the command-line interface (CLI), I am not able to issue {{xmldoc dump <file>}} when the module {{app_agent_pool}} is loaded. I get errors like {{output error : string is not in UTF-8}} or {{xmlEscapeEntities : char out of range}}.

*Workarounds*:
a) {{./configure --disable-xmldoc}} or
b) disable the module {{app_agent_pool}} via {{make menuselect}} or
c) noload the module {{app_agent_pool}} via the configuration file {{modules.conf}} or
d) change the string array {{agent_type_blacklist}} to contain just the terminating {{NULL}} value
Comments:By: Asterisk Team (asteriskteam) 2021-08-25 04:55:56.840-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: Alexander Traud (traud) 2021-08-25 07:20:11.135-0500

What a pity: The module, I have chosen to learn and understand how to do it right, {{app_skel}}, breaks the XML documentation as well. I build all modules, I was able to, and until now only those two modules seem to be affected.

By: Benjamin Keith Ford (bford) 2021-08-25 07:32:34.297-0500

Did this work properly on earlier versions of Asterisk? If it's a regression then we'll just go ahead and open up a ticket.

By: Alexander Traud (traud) 2021-08-25 08:20:28.901-0500

Yes, the symptom, the broken XML documentation when trying to dump it, was introduced in Asterisk 13.19 and Asterisk 15.2, with [GERRIT-7530 …|https://gerrit.asterisk.org/7530] (commit [bf2d359|https://github.com/asterisk/asterisk/commit/bf2d359]). Perhaps its author, [~coreyfarrell], is able to comment. However, the actual cause might be even older, when looking at that change. Furthermore, {{tests/test_config.c}} _might_ be affected too, because it uses a {{ACO_BLACKLIST_ARRAY}} as well. Cannot test that right now.

By: George Joseph (gjoseph) 2021-08-25 08:29:17.319-0500

Can you attach the config files you're using?  I tried using the sample agents.conf and am not getting the error.


By: Alexander Traud (traud) 2021-08-25 09:08:12.517-0500

This was a vanilla default installation of the Asterisk 19 branch with {{sudo make samples}}. However, whether you get that error depends on uninitialized memory. Therefore, you might face or not. However, on the CLI, you can use {{xmldoc dump <file>}} to reproduce this error quite constantly. With the latter, I am able to reproduce it in the latest releases of Asterisk 18, 16, and 13.

By: Sean Bright (seanbright) 2021-08-25 09:37:42.898-0500

I am able to reproduce this consistently. Clean install, no {{agents.conf}} file needed (Ubuntu 20.04.3 LTS).

{noformat}
Linux heisenbug 5.4.0-81-generic #91-Ubuntu SMP Thu Jul 15 19:09:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
{noformat}


By: George Joseph (gjoseph) 2021-08-25 09:54:59.691-0500

Yeah me too.  app_agent_pool and app_skel are the only two modules using the blacklist stuff.


By: Sean Bright (seanbright) 2021-08-25 10:22:39.380-0500

Tentative patch on Gerrit.

By: Alexander Traud (traud) 2021-08-25 13:04:16.877-0500

Sorry, cannot comment or help with the patch because I do not even understand what the current source code tried to achieve with a blacklist, let alone the XML thing. Does the code try to cope with a {{\[general\]}} section not at the start but between sections of different agent IDs? Even then, why does it try to cope with a {{\[agents\]}} section, although a section called {{\[agents\]}} is not expected at all.

By: Sean Bright (seanbright) 2021-08-25 13:13:56.320-0500

The blacklist (both before and after Corey's change) told the config framework to ignore sections (before Corey's change it used a regular expression to do that, and after his change you could do it will literal strings to avoid the overhead of regex).

You are correct that the {{agents}} section is not used and that can probably be dropped. It looks like a carry-over from when {{chan_agent}} was converted to {{app_agent_pool}}.

By: Alexander Traud (traud) 2021-08-26 03:09:21.981-0500

Thanks. Just to double-check whether I got [it|https://gerrit.asterisk.org/7564]:
The current code tries to ignore a section ‘general’ as agent ID. Although it could be possible, I named my agent ‘general’, this is not allowed because it is the global/generic section. In other words, such a black list contains all static/well-known sections, not to parse/confuse them with a name for a section which could have any name. Right? OK. If app_agent_pool has just one item in the blacklist, we do not need an array and can go for ACO_BLACKLIST_EXACT. Question (1): Can’t we? If not, I still have not understood the concept.

That would leave the module {{app_skel}}, which is not enabled on default. That brings me to the next question (2): What is this XML API call, what does it try to do exactly?

And that would leave the module {{test_config}}, which is part of the test framework. That is still using RegEx for {{global}}. Question (3): Was that a glitch in the [originating change|https://gerrit.asterisk.org/7530] and {{test_config}} does not need RegEx at all?

That brings me to another question (4) which I think I have an answer for already: If only two modules remain to use ACO_BLACKLIST_ARRAY, why not changing those to RegEx again and kill ACO_BLACKLIST_ARRAY? If I understand your patch correctly, you are doing RegEx again but keep the public symbol ACO_BLACKLIST_ARRAY, for those third-party modules outside of the Asterisk source tree. Right?

That brings me to my last question (5): Why is just the black but not the white list affected by this which are used by {{cel/cel.c}} and {{cdr/cdr.c}}? Looking at your patch, you fixed that as well. I wonder why modules based on that, did not trigger an issue {{xmldoc dump}} here.

That brings me to my very last question (6): Why does {{xmlNodeSetContent(.)}} not detect that issue always? Obviously, nobody ever reported in 3½ years. Is there any way to enable stricter parsing and error checking?

By: Sean Bright (seanbright) 2021-08-26 08:41:03.369-0500

Those are a lot of questions and I am by no means an expert on ACO. I just happened to figure out what was going on with this particular bug. Anyway:

# Yes, we can.
# Which XML API call are you talking about specifically?
# I assume the intent was that all of the various match types were exercised during the test, but I do not know for certain.
# That could be done. My change does not re-introduce the regex but just generates a string that is displayed by 'config show help' - that part was neglected during the change to include the blacklist array. It just looks like a regex so that the output is consistent with other categories.
# I assume because their {{type}} is set to {{ACO_IGNORE}} but I would have to test that to confirm.
# You would have to ask the libxml2 authors, I am not that familiar with the inner workings of that library

By: Alexander Traud (traud) 2021-09-01 04:25:23.750-0500

No obligation. Those questions are for everyone. I am thankful for any answer.

2. I am talking about xmldoc_update_config_type and what follows, especially xmlNodeSetContent. Where in the DOM is that node? And what shall be in that node actually?

I tested your patch and {{config help show app_agent_pool \[globals|global|general\]}}: It changed behavior for what it reacts. I am not sure whether this is intended. Furthermore, the help text seems not right either. Finally, I do not understand those three alternatives: \[globals|global|general\]. Sometimes a module uses that, sometimes it uses this. Isn’t there a way that every module reacts on all three alternatives? The Asterisk configuration file API is so confusing …

By: Sean Bright (seanbright) 2021-09-01 09:12:12.670-0500

bq. 2. I am talking about xmldoc_update_config_type and what follows, especially xmlNodeSetContent. Where in the DOM is that node? And what shall be in that node actually?

The documentation says that it "Update\[s] the XML documentation for a config type based on its registration." Asterisk stores documentation for applications, functions, etc. in an XML document and when something registers itself with the ACO framework it is able to augment that documentation at runtime.

I do not see a call to {{xmlNodeSetContent}} in {{xmldoc_update_config_type()}} (and again, {{xmlNodeSetContent}} is from libxml2, not Asterisk). I feel like your other questions can be determined just by looking at {{xmldoc_update_config_type()}} so I will leave that to you. It's also possible I don't understand your additional questions.

bq. I tested your patch and {{config help show app_agent_pool \[globals|global|general]}}: It changed behavior for what it reacts. I am not sure whether this is intended.

Before my patch:

{noformat}
*CLI> config show help app_agent_pool [globals|global|general]
Unknown configuration type [globals|global|general]
*CLI> config show help app_agent_pool globals
Unknown configuration type globals
*CLI> config show help app_agent_pool global
global: [category =~ /general/]

Unused, but reserved.

*CLI> config show help app_agent_pool general
Unknown configuration type general
{noformat}

And after my patch:

{noformat}
*CLI> config show help app_agent_pool [globals|global|general]
Unknown configuration type [globals|global|general]
*CLI> config show help app_agent_pool globals
Unknown configuration type globals
*CLI> config show help app_agent_pool global
global: [category =~ /^general$/]

Unused, but reserved.

*CLI> config show help app_agent_pool general
Unknown configuration type general
{noformat}

The only difference I see is:

{noformat}
- global: [category =~ /general/]
+ global: [category =~ /^general$/]
{noformat}

Is that what you are referring to? If not, can you be specific about what the differing behavior is you are seeing?

bq. Furthermore, the help text seems not right either.

Can you please be specific about what is not right?

bq. Finally, I do not understand those three alternatives: \[globals|global|general].

I do not know where you are getting that string ({{\[globals|global|general]}}) from. It's not in any of the output I have seen from {{config show help app_agent_pool}} but maybe I am not looking in the right place.

bq. Sometimes a module uses that, sometimes it uses this. Isn’t there a way that every module reacts on all three alternatives?

What are the "that" and the "this" that you are referring to here?

By: Alexander Traud (traud) 2021-09-01 11:38:40.143-0500

Some modules use {{\[general\]}}, some {{\[global\]}}, some {{\[globals\]}} with an s for the generic part (yet, a fourth name for that section). One of those three alternatives. And, yes, in {{app_agent_pool}}, that section is called ‘generic’. However, the help works only for ‘global’ although I expect it to be ‘generic’. And, yes, the help text was and is confusing: It mentions ‘global’ at the start and then shows that escaped style about ‘generic’ – was that Regular Expression before? So, yes, that is what I am referring to. I have no idea what was and is intended (by the original code).

bq. it is able to augment that documentation at runtime.

Got that. But I wonder, what exactly is expected to be shown to the user and what should be in the XML DOM.

bq. I do not see a call to xmlNodeSetContent in xmldoc_update_config_type()

{{xmldoc_update_config_type}} calls {{ast_xml_set_text(tmp, category)}} and that is just a wrapper for {{xmlNodeSetContent}}. That was the place in code which gave me the error, at least the error, I saw.

bq. other questions can be determined just by looking at

Puh. I am working with XML and XPath now exactly twenty years. I do not get it. Probably, because I do not get the high level idea what this code is about. Why does it need to update that documentation at runtime? Or asked differently: That variable {{category}} in {{ast_xml_set_text(.)}} should be just plain text. Nothing special, no further XPath pointer, no Regular Expression. Just text, to be shown to the user. Right? Previously, it was a C array. Therefore, the confusion. Then it was no memory corruption but just a wrong type of data in that XML string. And my initial analysis was incorrect. Was it?

By: Alexander Traud (traud) 2021-09-01 11:42:34.114-0500

Just to be clear: All error messages, I got, come directly from the libxml2 library. Those messages were not at the Asterisk level.

By: Sean Bright (seanbright) 2021-09-01 12:35:17.107-0500

I am probably not the right person to answer your questions. Everything I know about the ACO API I have learned while fixing this issue and trying to answer your questions. I do not have any extra insight into the reasons things were done the way they were.

There is no rule or policy about the name of the "general" section of a module's configuration. Different module authors chose different names.

bq. And, yes, in {{app_agent_pool}}, that section is called ‘generic’. However, the help works only for ‘global’ although I expect it to be ‘generic’.

Yes, I agree that it is confusing.

bq. And, yes, the help text was and is confusing: It mentions ‘global’ at the start and then shows that escaped style about ‘generic’ – was that Regular Expression before?

The code that prints the help text out assumes that the text from the {{category}} node is a regular expression - because everything was a regex before Corey's change - and prints it out {{/between slashes/}} to communicate that to the user. My patch turns {{ARRAY}} and {{EXACT}} matches back into regex-like syntax just for display in the {{config show help}} output.

bq. I have no idea what was and is intended (by the original code).

Neither do I.

bq. But I wonder, what exactly is expected to be shown to the user and what should be in the XML DOM.

All of this is done in {{xmldoc_update_config_type()}}. [It tries to find the {{configObject}} node associated with the type being registered|https://github.com/asterisk/asterisk/blob/6cc004dc5aa92bba2abd5bf3562c48340594ae6c/main/config_options.c#L1061]. If it can find it, [it creates a {{syntax}} node|https://github.com/asterisk/asterisk/blob/6cc004dc5aa92bba2abd5bf3562c48340594ae6c/main/config_options.c#L1071] underneath it and everything else is a child of that newly created {{syntax}} node. The generated DOM node ends up looking like (from {{app_confbridge}}):

{code:xml}
<syntax>
 <matchInfo>
   <category match="false">^general$</category>
   <field name="type">user</field>
 </matchInfo>
</syntax>
{code}

bq. Or asked differently: That variable {{category}} in {{ast_xml_set_text(.)}} should be just plain text. Nothing special, no further XPath pointer, no Regular Expression. Just text, to be shown to the user. Right?

-That is correct.- It is text that looks like a regular expression, which is why the help output uses {{=~}} and {{!\~}} which are typically used to indicate regular expression matching.

bq. Previously, it was a C array. Therefore, the confusion.

This is what this patch fixes. The code was trying to write a {{const char **}} as a string into the {{category}} node. This bug has existed since {{_ARRAY}} and {{_EXACT}} were added.

bq. Then it was no memory corruption but just a wrong type of data in that XML string.

Trying to write the {{const char **}} as a string caused the memory corruption. I don't know what the "wrong type of data in that XML string" means?

bq. And my initial analysis was incorrect. Was it?

Your initial analysis was that there was memory corruption. That is correct, it was memory corruption.

By: Sean Bright (seanbright) 2021-09-01 12:53:34.666-0500

Hop in to {{#asterisk-dev}} on [libera.chat|https://libera.chat/] if you would like to discuss more. It is easier to have a conversation like this in real-time.

By: Alexander Traud (traud) 2021-09-02 02:08:53.718-0500

bq. I don't know what the "wrong type of data in that XML string" means?

A node in XML can be anything. However, without special treatments (CDATA, special character escaping), it is text which is not allowed to violate the text encoding used as well (in our case UTF-8, I think). Therefore, a lot of restrictions apply to the final XML text. So this was a bug because converting a C string structure into a XML text node was not done. However, what confused me, the result had to be a Regular Expression *look-a-like* text to the user.

bq. The code was trying to write a const char ** as a string into the category node.

Then, I got it now. And the content is nothing but a look-a-like Regular Expression, displayed to the user but not used/executed by XML or something else.

bq. help output uses =~ and !~ which are typically used to indicate regular expression matching

Working with Regular Expressions (and matching) for 19 years now. Never seen that before. Learning never ends.

bq. Different module authors chose different names.

My Software Usability Engineer would give me a kick in the ass … never name things different if they do not differ. It is a pity that this passed code-inclusion-process. Furthermore, the console help does not cope with all the alternatives but requires knowledge about the one for that particular module. That made analysis of this issue even more confusing to me, why is this global, generic, globals, or general … what is the reason behind that … ahh, there is no reason. Glad, I posted this issue early. Would have taken me days to understand it.

By: Sean Bright (seanbright) 2021-09-02 08:34:52.603-0500

bq. Working with Regular Expressions (and matching) for 19 years now. Never seen that before. Learning never ends.

I may have overstated it's ubiquity but I did first [see it in Perl|https://perldoc.perl.org/perlop#Binding-Operators] which I consider fairly well known.

bq. My Software Usability Engineer would...

[As you know|https://issues.asterisk.org/jira/browse/ASTERISK-28844?focusedCommentId=250403&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-250403], there is no one that works on the project that fits that role. You're welcome to fill that role if it is something you are interested in doing.

I believe the patch that is up for review resolves this issue, if you feel otherwise please let me know either here or via code review comments.

By: Friendly Automation (friendly-automation) 2021-09-02 15:09:31.945-0500

Change 16433 merged by Friendly Automation:
config_options: Handle ACO arrays correctly in generated XML docs.

[https://gerrit.asterisk.org/c/asterisk/+/16433|https://gerrit.asterisk.org/c/asterisk/+/16433]

By: Friendly Automation (friendly-automation) 2021-09-02 15:13:05.971-0500

Change 16434 merged by Friendly Automation:
config_options: Handle ACO arrays correctly in generated XML docs.

[https://gerrit.asterisk.org/c/asterisk/+/16434|https://gerrit.asterisk.org/c/asterisk/+/16434]

By: Friendly Automation (friendly-automation) 2021-09-02 15:17:18.390-0500

Change 16435 merged by Kevin Harwell:
config_options: Handle ACO arrays correctly in generated XML docs.

[https://gerrit.asterisk.org/c/asterisk/+/16435|https://gerrit.asterisk.org/c/asterisk/+/16435]

By: Friendly Automation (friendly-automation) 2021-09-02 15:17:32.941-0500

Change 16401 merged by Kevin Harwell:
config_options: Handle ACO arrays correctly in generated XML docs.

[https://gerrit.asterisk.org/c/asterisk/+/16401|https://gerrit.asterisk.org/c/asterisk/+/16401]