I've determined the source of the issue.
In the message that crashed Asterisk, the Content-Type header is as follows:
The backtick (`) character before the "boundary" parameter causes PJSIP to be unable to determine the boundary for the multipart message. What PJSIP does in this instance is to try to guess the boundary by inspecting the body. It looks for the first "--" in the body and determines that whatever comes between that and the next CRLF (or just LF in your case) is the boundary. The first instance of "--" in the body is immediately followed by a newline, so PJSIP determines that the boundary is an empty string.
Given that information, the first two lines of the body are
PJSIP determines that the first "--" is the opening boundary of the multipart body. The next line ("--++=AAA") starts with the boundary marker ("--") and so PJSIP considers that to be the end of the current multipart part. Therefore, the part it found is zero-length. Here's where things go wrong. The parser has two pointers: start_body and end_body. As the names imply, start_body points to the start of the part, and end_body points to the end of the part. Since the part in this case is zero-length, start_body and end_body point to the same address. However, PJSIP makes the assumption that it needs to trim the CRLF that precedes the end_body pointer and adjust the end_body pointer as necessary:
/* The newline preceeding the delimiter is conceptually part of
* the delimiter, so trim it from the body.
if (*(end_body-1) == '\n')
if (*(end_body-1) == '\r')
This results in end_body pointing to an address lower than start_body. PJSIP then attempts to parse the part as follows:
part = parse_multipart_part(pool, start_body, end_body - start_body,
Notice the end_body-start_body. The calculation results in a length of -1. Since this value is interpreted as unsigned, the resulting length passed to parse_multipart_part becomes 18446744073709551615 instead. This results in the following loop running past where it should:
while (p!=end && *p!='\n') ++p;
In some cases, these invalid reads of *p can cause a crash. In other cases, it does not.
There are three main issues I have found with the multipart body parsing:
- If no boundary is specified in the Content-Type header, PJSIP tries to guess the boundary instead. This could result in an incorrect guess. It is probably a better idea to fail at parsing under this condition.
- When searching for delimiters between multipart body parts, PJSIP does not follow the rules set forth by RFC 2046, section 5.1.1:
In other words, the "--++=AAA" should not have been interpreted as a delimiter since there was more than just whitespace and a CRLF after the determined boundary of "--".
The boundary may be followed by zero or more characters of
linear whitespace. It is then terminated by either another CRLF and
the header fields for the next part, or by two CRLFs, in which case
there are no header fields for the next part
- There is a logical error whenever a zero-length body part is encountered that results in the calculated length of the part being negative instead of zero. The zero-length body needs to be special-cased so that this calculation does not occur.
The sample INVITE that you provided exploits all three of these issues in order to make the crash occur. Issue 3 is what actually triggers the crash though.The following SIP message causes the same crash but only exploits issue 3:
INVITE sip:email@example.com SIP/2.0
Via: SIP/2.0/UDP sip.example.com;branch=7c337f30d7ce.1
From: "Alice, A," <sip:firstname.lastname@example.org>
To: Bob <sip:email@example.com>
CSeq: 1 INVITE
Contact: Alice <sip:firstname.lastname@example.org>
We plan to get in touch with the PJProject maintainers and at the very least get issue number 3 fixed. The behavior of the first 2 issues may be needed for interoperability purposes. We will suggest that those behaviors get changed as well.