[Home]

Summary:ASTERISK-18394: T.38 FAX passthrough does not work
Reporter:Fabian Borot (fborot)Labels:
Date Opened:2011-08-31 09:38:43Date Closed:2011-09-01 17:00:38
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_sip/T.38
Versions:1.8.3 1.8.5.0 Frequency of
Occurrence
Constant
Related
Issues:
is related toASTERISK-18402 Asterisk accepts a re-INVITE to switch from T.38 back to voice, but does not switch back
is related toASTERISK-18400 RTCP Receiver Reports are sent for idle RTP sessions
is related toASTERISK-18401 Debugging messages generated by 'udptl debug' are incomplete
Environment:Linux asterisk1-8.lab.com 2.6.18-238.el5 #1 SMP Thu Jan 13 15:51:15 EST 2011 x86_64 x86_64 x86_64 GNU/Linux using CentOS 5.6 and also CentOS 5.2 No firewalls, no iptables, using public IPs on all devicesAttachments:( 0) fax.pcap
( 1) fax-log.txt
( 2) fax-quintum-asterisk-switchvox-20110901.pcap
( 3) good-fax-call-ptime20.pcap
( 4) quintum_to_asterisk_to_img-console_log-fax_failed.txt
( 5) quintum_to_asterisk_to_img-fax_failed.pcap
( 6) quintum_to_asterisk_to_switchvox-console_log-20110901-fax_OK.txt
( 7) quintum_to_nextone_to_img-1-fax_OK.pcap
( 8) quintum_to_nextone_to_img-2-fax_OK.pcap
Description:A call is received on our asterisk and is routed to the destination UA. The call connects as a regular voice call (only audio codecs offered at this point) then the destination UA sends a Re-INVITE to switch to T.38 and the Re-INVITE is passed back to the calling UA and accepted with 200 OK with the SDP showing fax capabilities. Up to this point everything looks normal and fine. Also almost at the same time that the Re-INVITE is received we can actually hear the fax tone on the fax machine connected to the calling UA.
We started seeing a lot of udptl packets being sent from the Quintum to the Asterisk and only 1 of those packets is sent to the destination. After 20 seconds of inactivity the Quintum sends Re-INVITE with voice codecs only and our Asterisk sends a 200 OK still with T.38 information. then after around 20 seconds the Quintum sends BYE to shutdown the call.
Another thing that I noticed is that both the Quintum and the IMG keep the same port for voice and fax on their SDPs, however Asterisk chooses a port for voice on the range defined on the file rtp.conf and another port for T.38 on the range define in the file udptl.conf [I have tried also using even udptl ports too just in case any of the  remote UAs do not support UDPs on odd ports]
Both the calling (Quintum gateway) and the called (Dialogic IMG 1010) UAs support T.38 and whe we send the call directly from the Quintum to the IMG the fax works all the time.
Comments:By: Fabian Borot (fborot) 2011-08-31 09:40:25.464-0500

pcap capture showing both legs of the call and logs taken on asterisk console with sip and udptl debug on

By: Kevin P. Fleming (kpfleming) 2011-08-31 14:53:52.303-0500

Investigating packet traces and logs.

By: Kevin P. Fleming (kpfleming) 2011-08-31 15:47:25.927-0500

Alright, there are quite a few things going wrong here, but probably only one of them that is the root cause of your FAX session failures. I'll detail what I've seen in this packet capture and log:

* The Dialogic device sends "T38FaxMaxDatagram: 72", but also requests IFP redundancy. This is a common problem in many T.38 endpoints where the implementer misinterpreted the T.38 specification; this is the reason why we have the 'maxdatagram' override setting in sip.conf. I would suggest that you set 'maxdatagram=200' in the 't38_udptl' line in the sip.conf configuration for this device.
* The Quintum device doesn't send "T38FaxMaxDatagram" at all; this is acceptable, as it's optional, and Asterisk should default to 400 bytes.
* After Asterisk has successfully re-INVITEd the Quintum device over to T.38 mode, the Quintum device continues to send RTCP Sender Report (SR) packets to the port that Asterisk used to be listening on when RTP was in use on the session... the port has been closed, and this results in ICMP responses being sent back. The Quintum device should not do this.
* After the session has switched to T.38 mode on both legs, Asterisk continues to send RTCP Receiver Report (RR) packets to both the Quintum and Dialogic devices; it should not do this, as the RTP sessions were shut down.
* After some period of time the Quintum device gives up its T.38 attempt, and sends a re-INVITE to Asterisk to switch back to audio mode; Asterisk accepts this re-INVITE, but does not switch out of T.38 mode.
* Some of the UDPTL debugging messages are missing the associated channel name (they just say 'SIP/').

Now... for the culprit: after the session has switched to T.38 mode, the Quintum device sends a T.38 'no-signal' packet (which is fine, that means the T.38 session is idle on its end). Asterisk receives this, and forwards it to the Dialogic device. No response is seen from the Dialogic device, which may or may not be an issue... I don't think it is obligated to send any T.38 signals at this point in the process. After that first packet, the Quintum device continues to send these 'no-signal' packets (attempting to get the receiving endpoint to initiate the FAX negotiation), but it does not increase the sequence number in the packets (they are all sequence number zero). The UDPTL stack in Asterisk sees these as retransmissions of packets it has successfully received and forwarded already, so it drops them. This means the Dialogic device stops seeing any T.38 input packets, and as a result it never progresses to sending any T.38 signals of its own.

There isn't any facility present in Asterisk today to make it forward UDPTL packets that appear to be retransmissions of previous packets, but such an option could exist in order to work around severely broken T.38 implementations such as this.

Out of curiosity, if you could provide a packet capture of a successful T.38 transaction directly between the Quintum and Dialogic devices, I'd like to see if there are any differences in behavior in either of those devices.

By: Fabian Borot (fborot) 2011-08-31 16:39:38.276-0500

txs a lot!. I am working on providing you new logs/captures as requested.
1- Fax call directly between quintum and IMG [passing through our nextone though]
2- another set of log/captures with the maxdatagram=200 going through the asterisk

By: Fabian Borot (fborot) 2011-08-31 18:02:46.788-0500

ok, this is what I did:
1- I un-commented the line below and changed 400 to 200 as suggested (globally)
t38pt_udptl = yes,fec,maxdatagram=200
Still I get the same behavior, fax does not go through. I am attaching a pcap capture called "quintum to asterisk to img-fax failed.pcap" and a txt file with the console log with sip and udptl debug on called "quintum to asterisk to img-console log-fax failed.txt"

2- I made 2 successful fax  calls without the asterisk server as part of the call flow (quintum --> nextone softswitch --> Dialogic IMG ). I am attaching 2 pcap files called "quintum to nextone to img-1-fax OK.pcap" and "quintum to nextone to img-2-fax OK.pcap"

I noticed that even though I have maxdatagram = 200 I still see the 72 on the asterisk SDP. I made sure that I reloaded the sip module and also did  "service asterisk restart" and made a 2nd call with same result.


I will get another call later on and capture the log in the IMG as well.


By: Fabian Borot (fborot) 2011-08-31 18:11:06.546-0500

Disregard the last comment I added regarding the maxdatagram parameter being ignored by asterisk. I looked at the capture again and noticed that the asterisk is indeed sending the Re-INVITE to the quintum side with the maxdatagram=200.However, the 200 OK to the IMG as a response of the RE-INVITE is set to 1400.

By: Kevin P. Fleming (kpfleming) 2011-09-01 09:52:01.951-0500

That's right; as the sample sip.conf file says, the 'maxdatagram' setting is an override only used when the value offered by the endpoint is *lower*; it serves to provide a minimum value, but it doesn't lower values provided by the endpoint that are larger (there wouldn't be any benefit in doing that).

As I said in my long note above, changing the maxdatagram value isn't going to solve the underlying problem here, which is repeated sequence numbers. I'll look over the new captures to see if there's anything useful we can learn from them. Thanks for the quick response!

By: Kevin P. Fleming (kpfleming) 2011-09-01 09:52:33.648-0500

Also, there's no need to restrict your comments 'JIRA Users', because 'JIRA Users' is everybody :-)

By: Fabian Borot (fborot) 2011-09-01 10:08:30.255-0500

Dear Kevin, I  hope the new captures help. I have some updates that may be useful also.

1- I just made a successful fax call through our asterisk using the same calling UA [quintum gateway] but the destination was our switchvox pbx, which is another asterisk but at least our asterisk is able to "talk" t.38 with another box, so that is kind of good news. As a matter of fact, when I tested it a couple of months back it was asterisk 1.8.3 towards our switchvox pbx and it worked flawlessly. We did not have a carrier that supported T.38 at the moment. That is why we are conducting these tests now.  So, for this test the call flow is quintum --> asterisk --> switchvox.I will make a couple of more tests and provide the logs/captures to you. This switchvox has a couple of T.38 licenses that we bought a year or so ago. I could not test before because coincidentally the pbx was having fax issues as well and we had to contact the tech support and they reloaded the fax app to make it work, it was showing these errors; ERROR   pbx.c: Function FAXOPT not registered.

2- I am going to get logs on the IMG as well and see if I can get anything from it that may help also. maybe we can see something from their logs that can help us. I will also get the logs from the quintum side.

3- I may have another carrier to test with that supports T.38, so I will test that too and I will provide logs/captures

We greatly appreciate your help.
Fabian


By: Fabian Borot (fborot) 2011-09-01 10:28:55.760-0500

1- "quintum to asterisk to switchvox-console log-20110901-fax OK.txt" file contains console logs for a successful fax call [quintum --> asterisk -> switchvox pbx]
2- "fax-quintum-asterisk-switchvox-20110901.pcap" file contains the wireshark capture taken on the asterisk for the same call [quintum --> asterisk --> switchvox]

By: Kevin P. Fleming (kpfleming) 2011-09-01 14:55:19.332-0500

Alright, I've reviewed what you've posted, and I have a few observations, although I don't think they will really help the situation much.

* In the Quintum-Nextone-Dialogic scenario, the FAX transaction succeeds (apparently) because the Nextone SBC does forward the repeated 'sequence number 0' packets from the
Quintum device, and this is enough to get the Dialogic device's T.38 gateway 'moving'.

* In the Quintum-Asterisk-Switchvox case, Asterisk still doesn't forward the repeated packets, but it doesn't matter, because the Switchvox endpoint isn't a FAX gateway, it's acting as a FAX *terminal*, and it immediately starts sending T.38 packets to initiate the FAX protocol negotiation. The Quintum device receives these and things move along smoothly.

I don't really see a way to help you quickly here; if the Quintum device properly indicated sequence numbers in its T.38 'no-signal' packets the problem would disappear, or if the Dialogic device started sending T.38 'no-signal' packets of its own the problem would disappear. Failing both of those, it is possible that we could provide an option in Asterisk to allow forwarding of repeated T.38 packets under some circumstances. That wouldn't get done quickly though unless a community developer wants to work on it, and I don't think it'd be a candidate to be added to the 1.8 branch anyway since it's not fixing a bug in Asterisk.

By: Fabian Borot (fborot) 2011-09-01 15:35:59.496-0500

Dear Kevin, first of I want to thank you for dedicating a lot of time and effort to determine the cause of the problem.
Now, the good news...we just discovered why it was not working with the IMG and even better we were able to re-configure both devices [asterisk and the remote IMG] with the needed changes and already passed more than 5 fax calls.
There was something that we never talked about and I really did not noticed until today when I compared the signaling in more detail. We are supposed to use g729:60 for the voice calls to save bandwidth and all my test with the asterisk were done with this codec, but the other tests that showed that the fax worked using the nextone were not done properly, by mistake the IMG was configured to use g729 ptime 20 when receiving a call from the nextone. It was supposed to use g729:60 as well but we overlooked it.
Once I noticed that the offer from the nextone had ptime 20 and that the IMG accepted the offer like that and had no problem switching to T.38 then I changed both the asterisk and the IMG to use ptime 20ms as well and voila! it started working.
Also, when the IMG detected the fax tone on the TDM side and created the RE-INVITE with T.38, just after that point the log file shows this nasty error:

13:41:52.185 PFC (VAP) PICASSO Log Msg
                      FAILURE Returned by MSP -> 0x507
13:41:52.185 PFC (VAP) PICASSO Log Msg
                      VFSM_StartConnectionPacket: Error dev(5) devchnl(6) conn
                      (65676) req(0x14e45498) reqid(1) pstChnl(0x15ab0cd8)
13:41:52.185 PFC (W)   [m1.1.12] MCHstACTIVE_FAXR Message Processing Error (0xf
                      ffffaf9)
13:41:52.185 PFC (W)   [m1.1.12] MCHstACTIVE_FAXR state - FAILED MMONevSERVICE_
                      RESP_ACK, (1)
13:41:52.185 PFC (I)   [m1.1.12] stMCHstACTIVE - evSERVICE_RESP_ACK Received (U
                      nknown Msg FCode=x1/cmdType=x2)
13:41:52.185 CALL(IP)  (05:47283:00) RTCP STATS: TxPackets=0; RxPackets=0; lost
                      Packets=0

13:41:52.285 CALL(SIP) (05:47283:00) RCVD 100 Trying from 204.9.239.116:5060 ud
                      p CSeq:1 Msg Size:602
13:41:52.285 CALL(SIP) (05:47283:00)

I do not know how familiar you are with the IMG but PICASSO is one of the Voice Module that Dialogic uses.

Now, that error, as I said, happens just after the RE-INVITE has been created/sent and 100ms before the 100 Trying is received on the IMG, also way before the 200 OK and infamous udptl packet is received. Also, I noticed that the udptl packets from the quintum say in the UDP header length= 14, and it reality have 12 more bytes. The asterisk when is relaying the one udptl packet to the IMG says length = 14 and indeed has 14 bytes [8 from udp and 6 bytes on the payload], so it is relaying only one and it is fixing it.

Now, we still need to make the IMG work under those circumstances [voice calls with ptime 60] but that is something that I need to take with Dialogic. In the mean time, we tested with G723 ptime 30 and it appears to work fine.
I will provide a fax call using g729 ptime 20 through the asterisk to close this.
Again I value your assistance, time and effort helping us. You provided timely responses and valuable insight.

Fabian


By: Fabian Borot (fborot) 2011-09-01 15:46:19.167-0500

good fax call through the asterisk using g729 ptime 20

By: Kevin P. Fleming (kpfleming) 2011-09-01 17:00:38.286-0500

Very glad to help, and I'm happy to see you got it working. Your quick responses and complete packet captures and logs made it relatively easy to spot the problems.