• Welcome to Hurricane Electric's IPv6 Tunnel Broker Forums.

Netalyzer says I have a IPv6 fragmentation problem.

Started by bicknell, January 26, 2012, 06:18:14 PM

Previous topic - Next topic

bicknell

Quote from: kasperd on March 26, 2013, 01:08:54 PM
Quote from: bicknell on March 26, 2013, 07:49:28 AMI believe from both a programming perspective and a security perspective things are significantly easier if the fragmented frames arrive in order, rather than any out of order sequence, including the reversed sequence Linux uses.
I think your idea about what is easier would change, if you tried to implement fragment reassembly. As for the security implications, it is a mistake to consider what ordering is easiest to deal with. Your security needs to work regardless of which order an attacker sends packets in.

I will note I am more familiar with the BSD stack, FreeBSD in particular.  It emits fragments in order.  The code to both send and receive them is quite clean, in my opinion.

I do agree that the security needs to work regardless of order.

bicknell

Quote from: kasperd on March 26, 2013, 01:44:00 PM
That's not supposed to be possible to do with a switch. I keep an old hub around just in case I need to do that sort of debugging.

The switch in question is a managed Cisco switch which supports SPAN, allowing the mirroring of traffic for monitoring.

Quote from: kasperd on March 26, 2013, 01:44:00 PM
That ID mismatch is a symptom of a quite mysterious bug on their side. Notice how the incorrect ID being received is always 25185. That packet is not DNS at all. What it contains is ASCII data. I captured one of those and found this 31 character string in the packet "bad 1496 2001:470:0:69::2 1480 ".

I'm not quite sure what to make of this problem.  It doesn't occur every time for me, perhaps for 1 in 10 queries.  I've reported it to the Netalyzr folks and they are looking into it.

bicknell

Quote from: bicknell on March 26, 2013, 08:00:05 AM
The good news here is that TunnelBroker is off the hook, the fragments are making it down my tunnel.   :D

The bad news is that they are not making it past my Time Capsule.  I'm working with the Netalyzr folks to see if there is anything we can do to get the fragments in order to see if that makes a difference before going back to update my bug report with Apple.  I suspect though that many firewalls will block all fragments (very bad), and that many will block the fragments received out of order (somewhat bad).  If people can replicate this test with different hardware it would be appreciated.

I configured one of my DNS servers on a FreeBSD box to generate an 1800 byte TXT records, and reran the tests to my own server.  I observed a few interesting details:

1) The FreeBSD/BIND combo I used emitted 1280 byte maximum length UDP.  I'm not immediately sure if this is a FreeBSD-ism, or BIND-ism.  The result is still one packet and one fragment, just of slightly different sizes.

2) The packets are emitted in order, and received down the tunnel in order.

3) Neither of the two response packets makes it past the Time Capsule.

So, I've now shown fragments don't make it past the Time Capsule regardless of packet order.  I raised the severity of my bug report with Apple, documented all of this with them, and poked a couple of people over there I know in an attempt to nudge it to slightly higher priority.

I also pointed out to the Netalyzr folks it would be very cool if they could devise a test to send the fragments both in order and out of order, and see if it makes any difference.  While it doesn't in this case, I suspect with some firewalls and NAT implementations it will.

kasperd

Quote from: bicknell on March 26, 2013, 07:02:26 PMIt emits fragments in order.  The code to both send and receive them is quite clean, in my opinion.
Sending in order is a bit simpler than sending in reverse order. Code for receiving needs to be prepared to receive fragments in any order. So for simplicity of the code, sending in order is preferable.

But you may want to optimize the reassembly for the case where packets are not reordered by the network. Less consumption of CPU and/or RAM in the reassembly code may be desired. So you could decide to do two alternate code paths. A fast path, which is used as long as fragments are received in reverse order, and a slow path, which is used when fragments are received out of order. This would require more code on the receiving end, but with a performance improvement in the common case.

Something similar is found in TCP options parsing. Linux has optimized code to handle the case where the TCP options are exactly 12 bytes long, and the first four bytes are exactly 0x0101080A. This doesn't simplify the code, because it still need code to handle arbitrary ordering of options. But the code will be faster on the large number of packets with that exact sequence of bytes. This optimization is even recommended in RFC 1323.

Quote from: bicknell on March 26, 2013, 07:04:53 PMIt doesn't occur every time for me, perhaps for 1 in 10 queries.
It depends on the PMTU information being cached on the sender. If the cache has no PMTU information for your IP the first fragment will be 1500 bytes. That fragment is bounced by the tunnel server. At that point your IP address will be put in the cache with a PMTU of 1480 bytes. But rather than retransmitting the packet on receipt of the ICMPv6 error, it sends an invalid response.

From that point on, it will work until the PMTU cache entry expires.

Quote from: bicknell on March 26, 2013, 07:10:44 PMThe FreeBSD/BIND combo I used emitted 1280 byte maximum length UDP.
That actually sounds like a very sensible default behaviour. I think more systems should do that.

Quote from: bicknell on March 26, 2013, 07:10:44 PMThe result is still one packet and one fragment, just of slightly different sizes.
I assume you mean one UDP packet fragmented in two IPv6 fragments.

Quote from: bicknell on March 26, 2013, 07:10:44 PMI also pointed out to the Netalyzr folks it would be very cool if they could devise a test to send the fragments both in order and out of order, and see if it makes any difference.  While it doesn't in this case, I suspect with some firewalls and NAT implementations it will.
It's quite likely, it will make a difference in some cases.

kasperd

Quote from: bicknell on March 26, 2013, 07:10:44 PMSo, I've now shown fragments don't make it past the Time Capsule regardless of packet order.
I have a few more ideas for you to try.

  • Does it drop all packets with a fragment header, or can a packet with a fragment header make it through the router, if there is only one fragment?
  • Does it drop fragmented packets in both directions, or is it only incoming packets, that are being dropped?

  • If outgoing fragmented packets can make it through the router, does that change handling of the replies? Maybe fragmenting the outgoing packet will make the router drop replies unless they are fragmented.

RayH

Quote from: bicknell on March 26, 2013, 07:10:44 PM
Quote from: bicknell on March 26, 2013, 08:00:05 AM
The good news here is that TunnelBroker is off the hook, the fragments are making it down my tunnel.   :D

The bad news is that they are not making it past my Time Capsule.  I'm working with the Netalyzr folks to see if there is anything we can do to get the fragments in order to see if that makes a difference before going back to update my bug report with Apple.  I suspect though that many firewalls will block all fragments (very bad), and that many will block the fragments received out of order (somewhat bad).  If people can replicate this test with different hardware it would be appreciated.

I configured one of my DNS servers on a FreeBSD box to generate an 1800 byte TXT records, and reran the tests to my own server.  I observed a few interesting details:

1) The FreeBSD/BIND combo I used emitted 1280 byte maximum length UDP.  I'm not immediately sure if this is a FreeBSD-ism, or BIND-ism.  The result is still one packet and one fragment, just of slightly different sizes.

2) The packets are emitted in order, and received down the tunnel in order.

3) Neither of the two response packets makes it past the Time Capsule.

So, I've now shown fragments don't make it past the Time Capsule regardless of packet order.  I raised the severity of my bug report with Apple, documented all of this with them, and poked a couple of people over there I know in an attempt to nudge it to slightly higher priority.

I also pointed out to the Netalyzr folks it would be very cool if they could devise a test to send the fragments both in order and out of order, and see if it makes any difference.  While it doesn't in this case, I suspect with some firewalls and NAT implementations it will.


I've also been doing some tests on transmission of various IPv6 extension headers to my Linux server + he.net tunnel connection using the si6 ipv6 toolkit http://www.si6networks.com/tools/ipv6toolkit/, and found this thread.

Packets containing hop by hop extension header, or destination extension header, or both, are all transmitted and received OK (confirmed with Wireshark running both on client and Linux server).

Packets containing any IPv6 fragment extension header appear to be getting dropped outbound by my Time Capsule to both the HE.net tunnel as well as local destinations connected on the same WAN interface as the Time Capsule.

Not even a simple minimalist packet seems to be getting through i.e. one consisting of an IPv6 header + single fragment header (first fragment) with no following fragments, next header TCP port 80 + SYN + no data && length <1280.

I tested with a direct cable between my client and the Linux server and everything worked exactly as expected.

I also tried disabling the firewall feature on the Time Capsule: no change.

I'd need to do some more debugging with a dumb hub and a laptop running wireshark to examine the 6to4 packets directly to really confirm this, but it's definitely looking like it's the Time Capsule is blocking all packets with an IPv6 fragment extension header.

RayH

Right. I've got a dumb hub in the path between the Time Capsule and my Linux box (that acts as the tunnel endpoint for he.net): so I have simultaneous wireshark captures at the end client behind the airport, at the hub between the Time Capsule, and on the Linux server.

Wireshark is seeing and decoding the 6to4 packets coming out of the Time Capsule correctly.

The Time Capsule is dropping every single IPv6 packet containing a fragment header (in any combination with or without hop by hop or destination option extension headers).

Hop by hop headers and destination option extension headers are being transmitted nominally in any combination (except when a fragmentation header is present)

It's definitely the Time Capsule that is blocking the *outbound* traffic.