
HTTP/HTTPS not working inside your VM? Wait for it - jonchang
http://rachelbythebay.com/w/2016/03/22/6nat/
======
krylon
> Someone else in the world reported the problem back in September, and aside
> from some random person asking a totally useless question, nothing had
> happened on the thread.

It's a special kind of horror to find, after hours of high-end-googling, the
one thread where someone reports the same problem you are experiencing, and
it's just the question, and then one other person asking if the problem has
been solved because she/he is having the same problem.

The one thing that is worse is if the OP then makes another post that simply
says "Solved it! =D", without giving any explanation on _how_ they solved it.

~~~
Unklejoe
Or when you find a thread where someome is asking the question and the only
response is by some wise guy telling him to "search". And of course, searching
keeps pointing you back to the same thread.

~~~
CaptSpify
My favorite is "Oh, you can find the answer here:
[http://brokenlink.com"](http://brokenlink.com")

~~~
pygy_
This can often be solved by prepending

    
    
        https://web.archive.org/web/*/
    

to the now broken URL.

~~~
kbenson
Hmm, there must be some extension that detects 404 errors and/or server
unresponsive and prompts whether you want to check Google's cache or the
Internet archive. That browsers haven't already integrated this (with a
configurable list of archiving services, like search engines), is actually
rather surprising to me.

~~~
greglindahl
There are many such extensions, but yes, we're working on getting directly
integrated into Firefox, Chrome, Edge, and if anyone has a contact at Safari
(or one of the other browsers), my email addr is in my profile. Thanks.

------
matheweis
Well, considering that VMWare fired their entire dev team in January [1], it's
not surprising that this isn't fixed... I'd expect more of these kinds of
issues to crop up without traction in the future.

1\. [http://www.loopinsight.com/2016/01/28/vmware-abruptly-
fires-...](http://www.loopinsight.com/2016/01/28/vmware-abruptly-fires-fusion-
dev-team-outsources-to-china/)

------
zwp
Funky! This feels like the connect(2) is returning before it has actually done
its work, async-style.

Rachel, could you write a small sneaky program (using eg libpcap) to see if
the TCP handshake has completed by the time connect(2) returns control to your
program, before your first write(2)?

------
amluto
An issue a little bit like this that I've seen is overzealous admins who block
ICMPv6, creating PMTU black holes. Short web pages load, and long pages hang.
Too bad I discovered this during tax season a couple years ago, and the
affected site was eftps.gov.

~~~
mindslight
It does feel like an MTU issue. I'd grab a tcpdump at all 3 points (server,
host, vm) and see what is getting dropped.

From what I know about tc, to only delay ip6 traffic you've got to create a
root qdisc that has multiple subclasses (like tc-prio [0]), and attach tc-
netem to one while passing the other straight through. Then classify packets
between the two, although I'd do that with iptables rather than figuring out
any more of the tc workings that necessary.

[0] The default pfifo_fast has multiple subclasses, but from what I remember
it had some problem with child qdiscs?

~~~
rachelbythebay
PMTUD broken on v6 happened to me too... in 2015.

[http://rachelbythebay.com/w/2015/05/15/pmtud/](http://rachelbythebay.com/w/2015/05/15/pmtud/)

------
h43k3r
After reading this post, my understanding is that it doesn't affect normal
machines or vms only the ones which are VMWare based. Am I right?

Also does anyone know what is the reason behind this peculiar behavior? A bug
or something more fundamental ?

------
mike-cardwell
"parts of the web are going IPv6-only", "Certain web servers have been going
IPv6-only of late" \- Really? Which parts of the web? Why would anyone
configure their servers that way?

~~~
geerlingguy
Some budget hosting providers offer IPv6-only hosting for a lower price, since
IPv4 addresses are getting harder to acquire en masse (and thus are more
expensive).

~~~
ultramancool
We've had vhosting for eras and SNI for years. So shared hosting is out.

And I've never seen a VPS provider that doesn't offer v4. Even a $5/mo
DigitalOcean box has it.

~~~
cosarara97
I have a 3€/yr VPS with IPv6 and natted IPv4 for port 80 at
[http://lowendspirit.com/](http://lowendspirit.com/).

~~~
ultramancool
Well, congrats, but you're getting ripped off.

[https://lowendbox.com/](https://lowendbox.com/) has many options cheaper than
that which do have real IPv4.

~~~
cosarara97
3€ _year_ , not month. I can accept being ripped off at that price. I think I
found those guys on that same page you just linked, actually. But anyway, most
stuff on [https://lowendbox.com/](https://lowendbox.com/) is quite more
expensive than that, and I don't feel like browsing the entire site.

~~~
ultramancool
Woops. Yeah, I misread that. In that case you're definitely not being ripped
off, and given how much IPs usually cost, that is probably something which can
really only be a v6 deal. Interesting. First I've seen anything like it.
Thanks.

------
keeperofdakeys
What OP really needs to do is get tcpdump (or similar) output from the vm, and
just outside the vm (host or router).

~~~
evilDagmar
Too right. This _reeks_ of something attempting to optimize traffic and
screwing up the first few packets.

------
botw
I have the same problem with VirtualBox Linux VM, was wondering what is going
on, and this post comes up. I am not sure if it is the same reason. I tried:

tc qdisc add dev eth0 root netem delay 100ms

and:

printf 'HEAD / HTTP/1.0\r\n\r\n' | nc -6 rachelbythebay.com 80

returns nothing whereas

printf 'HEAD / HTTP/1.0\r\n\r\n' | nc -4 rachelbythebay.com 80

returns as expected.

In my case, firefox has no problem if invoked from command line, but
"sometimes" it just hang up when invoked from script.

~~~
rachelbythebay
Try > 100 msec. Some people have been reporting they need more. It'll be
interesting to see what's causing it to vary.

------
peterwwillis
It doesn't seem like VMware is the culprit here, mainly because it has nothing
to do with anything above layer 3. Here's some points to look into and
possible fixes.

    
    
      [1] VMware's network driver does not handle TCP, or IP. It's just layer 2; it
          implements one of a couple kinds of network hardware, that's it.
      [2] VMware Guest Tools does install a para-virtualized network card driver
          - vmxnet2/vmxnet3. It communicates with the physical network device by
          communicating with the host OS, rather than emulating a network driver. That
          potentially may do something wonky with something above layer 3, even though
          it really should not be.
      [3] VMware does have a virtual network switch, which forwards frames between 
          the physical NIC and virtual NIC based on MAC address.
      [4] VMware may handle moving frames from a virtual NIC to a physical differently 
          than moving it to another virtual NIC.
      [5] VMware provides VMDirectPath I/O, which allows the guest to directly address 
          the network hardware.
      [6] TSO/LSO/LRO can have a negative impact on performance in Linux (though
          supposedly, LRO only works on vmnet3 drivers, and from VM-to-VM, 
          for Linux).
      [7] Emulated network devices may not be able to process traffic fast enough, 
          resulting in rx errors on the virtual switch.
      [8] Promiscuous mode will make the guest OS receive network traffic from 
          everything going across the virtual switch or on the same network segment 
          (when using VLANs).
    

[1] You can try changing the VMware guest's emulated network card (vlance,
e1000) and trying your thing again, but I doubt it will change much.

[2] Try installing or uninstalling VMware Guest Tools and corresponding
drivers.

[3] Nothing to do here, really. If you have multiple guests sharing one
physical NIC, try changing it to just one?

[4] Try your test again between two VMs on the same host.

[5] Try this, or not?

[6] Try enabling or disabling LRO. Or play with all three settings and see
what happens.
[https://kb.vmware.com/selfservice/microsites/search.do?langu...](https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2055140)

[7] Try increasing buffer sizes.
[https://kb.vmware.com/selfservice/microsites/search.do?langu...](https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1010071)

[8] Disable promiscuous mode on your NIC.

Other non-VMware things to investigate:

    
    
      [1] Your guest OS may have bugs. In its emulated network drivers, in its 
          tcp/ip stack, in its applications, etc.
      [2] An intermediary piece of software may be fucking with your network 
          connection. IPtables firewall, router/firewall on your host OS, after 
          the host OS/before your internet connection, at your destination host, etc.
      [3] Sometimes, intermittent network traffic makes it look like there is a
          specific cause, when really the problem is hiding in the time it takes 
          you to test.
      [4] The Linux tcp/ip stack (and network drivers) collect statistics about 
          erroneous network traffic.
      [5] Network traffic will show missing packets, duplicate packets, unexpected
          terminations, etc.
      [6] Your host OS or network hardware may be buggin'.
    
    

[1] Try a different guest OS.

[2] Make sure you have no firewall rules on the guest, host, internet gateway,
etc. Try a different destination host.

[3] Run tests in bulk, collect lots of samples and look for patterns.

[4] Check for dropped packets, errors on the network interface, in tcp/ip
stats.

[5] Tcpdump the connection to see what happens when it succeeds or fails.

[6] Try a different host for your VM.

 _edit_ one more idea: Look at the response headers for the request to the
site. The content length is 1413 bytes. Add on the TCPv6 and IPv6 header
overhead (and http headers, etc) and this is probably over 1500 bytes, the
typical MTU maximum. Try requesting a "hello world" text file and try your
test again.

------
shanemhansen
I would love to know what sort of problem has such an odd solution.

------
anabis
10 years ago, VMWare did not fragment / reassemble packets for me, so I had to
set NFS rsize option.

Maybe I was just missing a setting somewhere, but couldn't find it then.

------
newman314
I wonder if this is because of happy eyeballs...

------
sslayer
TCP Chimney

------
ai_ja_nai
wow

------
garethadams
Is this the millenials' version of the "500 Mile Email"? -
[http://www.ibiblio.org/harris/500milemail.html](http://www.ibiblio.org/harris/500milemail.html)

~~~
swalsh
I love that story, it is one of those classics, right up there with Mel the
old school programmer.

~~~
jaytaylor
Obligatory link to Mel:
[https://www.cs.utah.edu/~elb/folklore/mel.html](https://www.cs.utah.edu/~elb/folklore/mel.html)

------
imrehg
Which is actually about IPv6 networking peculiarities/issue within a VMware,
just fyi.

~~~
chris_wot
Uh... but why?!?

~~~
atemerev
Software is unreliable. Bugs happen. Always. There are bugs in avionics,
medical devices firmware, nuclear power plants monitoring software, bank
transfers backends, all places.

Once upon a time it was common to think that we can design software without
bugs, or at least almost. That didn't work at all! What did work is redundant
systems, invariant testing and fail-fast with restarts. This is how reliable
systems are written these days.

Bugs are common; we have to learn to work around them.

~~~
chris_wot
> _Bugs are common; we have to learn to work around them._

Or we could, you know, fix them.

I wasn't asking for a justification. I was just asking why this is occurring.
If you don't know, that's cool. I mean, one of the reasons I ask is because
I'd like to know if VMWare are going to fix this bug.

So thank you for explaining that software has bugs. I'm sure I'll remember
that the next time I fix a regression in LibreOffice, as I did with the issue
with EMF dashed lines not displaying correctly or when I fixed the issue where
JPEG exports didn't export the DPI value correctly...

~~~
mikeash
Just for future reference, something like "Do we know exactly what the bug in
VMWare is, and whether they're going to fix it?" would be _way_ more effective
at getting the answer you're looking for here. "Uh... but why?!?" sounds like
cursing at the sky, and gets a response appropriate for that.

~~~
chris_wot
Fair point.

