
Revisiting 64-bit-ness in Visual Studio and elsewhere (2015) - svenfaw
https://blogs.msdn.microsoft.com/ricom/2015/12/29/revisiting-64-bit-ness-in-visual-studio-and-elsewhere/
======
codeflo
At work, I have a fast machine with 32 GB of RAM, of which Visual Studio can
only access about 3 GB. About once a month, Visual Studio crashes with some
kind of out-of-memory error, and more often than that, it slows down to a
crawl because it has to garbage collect every few seconds in an attempt to
keep its memory usage below this magic limit. Even though the machine has 20
GB of free RAM that Visual Studio just can't use.

The author acknowledges that this can happen:

> So, you’re now out of address space. There are two ways you could try to
> address this.

> a) Think carefully about your data representation and encode it in a more
> compact fashion

> b) Allow the program to just use more memory

> I’m the performance guy so of course I’m going to recommend that first
> option.

As a "performance guy", he should know that using a bit more RAM is
essentially free, and much, much easier to code than doing the kind of bit-
fiddling wizardry he suggests ("encode it in a more compact fashion"). In
practice, that kind of low-level code will not get written anyway, at least
not by modern Microsoft, and much less so by most extension authors. Which
means that there's an arbitrary and very hard limit on how much you can
customize/extend Visual Studio before you hit that 3 GB wall. And again, this
is on modern workstation machines with gigabytes of unused RAM lying around.

In short, there may be very valid technical reasons why VS can't go 64-bit,
but to claim that this doesn't hurt the product is in my opinion not
justified.

~~~
bicubic
Considering the 64 bit support problem has been going on for over five years
now and the magical 3GB limit is increasingly absurd on modern hardware, I
wonder if they've considered just giving up on VS and doubling down on VS
Code. It also sorts out their desperate need for a UI overhaul.

~~~
brynjolf
Visual Studio Code is slow in comparison to VS. It has way less features. I
hope this is not the path they take.

~~~
lsadam0
I really hope it is. I don't really find VS Code to be any slower than VS. Of
course, this depends on what you're using each for. I find I prefer VS Code
because it has so few default features. The default install does not contain
GB of bloat that I will never need. I'm able to build the experience I want,
rather than fight one thrust upon me.

~~~
jasonkostempski
"I don't really find VS Code to be any slower than VS."

Given VS Code has a fraction of the features VS has, it should be noticeably
faster. It will eventually be GB of bloat and slower than VS.

What I'd like to see instead is features being broken out into separate
components that can be used by any other editor/tool. Of course, if every
component is going to required it's own dot file in my home directory to
(maybe) turn off telemetry, I won't touch it, so I'd rather see those things
developed by someone other than MS.

~~~
tracker1
You have a very different experience with VS than I have... right now, I'm
waiting half an hour, so I can check out a branch from vsts, so I can wait
another 6-8 minutes to build my solution, etc... it's painfully slow... VS
Code, edit, next...

I've only seen VS Code run slow if you're running on 2gb of ram, or have
hundreds of competing plugins installed, and even then. VS Code is way better
than VS proper from my own subjective experience.

------
BinaryIdiot
I remember his original post back in 2009 and being very unsatisfied.

I wanted the 64-bit transition only so I could properly use my tools at work
(used to work in a .Net shop). We had resharper and several other plugins that
ran in the same Visual Studio process as everything else and it was fairly
common to hit the memory limit of a 32-bit process and Visual Studio would
essentially die until it was killed and restarted.

Sure, you can say "stop using those tools" or "they should have written them
better". But at the time I was required to use most of them (it wasn't only
Resharper though Resharper was pretty nice).

Today I don't have this problem anymore. But I think because of that type of
issue it would still likely be worth it. Honestly I feel like they could
optimize Visual Studio at the same time; it has a ton of capability but it's
also incredibly large and, from my understanding, carries a TON of legacy code
and resources throughout.

~~~
ygra
I guess his original point was that while moving to 64 bit would help the few
users who actually hit memory allocation limits, it would _also_ be a
monumental effort for a project the size of Visual Studio to do so, including
out-of-process support for 32-bit extensions to maintain compatibility,
probably rethinking designers and debugging along the way. And all that to
make things actually _slower_ and taking _more_ memory for everyone. His
argument wasn't entirely »Stop using those tools while we do nothing at all«.
It was »They should have done the right thing, but we'll also work towards
making Visual Studio take less memory«. Especially considering that not
leading everything you might eventually ever need at once in the beginning has
probably a much smaller compatibility impact than upgrading significant parts
of the IDE.

------
kabdib
This is a pretty myopic view of performance and 32-bit code.

\- You really only get 3GB of RAM

\- Actually, you get less than that, because of DLLs that get loaded into that
space.

\- Actually actually, fragmentation becomes a problem, and making very good
use of the remaining memory gets awkward pretty quickly.

3GB is pretty crowded, especially when you're talking about (say) a game with
tens of GB of footprint. You need to page stuff into that footprint, and be
clever about the memory pressure not affecting the user experience.

On the server side of things, we regularly run processes with > 8GB
footprints, including things like solr (at 32 to 160GB). Breaking this stuff
up would involve a lot more disk and network chatter, as well as bugs
involving OOM conditions, reducing global performance and reliability.

So while VS may be fine with 32-bit code+data (I am not convinced), real-world
applications definitely need more. I'm guessing that making a 64-bit VS is
hard for legacy reasons, and that the 32-bit space is actually holding the VS
team back (and possibly plugin makers as well).

~~~
masklinn
> So while VS may be fine with 32-bit code+data (I am not convinced)

Many people in this thread and the corresponding one on reddit report VS
regularly crashing on OOM or eating all the CPU garbage-collecting nonstop as
it gets closer and closer to the limit.

------
marktucker
In my experience, Visual Studio performance has improved significantly since
VS2010, but if you compare it to VC6, it's a joke. There's still an old PC at
the office running it, and double clicking a project opens the IDE and loads
the project in less than a second. It's beautiful.

~~~
whitefish
Couldn't you say this about the software industry in general? In the 90's I
used to have a Sun workstation on my desk. It ran the powerful Solaris
operating system, but had just 16MB of RAM! Today you need 1GB of RAM to run
an OS comfortably. My question: what does modern Linux do that Solaris from
the 90's did not, that it requires 50x more memory?

~~~
oblio
Well, I don't know, could your Solaris (mostly) seamlessly connect and
disconnect to wireless networks? I don't even think there were that many
wireless networks in the world back then :)

Anyway, this is is just 1 contrived example of something modern OSs do, and
that OSs from the 90's didn't do.

Sure, there's some bloat, but a lot of it is the "Mozilla kind": "Mozilla is
big not because it is full of useless crap. It is big because your needs are
big".

~~~
Cyph0n
Isn't a significant chunk of OS memory usage caused by the desktop UI? If
we're talking about servers, I'm guessing it would be because of drivers and
more built-in functionality, some of which is rarely used.

~~~
floatboth
Yeah — the image buffers (screen resolutions) are bigger, we run more apps,
more advanced desktop environments… also in the old days (before OS X, Compiz
(fun with desktop cubes!) and Vista Aero) people used to run without
compositing, which meant one common image for all apps to draw into.

Just booted my laptop (FreeBSD -current amd64, 1366x768 display) and started
X: only 204M "Active" memory. Of the 204M: 60M is syncthing, 39M compton, 31M
Xorg, 11M i3bar, 11M polkitd, 10M i3, 9M dunst, 5M wpa-supplicant… Not
counting syncthing that's 144 megabytes. I think that looks reasonable. You
can go lower and optimize for low memory (no polkit, no compton — 94M) or go
higher and optimize for usability and fancy UI features (install gnome :D)

------
davrosthedalek
I might be wrong, but aren't there much more registers available in 64 bit
mode on intel? That potentially outweighs any memory increase because it can
reduce cache pressure. Or is this mostly alleviated by register renaming and
other tricks?

~~~
masklinn
> I might be wrong, but aren't there much more registers available in 64 bit
> mode on intel?

It's not "intel", it's x86. x86_64 has twice the number of registers.

~~~
wolfgke
> x86_64 has twice the number of registers

The "typical ALU instructions" as add, cmp etc. can now encode 16 registers
instead of 8. But the FPU stack still has 8 entries to encode and there are
also still only has 8 MMX registers in 64 bit mode (they overlay the FPU stack
as you surely know).

On the other hand with AVX-512 there will be 32 xmm/ymm/zmm registers
available instead of 8 in 32 bit mode (4 times).

UPDATE: On the other hand in 64 bit mode there are only 2 segment registers
available instead of 6 in 32 bit mode (OK, the 4 common ones were (IMHO
unjustifiably, since there are some cool things that you can use them for if
you know what you do) not used in modern 32 bit OSes (Windows NT, Linux), so
they were left out).

TLDR: The multitude depends on the type of register that you consider.

~~~
hrydgard
The FPU stack is only 8 entries in x86-64, yes, but the 128-bit SSE float/SIMD
register set was indeed expanded to 16 entries, just like the GPRs (rax, rcx
etc). The FPU stack is legacy and is barely used anymore, instead most
floating point operations are done with SSE instructions.

~~~
wolfgke
> The FPU stack is legacy and is barely used anymore

I know that if you implement an algorithm via SSE/AVX it typically is much
faster than if you use the FPU. But I still believe that the FPU has its uses:
For example it also supports 80 bit precision floating point numbers, while
SSE/AVX only support 32 or 64 bit ones. There are applications where this
capability can be quite useful. This is one reason why the FPU is still
supported (and sometimes used) in 64 bit mode.

~~~
hrydgard
Yes, it has its occasional uses and is still supported, but the point is that
x86-64 got a noticable FP capability boost through the doubling of the SSE
registers compared to 32-bit x86. SSE is not just for SIMD, it has
single/double scalar instructions as well, which mostly replace the old FPU.

If you look at generic floating point code generated for x86 by any modern
C/C++ compiler, you won't see any FPU use, it's all SSE scalar, and
occasionally SIMD for clever compilers.

------
sherincall
I really wish MSFT would support the x32 ABI[0]. While having a single amd64
ABI, compared to 6-7 active in the x86 days, has its advantages, MSFT already
threw it away with vectorcall. And x32 is arguable more useful than
vectorcall.

[0]:
[https://en.wikipedia.org/wiki/X32_ABI](https://en.wikipedia.org/wiki/X32_ABI)

~~~
wolfgke
An ABI for amd64 was consciously developed independently by the gcc developers
and Microsoft. Microsoft did "the obvious thing" and extended __fastcall to
define their ABI.

Here are links about this topic:

>
> [http://stackoverflow.com/a/35619528/497193](http://stackoverflow.com/a/35619528/497193)

>
> [http://stackoverflow.com/a/4438515/497193](http://stackoverflow.com/a/4438515/497193)

>
> [http://stackoverflow.com/a/35621290/497193](http://stackoverflow.com/a/35621290/497193)

------
oobey
Would he have made those same comments to resist going from 16 to 32 bit?

Hell, why not stick with 8 bit? We can just optimize everything to work on
that, right?

~~~
whitefish
No. With 8 bit you had to execute multiple instructions to add two numbers.
Same with 16 bit. This problem went away with 32 bit. Adding more bits beyond
32 does not bring proportional benefits because the numbers we deal with fit
in 32 bit.

~~~
floatboth
"the numbers we deal with fit in 32 bit"

Except when they don't. Everyone already forgot tweet number 2147483648? :)
[https://techcrunch.com/2009/06/12/all-hell-may-break-
loose-o...](https://techcrunch.com/2009/06/12/all-hell-may-break-loose-on-
twitter-in-2-hours/)

------
blauditore
_Disclaimer: I don 't know too much about this field_

Can someone explain why exactly 64-bit is generally slower than 32-bit?

I understand that more RAM will be used and I/O to it slowed down due to
double the bits pushed around since "chunks" have double the length, which
ends up being a lot of empty padding (is that correct?).

But everything _inside_ the CPU, like registers or ALUs, are 64 bits wide
anyway (right?), so computing in 64-bit mode would just make use of resources
that were unoccupied in 32-bit mode. Or am I missing something?

~~~
CountSessine
Because no one else has mentioned it, paging is more expensive in 64-bit. In
64-bit, the kernel uses a 3 or 4 level page table requiring 3 or 4 SDRAM
accesses on a TLB miss. 32-bit kernels can get away with 2 SDRAM accesses
(PDE+PTE) if they're doing 32-to-32 translation. On a ~3GHz CPU, each uncached
SDRAM access will cost the CPU about 200 cycles which means a worst-case
32-bit scenario would be about 400 cycles but 64-bit could reach +800 cycles.

This only applies to running a 64-bit kernel vs a 32-bit kernel; running a
32-bit process on a 64-bit kernel will incur the same page table cost as a
64-bit process.

~~~
tigershark
Exactly, and I seriously doubt that in 2017 anyone is using a 32 bit OS with
visual studio (I had to endure it in 2015 at work and I know very well what it
means)

~~~
CountSessine
Agreed. And, while I have no evidence of this, my intuition is that the extra
GPRs you get in 64-bit more than makes up for any minuscule system-wide
performance hit 64-bit long-mode paging imposes.

------
sp332
Security can improve with 64-bit as well. It's a lot harder to break ASLR in a
64-bit address space than a 32-bit one. Though again, that might not apply to
VS where a user can get "arbitrary code execution" by design.

~~~
Cyph0n
That's assuming high entropy for address randomization of course. A 64-bit OS
that has a weak ASLR implementation could end up weaker than a 32-bit OS that
implements ASLR correctly. Your point definitely stands if both are well made.

------
tpolm
2016 follow-up from the same "I regret I did this" person

[https://blogs.msdn.microsoft.com/ricom/2016/01/04/64-bit-
vis...](https://blogs.msdn.microsoft.com/ricom/2016/01/04/64-bit-visual-
studio-the-pro-64-argument/)

------
mtdewcmu
I only have VS2010, but I always felt that VS stacked up pretty well against
the competition, performance-wise. Eclipse always seemed painfully slow, and
Xcode hasn't really impressed with its performance. That's no reason to rest
on your laurels, surely. But at least VS shows some evidence of restraint in
abusing resources.

~~~
alkonaut
The big perf drop in VS was between 2013 and 2015 when Roslyn was introduced
(not sure if that's the reason though). It has improved somewhat in 2017 but I
still find 2013 to be significantly faster than 2017.

------
gwern
"Because virtually invariably the reason that programs are running out of
memory is that they have chosen a strategy that requires huge amounts of data
to be resident in order for them to work properly. Most of the time this is a
fundamentally poor choice in the first place. Remember good locality gives you
speed and big data structures are slow. They were slow even when they fit in
memory, because less of them fits in cache. They aren’t getting any faster by
getting bigger, they’re getting slower. Good data design includes affordances
for the kinds of searches/updates that have to be done and makes it so that in
general only a tiny fraction of the data actually needs to be resident to
perform those operations. This happens all the time in basically every
scalable system you ever encounter. Naturally I would want people to do this.

...In the VS space there are huge offenders. My favorite to complain about are
the language services, which notoriously load huge amounts of data about my
whole solution so as to provide Intellisense about a tiny fraction of it. That
doesn’t seem to have changed since 2010."

But I'm sure that by remaining 32-bit, in another 5 years maybe everyone will
magically optimize their stuff... Any second now...

------
MrBuddyCasino
The JVM has a neat trick where it uses 32 bit pointers up to 32GB heap sizes,
because alignment is 8 bytes and so there are really only 1/8th distinct
possible memory adresses for the whole adress space, so you can just easily
map them.

Is this something VS could do?

------
SadWebDeveloper
tl;dr... its humongous task and we are busy adding more "instant azure buying-
options aka deployment tools", it will only benefit 1% of our users and we
also want to keep milking this cow until the shit hit the fan.

------
rikkus
When I'm working on performance, I measure twice and cut once. I know
measuring is 'hard' as there's a lot of code optimised for x86 in VS, but
saying 'It would be slower built for 64 bit' is cutting before measuring even
once.

I have the greatest of respect for the author, but everyone needs to be
exposed to 'trust, but verify' at least occasionally, just in case they happen
to be wrong this once.

------
moogly
I don't see the point of this article. The VS team has moved more and more
stuff out-of-process. VS 2015 can easily use more than 4 GB (if you're working
in medium to large solutions), but it's divided up across 3-4 main processes.
MSBuild processes (normally 1 per thread) are also staying around for 15
minutes now, so they can be reused.

~~~
slededit
Where it falls down is if the debugger tries to load more than 3GB of symbols.
Guaranteed crash every single time. You can manually limit the symbols it
loads - but that doesn't apply to other parts like the profiler which will
still crash deterministically.

------
aussie123
The author seems to be defensive, trying to justify the status quo. I would be
interested what the real reasons are. Why have most other Microsoft products
moved to 64bit but VS was hesitant from the beginning?

------
tpolm
"The one who wants - seeks ways, who does not want - seeks a reason"

~~~
TillE
That's pretty apt. There are _always_ legitimate reasons why doing something
is difficult and imperfect, but there's a really clear future here. It's
absurd to think that Visual Studio will remain 32-bit in 10 or 20 years' time.

And no, as any C++ developer will tell you, the answer is not to move to a
souped-up text editor (Visual Studio Code).

------
eb0la
I remember people favoring the use of their 32bit JVM instead of the 64-bit
one for performance reasons but didn't wondered why.

Makes sense if you're running Java with -Xmx1024m

~~~
loganmhb
When reading about Elasticsearch I was fascinated to learn that the JVM can
often use 32-bit object pointers to address about 32GB of RAM when running in
64-bit mode:
[https://wiki.openjdk.java.net/display/HotSpot/CompressedOops](https://wiki.openjdk.java.net/display/HotSpot/CompressedOops)

~~~
will_hughes
As someone who comes from the .NET world, this is something that pissed me off
about configuring Java applications like Elastic.

If I've got a box with 512GB of ram, it seems I'm supposed to spin up multiple
instances to satisfy this, all because the JVM has a hissy fit if you go over
~30GB. This then means worrying about replication and ensuring we don't have
both the primary and replicas sitting on the same box.

It seems insane that this is an actual issue in 2017.

~~~
wolfgke
> If I've got a box with 512GB of ram, it seems I'm supposed to spin up
> multiple instances to satisfy this, all because the JVM has a hissy fit if
> you go over ~30GB.

What is the actual technical reason why the JVM cannot (easily?) address more
than 32 GiB of RAM?

~~~
will_hughes
The ES documentation goes into more detail on this.

[https://www.elastic.co/guide/en/elasticsearch/guide/current/...](https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-
sizing.html)

~~~
wolfgke
Thanks.

------
snnn
Right now, I'm fighting with: "fatal error LNK1248: image size (1004CB720)
exceeds maximum allowable size (FFFFFFFF)" :-(

~~~
mrich
You will have to split up your library.

There is also a limit for PDB files of 1GB, which can be workarounded to 2GB,
but from the error above I think you are not hitting this limit.

~~~
snnn
Thanks for your help.

------
nailer
> In the immortal words of Sherman T. Potter: “Horse hucky!”

?

~~~
wolfgke
Sherman T. Potter

>
> [http://mash.wikia.com/wiki/Sherman_T._Potter](http://mash.wikia.com/wiki/Sherman_T._Potter)

horse huckey:

>
> [http://www.urbandictionary.com/define.php?term=horse%20hocke...](http://www.urbandictionary.com/define.php?term=horse%20hockey)

Colonel Potter curses (“Horse hucky!” is at the beginning):

>
> [https://www.youtube.com/watch?v=vhagzSEXzic](https://www.youtube.com/watch?v=vhagzSEXzic)

------
to3m
"Any packages that really need that much memory could be built in their own
64-bit process and seamlessly integrated into VS"

Wait... did you just tell me to go fuck myself? ;)

~~~
recursive
I believe he did, Bob.

------
draw_down
Hmm. We must all be a bunch of rubes for running 64-bit code in so many places
these days.

~~~
draw_down
The mob has spoken! 64-bit is actually good.

------
taspeotis
Run devenv /safemode to load the IDE without third party extensions (think
ReSharper) and the thing flies. It's not Microsoft's code that's the problem.

~~~
BinaryIdiot
We...may have different definitions of "flies". Yes if you disable third party
plugins then it should be fast at various points in its lifecycle. At the same
time it's not really that fast when you have zero plugins installed...

