
What Kind of Lithography Will Be Used at 7nm? - Lind5
http://semiengineering.com/7nm-lithography-choices/
======
asmithmd1
At 7nm spacing you could draw 100,000 lines across the width of a human hair.
About 70 silicon atoms lying next to each other would be 7nm wide. I know the
end of Moore's law has been predicted before, but this time has to be
different.

~~~
nhaehnle
Keep in mind that most features at 7nm would be quite a bit larger than 7nm.
For example, in the 22nm technology I have worked with, a standard wire on the
lowest metal layer might be around 40nm wide (higher metal layers use even
wider wires to benefit from their lower resistance to cross larger distances).

I do agree with your fundamental point that there will definitely be an end to
Moore's law. It's just that the mere number of silicon atoms doesn't make <7nm
infeasible.

~~~
userbinator
Is it like how inkjet printers advertise huge resolutions like 5760dpi, and
the positioning can achieve that resolution, but the actual nozzles can't
produce lines thinner than e.g. 720dpi?

~~~
ghaff
No. There are process nodes that have real technical meaning but, especially
as sizes shrink, what that means exactly in terms of the dimensions of
specific features varies considerably.

~~~
avian
The "Minimum feature size" slide included in this article shows some feature
sizes for existing technologies. I think names like "14 nm" and "7 nm" are
mostly marketing at this point.

[http://www.anandtech.com/show/8367/intels-14nm-technology-
in...](http://www.anandtech.com/show/8367/intels-14nm-technology-in-detail)

------
xlayn
Interesting how many chip-makers have now the power (read it as financial and
human resources) to keep the race at what it seems the same level. Of them
all, Apple is the more impressive as they have keep the core count down. On my
books been able to squeeze another 4 cores to 12 cores doesn't make that much
sense on cellphones and laptop/desktops.

~~~
derefr
Apple _have_ been squeezing more parallelism onto their SoCs—just in the form
of more continuous on-die GPGPU silicon, rather than more discrete CPU cores.
Most "apps" seem to break down into serial-bottleneck and embarrassingly-
parallel subcomponents; this fits a "CPU for the serial part + GPGPU for the
parallel part" design much better than it fits a 12-core CPU. With every
release of iOS, more is run on the GPU and less on the CPU—things are being
parallelized, just not the way we expected.

A tangent: really, the only place that multiple discrete CPU cores make any
sense at all is on servers, because servers are frequently made to run
multiple submitted workloads, each containing their own serial-bottleneck
parts (vis. the old concept of a "time-sharing" computer.) And even then,
these days, those cores may perhaps be useful only for the sake of serving as
a more efficient substrate for virtualization; there's not much you can do
with one four-core VM that you couldn't do _better_ by treating it as four
one-core VMs.

~~~
eloff
Database servers do much better with 4 cores than 4 VMs. In fact they do even
better without any VMs at all.

~~~
derefr
I knew someone would argue this point but I couldn't figure out how to correct
the original post to be clearer.

"Virtual machine" has come to refer to a specific isolation and security model
through, effectively, having a microkernel (the hypervisor) with hardware-
accelerated microkernel RPC (hypercalls). But the abstract concept of a
virtual machine doesn't require that isolation; it just requires partitioning
of resources so that you can treat each partition as its own independent
_virtual Von Neumann machine_.

A DB that pins each of its worker threads to a separate core, has a set amount
of non-pageable memory reserved for each thread (and perhaps does SR-IOV to
get network frames directly to the right thread) is effectively, _in resource
allocation terms_ , four independent _virtual Von Neumann machines_. Each
pinned thread sees no context switches, gets no cache incoherency, suffers no
NUMA-based latencies, etc. You've got four little machines, each with their
own uncontested memory-bus and network bandwidth, that just happen to share
die space (and thus have a cheap IPC fabric between them.)

I mean, you get 90% of what having a modern hypervisor-managed VM gives you
just by using processes in the first place (as opposed to the old "single
shared address space" model of the Apple II and IBM PC, where things like TSRs
made sense.) But in the decades since the invention of the "process" concept,
we've gone past the original conception of a "process" as its own virtual
machine, and optimized for multi-user, heavily-multiprocessing time-sharing
workloads where context switches are frequent, memory bandwidth is shared, IO
hardware is shared (and thus has its access linearized through the kernel),
etc. We've realized that most people don't need to be allocated a circuit, so
scheduling packets will do; that most people don't need memory reservations,
so an OOM killer will do; that most people don't write hard real-time
software, so context switching will do; etc. (And we've even given up on the
address-space isolation, with threads and other shared-memory IPCisms.)

Most of what we use hypervisor-managed VM setups for today is simply to
enforce the resource partitioning that Operating Systems gave up on, so that
we can have "processes" that actually do give us resource guarantees. If
you're an ops person and you don't know what workload you're going to be
running but want that enforcement there anyway, hypervisor-based VM management
makes perfect sense. On the other hand, if you as the developer get to design
your entire "embedded system" or "appliance" or "service" of OS+app yourself,
you can throw out the hypervisor and get your VM model from the OS itself, by
teaching your app to tweak the OS in all sorts of places ala Snabb Switch.

Either way, the goal is to partition your machine into several smaller
independent ones that never have to wait in a queue behind one-another. You
can get that from an OS, or from a hypervisor; but what you get is _virtual
Von Neumann machines_ either way.

~~~
gpderetta
Not all parallelisable jobs can be profitably statically partitioned in
independent threads or processes. Many (i.e. anything that is not
embarrassingly parallel) require dynamic scheduling and load balancing, which
is easier and faster to do on a single address space.

~~~
derefr
I did hedge in the original comment with a "not much you can do that's faster"
rather than a "nothing you can do faster." I write control-plane software in
Erlang; I am well-aware of software that can take advantage of a 12- or 36- or
100-core CPU. :)

But, most software that people "parallelize" with pthreads is _not_ that type
of software, and would be much better served being split into independently-
partitioned shared-nothing worker processes, ala Redis. Not only for
performance sake, but also because that frees you from the operational
constraint of needing a single big machine to run it on. (You still _can_ run
all your shared-nothing worker processes on the same piece of Big Iron, but
you don't _have_ to, and that flexibility is important for designing a
solution in the face of unknown usage profiles.)

------
pnut
Machine elves, obviously.

