
Developer Preview – EC2 Instances with Programmable Hardware - jonny2112
https://aws.amazon.com/blogs/aws/developer-preview-ec2-instances-f1-with-programmable-hardware/
======
fpgaminer
These FPGAs are absolutely _massive_ (in terms of available resources). AWS
isn't messing around.

To put things into practical perspective my company sells an FPGA based
solution that applies our video enhancement technology in real-time to any
video streams up to 1080p60 (our consumer product handles HDMI in and out).
It's a world class algorithm with complex calculations, generating 3D
information and saliency maps on the fly. I crammed that beast into a Cyclone
4 with 40K LEs.

It's hard to translate the "System Logic Cells" metric that Xilinx uses to
measure these FPGAs, but a pessimistic calculation puts it at about 1.1
million LEs. That's over 27 times the logic my real-time video enhancement
algorithm uses. With just one of these FPGAs we could run our algorithm on 6
4K60 4:4:4 streams at once. That's insane.

For another estimation, my rough calculations show that each FPGA would be
able to do about 7 GH/s mining Bitcoin. Not an impressive figure by today's
standards, but back when FPGA mining was a thing the best I ever got out of an
FPGA was 500 MH/s per chip (on commercially viable devices).

I'm very curious what Amazon is going to charge for these instances. FPGAs of
that size are incredibly expensive (5 figures each). Xilinx no doubt gave them
a special deal, in exchange for the opportunity to participate in what could
be a very large market. AWS has the potential to push a lot of volume for
FPGAs that traditionally had very poor volume. IntelFPGA will no doubt fight
exceptionally hard to win business from Azure or Google Cloud.

* Take all these estimates with a grain of salt. Most recent "advancements" in FPGA density are the result of using tricky architectures. FPGAs today are still homogeneous logic, but don't tend to be as fine grained as they were. In other words, they're basically moving from RISC to CISC. So it's always up in the air how well all the logic cells can be utilized for a given algorithm.

~~~
deafcalculus
What can each one of those 2.5 million "logic elements" do? Last time I used
an FPGA, they were mostly made up of 4-bit LUTs.

How many NOT operations can this do per cycle (and per second)? I realise
FPGAs aren't the most suited for this, but the raw number is useful when
thinking about how much better the FPGA is compared to a GPU for simple ops.

~~~
fpgaminer
The 2.5 million number quoted in the article is "System Logic Cells", not
Logic Elements. Near as I can tell, since I haven't kept pace with Xilinx
since their 7 series, a "System Logic Cell" is some strange fabricated metric
which is arrived at by taking the number of LUTs in the device and multiplying
by ~2. In other words, there is no such thing as a System Logic Cell, it's
just a translucent number.

Anyway, the FPGAs being used here are, I believe, based on a 6-LUT (6 input, 2
output). So you'd get about 1.25 million 6-LUTs to work with, and some
combination of MUXes, flip-flops, distributed RAM, block RAM, DSP blocks, etc.

Supposing Xilinx isn't doing any trickery and you really can use all those
LUTs freely, then you'd be able to cram ~2.5 million binary NOTs into the
thing (2 NOTs per LUT, since they're two output LUTs). So 2.5 million NOTs per
cycle. I don't know what speed it'd run at for such a simple operation. Their
mid-range 7 series FPGAs were able to do 32-bit additions plus a little extra
logic, at ~450 MHz and consume 16 LUTs for each adder.

~~~
alexforencich
6-input, 1-output or 5-input, 2-output. They're implemented as a 5-input,
2-output LUT with a bypassable 2:1 mux on the output.

------
ranman
If you don't click through to read about this: you can write an FPGA image in
verilog/VHDL and upload it... and then run it. To me that seems like magic.

HDK here: [https://github.com/aws/aws-fpga](https://github.com/aws/aws-fpga)

(I work for AWS)

~~~
ranman
If you guys are curious about these announcements I'll be recapping them and
going into more detail on twitch.tv/aws at 12:30 pacific

~~~
brian-armstrong
Huh? Isn't Twitch just for gaming content?

~~~
frikk
Nope. Twitch is excellent for all kinds of live content.

~~~
brian-armstrong
[https://www.twitch.tv/p/rules-of-conduct](https://www.twitch.tv/p/rules-of-
conduct)

"All content that is neither gaming-related nor permitted under the rules for
Twitch Creative Conduct is prohibited from broadcast."

~~~
Crosseye_Jack
I've seen many people programming on Twitch
[https://www.twitch.tv/directory/game/Creative/programming](https://www.twitch.tv/directory/game/Creative/programming)

While its mainly Game dev or game dev related its not limited to game dev
stuff. From their FAQ
[https://help.twitch.tv/customer/portal/articles/2176641](https://help.twitch.tv/customer/portal/articles/2176641)

    
    
      Examples of what you can broadcast on Twitch Creative:
      ...
      Programming and coding  
      Software and game development  
      Web development
    

EDIT: It seems that re:invent is being streamed on twitch anyway.

~~~
brian-armstrong
This is a product announcement, though

~~~
Crosseye_Jack
And Twitch have had TV Shows streamed on it in the past
[https://www.twitch.tv/whoismrrobot](https://www.twitch.tv/whoismrrobot)

My guess that was a sponsored deal or something. But as my edit from before it
seems that re:invent is being streamed on Twitch anyway so guessing its all
above board (and as others have said Amazon owns Twitch).

------
wyldfire
> Today we are launching a developer preview of the new F1 instance. In
> addition to building applications and services for your own use, you will be
> able to package them up for sale and reuse in AWS Marketplace.

Wow. An app store for FPGA IPs and the infrastructure to enable anyone to use
it. That's really cool.

~~~
baybal2
>Wow. An app store for FPGA IPs

I see people making video transcoder instances on day 1, and MPEGLA
bankrupting Amazoners with lawsuits on day 2

~~~
toomuchtodo
I guess the online distribution of FPGA configurations was an eventual event?

~~~
alfalfasprout
This was already a thing. Plenty of marketplaces exist for FPGA IPs. It's just
not that well known because high end FGPAs run $7k+ and complex IP cores can
be $20k+ for a license.

~~~
problems
So if you were to get a cheap license via a service like this, do you get
access to the VHDL or equivalent or could you extract it in some way?

~~~
alexforencich
Probably not. Depends on how the core is distributed. Either you'll get HDL or
netlists, and they may or may not be encrypted. Obviously the synthesis
software has to decrypt it to use it, so like all defective by design DRM it
doesn't make it impossible to get at the code, it just makes it more
difficult. However, a netlist is just a schematic, so you would have to
'decompile' that back to HDL (and lose the names, comments, etc) if you want
to modify it. It's also possible that you would only get the binary FPGA
configuration file (this marketplace seems like one more for complete
appliances and not IP cores) so you would have to back a netlist out of that
somehow and then reverse-engineer it from there.

------
Sanddancer
I'm surprised that no one has linked to
[http://opencores.org/](http://opencores.org/) opencores yet. They've got a
ton of vhdl code under various open licenses. The project's been around since
forever and is probably a good place to start if you're curious about fpga
programming.

------
irq-1
OVH is testing Altera chips - ALTERA Arria 10 GX 1150 FPGA Chip

[https://www.runabove.com/FPGAaaS.xml](https://www.runabove.com/FPGAaaS.xml)

------
ktta
If anyone is wondering how the FPGA board looks like

[https://imgur.com/a/wUTIp](https://imgur.com/a/wUTIp)

~~~
alexforencich
Are they actually using that one, or is that just a board that happens to have
that particular FPGA on it?

~~~
ktta
The FPGA in the image is the retail version. But it's more than likely that
amazon is using the same one since they don't modify the GPUs although they
purchase them on a much larger scale.

~~~
alexforencich
How do we know they are using that particular board from bittware as opposed
to a board from a different manufacturer or even an in-house design? The
linked article does not mention bittware or the board part number.

------
anujdeshpande
Here's a post by Bunnie Huang, from a few months ago saying that Moore's law
is dead and we will now have more of such stuff -
[http://spectrum.ieee.org/semiconductors/design/the-death-
of-...](http://spectrum.ieee.org/semiconductors/design/the-death-of-moores-
law-will-spur-innovation)

Pretty interesting read. Also, kudos to AWS !

------
prashnts
For my institute this is going to be _really_ useful for Genomics data
processing because we can't justify buying expensive hardware for undergrad
research. Using a FPGA hardware over cloud sounds almost magical!

~~~
brian-armstrong
You can't justify buying it but you can justify renting it? Has your
department heard of amortization?

~~~
op00to
Most research finance departments are absolutely horrified at OpEx because any
strange non-capital expenditure makes them look less efficient than the next
research institute. This comes in handy when two labs are up for a grant, and
they are equally qualified. The more efficient institute gets the grant. You
can imagine asking for the lab credit card for EC2 time is not met with
enthusiasm.

~~~
CamperBob2
That's interesting. So, buying a lot of expensive, soon-to-be-obsolete
hardware makes your lab _more_ attractive?

~~~
op00to
Exactly. I haven't worked at my old lab for more than 5 years, but they are
still advertising on their web page the systems I built when employed there.
Woo! 2010-era blade servers!

------
krupan
The traditional EDA tool companies (Mentory, Cadence, Synopsys) all tried
offering their tools under a could/SaaS model a few years back and nobody went
for it. Chip designers are too paranoid about their source code leaking. I
wonder if that attitude will hamper adoption of this model as well?

~~~
CamperBob2
_Chip designers are too paranoid about their source code leaking._

It's more an issue of being able to reproduce an existing build later on. You
can't delegate ownership of the toolchain to the "cloud" (read: somebody
else's computer) if you think you'll ever need to maintain the design in the
future.

~~~
gricardo99
I'm not so sure that is the issue. Currently you delegate ownership of the
toolchain to the EDA vendor. Sure you have tools installed locally on your
machines, but the tools typically have licenses that expire, so there's never
a guarantee you can build it later with the exact same toolchain. Also EDA
vendors end-of-life tools at some point, so even if you pay, that tool won't
exist for ever, and the license will not be renewable.

I do think the issue with cloud is the concern over IP. There are not a lot of
EDA vendors, so the chances that your competitor is also using that same EDA
vendor is pretty high. I think companies are pretty wary of using a cloud
hosted service where you could literally be running simulations on the same
machines as your competitors. Can you imagine some cloud/hosting snafu
resulting in your codebase being accessible by your competitors?

EDA companies also sell ASIC/FPGA IP, and VIP (verification IP), so there's
also a pretty clear conflict of interest if they have access to your IP. So,
if you're really paranoid, imagine the EDA vendors themselves picking through
your IP and repackaging/reselling it as IP to other customers (encrypted of
course so you can't readily identify the source code)?

------
technological
Quick Question: If anyone wants to learn programming an FPGA is learning C
only way to go ? how hard is to learn and program in verilog/VHDL without
electrical background ?

If anyone suggests links or books, please do

Thank You

~~~
ktta
I would suggest Digital Design by Morris Mano[1]. It'll start off with basic
intro from digital gates to FPGAs itself! And you really don't need any EE
background for this book. This book starts from absolute basics and it'll also
teach you Verilog along the way. And verilog is used more in the industry than
VHDL(which more popular in Europe and in the US army for some reason).

I'm surprised where you got the idea of using C to program FPGAs, are you
thinking of SystemC or OpenCL (they're both vastly different from each other)

I'm really surprised a sibling comment recommended the code book. It really
meant to be a layman's reading about tech. It's a great book but it won't
teach you programming FPGAs.

[1]: [https://www.amazon.com/Digital-Design-Introduction-
Verilog-H...](https://www.amazon.com/Digital-Design-Introduction-Verilog-
HDL/dp/0132774208)

~~~
technological
I thought FPGA are programmed using low level languages like C

~~~
DigitalJack
Not really. C is considered high level in digital design. There are some tools
for high level synthesis, from languages like C but they aren't used much.

Most Fpga "programming" is a textual description of a directed graph of logic
elements in a language like vhdl or Verizon (and now systemverilog).

Synthesis engines have gotten better over the years to allow things like +,*
to decribe addition and multiplication instead of describing the logic graph
of multiplication.

And most Fpgas now have larger built in primitives like 18x18 multipliers now.

You can judiciously use for-loops for repeated structures.

~~~
DigitalJack
verizon was meant to be verilog. didn't catch that autocorrect.

------
brendangregg
Very interesting. I'd still like to see the JVM pick up the FPGA as a possible
compile target, that way people could run apps that seamlessly used the FPGA
where appropriate. I have mentioned this to Intel, who are promoting this
technology (and also have a team that contributes to the JVM), but so far no
one is stating publicly that they are working on such a thing.

~~~
baybal2
java hello world will not fit even into a 10 gigagate chip

~~~
CalChris
Funny thing is that bytecode is actually pretty dense, more dense than x86.
But it's that _everything else_ which makes Java images pretty huge.

------
jsh3d
This is amazing! We have been developing a tool called Rigel at Stanford
([http://rigel-fpga.org](http://rigel-fpga.org)) to make it much easier to
develop image processing pipelines for FPGA. We have seem some really
significant speedups vs CPUs/GPUs [1].

[1]
[http://www.graphics.stanford.edu/papers/rigel/](http://www.graphics.stanford.edu/papers/rigel/)

~~~
petra
Are the speedups enough to negate the much higher cost of an FPGA vs a GPU ?

------
huntero
Given that the Amazon cloud is such a huge consumer of Intel's X86 processors,
even using Amazon-tailored Xeon's, it's surprising that Amazon chose Xilinx
over the Intel-owned Altera.

These Xilinx 16nm Virtex FPGA's are beasts, but Altera has some compelling
choices as well. Perhaps some of the hardened IP in the Xilinx tipped the
scales, such as the H.265 encode/decode, 100G EMAC, PCI-E Gen 4?

~~~
rphlx
Stratix10 (the large, Intel 14nm family) was delayed, delayed, delayed, and
delayed some more. Last I heard it was supposed to be in high-prio customer
hands by end of 2016, but unclear if that meant "more eng samples" or the
actual, final production parts. Either way Xilinx beat them to market by
approx 3-6 months AFAICT.

------
1024core
I'm a total FPGA n00b, so here's a dumb question: what _can_ you do with this
FPGA that you can't with a GPU?

OK, here's a concrete question: I have a vector of 64 floats. I want to
multiply it with a matrix of size 64xN, where N is on the order of 1 billion.
How fast can I do this multiplication, and find the top K elements of the
resulting N-dimensional array?

~~~
cheez
FGPA = Field Programmable Gate Array.

Basically, you can create a custom "CPU" for your particular workflow. Imagine
the GPU didn't exist and you couldn't multiply vectors of floats in parallel
on your CPU. You could use a FPGA to write something to multiply a vector of
floats in parallel without developing a GPU. It would probably not be as fast
as a GPU or the equivalent CPU, but it would be faster than doing it serially.

Another way to put it: you can create a GPU with a FPGA, but not vice versa.

~~~
1024core
Thanks. But what's the capacity of this particular FPGA? How much can it "do"
? Surely it can't emulate a dozen Xeons; so what's the upper bound on what can
be done on this FPGA?

------
adamnemecek
Does this mean that ML on FPGA's will be more common? Can someone comment on
viability of this? Would there be speedup and if so would it be large enough
to warrant rewriting it all in VHDL/Verilog?

~~~
ktta
Yes, definitely to your first and last two questions!

It's not as viable as it resulting in a large scale FPGA movement anytime soon
since the the industry and academia is heavy experienced with using GPUs. The
software and libraries on GPUs, like CUDA, TensorFlow and other open source
libraries are very mature and are optimized for GPUs. There will have to be
libraries in Verilog (I for one I'm hoping to be a part of this movement for
some time now, so I'd love it if anyone can guide me to anything going on)

There are some major to minor hurdles. Although some of them might not seem
like much[0], here they are:

1\. Till now deep learning/machine learning researchers have been okay with
learning the software stack related to GPUs and there are widespread tutorials
on how to get started, etc. Verilog/VHDL is a whole different ball game and a
very different thought process. (I will address using OpenCL later)

2\. The toolchain being used is not open source and it's not really hackable.
Although that is not that important in this case, since you're starting off
writing gates from scratch, there will be problems with licensing, bugs that
will be fixed at snail's pace (if ever) till there will be a performant open
source toolchain (if ever, but I have hope in the community). You'll have to
learn to give up at a customer service rep if you try to get help, unlike open
source libraries where to head to github's issue page and get help quickly
with the main devs.

3\. Although this move will make getting into the game a lot easier, it will
still not change the fact that people want to have control over their devices
and it will take time for people to realize they have to start buying FPGAs
for their data centers and use them in production, which has to happen
sometime soon. Using AWS's services won't be cost effective for long term
usage, just like GPUs instances(I don't know how the spot instance siutation
is going to look with the FPGA instances).

This comes with it's own slew of SW problems and good luck trying to
understand what's breaking what with the much slower compilation times and
terribly unhelpful debugging messages.

4\. OpenCL to FPGA is a mess. Only a handful of FPGAs supported using OpenCL.
So this has lead to there being little to no open source development
surrounding OpenCL with FPGAs in mind. And no the OpenCL libraries for GPUs
cannot be used for FPGAs. More likely as from scrach rewrite. There should be
a LOT more tweaking done to get them to work. OpenCL to FPGA is not as
seamless as one might think and is ridden with problems. This will again, take
time and energy by people familiar with FPGAs who have been largely out of the
OSS movement.

Although I might come of as pessimistic, I'm largely hopeful for the future in
the FPGA space. This move isn't great news just because it lowers the barrier,
but introduces a chip that will be much more popular and now we have a chip
for which libraries can focus their support on, compared to before, when each
dev had a different board. So you'll have to get familiar with this -- Virtex
Ultrascale+ XCVU9P [1]

And also, what might be interesting to you is that, Microsoft is doing a LOT
on research on this.

I think all of the articles on MS's use of FPGAs can explain better than I can
in this comment.

Some links to get you started: MS's blog post:
[http://blogs.microsoft.com/next/2016/10/17/the_moonshot_that...](http://blogs.microsoft.com/next/2016/10/17/the_moonshot_that_succeeded/)

Papers: [https://www.microsoft.com/en-
us/research/publication/acceler...](https://www.microsoft.com/en-
us/research/publication/accelerating-deep-convolutional-neural-networks-using-
specialized-hardware/)

Media outlet links: [https://www.top500.org/news/microsoft-goes-all-in-for-
fpgas-...](https://www.top500.org/news/microsoft-goes-all-in-for-fpgas-to-
build-out-cloud-based-ai/) [https://www.wired.com/2016/09/microsoft-bets-
future-chip-rep...](https://www.wired.com/2016/09/microsoft-bets-future-chip-
reprogram-fly/)

I'd suggest started with the wired article or MS's blog post. Exciting stuff.

[0]: Remember that academia moves at a much slower pace in getting adjusted to
the latest and greatest software than your average developer. The reason CUDA
is still so popular although it is closed source and you can only use nvidia's
GPUs is that it got in the game first and wooed them with performance.
Although OpenCL is comparably performant(although there are some rare cases
where this isn't true), I still see CUDA regarded as the defacto language to
learn in the GPGPU space.

[1]: [https://www.xilinx.com/support/documentation/selection-
guide...](https://www.xilinx.com/support/documentation/selection-
guides/ultrascale-plus-fpga-product-selection-guide.pdf#VUSP)

------
klagermkii
Would love to know what that gets priced at per hour, as well as if they plan
to have smaller FPGAs available while developing.

------
majke
Bitcoin mining. WPA2 brute forcing.

Maybe someone will finally find the triple-des password used at adobe for
password hashing.

The possibilities are endless :)

~~~
problems
Mining is unlikely, with bitcoin at least. Bitcoin passed the FPGA stage and
moved onto ASICs many years ago. There are some alt coins that are currently
best mined on GPUs though and this may change that or put their claims to a
real test.

~~~
rphlx
The boards used for this preview do not have enough memory bandwidth to pose
even a modest threat to the latest batch of memory-hard GPU PoW algos.

------
jakozaur
So know anyone can run their High Frequency Trading business on their side
:-P.

So much easier than buying hardware. Also deep learning works sometimes
similarly. It's easier to play with on AWS with their hourly billing than
buying hardware for many use cases.

~~~
zitterbewegung
The latencies from AWS servers to the exchanges probably would make HFT
applications unfeasible.

~~~
spullara
Not when you use Amazon's new regions, us-fin-1, that is within the exchange's
datacenter. /s?

~~~
grandalf
That would actually be tremendously disruptive! Superb idea.

------
RossBencina
> Xilinx UltraScale+ VU9P fabricated using a 16 nm process.

> 64 GiB of ECC-protected memory on a 288-bit wide bus (four DDR4 channels).

> Dedicated PCIe x16 interface to the CPU.

Does anyone know whether this is likely to be a plug-in card? and can I buy
one to plug in to a local machine for testing?

~~~
smilekzs
Even if it does, this can easily sell for $10k+.

~~~
errordeveloper
Yeah, the point is that you should need to buy any hardware even for
development, which is the biggest win to me!

~~~
aseipp
But having the hardware is vital. You have to test your design a lot. You're
still going to need Vivado (which isn't cheap) and you'll need instance time
to test the design on the real hardware with real workloads, along with any
syntheiszable test benches you want to run on the hardware.

The pricing structure of the development AMI is going to be meaningful here,
because it clearly includes some kind of Vivado license. It might not be as
cheap as you expect, and you need to spend a lot of time with the synthesis
tool to learn. The F1 machines themselves are certainly not going to be cheap
at all.

If you want to learn FPGA development, you can get a board for less than $50
USD one-time cost and a fully open source toolchain for it -- check my sibling
comments in this thread. Hell, if you really want, you can get a mid-range
Xilinx Artix FPGA with a Vivado Design Edition voucher, and a board supporting
all the features, for like $160, which is closer to what AWS is offering, and
will still probably be quite cheap as a flat cost, if you're serious about
learning what the tools can offer:
[http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-
bo...](http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-board-
recommended-for-introductory-users/) \-- it supports almost all of the same
basic device/Vivado features as the Virtex UltraScale, so "upgrading" to the
Real Deal should be fine, once you're comfortable with the tools.

~~~
user5994461
I expect a Vivado license with all my development cards.

P.S. The few orders I had were SoC Zynq boards in the $400-1000 range.

------
krupan
For complex designs the simulator that comes with the Vivado tools (Mentor's
modelsim) is not going to cut it. I wonder if they are working on deals with
Mentor (or competitors Cadence and Synopsys) to license their full-featured
simulators.

Even better, maybe Amazon (and others getting into this space like Intel and
Microsoft) will put their weight behind an open source VHDL/Verilog simulator.
A few exist but they are pretty slow and way behind the curve in language
support. Heck, maybe they can drive adoption of one of the up-and-coming HDL's
like chisel, or create one even better. A guy can dream...

~~~
LeifCarrotson
As someone who has little experience with FPGAs beyond some experiments with a
Spartan-6 dev board that mostly involved learning to write VHDL and building a
minimal CPU, I found the simulator to be of limited use. My tiny projects were
small enough that the education simulator was plenty fast. It was nice when I
didn't have the board available, and occasionally, the logic analyzer was
useful when I didn't understand what my code was doing to a data structure.
But usually, it was just a lot easier to simply flash the board and run the
thing.

What's the use of a simulator when you can spin up an AWS instance and run
your program on a real FPGA?

~~~
krupan
Simulations give you better controllability and better visibility. In other
words, you can poke and prod every internal piece of the design in simulation
land. In real hardware, not so easy.

That being said, you are far from alone as an FPGA developer in skipping sim
and going straight to hardware. Tools like Xilinx's chipscope help with the
visibility problem in real hardware too.

------
the_duke
I'd be interested in practical use cases that come to your mind (like someone
who commented about genomics data processing for a university).

What could YOU use this for professionally?

(I certainly always wanted to play around with an FPGA for fun...)

~~~
scott_wilson46
Monte Carlo sims for options pricing? I've done this before on FPGA, might
have a go at doing it for this instance as a fun exercise to test the concept!

~~~
dx034
Not sure if that makes sense with the offer that Amazon has. The machines are
huge, so either you're pricing a huge amount of options at a very high speed
(which you'd probably do in-house with FPGAs that you own), or you'll be much
cheaper using a good machine locally. Never found MC sims to be a bottleneck
regarding time, but YMMV I guess?

~~~
scott_wilson46
I've heard (although admittedly never seen in practice) that some places take
a long time for this sort of things (running over a cluster of computers
overnight). If you could do the same job on a single F1 instance in say an
hour then I think that would be compelling! Bearing in mind that simple
experiments I did showed an improvement of around 100x for this sort of task
over a GPU.

------
koolba
Just wait till this gets combined with Lambda.

~~~
dx034
How would they do that? Since the FPGAs are not shared, I don't see how you
could use it for very short-lived instances.

~~~
koolba
If the spin up time is fast enough then they could do it. Alternatively if
it's active enough then there would a stream of requests processed by the
same, already loaded, FPGA.

------
mozumder
Anyone have a hardware ZLIB implementation that I can drop into my Python
toolchains as a direct replacement for ZLIB to compress web-server responses
with no latency?

Could also use a fast JPG encoder/decoder as well.

~~~
fpgaminer
Why stop there? Hack your kernel to deliver network packets directly to the
FPGA and then implement the whole server stack in the FPGA. Why settle for
response times on the order of milliseconds when you can get nanoseconds?

But seriously, I'm open to ideas for technologies that you or anyone else
needs implemented for these instances. Would make an interesting side business
for me.

EDIT: I should point out that I'm an experienced "full-stack" engineer when it
comes to FPGAs. I've implemented the FPGA code and the software to drive them.
None of this software developed by "hardware guys" garbage.

~~~
mozumder
Speaking as a hardware guy, I think that's the ultimate goal as well :)

Been planning a NIC card that directly serves web apps via HDL for a while
now...

------
kylek
I'm not totally up to date on it, but the RISC-V project has a tool (Chisel)
that "compiles" to verilog... Interesting times for sure!

~~~
emmelaich
Also checkout clash-lang.org which takes an almost Haskell language to VHDL or
Verilog or others.

------
_nrvs
_NOW_ things are getting really interesting!

------
mmosta
FPGA Instances are a game changer in every way.

Let this day be known as the beginning of the end general-compute
infrastructure for internet scale services.

------
lisper
Newbie question: What do verilog and VHDL compile down to, i.e. what is the
assembly/machine language for FPGAs?

~~~
JensSeiersen
A logic-gate/register netlist, i.e. a digital schematic of your design. This
is done by a synthesizer program. It is then mapped to the available resources
of your chosen FPGA, by a mapping program. Now you have the logic equivalent
schematic using the FPGAs resources. Then the netlist is place-and-routed to
fit it into the FPGA. If the design is to large/complex or the timing
requirements to strict (to high a clock frequency), this phase can fail. This
phase can also take many hours to complete, even on fast computers.

------
jordz
Azure will be next I guess. They're already using FPGA based systems to power
Bing and their Cognitive Services.

~~~
XnoiVeX
That's just anecdotal. No one has seen it. The Wired article sounded like
content marketing.

~~~
dgacmu
They've published papers about it -- [https://www.microsoft.com/en-
us/research/publication/configu...](https://www.microsoft.com/en-
us/research/publication/configurable-cloud-acceleration/) \-- they're giving
talks about it -- Mark Russinovich was here a few weeks ago with a very long
talk. Doug Burger and Derek Chiou are leading a lot of these efforts, and
they're absolutely for real.

I'm not sure I agree with them that this is the right path forward (but
they're smart and know their stuff, so I'm probably wrong), but it's
absolutely for real.

------
brilliantcode
wow. that's what was going through my mind reading this article but it quickly
dawned upon me (and sad) that I probably won't be able to build anything with
it as we are not solving problems that require programmable hardware but
euphoric nonetheless to see this kind of innovation coming from AWS.

------
Ceriand
Is there direct DMA access to/from the network interface bypassing the CPU?

~~~
alexforencich
Doesn't look like it from the article. That could be very interesting, but
there could be network architecture constraints that prevent Amazon from
providing that from the get-go. And it wouldn't be used in all cases, so that
could burn a lot of switch ports. Seems like they're targeting more compute
offload and less network appliance.

------
SEJeff
Are these custom fpgas or an Altera or Xylinx?

~~~
alexforencich
Looks like they are using Xilinx Ultrascale+ FPGAs.

------
jasoncchild
Oh man...this is freaking awesome!

------
n00b101
This is _huge_

