Hacker News new | past | comments | ask | show | jobs | submit login
Altera's Secret Processor Unveiled: a Quad-Core ARM Cortex-A53 (eetimes.com)
50 points by rbanffy on Oct 31, 2013 | hide | past | favorite | 25 comments

As far as I can tell, the AArch64-ness of these parts is only half of the (very exciting!) story ... the really interesting thing is that these are being manufactured on Intel's 14nm FinFET process. To date, I didn't think anyone was doing FPGAs on Intel's process except for Achronix, and they're relatively low-volume... so high-volume FPGAs (and with ARM cores, no less!) on a leading-edge Intel process is truly astounding.

These will be ghastly expensive parts, that much is for sure, but they have the potential to be really fast. I've always been a Xilinx user, but I will absolutely give credit where credit is due here: Altera seems to have pulled off something pretty remarkable. I'm excited to see what performance is like when these actually show up.

There's also Tabula. Amusingly, both Achronix and Tabula claims to be the first [FPGA] vendor on Intel's fab.

Have anybody used tabula? What is their pricing/performance levels? Is their marketing true?

I do know that they have Conal Elliot and that he's been exploring compiling Haskell to their space time architecture. Though I gather that's still a ways off from being for general consumption

I wonder how much more successful FPGAs would be if you could actually write your own compilers for them.

As of now, as far as I can tell, none of the major FPGA vendors actually document the actual format of their programming. You have to use a compiler from them to compile a netlist down to the raw bytecode that's actually sent to the FPGA to program it.

This means that there are many potential uses of an FPGA that you just can't do. For instance, you can't write a GLSL shader compiler that compiles down to an FPGA on the fly, or compile OpenCL to an FPGA, or add a language extension that compiles certain highly parallelizable statements in your language down to an FPGA.

It seems that FPGA manufacturers are missing out on an awful lot by not actually opening up their hardware to experimentation by third parties. Anyone know why exactly they act this way?

You can write the tools you describe on top of any existing tool stack -- you'll just have to use the hardware description language as an IL. (Indeed, many tools already do this.)

You could write your own synthesis front-end and place-and-route back-end, but they're not like any compilers you've ever seen before. (A recent Coursera class [1] went into pretty good detail about how synthesis and place and route work.) I think that the unfamiliarity is probably why there are no Open Source toolchains for it.

For Virtex-II platform, Xilinx provided a tool to edit bitfiles, called JBits [2], but it seems to be well and truly dead[3].

[1] https://www.coursera.org/course/vlsicad

[2] http://www.xilinx.com/labs/projects/jbits/

[3] http://forums.xilinx.com/t5/Virtex-Family-FPGAs/Question-abo...

Sure, you could use tools that target the HDL as an IL, if you wanted to write that code yourself and then ship it to your customers as a binary blob, as their HDL compilers need to be specifically licensed, and aren't, as far as I know, something designed to be used as a back-end component in shipping software.

But that limits its usefulness for something like, say, writing a GLSL compiler so you can use your FPGA as a GPU, and are then able to run arbitrary GLSL code on it.

I agree that it's definitely a much different than writing a compiler backend for a CPU. You are definitely going to have different abstractions that will apply well to it. But that's fine; that's what I'm interested in, writing a compiler for a very different kind of platform. The problem is, as far as I can tell, they don't document their actual bitstream format, so you must use their licensed tools any time you actually want to compile something new for the FPGA. And that precludes a lot of use cases.

> [...] you can't [...] or compile OpenCL to an FPGA

I noticed that Altera's own product page (http://www.altera.com/devices/fpga/stratix-fpgas/stratix10/s...) specifically mentions that:

> C-based design entry using the Altera SDK for OpenCL, offering a design environment that is easy to implement on FPGAs

But it seems the release is very early, I was not able to find pricing or even packaging information. It does seem to pack quite a punch, from the performance numbers on the above page ("56 Gbps chip-to-chip/module capability for leading edge interface standards" sounds like ... a lot).

Also, that page has some super-annoying trickery that appends text when you copy and paste, to insert a link back to some Altera page. What idiocy.

So, Altera is able to write an OpenCL compiler for their platform. I am not, because they don't document the low-level bitstream format; you always need to have some licensed Altera compiler in between what you write and the actual hardware.

As the author points out in the comments these are targeted for "high end" applications like crypto processors. I've got one of the Zedboards [1] and I really like the basic board, but damn if it isn't nearly impossible to set up with Xilinx tools. The Quartus tools from Altera are a bit better but they too have their quirks. I keep hoping pg will get a chance to fund a startup that is doing the HDL equivalent of PHP :-)

[1] http:://www.zedboard.org

Didn't xilinz released a high level development tool for fpga, using a c like language?

One problem thought, it costs around 5k, so not hacker friendly.

And tool I know of is convey-computer which offers tools to develop fpga accelerators. It's not cheap, especially since decent fpga chips are expensive.

Btw convey offers a more efficient memcached servers built using fpga chips.

And I've read some academic work regarding a visual tool that let's bioinformatics guys program fpga chips.

But since the zed board is priced reasonably for hackers , I wonder if it offers enough performance to be used as an accelerator?

I know these guys just applied to YC: http://tempoautomation.com/

I don't think that a robot like Tempo Automation's is the issue. We have low cost cncs and pick-n-place machines, and more importantly places like sierra circuits. The harder problem is, like ChuckMcM described it, an easy to use high level abstraction over the hardware to simplify and accelerate the design process. Fabbing a board isn't the hard part.

Out of curiosity, what is difficult about setting it up with Xilinx tools?

Mostly the lack of any ability to disassociate stuff. You need to start in PlanAhead and design all the hardware then build the software stack for that hardware, and then you can maybe test some of your ideas you were trying to implement.

My goal is to put together a system for teaching people about computers. Looking back one can teach the equivalent of a 4 year CS degree (the CS classes anyway) from 20 years ago to get them the fundamentals, to pretty much anyone, except that the platforms available to do that are very complex. I was trying to teach basic concepts to kids and either the systems were too complex (PC's) to really understand them all, or too simple (AVR) to do all of what you wanted. There were a lot of MCUs but most were designed (like the Atmel ATMega 328) for "embedding" so they had not enough RAM and too much flash for the kinds of things I got to do on a PDP-10 or a PDP-11 back in school. So I wrote a PDP-11 emulator for the ATMega 32. Its not all that hard and even running it out of serial FRAMs and with SPI emulating the front panel and switches, it can be as fast as an 11/23 was. But the reason I started there was that there was already other software that had been written for it (I was targeting an RT-11 bootup, but the smallest SD Card you can get is bigger than the largest Hard Drive you could use with RT-11 (140MB).

A number of people argued (successfully) that while a PDP-11 could teach basic concepts well, it would be better if you could re-use the experience in a more 'modern' setting, so I've been using ARM Cortex-M processors in my latest iteration. The ARM-9 crosses a bit to far over the threshold but provides a great 'follow-on' after the basic concepts are covered. One huge problem though is video.

My first video card was a Dazzler card (really easy S-100 video card) and later CGA, and VGA graphics. And while you can kinda/sorta to register level code on the RaspPi it was nothing like the C64 or Amstrad of its time. Those were machines you could understand completely. So a side project grew to build a '2D Framebuffer' for an inexpensive ARM system. I had been working the STM32F409 but just picked up an STM32F429 which has a simple frame buffer built in. Prior to that I was thinking something along the lines of the sun CG3 frame buffer with shared DRAM connected via the FSMC port on the ARM chip to provide 16MB of memory accessible to the FrameBuffer and user programs. The Zedboard provides a nice HDMI plug connected to FPGA fabric for doing that research even if the ARM9 is over kill, once finalized my plan was to move the Framebuffer design to a Spartan3e series (the one I was looking at is $10 in 1K quantities.)

But what I have not been able to do is to build a system with the Zedboard and just iterate on the Framebuffer part. In the ideal scenario I'd be able to reconfigure the FPGA on a running system but that has its own issues. I'd settle for rebooting into the next generation framebuffer but all I've been able to do is full up place and route followed by full scale sysgen. Very very painful.

More than you wanted to know I'm sure but it's a sore point with me :-)

> Mostly the lack of any ability to disassociate stuff.

I think I get where you are coming from. I've never approached FPGA work in this manner. What I mean by that is that every single FPGA board I've worked on I designed from scratch for the specific narrow application we were trying to address. I haven't had the need for full or partial dynamic reconfiguration at all.

Maybe I misunderstood. Modern tool sets from Xilinx (and probably Altera, but I am not that familiar with their offerings) have the ability to do incremental compilation. That, combined with allocating an area to your framebuffer code should net you reasonably quick design iterations.

Unless you are trying to do more than the basics a frame buffer isn't all that complex. A standard DRAM controller core to talk to memory, a set of FIFO's and a state machine to decide who does what when. If this is strictly for painting text and some basic graphics on the screen it's a relatively simple matter to double or triple buffer the interface you can render text/graphics in the background and flip between the live and background buffers during the vertical blanking period.

S-100. Ah. The good old days when you could actually wire-wrap your own peripheral boards. Here's a data point to date me: I ran AutoCAD on an S-100 system with a separate S-100 math co-processor card and a 640K ram-drive card. Worked like a charm. Oh, yes, interaction was through a VT-100 terminal (the real deal) and a tablet with a wired puck.

How many people on HN do you figure actually touched an IMSAI (outside a museum)?

Wow, am i mistaken here or are we witnessing the shift to an computing landscape where fpga's with hardcores like arm are the de facto standard vs asic's of today? And will this remove xilinx from the throne of fpga market leader?

This chip will probably cost $8000 or more. equivalent asic's will cost much less.And yes,the problem with ASIC is the higher development cost(NRE), but companies like baysand and especially easic offer low NRE chips with performance/price close(r) to ASIC.

The nice thing about those companies, is that this low NRE and possibility to do low volumes , might open ASIC's to startups and niche applications.

One such example is VMC that builds bitcoin mining chipsets using easic.

Xilinx hasn't said much about their next generation so it's impossible to tell who's ahead, but I would expect the next Zynq to get plenty of design wins just based on familiarity. http://www.xilinx.com/about/generation-ahead-20nm/index.htm

I can't help but wonder how much this would have impacted Bunnie's laptop design if it were available when he began work.

How is that going? All I could find were some old blog posts and this wiki: http://www.kosagi.com/w/index.php?title=Novena_Main_Page

I don't think Bunnie would contemplate replacing a $10 CPU and a $20 FPGA with an integrated one that costs $4K - $20K. :)

Zynq, the Xilinx equivalent of this cost from 50 u$s up to u$s 4000. Also the Zynq was already available by the time Bunnie started his project, that's why I never like it.

how well does a DSP mine bitcoins?

Poorly now that Bitcoin ASICs exist.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact