Hacker News new | past | comments | ask | show | jobs | submit login
Intel Buys Altera for $16.7B (bloomberg.com)
339 points by rbanffy on June 1, 2015 | hide | past | favorite | 140 comments

I find this an interesting move. Not many remember that Lattice Semiconductor was nominally[1] spun out of Intel in '83. Back in the early 80's the "computer" companys like AMD, Intel, and National were also making chips to help connect them to the chips around them. And people had started building something called Programmable Array Logic (PALs) which was trademarked by AMD so folks also called them Gate Array Logic or GALs. The idea was that you had this microprocessor that had one interface or needed a simple peripheral, and this programmble chip could provide it.

I arrived at Intel just after the folks who started Lattice had left and the position of Intel was that programmable logic was a neutral idea at best, and generally bad or unsuitable for any sort of volume idea which is where Intel wanted to play.

But the interesting thing about the 22nm node is that you can put a whole lot of programmability into a pretty cost effective FPGA, and so now the pendulum hit ASICs and now is starting to find FPGAs in "general purpose" production products. And here we are with Intel buying an FPGA company.

[1] My memory is fuzzy on the exact details, whether it was just a bunch of engineers who decided to quit and moved over to Lattice when it was founded or if Intel helped with the founding by trading some fab capacity and engineers in exchange for some equity in the new company.

I don't know much 'bout the wider world, but FPGAs are hot right now in atomic physics and quantum computing, e.g.



results which were only really possible when those labs started switching to FPGAs from the DACs they'd been using previously.

Is "DACs" a typo for "DSPs"? If you really do mean DACs, can you elaborate on what those DACs were doing and how that was replaced by FPGAs?

The experiments use independent time-varying voltages on a set of electrodes to move a trapping potential well (and the ion in contains) along a path.

They need high-quality voltage waveforms on those electrodes to make sure they don't lose the ion or disturb its quantum state, and the DACs they had in '10 didn't have the update rates of the FPGA apparatus they built in '12 and they improved transport times by something like a factor of 100 by making the switch.

I looked at Bowler's thesis. The DAC chips set the performance, the FPGA is just a natural way of interfacing to them.


I hope Intel puts their weight into FPGA now instead of just seeing this as taking out a potential (future) competitor. IMO the technology has a lot of premise, especially in HPC where hardware often gets very specific tasks, so general purpose chips are a waste.

Edit: It looks like they've already been pursuing this, I wasn't aware of Xeon+FPGA pairings yet: http://www.extremetech.com/extreme/184828-intel-unveils-new-...

It's been a trend for years that we'll have more general purpose CPUs augmented with specialized hardware acceleration units in one single unit, e.g.,

* Intel/AMD CPUs merged with GPUs;

* ARM mobile cores merged with GPUs, GPS, Gyro sensors, voice processors, finger print processors, etc.;

* ARM CPU with FPGA (Xilinx Zync, Altera SoC FPGA).

Their architectures are usually CPU-central, such that the CPU speed and memory bandwidth are the main bottlenecks.

For the Intel+Altera merage, I'd like to see them come up with architecture innovations from another perspective:

* Programmable logics and signal fabrics (bus) connecting general purpose CPUs and hardware acceleration units;

* Bandwidth and connections central, like the Internet, massive processing power comes from massive intelligent connected simple individual nodes (CPU/modules);

* Programming such architecture is more resemble how human brain works (reconfigurable inter-connects of neurons).

If anything Intel has been moving towards "more general purpose" computing in the past few years with Xeon Phi.

They do custom extensions for large customers and announced a Xeon/FPGA combination last year.

.. which hasn't exactly payed off yet. Whether Knight's Landing changes that remains to be seen. Intel's reliance on x86 for everything is IMO holding them back in the data parallel HPC segment.

What markets has Phi really made inroads into besides oil and gas legacy codes?

Any good examples where FPGAs outperform standard CPUs?


Low-latency interaction with hardware. In this case, it was possible to start outputting a TCP reply before the end of the packet it was replying to.

In general, FPGAs win for integer and boolean functions which are amenable to deep pipelining, or are capable of very high parallelism. One of my colleagues produced a neat chart of what level of parallelism was best suited to which of CPU/GPU/FPGA.

FPGAs don't have an advantage if you're memory or IO bound, and they have terrible power consumption.

You are not quite correct about power consumption. FPGA can be quite competitive when applied right.

First, of course, are very efficient Ethernet ports and many other efficient hard blocks.

Second, you can fuse many operations into one. For example, you cannot fuse 3 FP adds into one operation nor on GPU neither on CPU. It is often 2 or 4, rarely something inbetween or outside of those. GPU and CPU operations on vectors can be wasteful as vectors can be underutilized and on FPGA you can create circuitry that fits problem rather well.

I also think that you are not quite right about I/O or memory boundness. In FPGA you can add another I/O controller and use extra device for I/O, loading spare resource in FPGA. Same for memory - DDR controller synthesized into cells won't be very optimal, but you can have many of them nevertheless.

It depends on what you're doing. The static power consumption is rather high. If your solution fits fairly exactly into the FPGA and is running most of the time, then yes.

Having two DDR controllers helps the overall memory bandwidth number if your reads are not localised. If you're doing a lot of data-dependent loads this doesn't help at all (e.g. scrypt).

In all cases it's much harder for software developers to develop for FPGA, so this cost needs to be factored in. They're very good in their niche, not a general purpose silver bullet.

I think you are not quite right about non-local data access and bandwidth. Convex (I believe, I may not be right) has memory controller that provides full bandwidth utilization for R-MAT random graph analyses.

Your example with scrypt also does not access much of RAM, especially with Salsa20/8 standard function. As far as I can see, it also has parallelization parameter, and computation within top loop can be done in parallel.

Yes, it is hard to program for FPGA. But not that much - I myself programmed a system that performed translation from (pretty much high level) imperative description of algoithm to synthesable Verilog/VHDL code. In a one and half of month.

In my opinion, programming for FPGA is very entertaining, especially if you do not write V* code by hand.

This question comes up often, and it's the wrong question to ask.

Here's the thing. FPGA performance has nothing to do with it. They do what CPUs can't.

You don't choose to plug an FPGA in place of a CPU, you plug it where you can't.

Tying peripherals together, glue logic, bus connection. Some FPGAs have a builtin CPU (or you can plug a soft one together with your circuit)

While true in many cases, you can bet your bottom dollar that Intel did not spend $16.7bn to enter the embedded glue logic market. It is very much the compute aspect they are after, as a weapon against AMDs HSA and other GPGPU-style solutions.

But an FPGA (as it is today) cannot compete with a GPGPU

Maybe they will go for in-the-fly reconfiguration for specific computations (as: load your specialized circuit in an FPGA and fire away)

I designed the first commercial Bitcoin mining FPGAs, and though for awhile the FPGAs were not competitive with GPUs (overall, they beat them on power and usability) they eventually were (with the advent of Kintex and similar). Of course, that only lasted briefly, as the rapid growth in the market led to an influx of VC money to fund ASICs.

And that's where FPGAs shine; that small to medium volume market where small companies are doing innovative things but don't have the millions required to risk building an ASIC.

Bitcoin mining was probably close to a best-case scenario for FPGA compute though - it was computing a fixed function designed for easy hardware implementation at full capacity 24/7. And even then, actually implementing it and making it competitive was a huge colossal pain.

It was completely compute bound. Communicating with the host using just an RS232C UART was sufficient to keep the FPGA busy while computing Bitcoin's 2xSHA256 hashes.

> And that's where FPGAs shine; that small to medium volume market where small companies are doing innovative things but don't have the millions required to risk building an ASIC

I don't disagree with you on that count. Especially in this case (since for most cases a processor does a job with a better cost/benefit), FPGAs shine on very specialized/heavy computation tasks.

Not only that, but I'd say it (could be) the best place to learn about the very bottom of the computing stack, and experiment with chip design and wacky ideas.

> in-the-fly reconfiguration for specific computations

I always wondered, given that Intel's processors already have a pretty large gap between their instruction set and their real microcode, whether it would make sense to have a nominal "CPU" that, when fed an instruction stream, executes it normally on general-purpose cores, but also runs a tracing+profiling JIT over it to eventually generate a VHDL gate-equivalent program to jam into an on-core FPGA. "Hardware JIT", basically, with no explicit programming step needed.

Programming a CPU is becoming more and more a problem of fitting as much in the data caches as possible. Bandwidth is the problem, not the speed of the execution units. I don't see the huge benefit of an FPGA here.

> But an FPGA (as it is today) cannot compete with a GPGPU

I read an interesting quip somewhere on software/hardware development: 'civil engineering would look very different if the properties of concrete changed every 4 years.'

If at some point we stop scaling chip performance. And many-core-integration in/on a single chip/die stops making sense. Then glue logic starts to look like a key differentiator. And control over glue logic starts to look like control over profits.

Intel ate the chipset for performance reasons and so they could shape their own destiny.

If there aren't fundamental breakthroughs to preserve performance scaling as we know it, then I see this as more of the same.

But an GPGPU (as it is today) cannot compete with a FPGA.

Of course, it depends entirely on what you're doing with them. Keywords: horse, course, different.

That's the logical assumption. Of course, to make this work, the tooling would have to be wildly different; at the very least, an open bitstream format.

In addition to the benefits that others have already mentioned, when you use an FPGA, you can customize your hardware to provide task-specific features. An interesting example would be this demoscene project by LFT (Linus Åkesson):


"For this demo, I made my own CPU, ... cache, ... blitter with pixel shader support, a VGA generator, and an FM synthesizer."

In his explanation for why he wrote his own CPU in the FPGA, Linus explained "...I was able to take advantage of the added flexibility. For instance, at one point the demo was slightly larger than 16 KB, but I could fix this by adding some new instructions and a new addressing mode in order to make the code compress better."

I knew something like this had to exist. Shouldn't this approach extend to modern hardware? E.g. surely there must be cases where it is effective to use custom FPGA-based hardware fit for the job, rather than (or in addition to) CPU and GPGPU?

I heard counter-arguments to the tune of 'hardly anyone wants to program their FPGA' which sounf strange to me: after all, hardly anyone wants to program their pixel shaders, either.

The real problem with FPGAs is that for 99% of use cases, CPU/GPGPU is good enough. And in the cases you really do need the extra speed, its rare you also need the flexibility, in which case you'd make an ASIC. There is a niche for FPGAs (especially in prototyping), but it's not as big of a market as you would imagine.

Awesome demo.

Also, dude has a Symbolics Space Cadet keyboard. Respect.

FPGA's are chips, they're mostly orthogonal to CPU's. Sure there's some overlap in the margins, but the vast majority of FPGA's do things that CPU's can't do. There is some competition because an FPGA can do everything a CPU or GPU can, but the converse isn't true: there's lots of stuff that an FPGA can do that a CPU or GPU can't. To oversimplify, an FPGA could replace almost any digital chip anywhere.

HFT optimizes for low latency rather than high performance, and FPGA is currently the state of the art there to my knowledge [1].

When it comes to performance, you should look at FPGA in terms of performance per Watt. Generally speaking they outperform GPUs by an order of magnitude in FLOP/W*s [2], which in turn already have ~ 3x-5x advantage over Xeons [3]. This measure is the most important one when it comes to the question, how many chips you can put in a given rack. FPGAs are still held back in terms of cost per dollar invested, since they have been quite pricey - with Intel this could change.

[1] http://stackoverflow.com/questions/17256040/how-fast-is-stat...

[2] http://synergy.cs.vt.edu/pubs/papers/adhinarayanan-channeliz...

[3] http://streamcomputing.eu/blog/2012-08-27/processors-that-ca...

> Generally speaking they outperform GPUs by an order of magnitude in FLOPS/W*s [2]

The paper you link to is measuring MSPS/W (mega samples per second) and the algorithm they are studying relies on fixed point. It uses built in DSP blocks in the FPGA that are integer only. There is no floating point so it is incorrect to say this shows FPGAs give better FLOPS/W. It isn't all that surprising the FPGAs are doing better, the GPUs are all about floating point which isn't being used here.

Their GPU implementations use floating point as well as int and short. The efficiency barely differs between them showing that this particular GPU wasn't optimising with integer power efficiency in mind (which an FPGA implementation relying on DSP48s very much is).

Altera is claiming they will have >10 TFLOPS next year. They designed floating point DSP blocks in the Arria 10 and Stratix 10 (due out 2016Q1).


It would be interesting to the see the same experiment repeated using an NVidia Tesla and an Intel Xenon Phi. They used AMD GPUs not targeted at HPC so it's unsurprising the integer path is not power efficient (desktop/mobile graphics is all floating point).

Repeating the experiment with Tesla or Xenon Phi will show you the same thing: that GPUs are less efficient than FPGAs in this load. Their inferior efficiency has nothing to do with whether the polyphase channelization load is integer or floating point. A GPU consists of hundreds or thousands of microprocessors that have a traditional architecture: instruction decoding block, execution engines, registers, etc. Decoding and executing instructions is inherently less power-efficient than having this logic hard-wired as it can be in an FPGA.

> A GPU consists of hundreds or thousands of microprocessors that have a traditional architecture: instruction decoding block, execution engines, registers, etc.

Any example of a GPU with "hundreds or thousands of microprocessors"? Nvidia Titan X has 12 [1] microprocessors by your definition.

[1]: SM, Streaming Multiprocessor in Nvidia's terminology. Smallest unit that can branch, decode instructions, etc.

I am well aware of the technical details and that I used a liberal definition of "microprocessor". My wording was vague on purpose (I didn't want to delve into the details). I didn't mean to imply that each "microprocessor" had their own instruction decoding block (they don't).

An AMD Radeon R9 290X has 2816 stream processors (44 compute units of 64 stream processors) per their terminology. There is only 1 instruction decoder per compute unit, so a stream processor cannot completely branch off independently, but it can still follow a unique code path via branch predication. This is kind of comparable to an Nvidia GPU having "44 streaming multiprocessors".

But whether you call this 44 or 2816 processors is irrelevant to my main point: a processor that has to decode/execute 44 or 2816 instructions in a single cycle while supporting complex features like caching, branching, etc, is going to be less efficient than a FPGA with hard-wired logic (edit: "hard-wired" from the view point of "once the logic has been configured").

gchadwick also said integer workloads were "not power efficient" on GPUs, but that's also false. Most SP floating point and integer instructions on GPUs are optimized to execute in 1 clock cycle, so they are equally optimized. And of course integer logic needs fewer transistors than floating point logic, so an integer operation is going to consume less power than the corresponding floating point operation.

FPGA's don't actually have "hard-wired logic" though - they have a configurable routing fabric that takes up a substantial proportion of the die area and has much worse propagation delays than actual hardwired logic, leading to lower clocks than chips like GPUs. Being able to connect logic together into arbitrary designs at runtime is prerty expensive.

Thanks for pointing it out, I'm so used to FLOP used for benchmarks that I don't even question it anymore - mega samples didn't tick me off as being IP only.

I think AMD GPUs are much better at integer operations. NVIDIA ones are good at floating point.

> since they have been quite pricey - with Intel this could change

Why would it? The price of an FPGA is not really determined by its production cost – for medium-size Xilinx FPGAs for example, the ratio of price/chip production cost is on the order of 50.

By that you mean the margin? Well the margin is largely depending on competition (or the lack thereof), isn't it? That's why I think a big player could make a difference iff they put their weight into it.

Not necessarily margin, rather non-recurring engineering costs (fab masks, design, software development), marketing, sales, administration, other overhead. You need massively higher demand in order to lower those costs, which won't happen only because Intel is entering the market. Instead, you'd need to significantly change the principles and trade-offs of FPGAs, for which I don't see any indications.

Here's how I see FPGA in a perfect world: You have an x86 compiler that finds, through static analysis, a subset of your program that fits nicely on your FPGA, and according to performance models leads to a good speedup. This program is then automatically passed on to the FPGA compiler. So for the novice programmer the thing is just a black box, enabled by some compiler flag. Furthermore you can steer the compiler through usage of compiler directives such as OpenMP or OpenACC. We know that FPGA compilation takes a long time, so the problem here is fixing mistakes - every iteration may potentially cost you hours, which AFAIK is what makes FPGA programming so costly. Therefore the static analysis has to be of very high quality and the auto-offloading should be conservative. This sort of thing IMO could significantly change the popularity of FPGA, thus offsetting the R&D costs.

FPGAs don't outperform CPUs. That doesn't really make sense. FPGAs outperform software. Also, if you're in the chip business, FPGAs let you prove designs in-situ before moving to ASICs.

Signal processing.

FPGAs have high speed links and can perform thousands of operations in parallel. When you have a huge amount of data to process in real time, an FPGA (or ASIC) is often your only choice.

Second this. Video processing is somewhere I've used fpgas with great success. Embedded systems especially get a huge perf/watt boost when vision processing goes (at least partly) to an FPGA.

Video works really well because its discrete (60fps) and an FPGA just can't hit the clock speeds an asic will. If you already have a semantic clock requirement that is low, streaming works great.

Here is what they are really good at: deliver your custom hardware early to market (vs. ASIC or custom). This is a huge benefit if you consider that the first to market usually gets to own the market and command the best price.

Wrote a mandelbrot generator for uni that performs at several fps on a 25MHz clock using 8 multipliers. Definitely not possible on a CPU! https://instagram.com/p/2Y4CMtP95Q/?taken-by=yawn_brendan

But yeah pretty much any repetitive computation. 3DES enc/decryption is another example (although I think people normally use ASICs for that when they want to do loads of it).

A friend did some research in this area, comparing FPGAs, CPUs and GPUs. He published a paper [1] in regards to performance for several common Linear Algebra computations across a variety of input sizes. In particular, Figure 2 shows you where each of the platforms works best.

FPGAs are essentially re-programmable hardware, so they tend to outperform CPUs/GPUs when you program them for a specific task. They don't have to deal with most of the overhead that the more generalized platforms deal with which is why they dominate in the small input sizes. However, with FPGAs you're trading space (silicon) for that re-programmability so you can't have as much hardware in the same area as say a GPU. Thus, when the data sizes have saturated the available hardware of the FPGA for computation, the GPU begins to outperform. Due to the decreasing node sizes (28nm, 22nm, etc), we can fit more programmable logic into the same area, which causes the chart I mentioned above to shift more into the FPGA's favor.

[1]: http://www.researchgate.net/profile/Sam_Skalicky/publication...

I work on developing FPGA based prototyping platforms (Basically chip verification solutions). We are one area where FPGAs perform better than standard CPUs.

Bitcoin mining?

You are right, but FPGA is very old news in Bitcoin mining about 3-4 years old, everyone moved to custom ASICs.

Bitcoin mining. (although ASICs have now taken over, for a good part of 2010-2013 FPGAs were the king esp. for power limited mining rigs)

IIRC many mining ASICs are those same FPGA layouts etched permanently.

Think of FPGAs as having the potential to be primitive GPGPUs. They outperform CPUs in all the same areas GPUs outperform CPUs.

FPGAs are like GPUs with no floating-point, no caches, and limited local memory. But if you can implement a kernel in FPGA with comparable memory bandwidth, you'll usually outperform GPGPU while using as little as 1/50 the power.

> Think of FPGAs as having the potential to be primitive GPGPUs. They outperform CPUs in all the same areas GPUs outperform CPUs.

FPGAs have several orders of magnitude lower latency than GPGPU. GPUs have memory access latency of 1 microsecond, getting something useful out of them >1 ms. FPGAs can have state machines running at 200 MHz, or 5 ns cycle time.

> FPGAs are like GPUs with no floating-point, no caches, and limited local memory. But if you can implement a kernel in FPGA with comparable memory bandwidth, you'll usually outperform GPGPU while using as little as 1/50 the power.

Some FPGAs do have floating point hard blocks. Integrated SRAMs (syncram) can be used as caches and usually are. FPGAs usually have DRAM controllers as hard blocks, so local memory is not that limited. Unless you consider up to 8 GB (newer models up to 32 GB) limited.

They're used in HFT, it seems mainly for sending and receiving orders.

The FPGAs run entire algorithmic strategies. It's not just the order management that they handle.

FPGAs run the execution of strategy, everything else is done by high level software/systems. FPGAs are connected directly to the exchange and also help receive a feed of market data (in addition to other market data feeds).

I guess that depends on how close to the exchange you want to be. If your strategies were extremely latency sensitive, you might want to have not just the execution logic, but the larger algo framework on there too. This is particularly true of strategies that make use of order book dynamics.

A big advantage that FPGAs have is that they are fast at "single threaded" tasks so they can parse data in formats that aren't easy to parse quickly, like the XML-based FIX protocol.

I'm not sure FPGAs are 'suited' for parsing formats with lots of variable lengths and data dependencies like XML, but obviously they can be faster than typical Von Neumann. I know binary market data streams were one of the first uses in the financial industry.

FIX is usually sent and received as ASCII tags of the form NUM=value\x01 , not as XML.

Source: http://en.m.wikipedia.org/wiki/Financial_Information_eXchang...

They can't be compared directly, although that will never convince people to stop trying.

You tell a CPU what to do. You tell an FPGA what to be.

They usually require minimal amounts of power in comparison with a CPU with a similar thoroughput

The next (obvious) question: What's gonna happen w/ Xilinx.

For the longest time, the FPGA market has been dominated by Xilinx and Altera. With one joining Intel, it seems logical to expect a response?

>> The next (obvious) question: What's gonna happen w/ Xilinx.

Altera will have a process advantage over Xilinx and there is nothing that can be done about it. While everyone is talking 14nm, Intel is shipping it and talking 10nm.

This deal makes more sense than any I've seen. Altera gets to use the best fabs in the world to gain an advantage over their competitors. Intel gets to deploy to a new (to them) market where the volumes are lower but the ASPs are higher - this is good given the ever increasing costs of making chips and the lower yields likely at 10nm and 7nm. It looks like a purely strategic move for both companies.

I'd be curious to see just how long it will take Altera to adopt such a fine process for FPGA cells though.

Edit: ha! Apparently, "One reason why Altera is attractive to Intel is that Altera is already using Intel's fab to create its latest generation FPGAs and SoCs; Arria 10 FPGAs and SoCs are being implemented using TSMC's 20nm process, while Stratix 10 FPGAs and SoCs are leveraging Intel’s 14 nm Tri-Gate process (see Altera & Intel to Collaborate on Multi-Die 14nm FPGAs). Furthermore, Altera is also using Intel's state-of-the-art packaging technologies."


FPGAs have a lot of very regular structures, so they're generally one of the first things you produce with a process.

Probably easier to design too, allowing you to tape out your process launch vehicle early.

I hope the develop open and cheap compilation tools to force xilinx to do the same.

I believe the Xilinx tools are actually free of cost. Still, though, open source FPGA tools are desperately needed.

edit: even if they just opened up the bitfile format. I believe it's a similar situation as GPU instructions sets though: managers saying "no because patents".

I recently wrote a place and route tool for Lattice iCE40 FPGAs [0]. The bitstream was reverse engineered by Clifford Wolf and others as part of the IceStorm project [1]. We're using Yosys [2] for Verilog synthesis, also written by Clifford.

[0] https://github.com/cseed/arachne-pnr

[1] http://www.clifford.at/icestorm/

[2] http://www.clifford.at/yosys/

Wow, that is pretty awesome. I didn't expect this to have arrived that quickly.

Does it support user constraints on relative placement (and routing)? Self-timed logic, such as NULL Convention Logic [1], doesn't fit well with the synchronous FPGA paradigm, but can be implement on those if the feedback loops are tightly controlled [2]. I'd like to play with that :)

[1] http://en.wikipedia.org/wiki/Asynchronous_circuit

[2] http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=543867...

Whoa, I had no idea you could do async circuits on an FPGA (although I can't get to [2] right now, the site is under maintenance). Right now, I only support IO constraints. The IceStorm project hasn't documented the timing model yet but it is in pipeline. Is it possible to do what you want with vendor tools? Anyways, email me and we can chat about what you'd need.

Xilinx tools are only free for certain chips and feature sets. If you want tools like ChipScope (which more than pays for itself) or to use any Vertex chips you need to buy their software.

For their lower end chips (All the Spartan series, the larger of the 6 series are still quite useful) they straight up announced an end of development on the main toolchain (ISE) with no plans for support on their newer one.

Opening up the spec alone is not ideal, and its not just for the patents. Hardware validation of a software spec is expensive. Today they don't need to do that. They can simply change the software to work around broken hardware and nobody needs to know.

I hope the same, but it's not likely in the short term. Intel seems to understand open better than a lot of hardware companies, but it's still got its share of NDAs. And a proprietary compiler AFAIK.

Hardly. They're not even giving away their optimizing compiler for their own CPUs; you have to pay for it for commercial use.

Given that FPGA dev tools are much harder to develop than FPGA chips(at that company scale), i don't think intel will open them.

I don't understand the logic there. No FPGA vendor will ever turn a profit on their EDA-ware, and nobody would ever buy an FPGA without it. Open the stuff up and let people do some of the heavy lifting. They can still sell value-added bits and bobs if the want, and can concentrate on usability, something they sorely need to do.

In business - "the hard part" == "competitive advantage" . If you open source that , soon you'll see more competitors knocking on your doors.

For example, if intel wanted, it could have built a decent FPGA, probably in a reasonable time(there were startups that built decent fpga's).

But building good fpga software ? that it couldn't do, first because it's really hard and second because you need tens of thousands of designs from your customers to run on this in ordered to make good software. That's not something acquirable.

But whatever level of advantage they have here now is negated by xilinxes existence of a toolset with roughly that level of engineering effort. So why not, in the least, consider damaging the only true competition by opening up?

Life in a duopoly is great . Why change things ?


Why does any company make open source software, then? WebKit, Android, OpenGL, and OpenCV (Intel) are certainly hard to develop.

OSS development has proven to be a useful tool as part of greater business models.

If we look at WebKit, And OpenCV and i think OpenGL , the software is really a complement to the real business and are less hard to develop than the main products.Also they are all in the API business - where developers demand open source and have enough power .

As for android - Again it was a strategic necessity for Google, since it needed that distribution point for it's huge ad business. Operating systems are a winner takes all business and the most effective, lowest risk strategy to get there was open sourcing, even though it's kind of a loss leader. And even with only partially open-sourcing android , Google is at constant risk of losing control of android.

The situation in FPGA is different - if altera opens the tool, xilinx and other companies will probably use it , and we'll probably see multiple companies compete on the business.

"Compiling" for a FPGA is far harder than your regular processor compiling, where you "just" translate your high level code to machine code.

Here is a link that explains the process: http://curtis.io/others-work/open-tooling-for-fpgas

Right, I understand. Logic synthesis entails constraint solving and other big domain-specific challenges. It is a very difficult problem. (So are web rendering, OS development, graphics API design, and computer vision.)

Yes we can hope. That would be awesome

While this is obviously a win for Altera: they gain the Intel process advantage; this deal does have a number of advantages for Xilinx:

- Altera will now likely lose early access to all the non-Intel fabs. And sometimes one of the non-Intel fabs come up with a process advantage before Intel does. Fabs love to work with FPGA vendors: the regular structure makes them good targets for new processes; and FPGA vendors are the only people that can sell a single chip for $50,000.

- Big guys are only interested in big markets: Altera may abandon some of the smaller niches they currently play in.

I'm sure there are other advantages Xilinx gains by staying independent.

>> Fabs love to work with FPGA vendors: the regular structure makes them good targets for new processes

I heard that this has changed and mobile chips are being used for "pipe cleaning" in fabs, from a credible source, but i haven't dug into the details of how.

I'm thinking it could be interesting for Nvidia, considering that they have been building up HPC as their second leg, as an insurance policy in case Intel invests heavily into tooling and process technology for FPGA. Plus they're in a partnership with IBM now, who probably have some interesting use cases for future FPGAs (i.e. deep learning).

Basically, they're fucked. Altera's next chip was going to wipe the floor with Xilinx anyways. Xilinx has better tools, but they are going to be in trouble now that they are behind in process.

They'll get acquired by someone, probably in a year or two. If they lose their market leadership in the meantime, the acquisition will be for less than 17B.

They really need to invest in their software and tool flow right now. That's where they have a strong lead.

Heard something about Avago the other day. http://seekingalpha.com/news/2530316-xilinx-up-after-new-int...

I don't really understand this deal.

Altera and Intel already have a fab agreement and Intel is desperate to get customers into their mothballed fabs.

All this talk of technology collaboration could be accomplished with strategic partnerships/licensing that would certainly be less than $16.7B.

This looks like a case of a foundry player buying their customer to prevent it from defecting to TSMC in the future. To me this is a sign of weakness.

Altera and Intel already have a fab agreement and Intel is desperate to get customers into their mothballed fabs.

If they can't get customers into them, they can get their own products made in them.

Rumor has it that Altera was unhappy with Intel's process and was going to bail. Intel was so dependent on their cooperation that acquisition started to look good.

There is a lot of truth to this - Intel's process (and their tools!) is a FUCKING NIGHTMARE

I think its more core to Intel's product strategy than this. Intel will build more heterogeneous compute elements on a single die (CPU, GPU and now programmable routing fabric). Intel learned the first time to buy not build (Larabee).

Altera has been with TSMC since forever, and in their latest chip they had disastrous results. TSMC just can't compete with Intel's process when it comes to making high quality digital chips. This is a VERY VERY good move for Altera, not sure about Intel though.

Not sure if this is entirely accurate. Intel's process tend to be optimized towards high performance chips since this is where the big margins are.

I'm not sure they are ready to devote production capacity (and process development resources) towards very lower power products which are also less profitable.

That was one of the reasons they failed when trying to enter the cellular market.

> I'm not sure they are ready to devote production capacity (and process development resources) towards very lower power products which are also less profitable.

Isn't that what they're already doing? What about things like the Quarks used in the Galileo and the Edison?

With Java being a sort of common API for lots of programmers to reach the hardware, I have always wondered if we would ever saw hardware that would directly implement the JVM bytecode, or even JavaSE.

Apparently Altera has something called JVXtreme[1], which claim x55 peak and x15 sustained performance increase.

From the pdf: "JVXtreme accelerates the actual execution of Java by executing 87 of the most commonly used Java byte codes in hardware."

[1] https://www.altera.com/en_US/pdfs/products/ip/ampp/documents...

Performance increase over a JVM _interpreter_. They admit this is slower than compiled code. It would be interesting to see data on the complete Cost/Performance/Power/Memory tradeoff relative to a state-of-the-art JIT compiler.

There were lots of people doing and trying to do this earlier in Java's development:



I've not heard so much about those projects lately.

It turns out that while Java byte code is very nice as a compiler target, it's horrible as an instruction set. Theres been quite a few experiments with hardware accelerated java byte code, and all of them turn out to be slower then just targeting a simpler instruction set. (Something that's been played out before in CISC vs RISC)

Had. I believe all modern ARM cores have totally dropped support for executing Java bytecode natively in favour of traditional JIT compilers.

This is pretty damned exciting IMO. Intel has a long history of thorough documentation and high-quality tools, available to anybody for a reasonable price if not free. Basically the polar opposite of every single player in the FPGA space. Their tools are garbage, cost a mint, and documentation is lax.

I had no idea Altera was worth that much... they always seemed like such a small-time company when we used their products in class.

There is a thesis that FPGAs will take share from ASICs as upfront costs for producing ASICs continues to climb, seemingly in an exponential manner as feature size falls. Some Altera investors think the company is worth significantly more than $54 per share, given the market could become significantly bigger and the competitive dynamic is pretty nice for the two big players.

Certainly FPGA users see them as a distant second to Xilinx, but that may be why they need to be bought. Also, Altera is making 14mm parts at Intel's foundry which suggests they could integrate pretty easily with Intel's CPU products.

I don't know which FPGA users you're talking about. Huawei, one of the largest FPGA users in the world, has been buying from Altera for a long time now.

Altera's software is miles ahead of Xilinx's. And their hardware is pretty close, with Xilinx getting the edge.

>> Altera's software is miles ahead of Xilinx's.

Can you please expand on that ?

Also what about partial reconfiguration , isn't xilinx the leader ? isn't that important for general compute ?

By software I mean their place/route and synthesis tool, Quartus. The algorithms used in Quartus, based off of VPR (Versatile Place/Route) is the reason Altera is in the position they are now, relative to a decade ago when they were struggling quite a lot.

Place and route in digital circuits is an NP-hard problem. Xilinx solves this problem using analytical place and route (solving a giant system of equations), explained here a bit more: http://www.xilinx.com/products/design-tools/vivado/implement...

Altera's tool uses Simulated Annealing. Historically, Altera's tool has been faster and easier to use, allowing user's to implement larger designs and synthesize them quicker. However, I don't know the state of things RIGHT NOW, but I'm pretty sure this was the case 2 years ago.

Altera's chip design/layout is lacklustre compared to Xilinx's though.

Also, partial reconfig is a very important feature, but it's not used by all customers. Most people just want large FPGAs so they can fit large designs on them, have really fast I/O (Transceivers) and meet timing on their circuits.

I know xilinx created a whole new tools(which they releases a couple years ago), and [1] says they moved from simulated annealing. So at least according to the article, they aren't behind altera.


I had no idea they were so widely used! My comment is very naive and unqualified, as I haven't done any digital design since my college days.

Quartus II is ahead of what Xilinx has to offer? Dear god. Xilinx offering must be truly horrible!

It is. As is everyone else's. Nothing sucks more than FPGA toolchains. It's hard for software people to understand what FPGA users have to deal with.

Here's a good thought exercise to help visualize the situation: imagine if Intel had insisted on keeping the x86 instruction set closed and undocumented, and, furthermore, that they had succeeded in doing so. For 30 years we've had nothing but proprietary Intel compilers. How much would life suck? How far behind would we be?

Xeons with FPGA's are something that there high end customers would consume in major volumes. Some of the investment banks have been doing FPGA acceleration for algo trading for several years.

I wonder what broadly available FPGA's would mean for systems development, though...could be pretty interesting.

I was reading about the differences between CLDCs and FPGAs and got curious: why are CLDC offers so much simpler than FPGAs? Considering the same technology and feature sizes, I'd expect CLDCs, since they lack the reconfigurability of FPGAs, to be able to hold more complex designs.

What's a CLDC? I've never heard of them before. Do you mean CPLD?


Of course, yes. CLDC is something that collided with CPLD in my brain, probably due to a Java-induced trauma many years ago. CLDC is a Java ME profile (a very simple smartphone from another age).

I wonder if in addition to the obvious high-end Xeon+FPGA combo we are going to see smaller Quark+FPGA devices too for the embedded marker. It would make lot of sense, but Intel on the other hand never much cared for embedded.

I hear they are doing this to get ARM out of the FPGA+SoC space.

Yes, I'm curious about this as well. Altera has a couple SOCs using Cortex-A9 and Cortex-A53 IP, I wonder if the whole division will get killed.

But ARM+FPGA hybrids are available from Xilinx as well.

Not so much get them out of the market, but to be a viable competitor in that market. Perhaps in much the same way as they're trying to do with Quark against regular ARM SoCs.

So, it's not like Intel is struggling really? I heard news of them laying off a large part of their workforce, while relocating to Hillsboro.

Intel should have bought BroadCom as well

"Altera’s devices can have their function updated, even after they’ve been installed in end-devices. While they’re sold in relatively small volumes, programmable logic usually requires the latest in production technology because it’s some of the largest chips in the industry."

Somebody at Bloomberg is obviously a little confused about FPGAs.

Are dedicated video encoding / decoding chips some of those made by Altera?

I thought this said, "Intel Buys Africa for $16.7B"

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact