Never say never. With the benefits of Moore's law approaching zero, specialization of CPU's seems to be the way to go. Intel saw the writing on the wall and partnered with Altera, even lending them their state-of-the-art fabs: something one would have thought will never happen. http://www.altera.com/devices/fpga/stratix-fpgas/stratix10/s...
For many tasks, from games to databases, FPGA's could provide huge benefits. The only reason I can see why FPGA's weren't adopted by mainstream PCs is that improving CPU's was so much easier. But with the ever-diminishing returns from x86 improvement, I can very well imagine that FPGAs could become viable in the mainstream.
I don't see FPGA acceleration being useful in mainstream computers soon. By the way, AFAIK the one in Novena is primarily intended for data acquisition, not acceleration.
FPGAs are generally good for accelerating data-parallel applications, but we already have had SIMD and GPGPU for a while. Both these technologies are only used by a small subset of the applications which could benefit from employing them. Why? I would say poor tools and abysmal programmer literacy. Automatic vectorization for SIMD sort of works, but it tends to miss lots of opportunities. Automatic acceleration with GPGPU is pretty much in the research phase. Manual development for SIMD and GPGPU takes skills that most developers don't seem to have at the moment and the trend towards high level, highly abstracted imperative languages isn't helping.
I guess at this point it might seem like I'm contradicting myself. The software side of things is lagging behind for GPGPU and SIMD, but these technologies are still mainstream. Why wouldn't the same happen for FPGA accelerators? My answer is cost. SIMD requires just a few small functional units and registers, a fairly small chip area in the processor compared to caches; the general purpose processor is itself a fairly small part of a SoC. GPGPU didn't require significant architectural changes from the shader model. FPGAs large enough to be useful, on the other hand, are expensive. Maybe economies of scale might make a smaller FPGA cheap enough to be included as an accelerator, but as far as I understand high transistor count / large area / low yield is unavoidable.
Plenty of exciting stuff going in research, but automatic use of accelerators would work much better by moving to a dataflow model of programming.
Personal prediction: non-monotonic same-ISA heterogeneous computing is going to be the next big thing, maybe in the form of reconfigurable pipelines. A bit more far fetched: phase-change materials for very aggressive short-term DVFS to lower latency on mobile.
I would bet on hardcoded accelerators before FPGAs. (Apple's A8 looks like it is already half accelerators.) FPGAs are good for developing accelerators before putting them into an ASIC or shipping low-volume accelerators (like CAPI or Bing result ranking).
Sure, the average person isn't going to be programming the FPGA themselves, but when they can download someone else's of apps for it at a moment's notice and minimal cost . . . who knows? After all, most people can't write software themselves either.