And no, AMD doesn't count. ROCm is a mess.
Applied Brain Research has software called Nengo (www.nengo.ai) explicitly for developing neural network models and compiling them to different backends, including CPUs, GPUs, and neuromorphic hardware (Intel's Loihi, Spinnaker, Spinnaker 2, BrainDrop). It's been battle tested for over 10 years of developing models, built the world's largest functional brain model (https://bit.ly/2VNGgSX), integrates deep learning and spiking neural networks. Would be interested to hear your thoughts on it.
I don't have first hand experience using it, but from people I know, it does work.
Where are the numbers for the Cerebras chip coming from?:
- How do you have a TDP of 180W for an entire wafer of chips?
- Why is there a peak FP32 number when they are clearly working with FP16?
Each of these chips is a completely different architecture and it makes no sense to compare them at this level. The only meaningful comparison is actual performance in applications because that reflects how the entire system will be used.
15 kW power consumption for 1 chip?!?
That's currently just slideware at the moment.
"Ascend 910 is used for AI model training. In a typical training session based on ResNet-50, the combination of Ascend 910 and MindSpore is about two times faster at training AI models than other mainstream training cards using TensorFlow."
edit: The software framework "MindSpore will go open source in the first quarter of 2020."
I wonder how brittle the performance will be vs other models such as transformers and DRL vs CNN and ResNet.
> I’m focusing on chips designed for training
TPU 1 is designed for inference AFAIK.
> TPU v2: 45 TFLOPs
I think it would be great to clarify that what is commonly referred to as "TPU v2" (e.g. on GCP pricing, also what is shown in the image in this article), consists of 4 such modules with 8 cores total, which gives a more commonly quoted value of 180 TFLOPs.
They claim to have some great features. Anyone know when if a consumer version is coming / any release dates promised?
Or won't they be able to be as price/performance efficient compared to (nvidia) GPUs?
Would the nVidia Jetson count?
How to make your own AI chip
Intel has stuff made by other foundries?
Quoted from Nvidia Turing datasheet
I believe it is available via Elastic Inference (or maybe soon will be).
Currently, it only takes about 1 month to break even if you buy a consumer gpu like the rtx 2080ti, compared with AWS time. For training purposes it doesn't seem to make sense.
- just looked up the numbers and google tpus are pretty similar in terms of pricing. I think any aws equivalent would probably be just as expensive compared to a diy pc.
Why should they? There is not a lot of money to be gained from renting niche product in comparison to enormous capital expenditure for anything hardware related.
Lot's of dotcom companies burned themselves badly while chasing trendsetters with custom silicon. A cookie cutter 40nm SoC may cost "just" 10M today, but by involving yourself into custom silicon game you risk loosing in it. Not to mention that your operations troubles will increase n-fold.
Managing operations of hosting business with hundreds of thousands customer is hard enough. Logistics, server lifecycle, DC management, managing procurement contracts with unruly OEMs... Now try to dock all troubles you have with chipmakers to it. It will become a nightmare.
Edit: I didn't see the parent meant a hardware accelerator created by Amazon itself. Thx to @jsty and @paol for pointing this out. An ASIC by Amazon was announced last year and is known as 'AWS Inferentia'