
Edge TPU – Run Inference at the Edge - obulpathi
https://cloud.google.com/edge-tpu/
======
grumpopotamus
From
[https://en.wikipedia.org/wiki/Edge_computing](https://en.wikipedia.org/wiki/Edge_computing):

Edge computing is a method of optimizing cloud computing systems "by taking
the control of computing applications, data, and services away from some
central nodes (the "core") to the other logical extreme (the "edge") of the
Internet" which makes contact with the physical world.[1] In this
architecture, data comes in from the physical world via various sensors, and
actions are taken to change physical state via various forms of output and
actuators; by performing analytics and knowledge generation at the edge,
communications bandwidth between systems under control and the central data
center is reduced. Edge Computing takes advantage of proximity to the physical
items of interest also exploiting relationships those items may have to each
other.

~~~
TeMPOraL
Basically, how such computing _should_ be done? Preferably with little to none
of the data reaching the central server.

Funny how computing got centralized, and now is slowly getting decentralized
again. I'm happy to see tech for that developing, but I worry that data
ownership will continue to be centralized.

------
obulpathi
Eagerly waiting for Dev kits - to be released in couple of months!

------
deepnotderp
Looking at the image, looks like the die size is around 10-36 mm2,so
presumably it's meant to be an ultra low performance chip?

~~~
slivym
Hey. We don't say low performance. We say efficient, or low cost.

~~~
deepnotderp
Sorry, yes, I didn't mean that as a negative, I just meant wrt its target
market.

Also smaller != more efficient

------
dna_polymerase
Int8 and Int16? I never worked with quantized models, anyone mind sharing
their experience? Do such models achieve state-of-the-art performance?

~~~
ur-whale
a) that's just for inference, you don't train with that.

b) a fully float-trained model "quantized" to int16 typically loses overall
precision, but often works well enough. It's also usually faster (if
implemented properly).

c) there's a version where you go all the way down to int1 (bits) and binary
ops instead of addmuls on floats and ints. It can solve some problems. And
properly compiled, it's wicked fast.

~~~
DoofusOfDeath
> there's a version where you go all the way down to int1 (bits)

There's also a Zen version that uses just 0.5 bits. </joke>

------
knorker
It's an AI chip for mobile devices out on missions. Skynet... err... I mean
Google.. set it to read-only when out on these missions.

Inference only, no ML training. Only the Cloud has löööörning capabilities.

I bet you can just unscrew the head and flip a dip-switch, and it'll start
combining insults in no time.

------
brootstrap
instead of cloud-to-butt, now we need edge-to-butt plugin :p

