This is often harder at large companies because you very rarely make career progress playing defense, so it becomes very tricky to do it fairly. It can work wonders if you have the right teammates, but it’s almost a prisoners dilemma game that falls apart as soon as one person opts out.
Good point, we will usually only rotate when the long running task is done but eventually we’ll arrive at some feature that takes more then a few weeks to build so will need to restructure our methods then.
This is a pretty common perspective that was introduced to me as “shifting the goalposts” in school. I have always found it a disingenuous argument because it’s applied so narrowly.
Humans are intelligent + humans play go => playing go is intelligent
Humans are intelligent + humans do algebra => doing algebra is intelligent
Meanwhile, humans in general are pretty terrible at exact, instantaneous arithmetic. But we aren’t claiming that computers are intelligent because they’re great at it.
Building a machine that does a narrowly defined task better than a human is an achievement, but it’s not intelligence.
Although, in the case of LLMs, in context learning is the closest thing I’ve seen to breaking free from the single-purpose nature of traditional ML/AI systems. It’s been interesting to watch for the past couple years because I still don’t think they’re “intelligent”, but it’s not just because they’re one trick ponies anymore. (So maybe the goalposts really are shifting?) I can’t quite articulate yet what I think is missing from current AI to bridge the gap.
> Meanwhile, humans in general are pretty terrible at exact, instantaneous arithmetic. But we aren’t claiming that computers are intelligent because they’re great at it.
"The question of whether a computer can think is no more interesting than the question of whether a submarine can swim." - Edsger Dijkstra
> breaking free from the single-purpose nature of traditional ML/AI systems
it is really breaking free? so far LLMs in action seem to have a fairly limited scope -- there are a variety of purposes to which they can be applied but it's all essentially the same underlying task
It _is_ all the same task (generating text completions) but that pretraining on that task has seemed to be suitably abstract for the model to work decently well on more narrow problems—certainly they work dramatically better at a collection of tasks like “sentiment analysis of movie ratings” and “spam classifier” than if I took purposes built models for either of those tasks and tried using them like I could an LLM.
Someday if we have computers that are capable of doing 100% of the cognitive tasks humans do, better than any human can, we might still say it’s “just” doing X or Y. It might even be disappointing that there isn't a “special sauce” to intelligence. But at the end of the day, the mechanism isn’t important.
We are already playing with some incredible ingredients. Machines that can instantly recall information, (in principle) connect to any electronic device, and calculate millions of times faster than brains, and perfectly self-replicate. Just using these abilities in a simple way is already pretty darn powerful.
Interesting, I’ve always run pis from wall power. Is the pi hardware incapable of similar power optimizations to coral, or is this a problem of a lack of software support for power management on pi? (I assume from your mention of external hardware to manage power that it’s not just a software issue.)
A typical battery-powered IoT device like the Ring video doorbell will last ~2 months on a ~6000mAh ~3.3v battery. An average power draw of about 14 milliwatts.
This is quite low - a power LED can use more than 14 milliwatts. Of course some products have power consumption even lower than that, right down to the tens-of-microwatts level.
Meanwhile a raspberry pi, when idle, consumes ~3 watts [1]. That's 200x more than the video doorbell.
Getting the power consumption down requires (a) that your hardware draws very little power when it's in sleep mode, and (b) that it spends as much time as possible in that sleep mode. Hardware and software have to work together to achieve this, and the software changes can be extensive.
I'm really mixing two things and wasn't very clear about it.
The pico can kinda deep sleep but it requires an external wakeup trigger. It can't deep sleep from its own clock. Even so its deep sleep is pretty high power compared to most embedded chips.
The zero (w) and zero 2 w don't have the equivalent of suspend-to-ram with a low sleep current. I'm not sure if that's a limitation of the SOC or the driver or both, but rpi was fairly clear it wasn't in the cards: https://github.com/raspberrypi/linux/issues/1281
I'm pretty sure it's the hardware. The SOCs they use in the mainline pi products are generally targeted at applications which aren't battery powered, so they don't focus on fast sleep support and similar power optimisations (which make the system design much more complex). Unfortunately if you want that you generally need to go with phone SOCs which are generally incredibly NDA-bound, very hard to buy even if you're as big as the rpi company, and have short availability windows which runs counter to requirements of large parts of the SBC market segment.
A knowledge graph is really just a projection of structured data from disparate sources into a common schema.
Take a bunch of tables and covert each row into a tuple (rowkey, columnName, value). Now take the union of all the tables.
^ knowledge graph
That’s it…but it’s not very useful yet. It becomes more useful if you apply a shared ontology during the import—ie translate all the columns into the same namespace. Suppose we had a “contacts” table with columns {“first name”, “last name”, …} and a “events” table with columns {“participant given name”, “participant family name”, …} — basically you need to unify the word you use to describe the concept “first name”/“given name”/whatever across all sources.
This can be cool/useful because you now only need one table (of triples) to describe all your structured data, but it’s also a pain because you may need to perform lots of self-joins or recursive queries to recover your data in order to do useful things with it. The final table has a very simple “meta” schema, and you erase the schema from each individual source so you can push the schema into the data.
I got my kid a raspi 400 a few years ago. She’s enjoyed it tremendously, but I don’t know how it holds up to more recent alternatives. She’s still using it today, though.
Even if they’re plateauing there’s still a lot of value to be had in what they already do. I think the mistake so far has been aiming too high or too low—ie, products that require AGI-like LLMs or unimaginative “low-hanging fruit” product ideas which are obvious from the function of an LLM. The former have been wishful thinking, and the latter have no moat. The Goldilocks area is in understanding what current LLMs can do in a way that you can either do something complicated that we couldn’t do reliably without LLMs or do something simple that wasn’t worth doing without LLMs. And in both cases the products need to be built in a way that naturally incorporates the expected failure modes of the tech. (For example, I don’t need it to write all my code; there’s a lot of value in just using ChatGPT to help me write one-off bash scripts.)
[1] applied AlphaZero style search with LLMs to achieve performance comparable to GPT-4 Turbo with a llama3-8B base model. However, what's missing entirely from the paper (and the subject article in this thread) is that tree search is massively computationally expensive. It works well when the value function enables cutting out large portions of the search space, but the fact that the LLM version was limited to only 8 rollouts (I think it was 800 for AlphaZero) implies to me that the added complexity is not yet optimized or favorable for LLMs.
I also found it confusing the first time I saw it. I believe it is sometimes used because the techniques for DL are very similar (in some cases identical) to algorithms that were developed for color palette quantization (in some places shortened to "palettization"). [1] At this point my understanding is that this term is used to be more specific about the type of quantization being performed.