* Spotify’s Layoff Impact: CEO Daniel Ek was surprised by the significant operational challenges following the layoff of 1,500 employees.
* Financial Performance: Despite achieving record profits of €168 million and revenue growth, Spotify missed its targets for profitability and user growth.
* Investor Reaction: Shares increased by over 8% after the earnings report, reflecting investor confidence.
* Long-Term Strategy: Despite short-term issues, Spotify believes the layoffs were necessary for long-term profitability.
I think even for the new APIs added in non-public betas, it's all on the main website now.
Also there's docs that aren't on the dev.apple site for things like Apple Pay, but it's not a perk of paying $99 a year. I suspect the actual hardware docs for the Vision headset are private too (they were for the Apple Silicon DTK).
No abstractions turns to spaghetti
Too many abstractions turns to spaghetti
I don't think the answer is one or the other at all like the blog post is implying. I think his previous conclusion of a threshold makes more sense. Perhaps though they have chosen the wrong abstractions and hence the confusion
The thing is, even if its spaghetti, I can follow the individual noodles. I can carefully pull them out of the pile and lay them out all neat and separated.
With "cleverely abstracted" code, I cannot do that. Because there are no noodles any more. There are small pieces of noodles, all in different sizes, spread out over 200 plates, that reside on 140 tables in 40 different rooms, some of which are in another building. And there are now 20 types of sauce, and 14 types of parmesan, but those are not neatly separated, but spread evenly all over everything, and so it all tastes awful. And each individual plate still manages to be spaghetti.
Okay, yes, I may have strechted out the food-analogy a bit.
But that's pretty much how I feel when digging through over-engineered codebases.
I loved the “I can follow the individual noodles”. Yes, sometimes even simple spaghetti is better than spagetti tower lassagnia but it’s still too slippery to reason through. But I do get the part about being able to at least grasp it, slippery as it is…
Laying the spaghetti out neatly is, I think, the correct level of abstraction. Identify each noodle, separate it, put it in the right place and make sure they’re all in the correct order. You don’t want noodles zig zagging and crossing over each other everywhere. You want nice clean lines of noodles you can easily follow from start to end and as few of them as possible.
Yes, it's just the new style of spaghetti, this time with lipstick and a bow tie.
One notable difference is that now the GOTOs are implicit and only in your head, so it's notably even harder to comprehend than the old style of spaghetti.
Also often adds in some code hidden in comments ('annotations'), some squirreled away in complex configuration files, and some other code where what actually gets executed depends on concatenated string variables, and maybe for good measure some places where which code gets executed (and how) implicitly depends on the filepath it is under, and now we have a mess even a debugger can't help with.
It's nice to see more people finally getting disillusioned with these trends.
>No abstractions turns to spaghetti Too many abstractions turns to spaghetti
That's because abstraction is a myth perpetuated by the likes of Sussman (ie academics that never wrote real code) and in reality the only thing that matters is PSPACE ⊆ EXPTIME.
A little confusing because "vector" here (largely) refers to two different things. "Vector search" being this ANN thing, but the "Vector API" is about SIMD. SIMD provides CPU operations on a bunch of data at a time, i.e. instead of one instruction for each 32-bit float, you operate on, depending on the CPU, 128 or 256 or 512 bits worth of floats at the same time. So, over scalar code, SIMD here could get maybe a 4-16x improvement (give or take a lot - things here are pretty complicated). So, while definitely a significant change, I wouldn't say it's at the make-or-break level.
As add-on to this comment: There's another Lucene issue from 2 weeks ago that provides some more details on different approaches that were considered: https://github.com/apache/lucene/issues/12302
Great explanation. But to be clear to those who don't follow:
SIMD is supported by Java out of the box but the optimizer might miss some opportunities. With this API it is far more likely that SIMD will be used if it's available and on first compilation so performance should be improved.
Lucene here just dealing with plain float[]s, so Valhalla at least shouldn't affect it much. It seems the limiting thing here is that it has sum accumulators, which the optimizer can't reorder because addition isn't associative.
>> How did people do this before Lucene supported it?
By performing query expansion based on features of documents within the search results. Very efficient and effective if you have indexed the right features.
"This work is part of the third pillar of our approach to alignment research: we want to automate the alignment research work itself. A promising aspect of this approach is that it scales with the pace of AI development. As future models become increasingly intelligent and helpful as assistants, we will find better explanations."
On first look this is genius but it seems pretty tautological in a way. How do we know if the explainer is good?... Kinda leads to thinking about who watches the watchers...
The paper explains this in detail, but here is a summary: an explanation is good if you can recover actual neuron behavior from the explanation. They ask GPT-4 to guess neuron activation given an explanation and an input (the paper includes the full prompt used). And then they calculate correlation of actual neuron activation and simulated neuron activation.
They discuss two issues with this methodology. First, explanations are ultimately for humans, so using GPT-4 to simulate humans, while necessary in practice, may cause divergence. They guard against this by asking humans whether they agree with the explanation, and showing that humans agree more with an explanation that scores high in correlation.
Second, correlation is an imperfect measure of how faithfully neuron behavior is reproduced. To guard against this, they run the neural network with activation of the neuron replaced with simulated activation, and show that the neural network output is closer (measured in Jensen-Shannon divergence) if correlation is higher.
> The paper explains this in detail, but here is a summary: an explanation is good if you can recover actual neuron behavior from the explanation.
To be clear, this is only neuron activation strength for text inputs. We aren't doing any mechanistic modeling of whether our explanation of what the neuron does predicts any role the neuron might play within the internals of the network, despite most neurons likely having a role that can only be succinctly summarized in relation to the rest of the network.
It seems very easy to end up with explanations that correlate well with a neuron, but do not actually meaningfully explain what the neuron is doing.
Why is this genius? It's just the NN equivalent of making a new programming language and getting it to the point where its compiler can be written in itself.
The reliability question is of course the main issue. If you don't know how the system works, you can't assign a trust value to anything it comes up with, even if it seems like what it comes up with makes sense.
I love the epistemology related discussions AI inevitably surfaces. How can we know anything that isn't empirically evident and all that.
It seems NN output could be trusted in scenarios where a test exists. For example: "ChatGPT design a house using [APP] and make sure the compiled plans comply with structural/electrical/design/etc codes for area [X]".
But how is any information that isn't testable trusted? I'm open to the idea ChatGPT is as credible as experts in the dismal sciences given that information cannot be proven or falsified and legitimacy is assigned by stringing together words that "makes sense".
> But how is any information that isn't testable trusted? I'm open to the idea ChatGPT is as credible as experts in the dismal sciences given that information cannot be proven or falsified and legitimacy is assigned by stringing together words that "makes sense".
I understand that around the 1980s-ish, the dream was that people could express knowledge in something like Prolog, including the test-case, which can then be deterministically evaluated. This does really work, but surprisingly many things cannot be represented in terms of “facts” which really limits its applicability.
I didn’t opt for Prolog electives in school (I did Haskell instead) so I honestly don’t know why so many “things” are unrepresentable as “facts”.
There is a longer-term problem of trusting the explainer system, but in the near-term that isn't really a concern.
The bigger value here in the near-term is _explicability_ rather than alignment per-se. Potentially having good explicability might provide insights into the design and architecture of LLMs in general, and that in-turn may enable better design of alignment-schemes.
It doesn't have to lag, though. You could ask gpt-2 to explain gpt-2. The weights are just input data. The reason this wasn't done on gpt-3 or gpt-4 is just because a) they're much bigger, and b) they're deeper, so the roles of individual neurons are more attenuated.
I had similar thoughts about the general concept of using AI to automate AI Safety.
I really like their approach and I think it’s valuable. And in this particular case, they do have a way to score the explainer model.
And I think it could be very valuable for various AI Safety issues.
However, I don’t yet see how it can help with the potentially biggest danger where a super intelligent AGI is created that is not aligned with humans.
The newly created AGI might be 10x more intelligent than the explainer model. To such an extent that the explainer model is not capable of understanding any tactics deployed by the super intelligent AGI. The same way ants are most probably not capable of explaining the tactics delloyed by humans, even if we gave them a 100 years to figure it out.
You're correct to have a suspicion here. Hypothetically the explainer could omit a neuron or give a wrong explanation for the role of a neuron.
Imagine you're trying to understand a neural network, and you spend enormous amount of time generating hypotheses and validating them.
Well the explainer might give you 90% correct hypotheses, it means you have 10 times less work to produce hypotheses.
So if you have a solid way of testing an explanation, even if the explainer is evil, it's still useful.
Using 'im feeling lucky' from the neuron viewer is a really cool way to explore different neurons. And then being able to navigate up and down through the net to related neurons.
I'm quite curious how much of the improvement on text rendering is from the switch to pixel-space diffusion vs. the switch to a much larger pretrained text encoder. I'm leaning towards the latter, which then raises the question of what happens when you try training Stable Diffusion with T5-XXL-1.1 as the text encoder instead of CLIP — does it gain the ability to do text well?
DeepFloyd IF is effectively the same architecture/text encoder as Imagen (https://imagen.research.google/), although that paper doesn't hypothesize why text works out a lot better.
Right, I'm aware of the Imagen architecture, just curious to see further research determining which aspect of it is responsible for the improved text rendering.
EDIT: According to the figure in the Imagen paper FL33TW00D's response referred me to, it looks like the text encoder size is the biggest factor in the improved model performance all-around.
The CLIP text encoder is trained to align with the pooled image embedding (a single vector), which is why most text embeddings are not very meaningful on their own (but still convey the overall semantics of the text). With T5 every text embedding is important.