Would you mind sharing your company name?
I'm a master's student in AI, and after finishing my master's thesis at IBM this summer, I'll be looking for jobs.
Well, it's just like stochastic gradient descent, if you think about it. The normal gradient descent is computed using the whole training set. The stochastic gradient is trained on a batch (a subset of the training set), and in the distributed case, we compute two batches at once by doing the gradient on each in parallel.
The intuition works IMO, but indeed, having the first batch update and then the second, is not equal to having the mean update.
Does anyone actually use the 'normal gradient descent' with the whole training set? I only ever see it as a sort of straw man to make explanation easier.
Generally yes, vanilla gradient descent gets plenty of use. But for LLMs: no, it’s not really used, and stochastic gradient descent provides a form of regularization, so it probably works better in addition to being more practical.
Just try it. I tried and the launch was so smooth that I'm keeping it a few days to test it. My current biggest problem is that I launch my terminal using the spotlight shortcut (⌘-space on mac) and while iterm2 is found when I search for term, ghostty isn't.
I'm not complaining about the indexing (you can start the indexing by starting spotlight on and off). I'm complaining about having to search for "ghos..." instead of "term...". Because I don't don't like to remember the specific app name. But this complaint is the same for everything; when searching for Excel, I would like "Numbers" to also show up in the results.
When looking for vscode I would like "Visual Studio code" to also show up, but I need to type either code, visual or studio.
Huh, you've got me curious, and the Windows start menu does in fact launch Visual Studio Code when searching for "vscode." Color me pleasantly surprised!
He is working on CNN (Convolution Neural Network) while everyone else uses transformers. Furthermore, he wasn't in the OpenAI scandal with Ilya. Third, they don't publish papers at the same rate.
Ilya has 585182 citations, while Alex has 286082 (source google scholar).
> I also feel like guarding their consumer product against bad-faith-bad-use is basically pointless. There will always be ways to get bomb-making instructions
With that argument we should not restrict firearms because there will always be a way to get access to them (black market for example)
Even if it’s not a perfect solution, it help steer the problem in the right direction and that should already be enough.
Furthermore, these researches are also a way to better understand LLM inner working and behaviors. Even if it wouldn’t yield results like being able to block bad behaviors, that’s cool and interesting by itself imo.
No, the argument is that restricting physical access to objects that can be used in a harmful way is exactly how to handle such cases. Restricting access to information is not really doing much at all.
Access to weapons, chemicals, critical infrastructure etc. is restricted everywhere. Even if the degree of access restriction varies.
> Restricting access to information is not really doing much at all.
Why not? Restricting access to information is of course harder but that's no argument for it not doing anything. Governments restrict access to "state secrets" all the time. Depending on the topic, it's hard but may still be effective and worth it.
For example, you seem to agree that restricting access to weapons makes sense. What to do about 3D-printed guns? Do you give up? Restrict access to 3D printers? Not try to restrict access to designs of 3D printed guns because "restricting it won't work anyway"?
Meh, 3D printed guns are a stupid example that gets trotted out just because it sounds futuristic. In WW2 you had many examples of machinists in occupied Europe who produced workable submachine guns - far better than any 3D-printed firearm - right under the nose of the Nazis. Literally when armed soldiers could enter your house and inspect it at any time. Our machining tools today are much better, but no-one is concerned with homemade SMGs.
The Venn diagram between "people competent enough to manufacture dangerous things" and "people who want to hurt innocent people" is essentially zero. That's the primary reason why society does not degrade into a Mad Max world. AI won't change this meaningfully.
People actually are concerned about homemade pistols and SMGs being used by criminals, though. It comes up quite often in Europe these days, especially in UK.
And, yes, in principle, 3D printing doesn't really bring anything new to the table since you could always machine a gun, and the tools to do so are all available. The difference is ease of use - 3D printing lowered the bar for "people competent enough to manufacture dangerous things" enough that your latter argument no longer applies.
FWIW I don't know the answer to OP's question even so. I don't think we should be banning 3D printed gun designs, or, for that matter, that even if we did, such a ban would be meaningfully enforceable. I don't think 3D printers should be banned, either. This feels like one of those cases where you have to accept that new technology has some unfortunate side effects.
There's very little competence (and also money) required to buy a 3D printer, download a design and print it. A lot less competence than "being a machinist".
The point is that making dangerous things is becoming a lot easier over time.
you can 3-D print parts of a gun but the important parts are still metal which you need to machine. I’m not sure how much easier you just made it … if someone’s making a gun in their basement are you really concerned whether it takes 20 hours or 10?
What you should be really concerned about is when the cost of milling machines comes down, which is happening, quick, make them illegal
I've ended up with this viewpoint too. I've settled in the idea of informed ethics.. the model should comply, but inform you of the ethics of actually using the information.
> the model should comply, but inform you of the ethics of actually using the information.
How can it “inform” you of something subjective? Ethics are something the user needs to supply. (The model could, conceptually, be trained to supply additional contextual information that may be relevant to ethical evaluation based on a pre-trained ethical framework and/or the ethical framework evidenced by the user through interactions with the model, I suppose, but either of those are likely to be far more error prone in the best case than actually providing the directly-requested information.)
Well, that's the actual issue, isn't it? If we can't get a model to refuse to give dangerous information, how are we going to get it to refuse to give dangerous information without a warning label?
> Even if it’s not a perfect solution, it help steer the problem in the right direction
Yeh tbf I was a bit strong worded when I said "pointless". I agree that perfect is the enemy of good etc. And I'm very glad that they're doing _something_.
Not sure what you mean by the "that" when you say "if that's true", but there is nothing in this thread or by google that is anywhere close to breaking encryption.
Because the "something" in question is not decryption. It's actually specifically something with no useful result, just a benchmark.
Decryption with quantum computers is still likely decades away, as others have pointed out.
To be specific, the best know quantum factoring did 15 = 3x5, and when 35 was not able to be factored when attempted. Most experimental demonstrations have stopped in recent years due to how pointless it currently is.
No, this quantum computer cannot factorize the large composite numbers that we use for modern RSA. Even for the numbers that it can factor, I don't think it will be faster than a decent classical computer.
Increasing the dimension causes a lot more computation, this is one of the main reason. You can see evidence of this in the multi head where the dim is reduced via a linear projection.
I don’t agree,
We can learn to apply thought process.
We learn by heart patterns and recognize them in new situations.
Learning that a -> b doesn’t mean that b-> a.
I would be able to say it for a few cases intuitively but knowing this pattern exists I’m able to apply it to all construct not a few cases
I also allow me to formally prove or disprove something I have an intuition for