More

utdiscant · 2025-01-21T13:00:17 1737464417

A lot of tech people seems to have changed their mind on this. Some ways to reason about that are:

1) They are doing it for opportunistic reasons. They can't afford to be enemies with Trump.

2) They legitimately changed their opinion about a wide array of things they used to believe enough to outspoken about.

3) They believe that while Trump's core beliefs are not aligned to theirs, the alternative is worse. And potentially they believe there is a need for some sorm of over-correction to fix what has happened over the last 4 years.

robertlagrant · 2025-01-21T13:06:04 1737464764

I think the fourth is quite a few people spotted what appeared to be happening in big tech, where if you have the right activists in only a few companies you could cover and (to some extent) control a lot of digit communication. The moment whatever issue du jour came up that they did not align with the big tech decision on, they noticed and started to distrust what they were seeing. It's possible that being a benevolent gatekeeper only works if you are extremely nonpartisan.

utdiscant · on Dec 22, 2024

"We picked the latter, which also gave us our performance metric - percentage of generated comments that the author actually addresses."

This metric would go up if you leave almost no comments. Would it not be better to find a metric that rewards you for generating many comments which are addressed, not just having a high relevance?

You even mention this challenge yourselves: "Sadly, even with all kinds of prompting tricks, we simply could not get the LLM to produce fewer nits without also producing fewer critical comments."

If that was happening, that doesn't sound like it would be reflected in your performance metric.

dakshgupta · on Dec 22, 2024

Good criticism that we should pay closer attention to. Someone else pointed this out and too and since then we’ve started tracking addressed comment per file changed as well.

SomewhatLikely · on Dec 22, 2024

You could probably modify the metric to addressed comments per 1000 lines of code.

utdiscant · on Nov 30, 2024

Here is what I got from o1:

To find two prime numbers that sum to 123, let’s consider the properties of prime numbers and the nature of their sums: 1. Prime Numbers Greater Than 2 Are Odd: Except for 2, all prime numbers are odd. 2. Sum of Two Odd Numbers Is Even: Adding two odd numbers results in an even number. 3. Sum of an Even and an Odd Number Is Odd: Adding an even number (like 2) and an odd number results in an odd sum.

Given that 123 is an odd number, the only way two primes can sum to 123 is if one of them is 2 (the only even prime) and the other is an odd prime. Let’s test this possibility: • 2 + x = 123 ⇒ x = 121

However, 121 is not a prime number (since ).

Next, consider the possibility of adding two odd primes: • Odd Prime + Odd Prime = Even Number

Since adding two odd primes yields an even number, it’s impossible for two odd primes to sum to 123 (an odd number).

Conclusion: There are no two prime numbers that sum to 123.

Answer: There are no two prime numbers whose sum is 123; such primes do not exist.

lowyek · on Dec 1, 2024

I think may be i am not with the right model. the reasoning looks so good here. May be the default model in chatgpt free is not the one which you used.

alwayslikethis · on Nov 30, 2024

qwq went through a two screenfuls of tokens to try all primes under 123 twice, but eventually gave the right answer.

kenjackson · on Nov 30, 2024

Well explained!

utdiscant · on Nov 3, 2024

I found the ratio of rice and water to be the main issue - and the rice cooker didn't fix that. I stumbled upon a recipe with a different approach that I use now.

Take plenty of water and get it to a boil. Add any amount of rice to the boiling water, and let it boil for 8 minutes. Then drain the water, remove the pot from the stove and put the lid on for a few minutes. Reduces the need for measuring quantities.

fkyoureadthedoc · on Nov 3, 2024

I just use the included rice cup and fill to the correct line in the rice cooker pot...

adrian_b · on Nov 3, 2024

I always use a ratio of water to rice (by weight) of 4 to 1 (e.g. 500 g of water for 125 g of rice).

I cook the rice in a microwave oven in a covered glass vessel (preventing the escape of water) and it is very good with this ratio.

For most other cereals that are either coarsely ground or whole grains, e.g. cornflour, semolina or wheat grains, the same 4 to 1 by weight ratio works fine when boiled in a closed vessel.

Most other kinds of starchy seeds absorb less water when boiled, so the ratio must be lower.

I assume that a dedicated rice cooker is useful only for those with numerous family members, who might want to cook large quantities of rice at the same time.

For smaller quantities, e.g. suitable for a couple of people, a microwave oven is very fast and reproducible, so there is no need for dedicated equipment.

will5421 · on Nov 3, 2024

What wattage is your microwave on?

adrian_b · on Nov 3, 2024

It is controllable, but for rice and other cereals I use the maximum of 1000 W. Small powers are needed for things like meat or eggs, but very seldom for vegetables.

The time depends on the quantity. For around 125 g of rice + 500 g of water, the time is between 12 and 15 minutes.

An advantage of using a microwave oven is that no stirring is needed during cooking, unlike when boiling rice or other cereals on a traditional stove.

Tade0 · on Nov 3, 2024

> Small powers are needed for things like meat or eggs, but very seldom for vegetables.

I understand why you would use low power for eggs, but meat? I always applied power in proportion to the product's water content, as it has high specific heat and absorbs microwaves readily. Meat is largely water so high power it is.

adrian_b · on Nov 3, 2024

This may depend on the kind of meat. I cook mostly turkey, chicken or fish.

At high powers over 500 W my meat would explode.

Moreover, when cooking meat at a lower power for a longer time (e.g. up to between 20 and 30 minutes), the cooked meat is much more tender than when cooked faster.

If I boiled the meat, then maximum power could be used. However I do not boil it, but I roast it in the microwave oven in a covered glass vessel, with nothing added, except salt and condiments. Thus it is much more tasty than boiled.

jillesvangurp · on Nov 3, 2024

Works great indeed. Another useful trick is to use a microwave. Simply put rice and the water in an open container in the microwave. Let it go at full blast for 14-15 minutes. Keep going until the water is gone. Let it sit for a few minutes and done. You get perfectly fluffy rice every time. Works great for small portions and it won't boil over.

I learned about this trick only a few years ago and when I tried it, I basically got a perfect result. Which was not what I was expecting.

brbrodude · on Nov 3, 2024

How about the seasoning? Years ago I used to do this microwave rice but I threw in some powdered flavoring and it would boil & stick everywhere :/

adrian_b · on Nov 3, 2024

The seasoning is better added and mixed immediately after taking the rice out from the oven, when it is hot and soft.

If you want to add some kind of oil, that is also better added immediately after taking the rice out of the oven.

If you add salt or sugar, then that should be added before cooking in the oven.

pandemic_region · on Nov 3, 2024

How do you prevent the rice from sticking to the bottom of the pot? Do you stirr it a few times?

utdiscant · on Oct 29, 2024

"I’ve keep taking screenshots of everyone who’s asked about acquiring my account. One interesting pattern: the majority of these requests come from profiles without any photos. I find it so weird that people are so eager to get a username when they don’t even share content!"

If I had an account with a huge amount of followers, then I would also not initially reach out from that main account in order to negotiate the price.

winternett · on Oct 29, 2024

The same schemes for account promotions likely come from insiders too. As far as I've observed, the messages I get on pretty much every platform are too well organized to be outsiders, some even make it through spam filters. I'm so over social media to be honest... It's a haven for fake vanity points that really doesn't guarantee any sort of real life achievement.

utdiscant · on Sept 12, 2024

Feels like a lot of commenters here miss the difference between just doing chain-of-thought prompting, and what is happening here, which is learning a good chain of thought strategy using reinforcement learning.

"Through reinforcement learning, o1 learns to hone its chain of thought and refine the strategies it uses."

When looking at the chain of thought (COT) in the examples, you can see that the model employs different COT strategies depending on which problem it is trying to solve.

persedes · on Sept 12, 2024

I'd be curious how this compared against "regular" CoT experiments. E.g. were the gpt4o results done with zero shot or was it asked to explain it's solution step by step.

nmca · on Sept 13, 2024

It was asked to explain step by step.

mountainriver · on Sept 12, 2024

It’s basically a scaled Tree of Thoughts

qudat · on Sept 13, 2024

In the primary CoT research paper they discuss figuring out how to train models using formal languages instead of just natural ones. I'm guessing this is one piece to the model learning tree-like reasoning.

Based on the quick searching it seems like they are using RL to provide positive/negative feedback on which "paths" to choose when performing CoT.

danielmarkbruce · on Sept 12, 2024

This seems most likely, with some special tokens thrown in to kick off different streams of thought.

Zenzero · on Sept 13, 2024

To me it looks like they paired two instances of the model to feed off of each other's outputs with some sort of "contribute to reasoning out this problem" prompt. In the prior demos of 4o they did several similar demonstrations of that with audio.

danielmarkbruce · on Sept 13, 2024

To create the training data? Almost certainly something like that (likely more than two), but I think they then trained on the synthetic data created by this "conversation". There is no reason a model can't learn to do all of that, especially if you insert special tokens (like think, reflect etc that have already shown to be useful)

Zenzero · on Sept 13, 2024

No I'm referring to how the chain of thought transcript seems like the output of two instances talking to each other.

danielmarkbruce · on Sept 13, 2024

Right - i don't think it's doing that. I think it has likely been fine tuned to transition between roles. But, maybe you are right.

diedyesterday · on Sept 14, 2024

Reminds me of how Google's AlphaGo learned to play the best Go that was ever seen. And this somewhat seems a generalization of that.

utdiscant · on Aug 25, 2024

Back in my early university days I did a short project (https://www.dropbox.com/scl/fi/ch33p2xaq7xavgu9uk0qv/index.p...) on how to generate "hard" mazes. While there are many algorithms to create mazes, there is no real literature (maybe for good reason) on how to create mazes that are hard for humans to solve.

utdiscant · on March 14, 2024

But humans have already trained on an incredible number of games (including reality) when they play No Man's Sky for the first time. What they say here is that training on N-1 games makes you better at the Nth game. So you just continue to scale this up.

YeGoblynQueenne · on March 14, 2024

"An incredible number of games"? You're saying a kid can't pick up and play No Man's Sky if it's the first time they ever played a video game? Or that they can't get good at it if it's the first game they play?

utdiscant · on March 14, 2024

Definitely feels like we are advancing towards AGI quite rapidly. As another commenter mentioned (https://news.ycombinator.com/item?id=39693035), the OpenAI DotA game was a big milestone for me.

If you think about it abstractly, humans are basically models that take input from our senses, do some internal processing of that and then take actions with our bodies. SIMA is the same - it takes input from video, and takes action through keyboard actions. There is nothing against introducing additional types of input and taking different actions.

The ability to train on one game and transfer that knowledge to a different game should allow future models like this to train in games, by reading text, watching videos etc, and then transfer all of that knowledge to the real world.

utdiscant · on Feb 22, 2024

Having a simple clear focus is hard to undervalue. Since doing YC 6 years ago there has never been a second of doubt about the priorities (imo).

YC is there to make money for its investors. And the way to do that is to invest in the best startups and make them huge.

The only source of conflict then comes from YC vs. the startups. For example, do you enforce strict legal structures on all companies. And here the dominant priority is of course YC itself. But this is expected, and given the rationality also easy to work with.