Hacker Newsnew | past | comments | ask | show | jobs | submit | Donald's commentslogin

I was curious what the commenter's business was, and found this post about HTTP protocol latency: https://jacquesmattheij.com/the-several-million-dollar-bug/


>FREEDRUPALWEBSITEHOSTING.COM

Yeah that's not gonna work nowadays.

>DOWNLOADWEBCAM.COM

Is that like Download More RAM?

>BROWSEHN.COM

Hey, I'm browsing that place right now!

>MUZICBRAINZ.COM

This sounds 100% legit no virus softpedia guaranteed.


Gemini 3 Pro Preview gets 96.8% on the same benchmark? That's impressive


And performs very well on the latest 100 puzzles too, so isn't just learning the data set (unless I guess they routinely index this repo).

I wonder how well AIs would do at bracket city. I tried gemini on it and was underwhelmed. It made a lot of terrible connections and often bled data from one level into the next.


> unless I guess they routinely index this repo

This sounds like exactly the kind of thing any tech company would do when confronted with a competitive benchmark.


I mean, the repo has <200 stars, it's not like it's so mainstream that you'd expect LLM makers to be watching it actively. If they wanted to game it, they could more easily do that in RL with synthetic data anyway.


Belated update on this. Gemini reasoning did much better than quick on bracket city today (an easy puzzle but still). It only failed to solve one clue outright, got another wrong but due to ambiguity in the expression referenced and in a way that still fit the next level down making the final answer fairly cleanly solved. Still clearly has a harder time with it than the connections puzzle.


GPT-5.2 might be Google's best Gemini advertisement yet.


Especially when you see the price


My account is just a week apart from yours. Neocities is an excellent project btw, glad to see your passion for the web shine through it.


Talk to federal contractors about their experience with DOGE. They’re literally cancelling contracts and helping to their friends recapture the money as new federal contracts since the money needs to be spent anyway.

The congressional hearings over this are going to be excellent CSPAN viewing.


As someone in NLP who lived through this experience: there's something uniquely ironic and cruel about building the wave that washes yourself away.


45^2 = 2025

Happy perfect square year, everyone. The previous one was 1936 and the next one will be 2116.


2025:

1) is a square: 45²

2) is the product of two squares: 9² x 5²

3) is the sum of 3-squares: 40²+ 20²+5²

4) is the sum of cubes of all the single digits: 1³+2³+3³+4³+5³+6³+7³+8³+9³


5) sum of the single digits squared: (1+2+3+4+5+6+7+8+9)²


Some more (thanks to chatgpt-o1)

6) sum of the first 45 odd numbers: 1+3+5+...+89

7) is a Harshad number: https://en.m.wikipedia.org/wiki/Harshad_number


6 is kind of cheating. It's a restatement of 45^2.

3^2 is the sum of the first three odd numbers. 4^2 is the sum of the first four odd numbers. 5^2 is the sum of the first five odd numbers.

Edit: sorry, don't mean to be a pill.


I don't consider it cheating, I bet most of these rules have an internal relation.


They do indeed have an internal relation - they all add up to 2025.

Obviously all the formula will be equivalent to each other. They are, by construction, all restatements of each other.


I guess that means that every number is equivalent to a formula? Is there some sort of metric of how many formula produce the same number?


You’d have to at least exclude subtraction and division (and zero) to not have infinitely many formulas for every number.


I would say that a rule is "cheating" iff it is implied by another rule for any arbitrary N.


I think that it is a nice observation. Some people complain that explaining the formation of a rainbow scientifically makes it lose its "aweness" but I think it even deepens it.

Actually, property 5) trivially implies 1) but also 2), as `(1+2+...+n)² = n²(n+1)²/4` and either n or n+1 must be divisible by 2 hence one of the squares divisible by 4 hence it is a product of squares. But also property 4) as `(1+2+...+n)² = 1³+2³+...+n³` (easy to show by induction).


4 and 5 too


How so? I'm too dumb to see it.


The sum of the first n cubes is always the square of the sum of numbers from 1 to n. For example 1³+2³+3³+4³=(1+2+3+4)².

You can prove it by induction; just expand (n(n+1)/2)² – (n(n-1)/2)², the result is n³.


89 isn't 9^2, 81 is.


Huh? 89 is the 45th odd number.


(just reading wikipedia here, I didn't know about Harshad numbers)

There is no such thing as a Harshad number, there is a _Harshad number in a given base_. All integers between zero and n are n-harshad numbers.

Which is a pity, because apparenty it means the `joy-giver`. I think human kind could use a joy giver year


8) the sum of 2024 + 1 also


Oh I like these two.


How do people find these kinds of things out without idly brute forcing things?


Also, (20 + 25)^2 = 2,025! Happy New Year :)


Python:

    [x**2 for x in range(32,100) if x**2 // 100 + x**2 % 100 == x]
    [2025, 3025, 9801]


This decomposition is especially fun!


Great to know that someone else too keeps track of squares.

At the ages of perfect squares is when we all cross or achieve significant milestones in our lives as children, students, (young)adults, spouses, parents, grandparents, senior citizens of society and so on.

This year being a perfect square, I wish that it will be as much or more special as it was for everyone at those ages.


My youngest is fascinated by squares at the moment. Luckily for him, he is 4 years old, his older brother is 9, while I just turned 36. He will be delighted when I tell him that we are entering 45 squared!


Also, if you add your ages together… 7^2

If you multiply your ages… 36^2


This is the most Hacker News comment I’ve seen today. Well played.


Up there with Putnam and Dropbox.


Thank you for the Putnam; I did not know about it. For anyone else that did not understand this reference; https://news.ycombinator.com/item?id=35015. Legendary.


That's an epic thread. Thank you very much for sharing it :)

And now, looking back 17 years later, I'd say he succeeded. It's the tarsnap founder.


Indeed! And it was so much fun when dhouston popped up in the thread :-)


And "Less space than a Nomad"


Putnam? Of math competition fame?


Yep, specifically this comment: https://news.ycombinator.com/item?id=35083


Totally nothing bad happened in the decade following the last perfect square year in 1936. :')


Well things have already been a tad rough around this square, so if we follow the trend, the next square might turn bad even sooner. So maybe around, I dunno, 2101?


Unless something equivalent happened in 1849, 1764, 1691... I think we're OK :)


1225: ten years earlier, Magna Carta starting to limit monarchs and the seed of individual freedom

1681: eight years later was glorious revolution with a bill of rights, marking individual freedoms

1764: ten years later, beginning of American Revolution and being free of monarchs

1849: ten years-ish later, start of US civil war; was the time of an attempt by the British to end slavery around the world

1936: ten years later, colonial empires were being dismantled, UN established to attempt global cooperation, US in the ascendancy with a seed of ties being established more by economics than military force, great economic upswing lifting people out of poverty (60% in poverty then, 10% now) while the global population blossoms

2035: Majority of the global population in middle class or better, triumph of individuals over technocrats, bureaucrats, and corporatists :)


I love this! Haha, I was hoping someone would do that. :)


2025 = 515 (palindromic in base 20)



And

    (20+25)^(20/(2*5))
as well.


US President number 45 returns, kind of seems like squaring applies.


Donald = Donald E. Knuth? ;-)


here’s to all be alive and well floating in amniotic liquid living in VR paradise


The algorithm behind this has to be super fun


My guess is that they host a version of the model locally on the iPhone.


Even if they don’t (or if it’s partially networked as some recent rumors suggest), it’ll be rolled into one or both of two predictable costs (to the consumer):

1. The device sale itself, either raising the ASP or offsetting some other cost (to Apple) savings

2. Recurring payments for iCloud (or any rebranding it might undergo along with the feature)

Apple’s pricing model, if not totally predictable, is exceedingly formulaic. If they deviate from these into some sort of nickel and diming on “AI” features alone, that would almost certainly be a clear sign that they’re betting against it as a long term selling point.


This indeed seems to have been a heavy focus of their research team in the past year, eg. "Efficient Large Language Model Inference with Limited Memory" [1] and OpenELM [2]

[1] https://arxiv.org/pdf/2312.11514

[2] https://arxiv.org/pdf/2404.14619 (with 1.1B parameters, this appears to be their attempt at building a lightweight LLM)


Maybe a very cut down version - any of the more recent and capable OpenAI models are surely far too large to put on an iPhone, and far too large to run (in terms of both memory available, and processing power).

This would maybe align with the 'limited abilities are free' approach.


Do you have any examples of your prompts? I've found Opus to be vastly superior to ChatGPT for intricate tasks.


Sure.

---

In the following, Opus bombed hard by ignoring the "when" component, replying with "MemoryStream"; where ChatGPT (I think correctly) said "no":

> In C#, is there some kind of class in the standard library which implements Stream but which lets me precisely control when and what the Read call returns?

---

In the following, Opus bombed hard by inventing `Task.WaitUntilCanceled`, which simply doesn't exist; ChatGPT said "no", which actually isn't true (I could `.ContinueWith` to set a `TaskCancelationSource`, or there's probably a way to do it with an await in a try-catch and a subsequent check for the task's status) but does at least immediately make me think about how to do it rather than going through a loop of trying a wrong answer.

> In C#, can I wait for a Task to become cancelled?

---

In the following exchange, Opus and ChatGPT both bombed (the correct answer turns out to be "this is undefined behaviour under the POSIX standard, and .NET guarantees nothing under those conditions"), but Opus got into a terrible mess whereas ChatGPT did not:

> In .NET, what happens when you read from stdin from a process which has its stdin closed? For example, when it was started with { ./bin/Debug/net7.0/app; } <&-

(both engines reply "the call immediately returns with EOF" or similar)

> I am observing instead the call to Console.Read() hangs. Riddle me that!

ChatGPT replies with basically "I can't explain this" and gives a list of common I/O problems related to file handles; Opus replies with word salad and recommends checking whether stdin has been redirected (which is simply a bad answer: that check has all the false positives in the world).

---

> In Neovim, how might I be able to detect whether the user has opened Neovim by invoking Ctrl+X Ctrl+E from the terminal? Normally I have CHADtree open automatically in Neovim, but when the user has just invoked $EDITOR to edit a command line, I don't want that.

Claude invents `if v:progname != '-e'`; ChatGPT (I think correctly) says "you can't do that, try setting env vars in your shell to detect this condition instead"


Now I come to think of it, maybe the problem is that I only ask these engines questions whose answers are "what you ask is impossible", and ChatGPT copes well with that condition but Opus does not.


(Nope, this hypothesis doesn't seem to be it. Maybe it just confidently doesn't know anything about Neovim?)


Do you have an example implementation of reimplementing the core of these?


It's literally what I did at work last week, which is why I found this submission timely. I'd have to check with my employer if it can be made public. I don't see any reason why not, there's not much to it.


What did you use to implement the regularization of the trend breakpoints? Prophet by default uses a regular grid and thins them out with STAN. I couldn't find a quick regularization replacement in numpy/scipy/statsmodels with equivalent performance. (I don't want to drag in another huge dependency with Torch or TF).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: