Hacker News new | past | comments | ask | show | jobs | submit | heinrichf's comments login

I'm comparing Gemma3 12 B (https://ollama.com/library/gemma3; running fully on my 3060 12GB) and Mistral Small 3 24B (https://ollama.com/library/mistral-small; 10% offloaded to the CPU).

- Gemma3 12B: ~100 t/s on prompt eval; 15 t/s on eval

- MistralSmall3 24B: ~500 t/s on prompt eval; 10 t/s on eval

Do you know what different in architecture could make the prompt eval (prefill) so much slower on the 2x smaller Gemma3 model?


Thank you for the report! We are working with the Ollama team directly and will look into it.

Restic has a mount subcommand that exposes all backups through a FUSE filesystem, no?


There is also an alternative more lightweight self-hosted server in Rust, compatible with the official clients: https://github.com/cpg314/ltapiserv-rs


It's to access the tuple's single element, but the author could have used #[repr(transparent)]


(2020). See also https://news.ycombinator.com/item?id=9870408, which I found more interesting (and that the author should probably have referenced).


A very long time ago (around 2000) I came up with the idea to record my car's engine sound, use a FFT on that, find the dominant frequency change over time, and then use the car's gear ratio / wheel size to find back the power curve. I'd drive my car both ways on an empty street, in 2nd gear, from idle to the rev limiter. Then I'd compute the two curves and average them.

It was fascinating: the curve plotted was matching nearly exactly the power curve given by the car manufacturer.

I think it's the first Java program I wrote. I copy/pasted some discrete FFT code I found somewhere.

It was a nice little project for it's way simpler than trying to fingerprint audio using FFT like Shazam does: yet I got to learn a bit about harmonics, FFT / amplitude over time vs amplitude over frequency, car drag coefficient, etc.

As I never throw anything I probably still have that crappy code somewhere!


Cool stuff. You might enjoy this video[0] where Ben K. from Applied Science measures the RPM of his car engine via the cigarette lighter by measuring the inductive ignition spikes with an FFT on his scope.

[0] https://youtu.be/t0ToYhjYV9I


Thanks for sharing that! Aside from the excessive fawning over the scope that was a great video - I think it’s awesome that the result differed from expectations with the (spoiler alert!) DMC-12’s uneven firing pattern and made the whole thing more interesting.


You could also use the accelerometer in the phone to measure power transfer.


Commenting as a mathematician who sometimes uses the Bourbaki books as reference for research: this is probably not a good idea. These books have a very dry style and in particular rarely contain detailed examples, motivation for definitions, or intuition for theorems.

They are great for reference if you already know the subject, but if you want to learn something, you'd be better off using introductory textbooks for the various fields you're interested in. I'm happy to make recommendations, and you can also look up syllabi for math undergrad at Stanford, Princeton, etc.


This page appears as if it was written by the authors or the PR department of the university...


The praise from popular press and the promotion by the authors should be put into context of what mathematicians think of it.

Two blog posts by a professor at U. Chicago, qualifying it of intellectual fraud:

https://www.galoisrepresentations.com/2019/07/17/the-ramanuj...

https://www.galoisrepresentations.com/2019/07/07/en-passant-...


Man, this is an example of how difficult it is to know what is BS or not if you're not an expert on the subject. On one hand, this article was published in Nature, which I thought was trustworthy. On the other, there's this comment on a social media platform that links to a blog that also seems legit. No wonder misinformation spreads so fast. Even after reading both, I don't know what to make of it. The reaction and comments here just confuse me more.


Nature has published some very questionable papers in AI/ML that are filled with malpractices. Another bogus paper that comes to mind was predicting earthquakes with a deep(read huge) neural network that appears to have information leakage and was fuelled with the hype of DL when a simple logistic regression (i.e. single neuron) could perform just as well [1,2,3].

[1] https://www.reddit.com/r/MachineLearning/comments/c4ylga/d_m...

[2] https://www.reddit.com/r/MachineLearning/comments/c8zf14/d_w...

[3] https://www.nature.com/articles/s41586-019-1582-8 / https://arxiv.org/pdf/1904.01983.pdf


This is a frighteningly common practice in DL research. Baselines are rarely taken with resect to alternate techniques, largely due to publication bias.

On one hand papers about DL applications are of interest to the DL community, and useful to see if there is promise in the technique. On the other hand, they may not be particularly useful to industry, or to forwarding broader research goals.


A good rule of thumb is to be slightly more suspicious of "DL for X" unless X was part of the AI/ML umbrella in the 2000s. If no one was publishing about X in AAAI/NIPS/ICML before 2013 or so then there's a pretty good chance that "DL for X" is ignoring 30+ years of work on X. This is becoming less true if one of the paper's senior author comes from the field where "X" is traditionally studied.

Another good rule of thumb is that physicists writing DL papers about "DL for X" where X is not physics are especially terrible about arrogantly ignoring 30+ years of deeply related research. I don't quite understand why, but there's an epidemic of physicists dabbling in CS/AI and hyping it way the hell up.


Anecdotally, having come from a physics background myself - DL is more similar to the math that physicists are used to than traditional ML techniques or even standard comp-sci approaches are. In combination with the universal approximation proofs of DL, it's easy to get carried away and think that DL should be the supervised ML technique.

Curiously, having also spent heavy time on traditional data-structures and algorithms gave me an appreciation for how stupendously inefficient a neural net is and part of me cringes whenever I see a one-hot encoding starting point...


Re: similar to the math they know, this makes sense.

I don't understand why over-hyping and over-selling is so common with AI/ML/DL work (to be fair, over-hyping is more related to AI than physicists in particular. But people from non-CS fields get themselves into extra trouble perhaps because they don't realize there are old-ish subfields dedicated to very similar problems to the ones they're working on.)


I think the answer is related to the startup culture, and that is to gather funding.


That's the confusing thing. Appealing to rich twitter users/journalists/the public doesn't strike me as a particularly good strategy for raising research funds!

Random rich people rarely fund individual researchers. More common for them to fund an institute (perhaps even by starting a new one). The institute then awards grants based on recommendations from a panel of experts. This was true before Epstein scandals, and now I cannot imagine a decent university signing off on random one-off funding streams from rich people.

All gov funding goes through panels of experts.

Listening to random rich people or journalists or the public just isn't how those panels of experts work. Over-hyping work by eg tweeting at rich/famous people or getting a bunch of news articles published is in fact a good way to turn off exactly the people who decide which scientists get money.

Maybe a particularly clueless/hapless PR person at the relevant university (or at Nature) is creating a mess for the authors?


>Random rich people rarely fund individual researchers.

Yes and no. There are private foundations that, if someone donates a reasonably large amount, say at least the amount of their typical grant, they will match the donor with a particular researcher, and the researcher will have lunch, give a tour, and send them a letter later about about the conclusions (more research is needed).

That doesn't mean the donor gets input into which proposals are accepted; that is indeed done by a panel of experts as far as I know. It's more of a thing to keep them engaged and relating to where the money goes when there are emotional reasons for supporting e.g. medical research.


Even less than that now. The ability of specific donors to direct funds to specific academic groups for specific research is WAY more constrained now than it was pre-Epstein. Institutions want that that extra layer of indirection.


While I agree with you, the nature paper that I linked above was published by the folks at google of all places. I think a valid hypothesis is that work done during internships (or even residencies) may not be on-par with what NeurIPS/ICLR/etc require but they give publicity and thus the PR teams push for that kind of papers.

However, it still does note explain why this kind of sloppy work done and published by publicly funded research labs, except perhaps as a form of advertisement.


Well, yeah, corporate research is what it is. A lot of the value add is marketing.


I agree with the sibling comment that whenever ML has not been used before on in a field and DL and especially DRL (deep reinforcement learning) are used, it is likely that the authors are ignoring decades of good research in ML.

After a very theoretical grad course in ML, I have come to appreciate other tools that come with many theoretical guarantees and even built-in regularization that are less Grad Student Descent and more understanding the field.

I think that the hype that was used to gather funding in DL is getting projected onto other fields, if only to gather more funding.


The articles don't contradict each other when it comes to cited facts - you can believe both!

I suppose its all in the implications though, which are contradicting as the nature article implies it is a big deal. The nature article doesn't give any examples of interesting conjectures, or examples of interesting consequences if any of the conjectures should be true. They talk a lot about alternate formulae to calculate things we already know how to calculate. Why would we care? Do they have a smaller big-oh? Nature references the theory of links between other areas of math, if true that's great, but if its true surely they would have mentioned an example of such a link? Anyways I lean towards this not being that interesting, even if you base that just on what the nature article said.


Re why would we care: this is a search algorithm for numerical coincidences. Most numerical coincidences are trivial, for example can be derived from hypergeometric function relation which was known to Gauss. In fact it would be interesting to automatically filter formulae which can be derived from hypergeometric function relation... On the other hand, numerical coincidences can lead to deep theory, monstrous moonshine is a prime example. Hope is that by searching for numerical coincidences, we can discover one leading to deep theory without already knowing that deep theory. This seems reasonable.


That's a very good point - and it really motivates these kinds of computer searches.

The Nature paper has quite a lot of detail in its supplementary

https://static-content.springer.com/esm/art%3A10.1038%2Fs415...

Table 3 inside also shows new conjectures for constants such as Catalan's and zeta(3). These results do not seem to trivially arise from known knowledge.


FWIW that blog is written by one of the top leading number theorists in the US today. Of course, his opinions are his and you’re free to form your own, but just wanted to clarify that the blog is very much legit.


It seems like Calegari is chasing after PR and may be angry at computer scientists getting into his field.

His criticism was discussed and found incorrect by the peer review process:

https://static-content.springer.com/esm/art%3A10.1038%2Fs415...


This is an incredibly strange slate of reviewers. Only the first seems to really understand the mathematical context. It's odd to appeal to the peer review process when the "peers" are not suited to complete the review.

I assure you that Calegari knows more about number theory than any of those referees, and the reasons why the paper is bad are well-explained on his blog (cf. the two links above) and by referee #1. Speaking of "peer review," look at how all the excellent mathematicians commenting on that blog agree with him!


I agree. Without being an expert in the field, the 2nd and 3rd reviews "smell funny"; they clearly lack depth, are very enthusiastic, and don't seem to make a good and comprehensive point as to _why_ this paper is supposed to be as great as they claim it is. In a strong community, any editor should consider these reviews disappointing, and any author should at least have mixed feelings about them.


Calegari gets to cherry-pick comments he approves or rejects on his blog, so calling it "peer review" is taking the concept out of context =).


True! I tried several times to comment on his blog - but Calegari didn't confirm my comments

It's hypocritical to criticize but to avoid criticism ...


The authors tweeted at Elon Musk and Yuri Milner, so it's obvious who is chasing PR (and "dumb money")

Meanwhile, the blog author congratulated Mathematica for being for being good at solving continued fractions

I'd ask you where the criticism was "found to be incorrect", but I know that's absurd (aka, not even wrong), as peer review comments are not in the business of "finding criticism to be incorrect".


Assuming that it's the PI/senior author doing all of this shameless promotional work, I feel really REALLY bad for the grad student(s) on this paper... what a way to start your academic career :(

The paper is actually really nice work, but holy jesus someone on that author list is making a complete ass out of themselves.

Academia isn't startup world. The community is small, people have long memories, and I've rarely seen the strategy being deployed here work out. It does work sometimes, but more often it backfires. Especially for folks who aren't yet on a tenure track.


Science/Nature are prestigious, but the quality of their articles are often questionable. Part of the problem is the short format, which makes it difficult to include a lot of context and sanity-checking. Another issue is that they prioritize the “sexiness” of the research over pretty much everything else.


I'll never understand why Science/Nature carry any currency in CS and Math. TBH I consider them negative signals in these fields, and I encourage others to do the same when hiring -- the same way that a prestigious newspaper or magazine would treat someone with a bunch of (non-news) Buzzfeed bylines.

There are some exceptions. E.g., a Science/Nature paper summarizing several years worth of papers published in "real" venues. Truly novel work that's reported on for the first time in Nature/Science is almost universally garbage. At least in CS/Math.


In my experience, at least in pure math, publishing in Nature/Science doesn’t carry any weight. The most prestigious journals for a given subfield usually specialize in that subfield (with a name like Journal of Geometric Topology), with a few exceptions like Annals and JAMS. Even those are still focused heavily on pure math; I can’t think of any which are cross-disciplinary outside of math.


Nature in particular seems vulnerable to the academic equivalent of click-bait articles. I think the top journals within a specific field are more reliable.


This is why I have empathy for conspiracy believers. From their perspective, their understanding of the world is accurate.

This is also why I see the inevitable failure of social media platforms in regulating truth-vs-non-truth.


As a thorough non-expert, I don't take headlines in the style of The Register seriously, even if the article is in Nature.

Although, if it was really from The Register it probably would have said "boffins" rather than "humans".


Sometimes popular science is itself the misinformation. The authors stretch the findings to land in prestigious journals. The news stretches the findings further to sell clicks (c.f. Gell-Mann Amnesia effect). The people on the internet selectively quote articles and selectively ignore others. The algorithm tries to only show you content that you like.

The truth doesn't have a chance.


This phenomenon has a name: epistemic learned helplessness.

https://slatestarcodex.com/2019/06/03/repost-epistemic-learn...


Full disclosure- I am one of the authors of the paper.

Note that the blog you're citing was written a year and a half ago. It refers to a select few conjectures, and naturally has no references to the developments in the past year and half (which were the main reason the paper got published).

Furthermore, the author of the blog didn't respond to multiple emails we sent him, attempting to discuss the actual mathematics.

So basically the vast majority of the criticism here, is based on a single, outdated blog, by a professor (respected as he may be) who has not revisited the issues and new results since first posting the blog, and has not given any mathematical argument as to why the results shown in the paper (the actual updated paper that was published) are supposably unimportant.

Would appreciate your opinions on the matter.


Not the person you're replying to, but I admit to characterizing your paper as "garbage" in another comment thread. Since you're inviting discourse, which I greatly appreciate, I'm compelled to reply.

1) To anyone who's studied algebra, it is clear that identities of the form LHS = RHS can be obtained by a nested application of transformations and substitutions in a consistent manner.

2) Of course, arriving at a new, insightful result often involves taking mundane steps. However, in this case, the new mathematical discoveries based on the output tableaus of your algorithm are hypothetical. Whereas the manuscript (and the authors) have already pocketed one of the premium accolades in sciences in the form of a Nature publication.

3) To drive the point above home, do you think the resulting mathematical insights themselves, without riding on the "AI" novelty aspect, would clear the bar for a Nature (or similar high-impact) publication? To be clear, I'm not a mathematican, but I believe the answer would be no. Contrast this with another AI/ML advance published in Nature quite recently: AlphaGo. Note how the gist of their paper, superhuman performance in Go, is a self-standing achievement that merely makes use of machine learning techniques.


"garbage" and "fraud" are really strong words.

I would give the actual work behind this paper a "strong accept" if the claims were properly scoped, perhaps with a weak/borderline score on "significance/impact" since I'm not really sure why anyone cares about discovering discovering these sorts of identities. Probably a Conditional Accept in its current form because of the mismatch between actual results + reasonable expectation of potential vs. what's claimed.

So, "over-hyped" and "claims wildly out of line with actual results" are definitely more than fair statements. "Fraud" or "garbage" are way too strong.

Re: Nature, I don't really understand it or care. I can say that in my own input to hiring committees I tend to treat Nature papers in CS/Math as red flags unless they're consolidations of a bunch of other work published in top sub-field journals/conferences.

For some reason Nature really loves these "automated discovery of random mathematical facts" type of papers. I don't understand it. I tend to assume it's click-through-rate-driven editorial decision making.


I appreciate your views otherwise, but I did not use or imply "fraud" in my comment(s).


No, I know. I was referring to other comments on this story. I think garbage is also strong.


I think the vast majority of criticism here does not target the research per se, but rather the way the results are "hyped" and presented as a massive break-through. I agree with this criticism, and also think that the two positive Nature reviews seem rather shallow, at least from a non-expert's perspective (this is not your fault, of course). When it comes to long term impact, I'd find it interesting to discuss how your work can (ideally) interact with proof assistants like Lean. Also, the work around Lean is a good example of a "hyped" topic that is presented by its contributors with caution and modesty.


I don't really see much value in debating the procedural aspects (Nature review process etc). I see a lot of value discussing the research and its content. We think the results shown in the paper are significant and of some importance, and so do others who reviewed our work. This is where I think the focus should be.

Please read our paper and not only the blogs criticizing it:) There is a link to access it here: https://rdcu.be/ceH4i


Then don't take offense by the discussion here, because it's mostly on some "meta" aspects of science communication, and you are probably not responsible for any of the aspects that have been critizised.

Regarding the research itself, I am not an expert, but I am curious to learn how this line of research (automated conjecture generation) intersects with proof automation/proof assistants, and in particular with the work that the Lean community is doing (creating an "executable" collection of mathematical knowledge). Perhaps there are some works you can point to.


Ah yes, the good old "SV culture disrupts X! Revolution at 8 o'clock!"

There's an arms race:

* People are evolving memetic resistance to the incessant BS, ads and bombastic headlines.

* The SV/startup culture is evolving to inject authenticity to overcome people's BS defenses, convince them they need a change.

Honestly, do you still get excited when you read "AI solves X!"?

Probably another huckster peddling empty air, cutting corners, externalizing costs. The whole game is tired, and people are taking note. Not everything that exists requires a radical change.


Look at the comments:

> The paper is amazingly bad. None of the authors are mathematicians as far as I can see. I think the word “new” appears 50+ times in the paper. Looks like they updated the paper to include your observation from last time about the Gauss continued fraction without mentioning the source (the authors admit here they read your blog: http://www.ramanujanmachine.com/idea/some-well-known-results...). Classy!

Just some light plagiarism/academic misconduct!


> Well … OK I guess? But, pretty much exactly as pointed out last time, not only is the proof only one line, but the nature of the proof makes clear exactly how unoriginal this is to mathematics

This is what I was wondering about while reading the article. If the AI only generates formula for which proofs involve only a few trivial steps back to something that is known, then it doesn’t feel useful. But I feel like the question “what makes a good conjecture?” in its own right makes for a very interesting discussion.


Mathematical physicist Robbert Dijkgraaf has got you covered:

https://www.quantamagazine.org/the-subtle-art-of-the-mathema...


Wouldn't a good conjecture be anything that's interesting if true. General bonus points for if intuitively it seems like it should be obviously true (or false) but yet is hard to prove or if proving it is true would allow you to prove lots of other interesting statements.


Sure. Define "interesting".

What's interesting to me is probably in a standard textbook already.


If it is in a standard textbook, than its almost certainly interesting (although probably not a conjecture unless its a pretty advanced textbook)


I think one interesting lesson from this nice qualification here is that at the moment ML methods to learn mathematics may look trivial from a professional mathematician (ie the results are unoriginal or trivial) but perhaps the target audience of this method may be for non professional mathematicians or students training to be mathematicians. I could still see this ML tool as a way to automate the work of some more “trivial” (from the POV of an expert) mathematics, although not the work of professional mathematicians.

The knowledge gap in mathematics between professional mathematicians and non professionals is vast, and this tool could narrow the gap.

I would bet the majority of readers of nature would not be able to point out that the outputs of the ML tool were trivial. So there is need to narrow this gap.


Simple results in almost any specialist field would stump most readers of Nature. That's not a reason to publish in Nature.


This is shocking stuff. I encourage everyone to read these two links.


I agree with the article you linked. Mathematical knowledge is about compression. Most if not all of these formulae are just specializations of known formulae. So the value of this approach is questionable. Generating these forms can possibly be done in a much simpler way.


who else holds the compression view ?


I understand his frustration. But calling it a fraud is a little bit too much.


Looking at his post, the main criticism is "that the program has not yet generated anything new", but the post does not refer to the actual results (like formulas for Catalan's and Apery's constants).


Irrelevant.


The Nature paper presents several new conjectures related to the Catalan constant, pi^2, and zeta(3) (Apéry's constant):

http://www.ramanujanmachine.com/wp-content/uploads/2020/06/c... http://www.ramanujanmachine.com/wp-content/uploads/2020/06/p... http://www.ramanujanmachine.com/wp-content/uploads/2020/06/z...

The main criticism of the blog is "that the program has not yet generated anything new", but the post does not refer to these results. So it seems that this blog post is currently irrelevant and out-dated compared to the Nature publication.


Discussion on /r/MachineLearning with caveats here: https://www.reddit.com/r/MachineLearning/comments/jkrzlt/d_a...


The last author and his company were/are involved in scandals about faking results on a large scale:

https://www.reddit.com/r/MachineLearning/comments/8zm4kl/d_l...

http://sadeghi.com/dr-iman-sadeghi-v-pinscreen-inc-et-al/


Thanks, I think this deserves an HN post of its own. Some of the things that were done to entice Sadeghi to join sends shivers down my spine.

Edit: I've been looking at the details on the rear view of some of the "Single-View Reconstruction" examples, and I'm starting to worry that this may actually not be reproducible.


I have followed this case since the beginning. I am surprised how the academic community (who's fully aware of this case) continues reviewing his work without questioning work ethics as though nothing happened. TLDR:

Dr. Iman Sadeghi is the man behind hair rendering tech for Disney and Dreamworks. He left his job at Google to join Hao Li's company Pinscreen (which by the way is funded by big names like Softbank).

When Sadeghi saw red flags inside the company, he raised the issue from within, and finally wanted out. While he was leaving one day, Li and his colleagues legit assaulted (violently) Sadeghi to give up his company laptop. This, by the way, is recorded on CCTV cameras and can be viewed online.

The fraud case is about falsifying results in their SIGGRAPH 2017 Technical Papers submission. They claimed to generate avatar hair shapes in their paper, and when their reviewer asked them to give results on many faces, the company hired artists for as much as $100 to generate them manually for the results. Of course, they later claimed to make it fully automatic AFTER publishing the paper, but that doesn't justify the stance that they published false results in one of the biggest computer graphics conference. Hell even at that public demo, they showed pre-cached avatars and claimed them to be generated real-time.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: