Hacker News new | past | comments | ask | show | jobs | submit | iandanforth's comments login

Why does it weigh 100lbs?

nit: I find the writing in this post very distracting. (Grammar and style pet peeves)

Luckily, it is now trivial to drop the post into Claude and say "Re-write this without <list of things that bother me>"

So, just in case you also felt like you were driving over a road filled with potholes trying to read this post, don't just click away, have your handy LLM take a pass at it. There's good stuff to be found.


While this article presents a system it doesn't present any results. Does this modification to SRS help? Which type of student does it help? How large is the effect if there is one?

I haven't read the larger pdf of which this is a part so perhaps someone who has can provide a pointer to some results.


The main idea is that this approach makes spaced repetition feasible in something like mathematics. Without this approach, spaced repetition wouldn't even be feasible because after a short while, you'd be continually overloaded with too many reviews to really make any progress learning new material.

Moreover, in addition to making spaced repetition feasible, it minimizes the amount of review (subject to the condition that you're getting sufficient review) which allows you to make really fast progress.

We (Math Academy) don't have any official academic studies out at the moment, but if you want some kind of more concrete evidence of learning efficiency, you can read more online about our original in-school program in Pasadena where we have 6th graders start in Prealgebra, and then learn the entirety of high school math (Algebra 1, Geometry, Algebra 2, Precalculus) by 8th grade, and then in 8th grade they learn AP Calculus BC and take the AP exam.

The AP scores started off decent while doing manual teaching, but the year we started using our automated system (of which the SRS described here is a component), the AP Calculus BC exam scores rose, with most students passing the exam and most students who passed receiving the maximum score possible (5 out of 5). Four other students took AP Calculus BC on our system that year, unaffiliated with our Pasadena school program, completely independent of a classroom, and all but one of them scored a perfect 5 on the AP exam (the other one received a 4).

Even some seemingly impossible things started happening like some highly motivated 6th graders (who started midway through Prealgebra) completing all of what is typically high school math (Algebra I, Geometry, Algebra II, Precalculus) and starting AP Calculus BC within a single school year. Funny enough, the first time Jason & Sandy (MA founders) saw a 6th grader receiving AP Calculus BC tasks, Jason's reaction was "WTF is happening with the model, why is this kid getting calculus tasks, he placed into Prealgebra last fall, this doesn't make any sense," but I looked into it only to find that it was legit -- this kid completed all of what is typically high school math (Algebra I, Geometry, Algebra II, Precalculus) within a single school year.

Anyway, some links if you're interested:

* https://www.mathacademy.us/press

* https://www.reddit.com/r/homeschool/comments/16hn9f5/comment...

* https://x.com/justinskycak/status/1810482435940913502

* https://x.com/justinskycak/status/1812557234028839193

Again, I realize these are not official academic studies, but we're completely overloaded in startup grind mode right now and have so many fish to fry with the product that we just don't have the time for academic pursuits at the moment, let alone much sleep. Happy to answer any follow-up questions that you might have, though.


I'm excited about learning DAG + spaced repetition - fully sold this is an improvement over Anki

> Without this approach, spaced repetition wouldn't even be feasible because after a short while, you'd be continually overloaded with too many reviews to really make any progress learning new material.

I think this may be an overstatement. I've used Anki to learn and retain a lot of math including linear algebra (e.g worked through several chapters of Strang's books and its exercises). While it's not perfect, and I would love to have a DAG, what ends up happening in my experience is the more basic topics I understand and related topics end up being proven understood enough that they are backed off from review for a long time. So I might not be asked about a basic topic / problem for a year. This seems to prevent being overloaded with duplicate / repeat cards.

However, if I do find that I am faced with a more advanced card that I have forgotten some of the foundations on, it is frustrating to not be able to easily be challenged smartly up through its ancestors. If I keep up with anki every day (10-20 minutes) this doesn't happen, but if I take 2-3 months off and come back to my deck, I can be faced with this problem and have to sometimes go manually digging for relevant background topics. So that's why I would love to have all this stuff get smarter, and am now reading / following you and math academy's work.


These are fair points; I guess the feasibility of spaced repetition comes down to what you consider a repetition. If you're considering a single quick problem on a topic to be a repetition, and you don't have too many topics, and you're not plowing through them too quickly, then I can see it being feasible as you're saying.

Math Academy is built with a different context in mind:

-- 1. we have tons of different topics (for instance, over 300 in our AP Calculus BC course -- and that's just one course; students who stick with us past their first course and continue taking more courses on our system can easily accumulate thousands of topics on which they need to maintain their knowledge)

-- 2. each repetition consists of multiple questions on a topic (after an initial lesson task consisting of somewhere in the ballpark of 10 questions, future repetitions are review tasks in the range of 3-5 review questions)

-- 3. students are getting through a new topic every 20 minutes or so on average (our AP Calculus BC course is estimated to take about 6000 minutes, and that includes time spent on review / quizzes / etc., and 6000 minutes / 300 topics = 20 minutes/topic)

Just some back-of-the envelope math in our context: say you learn 3 new topics per day (an hour-long session), and each review takes you about 4 questions. Then, as a rough estimate, pretty quickly you reach a point where you've got

12 review questions based on yesterday's lessons,

+ another 12 review questions on topics you reviewed for the first time last week,

+ another 12 review questions on topics you reviewed for the second time a couple weeks ago, + another 12 review questions on topics you reviewed for the third time a month ago,

+ another 12 review questions on topics you reviewed for the fourth time a couple months ago,

...

That's already 60 review questions/day, already past the point where you’re spending every hour-long session entirely on review, which means that your progress grinds to a halt in terms of learning new material.

So, in our context at least, raw spaced repetition just doesn't work out in a way that’s feasible – student will get absolutely crushed by a tsunami of review unless we unless we take measures to drastically cut down the amount of review (i.e., fractional implicit repetition + repetition compression).

I hear your point about needing some refresher if you take 2-3 months off and come back, though. Similar thing happens with Math Academy students if they take time off and come back, but the way we solve that problem is we just recommend they take a new diagnostic to refresh their knowledge profile. Basically, it just peels back their "knowledge frontier" to a point where they can pick up and continue learning smoothly with our standard approach.

(In my experience, that’s what you’d ideally do as a teacher after summer vacation – you know your students have forgotten a lot of material, so you have them take a beginning-of-the-year knowledge evaluation and go from there.)


Cool, thanks for these details - perhaps I'm underestimating how much faster I could be learning with a tool like this - I also think how much time one is spending makes a difference - I would be chipping away a few hours a week max, had I spent more like full time I probably would have run into what you are describing.

Thanks for the detailed response! Fwiw I bet you could get some free academic labor just by offering to let a local PhD student have access to your internal data. (Obviously still not zero effort)

And yet the consequence of letting people like this run your security org is that it takes a JIRA ticket and multiple days, weeks, never to be able to install 'unapproved' software on you laptop.

Then if you've got the software you need to do your job you're stuck in endless cycles of "pause and think" trying to create the mythical "secure by design" software which does not exist. And then you get hacked anyway because someone got an email (with no attachments) telling them to call the CISO right away, who then helpfully walks them through a security "upgrade" on their machine.

Caveats: Yes there is a balance and log anomaly detection followed by actual human inspection is a good idea!


Open source is great and new things are great and pursuing your passion is great. The rhetoric here however is lacking. Specifically the argument is "google money bad" but the authors don't provide specific examples where google money has caused a technical decision they disagree with.


> No "default search deals", crypto tokens, or other forms of user monetization, ever.

Is avoiding those sorts of things not supposed to be reason enough for them?

Also the page does a good job of specifically mentioning Google and making general statements about what any source of funding can impact. If Google wanted to give an unrestricted donation it's not clear from this page they would decline it.


The semantic equivalence of possible outputs is already encoded in the model. While it is not necessarily recoverable from the logits of a particular sampling rollout it exists throughout prior layers.

So this is basically saying we shouldn't try to estimate entropy over logits, but should be able to learn a function from activations earlier in the network to a degree of uncertainty that would signal (aka be classifiable as) confabulation.


Are you a lawyer? Can you point to the relevant statutes? I don't ask this flippantly as there exist many forms (private / public / academic) of library which allow for both physical and digital lending of owned assets and are not subject to lawsuits like this. It's not at all obvious that IA's interpretation of the law is in error.


The relevant statute provision is 17 U.S.C. § 501.

On March 24, 2023, the Internet Archive was found liable for copyright infringement under that section by a federal court, in an order granting a motion for a summary judgment.[0] A summary judgment means that there is no genuine dispute about facts, and the plaintiffs (the people suing the Internet Archive) are entitled to a judgment as a matter of law.[1]

It is your prerogative to feel that you're better qualified to interpret federal law than a federal court is, but it is fairly misleading to say that it is not at all clear what the law is here, when a court decision exists on these exact facts.

Should the law be changed? Yes, in my opinion. Is there much dispute over what the law is? No, not really.

[0] https://storage.courtlistener.com/recap/gov.uscourts.nysd.53...

[1] Other common law jurisdictions use clearer language to describe summary judgments: in the UK and Australia, for example, a summary judgment is granted when a party has "no reasonable prospects of success" and there is no point in going to trial. These exact words aren't used in the US, but they give a reasonable indication of how summary judgments are used in practice.

None of this is legal advice.


You really don't need to be a lawyer to understand the extent to which Internet Archive really screwed up here. For starters they lost on Summary Judgment, which means they couldn't come up with a single issue of fact that the judge thought deserved a trial. Read the Order, the Judge obviously has it straight. Then check archive.org's Form 990s and see how little money they run on, how much they pissed away on legal fees so far, and also infer the amount they had to pay in damages, which was obviously very tiny.

You are welcome to argue it as a matter of culture (and I'm inclined to agree and cheer you on) but from a legal perspective, Brewster should be removed and they need to find competent people to put on the board because they really did put the entire organization at risk over an idiotic decision. And the ramifications continue.

Internet Archive's "we own a copy of a book, we scan it and loan out one digital copy" policy was already on shaky ground. When Covid hit and everyone lost their minds, letting homeless people sleep on the stairs of their building apparently wasn't enough so they just turned into The Pirate Bay and loaned out infinite copies of everything.

In discovery for the case, it turned out they weren't even tracking the "we own one copy" part to begin with correctly. None of this should be surprising to anyone who actually attempts to use the site. The whole thing is duct tape and string.

They have a tiny budget and the do amazing things with it, but it really deserves to be treated like a business and not be run like an art project. If they wanna stick your neck out and push for CDL reform, great. Just do it under a different LLC so you don't tank the 50 other important things you've got going on. And it's time for Brewster to move on. In any other non-profit he'd be gone by now.


I disagree. Brewster demonstrated real courage.

The world locked down u necessarily precisely because the population has not been getting smarter the past 50 years because copyright has poisoned our information wells.

This is a fight worth having.

It's time to abolish copyright.


There's absolutely no courage in putting that's been built and used for almost 2 decades on the line because of the wave of "we must do something" that was so common in March and April 2020 even when in most cases, doing nothing would've been more helpful.

It would be like if Mozilla decided to stop taking any google funding overnight because they felt like their values didn't fit with Google's. Sure, it would be well intentioned. But then you also guaranteed that Firefox won't be able to exist for more than a few weeks. Deciding to unilaterally reproduce (by way of unlimited lending without the copies to back said lending) books is the legal equivalent to doing that. It would be a nice thing to have, but it's not something you get by just getting wrecked in an open and shut case in court.


> It would be like if Mozilla decided to stop taking any google funding overnight because they felt like their values didn't fit with Google's.

That would be the best thing that could happen to Mozilla becose then the vultures would move on and the project could get back to its mission without the MAJOR conflict of interest fucking up their incentives.

> But then you also guaranteed that Firefox won't be able to exist for more than a few weeks.

Firefox doesn't need Mozilla to continue existing. But even if it died that would at least make space for an alternative that isn't just controlled opposition doing the bare minimum to protect Google from antitrust lawsuits while continuuously disrespecting user choices and preferences.


> This is a fight worth having.

Literally everyone here has already stated that they agree with you on this. There's no controversy upthread about whether copyright law needs to be reformed, the controversy is over whether it made sense to risk the entire Internet Archive (whose most important contribution to the knowledge of mankind has nothing to do with online lending) or if it should have been fought by an organization that was built to fight it.

Their instinct to help is admirable, but their lack of restraint shows a major lack of judgment and very well could have put their archives of the internet at risk. Not every nonprofit organization can do everything.


To put a finer point on it, the archive is there to be an archive -- there is a lot to be done there. The EFF is there to fight battles.


When a CEO pulls some nonsense and gets his ass handed to him in court, it's not an act of bravery, it's hubris. He set the result both you and I want back ten years by running his organization in such a sloppy manner. Even before the lawsuit they didn't keep accurate records of what they were lending, full stop.


It may be a fight worth having, but that wasn’t the way to fight it. There was zero chance this was going to end with a favorable court decision. It only carried risk for the organization - no upside at all.


If you want to fight that fight, instead of putting hundreds of thousands of books on the internet, you start with a few books and get sued over that. That way, if you lose, the actual damages are minimal. Right now, the Internet Archive could go bankrupt because of the millions of dollars in potential damages now that they've lost.

Finding representative cases to fight over and create precedent is a common strategy that exposes you to less risk if you lose.


> from a legal perspective, Brewster should be removed and they need to find competent people to put on the board

I don't like that the IA risked the actual internet archive with this or that they chose to engage in DRM at all but let's be reall: "competent people" would have sold the IA to the advertisement moloch or another horror of modern civilization long ago. The IA does need a leader that puts principles above financial security.

> it really deserves to be treated like a business

That would be the absolute worst thing that could happen to the IA.

> And it's time for Brewster to move on. In any other non-profit he'd be gone by now.

Consider doing something worthwile of your own instead of trying those who have but aren't perfect enough in your opinion.


Stockpiling assets on an active fault line, opening a Credit Union, failed activism, obviously shitty tech... it's time to put a grownup in charge of the thing. They're an archive. It needs consistent funding and boring tech and to focus on three things, max.


Are you aware of a form of library subject to US law that lends copyrighted digital assets without either a license from the copyright holder or legal trouble from the copyright holder?


>Can you point to the relevant statutes?

IANAL, but:

https://www.law.cornell.edu/uscode/text/17/107

https://www.law.cornell.edu/uscode/text/17/108

https://www.law.cornell.edu/uscode/text/17/117

What the Internet Archive did (loan many digital copies based on one physical copy) is illegal as the law stands today.


> Can you point to the relevant statutes?

This is about sound recordings rather than books, but one of the more insane features of US copyright law is that the copyright status of sound recordings made before 1972 is governed by state rather than federal law, and many states are thought to have no applicable statute. Some of these states determine copyright status for these works by deferring to federal law, which does not cover them.


If you enjoy this topic I also highly recommend "A Brief History of Intelligence" which goes into quite a bit of detail, is very readable, and ties in directly to the near term future of what intelligence will mean in our world. Really a very good book!


I don't envy Australian's most things but this is one worthy of it.


Out of curiosity, as an Australian, what is unenviable?


The total lack of free speech is pretty messed up. Am I correct that you can’t even have blood come out when someone is hurt in a video game marketed to adults?


I am not a free speech absolutist, but I don't know that free speech in Australia is lacking very much compared to any other English speaking country, including America: I don't really believe that your constitutional right protects you the way that you might think it does, assuming you're American.

As to the latter point, no, that's incorrect. There were some issues in the past with getting a game approved at a 15+ age rating instead of 18+, but those have gone by the wayside as videogames in general have become more mainstream & accepted.


Your pro-coals politicians perhaps. And you don’t drive on the right side of the road.


> And you don’t drive on the right side of the road.

Well. There is nothing left to discuss then.


your gigantic spiders


Australia's most venomous spiders are actually small.

But yes, pretty much everything in Australia will try to kill you: it has world's most venomous snakes, world's most venomous spiders, saltwater crocodiles, sneaky dropbears, dengue fever carrying mosquitoes, world's most venomous jellyfish and sea snakes, and, of course, the IT consultants who will eat one alive.


One point not covered but clearly understood tacitly by the author is that the representativeness of a fact degrades over time. While it may be true that someone did something 10 years ago, that fact may not be a true reflection of their character today. Thus a statement may be technically true, but practically false, and thus should be considered defamatory (under the modern American definition).


I like this notion (that any ancient past actions must carry facts of subsequent actions) better than trying to push a "right to be forgotten."


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: