How do you know you haven't unknowingly broken something when you made a change?
I think if:
- the code base implements many code paths depending on options and user inputs and options such that a fix for code path A may break code path B
- it takes a great deal of time to run in production such that issues may only be caught weeks or months down the line when it becomes difficult to pinpoint their cause (not all software is real-time or web)
- any given developer does not have it all in their head such that they can anticipate issues codebase wide
then it becomes useful to have (automated) testing that checks a change in function A didn't break functionality in function B that relies on A in some way(s), that are just thorough enough that they catch edge cases, but don't take prod levels of resources to run.
Now I agree some things might not need testing beyond implementation. Things that don't depend on other program behavior, or that check their inputs thoroughly, and are never touched again once merged, don't really justify keeping unit tests around. But I'm not sure these are ever guarantees (especially the never touched again).
Numerical simulation is a well explored field, we know how to do all sorts of things, the issues lie rather in the tooling and robustness of it all put together (from geometry to numerical results) than in conceptual barriers. Finite Differences have existed since the 1700's! What hadn't for the longest time, is the computational power to crunch billions of operations per simulation.
A nice thing about numerical simulation from first principles, is it innately supports arbitrary speed/precision, that's in fact the backbone of the mathematical analysis for why it works.
In some cases, as is the case for CFD, we're actually mathematically screwed because you just have to resolve the small scales to get the macro dynamics. So the standard remains a kind of hack, which is to introduce additional equations (turbulence models) that steer the dynamics in place of the small (unresolved) scales. We know how to do better though (DNS), but it costs an arm and a leg (like years to milenia on a super computer).
I can't help but find this comment a little insulting. It's very similar to saying "if, while, else, malloc. The LLM can figure the rest out!" as if CS were a solved thing and the whole challenge weren't assembling those elementary bricks together in computationally efficient and robust ways.
Also more to the point, I doubt you'll have much success with local optimization on general surfaces if you don't have some kind of tessellation or other spacial structure to globalize that a bit, because you can very easily get stuck in local optima even while doing something as trivial as projecting a point onto a surface. Think of anything that "folds", like a U-shape, a point can be very close to one of the branches, but Newton might still find it on the other side if you seeded the optimizer closer to there. It doesn't matter whether you use vanilla Newton or Newton with tricks up to the gills. And anything to do with matrices will only help with local work as well because, well, these are non-linear things.
"Just work in parameter space" is hardly a solution either, considering many mappings encountered in BREPs are outright degenerate in places or stretch the limits floating point stability. And the same issue with local minima will arise, even though the domain is now convex.
So I might even reduce your list to: Taylor expansion, linear solver. You probably don't need much more than that, the difficulty is everything else you're not thinking of.
And remember, this has to be fast, perfectly robust, and commit error under specified tolerance (ideally, something most CAD shops don't even promise).
If anyone is interested, you can try EngineeringSketchPad (https://acdl.mit.edu/ESP/) which is very similar but much more mature. It also supports simple geometric primitives and boolean operations via a scripting language, but also more general rational curves and surfaces (i.e. BREPs). It has other nice features like differentiation, application-specific views (think structural vs CFD), and an attention to water-tightness/correctness.
I have a terrible memory for details, I'll admit an LLM I can just tell "Find that paper by X's group on Method That Does This And That" and finds me the paper is enticing. I say this because I abandoned Zotero once the list of refs became large enough that I could never find anything quickly.
Speaking of conferences, might this not be the way to judge this work? You could imagine only orally defended work to be publishable, or at least have the prestige of vetting, in a bit of an old-school science revival.
If I may be the Devil's advocate, I'm not sure I fully agree with "The hard part always has been, and always will be, understanding the research context (what's been published before) and producing novel and interesting work (the underlying research)".
Plenty of researchers hate writing and will only do it at gunpoint. Or rather, delegate it all to their underlings.
I don't see an issue with generative writing in principle. The Devil is in the details, but I don't see this as much different from "hey grad student, write me this paper". And generative writing already exists as copy-paste, which makes up like 90% of any random paper given the incrementality of it all.
I was initially a little indignated by the "find me some plausible refs and stick them in the paper" section of the video but, then again, isn't this what most people already do? Just copy-paste the background refs from the colleague's last paper introduction and maybe add one from a talk they saw in the meantime, plus whatever the group & friends produced since then.
My experience is most likely skewed (as all are), but I haven't met a permanent researcher that wrote their own papers yet, and most grad students and postdocs hate writing. Literally the only times I saw someone motivated to write papers (in a masochistic way) were just before applying to a permanent position or while wrapping up their PhD.
Onto your point, though, I agree this is somewhat worrisome in that, by reaction, the barrier to entry might rise by way of discriminating based on credentials.
I also am not sure why so many people are vehemently against this. I would bet that at least 90% of researchers would agree that the writing up is definitely not the part of the work they prefer (to stay polite). As you mentioned, work is usually relegated to students, and those students already had access to LLMs if they wanted to generate the work.
In my opinion, most of those tools become problematic when people use them without caution. Unfortunately, even in sciences, people are not as careful and pragmatic as we would like to imagine they are and a lot of people are cutting corners, especially in those "lesser" areas like writing and presenting your work.
Overall, I think this has the potential to reshape the publication system, which is long overdue.
I am a rather slow writer who certainly might benefit from something like Prism.
A good tool would encourage me, help me while I am writing, and maybe set up barriers that keep me from taking shortcuts (e.g. pushing me to re-read the relevant paragraphs of a paper that I cite).
Prism does none of these things - instead it pushes me towards sloppy practices, such as sprinkling citations between claims.
Why won't ChatGPT tell me how to build a bomb but Prism will happily fabricate fake experimental results for me?
LaTeX is already standard in fields that have math notation, perhaps others as well. I guess the promise is that "formatting is automatic" (asterisk), so its popularity probably extends beyond math-heavy disciplines.
The representation might not need to explicitly encode "meaning", if it does so implicitly by preserving essential properties of how things relate to each other.
For instance, a CAD object has no notion of what an airplane wing or car wheel are, but it can represent those in a way that how a wing relates to a fuselage is captured in numerical simulations. This is because it doesn't mangle the geometry the user wanted to represent ("what it means", in a geometric sense), although it does make it differ in certain ways that are "meaningless" (e.g. spurious small features, errors under tolerances), much like this representation might do with words.
Back to words, how do you define meaning anyways? I believe I was taught what words "mean" by having objects pointed to as a word was repeated: "cat", says the the parent as they point to a cat, "bird", as they point to a bird. Isn't this also equality/correspondence by relative frequency?
I think if: - the code base implements many code paths depending on options and user inputs and options such that a fix for code path A may break code path B - it takes a great deal of time to run in production such that issues may only be caught weeks or months down the line when it becomes difficult to pinpoint their cause (not all software is real-time or web) - any given developer does not have it all in their head such that they can anticipate issues codebase wide
then it becomes useful to have (automated) testing that checks a change in function A didn't break functionality in function B that relies on A in some way(s), that are just thorough enough that they catch edge cases, but don't take prod levels of resources to run.
Now I agree some things might not need testing beyond implementation. Things that don't depend on other program behavior, or that check their inputs thoroughly, and are never touched again once merged, don't really justify keeping unit tests around. But I'm not sure these are ever guarantees (especially the never touched again).
reply