Hacker News new | past | comments | ask | show | jobs | submit | samgriesemer's comments login

The md-to-html demo is a good one, but worth mentioning that the Markdown parser[1] being used may not be suitable for more complex documents. From the README:

> "...it is not recommended to use this parser where correctness is important. The main goal for this parser is to provide syntactical information for syntax highlighting..."

There's also a separate block-level and inline parser, not sure how `tbsp` handles nested or multi-stage parsing.

[1]: https://github.com/tree-sitter-grammars/tree-sitter-markdown


Even worse, the README implies tree-sitter is just not going to work for markdown at all[1], this is not a matter of a little polish and bugfixing:

> These stem from restricting a complex format such as markdown to the quite restricting tree-sitter parsing rules.

[1]: Outside of something like tree-sitter v2 with a much more complex grammar support. And frankly I personally don't think making more complex grammars in Javascript+C is a good way forward.


Small thing, but the blurb on the README says

> While the system cannot produce publication-ready articles that often require a significant number of edits, experienced Wikipedia editors have found it helpful in their pre-writing stage.

So it can't produce articles that require many edits? Meaning it can produce publication-ready articles that don't need lots of edits? Or it can't produce publication-ready articles, and the articles produced require lots of edits? I can't make sense of this statement.


It gives you a draft that you should keep working on. For example, fact checking.


It's explained more in the "read paper" link, where they provide the actual prompts:

https://openaipublic.blob.core.windows.net/neuron-explainer/...


Very cool work but I'm a bit perplexed by their first example/diagram from the blog post (which is presumably cherry-picked?). The event "The dogs are waiting." overlapping with the event "The dogs are pulling the sled." seems like a poor joint labeling of the events. The two obviously cannot co-occur, and this feels like a pretty easy opportunity for the model to demonstrate its understanding of event disentanglement.

The remaining examples from the paper don't do much in the way of convincing me this is a one-off issue. The recognition of multiple events globally is good, but perhaps extra care should be taken at overlapping event boundaries (e.g. additional local constraints in the loss/regularization scheme that encourage event splitting or time boundaries "snapping-to-grid" if confidence of co-occurrence is low).


You might take a look at The Bitter Lesson [1], it's referenced by the article and linked around on this thread.

> One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.

> The second general point to be learned from the bitter lesson is that the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries.

[1]: http://www.incompleteideas.net/IncIdeas/BitterLesson.html


I did take a look at the article before writing my comment, and I disagree with its premise as well as conclusion. The human mind has a variety of specialized functions. Our ocular cortex does a sort of convolution on a 2D field with three color and one alpha ‘sensors’ (cones and rods). If specialization were less powerful than generality, why isn’t our brain one giant lobe with no diversity in neuron topography?

The idea that specialization is not as powerful as computation fails the most basic test of a proactive, rather than retroactive, theory. Can you make proactive claims about what works in any given domain? Is the solution to take the hungriest algorithm and apply it? What about feature engineering, cleaning, parameter tuning, analysis, etc.? Is the most power hungry solution still the most effective? In my opinion, part of the reason humans aren’t just giant computation blobs is that we thrive on constraints (physical, sexual, emotional).


Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: