Hacker News new | past | comments | ask | show | jobs | submit login

I can't believe this. I had the same idea and I've been working on a demo. The abstract and introduction of this paper could be describing my project, verbatim.

Well, there's nothing more powerful than an idea whose time has come. This is going to shake up anything that operates on hierarchical data in passes, because traditional structure layouts have everyone doing a lot of pointer chasing. That approach got old when CPUs got fast.




Now that I've read the full paper: they've have taken all the research novelty out of my sails. I'm amazed how parallel our designs are. I guess once you've gone down the rabbit hole of committing to operating on a dense serialized format, things like decoupling application data types from concrete representations suggest themselves. I'd be sunk if I were doing this for Science.

But I'm doing this for Engineering, and I like my implementation choices. I have achieved some properties that I think are important for practical integration with other code. The working name of my project is Re, because the core data structure is reified traversals and operating on them involves reifying some other things that aren't normally reified.

Edit to add: my elevator pitch (needs condensing, I haven't pitched it to anyone but my mom):

It used to be common practice to use tree structures to represent linear data. As CPUs have become faster than RAM the pointer-heaviness of this approach has made the constant factors awful, and it's now avoided when possible. It's now actually efficient to do the opposite and represent trees as linear data when you can. This can be done explicitly, e.g. in the WASM format, but such structures can't be operated on directly unless it's done "in the raw", outside the type system and other facilities of the language. What's missing is a generic abstraction layer that maps between operations on conceptual application objects and linearized trees. I've found that once you have that, you can also do things with linearized data that would be impractical with trees.

(Actually, it happened the other way around. Some time ago I prototyped a regex-like engine for finding and replacing fragments of ASTs [https://blog.lambdaverse.org/comacro/], and I discovered operating directly on serialized representation enabled that. I haven't had a chance to make a production implementation yet. Then recently I was working on tree-structured template inference to turn collections of webpages back into the databases that generated them with minimal interaction, and I realized... the comacro construction is a perfect fit here. So it became an excuse to develop that idea ostensibly as part of a project I could actually make money with.)


FWIW I've also gone down a similar rabbit hole, and written a bunch of code too in that direction. Resources here:

https://github.com/oilshell/oil/wiki/Compact-AST-Representat...

I wouldn't be surprised that somebody else thought of the same idea -- that happens all the time. The canonical example is that two people independently invented calculus :)

https://en.wikipedia.org/wiki/Multiple_discovery

I also found that the academic projects Morbig/Colis and Smoosh independently did some of the same things I'm doing Oil, at pretty much the exact same time (2016-2017).

Links here: https://github.com/oilshell/oil/wiki/ExternalResources

Some things are just "in the air". I think there are probably a lot more precursors to what you were doing than you were aware of (some of which may be on that wiki page -- I also have a private wiki page called "Avoiding Pointers" with some more academic papers, if you're interested).

I mention Ken Thompson's encodings of regexes, and that was in the 60's. He didn't motivate it by "avoiding pointers", and it's specific to a particular application rather than a general purpose language, but the idea is the same.

The code I wrote was inspired by capnproto to a large extent, and the author chimed in here with the similarities to this language. protobufs/capnproto are basically a language-independent type system, and it's not that big a jump to go from a type system to a general programming language.


Funny, because I been kicking around an idea to remove pointers entirely form a language, the compiler and the implementation. Maybe even the ISA. So heavy, literal, concrete, absolute. Pointers do not go with the flow.

Pointers are a junk-food abstraction that needs to go away.


It sounds like your work would make a great Show HN once it's ready for people to try. If you want to do that, email hn@ycombinator.com when you're ready and we'll give you some tips.


@firethief -- glad to hear you're interested in this idea too. Do you have an implementation for Re that we could learn more about? Maybe we could swap some ideas.




Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: