Hacker News new | past | comments | ask | show | jobs | submit login
We Built a C++ Rendering Engine for the Web (opendesign.dev)
108 points by PetrBrzyBrzek 6 months ago | hide | past | favorite | 45 comments

Parsing proprietary, undocumented formats and turn it a universal format, and render it? That sounds like a kind of an ambition which is never realized, but they did. And this kind of project tends to fail because it's filled with boring tasks with unreasonable corner cases. So congrats!

And according to their page [1], they've been doing this for a decade. This is such an accomplishment.

[1] https://www.notion.so/The-Story-of-Avocode-d96c279b270443f88...

> Parsing

But why? I thought PSD was all-binary...

I love reading about projects like this. They are in a class of projects that are 100% slog and not very sexy but make things a lot better for a lot of people.

Better outputs for more inputs in better time. Beautiful.

Reminds me of Photopea... Oh, I wonder whether they are in competition? Probably not since Photopea seems to be more about image editing and Avocode is more about design and handoff.

"the notoriously opaque Photoshop format"

what? it's well been documented for ever. 2nd hit on google


It's also the one with the famous rant, "PSD is not my favourite file format": https://github.com/gco/xee/blob/7aec0d65f776fa59c58eb6cf163b...

Well, this doc is from 2019, a lot of things can be changed.

As a linux user, there are a plenty software that can render it perfectly but cannot edit, and i really don't get it. As a web developer i have to use photopea to be able to extract assets from psd files. It's really strange that a web application can edit psd and a lot of powerful softwares like gimp and Krita have the exact problem with layers and masks with modern .psd files, so my guess its bcause photopea IT'S designed to be compatible with psd files. I suspect it's incompatible by design with others softwares and not because it's has a poor specification.

you suspect wrong. I've written Photoshop readers since 1993. They all still work. Photoshop has never broken their format

If I remember correctly, PSD files have an optional header (when files are saved in "Compatibility Mode", which I think is the default) in which is encoded the rasterized image. It's then "easy" to display the image, but very difficult to edit it, as you then need to implement the whole renderer.

PDF is also well documented by that measure, i.e. the complexity of said specification does matter.

Isn't that just... A browser?

Title is misleading, it's a PSD renderer for web browser, not a web browser's renderer.

I really don't think it was a good idea to write this in C++ for security reasons. It accepts external inputs in some complex formats, and they also have a web server version. This makes it highly vulnerable to all kind of attacks, such as buffer overflows.

As a professional C++ programmer I feel we, as a group, are constantly under-estimating the complexity of tasks, and over-estimating what can get done with the standard library.

C++ is a very powerful, unopinionated language, that gives you a lot of freedom to attack your problem domain the way you best see fit.

If you're writing a networked application, don't use POSIX sockets, which have an API designed for C, go and find a higher level library. If you're parsing complex text formats, don't iterate over buffers with char*'s, go pick up PEGTL[0]. If you're working on graphs, or need to properly index in-memory data, go pick up Boost[1][2]. If you need a GUI, go pick up Qt.

It's extremely common in C++, due to the lack of a universal package management solution, for people to try and "muddle through" and do shit themselves when it's far outside their core competency.

At one of my last employers, the core product was parsing JSON with std::regex, simply because they couldn't be bothered to integrate a JSON library (which can be done header-only).

[0] https://github.com/taocpp/PEGTL

[1] https://www.boost.org/doc/libs/1_76_0/libs/graph/

[2] https://www.boost.org/doc/libs/1_76_0/libs/multi_index/doc/i...

What's so wrong with POSIX sockets? It's maybe not an elegant API, but not because "it's designed for C". It's problems are for example 1) it's doing too much abstraction (sockets). 2) you have to deal with some arcane data structures and ancient formats (endian conversions).

But, it does its job well enough: allowing the user to send and receive packets from the network.

If you're writing a networked application that works, chances are you either don't give a sh*t what API to use as long as it lets you send and receive packets (and thus you go with the relatively portable POSIX sockets (at least for Linux / WinSock2 on Windows)), or you use a lower-level API (probably proprietary) to reduce syscall overhead / get more control.

If you're parsing text, chances are all you need is fread() to read in the next chunk from a file, and from this you'll build a "next_byte()" function and then a "next_token()" function on top.

(I've done a lot of network code as well as parsing code, and the I/O API is among the least of my concerns).

All these fancy bottom-up kitchen sink libraries implementing "proper abstractions" or whatever do not provide any value past being able to be combined to form barely working and un-fixable applications where you will pull your hair out when you actually need some control over what's happening.

For something better, you'll need exactly this from external libraries: a clean programmatic (function call) interface that gives you control at a reasonable level of abstraction.

It does the job so well that in latter releases of Apple, Google and Microsoft OSes, the new networking features aren't available to classical POSIX sockets.

Not disagreeing, that's kind of what I said. However it's still chugging along just fine and works well enough for basic applications (like 99% of applications?).

What features do you allude to if I may ask?

In any case, one can't solve any of these "problems" by abstracting over this API ;-)

POSIX sockets aren't portable, period.

Putting aside the super obvious problem that there's no common way to use them asynchronously across platforms, and that file descriptors are the wrong abstraction for TCP connections, they are riddled with more obscure issues:

- Linger behavior varies by platform

- Even simple non-blocking behavior varies by platform.

- Common options like enabling TCP keep-alives, or setting buffer sizes, varies by platform.

- More often than not, in modern times, you also want TLS... and that's not available portability across platforms either, and is a whole new awful API to learn (if you choose to use OpenSSL directly).

- No RAII, means resource leaks (in C++).

Using the raw BSD sockets APIs as a starting point for any portable application in 2021 is fucking insane. There's a reason why Python has the 'asyncio' module now and Go has the net module and goroutines.

Between Windows and Unix/Linux, the general approach is portable at least. Across various Unix flowers, I'd expect some more portability. Can't say much more since I've never tried to use the same code on multiple platforms.

I'd expect you can easily code one backend per supported platform since the backend specific code can start out (and most likely, stay) fairly minimal, like 100 lines or so.

> Using the raw BSD sockets APIs as a starting point for any portable application in 2021 is fucking insane

I started a Linux POSIX sockets "embedded" server project in 2019 using BSD sockets API (TCP) that is rock-solid even though it has some critical low-latency components in the data path (~10ms).

I also worked on a Windows GUI project in 2020 using WinSock2 (TCP). Then I did several experimental projects on Linux POSIX sockets in 2021, building reliable streams on top of UDP. The platform is not that important, I used non-blocking sockets and moved from recvmsg()/sendmsg() to recvmmsg()/sendmmsg() as an optimization, which is maybe 20 lines more code on the backend.

I wasted several months with the wrong approaches on Windows first. I used WinSock2 with IOCP (asynchronous completion ports) and tried to be super clever with multi-threaded designs (roughly thread-per-connection models) and lots of synchronization, even going into "Fiber" approaches with custom scheduling.

That's all wrong, and I/O is very simple. You place buffers at the connections, then you pump data to/from the buffers on a regular basis. You write plain, simple, procedural code, no threading or any other cleverness needed. All you have to do, just like with files or any other I/O, is get rid of the expectation that you can write "nice" non-blocking code in any way. You just don't do that, it won't work out (expect for scripts / batch programs).

I don't see a reason why the story with TLS should be any different (never tried though). It should just be a component that you put between the network buffers and your application code. Something arrives from the network, you shove it to the TLS module. Something arrives from the TLS module, you shove it to the network.

> No RAII, means resource leaks (in C++).

Don't worry - it's just the same as with file descriptors or most other resources. If you're declaring them inline in a stack, something is wrong. Usually there should be exactly one place in the codebase where you're creating / accepting sockets, and one place where you're closing them. There's really nothing to worry about. There's so much C++ RAII zealotry and resource leaking FUD in the wild, but with a systematic appraoch there's little that can go wrong, plus the code will be so much better structured for out.

> You write plain, simple, procedural code, no threading or any other cleverness needed

Using sockets in a synchronous fashion is one way to block for an indefinite period of time. Once a TCP connection is established, there are failure modes where nothing will notify you that the connection has been lost until you try to write(), and even then after minutes in the worst case. Using sockets without timeouts is nuts. The BSD sockets API doesn't give you timeouts.

>I wasted several months with the wrong approaches on Windows first. I used WinSock2 with IOCP (asynchronous completion ports)

If you'd used Boost ASIO you'd have gotten Windows IOCP under the covers for free.

I honestly don't see an argument here. Defaulting to these low level primitive APIs is an act of hubris. Boost has HTTP, TLS and Websockets as well, all under the same async I/o model. Even HTTP/2 is available under asio via nghttp2

> there are failure modes where nothing will notify you that the connection has been lost until you try to write()

you need to either read() or write() on a connection to be informed that the connection was terminated or half-closed. My server application works perfectly, it reacts immediately to any state change. Did not require any special code, just monitor the read and write ends, which is what one does anyway. (Yep, this is API specific behaviour of course, but it's the only sane approach IMO, since the termination event must be sent in a synchronization with the actual channel interaction).

Of course, if you're not checking for updates on both directions (read + write) because you're blocked on some blocking interface (either on the same socket or different I/O port or computation), your server won't react. The API is not to fault. The mistake was to write blocking code.

That is the difference between dirty batch scripts and systems programming.

Calling read() won't tell you anything if someone cut the wire or unplugged an Ethernet cable that wasn't directly connected to your machine.

write() won't fail until after a bunch of TCP re-transmit timeouts have passed.

TCP keepalives can help but you have to enable them and, as I said before, doing so is different on different platforms.

Honestly, if you're doing anything remotely interactive or latency sensitive on the same thread as network I/O you need to go async.

For the record, I am of course doing async (Or rather non-blocking) I/O. I explicitly said that it's a mistake to write blocking code. (I made an error somewhere though where I wrote "non-blocking" instead of "blocking").

(And as I described, the "green threads" kind of blocking code is a mistake just as well. And the event-driven kind (i.e. callbacks / like in Javascript) leads to messes as well, only way I see it not become a mess is pushing stuff to buffers to process messages later in a separate message processing loop with some better context etc.

> The BSD sockets API doesn't give you timeouts.

Of course you can get timeouts (using select() or any other standard event notification mechanism), and most importantly you can easily get non-blocking socket reads/writes, I did just that.

> If you'd used Boost ASIO you'd have gotten Windows IOCP under the covers for free.

Well, I got Windows IOCP without the covers. Even better, since now I can integrate all IOCP parts in my application, and don't have to separate the ones that are covered (or might be? hard to see when covered, right?) by library A from those that are covered by library B.

But I'd like to see first whether IOCP is strictly needed anyway, synchronous non-blocking reads/writes might give you more than enough performance for most cases.

> Boost has HTTP, TLS and Websockets as well

I don't use Boost on principle. Maybe some of these libraries are usable, but boost is a community of architecture astronauts. Another reason is that I avoid C++ if possible.

> Defaulting to these low level primitive APIs is an act of hubris.

BSD sockets is not low level, if anything it is too high-level. As said, it allows you to send and receive packets. What more could you want? Anything else is snakeoil.

Update: Yep, this seems to be some overarchitected junk that leads to unmaintainable messes: https://www.boost.org/doc/libs/1_75_0/doc/html/boost_asio/ov... The basic primitive, receiving new updates, is not readily available. Instead, you're encouraged to do callback handlers, leading to temporal coupling and ravioli code.

All in the name of optimizing for short syntax in toy examples. Look, how much you can do in just 5 lines with automatically inferred types, and pray the RAII! (Nevermind that anything moderately complex will require twice the normal amount of code just to unwrap all the insanity).

> Of course you can get timeouts (using select() or any other standard event notification mechanism)

That's just it, there's no such thing as a 'standard event notification system'. select() is terrible for performance, and all the best options are different on every single platform.

> Instead, you're encouraged to do callback handlers, leading to temporal coupling and ravioli code.

Callbacks are the simplest primitive for async code. If you're not comfortable with them then you're not going to go far with async I/O. Not to mention, ASIO also supports futures and coroutines.

By `standard` I mean a system that works on this platform for any kind of fd or handle.

select() is just fine for simple cases, but of course it has some known problems, such as MAX_FD. There are better APIs, and ultimately it was learned that ringbuffers between the user-process and the kernel (that remove the need for system calls) as an implementation of asynchronous I/O are a good idea. I.e. IOCP, io_uring, etc.

Often you don't need any of these APIs at all - in a system with a constant ("stochastic") load you don't really need any kind of event waiting system. Instead, you can process all incoming messages every N milliseconds or so.

> Callbacks are the simplest primitive for async code.

No, the simplest thing is to just use a plain old buffer. See e.g. the IOCP API or just any regular buffer code. One side pushes the message to a buffer. Some (arbitrary) time later, the other side (potentially, but not necessarily a different thread) pulls the message from the buffer and handles it.

It's just buffers, buffers are all that is needed, and buffers plainly are the best way to solve all issues related to event handling. No fancy abstract template insanity, no weird generic resource handling systems, no complex scheduling systems, not even a need to declare any kind of event handling function or interface. Just place a few statically allocated buffers at the connection points where threads of execution (OS threads, but also hardware / network etc.) meet.

Callbacks are of course theoretically equivalent, since they can be made to do the same thing as buffers. You can trivially write a callback that only pushes the message to a buffer. In practice, the difference is significant because lots of callback boilerplate is created and temporal coupling (i.e. same thread, same code path) between enqueuing a message and handling the message is encouraged. This results in a lot of overly complex code, including custom green thread runtimes. I've seen it, I've tried to do the same, I've seen others try to do the same. It turns out to be a very, very bad idea, resulting in the creation of a whole parallel universe with separate green threads I/O implementations.

This is what the term "Callback Hell" was invented for.

Look at Windows Fibers API, it's widely recognized to be a dead end. You will find some good post-mortem material on that topic on the internet.

> Using sockets in a synchronous fashion is one way to block for an indefinite period of time.

I'm not saying to use sockets in a synchronous fashion (i.e. blocking I/O). That would, of course, potentially block the thread indefinitely.

"Plain, simple, procedural" does not imply "blocking I/O". What I mean is to use no fancy types, no callbacks, no crazy automatic scheduling magic. Very simply, there is nothing special required to handle events. Just a buffer.

For example Network.framework in Apple platforms.



See slide 17 on the WWDC 2017 session.

Similarly on the UWP/WinRT based APIs, and on Android the NDK doesn't see the network APIs that are only exposed via Java APIs.

The C++ code is running on webassembly, which is no more at risk of buffer overflow attacks than vanilla javascript. Worst-case scenario, you could have memory leaks in the C++ code or in the wasm-javascript interface (since javascript doesn't support finalizers for webassembly objects). But this is a usability issue, not a security issue

> which is no more at risk of buffer overflow attacks than vanilla javascript.

This is quite wrong, a Wasm program can overflow internal buffers due to a missing bounds check and access unrelated data as a result. See HEARTBLEED for a case where this created a very real vulnerability. The Wasm safe sandbox only protects the boundary with the rest of the system.

You're right that wasm programs can overflow internal buffers in linear memory, which can be dangerous. However, wasm does a lot more than only protect the boundary with the rest of the system, including

* Safe call stack (opaque / managed by the VM, and so uncorruptible).

* Safe control flow (no jumps to unexpected places).

* Safe(r) indirect calls (only methods in the table can be called, and the signature is verified).

However, wasm also lacks a few things, like the ability to write-protect static data (see "Everything Old is New Again: Binary Security of WebAssembly"). Future wasm proposals will hopefully address those things.

OK, that got down-voted. I thought the basic idea behind HN is that we can have some interesting discussions. Simply down-voting without adding a reply why you disagree with an opinion does not really help.

I have been interested in security for a long time. The number of security vulnerabilities that have been caused by insecure memory management problems is really huge. Some people will probably claim that this is not a problem with modern C++ because it can remedy this problems. But this assumes that the programmers know all the possible pitfalls. With respect to security, the problem is that when there is only a single weakness in your system it might become a point of attack. With a language like C++ there are many possible weaknesses that simply do not exist in memory safe languages.

> "Simply down-voting without adding a reply why you disagree with an opinion does not really help."

This looks like a general criticism of using C++ which has nothing to do with the topic of this post. You're free to criticize, but this criticism alone brings absolutely nothing constructive to the conversation and only serves to incite more useless "have you considered writing this in Rust?" conversations. You're not even suggesting what you think they should've used instead.

> "I have been interested in security for a long time"

Here's a tip for you then: security is not an absolute, and things usually aren't as black or white as you might think. Take a moment to consider the fact that C++ is one of only a small handful of languages with which everything around you has been built for the last 30+ years. Do you know something all of those other engineers don't already know? Otherwise, humility goes a long way.

The article title literally has 'C++' in it.

And yet it is more specific.

According to the article, the C++ code is compiled via Emscripten (presumably to WASM, or maybe to asm.js), so it's running sandboxed either in the WASM or JS runtime. Any potential memory corruption caused by unsafe C++ code is contained within the sandbox (which is the whole point of JS and WASM really).

The security implications are exactly the same as writing the code in any other language (incuding Javascript or Rust). If the sandbox is buggy, then a "safe" language wouldn't help either.

Just because the attack is contained inside the sandbox doesn't mean it can't do anything, so no, "it's in a sandbox" does not remove all risk automatically.

You're right that it doesn't remove all risk automatically. You can still corrupt data inside the sandbox.

However, wasm has a very clear sandboxing boundary. The ability of an exploit to escape the sandbox is very small if you are careful there.

IIUC the task here is a user that wants to parse their own files. For that, I think wasm's sandboxing (if used properly) is very useful. Especially since in this case it runs on the web and so we also have the browser's additional isolation (a sandboxed process).

Memory safety is incredibly important, but there isn't a simple answer in the space of tradeoffs, at least not for tasks like this. (For things like running an executable on bare metal that parses arbitrary inputs, obviously things are very different!)

A WASM module basically is like an OS process, from security point of view.

So now think what might happen, when not used properly.

Some form of bounds checking should have been part of the design, like memory tagging.

Yes, exactly, otherwise buggy applications wouldn't be a big deal because we could run them on their own dedicated computers.

Section 2.5 of this paper has a good discussion on this: https://cr.yp.to/qmail/qmailsec-20071101.pdf

In a browser environment, all addressable memory accessible to WASM is more-or-less just a javascript ArrayBuffer object. If you can unintentionally break the browser sandbox with buggy C++ code, someone else has almost certainly already compromised your system with malicious plain ol' javascript.

Well, you're not starting of to great by randomly claiming that it's a security issue that it's written in C++. It's not really productive. A lot of software is, for better or worse, written in C or C++, that's not going to change.

For years people have been yelling: "It's broken because it's written in C/C++". That same "attacks" was made to promote Java 20 years ago.

Sure, maybe they could have picked a memory safe language, but they didn't. Perhaps because they know C++ and doing the same project in a language they're just learning would result in a ton of other bugs. They even write that they hired a few brilliant C++ programmers, so chances are that they know how to safely handle memory in C++.

From my point of view, better C++ than C, however and not speaking from this project rather in general terms, adopting best practices for secure coding in C++ seems to still be an uphill battle, saying this as C++ aficionado.


I disagree with the commenters who are saying your comment was just "C++ bad". The point I took from your comment is that this particular class of application, which is parsing many complex formats from external sources, several of which probably have extensive warts and edge cases, and rendering the output to a browser, is the type of application may be the most vulnerable to the downsides of C++ when it comes to security.

I think it's a reasonable point, and a step or two above "just use Rust".

You write better than I do.

You're not trying to have a discussion, your post is essentially 'why didn't they use Rust'. Sadly predictable.

There is no-one on HN who doesn't know that C++ has historic memory management difficulties.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact