
Let’s build a browser engine (2014) - chiefofgxbxl
https://limpet.net/mbrubeck/2014/08/08/toy-layout-engine-1.html
======
algorithm314
An other very lightweight weight engine in c is dillo. Unfortunately now its
development has stopped.

~~~
userbinator
There's also NetSurf.

As much as saying this is probably going to get me a lot of hate from web
developers, the world needs more browser engines. Simpler ones, maybe HTML+CSS
only with no scripting. The idea of the Web as a flexible hyperlinked document
system and not an application platform needs to gain more support. IMHO if
your site is information-centric, and it's not readable in these "document-
only" browsers, you're doing it wrong.

~~~
vedantroy
I don't understand this hatred of Javascript. The only websites I've felt were
actually bloated are news sites with a lot of ads, but that's not a problem
with Javascript as much as it is a problem with excessive ads.

What counts as information-centric? A lot of basic things (commenting,
searching, liking a post) require Javascript. If you want to use pretty
animations, there's a high probability you need Javascript.

Making information-centric sites only use HTML/CSS would significantly
decrease the capabsilities and attractiveness of the sites.

~~~
naasking
> The only websites I've felt were actually bloated are news sites with a lot
> of ads, but that's not a problem with Javascript as much as it is a problem
> with excessive ads.

The problem is JS has too much power in the browser, and too little
consideration for security. It can effectively take control away from the
user, there's virtually no way to know what it's doing without source code
audits, which are prohibitive, and the security vulnerabilities are legion.

> What counts as information-centric? A lot of basic things (commenting,
> searching, liking a post) require Javascript.

None of these actually require JS.

~~~
lmcarreiro
> > What counts as information-centric? A lot of basic things (commenting,
> searching, liking a post) require Javascript.

> None of these actually require JS.

Can you imagine a facebook doing a page reload/refresh every time you click to
like a post? Or without loading more content on demand every time you scroll
the page?

~~~
naasking
> Can you imagine a facebook doing a page reload/refresh every time you click
> to like a post?

You're stuck thinking about Facebook as if it still had long lists of posts
with infinite scroll. The UX would be completely different when the design
constraints are different.

For instance, instead of infinite scrolling, you might show one post at a time
with clickable previews of the last and next posts. A like doing a full
postback isn't a big deal with this approach, particularly with judicious use
of anchors. Certainly not as slick, but perfectly usable.

~~~
simplify
Usable, with a worse user experience, to what end? It would be nice to have
more non-js built-in browser behavior, though.

~~~
naasking
A slightly worse user experience on Facebook, for a significantly better and
more secure user experience on the web overall. I'm not sure that's such a
terrible tradeoff.

Agreed on built-in browser behaviour though. Chrome pushing more input types a
few years ago was a great thing.

------
codezero
I loved this walk through back when it came out and even made a ham handed
attempt to write the example code in Go
[https://github.com/radiofreejohn/gobinson](https://github.com/radiofreejohn/gobinson)

It is a bummer it ended with a to be continued that never happened, I remember
checking up on it from time to time.

~~~
emmanueloga_
So the blog post ends with a promise of sections about:

* inline layout

* text rendering.

* networking.

* scripting.

LAYOUT

I think here could be useful to take a look at something like Facebook's Yoga
[1] although maybe not great to start. There's this little layout demo that
could be interesting to investigate [2].

TEXT

Regarding text probably the keyword to learn about is "rasterization". I found
a few tutorials back in the day [3], [4]. In general rendering text is a PITA
and you don't want to do it no matter what :-) The only lightweight solution I
found was stb's truetype [3].

So if you want to render text yourself you'll need:

\- a parser for a font format format (e.g. TTF, PITA) to have a way to get the
curve definitions for each font.

\- a rasterizer (could be a fun project for getting something reasonable and
probably slow performant)

\- a curve filling algorithm and plugging it to your rasterizer, which will
probably be able to do humble things like drawing filled triangles and other
simple shapes (PITA! :-p)

\- a word and paragraph layout engine (PITA again :-)

Overall I'm happy half understanding the steps involved and not having to deal
with all the messy steps in the middle.

NETWORKING

I'm super lazy in this area, and I would rather learn something like ZeroMQ to
posix sockets... back in the day I followed Beej's tutorial which was nice.
[6] Then you can connect your sockets to some pre-existing HTTP parser which
are a dime a dozen these days [7] and you are all set! :-p

SCRIPTING

Not much to say here... hundreds of languages available. Probably Lua is a
nice alternative to JavaScript, but there are also plenty of lightweight
JavaScript implementations these days (I think there is a nice little one
called V7). If you want to learn how to actually implement an interpreter
there are plenty of resources too. If I went this route I would start by
implementing some version of lisp or scheme for simplicity.

1: [https://yogalayout.com/](https://yogalayout.com/)

2: [https://github.com/randrew/layout](https://github.com/randrew/layout)

3:
[https://github.com/NotCamelCase/RasterizationInOneWeekend](https://github.com/NotCamelCase/RasterizationInOneWeekend)

4:
[http://www.scratchapixel.com/index.php](http://www.scratchapixel.com/index.php)

5:
[https://github.com/nothings/stb/blob/master/stb_truetype.h](https://github.com/nothings/stb/blob/master/stb_truetype.h)

6: [https://beej.us/guide/bgnet/](https://beej.us/guide/bgnet/)

7:
[https://github.com/search?q=http+parser&ref=opensearch](https://github.com/search?q=http+parser&ref=opensearch)

~~~
krapp
>So if you want to render text yourself you'll need (...)

Or possibly use SDL or a similar graphics library meant for game development.

You would still probably have to write a lot of code but you would get a
rasterizer, font loading, simple primitives and image rendering in a cross-
platform compatible API.

~~~
swiley
If you’re going to use a library just use straight freetype (what SDL wraps.)
It’s very easy to use.

~~~
krapp
Sure, but SDL gives you more on top of just font handling.

I'm just saying, it's not necessary to reinvent all of the wheels.

------
crimsonalucard
I bet it's possible to achieve layout completeness (in the same vein as turing
completeness) with a much simpler markup and styling language.

I'm actually more interested in implementing a toy OS or a toy compiler for a
toy language because the browser seems to be a more of very specific complex
mess of mistakes and legacy choices rather then a serious theoretical concept.
Although I'm sure you'll still learn a lot if you implement one.

~~~
emmanueloga_
I don't know about the "completeness" part but the idea of making a brower-
like rendering system without the burden of having to support old standards
and data formats _is_ appealing. I think this is what projects like QML [1],
JavaFX [2], Avalonia [3] and more recently Flutter [4] have done (also...
Flash! does anybody remember it anymore? One minute of silence please...).

It is too bad none of these projects decouple well enough the layout and
rendering part, they are really very coupled to their respective environments
and all of them are super idiosyncratic and make cross-language development
almost impossible or unpractical (except maybe Qt but it is not quite the same
model as QML), as opposed to the openness of the web platform. I think maybe
all of them use Skia on the back though?

Some of these projects are similar to the browser in that they allow
controlling the content using an scripting language (JavaScript for QML). What
do you get when you mix a rasterizer plus a scripting system or VM plus a
layout system plus some IO primitives? Macromedia Flash! Just kidding, you get
a browser. Or either one :-).

1:
[http://doc.qt.io/qt-5/qmlapplications.html](http://doc.qt.io/qt-5/qmlapplications.html)

2: [https://openjfx.io/](https://openjfx.io/)

3: [http://avaloniaui.net/](http://avaloniaui.net/)

4: [https://flutter.io/docs/resources/technical-
overview](https://flutter.io/docs/resources/technical-overview)

With regard to browser specific problems, I came across these articles from
this "Meyerovich" guy talking about some opportunities to optimize and
parallelize the layout process. Honestly a lot of this stuff flies over my
head (formulating the layout problem with attribute grammars? I don't even
know how to use attribute grammars for the "normal" things... :-p). I have the
idea of maybe coming back to these papers some day with more time and patience
and go over the references, etc:

[https://lmeyerov.github.io/projects/pbrowser/pubfiles/paper....](https://lmeyerov.github.io/projects/pbrowser/pubfiles/paper.pdf)

[https://lmeyerov.github.io/projects/pbrowser/pubfiles/login....](https://lmeyerov.github.io/projects/pbrowser/pubfiles/login.pdf)

[https://lmeyerov.github.io/projects/pbrowser/hotpar09/paper....](https://lmeyerov.github.io/projects/pbrowser/hotpar09/paper.pdf)

~~~
Yoric
Another problem would be the delivery. Yes, writing something that behaves
like a browser would be nice, but it took 10-15 years for browsers to be
present on enough computers to make a clear difference, and there is currently
no reason to assume that a new technology that is the-same-but-simpler would
have nearly as much success.

Of course, the solution would probably be to compile that code to something
that works in actual browsers, which would be somewhat ironic :)

~~~
emmanueloga_
I would say, if you managed to cross compile your new browser into a
statically linked binary executable weighting a few hundred kilobytes, it
would become trivial to distribute and a very attractive “platform”.

~~~
naasking
You could probably easily render it in a regular browser using JS as well.
Instant universal deployment.

------
jpfed
This was a very useful guide for me a couple years ago. I had to print html
documents (thankfully, just a reasonably-well-defined subset) from a .NET
Winforms app, but driving a browser to do it was too slow and flaky. So I used
the OP as the starting point for a C# layout engine.

~~~
aliswe
Is it open source? Thinking about the logic for auto boxing text nodes
adjacent to elements...

~~~
emmanueloga_
Can you elaborate on "auto boxing text nodes" ? Not sure what you mean. Are
you trying to shape paragraphs of text? (like [1]).

By the way, anything having to do with text is more complicated to handle that
one would imagine [2].

1: [https://github.com/bramstein/typeset#variable-line-
width](https://github.com/bramstein/typeset#variable-line-width)

2: [https://xxyxyz.org/line-breaking/](https://xxyxyz.org/line-breaking/)

~~~
aliswe
This is what I'm referring to:

[https://www.google.com/search?q=css+anonymous+block+box](https://www.google.com/search?q=css+anonymous+block+box)

Ie. when having text adjacent to an element (which is a block element), that
text is automatically inserted into a box which can't be styled or referred
to, but yet help to layout the document.

    
    
        <div>
          I'm in a box
          <div></div>
          Me too
        </div>

------
ape4
Just implementing the <video> tag is a career.

------
reinhardt1053
What happened to Matt Brubeck?

~~~
bugmen0t
He still works on browsers, but blogs less. You can see his contributions on
GitHub.

------
favorited
Should probably say [2014]

~~~
ejanus
Yes.

