
Oga: a new XML/HTML parser for Ruby (2014) - redman25
http://yorickpeterse.com/articles/oga-a-new-xml-and-html-parser-for-ruby/
======
gsnedders
Kinda sadly, a brand new HTML parser that doesn't even follow the HTML spec
(which nowadays defines how to parse any character stream, therefore including
all non-conforming documents). :(

~~~
YorickPeterse
Has this actually posed any problems for you? If so, could you report these as
issues if you haven't already done so?

------
languagehacker
Nokogiri's problems are compounded in that a lot of major Ruby gems still use
it, so sometimes you can't get around the long build time. A new parser is
only one teeny tiny step towards fixing that problem.

These are libraries other businesses now rely upon. Should the maintainers of
this library risk the stability of the gem (and thus the reputation of the
project) by swapping out such an important dependency? It's incumbent on the
designers of the replacement and the community behind it to make such a case.

------
hackerboos
How does this compare to Ox?

[https://github.com/ohler55/ox](https://github.com/ohler55/ox)

~~~
YorickPeterse
This is briefly covered in [https://github.com/YorickPeterse/oga#why-another-
htmlxml-par...](https://github.com/YorickPeterse/oga#why-another-htmlxml-
parser):

> Ox looks very promising but it lacks a rather crucial feature: parsing HTML
> (without using a SAX API). It's also again a C extension making debugging
> more of a pain (at least for me).

Performance wise ox is generally a tad faster.

------
bretthopper
Needs 2014 in the title. Oga has been around for a while now and seems to be
pretty stable.

