Hacker News new | past | comments | ask | show | jobs | submit login
Improved Doxygen documentation and search for C++ projects (magnum.graphics)
54 points by mosra on Jan 3, 2019 | hide | past | web | favorite | 32 comments

I have installed doxygen on a handful of projects. In hindsight, it was a waste of time. No one reads generated codedoc, yet everyone just read the code.

Doxygen should be replaced with rendered comments in IDEs/code editors. That's the real solution.

Exactly. Doxygen implicitly generates useless info about what file is included by what file, list of places from where a function gets called, what line was this and that function declaration in (and where is the definition), huge class inheritance diagrams, entangled monstrous file dependency diagrams, alphabetical file index, alphabetical symbol index, including every possible undocumented symbol and file it can find and tons and tons of other stuff that has a total value of 0.

The user should visit docs to get to know a high-level overview of a library or human-readable explanation of an algorithm. Not to see stuff that's already explained by the code itself.

So here I threw away all this noise and the theme is actively forcing the library authors to focus on important stuff in the docs, explicitly excluding useless things that could be auto-generated. "Installing Doxygen on a project" achieves nothing, one has to write the actual docs first.

The result? See for yourself: https://doc.magnum.graphics/magnum/namespaceMagnum_1_1Animat...

I find the caller and callee stuff kind of useful if I'm handed a legacy project that is kind of well structured I can get an overview of the code flow.

Doxygen can generate a lot of data most projects don't need (and may not have the best defaults). Especially for external docs.

There's a lot of value in shaping/limiting options and choices to make Doxygen easier to use for focused, high-quality external docs, but I'm not sure it follows that the other information it can generate is useless.

For the record, nobody complained when I removed all the class diagrams and other things mentioned above, so my takeaway was that ... yes, those features were useless :) On the other hand, people complained about lack of essential features that Doxygen didn't have, like proper search or #include information for non-class members.

Sure. You're doing useful work shaping something that is a bit too free-form for most. I didn't question your curation. Nor do I think your project needs to support them.

But the fact that you haven't heard from the inevitably short list of projects that have looked for new documentation tools in the past ~14 months and also happen to be complex enough to benefit from niche features like inheritance diagrams doesn't mean they don't exist.

The problem isn't that no one reads generated codedoc, the problem is that doxygen is a pretty abysmal code generator.

If you look at other languages, automatically-generated code documentation has become the main way people ingest documentation for software projects. Javadoc was the first major such tool; if I'm trying to use a Java library, I don't read the source code, I start by reading the generated documentation. If you've ever used https://docs.oracle.com/javase/8/docs/api/, you've used the results of Javadoc. Similarly, Rust also has a widely-used builtin documentation tool (and again, the official standard library documentation is entirely the result of rustdoc).

C/C++ of course don't have a builtin tool. Doxygen attempts to fulfill that same gap, but it also suffers from trying to support every language under the sun, as well as the inherent issues of C/C++ (the use of the preprocessor to generate module boundaries does not make for fun processing). I also find that the default settings tend to make too many graphs that aren't really useful but take up a lot of visual space (I've never found collaboration graphs to be helpful), and its approach to trying to figure out what things are types gets pretty sketchy (why does it think that unsigned is a class?). To top it off, I was never impressed with its design aesthetic.

In short, the fault of doxygen is that it's not designed in close integration with the language parser to ensure a highly correct understanding of source code, which makes it tend to fall over if anything is moderately unusual or confusing. And since the general experience with its output tends to be negative, there isn't much effort in trying to maintain quality documentation comments in the underlying code in the first place.

I agree that the default settings create a lot of useless output, but disagree with the rest of your comment.

There's nothing inherently wrong with the code Doxygen generates or with its C++ parser. I've worked on large projects using crazy Boost pre-processor macros and plenty of C++14 features and Doxygen handled it all fine.

IME the number one problem with Doxygen is that the people writing the doc comments do a poor job. It's garbage in, garbage out.

The Java documentation is great because the people who write it put in a ton of effort to actually explain how to use their API, with examples, hyperlinks to other modules, version information, compatibility issues, and everything else. They're not adding trivial @param and @return comments as an afterthought.

Unfortunately, a lot of Doxygen documentation seems to be the trivial kind that tells you "@param foo The integer parameter foo" but doesn't give any clue how to use the library, class, or method.

Look at LLVM's doxygen, say, http://www.llvm.org/doxygen/classllvm_1_1MachineRegisterInfo.... You see how things like const, bool, and unsigned are all believed to be classes? Go further down, and you can find a definition of a global variable Reg, which is from a case where Doxygen managed to think that the arguments of a function a global variables.

That's just the stuff I have readily available; there's been several times where I tried to use Doxygen to figure something out, only to discover that it was horribly confused and worse than useless (since it looks like it can tell you the information, but it doesn't figure it out).

I used doxygen for a Cuda project, and it had quite a hard time with it. Due to the syntax differences, it would spit out warnings or errors. Some could be ignored, and some couldn't. I believe it's gotten better over the last year, though.

> In short, the fault of doxygen is that it's not designed in close integration with the language parser to ensure a highly correct understanding of source code...

I'm fairly certain that doxygen uses libclang[0] for a correct parse of the code these days. 0: http://clang-developers.42468.n3.nabble.com/Doxygen-1-8-4-us...

It's an optional feature and isn't built by default. See http://www.doxygen.nl/manual/config.html#cfg_clang_assisted_...

I have found doxygen useful, but only in the context where you don't have the source code. QT has a doxygen like tool which means I don't have to understand how the source is structured to know how to use it. I know of other closed source projects (which I'm not allowed to name) that similarly use doxygen like things so that you don't need the source code.

Note that to be useful someone needs to their full time job to edit it and make sure there are good examples.

Definitely. Maintaining the Magnum documentation is the most time-consuming portion of the development (tests are second, the actual code takes the least amount of time), but so far it was very worth it.

This says more about C++ and Doxygen specifically than about code documentation in general. In Python, Ruby, Java, Rust, Go, etc. the documentation is where people tend to go first.

Generated documentation can be really really useful, but in my experience and opinion it needs to be a custom documentation generator. That's the approach I used with Autumn, where the generated HTML on the website is the same as in the app, so you can get a feel for it without trying the app out. The docs are concise, clear, good looking, organized in an easily navigable and understandable hierarchy, and with clickable references. I think what I came up with beats out commented code by far, although I did start to wonder if I should render the English portion of each node in my doc tree to be styled as a code-comment, just to draw more attention to the actual code signature itself.

[1] https://sephware.com

I've been writing a lot of medium level (highly technical, internal) documentation lately and it seems that doxygen could make a good "glue" between this and the code itself. When describing a process for example, being able to link to the entry point would be handy, when creating an enum cheat sheet being able to link to the definition (to make sure it's correct) would be nice. I think the value is there rather than pure API documentations, it's sowwhat similar approach to literate programming using tools like emacs org mode, which is only really useful for actual API's and not internal projects.

I haven't converted much of my documentation to doxygen yet, so I'm not sure how well it works in practice but so far it seems like a promising approach.

Also, when you work with OO junkies features like INLINE_INHERITED_MEMB create some useful and much more navigable projections.

Especially in Python applications I tend to try to write code that you can read top to bottom like a text and figure out what it does. This is largely destroyed by using an external documentation tool (whether that's Doxygen or Sphinx is irrelevant), because those only extract a very tiny and fixed amount of information from the code. Some docgens even reorder entities alphabetically (by default), which is basically heresy.

I think it would be very interesting if e.g. Sphinx were able to generate pages where the highlighted and crossreferenced code is interleaved with the processed markup from those parts it extracts today (docstrings etc.). Heck, even being able to just tell it to only generate the source code pages and crossref to those instead would be better for documenting applications.

it's slightly off topic but I used Doxygen for my Hexagon Library written in c# [0] and was quite content with it - aside from Generic Classes with Different Parameters ( GenericClass<T> and GenericClass<T1,T2> ) not working properly - maybe I just was too stupid. I liked the ease of use compared to the other code doc generators I tried. However rendered comments inside the IDE would be so much better, it's kind of crazy when I think about it that that is not a feature of Visual Studio (but a 3D-Editor is...)

[0] https://aurelwu.github.io/

That's a well-written and nicely organized documentation, congrats :)

Btw., I'm now working on a C# support for this documentation theme, but by using XMLDoc directly instead of Doxygen (and thus also with none of its bugs). See here: https://github.com/mosra/m.css/issues/76

For the IDE docs integration, I know about QHP (by Qt), which is supported by QtCreator, KDevelop and has a VS support using the Qt VS Tools plugin. But that's C++. Weird that VS doesn't have something similar builtin for C#...

FYI: Resharper can render XML comments (place the cursor on the symbol of interest and hit CTRL+SHIFT+F1). This includes the entire comment, not just the <summary> section.

I agree. Doxygen adds zero value. If I need to know what methods a class has, I have Intellisense/Roslyn for that, without even leaving emacs or my ide.

What needs to be in comments (and frequently isn't IME) is intent.

with Doxygen or other code generation tools you can add images which can help users understand your API more easily though. Not saying images are always helpful but there are definitely use cases. Natively adding images into comments would be a super useful feature for each IDE I think.

edit: actually there seems to be an extension for Visual Studio for example: https://marketplace.visualstudio.com/items?itemName=MsBishop...

I feel silly for never searching for that ... but then unlike with Doxygen everyone needs to have that extension

I have my own gripes and struggles with Doxygen, but I think it's useful to realize that it's a documentation-generation toolkit that can serve many purposes.

I agree it's usually not the best tool for generating outward-facing API documentation for languages with native documentation tools (though mosra is demonstrating that it's not incapable).

With several different configs, however, Doxygen can generate different docsets that can be tailored to multiple audiences that need to understand different sections and facets of a sprawling codebase.

In FOSS libraries maybe, on commercial ones, not so much.

Magnum truly has the most beautiful/polished doc I've ever seen.

Even more impressive considering it's a C++ project.

I never heard of it before but boy do these docs look and read well on a first skim. Thoroughly impressed in just a few seconds.

Sphinx (http://www.sphinx-doc.org) used by Python and some C++ projects like LLVM provides a modern high-quality alternative to Doxygen.

Not really an alternative, since C++ projects often use Breathe to "transpile" Doxygen to Sphinx. I even contributed a few patches to Doxygen from which the Breathe/Sphinx project directly benefited, but after trying it out a bit and hitting a few walls, I realized I would achieve better flexibility by having my own frontend.

Especially because (if I can dare to say) I ended up with a much better search than what Sphinx (or Doxygen) has -- try it out: https://doc.magnum.graphics/magnum/?q=max#search ;)

Yes, the search functionality in Sphinx leaves a lot to be desired, even if you are already writing your own Sphinx plugin it is fairly difficult to add stuff to the search indices in a meaningful way. It's also slow. Overall Sphinx suffers a lot from hardcoding the reST data model and directly working with that.

You don't have to use Doxygen with Sphinx although you can.

The DPDK documentation uses Sphinx, and it's great. For a massive codebase like that, it's a nice addition compared to what an IDE would do.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact