Hacker News new | past | comments | ask | show | jobs | submit login
Firefox 56 supports headless mode on Windows (bugzilla.mozilla.org)
345 points by sohkamyung on June 22, 2017 | hide | past | web | favorite | 85 comments

I just finished writing an article that explains how to connect WebDriver to Firefox running in the new headless mode on Windows if anybody's interested: https://intoli.com/blog/running-selenium-with-headless-firef...

It should be pretty similar on Linux, and probably macOS when it comes around.

Great article! I'm the product manager for the headless feature, and I'm unsure why `binary.add_command_line_options('-headless')` doesn't work for you, as the flag does work when invoking Firefox directly:

> "c:\Program Files\Nightly\firefox.exe" -headless

(It also works when Firefox is invoked from a Node.js script in a test suite for a Node.js program that drives Firefox.)

Perhaps this is a Selenium issue?

Thanks! That was guess as well, since running the command you posted from cmd does work. I dug around selenium's source code a bit, but decided to take the pragmatic path and use the MOZ_HEADLESS environment variable instead.

Is there a linux headless mode already? Selenium chrome driver headless is pretty much useless in linux because important methods like `send_keys` need X anyway for keyboard mappings or something like that.


Support headless flag on Linux, RESOLVED FIXED in Firefox 55

There is a bug filed against the chrome driver send_keys issue, and a workaround...one that requires recompiling though.


recompiling does solve the issue though

If you are blocked it may be worth a look

Genuine noob question: could you not use Xvfb?

The whole point of headless, from what I can tell, is that you don't _have_ to!

you can, and it worked, prior to availability of headless options(Phantom sucks). However xvfb is yet another dependency that could cause breakage.

It also takes a bit longer to start up IIRC, but xvfb + chrome driver on Jenkins definitely works for running tests.

You could also install X server and run tests with an entire graphical stack. But if you don't have to....

The ticket is about the Windows version so xvfb is not an option.

It's not very clear how to actually use this in a test environment like Selenium etc. At least with headless Chrome there are libraries now to drive it via the remote debug protocol like https://github.com/LucianoGanga/simple-headless-chrome

This feels a little nicer than Selenium as it's one less layer of abstraction.

EDIT: guess from other comments WebDriver is the right method to access.

SlimerJS [0] and Selenium WebDriver are the main APIs supported at the moment. Setting MOZ_HEADLESS=1 or providing the --headless flag to Firefox will launch in headless mode.

It should be mentioned that this is an area of active development. I don't believe headless mode has been "officially" announced yet; documentation should follow when that happens. Optimization, more APIs, more platforms, and easy deployment are in the pipeline. The meta bug for tracking is here: https://bugzilla.mozilla.org/show_bug.cgi?id=1338004

(Disclaimer: I'm an intern at Mozilla, working with 'brendandahl on headless right now.)

[0] https://slimerjs.org/

The Chrome Devtools API is really useful. However, if you're wanting to use Selenium, a coworker of mine wrote this up to show how to use headless Chrome with Python + Selenium https://duo.com/blog/driving-headless-chrome-with-python

I would imagine the same approach could be used here with minimal changes.

Would people find it helpful if I add a `headless` capability to `moz:firefoxOptions` in geckodriver so that passing in `{capabilities: {alwaysMatch: {moz:firefoxOptions:{headless: true}}}}` when starting a session will start the browser in headless mode, if supported? This doesn't seem like something that's compelex enough to warrant a whole blogpost to explain how to get it working.

Oh, apparently there's already a flag you can pass to the binary to enable headless mode, and geckodriver already supports setting those, so adding a separate option seems unnecessary. I thought it was just an environment variable.

FWIW, from the Chome side, we think there's more power via the protocol rather than WebDriver.

Well, yes, a low level protocol is going to have at least the same features as a higher level protocol implemented in terms of it. However I think you should mention the significant disadvantages of using a nonstandard API:

* Doesn't generalise to driving other browsers. Using a standardised API gives you the possibility to run your automation against any top tier browser. If you are testing a website this ought to be a top consideration.

* Uncertain compatibilty story. I assume that the devtools protocol can change at the whim of the Chrome team. Using something standardised means that your tests and client are more likely to continue working.

* More limited selection of clients. WebDriver in particular has a large number of production-grade clients that are actively maintained.

I'm not claiming that WebDriver is perfect; certainly if I designed it from scratch it would look very different. And yes, there is a period of churn as it moves toward being a standard. But I think the advantages of cross-browser support are a compelling reason for it to be the default tool for remote automation tasks that are within the scope of its featureset. By all means, if you have something that cannot be done with a standardized technology look at the proprietary solutions. But I don't think it should be your first pot of call.

All of these look like advantages if you're building a monopoly and have a contempt for anything that slows that down.

(Note: I'm the product manager for the headless browsing feature of Firefox.)

I chose to prioritize WebDriver support because it's a popular way to drive headed Firefox, and I wanted to make it as easy as possible for existing WebDriver users to use headless.

For use cases that are not well-served by WebDriver, I've considered supporting the Chrome DevTools Protocol, as @brendandahl noted, although I haven't yet made a decision to do so.

I'm interested to hear more about the use cases you've considered for which WebDriver is a suboptimal solution. Like @brendandahl, I'm in the Bay Area and would be happy to meet in SF or MV.

On the Firefox side, we are considering supporting the protocol or a subset of it. It seems many people have had lots of trouble with WebDriver/Selenium in the past and are hesitant to use it. However, it is nice that WebDriver has a W3C standard which could provide a nice path forward for headless cross browser use. This would probably require browser vendors to make it a first class citizen and work out some the kinks of the spec though.

BTW, I'm in Mozilla San Francisco office, so I'd be happy to chat about the headless cross browser future if anyone from the headless Chrome team is around.

for automated testing in a non windowed env, this can be used.

Do you need to use selenium to control Firefox in headless mode or does it have something lower level like Chrome's devtools protocol?

1. You can use the WebDriver protocol[0] without selenium

2. Firefox's WebDriver endpoint geckodriver[1] is a layer above Firefox/Gecko's native Marionette protocol[2]

So you doubly don't need to use selenium, you can write your own WebDriver client (which should be cross-browser assuming the browsers either support WebDriver natively or have a WD layer of some sort installed) or you can use raw Marionette (either a hand-rolled client or an existing client)

[0] https://w3c.github.io/webdriver/webdriver-spec.html#protocol

[1] https://github.com/mozilla/geckodriver

[2] https://developer.mozilla.org/en-US/docs/Mozilla/QA/Marionet...

You can use use WebDriver/GeckoDriver[0], which is how Selenium actually runs, but most people like the abstraction layer.

[0] https://github.com/mozilla/geckodriver

Patch author here: To add to the other comments, you could also use SlimerJS[1]. Though I don't know if I'd consider that lower level. We're open to supporting other ways of controlling FIrefox and have been gauging feedback on what to support.


I was wondering if I could use this (using a plugin) or Chrome to generate PDF files on the web server. Most of the PDF generation software out there are quite expensive.

Maybe it's possible with the print option Chrome offers. Regular Chrome can print to PDF files. Maybe headless Chrome supports that.

Edit: It's possible. It's on the introduction page of Chrome Headless


Look at 'Create a PDF'

Here's a simple library to do that: https://github.com/LucianoGanga/simple-headless-chrome

You can. In my experience, there are very few systems as good at producing accurate renders as well as Chrome. There are a few Chrome/webkit based options.

PhantomJS made the process pretty easy, but that's probably not a good option these days as it's on the way out.

NightmareJS is pretty much the same and can do it (though I haven't tested myself) [0]

If you want to get a little deeper you can use CEF (Chromium Embedded Framework [1 ]. It basically turns Chrome into a library and there are a bunch of projects that build on top of it (eg Electron, which is what NightmareJS uses underneath).

Here's a project that just uses CEF for converting to PDF [2].

[0] https://github.com/segmentio/nightmare#pdfpath-options [1] https://bitbucket.org/chromiumembedded/cef/overview [2] https://github.com/spajak/cef-pdf

For C#, there is also CEFSharp. This is used, for example, by the image(recognition)-driven Kantu automation: https://a9t9.com/kantu/web-automation

We use wkhtmltopdf to do this server side. It as a few problems and quirks, but it work well enough for us.

it has a lot of quirks, and, frankly, webkit is pretty bad at printing in general as it lacks a ton of printing specific features like re-printing table headers across page boundaries.

There's also only very rudimentary support for page headers and footers.

wkhtmktopdf is probably ok if you just need a quick single page invoice printout, but the moment you need more (in our case, we definitely do), I can highly recommend paying for https://www.princexml.com/ which has support for a lot of printing specific features.

One of the issues we had with wkhtmltopdf was that we use Angular for our views, and we have a view for an invoice, and the ability to print it.

We wanted the same view to be passed into wkhtmltopdf to download it as PDF (with a couple of things omitted).

Unfortunately, the JS didn't execute properly, so you get a borked output.

End result was we render it, capture the HTML after render, and pipe that through to wkhtmltopdf.

Using chrome _should_ resolve that issue and let us just pass the page through.

wkhtmltopdf works quite well for us, even with modern web technologies.

Here's an example fully rendered in React: https://s3.amazonaws.com/ud-reports/comps-report-86.pdf

And one with svg charts: https://s3.amazonaws.com/ud-reports/chart-report-89.pdf

With -webkit- prefixes you can use even flexbox, and and with js polyfills you'll have all you need.

Of course if I would start a project now, I would use chrome headless, but I don't feel wkhtmltopdf is that bad.

We used node-html-pdf[1] for this, but it depends on phantomjs which uses an ancient version of webkit. We migrated to electron and its contents.printToPDF(options, callback), but that has the drawback that we have to start a X-server[2] with our backend code. I have hopes with a headless chrome we can ditch the xvfb-server and have one less dependencies. The setup is a pain, but the results are very good and looking very decent.

[1] https://github.com/marcbachmann/node-html-pdf [2] https://en.wikipedia.org/wiki/Xvfb

What about ReportLab (free version)? http://www.reportlab.com/opensource/

I currently use phantomjs for this, I haven't switched to headless chrome yet.

Latex is free.

Really not the same thing (e.g. pandoc for html -> pdf via LaTeX) as having a webkit based renderer. They do not look the same at all; sometimes it is good to have a LaTeX output but most of the time you are going to want a more accurate representation of your HTML.

The original question didn't state that he was starting from HTML. If you want to generate an invoice or such from plain data, then LaTeX seems like a good fit. Rendering HTML is a whole different bag.

That is true. TeX is a typesetting engine, and allows you exact control over the final rendering, while HTML is a markup language, the rendering of which depends on very many factors.

My time isn't. Sometimes the most cost effective option isn't the one with the lowest price tag.

What are the biggest challenges when implementing headless mode? I'm asking cause this feature took some time to be delivered both in Firefox and Chrome and I always assumed that it should be 'pretty' straightforward to implement. Is it that both engines were coupled with GUI libs?

There are a bunch of things which are tied to native GUI libs in ways that make headless a little annoying to implement. Some examples, from memory and looking at the patches/dependencies in the Firefox bugs on this:

1) Theming of form controls to make them look like native widgets.

2) Clipboard.

3) Printing.

4) Fonts and font rendering.

5) General "now it's time to repaint" machinery.

This is in no way an exhaustive list. In practice what you have to do for headless is create the relevant GUI bits anyway, but not actually show them on screen, then deal with whatever quirks your GUI library has when its bits are not shown.

Patch author here. bzbarsky's comment covered most of it. It hasn't turned out to be that difficult. On linux, it was mostly playing whack-a-mole of avoiding calls in to gtk/x11 and then creating headless implementations of the platform specific code.

One slight complication of headless has been wiring up the headless "widgets" (in Firefox, we refer to most of the platform specific code as widgets e.g. there's gtk widgets, cocoa widgets, windows widgets). Usually the widget type is defined statically at build time, but in the case of headless we wanted Firefox to either use the headless widgets or the platform specific widgets at runtime. Luckily, some of the work on multi-process Firefox work also added another type of widget and made it much easier to support multiple types of widgets.

Overall, the hardest part has been trying to replicate all the events that would normally be triggered by the platform gui code. I've also found that these events and order can vary per platform which makes it hard to do in a non-platform specific way. We're still working out some issues here.

Probably just good old software engineering. Getting a 15-20 million SLOC codebase to work in an environment it wasn't designed to handle takes time.

My money would be on it being a low priority feature than being difficult to implement.

How is it controlled when in headless mode? Still via Webdriver or does Firefox support Chrome Debugger Protocol?

> Still via Webdriver

WebDriver or Marionette.

So I can navigate by cardinal directions now (like scrolling south a bit)?

Joke aside: can somone explain "headless" to me here?

When you run the browser, but it doesn't actually display anything on screen.

Uses include rendering a screenshot (e.g. a preview thumbnail) of some website on a server, or running automated UI tests that click around a web app and fill out forms to verify its functionality.

Thank you for taking the time to explain. I had the same question.

A headless browser is a web browser without a GUI. Extremely useful in web testing/scrapping.

Blog / forum "SEO" spam bots will also be a big use case.

Sure but they most likely already run off xvfb.

The same reasons as Chrome headless [1] but with Gecko engine.

[1] https://developers.google.com/web/updates/2017/04/headless-c...

Anyone know if this supports audio playback?

Firefox audio developer here. I haven't heard from the folks doing this, and I would have been the person doing the code reviews so I suppose it's left untouched.

Edit: looking at the code, it will work just fine, and that means I can even use it myself to run Firefox's unit and integration tests, I usually use xvfb on Linux.

Firefox 55 introduced major changes in the process model. My scrolling on the Mac is slow and not smooth. Others have reported the same. Maybe focusing on stabilizing this big change over one or two releases would be a good thing?

Of course headless is also needed as the PhantomJS solution is not maintained anymore since April. See discussion here: https://news.ycombinator.com/item?id=14105489

It is good to see some competition, I would hate to see automated tests being done purely in Chrome simply because it is the simplest browser to setup in headless mode.

But can you really run windows headless? I thought even the Server Core and IoT versions wanted some sort of GPU even just to display a blue/blank screen.

IoT Core has a headless mode. I don't think it can run Firefox though... UWP apps only.

Does that mean we'll soon have an alternative to ElectronJS / NWJS based on Firefox ? (Positron :p ?)

I don't see how headless mode would be related to that?

Checked. Discontinued. It used to be prism or something but that was discontinued years ago.

The sometimes really make it hard for us fanboys (prism, Brendan and uncertainty about the future of extensions) :-|

Well, we did have XULRunner...

Are there some examples available for this? I would love to replace some of my PhantomJS stuff

Great! This will help us run selenium tests much faster.

Please don't forget about Linux


Poke me when there are no X11/Qt/GTK/whatever dependencies at all - that is headless.


I think the goal is to have a browser that renders web pages and is fully interactive, but just doesn't render to screen and is interactive through an API rather than input devices.

So it's going to need a UI rendering and interaction library.

You don't actually need to have X11 etc running to use headless mode, the libraries are just necessary for linking as there aren't separate headless and non-headless Firefox binaries shipped. The possibility of dynamically loading these libraries to remove this dependency is being looked into.

Upvote the bug! https://bugzilla.mozilla.org/show_bug.cgi?id=1372998

I doubt it'll happen anytime too soon, but Mozilla should see the community interest.

As minor counterpoint: Firefox is not strictly speaking developed "by Mozilla" but by the community. Mozilla's the curator, and has a fair bit of staff also working on it, but vast swathes of the codebase are community contributions. If you need FF to do a thing, and you think you have good enough ideas to contribute the solution there, then rather than upvoting a bug, or bumping it to ask what's happening, be the person to make the happening happen =)

Of course, but bug upvotes don't hurt that cause, either. I have a few reasons that I work on open source projects. First, to fix a bug or add a feature that affects me directly. Second, to learn something new. If I'm just doing the latter, I'd prefer to tackle a bug with more interest/votes, to help others out while I'm at it and increase the odds of finding a willing reviewer.

If your running this on a server sure great. On desktop that is next to pointless due to the need for those dependencies for everything besides cure apps.

>If your running this on a server sure great.

Which is certainly the obvious primary use case for headless browsers.

I actually have to run it on my desktop due to non-primary use cases.

This got down voted a lot but GTK is really a pretty painful dependency in many situations (especially gtk3)

Not that Firefox itself is particularly lightweight.


Well - you will get downvotes for making a technical claim, but not explaining the reason...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact