This is fantastic. I'm using a combination of Chrome and PhantomJS for karma testing right now, for https://github.com/paypal/paypal-checkout and https://github.com/krakenjs/xcomponent. There are hundreds of tests opening up hundreds of iframes and popup windows, and sending a lot of cross-window messages, and that ends up being really memory hungry.
Chrome deals pretty well with garbage collection, so long as I'm careful to de-reference closed windows properly¹, and only uses a maximum of 150mbs. PhantomJS eats up almost 6GB of memory before it's done, which makes it almost unusable on machines with less memory or CI boxes. Travis is a no-go.
I'm hoping running Chrome in headless mode should give a nice speedup for our tests.
-----
¹ Turns out even a closed popup window or iframe keeps a huge amount of memory hanging around. Who knew.
The DevTools Protocol is the primary API for headless Chrome, but we are excited for higher-level abstractions like PhantomJS & NightmareJS's API to manipulate the browser as well. Plenty of details to work out, but hopefully sometime this year you'll get a drop-in solution for some of your testing to upgrade from Phantom's older QTWebKit to the latest Chromium.
> >Is there interest on your side in adopting Chromium as a runtime? There's some existing documentation [2] around the API and embedding, but admittedly, this would be some work.
> We are interested. But I am afraid not in the current state. Currently, PhantomJS heavily relies on Qt and QtWebKit. It's not that easy to adopt Chrome as a new runtime.
> But I think we could implement PhantomJS as a completely new (with the same API) project that will use Chrome - Phantomium!
It was my understanding that Google's crawler literally is Chrome. Does Google have any plans to open source those parts of the browser to make integration easier? Maybe I was mistaken
That seems unlikely... It would seem they have two sets of crawlers... one that does typical/advanced scraping, and another that runs the JS (Chromium) and takes DOM snapshots. This is reflected by changing certain properties (window title etc) and seeing them reflected in Google's search results. A couple years ago, the lag was several days to a week behind on new content via as-rendered from the server vs. via JS.
Interesting! I find it puzzling that insider knowledge on how google search works never seems to leak into the public domain. Is google killing leaks in their search algorithms or do they pay the search team such ungodly amounts of cash that nobody has ever left?
Well, the point of their engine is to make it harder to game the system... everytime someone figures out a trick, someone takes advantage of it... that alone would discourage leaks.
We laypeople can only find out things via word of mouth or observational tests on assumptions.
We're using xvfb and selenium* for testing and a proper headless support would be a hundred time more stable than the self restarting framebuffer can't wait to move to headless chrome
*Yeah I know phantomjs is more cool these days but phantom doesn't support windows height so there's that.
We also did a POC with PhantomJS and found similar issues, as well as generally flakiness causing too many false negatives. Ended up not using it; I'm hoping this can simplify things are give something more solid to build on.
- Making sure all promises are fulfilled or rejected, so window objects don't get caught indefinitely in closure scope for any .then() or .catch() handler functions.
- Using WeakMaps as much as possible, when we have things that are tied to a particular window, like message listeners or response handlers in post-robot
- Manually clearing up any global references to windows when we destroy an xcomponent instance
Finding the references was the tricky bit. A lot of the effort was finding a leaky test-case, running in 100 times in succession, and deleting code until the memory graph was flat -- then figuring out what I'd just deleted that caused the leak.
The problem started manifesting as I added more and more tests -- so now I'm actually checking my tests' memory usage on the fly and failing if they cross a threshold. Hopefully that should avoid getting into this kind of sticky situation ever again.
I've been using post-robot and xcomponent a lot recently as I'm solving a bunch of similar challenges. Just wanted to say thank you for sharing your solutions and expertise.
I've been testing Chrome headless extensively for the past few months, and while it's a good step, but it's not stable for high-volume or even diverse set of webpages.
Memory usage is pretty high, lot of heavy webpages result in crashes/hangs, there are many inconsistencies between features available in full version and headless, their debugging protocol has different APIs that work on headless/non-headless in Linux or Windows, and so on.
Of the bugs I've submitted, some have been fixed in the upcoming M59, so other critical ones may take longer due to their backlog. I suppose for now (maybe until M61-62), Chrome full with xvfb or even PhantomJS are better options. When you realize that Chrome is about the same size (by LoC) as the Linux kernel [1], you can't help but wish for a leaner & faster headless browser.
There seems to be some work going on building Firefox pure headless as well. Great overall, as long as all the browsers try to follow the RemoteDebug initiative [2].
I've been successfully using Chrome headless in a 500MB Docker container for dumping the DOM for https://www.prerender.cloud/ for months (rendering a large variety of sites without restarts for weeks at a time)
Run it with:
--js-flags="--max_old_space_size=500"
to force the VM to keep it GC'd below 500
Chrome v55 was a 30% memory savings, before that I used 1GB containers.
It's not perfect, but I am definitely pushing high volume (multiple tabs, concurrent activity) and I am not having any significant stability issues and I am pushing diverse sets of webpages.
What the best strategy that worked for you on high volumes, multiple tabs or multiple docker instances? I am wondering if multiple tabs is as efficient as multiple windows/instances.
I'm using more than one Chrome process so I can kill the processes every so often (e.g. after timeout or when they get stuck). Inside each Chrome instance I use 16 tabs, there might be a number of factors at play:
- Are you worried about same-origin pollution if you run multiple tabs from the same origin in the same process? If so -> Extra process
- Do you have to take screenshots? You can only take screenshots of the tab that's in the foreground, so you have to activate it first to take the screenshot. This might fail if you have lots of tabs which roughly trigger at once.
This is correct, it does accept both, and the help lists it as underscores not hyphens. So the comment about it ought to be hyphens not underscores is incorrect. Either is fine.
When I went digging through the code for v8 I believe I saw examples of it both ways. I'm not dead sure though because I don't know very much C. I just checked because I was hoping they weren't actually using an invalid flag for all that time.
Yes, this exactly. I wrote Crabby [1] a few months back to schedule automated page testing using Chrome and webdriver but doing anything automated with Chrome is really atrocious. You can't expect to load more than one page every 10-15 seconds on the average 8GB instance and it occasionally crashes or otherwise stops working completely.
I ended up writing a simple check using Go's net/http library to do basic performance profiling but it doesn't measure DOM loading like the Chrome checks do. Such a bummer
What I really want is an easy, cross-platform way to collect the network timings for each object like you get in Chrome's dev tools network waterfall graphs.
Thanks for that information. I've been putting off moving over to Chrome headless for https://urlscan.io and haven't had the time to do extensive testing yet. Right now Xvfb works fine for me. Still, I'm running 3 Chrome instances with 16 tabs each in parallel and have to kill the processes every so often because they get "stuck".
Unfortunately, I have to agree, though they have been very helpful and vigilant on the bugs that have been filed. Most of mine have been fixed in a few days.
I've been working on a fully open source Windows/macOS library (via Chromium Embedded Framework) that allows you to render pages to memory (and then of course to bitmaps, textures etc.) as well as inject synthesized mouse/keyboard/JavaScript events. It currently uses (what amounts to) Chrome 57.
Thanks keville - I've poured my heart into this on nights and weekends after I'm done with my day job for a long time now but what you say is so true - hopefully we'll get access to a more robust solution compared to my modest hacks.
It's a phantom js (and other headless browser) web service. Using the site, you can quickly create different tests, scheduled tests, chained tests, keep screen shots, create videos of multi step tests, and have historical information of it all.
Can't say enough good things about the site.
Edit: also there's a great chrome extension that will record your mouse clicks and keyboard commands to make creating a test that much simpler.
+1 to GhostInspector; I used them at a previous company a few years ago, and it was very useful.
They were just starting, but service was rather reliable, and their tech support was excellent (maybe because we were early customers). We used to run a bunch of automated tests for monitoring and compliance, archiving hourly screenshots over different builds for later comparison.
I'm the co-founder of a startup that provides an enterprise solution in this space (https://functionize.com).
I agree with your sentiment regarding these types of tools. It has been a long time coming but this release and the tools available now are things I wish I had years ago when I was building my first company.
WOW. That's fantastic. As I noted in a comment above, I wrote a tool called crabby that uses Selenium and Chrome headless to do automated page testing and to report results back to metrics engines like Graphite, Datadog, Prometheus, and Riemann. The biggest problem I have is the unreliability of chromedriver and the extreme resource consumption of Chrome + Selenium. It's really too much for your average public cloud instance if you want to test any more than, say, one or two pages a minute.
Do you know if chromedp can access any of the timing measurements?
Yes, you can access everything via the underlying APIs. chromedp is a relatively new project (only about 4 months old), so there isn't much yet in the way of high level timing / profiling, but we hope to add that to the code base when we have some bandwidth to do so.
With either Chrome in headless mode, or "headless_shell" (a minimal Chrome app part of the Chromium source tree), you first enable the remote debugging port (via --remote-debugging-port=9222), and then you can then simply browse to http://localhost:9222/. That web page will list the various Chrome "pages" (ie, tabs) which you can then click on. Clicking on those tabs will open the Chrome DevTools inside of Chrome, as a web app served from http://localhost:9222/.
This is the internal API that DevTools uses, and is what is referred to as the "Chrome Debugging Protocol" (ie, chromedp). Since 57+, the built in DevTools UI displays whatever the active viewport Chrome "sees" using the screencast APIs. It's just a PNG that's updated every couple hundred milliseconds with the output of Chrome's headless renderer.
Mostly, I'd like to know how the control of the virtual time system would be exposed. Would it be through the C++ API, or could it be made available through the debugging protocol?
If you're using Go, you might want to check out https://github.com/knq/chromedp ... There is also a similar package for NodeJS. Otherwise, you can use Selenium in other languages.
I've been trying to test audio and video with headless browsers (namely PhantomJS) but have experienced extreme difficulty, I wonder if headless Chrome is able to support/supports already HTMLAudioElement or HTMLVideoElement or any media interface that would make, for example, testing YouTube or SoundCloud embeds easier.
Related: I want to take screenshots of a few news websites for a little fake news project of mine, and most approaches return something completely different than what I'm seeing when I open Chrome.
Limited height would be better/ok (something like the first 3000 pixels).
Low volume / can be slow (30 seconds would be ok).
Those news websites many times have infinite scrolling.
I've tried:
- phantomJS (rendering sucked, tried every technique I could find to wait for JS to load)
- wkhtmltopdf (almost ok, generates a huge 30M image with all the height, no antialiasing it seems)
Thanks - I tried Selenium (on the desktop) with geckodriver now and it rendered well. The only thing is that long screenshots didn't work but there is probably a workaround for that.
WebGL is supposedly a first class citizen of the browser and is used in a ton of pages, but it gets left out or deferred in many tooling packages. There are many tools that are nearly useless to us because they don't support WebGL. It's disappointing.
Oh my goodness I have been waiting for this day for a while - we ran into PhantomJS problems with keyboard/mouse eventing and the HTMLVideoElement for testing, this sounds like it should be the cure for our woes of having to hack around PhantomJS's deficiencies.
It's already possible. I'm using webrtc and node-electron to connect a golang server for my MMO project. I have a farm of 4 nodejs processes running under tmux acting as proxies for unreliable communications.
I am using electron-webrtc. You need xvfb installed. I was able to get this running under node, but only running the processes under tmux. You will also need to install some random shared libraries Chromium needs, but which node-electron doesn't install, but these are obvious from error messages.
Then I'm using simple-peer on top of that. There's also a library for UDP communications from the node process to the golang process.
It should be possible, but as a browser rather than a driver so you'd still need Chromedriver to glue Chrome and Selenium together. In Karma with karma-chrome-launcher[1] you can pass options to the browser using the flags option, eg;
the chrome browser spends a decent amount of time on other steps such as parsing HTML. I wonder how much time could be saved by not rendering pages into pixels.
For those interested, Firefox is also going to support a headless mode. The current nightly supports headless SlimerJS on Linux and more platforms will come soon.
Have people found many issues that come up in Chrome but aren't found in PhantomJS? We used to use a headless browser but switched to PhantomJS and haven't had any real issues.
(We should probably run under the real IE but jut haven't been bothered.)
Phantom does not support more recent JS syntax/tweaks. We have an app that is aimed at more recent browsers only so can use latest ES6 features, and had to move from Phantom to in-browser tests (the alternative would be to use babel for transpilation, but then we wouldn't be testing the code that is actually released to users)
I've run into some problems with some of the more ... creative jasmine tests I've done. Mostly, it's been around object mocks. I've found places where I can do things with Object.defineProperty() in Chrome that throw exceptions in PhantomJS :(
Can anyone confirm - would this work with a Flash/SWF application? i.e. could I use the headless mode to interact with the Flash Application to run some commands and retrieve the output?
I tried googling around but didnt find much to say either way...
How fast is the debugging mode? I tried the first debugging protocol when Chrome added it and it was very difficult to use. I assume this time is different?
Looking at https://developer.chrome.com/devtools/docs/debugging-clients, it seems we are missing a Ruby client. Last time I tried it gave me some headaches trying to talk to the websockets, but hopefully someone smarter than me can pick it up.
i use when they announced headless mode on linux, and built generating thumbnails from captured screenshots of websites and uncovering the technologies used on websites
> Headless mode allows running Chromium in a headless/server environment. Expected use cases include loading web pages, extracting metadata (e.g., the DOM) and generating bitmaps from page contents -- using all the modern web platform features provided by Chromium and Blink.
Practically speaking, software developers will use headless Chrome to automate testing of product functionality. Today, developers use systems like Selenium or PhantomJS to accomplish this feat, but it's a painful process to maintain these headless browser execution engines. Adding headless support into Chrome means that developers can count on the presentation of their application on a given version of the Blink engine run within Chrome.
The print to pdf command currently only support default print settings. Adding support for customized page size, header and footer, dpi, etc. is in progress.
I'm the maintainer of wkhtmltopdf, and it's hopelessly out of date. There's still some bugs in the Chrome print-to-PDF support as support was added just a few days ago:
Not sure if all the full functionality that wkhtmltopdf can be ported, it had patches to Qt/WebKit to enable that ... probably will need API enhancements in Chrome. Don't have the time right now, but I registered http://crhtmltopdf.org a while ago hoping that I'd get around to it.
I think it just means that the site is poorly designed. I think websites should work correctly everywhere where CSS2.1 and ES3 is supported. Otherwise some users won't be able to view those sites.
If you're willing to wait until Electron releases a Chrome 59 -based build, I'll be updating https://github.com/mixu/electroshot which handles screenshots and print-to-PDF along with a bunch of other niceties.
Extracting text from PDF is not hard, though PDF only contains low-level formatting instructions so the result might not be nice, especially if the original PDF has any non-trivial formatting, like pull quotes, multi-column text, etc. If you don't care about that or the correct "flow" of text, it should be easy enough to just find all the Tj and TJ operators and extract their operands. You might also need to reverse some ligatures though.
Producing nice semantic HTML is much harder, though also easy if you don't mind every word in a separate absolutely positioned div.
Many PDF reader software already contains empirically tuned routines to infer the text flow and generate text files (because the software needs to handle Select All and Copy), but they often produce bad results.
But if you just want to read a PDF on a remote machine over ssh, the easiest solution might be just transferring the file and then opening it locally, or use X forwarding and open the PDF with a graphic reader.
Easy, not so much, depending on exactly what you want to get out of it. I did a project with this once https://www.idrsolutions.com/jpdf2html5/. Last I checked, they only supported it as a Java library that could generate rather nice looking and complete HTML pages from PDF documents. The output was great, but it was kinda pricey and difficult to work with.
On the opposite side of the complexity level, I have also used this http://www.pdfsharp.com/PDFsharp/ to extract bits of text from PDFs. It's free, but you only get access to the raw PDF text with formatting codes. It works fine if you just want to grab a short string, but you got your work cut out for you if you want to do anything more sophisticated.
It's a useful tool (and huge thanks to those who built it -- and SlimerJS for that matter) but whenever I've reached for there's always been some issue to resolve, generally relating to the exact version and/or set of APIs supported. Headless mode (with PDF support -- which it looks like the latest version of the Chromium remote protocol does indeed have) built into a mainstream browsers is nearly guaranteed to be a smoother experience.
And, if you need much higher fidelity and control of HTML/CSS -> PDF, there's the fantastic Prince library, http://princexml.com/ (nonfree)
(I've been using Prince for over a decade, rendering everything from prescription labels, packing slips, receipts, resumes, books, and more. It's great.)
Chrome deals pretty well with garbage collection, so long as I'm careful to de-reference closed windows properly¹, and only uses a maximum of 150mbs. PhantomJS eats up almost 6GB of memory before it's done, which makes it almost unusable on machines with less memory or CI boxes. Travis is a no-go.
I'm hoping running Chrome in headless mode should give a nice speedup for our tests.
-----
¹ Turns out even a closed popup window or iframe keeps a huge amount of memory hanging around. Who knew.