
Show HN: Chromeless – Headless Chrome Automation on AWS Lambda - schickling
https://github.com/graphcool/chromeless
======
schickling
I'm really excited to finally open-source Chromeless. We've used NightmareJS
and similar tools before to run integration tests but these basically added
~20min to each build. With Chromeless we were able to reduce this time to
under a minute!

Here is btw a demo playground to try it out:
[https://chromeless.netlify.com/](https://chromeless.netlify.com/)

Let me know if you have any questions :)

~~~
ivan_ah
How difficult would it be to also support google cloud functions and the azure
offerings? This seems like a really useful standard tool that lots of people
might want to use. CI jobs on pull requests that take seconds instead of
minutes = big win!

~~~
jpalomaki
Azure Container Instances might be good for this:
[https://azure.microsoft.com/en-us/blog/announcing-azure-
cont...](https://azure.microsoft.com/en-us/blog/announcing-azure-container-
instances/)

------
gowan
shamless plug: i've also written a high level api on top of the chrome remote
debugger chrominator [1]

similar idea. chrominator use promises instead of a fluent api. it also
follows the selenium w3c spec where possible. it does cool stuff with evaluate
and evaluateAsync where it resolves the remote object to something usable.

to be fair there are a few other projects i know about that wrap chrome remote
debugger with a high level api:

* autogcd [2]

* ghostjs [3]

[1] [https://github.com/jesg/chrominator](https://github.com/jesg/chrominator)

[2] [https://github.com/wirepair/autogcd](https://github.com/wirepair/autogcd)

[3]
[https://github.com/KevinGrandon/ghostjs](https://github.com/KevinGrandon/ghostjs)

~~~
smithclay
Shameless plug: I've been hacking on headless chrome in AWS Lambda but with
selenium webdriver support [1], also using the binaries from the serverless-
chrome [2] project.

To echo another comment on this thread, headless chrome seems well-positioned
to shake up the automated browser testing market. The price—especially with
the AWS Lambda free tier—is very, very compelling for a number of projects.

[1]
[https://github.com/smithclay/lambdium](https://github.com/smithclay/lambdium)
[2] [https://github.com/adieuadieu/serverless-
chrome](https://github.com/adieuadieu/serverless-chrome)

------
cjr
This looks great! As the developer of an automated screenshot solution
([https://urlbox.io](https://urlbox.io)), one of the major pain points when
taking screenshots is font-rendering. I wonder how you could install/configure
fonts on lambda?

------
titel
@schickling - When will the PDF support arrive?
[https://github.com/graphcool/chromeless/blob/master/docs/api...](https://github.com/graphcool/chromeless/blob/master/docs/api.md#api-
pdf)

~~~
pkilgore
Also interested, this would be excellent fit for a use case I have archiving
certain important government websites.

Btw, does the .viewport() option not work in the demo? I'm seeing a
`TypeError: Failed to fetch` when I set one.

~~~
sorenbs
You need to do it like this:

[https://chromeless.netlify.com/#src=const%20chromeless%20=%2...](https://chromeless.netlify.com/#src=const%20chromeless%20=%20new%20Chromeless\(%7B%20remote:%20true,%20viewport:%20%7Bwidth:%201024,%20height:%20800%7D%20%7D\)%0A%0Aconst%20screenshot%20=%20await%20chromeless%0A%20%20.goto\('https://www.graph.cool'\)%0A%20%20.scrollTo\(0,%202000\)%0A%20%20.screenshot)

~~~
pkilgore
Thanks!

------
shortj
This would have been awesome to have back when I was heavy in to the UI test
automation game. Our best option at that point was a spot instance EC2 fleet
and analyzing commits to determine which tests would be the most valuable to
run. It's awesome being able to easily run hundreds or thousands of tests in
parallel, completely segmented, and pay only on demand. A fantastic use of AWS
Lambda! It suddenly becomes reasonable to do full integration tests on every
merge request, or even commit, and get feedback to the developer in seconds.

~~~
toomuchtodo
Do the math. If you're suggesting running _all the tests_ , drastically
increasing your compute volume, it's probably cheaper to use a dedicated
instance running at max CPU utilization.

~~~
schickling
It's of course based on what's more important for you: Running tests non-stop
and perfectly utilizing a compute instance vs having the tests executed when
you need the results as soon as possible. I might argue that in most cases the
latter is what you'd actually want. Especially given that you're getting
billed on a millisecond basis.

~~~
sorenbs
I would love to have something like this for my scala API tests. We have 3k
tests running on 10 ec2 instances and the entire test-suite takes 10 minutes.
If some infrastructure would allow me to pay a little more and run all tests
in parallel, that would be a game changer for our team.

------
kensoh
This is a really cool project, but looking closer at the API and issues raised
it seems that the features are being over-promised.

\- "Do pretty much everything you've used PhantomJS, NightmareJS or Selenium
for before".

The main features of those tools plus their ability to handle a large range of
edge cases are built up over the years in production use and do not seem to be
already in Chromeless. Also, Lambda costs can be a significant point of
consideration for professional test automation with large volume.

Nevertheless, there's no turning back as flood gates have been opened and many
developers are noticing Chromeless. I believe, with enough dedication from
Chromeless maintainers, they may be able to channel the attention and
contributions to shape Chromeless to be the main challenger to existing test
automation approaches. That will really be a blessing to the open-source
community!

The only catch I believe, is it may be easier for those existing tools to be
made working in Lambda or implement a similar form of parallelism while still
having their mature API, than for Chromeless to catch up to the state of
maturity of those tools. But as they say, growth solves almost every problem,
so issues like these may be ironed out through collaborative efforts from
contributors/maintainers.

~~~
schickling
Hi kensoh, thanks a lot for your great comment.

I totally agree with you! It took years for these tools to mature and so it
will be the case for Chromeless. There are probably a range of edge-cases that
yet have to be solved but like you said, I'm very optimistic that together
with our great community, we'll be able to handle all of these cases.

The big incentive for us to create Chromeless instead of using Nightmare or
similar (which I've done for years) was the fact that you can now use headless
Chrome (which provides a way more stable foundation) + the ability to execute
the code on AWS Lambda which solves the parallelisation question. I hope this
makes sense to you :)

~~~
kensoh
I'm really excited for you guys, it looks like your implementation through AWS
Lambda has struck a chord. I really can't wait to see how Chromeless
maintainers & contributors give back to the open-source community by
challenging existing ways to do stuffs and introduces new innovations in this
space.

Yes perfectly, I believe for any one very serious about test automation or
browser automation in general, they will make their own tool and bring it to
market if there isn't already one that meets their needs. :)

------
yeldarb
I've been using serverless-chrome for the past few weeks and this looks like a
big improvement in usability!

[https://github.com/adieuadieu/serverless-
chrome](https://github.com/adieuadieu/serverless-chrome)

~~~
schickling
Chromeless is actually built on top serverless-chrome and was developed
together with the author of serverless-chrome :)

~~~
adieuadieu
;-)

~~~
kensoh
Really cool collaboration and project! :)

------
afandian
I've got the impression that lots of sites block AWS IP addresses. I wonder if
this would hamper the practical use of this on Lambda.

I'm doing something similar, and this concern was one motivation for running
in our datacentre vs EC2.

Does anyone have concrete info on rates of bots blocked from AWS IPs?

~~~
extra88
I assume the number one use of this would be test automation for one's own
sites so blocking would not be an issue.

What are sites' motivations for blocking AWS IPs? I bet there are some reasons
I would agree with even though the somewhat crude method of blocking ip range
would have some unintended consequences (e.g. blocking people running a
personal VPN).

~~~
pravda
>What are sites' motivations for blocking AWS IPs?

I block AWS. So many crawlers up to so much nonsense! I don't block by IP, but
by hostname.

    
    
      $block='.amazonaws.com';
      $ua = @$_SERVER['HTTP_USER_AGENT'];
      
      if (stripos($rh,$block)!==false &&
      	stripos($ua,'Silk')===false &&
      	stripos($ua,'Safari')===false){
      
        	$block_visitor=true;
      	$message="Blocked Host:Amazon Web Services";
      }

~~~
hackits
Just curious what have you seen crawlers do to make you conclude they're up to
nonsense?

~~~
pravda
Well, from amazonaws.com, there are so many requests for wp-login.php!

And then all the off-brand scraping companies use amazonaws.com.

~~~
afandian
What do you mean by off-brand scraping? You mean search engines that you
haven't heard of, or copyright violating orgs?

~~~
pravda
Some AWS visitors:

Cliqzbot, VidibleScraper/1.0, CheckMarkNetwork, CCBot/2.0
([http://commoncrawl.org/faq/](http://commoncrawl.org/faq/)), linkdexbot/2.2;
+[http://www.linkdex.com/bots/](http://www.linkdex.com/bots/)

That last one is a "SEO platform".

------
1024core
How do cookies work in Chromeless? Can I specify a cookiejar to use? Can I
keep cookies separated?

------
pedrocls
How is this different from using a container/vm image which has chrome pre-
installed and on request launch it in headless mode, accessing the instance
via chrome-launcher and manipulating the browser with chrome-remote-interface?

You can then use the vm/container as a function to match AWS lambda.

Is it the that the api is more-user friendly or selenium w3c complaint?

Genuinely curious, don't know much about this project.

------
biscarch
I've been super excited about Chrome headless but haven't had a chance to dig
into using it yet. The api here looks amazing for getting started without
getting lost in the weeds. It'd be fairly trivial to hook this up to a
Slackbot and to get on-demand screenshots of various pages on my websites,
etc.

~~~
mabbo
A month ago I was trying to migrate some tests from phantomjs to chrome
headless. The big blockers I ran into were that chromedriver automation has
certain functionality that only works with a special chrome extension that it
installs- but headless chrome doesn't support extensions.

There's work being done to fix this, but it's still in progress, last I
checked. Until then, resizing windows, taking screenshots and a handful of
other things simply don't work.

------
coredog64
The last time I tried headless Chrome, file downloads were a PITA. Has anyone
tried downloads with Chromeless?

~~~
adieuadieu
Other than taking a screenshot and evaluating JS code in the context of Chrome
and returning JSON, we haven't yet implemented any file-download features. But
it might be possible for us to implement something. Would you mind creating an
issue here describing your use case so we can discuss it further?
[https://github.com/graphcool/chromeless/issues/new](https://github.com/graphcool/chromeless/issues/new)

~~~
kensoh
By design, headless Chrome disables file downloads. This is being tracked at
this issue to offer a way to enable that, and the issue seems to be moving
along =)
[https://bugs.chromium.org/p/chromium/issues/detail?id=696481](https://bugs.chromium.org/p/chromium/issues/detail?id=696481)

EDIT - above is assuming downloading a file by simulating a click event to
perform the download. there may be other workarounds by script injection etc
to use XMLHttpRequest() for downloading a resource directly.

------
inertial
I'll just add some related projects I've used / tried in the past.

The promise of fast execution time in parallel is tempting with Chromeless.
Thanks for sharing.

\-
[https://github.com/webdriverio/webdriverio](https://github.com/webdriverio/webdriverio)

\-
[https://github.com/nightwatchjs/nightwatch](https://github.com/nightwatchjs/nightwatch)

\- [https://github.com/assaf/zombie](https://github.com/assaf/zombie)

\-
[https://github.com/dhamaniasad/HeadlessBrowsers](https://github.com/dhamaniasad/HeadlessBrowsers)

~~~
schickling
Thanks a lot for bringing this up. We've tried all of the projects listed
above before we began to implement Chromeless.

Ultimately it was the combination of using headless Chrome and the ability to
execute code in parallel on Lambda, which made us invest in Chromeless.

~~~
kough
How hard do you think it would be to update Nightmare/etc. to use headless
Chrome under the hood? Asking as someone with some interest in the space but
little experience with the codebases.

------
shimon_e
Another platform that supports headless chrome:
[https://devexpress.github.io/testcafe/](https://devexpress.github.io/testcafe/)

The error report given by this are some of the best.

------
Vuneu
This is pretty nifty! I've been keeping an eye on Phantomium for a while, I
wonder what's come out of it.

~~~
kensoh
Looks like Google releasing headless Chrome is really shaking up the test
automation domain. This is a really cool project :) The other day I saw
another interesting and refreshing implementation using GraphQL -
[https://github.com/joelgriffith/navalia](https://github.com/joelgriffith/navalia)
(not affiliated with the project, just got to know from a PhantomJS issue).

------
yamafaktory
Looks super promising, the API is really neat and running the tests in
parallel is a big plus! Awesome work!

------
jotto
(Shameless promoting): for an API version of the prerendering functionality,
with no warm-up latency, we're running a large cluster of Chrome headless
instances here: [https://www.prerender.cloud/](https://www.prerender.cloud/)

~~~
schickling
Sounds like a great service and very useful for frontend developers.

We've built something similar internally and will shortly migrate it to
Chromeless. We basically use it to pre-render our websites and docs:
[https://github.com/graphcool/prep](https://github.com/graphcool/prep)

------
languagehacker
We've been using chrome-remote-interface for test automation in a project that
makes heavy use of Lambda for a distributed event processing infrastructure.
I'm looking forward to seeing whether we can implement this for running our
test automation suite!

~~~
schickling
Would love to hear how that goes. Please reach out if you have any problems or
questions. The easiest way is to ping me on Slack:
[https://slack.graph.cool](https://slack.graph.cool)

------
iokevins
One minor housekeeping comment:

The first two examples seem to return 404, for me:

[https://github.com/graphcool/chromeless#examples](https://github.com/graphcool/chromeless#examples)

~~~
schickling
Thanks a lot. This is fixed now!

------
victorhooi
So to clarify - this is basically a Node.JS wrapper around Chrome headless,
right? =_)

Seems pretty awesome.

My use case is to take screenshots of various pages - the docs don't mention
the default viewport dimensions, btw.

~~~
schickling
Yes exactly + this can run (in parallel) on AWS Lambda, so you don't need to
worry about provisioning & running servers. That's actually the part I'm most
excited about :)

~~~
purvis
It's parallel because AWS Lambda is inherently parallel? Or are you referring
to within JavaScript?

~~~
adieuadieu
Parallel in that you can invoke many Lambda functions at the same time and
have them run independently of each other

~~~
dlisboa
So basically if I had 200 automations I could run them on 200 lambdas and have
them finish by the time the slowest one finishes? That pretty awesome,
specially for testing. For many cases this would also fall under the free tier
since it not that many requests/usage...it kinda seems too good to be true. Am
I missing something?

~~~
bald
> For many cases this would also fall under the free tier since it not that
> many requests/usage...it kinda seems too good to be true

(since I've ran into the same trap a couple of weeks ago and ended up with USD
650 of unanticipated charges): the free Lambda tier includes

* 1M requests

* 400k GB-seconds

\--> the GBsec can be a serious bottleneck. Imagine you're running each Lambda
instance with 512MB RAM and each instance takes 2 mins to complete your test.
This means that one Lambda instance is ~61.5 GBsec, meaning you can execute
~6,500 of these instances per month to remain in the free tier.

Depending on how extensive your tests are/how often you run them, you might
run out of free GBsec well before you'd run out of the requests quota.

~~~
penagwin
Granted that means running ~216 instances per day, or 9 instances per hour
(taking 18 minutes per hour to run total). Now you're right, if you're running
a screenshot service then this will kill you real fast.

However assuming a 8 hour work day you then get ~27 instances per hour. Each
test takes two minutes to run, so for a single user testing, assuming a code -
test - code - test routine, you'd be able to do that nearly continuously, for
8 hours a day, every day of the month (no weekends or days off). Seems safe to
assume that wouldn't occur.

------
polskibus
Can I use this somehow in a container? Farm out headless chrome + selenium on
a local data center? I would be grateful for any hints.

~~~
schickling
We haven't done so yet, but it shouldn't be a problem at all to set up
Chromeless in a container environment. Would you mind creating an issue here
to discuss this further?
[https://github.com/graphcool/chromeless/issues/new](https://github.com/graphcool/chromeless/issues/new)

------
hartator
Really awesome. Will definitely be usefull for a project I am working on.

Any plan to support other languages beside JS?

------
colordrops
Can any headless browser solution be considered complete without GPU support?
A large percentage of sites use GPU enabled and accelerated features and
without GPU support, headless options are worthless to many applications.

------
pknerd
Wish it had a python wrapper.

~~~
schickling
There is no reason why you shouldn't be able to build one :) You should even
be able to reuse the existing Lambda backend.

------
fizixer
naming gore.

------
craptocurrency
Nice tool

Some API documentation says:

pdf() - Not implemented yet

Not implemented yet

How about you deprecate the API for now but reveal the purpose please.

------
megamindbrian
Cool. Now can you use it with webdriver functions instead of re-inventing the
API?

