
How to make Selenium tests reliable, scalable, and maintainable - mjswensen
https://www.lucidchart.com/techblog/2015/07/21/selenium-7-things-you-need-to-know-2/
======
stevebmark
Selenium tests are inherently slow, unreliable and flappy. They have been the
bane of developers for every employer I've had. Do yourself a favor and write
React and test your components without a browser driver in good ol' JS with
the occasional JSDom shim. It removes almost the entire need for Selenium,
which should be reserved for only the faintest of smoke tests. And please, if
you have to use Selenium, use headless Firefox, because PhantomJS is very bad
software.

~~~
patio11
I had a Rails consultancy (Makandra) recently work on a JS-heavy application
that I happen to own, and they got Selenium singing on it, which had been
beyond my capabilities for years. One of their tricks, which you can inspect
the implementation of in their (public) utilities library [+], is using
basically a vendored Firefox per project and VNCing into that Firefox to drive
things around. It is thus off-screen and out of the way when you're using it,
but apparently is more true-to-reality than headless.

The test suite they wrote has about ~600 tests and while they're slower than
I'd like (2~3 minutes) they've been bulletproof since we got my dev
environment configured properly. It includes some fairly complicated
interactions, most relevantly around our calendar interface.

[+] [https://github.com/makandra/geordi](https://github.com/makandra/geordi)

~~~
logn
I had been using Firefox driver in Xvfb but wasn't happy with the
performance/stability. So I built a Selenium driver out of Java only (using
JavaFX's embedded WebKit) and used a headless JRE windowing toolkit (Monocle).
My project is still a pre-release but the headless capability, Java-only
system requirement, and its ajax handling might make it useful to some people
currently:
[https://github.com/MachinePublishers/jBrowserDriver](https://github.com/MachinePublishers/jBrowserDriver)

~~~
boundlessdreamz
This looks very interesting!

1\. How does it compare with phantomJS?

2\. What's the current webkit version?

3\. How often does the javaFX webkit update?

~~~
logn
1\. Not quite sure. I've only used PhantomJS via Selenium Ghost Driver. From
that usage they're similar. The main difference is that my driver uses only
Java so under the hood the JRE is launching WebKit through JNI and everything
runs in the same JRE process.

2\. Current WebKit version depends on the JRE used. Oracle Java 1.8.0_45 has
WebKit version 537.44.

3\. Java maintainers will update WebKit periodically, including within a major
version. E.g., here they update WebKit for the 1.8.0_60 JRE:
[http://openjdk.java.net/jeps/239](http://openjdk.java.net/jeps/239) ... Other
than that I'm not sure.

------
t0mbstone
I currently manage a rather large test suite (around 700 different tests)
using Selenium, which is all written in Ruby and Rspec (although I've also
used Cucumber), and uses the gems Capybara (an abstraction layer for querying
and manipulating the web browser via the Selenium driver) and SitePrism (for
managing page objects and organizing re-usable sections).

The entire suite runs in around 10 minutes on CircleCI, using 8 parallel
threads (each running an instance of the Firefox Selenium driver), and it is
rock solid stable.

It took us a while to get to this point, though.

The hard part is handling timing due to Javascript race conditions on the
front-end. I had to write my own helper methods like "wait_for_ajax" that I
sprinkle in various page object methods to wait for any jQuery AJAX requests
to complete. I also use a "wait_until_true" method that can evaluate a block
of code over and over until a time limit has been reached before throwing an
exception. Once you figure out ways to solve those types of issues, testing
things with Selenium becomes a lot more stable and easy.

I have also used the exact same techniques (page objects, custom waiter
methods for race conditions, etc) to test mobile apps on iOS and Android with
Selenium.

It can be a challenge, but once you have a system down and you know what you
are doing, it's not so bad.

------
paulddraper
The most annoying thing I found with Selenium was that it wouldn't wait for
the browser to respond to click events and rerender.

The approach in the blog post (and I think elsewhere ... not sure) is to poll
the DOM with a timeout.

Is there a better solution to be add with something like `executeScript`? You
could run `requestAnimationFrame`, and then poll for an indicator that the
click, etc. handler has indeed finished. That way if it fails, you know about
it pretty soon, without the need for long timeouts. This is all just a guess
though.

~~~
crdoconnor
>Is there a better solution

Yes. And it's pretty simple:

    
    
        WebDriver driver = new FirefoxDriver();
        driver.get("http://somedomain/url_that_delays_loading");
        WebElement myDynamicElement = (new WebDriverWait(driver, 10))
      .until(ExpectedConditions.presenceOfElementLocated(By.id("myDynamicElement")));
    

From :
[http://docs.seleniumhq.org/docs/04_webdriver_advanced.jsp](http://docs.seleniumhq.org/docs/04_webdriver_advanced.jsp)

~~~
drothlis
I'm not sure this satisfies your parent poster's requirement of: "if it fails,
you know about it pretty soon, without the need for long timeouts."

~~~
crdoconnor
Well, you need to have _a_ timeout.

You can make the timeout shorter when running the test on a dev environment,
though, so you get quicker feedback about errors.

------
bcjordan
Nice rundown, wish I had read this a year ago!

> One developer designed a way to take a screenshot of our main drawing canvas
> and store it in Amazon’s S3 service. This was then integrated with a
> screenshot comparison tool to do image comparison tests.

I would also take a look at Applitools
[https://applitools.com/](https://applitools.com/) — they have Selenium
webdriver-compatible libraries that do this screenshot taking/upload and offer
a nice interface for comparing screenshot differences (and for adding ignore
areas). Way fewer false failures than typical pdiff/imagemagick comparisons.

~~~
drothlis
If using Selenium's Python bindings, you can take a screenshot from Selenium
and convert it to OpenCV format like this:

    
    
        cv2.imdecode(
            numpy.asarray(
                bytearray(base64.decodestring(driver.get_screenshot_as_base64())),
                dtype=numpy.uint8),
            cv2.CV_LOAD_IMAGE_UNCHANGED)
    

(where `driver` is your WebDriver object, e.g. `WebDriver.Chrome()`).

Then to match that frame against a previously-captured "template" image, you
can use stb-tester's[1] "match" function[2] which allows you to specify things
like the region to ignore and tweak the matching sensitivity.

[1] [http://stb-tester.com](http://stb-tester.com) [2] [http://stb-
tester.com/stb-tester-one/rev2015.1/python-api#st...](http://stb-
tester.com/stb-tester-one/rev2015.1/python-api#stbt.match)

------
novocaine
Everyone in the blogosphere (and at my own company) writing non-app-specific
layers on top of selenium suggests that there is scope for a higher level
framework that can be used on top of selenium. Or that the selenium api is too
thin a layer over webdriver.

Does anyone know of such a project?

~~~
crdoconnor
I did the exact opposite. I ripped out some robot framework tests and replaced
the code with python using selenium webdriver. Works great.

~~~
shicky
can I ask why you decided to do this? The tests were just flakey while in
robot framework world?

~~~
crdoconnor
I absolutely hated the robot framework. The DSL was just horrible to use. It
had weird, unnecessary syntax quirks and gave you the minimal amount of
information if something failed (wouldn't tell you which line number it failed
on, for instance).

The tests were also flaky as hell but that was more to do with poor
environment management. That, admittedly, was also easier to fix in python.

------
sjansen
Here's the presentation the post is based on:
[https://www.youtube.com/watch?v=5K6bwikZulI](https://www.youtube.com/watch?v=5K6bwikZulI)

------
bobm_kite9
The PageObjects tip is a really good one. Previously using Selenium you end up
with a complete maintainability nightmare.

I used Geb on a recent project, and I actually felt that the tests I built
demonstrated a passable level of engineering discipline. However, Geb was
really hard to learn (partly the error messages were really confusing/missing)
and you're still on top of Selenium so you still get wacky exceptions and edge
cases.

------
karlosmid
Also, by switching from Java to Ruby ecosystem is one way to improve your
selenium tests. For start, use watir-webdriver and page-object gems.

~~~
EdwardDiego
Improve them how though? Speed? Reliability? If it's just a nicer API, that's
all well and good, but until the key problems I face with Selenium are solved
(slow and non-deterministic tests) then a nicer API to it is just rearranging
deck-chairs on the Titanic.

~~~
karlosmid
It seems that you try to use selenium 2, or webdriver, in order to run your
unit tests. Selenium is for browser test, and by its nature it can not run in
milliseconds. Its execution time is in seconds. Even when use phantomjs
webdriver. It is integration testing approach because it combines execution of
several javascript modules. That run in real browser. Selenium has its
purpose, but fast test execution is not one of them.

~~~
EdwardDiego
> It seems that you try to use selenium 2, or webdriver, in order to run your
> unit tests.

Nope. Integration tests. But integration tests that start a Firefox instance
from scratch and have to be rerun multiple times to pass due to non-
determinism are slow.

~~~
karlosmid
Could you please provide one example of non-determinism? I would like to
understand what exactly do YOU mean by that term.

------
defied
Some very good information in this article. It is true that Selenium has its
quirks, retrying a failed test can sometimes result in a passing test.

Disclaimer: I work for [https://testingbot.com](https://testingbot.com) : at
my work we offer our customers automatic retries when a test fails. Writing a
Selenium test does take its time, but once you run it in parallel across
hundreds of browser and os combinations, it's worth it.

~~~
crdoconnor
>retrying a failed test can sometimes result in a passing test.

This is usually a sign of either a buggy test or buggy code.

------
labianchin
I wonder if there are stories about running Selenium tests in production.
Something in the lines of semantic monitoring
([http://www.thoughtworks.com/radar/techniques/semantic-
monito...](http://www.thoughtworks.com/radar/techniques/semantic-monitoring))

------
shicky
Great to see a HN post on testing, they seem few and far between to me!

------
marktangotango
BrowserMob, that was a sweet service (based on selenium). Does anyone know
what happened to those guys after they sold? I've always wanted to learn more
about their story.

~~~
nirvdrum
I don't know about the entire team, but Patrick and Ivan were at Neustar for a
while. They're both at NewRelic right now.

------
derricki
I do find Selenium a overly complicated so thanks for the post.

~~~
mjswensen
There is a nice presentation at the bottom with some code examples, too.

------
gowan
tldr; have developers help maintain automation tests

~~~
NegativeK
Summarizing away a technical article makes for a not very useful summary.

------
ilovefood
using it right now for my latest project, it is a nightmare. I have 1100 tests
that have to run per night. I'm using PhantomJS. It is such a mess ! ! !

------
drothlis

      > getWithRetry takes a function with a return value
      > 
      >   def numberOfChildren(implicit user: LucidUser): Int = {
      >    getWithRetry() {
      >      user.driver.getCssElement(visibleCss).children.size
      >    }
      >   }
      > 
      > predicateWithRetry takes function that returns a boolean and will retry on any false values
      > 
      >   def onPage(implicit user: LucidUser): Boolean = {
      >    predicateWithRetry() {
      >      user.driver.getCurrentUrl.contains(pageUrl)
      >    }
      >   }
    

At first I didn't get the difference between `getWithRetry` and
`predicateWithRetry`, but then I noticed that the former throws an exception
whereas the latter returns false. I infer that `getWithRetry` will handle
exceptions thrown by the retried function.

In stb-tester[1] (a UI tool/framework targeted more at consumer electronics
devices where the only access you have to the system-under-test is an HDMI
output) after a few years we've settled on a `wait_until` function, which
waits until the retried function returns a "truthy" value. `wait_until`
returns whatever the retried function returns:

    
    
      def miniguide_is_up():
          return match("miniguide.png")
    
      press(Key.INFO)
      assert wait_until(miniguide_is_up)
      # or:
      if wait_until(miniguide_is_up): ...
    

(This is Python code.)

Since we use `assert` instead of throwing exceptions in our retried function,
`wait_until` seems to fill both the roles of `getWithRetry` and
`predicateWithRetry`. I suppose that you've chosen to go with 2 separate
functions because so many of the APIs provided by Selenium throw exceptions
instead of returning true/false.

    
    
      > doWithRetry takes a function with no return type
      >
      >   def clickFillColorWell(implicit user: LucidUser) {
      >    doWithRetry() {
      >      user.clickElementByCss("#fill-colorwell-color-well-wrapper")
      >    }
    

Unlike Selenium, when testing the UI of an external device we have no way of
noticing whether an action failed, other than by checking the device's video
output. For example we have `press` to send an infrared signal ("press a
button on the remote control"), but that will never throw unless you've
forgotten to plug in your infrared emitter. I haven't come up with a really
natural way of specifying the retry of actions. We have `press_until_match`,
but that's not very general. The best I have come up with is `do_until`, which
takes two functions: The action to do, and the predicate to say whether the
action succeeded.

    
    
      do_until(
          lambda: press(Key.INFO),
          miniguide_is_up)
    

It's not ideal, given the limitations around Python's lambdas (anonymous
functions). Using Python's normal looping constructs is also not ideal:

    
    
      # Could get into an infinite loop if the system-under-test fails
      while not miniguide_is_up():
          press(Key.INFO)
    
      # This is very verbose, and it uses an obscure Python feature: `for...else`[2]
      for _ in range(10):
          press(Key.INFO)
          if miniguide_is_up():
              break
      else:
          assert False, "Miniguide didn't appear after pressing INFO 10 times"
    

Thanks for the article, I enjoyed it and it has reminded me to write up more
of my experiences with UI testing. I take it that the article's sample code is
Scala? I like its syntax for anonymous functions.

[1] [http://stb-tester.com](http://stb-tester.com) [2]
[https://docs.python.org/2/reference/compound_stmts.html#the-...](https://docs.python.org/2/reference/compound_stmts.html#the-
for-statement)

~~~
jarr416
Thanks for the comment. We actually originally had a waitUntil function that
was basically used for all three of the cases I mentioned above. In some
sections of the code, it was just there to eat errors, other sections get some
text, and yet others it was wrapped in an assert and needed to return a
boolean. This led to chronic misuse around the code (I found 4-5 tests that
simply forgot to wrap it in an assert effectively rendering the test
completely worthless). The main benefit we got from splitting the methods out
was making it clear to developers what it did. Catching all the exceptions
thrown by Selenium instead of returning booleans was just an added benefit.

And you are correct, we are using Scala. There are some really cool things
about the language, case classes, pattern matching, first order functions, and
traits just to name a few.

~~~
drothlis
> This led to chronic misuse around the code (I found 4-5 tests that simply
> forgot to wrap it in an assert effectively rendering the test completely
> worthless).

Yes, I've been bitten by that too -- it's too easy to forget the "assert".
This morning it occurred to me that I could write a pylint (static analysis)
checker to catch that, so I've done just that: [https://github.com/stb-
tester/stb-tester/commit/5e5bdbb](https://github.com/stb-tester/stb-
tester/commit/5e5bdbb)

------
mherrmann
I'm working for a startup that addresses this by means of a simpla wrapper
API: [http://heliumhq.com](http://heliumhq.com). Human-readable tests with no
more HTML IDs, CSS selectors, XPaths or other implementation details.

