
Consistent Selenium Testing in Python - cdubzzz
https://chrxs.net/articles/2017/09/01/consistent-selenium-testing/
======
stephen
FWIW the most consistent (as in non-flaky) selenium tests I've worked with
used hidden DOM elements to synchronize the selenium driver with the JS code:

[http://www.draconianoverlord.com/2011/10/14/sane-selenium-
te...](http://www.draconianoverlord.com/2011/10/14/sane-selenium-testing.html)

Basically every JS ajax call (or JS promise, though we didn't use any of those
at the time) would update a hidden DOM element's inner text, e.g. the
`#pendingAjaxRequests` inner text would go to 1, or 2, then back to 0 when the
AJAX response came back (ideally you do this instrumentation in a single place
in your app).

Then on the selenium side, after every button click/other action, we'd wait
until those counters went back down to zero.

This post-action waiting (waiting for the app to "settle" after invoking
actions) vs. pre-action waiting (...well, this the element I want to click on
available yet? Hm...no...) was very reliable.

~~~
yebyen
What do you think of this approach instead:

[http://cabeca.github.io/blog/2013/06/16/waiting-for-
complete...](http://cabeca.github.io/blog/2013/06/16/waiting-for-completed-
ajax-in-capybara-and-other-tricks/)

When I did this type of workaround (I have not needed to write one of these
for a while) I found it was a problem for control flow that I needed to know
if the action actually started and finished, not just whether it was currently
in progress.

I used an .ajax-processing class that I added at the start of the request, and
removed when the request came back with a callback.

While I can see the advantage of pendingAjaxRequests instead (you could have
more than one going at a given time, and my .ajax-processing breaks down
there), it still seems like you could have a problem as the browser JS is not
running in the same process as your server or your test executor.

Your next line of code might run before the browser increments the inner text,
or starts the XHR. You could pass control flow on thinking that the event has
already completed, when in reality it hasn't even started yet. The 2013
article I linked seems to have a more robust model for tracking requests, it's
just two variables, but it would seem to handle this problem with perfect
reliability.

~~~
stephen
The approach you linked to looks good to me. I hadn't seen that before; thanks
for sharing it!

> Your next line of code might run before the browser increments the inner
> text, or starts the XHR.

Yeah, that is true in theory, but we never had a problem with that--my mental
model was that if the Selenium process issued a click, and it got sent to the
browser, any "onclick" behavior would fire within that immediate event loop
(and for our app that's where we issued any AJAX calls, immediately within
that event loop), tick up the hidden DOM count, _before_ the webdriver code
that's sitting within the browser sends back the response to Selenium client.

I've not examined the webdriver impls of each various browser, but AFAICT my
theory matches what we saw at the time (in google/firefox with native events
enabled).

I'd be very interested to hear from someone authoritatively on whether this
theory is actually correct.

Also, if you have a JS-side persistence layer that does AJAX calls in separate
event loops, e.g. with setTimeout, then yeah I could see this being an issue
and would necessitate a more deliberate approach (like your linked post, which
I like).

~~~
yebyen
If your client talks to server, which talks to API, and your server is running
in a different process/machine than your client integration tests (in our
case, with Capybara.run_server=false) you will find this happening all the
time.

The next server operation isn't done until both client-server and server-API
communications have finished. And in any case API server is likely to need to
confer with a database server before it can return its response, so it will
take longer.

There are many good reasons not to run this way, but so far we have not found
any better way for our Windows developers to do tests locally than with
Capybara.run_server=false set in their environments. (They are using Vagrant
and a process from 3+ years ago, and our developer envs need a serious refresh
for 2017 imho...) And it helps for the argument that in real life, your
servers and your users will not be in the same thread.

If your stack is tightly coupled and your tests run strictly with both client
and server locked against a GIL, you will in all likelihood absolutely never
hit this case. Even if your server does actually hit remote services, but the
server and client thread in testing are joined by a lock, your client thread
will wait for the server to return and you will still probably never hit this
issue.

It is not a production-facing issue, in any case the kind of errors that you
can hit if you arrange your testing infrastructure in a way to expose these
problems... are not problems that you will ever see a user complaining about
in production. If you don't have this problem, don't go looking for it in
other words, because it's a pain! It's been a pain for us, but not by any
means impossible to work past.

Our ruby developers on Windows found this happened a lot more frequently than
when I ran my tests on MacOS (where I did not need run_server=false.)

I'm running against a local selenium webdriver that was spawned by Capybara's
selenium driver natively. (In other words, it's cool when everything is housed
in a single process-thread. The way that any sane person would do their
testing deployment. But for my Windows users with Vagrant boxes, another way
is needed of course...)

We set our Jenkins server up to run similarly decoupling the stack from the
client selenium/test integration and by applying resource constraints to make
the right parts slow, for us it was absolutely 100% reproducible.

------
gowan
you should skip this post. it is full of anti-patterns.

anti-patterns

(1) implicit wait. this will create subtle differences in behavior between
drivers. it will also create long pauses when you test for negative
conditions.

(2) clear. selenium has built in support for clear[1]. in addition to clear
you can send the null key[2] if you want to clear the input midway through a
sequence of characters.

(3) time wait. this does not make any sense to me. seems like a clever way to
add time.sleep.

[1] [https://www.w3.org/TR/webdriver/#element-
clear](https://www.w3.org/TR/webdriver/#element-clear)

[2] [https://www.w3.org/TR/webdriver/#element-send-
keys](https://www.w3.org/TR/webdriver/#element-send-keys)

~~~
cdubzzz
I agree that implicit wait is something of a time waster.

Regarding clear, that does appear to be exactly what the Python implementation
does[0], but in my experience it just seemed to fail at random. If I remember
correctly it worked most of the time with geckodriver, and then hardly ever
with chromedriver. Really not sure what exactly the deal is with that...

And yes the time wait is a little pointless (:

[0]
[https://github.com/SeleniumHQ/selenium/blob/master/py/seleni...](https://github.com/SeleniumHQ/selenium/blob/master/py/selenium/webdriver/remote/remote_connection.py#L266)

~~~
Elhana
clear() works for me consistently. At the same time, I had cases before when
sendkeys was not sending complete string, so I had to compare it with
get_attribute('value') to be sure.

------
alistproducer2
I've been working with selenium a lot at work recently. There are lots of
little nuances to grok if you want your tests to run completely deterministic:
especially on a JavaScript-heavy site.

~~~
danidiaz
I try to use explicit waits whenever possible. Luckily, the Java client
library provides a rich set of combinators for declaring them.

You can also define your own; they are basically functions from a WebDriver
object to some other thing (usually a Boolean).

[http://seleniumhq.github.io/selenium/docs/api/java/org/openq...](http://seleniumhq.github.io/selenium/docs/api/java/org/openqa/selenium/support/ui/ExpectedConditions.html)

~~~
cdubzzz
Oh wow, Java gives you a lot! Python's are actually pretty decent as well, I
just came to rely almost entirely on "presence_of_element_located" with our
Javascript-heavy app.

------
yebyen
The best thing you can do if you are developing Selenium tests in ruby (but
these lessons are probably applicable no matter what environment you're using)
is learn how they changed Capybara. There was a big shift in Capybara around
the time that "should" went out of the vocabulary.

What that means syntactically for developers is less important than the
semantic change that came with this shift. Tons of the old cheat sheets are
still based on the old syntax, and many of them if you dig deep will also give
the bad advice!

So many people try browser testing and give up because in some ways it's hard
to make the testing consistent. Especially if you use onClick events that load
another page, it's so easy to forget that you might or might not yet have
loaded that page when the next line of code runs! That browser-side JavaScript
has made your integration now multi-threaded.

The hardest one to figure out is when you grab a reference to something on the
current page that looks like the element you want, but it's on the previous
page... it's not what you want.

Then when you get around to trying to finding nodes within it, or clicking on
that node. You've loaded the page you want, but your reference is pointing to
a node that is no longer present in the active DOM, because you're on the new
page now!

If you're composing reusable steps (like in Cucumber) Always find a way to
make sure you've already loaded the DOM of the page you think you're on,
before you get any references to nodes on the page. Even if you have to put
#some-target-page-node on the target page so you can

expect(page).to have_selector('#some-target-page-node')

That will prove you've landed on the target page, and if you find it sometimes
takes a long time, or if it's longer than the timeout, set 'wait' to a longer
number of seconds:

expect(page).to have_selector('.some-node', wait: 90)

If your page has waits that long of course, it's most likely going to start
affecting your conversion rates, so do something about that...

~~~
yebyen
To clarify...

    
    
        set 'wait' to a longer number of seconds:
        expect(page).to have_selector('.some-node', wait: 90)
        If your page has waits that long of course
    

Wait is different than sleep. Wait retries until the condition is met. This
makes it a little bit confusing for "negative presence" tests where you are
testing for the absence of a thing, but it ensures that you are not waiting a
full 90 seconds for a thing to happen, when that thing has already happened.

This is the greatest, most noticeable effect of the "shift." No more "until".

This is something like explicit/implicit waiting described by the OP's
article, although Capybara is careful now to avoid "wait until" you can still
get this behavior using expect statements as I showed above. It's much cleaner
in capybara[1] nowadays, and I'm actually surprised people work with Selenium
WebDriver directly in Python by comparison.

You can also wait explicitly for the AJAX events to complete, although I never
do this[2] because it seems to introduce a new source of error: the AJAX
completes before we get around to observing that it had started.

This article seems to provide a bullet-proof way to do it though, upon review,
because it's simply counting with JS variables in the DOM. You can even start
multiple AJAX requests at once and wait for all of them to complete, or just
go ahead and move on if you can see they already completed when you first
check. No waiting. This is more reliable than adding and removing the ".ajax-
processing" class to a hidden element, which you must be careful to observe so
you know your AJAX has started, and then observe again as it is removed so you
know it has completed. The method in [2] is a bit less "semaphore-ish" and
looks more reliable than ways I've used.

Another nice trick is to remove another source of threaded racing: your
animations. If you have a test that triggers an animation and you want to
disable animations, run once you get to the page:

page.execute_script("$.fx.off=true") # or jquery.fx.off=true

This runs on the DOM though, so it must be run on every page that loads if it
is going to be effective. There are also ways[3][4] you can get this script to
run on every page rather than running it manually just wherever needed, I
haven't tried this, but I'd recommend it if you don't want to spend time by
hand tracing down the source of each delay.

I feel like there were about 5 things I had to learn to make my tests
reliable, and I am still applying them daily. But once I had the experience of
"tracking down the Heisenbug" enough times, I can avoid writing 98% of the bad
tests, and tracking down the remaining 2% when they inevitably show up in my
test suites as "failed once, then failure went away with no code changes" it
simply hasn't been hard to track down the source of the failure anymore, as I
now know really well what I'm looking for, what causes this type of failure.

I personally got a bit out of this article, but I'm not sure anyone is going
to read this article and just know better how to do browser testing. I think
you have to go through this experience of accidentally writing sometimes-
unreliable tests, and then figuring out how to fix them. I did not learn
without the help of articles like this, though.

The last great advice I found was to learn to mock API responses if your
application uses external APIs, but don't over-rely on mocks. We have lots of
tests that do hit our APIs and will fail if they are down. But we also have
tests that depend on some difficult to replicate condition being met by the
API servers. Those mock tests in some cases are more dangerously brittle than
the tests that actually hit the API.

My favorite was added recently, the test that shows what happens if the API
goes down in the middle of your overnight job. Hint: You want the job to stop
running at the first sign that the API has gone down, and trigger alert mail
or otherwise signal an error that you can find later.

With WebMock, this test was easy to write! Just send back a 500 error at a
time when you know you weren't expecting it.

You can usually use factories to replicate other less unusual conditions, but
when your app depends heavily on external APIs, this trick can help greatly to
simplify your tests. This particular test would have been impossible to write
without mocks. I would not mock more than 50% of API tests though, because it
is equally valuable to find out when an unrelated change that is deployed
through your CI environment to one of your APIs, will break some part of your
application unexpectedly. This is why I do not mock 100% of API-related tests.

[1]: [https://www.varvet.com/blog/why-wait_until-was-removed-
from-...](https://www.varvet.com/blog/why-wait_until-was-removed-from-
capybara/)

[2]: [http://cabeca.github.io/blog/2013/06/16/waiting-for-
complete...](http://cabeca.github.io/blog/2013/06/16/waiting-for-completed-
ajax-in-capybara-and-other-tricks/)

[3]: [https://makandracards.com/coffeeandcode/7503-disable-
jquery-...](https://makandracards.com/coffeeandcode/7503-disable-jquery-
animations-during-rails-tests)

[4]:
[https://gist.github.com/keithtom/8763169](https://gist.github.com/keithtom/8763169)

------
_pRwn_
... and when you're done with this beginner stuff ... use the de facto
industry standard [http://robotframework.org/](http://robotframework.org/)

------
noir_lord
I wrote a shim layer over selenium using pythons test infrastructure so I can
do

    
    
        self.clickPseudoButton('foo')
        
        self.setInputById('fizz', 'buzz')
    

It also maps common operations against the structure of my legacy app, its way
better than using the lower level API.

Tests look like pseudo code and are way more grokkable.

~~~
cdubzzz
Your code available anywhere? (:

~~~
noir_lord
It's nothing that impressive but I'll post it on Monday when I'm at work, it's
really just a bunch of helper methods to reduce boilerplatr.

~~~
cdubzzz
Yeah, I would just be curious to see how you organized. I kept meaning to step
back and work on abstracting things but always got side tracked with other
stuff. It gets real long and messy without that.

