Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Helium – Lighter web automation based on Selenium (github.com/mherrmann)
191 points by mherrmann on Feb 25, 2020 | hide | past | favorite | 62 comments



I really appreciate the frankness in the paragraph that starts with this: "I have too little spare time to maintain this project for free. If you'd like my help, please go to my web site to ask about my consulting rates."

I hope drawing a clear boundary like that reduces some of the entitled nonsense that can happen around open-source projects.


> I hope drawing a clear boundary like that reduces some of the entitled nonsense that can happen around open-source projects.

Case in point: Rust's actix web

https://www.theregister.co.uk/2020/01/21/rust_actix_web_fram...

https://www.reddit.com/r/rust/comments/epoloy/comment/feksq2...

Regardless of where you are on the soundness telenovela, the fact that a lot of the FLOSS project consumers felt entitled to demand access to a well maintained project for free and complained that even forking a project is not a valid option because they expect someone else to spend their personal time to benefit them personally is an attitude that boggles the mind.


I also think this a good idea. There are companies that are willing to pay millions for extremely ineffective development, asking the designer/creator of the product for couple of hours or days of help has much higher chance of success in terms of return on investment.


@mherrmann What would be the comparison / difference with SeleniumBase https://github.com/seleniumbase/SeleniumBase ?

I'm always on the lookout for browser automation using Python so excited to try this framework. I was a big fan of https://github.com/miyakogi/pyppeteer but it doesn't seem to be maintained anymore.

There is another project by microsoft called playwright, though it's js-only for now. They don't have plan for Python bindings for now https://github.com/microsoft/playwright/issues/1043


Has anyone had rousing success with automation tests like this? I almost always find the ratio of maintenance to issues caught to be unacceptably high, especially when compared to simple unit tests.


As much as I love unit tests (being simpler and running more than 1000 times faster than web tests) there is a place for basically all types of tests in a CRUD application. 100% unit test coverage isn't going to help you if you can't actually spin up the whole system and serve requests over high-level protocols.

It's all about ROI, and ROI is huge for simple high-level smoke tests.


My experience with selenium is that the tests broke more often than the UI, whether through timing issues, DOM changes, or framework changes. The binding is on a layer that humans don't see, and it makes it fragile.


This was my experience to a a T but then we hired a guy to solely manage the tests instead of having the developers do it and his tests caught a few false positives at the start but over time we were able to eliminate a 2-day+ crawl over the site by QA and UAT. My takeaway from this (which was the same argument I made back when we were failing at producing anything even usable let alone reliable) is: It needs to be someone's sole goal and I'll go a step further and say they should be on the outside of the team developing the main app. Being too close to the underlying code makes for a bad QA/Automated Tester IMHO. I have no problem even being the person who is managing/writing the tests but it's very hard to do both in my experiance.


As someone who's worked in the UI automation field for approaching 10 years, getting it to not break takes a little of (1) knowing what actual user interactions look like, (2) knowing how UIs commonly break, (3) knowing how the underlying UI framework commonly breaks / what shortcuts programmers are apt to take (e.g. simulated async in VB6), (4) knowing how events and code interact in a running system, & (5) striking the proper balance between performance and reliability.

Honestly, though, 75% of automation errors are timing missteps -- automation dispatching events at a specific millisecond that a human never would.


I was recently involved in my first "integration testing" project involving Selenium, Winium, and ssh (doing cross-platform testing of an installer).

One of the things that really struck me as I was reading through examples and such was how common it seems to be to have test code like "Click x; Wait 5 seconds; Click y; Type 'abc'; Click Z". My experience with unit testing tells me that as tests break 'randomly' (because something occasionally takes a few milliseconds too long to load) those delays get inserted, and eventually you end up with slow tests that are harder to maintain and still flaky.

In my project, we were installing on a variety of OSes and versions, and based on what dependencies had to be downloaded+installed it could take anywhere from a few seconds to a couple minutes. I ended up writing a lot of helper functions like `WaitForFile(filename, absoluteTimeout)` that basically run a retry loop, checking every couple seconds (to allow fast-as-possible pass/fail), but frankly was surprised that this was non-standard -- I kind of expected any "integration test assertion library" to be chock full of helpers like this.

I think there's a lot of ground to be covered in improving integration testing in general, and especially bringing in some more modern software design and engineering principles. Unit and integration tests are still code after all -- just because it doesn't ship with the product doesn't mean it shouldn't be maintained at the same quality.


> WaitForFile(filename, absoluteTimeout)

The Java bindings have support for this in the form of FluentWait#until http://javadox.com/org.seleniumhq.selenium/selenium-support/...

Using explicit waits for expected conditions (instead of static waits or implicit waits) helps with making the tests more robust. It's a bit more work though.


Wait for five seconds?

That's the first mistake.

You should always wait for some kind of state change, you're just building in fragility.


This. If your framework only does timing through pausing, find a new framework.

That's 20 years out of date.


Cypress is pretty good for this. Every command is queued up and sort of deferred. You write tests in a synchronous manner and it has retrryability built in so it will wait for an assertion to be truth before continuing (with a max timeout ofc).

It's been working pretty well for me so far.


I had the same experience on projects where actual testing was not a priority over "dumb" unit testing (testing dummy function just to get the 100% coverage ...).

But, I work on some project where testing with selenium/cypress was a important functionality so the whole code was designed to be testable, and this made all the difference. Once you start to be careful and design your code to be automatically testable (which in any case, is just applying some good practice most of the time), it yield some very good result and it does catch a lot of regression/bugs.

Unit testing on the other end, they were letting a lot of problems pass through...


> The binding is on a layer that humans don't see, and it makes it fragile.

And that seems to be part of the point of this project. From the description:

> In Selenium, you need to use HTML IDs, XPaths and CSS selectors to identify web page elements. Helium on the other hand lets you refer to elements by their user-visible labels.

The example script uses a bunch of user-visible labels, not classes, ids, ancestors, etc. It is things like "click('Sign in')" which you would tell a user you were walking through an interface.


This works fine until it does not, for example when dealing with custom UI controls, which is when reality sets in and you find yourself wrangling HTML IDs, XPaths and CSS selectors again.


In worst cases, I always tell people to bind to user-visible elements, then only navigate the local DOM from there.

Global XPath type traversal is fragile, unless you're plugged in enough that the dev team lets you know when they break it.


There is a learning curve. Your devs need to learn how to make things easier for the testers and with experience the testers can work around most of the issues you are describing.


DOM changes should break the test (although you can write the tests not to be some tightly bound to the DOM), just like API changes should break tests of the API.


At this point I might do it for greenfield work.


Like others, I've had them be very useful as high-level smoke tests. For making sure the details work, I test at lower layers. But without some external-perspective smoke tests, I find that I'll just end up doing them manually to make sure things really still work.


We had them, then stopped maintaining them when the development speed was to high.

Now (years later) we have a subset of the original tests to ensure the most crucial processes are regression tested.

My experience is that it's a lot harder for these tests to find the right balance and depth than for unit or even other integration tests.

As already said, it's good to consistently have ids on relevant elements, as otherwise the test code complexity grows significantly.


"As already said, it's good to consistently have ids on relevant elements, as otherwise the test code complexity grows significantly. "

Life is definitely much easier if the devs set up their pages to be automation friendly. I view this as a prerequisite. Otherwise tests can get nightmarish complexity.


Here's looking at you, Material UI. (And almost any vendor-GUI-framework-in-a-box solution) :/


Every heard about the testing pyramid?

https://martinfowler.com/articles/practical-test-pyramid.htm...

A few UI tests on top of the pyramid does help.

We did move from selenium to Cypress because of flakyness in selenium, and because most browser run the same engine under the hood (cypress is chrome only).


Cypress actually _just_ released Firefox and Edge support a couple weeks ago:

https://www.cypress.io/blog/2020/02/06/introducing-firefox-a...

The Edge support kinda came for free thanks to its switch to the Blink engine, but the FF support had been under development for a long time.


Yes, we have a Selenium-based UI test suite at Mixpanel that's overall quite stable and runs all tests across several browsers. It tests strictly the front-end code, with API calls stubbed out, which is the biggest factor in maintaining fast deterministic tests. By way of contrast we have an older end-to-end Selenium suite that spins up the entire service cluster, and that is the usual flaming pile of slow flakiness associated with Selenium. One of our engineers did a writeup on it a few years ago (now somewhat outdated but the overall picture is still accurate): https://engineering.mixpanel.com/2018/10/31/the-state-of-ui-...


I work in medical devices and we have to do UI testing if we like it or not. Automated UI testing can be very hard and has a steep learning curve but the payoff is huge. I can see how maintaining them in very fast moving environment would be a pain. Thank God we are much slower.


I've had limited success. IME it's critical to make them as deterministic as possible without being too specific. Non-random test data coupled with 'id' attributes on interacted elements help a lot. Beyond that each webdriver abstraction has quirks one must learn.

Animations and rendering races are sticky points too. Usually it's possible to put in some strategic waits to get consistent results.

Lastly getting them running in parallel can be tough, but the payoff is usually worth it if the number of tests must grow beyond a few dozen.


If your UX designers can stomach it, simply ensuring async actions have a defined signpost to signify completion is the most helpful. E.g. an element that shows up when results are finished loading on the page.

"Spinner that disappears" isn't terrible, but isn't great either.

"Everything looks the same, and then some results load, and more results might load at some point, maybe" is sadly not that rare.


I have found that they're especially good at spotting embarrassing mistakes -- where you've broken something really obvious to every user, such as outright crashes in common use cases. Such things make users really angry: "How could you not have spotted this? How can I trust your product at all?"

The ratio of maintenance to issues remains very high, but the kinds of issues that it can catch put a thumb on that ratio.


I've always been of the opinion that GUI tests are the best, except for the expensiveness of running the tests means you can't cover as many cases as you can with unit tests.

At any rate there are of course issues that can be caught in GUI tests that cannot be caught in unit tests.


I work at Testim.io, we record this sort of thing. It works pretty well but it's pretty expensive to do automation well. We work with Microsoft, Wix, JP Morgan and a bunch of other large companies successfully for years now.


We have, but I work in banking (basically), and we have a group solely dedicated to testing. So they're consistently maintaining them.


Very cool project OP: it looks like it gets rid of the major pain points I've had with Selenium. In particular, smarter handling of iframes and the better explicit waits are huge wins over vanilla Selenium. I'm glad you decided to release this, especially since you chose to use a permissive license.


I don't see people talk about TestCafe which is open source and is an actual alternative without being dependent on WebDriver at all


I haven’t personally used it, yet, but my team has been really happy with TestCafe. The tests seem much less flakey than selenium.


What’s the relationship between Helium and HeliumHQ? They look like similar products.

https://heliumhq.com/


HeliumHQ used to be the commercial offering of Helium. We shut down that company at the end of 2019. The GitHub link you see here now is the same code, modernized and open sourced. See the history section at the bottom of the GitHub README.


Why didn’t the commercial offering succeed?


Developers don't like to pay for software libraries so it was an uphill battle. My co-founder (our "COO") moved on because he got a very lucrative offer. I wanted to buy out his 50% of the business, he asked for a 40x multiple of yearly revenue. (For context, online businesses usually go for 2-3x.) In this stalemate, I did not want to continue growing the business with him owning 50% of it. Now that revenue has, through years of inactivity, died down to essentially 0, he agreed to let me open source it.


I've been using SiculiX for GUI testing.... It has GUI and OCR, i wonder if helium could be in SiculiX scripts


That reminds me of this: https://news.ycombinator.com/item?id=22374991

Those tools perhaps will be useful for automated testing or accessibility, etc.


I created a blog post on this:

https://kevintuck.co.uk/browser-automation-with-hellium/

Great little tool!


Awesome, thanks!


This appears to be very similar to Capybara in Ruby, which I've used extensively. I was under the impression that implementations of Selenium were as high-level as this every language.


The nice thing about Capybara and other "in stack" tools is you have tight integration with the database, so setting up test data and resetting it is easy. And for the tests that don't require JavaScript, the tests can still be made to run quite fast.


Has anyone managed to log in to Facebook or Instagram using it?


AFAIK Selenium is easily recognizable as it inserts itself in the DOM, so I would guess that they will block you pretty quickly.

If you're interested: https://stackoverflow.com/questions/33225947/can-a-website-d...


OK so I did a few tests with FB, using an existing account data and managed to change status without getting banned. Will continue testing as it would be a very convenient way to post status updates without having to actually use FB.


Pretty soon people are going to run out of elements to name their products after. Naming things after letters in the alphabet is a better strategy.


Oh, hey, I basically started writing something exactly like this when I started using Selenium to script some test cases.


Same here.

As a lot of the Selenium API is quite low-level (speaking about the Java variety).

Would be wonderful if this and similar efforts would end up in the core Selenium project, keeping the low-level options, but providing an easier to use layer on top.



+1 on this boat, php-webdriver (nee Facebook WebDriver) where waiting, some sendkeys magic, clickAndWait should always be present.


This is great news, especially for Selenium users. Helium is a great tool.


Your "thanks" is really just a cheap way to advertise your own solution ;-)

(Edit: zabil removed the link to his project.)


Awww, straight to the point and bullseye :)

I'm glad someone wrote what I had thought :D


Why can't it directly use chrome devtools protocol?


The reason why it doesn't use the Chrome DevTools Protocol is that it (Helium) is a wrapper around Selenium WebDriver [0].

One advantage of wrapping WebDriver, rather than using something like the Chrome DevTools Protocol, is that WebDriver has an interface specified by an W3C Standard [1] [2], and can be implemented for any browser. The Chrome DevTools Protocol (obviously) only works with browsers based on Chrome.

[0] https://www.selenium.dev/documentation/en/webdriver/

[1] https://www.w3.org/TR/2018/REC-webdriver1-20180605/

[2] https://github.com/w3c/webdriver


Apparently the reason one would keep using selenium over headless chrome (perhaps using a wrapper like puppeteer [1]) is that selenium works over many different browsers, not just chrome.

"WebDriver can be used with all major browsers. Automate real user interactions in Firefox, Safari, Edge, Chrome, Internet Explorer and more!" [2]

1: https://github.com/puppeteer/puppeteer

2: https://www.selenium.dev/projects/




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: