
Remote-Browser – A browser automation framework based on the Web Extensions API - foob
https://github.com/intoli/remote-browser/
======
mixologic
"The central idea behind Remote Browser is that there's no need to reinvent
the wheel when modern browsers already ship with an extremely powerful cross-
browser compatible API that's suitable for automation tasks."

Thats already what the webdriver API accomplishes, and its a w3c standard.
[https://www.w3.org/TR/webdriver/](https://www.w3.org/TR/webdriver/)

People still call it 'selenium' for whatever reason, but
chromedriver/geckodriver/edgedriver are all implementations of the protocol
for each existing browser (safari's might be built in?).

Granted the web extensions API would likely allow for some additional powerful
options, but at the cost of some missing features.

~~~
celerity
Selenium is just a tool built on top of the WebDriver API. One of its main
disadvantages is needing to run a complicated proxy program (like geckodriver,
ChromeDriver, etc.) built individually for each browser in order to drive your
instance. As a result, users sometimes suffer from hard to debug edge cases
and other pain points.

They also make interacting with JavaScript on the page a bit painful. For
example, injecting JavaScript into the browser with Selenium can be quite an
ordeal [1], so you're somewhat limited in what you can do by what Selenium's
developers decided to focus on. It also complicates deployments by adding
another moving part to the overall equation.

In contrast, the Web Extension API is now part of all major browsers, and
makes interacting with different page contexts effortless. To give a sense of
the project, we wrote an interactive tour of Remote Browser which runs browser
instances on our backend.

[1]: [https://intoli.com/blog/javascript-
injection/](https://intoli.com/blog/javascript-injection/)

[2]: [https://intoli.com/tour/1](https://intoli.com/tour/1)

~~~
alceta
You're writing that users suffer from edge cases from the individual webdriver
implementations. My experience as someone working with gecko and chromedriver
on a daily basis is that the number of edge cases stemming from browser
behavior (such as moving to, clicking, and focusing on elements) is a much
more frequent pain than differences in the webdriver implementation.

------
scottfr
The Web Extension APi doesn't seem too suitable for browser automation as it
doesn't have support for simulating key presses or mouse movements (other than
HTMLElement's click method).

Yeah, you can simulate events, but that can be a lot of work (e.g. typing an
'a' key might require you to simulate all of keydown, keypress, and keyup and
set various non-standardized properties on them). And that won't even work in
a standard text input as isTrusted is set to False on events you generate.
Simulating something like a Tab key press will require you writing code to try
to replicate your browser's logic in determining what the next element should
be.

Why would this be preferable over something like Selenium?

~~~
timwis
But we can just create/use a JS library that simulates typing. It would work
for an app that needs it, or on your automation environment because it's just
using the DOM API. Is selenium doing something special beyond triggering those
events?

~~~
scottfr
The point is you can't accurately simulate typing without something like
Selenium. Neither the standard DOM API nor the additional features provided by
the Web Extension API allow you to do so.

You can simulate some parts of typing by mimicking browser behavior on a case
by case basis, but there are places where this will be strictly impossible.
For example, if you are working with anything that checks the isTrusted event
bit, you're out of luck as you there is no mechanism for you to set that to
true.

Selenium on the other hand is actually triggering events as if a person
physically triggered them. So, for instance, the isTrusted bit will be set to
true when you use the send_keys method in Selenium.

~~~
timwis
Interesting. Do you know how selenium does that?

~~~
jarvuschris
Looks like it's a feature of the WebDriver API:
[https://www.w3.org/TR/webdriver/#element-
interaction](https://www.w3.org/TR/webdriver/#element-interaction)

------
tal_berzniz
Cool idea - Web Extensions API are very powerful.

How would you go about installing the extension in a CI/CD setup? Can it be
installed on headless Chrome?

~~~
foob
Thanks! You can use the extension in a CI setup by first installing the
_remote-browser_ extension, and then using it in your tests. You can check out
_remote-browser_ 's own tests for an example of integrating the project with
CircleCI [1].

[1] - [https://github.com/intoli/remote-
browser/blob/master/.circle...](https://github.com/intoli/remote-
browser/blob/master/.circleci/config.yml)

