
RobotJS – Node.js Desktop Automation - octalmage
https://github.com/octalmage/robotjs
======
STRML
Cool project! Seems like its capabilities are somewhat similar to Hammerspoon,
an OS X project where you can script almost anything in the OS with Lua. I
believe it was forked from Mjolnir.

I use it for window management (which is it awesome at, especially considering
I have a very complex arrangement with a 4K monitor + laptop screen),
automatic mute when not home, remapping shortcuts, and more.

The config syntax is pretty simple and works great. There are some really cool
ones all over the internet. Here's mine:
[https://github.com/STRML/init/blob/master/hammerspoon/init.l...](https://github.com/STRML/init/blob/master/hammerspoon/init.lua)

Hammerspoon Docs:
[http://www.hammerspoon.org/docs/](http://www.hammerspoon.org/docs/)

~~~
heptal
Came to post the same thing. Here's mine:
[https://github.com/heptal/dotfiles/blob/master/roles/hammers...](https://github.com/heptal/dotfiles/blob/master/roles/hammerspoon/files/init.lua)

~~~
STRML
That's a nice one, especially the paste block workaround. That is my #1 least
favorite security fad/flaw.

------
mwagstaff
Despite its many syntax quirks, AutoHotkey is an amazingly powerful tool for
automating Windows keyboard and mouse input - I use it daily at work.

Having an equivalent tool on the Mac would be awesome, as I don't think
anything in AutoHotkey's class exists right now...

~~~
macobo
Out of curiosity, have you looked at Sikuli [1]?

[1]: [http://www.sikuli.org/](http://www.sikuli.org/)

~~~
mwagstaff
I hadn't heard of it before, and it does look potentially very powerful.

My only initial concern from reading the quick start
([http://www.sikulix.com/quickstart.html](http://www.sikulix.com/quickstart.html))
is the apparent requirements to use the "SikuliX IDE" for scripting...

~~~
hugs
The magic of Sikuli is its use of the OpenCV library under the hood (and
Tesseract for OCR). You could skip the Sikuli part and just use OpenCV and
Tesseract directly. (Not easy, but theoretically possible.)

~~~
skimmas
I once used sikuli to extract a table from a pdf. Pretty funny way of hacking
a fast solution for a problem. It's a pretty powerful tool but not very
stable. IDE is also not so exciting but gets the job done. Pretty excited to
see how far this project will go.

------
jbrooksuk
AutoIt MVP member here (James on the forums).

I'm really glad to see a replacement for AutoIt and AHK and potentially, if
not already, cross-platform too.

I can't speak for AHK but AutoIt has made huge progress over the last couple
of years. Jon has been working on new features, especially improved COM
support.

Maybe RobotJS will have a community too? It'd be great to see UDF's and a
thriving ecosystem.

~~~
octalmage
Awesome! AutoIt is great. AutoHotkey wouldn't exist if it wasn't for AutoIt.

AutoHotkey has also improved a bunch recently. Lexikos picked up development
and he's done a killer job. But unfortunately it (and AutoIt) will never be
cross platform. That's why I made RobotJS.

I honestly never thought about a community but that's an amazing idea. A
classic forum would be great, I've spent so much time on the AutoHotkey/AutoIt
forums. I'd love for this to happen.

~~~
bmh100
Why won't it ever be cross-platform? Would the effort just be too gargantuan?

~~~
striking
It's too directly tied to the Windows API. The vast majority of non-language
code would have to be rewritten to run correctly, unless they emulate the
Windows API instead.

------
phleet
I've been looking for something like this forever! I loved AutoIt v3 on
Windows, and osascript always felt super crippled in comparison (as well as
impossible to look up documentation for). I've long since forgotten what I
wanted to use this for, but I'll be sure to remember this for later!

~~~
noinsight
> AutoIt v3

AutoHotkey is available and open source but the custom scripting language is
off-putting, I would much rather have something like standard JavaScript for
it. If this project moves forward in that direction it could be great.

[http://ahkscript.org/](http://ahkscript.org/)

~~~
octalmage
I grew up on AutoHotkey but yeah, the syntax is very strange. It's actually
based on AutoIt v2. The closest language (syntax wise) is Assembly, and that's
silly.

AutoHotkey is avalible, but only on Windows. I don't think I would have made
this if AutoHotkey was cross platform.

------
rolux

        //Type "Hello World".
        robot.typeString("Hello World");
    
        //Press enter. 
        robot.keyTap("enter");
    

Whoever designed this interface didn't think much about consistency.

~~~
octalmage
It's very temporary! There's discussion about it here:

[https://github.com/octalmage/robotjs/issues/4](https://github.com/octalmage/robotjs/issues/4)

~~~
rolux
Ah, thanks for the pointer!

    
    
        keyboard.type('foo');
        keyboard.press('fn');
    

... makes much more sense.

~~~
hugs
I'd go for as short (but still readable) as possible.

    
    
      > robot.keys("Type a string")
    
      > robot.keys(ENTER)   // ENTER is an integer key value
    

(I've been thinking about this while trying to make an idiomatic node client
for Selenium WebDriver...
[https://github.com/hugs/34#api](https://github.com/hugs/34#api))

~~~
smilekzs
Importing enum values into global/function/module local namespace has always
been a PITA for javascript environments. Any suggestion on how to do this
cleanly?

~~~
hugs
I'm sure it's possible, but one way to avoid it -- turn the constant into a
method:

    
    
      > robot.keys.ENTER()
    

(Not great for key combos, though...)

~~~
nunull
I'd then go with

    
    
      robot.keys.enter()
    

(since enter is a function, not a constant) and make it chainable by returning
`robot.keys`.

    
    
      robot.keys.ctrl().enter()

~~~
swsieber
You'd probably want something slightly different than keys - at least some way
to different between key presses and hold the keys down at the same time.

------
striking
See also:
[http://docs.oracle.com/javase/7/docs/api/java/awt/Robot.html](http://docs.oracle.com/javase/7/docs/api/java/awt/Robot.html)

Java had this, like, forever. Long live "Write Once Run Anywhere"!

~~~
Cyph0n
Nice stuff. Java provides so much out of the box that it becomes confusing.

------
dc2
Love how simple and solid this project is. You can use it to do a ton of new
things in Node.

~~~
octalmage
Thanks! Node.js can do anything!

------
nickstefan12
RobotJS + chrome drivers (and therefore CSS selectors for mouse/keyboard
actions) would be the holy grail for integrations testing. webdriver.io kind
of does that now, but it's a bit finicky... Any idea how I'd set that up?

~~~
SchizoDuckie
I think it would be awesome if we can somehow hook this into some API that
grabs

\- the process list \- list of the window positions \- text under a cursor \-
provides an interface to create specialized keyboard/mouse actions for
specific apps (because why stop at chrome?)

i'm sure this is project is going to be quite popular :)

------
krat0sprakhar
Looks damn neat! I've not used AutoIT (and the likes) previously so I can't
think of compelling use-cases yet. Can someone suggest some possible ideas on
what to automate in my desktop with this Node library? Thanks!

------
novaleaf
is there anything like this for browsers? I know theres things like casperjs
but that seems more "browser test" specific, not "browser automation"

~~~
noinsight
Selenium can be used for browser automation.

~~~
Abundnce10
I've used Selenium a bunch the last couple months to automate daily/hourly
jobs to pull data from 3rd party UIs that don't offer an API. I couldn't
imagine not having a tool like Selenium at my disposal!

~~~
SchizoDuckie
Pro tip: create a chrome extension with permissions on all [http://](http://)
and [https://](https://) sites, or run it through Node-Webkit/nw.io, then you
can use the generic DOMParser and querySelectorAll with a pretty fluent
interface on any site you can imagine.

example here:
[https://github.com/SchizoDuckie/DuckieTV/blob/angular/js/uti...](https://github.com/SchizoDuckie/DuckieTV/blob/angular/js/utility.js#L45)

~~~
vaviloff
> or run it through Node-Webkit/nw.io

That's quite interesting! I thought node-webkit isn't suitable (yet) for such
purpose. Could you go into more detail on how to do parsing/automation
external sites with it?

~~~
SchizoDuckie
It's very suitable! (I'm using it in DuckieTV in production, works like a
charm!)

Basically, you can use xmlhttp to fetch any webpage becaused of relaxed
restrictions. then use DOMParser (a built-in browser component, that you can
even shim) to create a virtual DOM of that xmlhttp result, and execute regular
querySelector and querySelectorAlll queries on that :)

------
franciscop
robot.js, what will happen when we want to publish a js library for actual
robots in something like Tessel? (;

------
DonHopkins
This is great!

I wrote up some ideas about "aQuery -- Like jQuery for Accessibility", which
RobotJS would be very useful for implementing. It refers to the Mac
accessibility API but it could work with any platform, and even abstract the
differences between platforms just like jQuery does.

[http://www.donhopkins.com/mediawiki/index.php/AQuery](http://www.donhopkins.com/mediawiki/index.php/AQuery)

Also, Morgan Dixon did some wonderful stuff with Prefab: The Pixel-Based
Reverse Engineering Toolkit, which would be great to integrate into RobotJS.

[http://homes.cs.washington.edu/~mdixon/research/prefab/](http://homes.cs.washington.edu/~mdixon/research/prefab/)

aQuery -- like jQuery, but for selecting, querying and manipulating Mac app
user interfaces via the Accessibility framework and protocols.

So you can write jQuery-like selectors that search for and select
Accessibility objects, and then it provides a convenient high level API for
doing all kinds of stuff with them. So you can write higher level plugin
widgets with aQuery that use HTML with jQuery, or even other types of user
interfaces like voice recognition/synthesis, video tracking, augmented
reality, web services, etc!

For example, I want to click on a window and it will dynamically configure
jQuery Pie Menus with the commands in the menu of a live Mac app. Or make a
hypercard-like user interface builder that lets people drag buttons or
commands out of Mac apps into their own stacks, and make special purpose
simplified guis for controlling and integrating Mac apps.

[...]

aQuery could apply the DOM tree searching and traversal and data association
stuff to the Accesibility Tree, which is similar in a lot of ways to a DOM
tree, and describes all the widgets and user accessible affordances and
commands in an app, as well as non-tree-like relationships between them (this
label describes that widget, this tab represents that panel, this icon
represents that view, this editor manipulates that object, etc).

[...]

aQuery should provide ways of registering patterns and calling handlers when
user interface items that match them are created and destroyed. jQuery doesn't
directly provide a way to do that (handling page onload events and XHR request
responses is usually sufficient), but of course there is a jQuery plug-in that
does it: [https://code.google.com/p/mutation-
summary/](https://code.google.com/p/mutation-summary/) .

So when some user interface objects you're interested in controlling come into
existence, you can wrap them with your own "widget" to glue them into whatever
other user interface you want to provide. (pie menus, hyperlook, ar, speech
recognition, etc).

[...]

I think aQuery should be independent of jQuery, but I like to use jQuery as a
metaphor for how it works, even though that might suggest that it's tied to
jQuery, or even HTML, which it shouldn't be.

------
anon3_
Thank you OP!

> 95.1% C

Any intention on making this available for other languages?

~~~
tjallingt
I think you are misunderstanding; robotjs is written in C but it is a module
for Node.js which means you implement it using Javascript.

Node.js modules can be written in C or Javascript but implementing new
features like this requires you to use C so there is no "making this available
for other languages".

~~~
anon3_
I stand corrected. I looked closer.

I never knew node modules could be written in C.

~~~
deckar01
I was searching around for a js wrapper, but found that even the JS API was
implemented in C [1].

I would only implement the low level "hardware" primitives in C, then
implement the high level API in JS like Chromium's Blink-in-JS initiative [2].
Once they start expanding the high level functionality, they will lose
potential contributors by sticking with pure C.

[1]
[https://github.com/octalmage/robotjs/blob/master/src/robotjs...](https://github.com/octalmage/robotjs/blob/master/src/robotjs.cc)

[2] [http://www.chromium.org/blink/blink-in-
js](http://www.chromium.org/blink/blink-in-js)

~~~
octalmage
You'd be surprised by how many C/C++ programmers there are out there! I've
already been surprised by the number of contributions. But yeah, using C
wasn't a choice, it was the only option. Luckily we already have all planned
features implemented in C.

