
Sikuli Script - llamataboot
http://www.sikuli.org/
======
krilnon
I used Sikuli back in 2011-12 for some random automaton tasks, and often wish
I would remember to use it more.

\- Advisor wanted a one button way to run a convoluted research prototype I
had made, and I didn't want to have to dig into Cocoa to figure out how to
programmatically click/select options in a few desktop apps.

\- Worked at a company and there was silly employee training slideshow+quiz,
so I had Sikuli wait for the next arrow to show up once the audio was finished
and click it.

\- Wanted to heat up my GPU to warm a brownie, so I opened one of those WebGL
water demos and had Sikuli repeatedly pick up a ball and drop it in the water.

~~~
AceJohnny2
> \- Wanted to heat up my GPU to warm a brownie

Please elaborate.

~~~
krilnon
Sure thing. I was an intern at Adobe working on a new programming language.
The office has a cafeteria, and I would often buy lunch there and get a
brownie to save for later.

I like brownies better when they're warm, so I came up with a few ways to warm
the pre-packaged treats while still sitting in my office. I first tried
sitting the brownie atop my laptop charger:
[http://images.reclipse.net/warmed_brownie.jpg](http://images.reclipse.net/warmed_brownie.jpg)

This worked fairly well, but the charger was less hot when my laptop was at
100% charge, so I sought alternate methods for heating up the brownie. For
some reason, I had stumbled across some recent WebGL demos at the time, and
noticed that the fan on my laptop would spin up on that particular demo.
([http://madebyevan.com/webgl-water/](http://madebyevan.com/webgl-water/))

This provided a more consistent heat than the charger, since usually my
laptops would be completely charged by the end of lunchtime. I had my own
laptop and a company-provided laptop, so it wasn't hard to set one of them
aside as a brownie warmer.

~~~
habosa
Wow. I thought the "to warm a brownie" thing was a joke about WebGL
performance. That's a hilarious story.

~~~
krilnon
Nah, it's real. Adobe had a WebGL competitor, Stage3D, at the time, but I
didn't work on that at all. As an intern, I would run a few useful benchmarks
per day, but it was more reliable to run things I knew would heat up a laptop
instantly, so I used WebGL demos.

Brownie-wise, I would wait a while because lunch would fill me up. But at 3 or
4 pm, I'd be hungry again and something the size of a brownie would really hit
the spot.

------
cdr
I used to use AHK a lot for interacting with a game. Sikuli seemed neat when I
heard about it, but it turns out that it only supports pretty much a single
method of capture and only a single method of relaying mouse events. This made
it completely unusable for that purpose. AHK supports pretty much every method
available to Windows. The simplicity of use was attractive, but it needs a ton
of work under the hood before it'd be viable as a general purpose automation
tool.

~~~
bcaine
I also experimented with it a few years ago trying to build an automated GUI
testing framework, and it turns out that its way too fragile and non-portable
to be usable for that use case.

I remember running into issues as soon as anything changed regarding
resolution, scaling, graphic settings, color scheme etc.

Pretty fun to make toy programs in to automate stuff with though.

------
Alex_MJ
Sikuli is freaking great, though not sure how this is news, it's been around.

Super useful for automating things that are easy to handle by looking for
patterns/things on screen and hard to handle with APIs (or lack thereof)

------
YZF
We used Sikuli for test automation in a pretty large project with a Windows
UI. Got kicked off by a TeamCity agent for every build and worked really
nicely.

Thumbs up.

You do need to be careful about timing and getting the right images so tuning
things to work under all conditions is a bit of an art. Also being able to
recover from a failure so you can continue testing is another bit of art.

As to why this is news there seems to be a new release out (or soon?) 1.1.0
... Sikuli development seems to have almost died a few years back but it's
made a comeback over the last ~2.5 years which is nice.

~~~
drothlis
I do a similar style of UI test automation for set-top boxes / smart TVs, with
stb-tester[1].

We've found that the "Got kicked off for every build" continuous integration
process you mention is the crucial part to achieving success with this type of
test automation -- if you're going to invest the effort in writing reliable
tests, you want to be getting value out of them by running them as often as
possible and as early as possible.

[1] [http://stb-tester.com/](http://stb-tester.com/)

------
gowan
Nice to see sikuli on the front page. I'm currently using it to test a legacy
application.

Sikuli is good at image matching. For me sikuli broke when I started to take
images of text. The font would render differently in gnome and the vm
(vncserver/twm) jenkins ran the tests on. I ended up creating docker images of
the test environemt so the docker image would be the same on jenkins and the
testers machines.

Debian has a sikuli package libsikuli-script-java and sikuli-ide. I've also
written a docker file for sikuli on debian wheezy [1].

[1] [https://github.com/jesg/sikuli](https://github.com/jesg/sikuli)

------
jamesgagan
Wrote a WoW fishing bot with Sikuli a few years back - it worked pretty well.

------
nogridbag
I was a big fan of Sikuli back in the day, but I found it a bit unreliable for
automation. No matter how much I tweaked it, it seemed to be a bit
unpredictable.

I did find some use for it. My girlfriend got addicted to some online flash
Mahjong game and no matter how hard I tried, I could not post a better score
than her. With a bit of Sikuli scripting, I was posting top scores in no time!

------
woutervdb
Reminds me a bit of Scratch[1], a tool that came pretty popular when the
Raspbery Pi came out. Very simple programming interfaces that work with simple
graphics, but can do a lot of things.

[1]: [http://scratch.mit.edu/](http://scratch.mit.edu/)

------
whitten
Sikuli is being used to create test plans for the VistA system (documented at
[http://www.osehra.org](http://www.osehra.org) ) It works with GUI stand-alone
executables and with web pages from a browser.

------
cwt
Does anyone use this for web scraping dynamic links created by javascript,
pulled from dev tools "network" tab?

------
fiatjaf
Works with the browser, right? So this is the ultimate visual scraping tool,
import.io and ParseHub are useless now?

~~~
tsergiu
One of the founders of ParseHub here.

Not quite. Sikuli tries to figure out where things are by doing a visual
match. This works very well for things like automating applications or sites
where page elements are fixed (e.g. finding an option in a menu or using a
search engine). But it works terribly when trying to overlay semantic
structure on dynamically-generated data. For example, it has no way of knowing
that a list of movies is split up on multiple pages, with each movie having
multiple genres, a cast, and multiple reviews, each of which has a rating and
an author.

There's also the additional drawback that it is hard to parallelize things in
Sikuli (you would need heavyweight vms, and there are no obvious "breaks" in
the flow). So doing something at scale is not feasible.

With ParseHub, one of the goals is to make it easy to express relationships
(and we think we've done a really good job). We also automatically figure out
how to split a job up across an entire fleet of servers.

Hope that offers some insight. Email me at serge@parsehub.com if you have any
other questions.

------
bart3r
What are it's capabilities with OCR?

~~~
YZF
It has OCR but wasn't working so great. It uses Tesseract. I'm not absolutely
sure why it wasn't working well in the past, possibly something to do with
different fonts/display rendering (e.g. ClearType and such). It "almost"
worked so maybe it got better or maybe there's some tuning you can do. Didn't
spend too much time on it.

~~~
drothlis
OCR is never perfect. I do a lot of automated UI testing in a way similar to
Sikuli, and while we do rely on OCR a lot, you have to use certain workarounds
(like fuzzy matching instead of looking for a perfect match of your expected
text).

Ultimately Tesseract was primarily designed to operate on text which had been
printed and then scanned, whereas the text on screen is lower resolution,
anti-aliased, on a coloured background, etc etc.

Some further details of our OCR investigations here: [http://stb-
tester.com/blog/2014/04/14/improving-ocr-accuracy...](http://stb-
tester.com/blog/2014/04/14/improving-ocr-accuracy.html)

The TLDR version is: Training Tesseract on your font doesn't help; scaling up
the text 3x before passing it to tesseract gives a massive improvement (I
don't know if Sikuli does this); normalising ligatures & punctuation gives an
additional slight improvement.

------
qwerta
Great project, used with great success to automate testing of legacy app
across multiple virtual machines.

------
faldore
How is this news? Sikuli has been around for many years.

~~~
zimbu668
[http://xkcd.com/1053/](http://xkcd.com/1053/)

