
Self Driving Desktop - verdverm
https://github.com/hofstadter-io/self-driving-desktop
======
mcklaw
One of the most unknown tools (>10 years)
[http://sikulix.com/](http://sikulix.com/) It allows play mouse/keyboard event
scrips BUT it allows to find components (coords) via screen OCR so you can
make your scripts multi resolution/desktop independent. Also, it's Java based
so you can play it multi SO.

~~~
flarg
This is an excellent tool! But you forgot to mention that the user codes it in
python, it comes with a purpose built ide, it recognises both text and images
the latter with an approximation capability.

~~~
nitrogen
There are a number of other commercial and free desktop automation tools that
exist, some of which I've used to automate GUI testing in the past.

[https://en.m.wikipedia.org/wiki/Comparison_of_GUI_testing_to...](https://en.m.wikipedia.org/wiki/Comparison_of_GUI_testing_tools)

My favorite on the Windows side was vTask Studio, but it looks like the domain
is down and the link was removed from that wiki page.

~~~
Thoth0
You can still grab vTask Studio via the WBM, thanks for that, shall be trying
it out;

[https://web.archive.org/web/20170927151003/http://www.vtasks...](https://web.archive.org/web/20170927151003/http://www.vtaskstudio.com/index.php)

------
majewsky
In 2008, we were at CeBIT showing off the then-brandnew KDE 4 desktop. (The
booth was sponsored by a Linux-focused media company.) The biggest attention
magnet was a script that we hacked together the evening before, that clicked
through the application menu and demoed various desktop features in a loop.
For a booth, it's absolutely vital to have something that moves, not just
static posters and people standing around waiting.

------
gwbas1c
What is it? The page just says "Desktop Automation framework" and then lists a
bunch of commands and switches.

Perhaps 2-3 paragraphs describing what it does?

~~~
zapzupnz
At a glance, macros. Or maybe the "System Events" portion of Applescript, for
Linux. Something like that. Indeed, the page would benefit from an explanation
and maybe rationale.

------
mirkonasato
Seems like a small wrapper around PyAutoGUI - that I've used before and is
great: [https://pyautogui.readthedocs.io/](https://pyautogui.readthedocs.io/)

~~~
Macuyiko
Or an alternative to Automagica:
[https://github.com/OakwoodAI/Automagica](https://github.com/OakwoodAI/Automagica)

~~~
mirkonasato
That one also depends on PyAutoGUI
[https://github.com/OakwoodAI/Automagica/blob/master/setup.py...](https://github.com/OakwoodAI/Automagica/blob/master/setup.py#L20)

------
michaelmrose
What's different about this compared to a shell script that invokes xdotool
save for being much more verbose.

------
reilly3000
I wish this had a ‘Record’ feature. That kind of logging could be incredibly
useful. I use tools like Katalon on the web and they are great for making a
first pass at test development. It doesn’t need to be entirely visual but if
it can capture the flow visually it can be refactored in code and be much more
accessible and usable.

~~~
verdverm
I use OBS for recording and Flowblade for editing. Got sick of editing my
mistakes out, so then this repo came to be. Planning to add some playlists to
start that up, set file names, begin/end recording.

self-driving-desktop will be part of a demo automation framework that is in
the progress.

~~~
verdverm
I did have a recording function around, to track mouse movement. The issue is
that the mouse movement gets verbose, and you would have to clean that up
somehow.

~~~
hateful
Sounds like a candidate for machine learning - and an excuse to learn it.

~~~
semi-extrinsic
I was going to say "sounds like a candidate for a Kalman filter".

------
thepete2
There is also xnee (Xnee is Not an Event Emulator).

[https://xnee.wordpress.com/](https://xnee.wordpress.com/)

Worked well last I tried it.

------
flukus
> mv x y s;: move the mose to x,y in s seconds

The problem with tools like this is that they create an API that the
developers don't know about and have no intention of supporting. I broke one
recently by having the app maximize on startup, but everything from adding UI
elements, rearranging them or timing differences can introduce breakages.

Considering it's scripting anyway, an actual API would be easier.

------
laythea
It would have been cool to have screenshots on the front page. It gives so
much more sense as to what the thing on github actually is, because I didn't
understand it (without further time) from just the github.

------
Adamantcheese
So it's basically AutoHotKey?

------
keerthiko
I think I have been looking for a framework this simple and straightforward
for about...12 years now? Ever since I got my own personal computer as a
college student, pretty much.

I can't _wait_ to completely go off the wrong quadrant of this chart with it.

[https://xkcd.com/1205/](https://xkcd.com/1205/)

~~~
albertshin
re: xkcd, sometimes, it's not just about the time in minutes you save in
aggregate. I often find routines especially helpful during flow states --
maximizing time for more creative work.

There's also just something satisfying about using something like Alfred to
launch a complex sequence of things that would have taken many mouse clicks
and hand movement. Or using keyboard shortcuts to resize and move multiple
windows around monitors. It feels almost... powerful? Not sure why.

~~~
marcosdumay
It mostly do not matter. The main goal for automating something is rarely to
save time nowadays (the low hanging fruit are much rarer). It is to document
procedures, prevent defects, or to test before running.

------
imjustsaying
Is it normal for devs to be able to read and understand github reps without
any explanations, introductions or context beyond the title? I remember much
more of this in github's early days and always wondered if this doesn't faze
the talented devs reading it.

~~~
lejar
I think it would be fair to say that you shouldn't expect anyone to be able to
understand a bare repo with just a glance, but if you're well versed with the
technologies that the repo uses and you know of similar products, then I think
you can guess it.

Here's how my thought process went on this one:

# I open the repo on github and look at the readme

1\. Okay it's doing something automatic

2\. It uses python

3\. Okay there's this playlist thing which has a bunch of commands in it.
Looks like of like an autohotkey script.

# I look at the file list

4\. Okay I know lark. Looks like the author wrote a domain specific language
parser for their input files. They probably get those commands out as a nested
list from the parser.

# I look in test.txt

5\. Okay that doesn't tell me much new

# I look in main.py

6\. Oh there aren't any comments in here...

7\. Alright the main function parses the commands from the input file and runs
"do" on them.

8\. Okay this is just like autohotkey

------
dwiel
For mac there is also talonvoice.com which allows a lot of similar
functionality along with methods for connecting to keyboard shortcuts,
voice/dictation control and noise control.

------
satyanash
Ruby would've suited well for the DSL this project is trying to implement.

------
Aeolun
I really was hoping for a desktop computer on wheels :(

------
westmeal
Autohotkey: Xdotool edition

------
rhizome
Kind of like Kixtart IIRC.

------
BeatLeJuce
It's "Grammar"

------
softgrow
The title is a bit misleading leading to disappointment. I was expecting
something like a self driving car. You just give the desktop an objective and
it figures out how to get there and then gets you there.

