

I wrote this article with one mouse click - jawns
http://coding.pressbin.com/60/I-wrote-this-article-with-one-mouse-click/

======
patio11
You can do an awful, awful lot for your business by taking this idea one or
two iterations further:

1) Identify data source

2) Extract value from data source

3) Spit out templated content pieces extracted from data source

4) Farm out templated articles to freelancers for thickening up, Demand Media
style

My client from this summer who paid me to do it for the average value of
particular college degrees is launching sometime in the next week or so. I'll
happily play show-and-tell with the non-proprietary parts if folks want, after
it launches.

~~~
jawns
I tried to do almost exactly what you're describing about a year ago --
scraping structured data about mutual funds and constructing articles about
each fund, which I submitted to Associated Content.

Unfortunately, I should have spent more time "thickening them up" -- after the
first half dozen or so, AC began rejecting them for being too similar to each
other.

But yeah, the potential is definitely there.

~~~
patio11
I'm curious: why did you submit them to Associated Content instead of building
your own site, where you'd have total control and keep most of the value you
created? A deep backbench of semi-automated articles about funds plus
relatively fewer pillar content pieces for linkability strikes me as a
potentially very viable business in an industry which is quite literally awash
in cash to spend on marketing.

~~~
jawns
At the time, AC was paying about $3 up front, plus pay-per-view incentives.
And it has a good Google PR. So, I figured I could either be lazy and submit
to AC or build my own site and spend a lot of time trying to get a decent
PageRank. I chose lazy.

~~~
patio11
Just in the spirit of introducing you to other options: there exist people who
already have profitable affiliate sites in the space who you could pitch on
the idea of "bolt this onto your site and get X more inventory which will rank
on the strength of your existing brand/trust/etc." I'd be thinking more in the
five figure range than the $3 up front range.

~~~
eru
Good suggestion!

And $3 per article can also add up, if you have a scalable solution.

------
gkoberger
Personally, I'm not a fan. For generic content like this, I'd rather read it
in a table or chart. The data is being encoded into natural language, and then
when we read it we have to parse the important information back out.

This is a weird example, but look at Groupon. One of the main reasons it's so
big is the custom, humorous descriptions that go along with each item.

If newspapers want to survive, they shouldn't be automating their content- it
just makes it more generic and forgettable. Nobody wants to read an article
that a computer wrote.

~~~
LiveTheDream
> One of the main reasons it's so big is the custom, humorous descriptions
> that go along with each item.

Personally, I never read those. I skim the headline, and look at the deal
details if the topic interests me and is a really good deal.

There's a difference in content though; with deals I just want the hard facts.
With many news articles, I want some well-written copy to add context to
otherwise meaningless/bland data.

For the Powerball results, the sentence format is a bit more engaging and I
can skim it as fast as I would skim a table. If nothing else, it makes me feel
like the publisher cares more about the reader.

------
ajays
FTA: "But the result — being able to automate what used to be a monotonous
task — is totally worth it."

Some of the best hackers and coders I've met were the laziest folks around;
but they'd spend hours working to automate a 1-minute monotonous task.

~~~
patio11
You can get _staggering_ productivity wins by automating enough of the right
1, 5, 15 minute tasks, especially when you consider how terrible people's
schedulers are. If your computer lost an hour of productive work every time it
context switched, you'd figure out ways to eliminate it's list of small
recurring chores, too. Happily, your computer has very, very efficient context
switching relative to you.

I've been keeping a running count of time spent on various activities this
month, for giggles. Total support time for BCC in the last two weeks: eight
minutes. The machine has been humming so efficiently I burned some time
yesterday just to check that the whole thing hadn't been hit with a meteor or
something.

~~~
varaon
A second point would be that the time you spend automating tasks has other
payoffs, in the form of learning and inspiration.

For instance, I suspect that part of the inspiration for Appointment Reminder
came from Patrick's own realization of how valuable the automation of small
tasks can be (chasing down missed appointments, in this case).

\---

To go off on another tangent, this is akin to eliminating technical debt from
your workflow. By taking time to "refactor" certain tasks by doing them in a
more efficient way, you get a net savings going forward. You can increase your
ability to take on new tasks by increasing the efficiency of existing ones.

In keeping with the refactoring theme, if you have tasks that don't scale
well, you may need to spend more time on them in a crisis. For example, in the
event of a site outage you might suddenly have a deluge of support emails.
Having support tickets be automatically created would save you n*60 seconds of
copy/pasting.

------
kljensen
Nice article. It could be improved if the author collected numerous, diverse
lottery result articles and used these to create a script that outputs
randomized articles.

This is already done in the sports area, which is significantly more
complicated: [http://mediadecoder.blogs.nytimes.com/2009/10/19/the-
robots-...](http://mediadecoder.blogs.nytimes.com/2009/10/19/the-robots-are-
coming-oh-theyre-here/)

------
jerf
This task _screams_ for DSLs, especially on the generation end. Doing this in
PHP directly (or most other general purpose languages) encourages too little
variation because adding alternatives is relatively heavyweight. Writing a
good DSL that makes it easy to offer more alternatives in a single template
will make it much easier to produce something even less distinguishable from a
human report.

~~~
aristus
I think you are missing the point. What's remarkable is that this person is
combining basic hacking skills with a completely different career, not the
cleverness of the design. Not everyone has to be a poet but amazing societal
changes happen when everyone can read and write.

~~~
jerf
I don't think I was missing the point. I think I was _making another one_.

~~~
jerf
Further thought: Actually, not being a programmer reinforces my original DSL
point, not contradicts it. Part of the purpose of a DSL is to reduce as much
as possible the "programming" part of the task so the domain expert can
concentrate on what needs to be done.

I never said "this guy should have written a DSL instead", which I didn't say
because it would be an asshole thing to say. I said that this task screams for
a DSL, and that's only _more_ true if this guy isn't a programmer.

------
troels
No need to mess around with curl bindings directly in php. `file_get_contents`
will accept an url.

~~~
jawns
Hey, I'm the blog author. You're right -- I'm just so used to using cURL for
more complicated requests that the simpler solution skipped my mind. I'll
update.

~~~
troels
You may know already, but a lesser known feature of php is that you can pass a
[stream context](<http://php.net/manual/en/function.stream-context-
create.php>) as optional argument to most file-operations. This enables you to
make fine grained http-control (post, headers etc.), still using
`file_get_contents` and friends.

------
Aaronontheweb
This idea has a lot of potential, and thanks for including code samples!

------
thehodge
wow, thats pretty funny, its almost exactly what we do for
<http://www.saturdaylotteryresults.co.uk..>. funny that

~~~
prs
You might want to add a space between the year and 'Lottery Results' in your
titles.

    
    
      Old: 2010Lottery Results
      New: 2010 Lottery Results

------
ddemchuk
this a classic seo content generation move, called mad lib sites. You create a
templated article, with variables for each piece of dynamic content. Usually,
you will also create "spun" content so that each article created with the
madlib template is even more unique.

Then, you can scrape or find large databases of consistent information and
deploy very large sites.

The trouble is getting google to fully index these sites. It requires a good
amount of link building both to the madlib pages and the home page to get
enough juice for the crawlers to spend time on the site and get things
indexed.

They can be very useful sites to build for a variety of reasons, and can
actually add some value depending on the data you're publishing

------
klbarry
Will someone please make a start-up where a non-technical person can plug in
information and do something like this? It would be great!

~~~
commanda
Do you mean for general workflow tasks, or for creating documents given a set
of data?

For workflow tasks, Automator.app on OSX is pretty great, even for non-
programmers. There's probably something analogous on windows/other OSes.

For document generation... that seems like a fun weekend hack. I'm imagining
some kind of more configurable madlibs-style app.

~~~
roryokane
Microsoft Word can already do the document generation thing from an Excel
spreadsheet. Of course, it’s more complicated than a purpose-built app would
be, but probably also more powerful.

~~~
rg
Here's a description of how MS Word and MS Excel were actually used to create
4,600 parameterized web pages, complete with samples of the Excel and Word
documents used:

<http://www.horniman.info/DOCUMNTS/HOWTO.HTM>

Scroll down to section 6, "Webpages" for explanation of this topic.

The generated website is also live online at

<http://www.horniman.info/>

so the generated HTML can be easily inspected.

------
to
<http://pastebin.com/NmtPBDgK>

~~~
to
<http://pastebin.com/8gMi4iAU>

