
Simple Haskell webapp: Generate random tweets using Markov chains - jaspervdj
http://jaspervdj.be/tweetov
======
mootothemax
Using markov chains is fun until it all goes wrong. My Twitter bot sadly seems
to have become infatuated with Justin Bieber _sigh_

<http://twitter.com/markov_chains>

~~~
mithaler
Not surprising, considering Justin Bieber uses up 3% of Twitter's
infrastructure.[1]

[1] <http://mashable.com/2010/09/07/justin-bieber-twitter/>

------
petercooper
Running @FoxNews into it:

    
    
      #UN Security Council, General Assembly evacuated due to hold joint military exercises
    

_"New Tax Bill: DADT"_ was another good'un.

~~~
ssp
_"TSA releases all 24 hostages after Kim Jong Il introduced son as House
member"_

------
ma2rten
Reminds me of Mark V Shaney.

"I spent an interesting evening recently with a grain of salt."
<http://en.wikipedia.org/wiki/Mark_V_Shaney>

------
Mongoose
This is a lot like an app I wrote a few years ago for a programming contest.
It uses the same technique to generate random invention descriptions using
patent application abstracts: <http://eurekaapp.com/> (Yes, I realize how
unreasonably slow it is)

~~~
unoti
Hey that's cool! I imagine that you probably caught yourself, a time or two,
actually trying to read and understand what the "invention" is. As I just did.

~~~
ma2rten
I usally get head pain from this sort of things, because I almost understand
what it says but I not quite.

------
oniTony
> Maple says nlognlogn, which could solve NP party?

And you are all invited.

------
jackowayed
It seems that it needs to sample more tweets. I've done 5 tweets, and 2 of
them were exact tweets that I sent (they were only a few words long, and I
guess they had words that I rarely tweet, so it had few or no other ways to go
once it started repeating the tweet)

Edit: Then again since there are only a few words in a tweet, you'd have to go
a really long way back to really ensure that won't happen. Possibly farther
back than Twitter will let you.

~~~
wmil
It's more fun if you use it on politicians or celebrities. You get strange
alternative world announcements.

~~~
pjscott
It generates downright disturbing results when you apply it to
<http://twitter.com/Othar>

_Oslaka died 2 kilometers in order to come from seafaring folk, so I do it. I
spend twenty minutes left! He is hideously scarred. My host removes his face a
lot in college._

------
Natsu
This reminds me of an old Perl script I made with Markov chains for a very
similar sort of random nonsense text generation.

I think I fed it some text from a few usenet kooks/conspiracy theorists and
something like Alice in Wonderland and got quite a few laughs a long time ago,
though it was made to allow you to combine arbitrary texts into a single
chain.

------
jallmann
@jennyholzer:

 _OFTEN AS OFTEN AS OFTEN AS OFTEN AS POSSIBLE_

Very apt, but what n-gram length is being used? n=1 is my guess, since "as
often as" is a common English construct. Obvious feature request: tweakable
lengths.

edit: I'd make the fix myself and send a pull request but I don't know haskell
and am too lazy to figure it out.

~~~
jaspervdj
n=1 is indeed being used. The problem with a larger n is that you get original
tweets really often, because the dataset is limited (as much tweets as you can
get in one request).

~~~
dododo
are you using smoothing for large n? kneser-ney smoothing seems to give the
best results.

[http://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-
tut...](http://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf)

~~~
ma2rten
I did not read the paper, but what does more accurate mean in this case ?
Likelihood of some unseen data ? Seems pretty hard to define or measure to me,
if the goal is shear amusement.

------
adimitrov
I'll leave this here, since it's loosely on-topic (of Markov chains:)
<http://www.joshmillard.com/garkov/> — Garfield strips with MC-generated text
instead of the original. I've had countless hours of fun with them.

------
ryan-allen
This is hilarious! Thanks for putting it up, I'm laughing my arse off at what
it's generating!

------
Sniffnoy
Unfortunately, if you do as I did and tweet what it generated, you can't use
it again, as it would be eating its own output; and generating nonsense from
nonsense is not so entertaining.

------
pavel_lishin
> And a new every day, I'll bring The first bill, since that was a beer Me
> too!. I don't, but some PHP servers aren't set up 64 pixels.

Pretty accurate stuff.

------
chrismealy
Cool! It's great to see real-world haskell in action.

------
klenwell
<http://twitter.com/chimpinson> (php)

------
sz
How long until the link expires?

~~~
jaspervdj
Until any of the following events occurs:

\- the machine runs out of memory (the tweets are stored in a redis backend);

\- someone (I) accidentally clears the database (this has happened before);

\- zombies attack our datacenter.

But I'll try to keep them up as long as possible.

------
DrBaud
@allah

So, you've a direct connect to the One?

