
Analyze Your HN Posts with Watson User Modeling - Jonovono
http://hn.mybluemix.net/
======
Kronopath
It's worth keeping in mind that with things like these you get what you put
in.

When the service first came up on the front page, I punched in some personal
writings from my journal into it. Comparing my results from that to this HN
assessment is like night and day. The HN assessment shows me as highly
intellectual, imaginative, and adventurous, with a strong value for self-
enhancement. The personal stuff showed me as being much more emotional, with a
huge emotional range, because—surprise, surprise—I tend to write more about
emotional parts of my life in my journal than on HN comments.

While this service might be useful to get a broad sense of people for
marketing purposes, using it on an individual basis is like talking to a
fortune teller—it could tell you nearly anything about yourself, and you'd be
able to come up with an explanation to justify it.

~~~
eksith
Exactly. We adopt a different tone based on the venue and, since no one shows
the same face everywhere, even with the same name, these are going to be
wildly different.

Ironically, this really did feel like a machine based cold-reading.

------
jcomis
Seems to be down for me: 404 Not Found: Requested route ('hn.mybluemix.net')
does not exist.

~~~
Jonovono
Ah, it's just hosted with IBM right now so they must limit.

------
aselzer
Did this with Twitter: twurl "/1.1/statuses/user_timeline.json?count=200" | jq
-r ".[] | .text" | pbcopy

paste it into [http://watson-um-demo.mybluemix.net/demo](http://watson-um-
demo.mybluemix.net/demo)

87% Openness, 5% Agreeableness, that's funny.

~~~
Tiksi
This inspired me to do the same with reddit, so I threw together this function
in zsh (should work with bash too):

    
    
            function redcom(){ ! [ -z "$2" ] && i="&after=$2" || i=""; data=$(curl -s "https://www.reddit.com/user/${1}/comments.json?count=100${i}"); j=$(jq -e -r '.data["after"]' <<<$data); echo $j; (jq -e -r '.data["children"][]["data"]["body"]' <<<$data)>>${1}-redcom.txt; echo $(echo -e $data|wc -l) lines; ( [ "$j" = "null" ] || [ -z "$j" ] ) || redcom $1 $j ;} ;
    

It's a bit more complicated due to the reddit api, but if you run that, then
run redcom <user> and it should throw all your comments into <user>-redcom.txt

Never user/heard of jq before, it's a pretty nice tool.

------
minimaxir
Is there any insight as to how this works? I got an Openness rating of 97% and
a Harmony rating of 100%, both of which I know are not true. (I also received
a Love rating of 1% under my Needs, although that's pretty accurate.)

~~~
bane
A problem I've observed with these kind of blackbox systems is that the
process from input to output _really_ is a mystery.

When the results are right, they're just "right" so you should accept them,
when they're wrong they're actually also right by whatever magical hamster
wheel is operating inside of the thing and you just don't "get it".

The problem is that humans like to have some clue as to how the results were
derived, something easy to explain that gets the gist across. Something like
"Watson counted all the words you use and compared them to different reference
lexicons to arrive at the score". This provides a little bit of context so we
understand the semantics of the result and how to consider them and reason
with them.

But for all we know the results we're seeing are from some arbitrary
stochastic method:

openness=rand(90,99) harmony=rand(90,100)

etc.

For things like this to be accepted by the users (humans) there needs to be a
quick explanation for how this works otherwise we get head scratchers.

~~~
keelyw
Please see my other 2 responses in this thread for some insight. I think I
posted them about the same time you posted this.

------
ultimoo
This is very cool. Although I don't know how accurate this is, if calibrated
and tuned to yield a certain degree of accuracy it will have a variety of use
cases.

For example -- When interviewing someone, being able to run their github
username (if known of course) to analyze their commit messages, comments,
discussions. Or even their hn, reddit, twitter user names (if the usernames
are linked with their first names, nothing creepy). It will potentially help
to identify candidates that are downright rude, arrogant etc.

Or analyze internal mailing lists, hipchat/slack channels for co workers who
are potentially burnt out.

~~~
smtddr
_> >It will potentially help to identify candidates that are downright rude,
arrogant etc._

This sounds very dangerous to me. I assume when recruiters and/or lead
engineers decide to reach out to me via LinkedIn, they did their homework on
me. I purposely link to enough stuff for them to realize "smtddr" is my
handle. The same blog & youtube channel in my HN profile is also in my
LinkedIn. But I expect a _human_ to look, not some computer judging me. I can
totally see people getting lazy and just doing stuff like only filtering for
people who rate 80% on openness or something. Then everyone will start
grooming their posts simply to get positive results... then someone will
create a social website that claims to block those scanners so people can say
whatever they want.

It just forces people underground and the filters won't work anymore since at
that point you might as well assume everyone is gaming the system.

 _(fwiw, I 'm also against standardized tests. Anything that forces a whole
group of people to start grooming themselves for a very specific measurement
kills diversity, imho. Since the very term "standardized" kinda goes against
the concept of diverse... and people become lazy and just rely on such tests
to make or break the deal)_

------
flatline
Would love to see something like this for reddit, where I'm a more active
poster on a wider variety of issues.

~~~
ljk
something like this?

[http://www.redditinvestigator.com/](http://www.redditinvestigator.com/)

~~~
Jonovono
Interesting! I like the fun guessed data. I am from Canada, game of choice is
ping pong, nor do I have children.

"""

Probably from: Canada

Support OWS: Probably no or doesn't care.

Children: I do not think so...

Gamer: Only pong probably...

Like trees: Must be in a really good mood.

Behavior: Candidate as replace for Good Guy Greg

"""

------
Jonovono
For those that like this. There is a similar thing I saw awhile on here (I am
guessing) that analyzes facebook posts and compares you:

[http://labs.five.com/](http://labs.five.com/)

------
EGreg
Gregariousness 8%

Given my name I found that funny Also I know it's wildly inaccurate on several
characteristics, but maybe my persona here is like that! Interesting.

~~~
waterlesscloud
Yeah, I think it's worth keeping in mind it's scoring you based on your
comments on a particular kind of site which tends towards certain kinds of
interactions.

------
thegeomaster
Does Watson internally use observations drawn from this study[1]?

And another question---is this applicable to non-native English speakers? Do
they acquire the same language habits as if English was their mother tongue?

[1]:
[http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783449/](http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783449/)

------
ColinWright
Now I'm consistently getting this:

    
    
        502 Bad Gateway: Registered endpoint
                         failed to handle the
                         request.
    

So, still not working for me.

 _Edit: In fact, entering someone else 's username worked, and another, but
mine continues to fail. Is it just me?_

~~~
aameek
I believe you are talking about the hn.mybluemix.net app. That is not an IBM
app - a HN user put it together. It appears to be crash often

~~~
ColinWright
Well, yes, because that's the link in this submission.

------
Houshalter
If you picked random English words and put random numbers next to them, I
wouldn't be able to tell.

~~~
Sven7
But that's how the stock market works...

------
chuckcode
Not a lot of documentation on the IBM page about what the characteristics mean
or how they learn a mapping from text to these categories.

Curious to know what the average hacker news scores look like? I'm imagining
it is a pretty small segment of "normal" society.

~~~
keelyw
See my response to minimaxir above about additional documentation coming soon,
and a link with good background info on the technology. Meanwhile, here are
brief descriptions of the Big 5 Personality traits: Big 5 Personality: \-
Openness - associated with curiosity, intellect, and an appreciation for art
and adventure \- Conscientiousness - associated with organization and
industriousness \- Extraversion - associated with positive and outgoing
attitudes toward other people \- Agreeableness - associated with compassion
and cooperation toward other people \- Emotional Range - associated with a
sensitivity to negative emotions

For more information on systematic associations between personality and
individual differences in word use, please refer to studies like Tal Yarkoni,
"Personality in 100,000 words: A Large scale analysis of personality and word
use among bloggers", 2010

------
ommunist
It worked from the UK IP for me, but not from the US one. Anyway - returned
502 after request.

~~~
xchaotic
I bet Watson now thinks you like UKIP. And now me, so meta.

~~~
ommunist
I cheated it from Russia. UKIP is not my political fav.

------
jason_slack
Interesting, I analyzed myself and patio11.

any insight on the tech stack and how it was implemented?

~~~
elyrly
It would be interesting to see the code.

~~~
Jonovono
I'll throw it on Github right away. I basically just used IBMs Node sample
code which had it all there I just hooked it up to the HN api :p.
[https://www.ng.bluemix.net/docs/#starters/nodejs/index.html#...](https://www.ng.bluemix.net/docs/#starters/nodejs/index.html#downloadapp)

------
hoopism
100% Challenge?

Not sure what that means... but maybe this post is contributing to it?

------
spindritf
_Conscientiousness 23%_

Ouch! How did it know?

Some are completely off, with plenty of 1% cop-outs but a few — spot on.
What's the methodology? What does the need for practicality mean for example?

------
throwaway344
I would be curious to see if some of these characteristics are correlated with
higher average karma/total karma.

------
ColinWright
Hmm.

    
    
        404 Not Found:
          Requested route ('hn.mybluemix.net')
            does not exist.

~~~
xchaotic
Shall we rename slashdot effect?

------
dwd
I wonder how well it detects sarcasm?

------
arikrak
I analyzed my HN posts with the Watson API on an Oculus Rift and the
singularity happened.

