Hacker News new | comments | show | ask | jobs | submit login
Show HN: London Feels – Sentiment-analysis of Londoners Tweets on a Map (feels.website)
58 points by radiodario on Oct 27, 2014 | hide | past | web | favorite | 38 comments

I once did a map of the UK using sentiment analysis of the text of geotagged Flickr photos, hoping to find the areas which were more happier than others. Turned out there was no geographical pattern from that data.

Geographical analysis tools should be used in these types of analyses, apart from just looking at blobs on a map. I used k-means based cluster analysis to find groups of happy and sad areas but again the groups turned out to be nothing conclusive.

The web GIS company I ended up working for used sentiment analysis of tweets by aggregated them into regions, so as to find positive and negative areas during a specific timeframe (for example, US elections). The regions had demographics which could be used statistically, and in general some interesting patterns were observed.

What sort of accuracy did you have in your sentiment analysis algorithms? I'm curious because I find the error rate in such algorithms is typically higher than any sort of variance you are seeking, which causes significant problems in terms of any sort of pattern recognition.

When you're using things as short as Tweets, and as broad as "general sentiment", you're probably making accuracy even worse, to the point that simpler demographic analysis or bag-of-words clustering (i.e., cluster areas by diction rather than by sentiment) yields more reliable results, even for sentiment.

Accuracy was pretty bad in sentiment analysis. Especially how people talk on tweets. Loads of false positives.

Hi HN.

I've built a map that takes a geofenced stream of tweets and runs AFINN-111 sentiment analysis on them, and then displays them in real time on a map of London.

Negative sentiments are displayed as Red tweets, happy tweets are Blue.

The whole thing is built on node.js using node-tweet-stream, node-sentiment and socket.io. The frontend map is leaflet with stamen design's Toner tiles.

It's quite fun to watch, especially when there's a football match or a concert. If you click on the "follow tweets" checkbox, new tweets pop up as they arrive, although currently that makes the map pan north.



Very nice. It also gives me the impression that people in the West End are certainly more angry (or have free time to be angry on twitter) during office hours than those in the City.

Very cool but before clicking on some dots I was wondering why everyone feels the same. The colors are not ideal for red/green colorblind people (is it blue and purple?)

Maybe include a feature to select the colors for happy/sad/average with a button to return to defaults?

Black for sad, light grey for neutral, something like a medium bright green for happy would be my picks.

cool idea - i tried to pick perceptually separate colours and thought blue and red would work, but turns it's confusing for some people. I might add "colorblind" mode and use your colour range.

"colorblind mode" is an antipattern. Using more contrasting colors and using other distinguishing features such as shape and texture benefits all users.

How about having numbers in the circles?

Cool! Hey, that's very funny....

We did the same with Tweets and Surfing. http://devwax.herokuapp.com/ from the meetup: http://www.meetup.com/DevWax/. It was all done in a weekend with some drinking and surfing, so it's a bit rough. The trouble with surfing was that the locations are very disparate and hard to guess. Fun to have a go at though...

Are you in London? (We are)

hey sweet orthographic projection! Yeah, i'm in London.

I'm only using tweets on which users chose to publish their locaiton, so it isn't all of the tweets, but a good chunk of them.

Given that most people tweet close to home, that most people work close to home, you can get the home location of the user from their profile, and geocode it to assign a location to the tweet.

This approach only works when aggregating tweets for a larger area. E.g. comparing 10,000 tweets each in UK county, or perhaps for cities.

For even larger areas (think regions / countries) you could look through the user bios, or previous tweets to pull out any names or locations and do some analysis to work out which broader region they are in.

Cool, we are in Southwark, doing a bunch of graph visualisation stuff (http://blog.stitched.io/), so come by for a beer at some point, would be good to chat.

Could perhaps be more accurately retitled 'London claims to feel on social media' map. There's a lot of literature examining how people present themselves in such venues and how it's often an intentional communication (even if subconscious) to create a certain impression.

Neat site. As others have pointed out, the sentiment analysis is off in many cases. I'd be quite impressed if you managed to correctly classify this one, though: http://i.imgur.com/wmWUitu.jpg

Nice site, although it does have issues with the analysis as outlined by others here.

One suggestion I would have is some sort of filtering based on the content of the tweets. This tweet returned "feeling good":

"New post featuring: @NewLookPRTeam @nextofficial @Matalan @hmunitedkingdom @uoeurope @Accessorize @ASOS http://t.co/nGsFj306xT #fbloggers"

This one returned "feeling average":

"@dannykobe17 @DanielRacheter @Khuds_ @shangambling @Umar_Wilshere19 @_mikenewell_ @BlueKay10 is that ollie?"

Whereas there's absolutely no real sentiment to derive from this sort of thing.

thanks! that's definitely a good idea, how would you go about doing this? counting @mentions vs "tokens"/workds and setting a threshold ratio to remove tweets?

Good job, expect I find colors pretty unintuitive - why red means negative? It's color definitly connected with love, anger, war etc. So blue is for cold or maybe not showing emotions. I would definitely rethink that. And senitment analysis not always work - get a tweet rated as "sweet" ended with ":(".

well, i have to say the mapping of colours to concepts is highly cultural, so in some places it means anger, or war, or love or joy.

I chose red and blue because they're quite perceptually separated, so it's easy to point them out on the map.

I personally feel like the colors shouldn't affect color blind individuals (e.g. light blue and orange). But great job overall!

Hey thanks for the feedback, I'll change the "bad" colour to orange - should be easy enough to do!

Very cool, thanks!

Some of these are unintentionally hilarious without context. Here's a real gem: https://imgur.com/c7Ly6Qm

All in all, though, impressive. Sure some are misclassified but it seems like a significant majority are not, including a lot of the hard ones. Good work!

I always have a look at CityDashboard : http://citydashboard.org/london/

It's got a bit more than just twitter feelz. Mostly Boris Bike usage rate :D

Clicked through ~20 of them and the analysis was completely off in most cases.

Yep, although in some cases understandably. One tweet listed as ‘Not good’ had the text ‘Killed it! 🔫’ and a location given as a comedy club. I’m assuming someone had had a good gig, but I’m not surprised a classifier algorithm got that wrong.

Some cases are just difficult, but the overall accuracy could probably be improved considerably if the sentiment analyzer were calibrated for the domain. AFINN (linked above) is calibrated from newspaper articles, which almost certainly have a different distribution of word/sentiment correlations than Tweets do. It's not hard for me to imagine that "killed" is a better predictor of negative sentiment when classifying news articles.

Sentiment.js (which is what we used in devwax and I am guessing the same here) is just AFINN based Sentiment.https://www.npmjs.org/package/sentiment which you can customize. So in our case we added things like {"barreling": 2} etc. What would be better is a bigram/trigram based approach do you could score "Going Off" and "Killed It" etc, but I am not sure if there is a js library that does that?

Makes sense, since AFINN is just a tab-separated list of words and [-5, 5] valence scores. [1]

The impressive part of this for me is the visualization. Really nice.


Cool idea. Unfortunately I've yet to see sentiment analysis even really come close to providing any useful insights. It's just not accurate enough on 140 character tweets.

>It's just not accurate enough on 140 character tweets.

I would disagree but apply the caveat that there has to be heavy filtering of what is being analysed to derive anything of value from it.

A tweet like:

"“@BarkhamTaylor: #beatcameronathisowngame @aliceehoughton” let's get it trending #adrian"

Has no value for sentiment analysis, yet is lumped in with the rest of them, while something like:

"Sunny day in London :) @ Green Park http://t.co/S71RQR2luU"

Clearly has value regarding sentiment analysis. The current problem being that all of the junk gets marked as "average" or similar because sentiment can't be derived from it, which in the overall set skews things greatly.

yep you're right - this is just a cool toy to have a look at what Londoners are up to - the focus isn't scientific, more like "hey what are all those people there tweeting about".

Yeah, the the realisation of it is nice enough to look at, but the sentiment analysis (as always for these kinds of things) is pretty wonky.

Yep, i realised that - but node-sentiment was pretty quick to implement and gives alright results, so it's a bit of a tradeoff. Do you know of more accurate sentiment analysis services?

The dots all look exactly the same to me. I'm colorblind, which I'm sure is the reason for this.

A legend for what the colors mean would be extremely helpful.

Please make one for Manchester!

Yes! should be really easy.

i want to do also nyc and sf.

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact