Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's a lot of discussion here of the 'western lens' as you bring up, but I'm not sure that's fair criticism. The creator(s) aggregated data and built something very interesting. To complain that the data they used isn't universal doesn't seem fair. I think Wikipedia is a reasonable starting place, but yes, Wikipedia skews geographically.

All datasets have bias. It's okay to acknowledge that and still find insights in the data.

Honestly curious: what highly accessible dataset that allows for the simple creations of 'fame metrics' would be better? I'm not aware of any.



It wasn't a criticism, of course something like this is limited by the data available and that's no the fault of the author. I was just musing on what might be a side effect of using what's available.

There wouldn't be a 'total complete and true set' of data for this task, since not all countries use wiki's to the same extent, and languages don't actually delineate between country (eg: Spanish wikipedia is not exclusively the view of people from Spain, nor is English wikipedia exclusively the views of people from England).


Sorry, I wasn't complaining about your post specifically, just a general tenor that kind of shits on this work because of an unsolvable problem: as you say there's actually no comprehensive dataset that would make it possible.

Your original comment was valid and insightful. I replied to it because it triggered a lot of secondary criticism about sample bias, and those are the comments I was most trying to respond to.


Ah I see, thanks for the clarification. It's a fair defense, this is still definitely an awesome project!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: