Of those who responded to the 'what is your blog and why should I read it' thread. HN is large. Likely larger than many of its visitors realize because the participants are a relatively small fraction of the readership.
So HN readers are not necessarily contributors. And not all contributors would plug their blogs in a thread asking them to do so.
If you want to get an idea of the HN readership rather than of the HN contributors you may want to start off by scraping all the profile pages instead, it will give you a much larger set of sample data to work with.
> you may want to start off by scraping all the profile pages instead
Oddly enough, I'm in the middle of this project right now! There are over 600,000 users, and it turns out that many of them use their profiles to share links to things other than personal blogs. I've done some scripting to automate deciding if a link is a personal site or not, but the whole endeavor has been significantly less trivial than I had hoped.
Regardless, I plan on sharing results soon, so be on the lookout!
That’s a fair point! I would have to read HN’s terms of use though. Not sure if that’s allowed or not. I felt good scrapping the comments section since everyone there “opted in “ to share their website to the broader community.
You can take it as read that people who write blog posts do that to share them with the broader community.
Could one of you downvoters please explain why you think people would post blog posts if they do not want to share them? I fail to see how that would be possible.
And you never foresee the ripples. I wanted to buy books online and trade stocks when I was a student. MasterCard/Visa in Algiers, Algeria was a rarity, and banks didn't communicate.
I called every bank in the country that was listed by the central bank, I knocked on their doors until I found one that offered students a card.
I even had to write a document because they didn't have a form for that case. I finally got my card after a week of trips to the bank, city hall, administration, providing documents that were unlisted, etc.
I computed that this wasted time multiplied by the number of people who would eventually go through the process and the fact almost nobody in the country had the card warranted detailing the steps. It would save time, and contribute to reduce the "underbanking". I wrote a blog post explaining how to do it with steps, as in provide the following documents, take these amount in these currencies. I attached the document/form I wrote so people could fill it and take it directly to the bank and save a trip. That asymmetry in information bothered me.
That post 2013 post is read hundreds of times per day. It had more than a thousand comments, although it only lists 700+. People asking me questions, then getting their own cards, then themselves answering other people's questions, then updating me on what has changed.
People I would meet in real life would tell me I looked familiar, and then they'd put it together and tell me they followed the post to get their card. Sometimes people referred me to my own post in case I wanted to get a card.
I received emails from people freelanced online who wanted to bring money back here. One person even sent me credentials to their online account with thousands of dollars in and asked me if I can find a way to transfer it here (I told that person not to do that again and she said she felt she could trust me).
Many times I'd receive a phone call from a friend who'd say they wanted to get a card, looked online, found a great post, then saw my name and laughed out loud because they knew me.
Yes, indeed. I have a few like that and I always wonder why those were the ones that got legs. This one for instance generates a couple of emails per week still years after:
Not a downvoter, but I did have a blog that I didn't promote, which I used mostly to get practice with my writing. I enjoyed it being public but with zero traffic, because writing that is 100% private can easily turn into a private rant. If there is the chance it will be seen, I hold myself to a higher standard, and hopefully that practice comes across when I engage in written communications with others, whether on HN, other sites, or at work.
Ultimately, though, all the sharing of our blog URLs and this related discussion made me realize that I didn't really want an audience, so I killed off my domain.
Direct personal experience is why I think someone would write a blog post and not have much interest in sharing. I’m not an enthusiastic self promoter. Turning up my lizard brain volume gets in the way of living in ways that I find more fulfilling.
I occasionally blog about interesting technical problems I encountered and how I solved them. Someone who have the same problem might happen upon my posts through a search engine, and find them helpful; but I don't see the point of "promoting my blog" to people who're not looking to solve those problems. So the answer to "what is your blog and why should I read it" is you probably should not until you find my blog when researching on a problem.
One reason is they don't actually want the "broader" community to see them. More than once I've seen people lament the fact that the "orange site" picked up one of their posts. There are Twitter accounts dedicated to making fun of things people say in the comments here. There are people who wrote scripts that do special things when the referrer to their site is HN.
Or maybe they've seen the thread and decided not to participate. For instance myself - my blog is nothing special, so I didn't include it, even though it's linked in my bio.
There is a difference between acknowledging publicly who you are and doxxing, which I consider to be publicly releasing your phone number and physical address (at least).
If you participate under a handle here and use your real name on your blog, it may effectively amount to the same thing.
People can have a variety of personal concerns, from a nutcase stalker ex to "I work for BigCo and want to spout off online without getting fired" to "I happen to have some bizarrely unique name, so using my real name anywhere amounts to doxxing myself."
Lots of people on HN use a throwaway email just for HN and don't want general HN readership to know much about their lives.
People use a wide variety of approaches to having an online life while looking out for their own specific privacy concerns. Please note that most people with privacy concerns will not chime in to this discussion to explain to you why they make the specific choices they make as that would tend to be counterproductive and undermine their goals of protecting their privacy.
Linking to your website has this effect. For example, one of the few pieces of information on my website is my amateur radio callsign. You can take that to the FCC's helpful database and then get my home address. I have it there because I think crime is low enough that it's worth having a Google result for "who is that KD2DTW guy that I just heard?"
> "Doxing" is a neologism that has evolved over its brief history. It comes from a spelling alteration of the abbreviation "docs" (for "documents") and refers to "compiling and releasing a dossier of personal information on someone".[9] Essentially, doxing is revealing and publicizing records of an individual, which were previously private or difficult to obtain.
The term dox derives from the slang "dropping dox" which, according to Wired writer Mat Honan, was "an old-school revenge tactic that emerged from hacker culture in 1990s". Hackers operating outside the law in that era used the breach of an opponent's anonymity as a means to expose opponents to harassment or legal repercussions.[9]
Every time I see a privacy outrage thread over here I think about how many readers/commenters on this site work for advertising/tracking companies or companies whose products include tracking code. (Full disclosure: I don’t.)
Probably because there isn’t a good free analytics service that is easy to use (no need for self-hosting) and is able to collect the info that one wants. GA is free, and easy.
This was sourced from a posting asking people to promote their blog. People that promote their blogs are likely going to have some kind of analytics tool installed. Also, lots of blog platform sites come with analytics built in and some of those use Google.
I've wanted to dump GA forever, but it was only a year or so ago I finally took the plunge and set up Fathom analytics on my sites. Now I'm happier with the faster load times and the ability to control my own analytics data (though some of it is more limited since no cookies are used)... but I do have one more small VPS I'm maintaining pretty much in perpetuity (eventually it'll move to a personal K8s cluster but resources still cost a little).
The friction is just great enough that most people still stick with GA, since it's already pretty much everywhere.
I like the analysis. However, I am curious why entries are not sorted by counts. As a rule of thumb, sorting alphabetically makes no sense!
Also, 382 sounds like a very small number, given the size of HN. I did try to find many blogs I read, but couldn't find one. So, crucially - was it a random sample? Or sample from top-liked, or from a particular month?
some findings (e.g. the prevalence of Wordpress) may depend on this procedure.
When I've seen the thread, it had like 500 comments. My comment wouldn't make a blimp and like five people would have seen it, so I just didn't post it.
Thanks for the advice! I went in and sorted the tables. I love Markdown for its simplicity and forces me to focus on the content, but sometimes simple things like sorting a table are a pain. Such is life.
[off topic] FYI, i found this comment by subscribing to the RSS feed of your HN comments on Fraidycat (by linking to https://edavis.github.io/hnrss/ for your username)
Only 22% did have any static site generator detected. Beyond that, 33% had a CMS detected.
That leaves nearly half of analyzed sites as unknowns! Speculation aside on what might be effective aside, the real answer wrt OP is basically "it wasn't".
I generate my blog with a static site generator I wrote, but I don't see how anyone would be able to tell by inspecting the output.
A heuristic that might work would be to add a cachebuster query parameter to the page url (?cb=$RANDOM) and see how long it takes to respond. The idea is that the three most common setups are:
* static site served with apache/nginx/etc, which will just ignore the query param
* dynamic site, which will regenerate the page
* dynamic site behind a cache where the cache doesn't know that the query parameter isn't needed, and so the cachebuster will cause the page to be regenerated
Or look for a link to the website source on github, gitlab etc and go from there. But that's more or less manual work given how many custom made SSGs there could be.
It’s artisanally hand crafted HTML with a little VanillaJS on a few pages. No static generator used. Also hosted on Netlify. Although I use BunnyCDN for large media. I post very infrequently.
When I announced my own analytics thing here a few months ago I got quite a few signups from the HN thread, and based on the bug reports, support requests, etc quite a few of the users are just running it on their personal blog. I'm aware of some other solutions as well that aren't mentioned at all in that table.
Personally I found Matomo rather hard to use, I tried it for a while but decided that writing my own was easier than figuring out to get Matomo to do what I want (also: cheaper and easier). I think it's a good alternative for some use cases, but far from all. Related comment I left on Lobsters a few days ago: https://lobste.rs/s/cdrrty/why_you_should_stop_using_google#...
I suspect that the larger-than-one-might-expect representation here for Erlang and Cowboy (an Erlang web server) is caused by sites hosted on Heroku, which would return Cowboy/Vegur in the Server header, rather than because many HN users are actually maintaining Erlang application servers to serve their blogs.
We run all our website on ASP.NET Core, but we have a forum and blog which basically automatically mandate MySQL. We played around with PHP on IIS [0] for many years before giving up. We also tried the opposite and hosted our app on Mono before giving up on that due to bugs that would randomly cause compilation errors [1].
Long story short, two completely separate backends each running on most reliable platform for the stack. And nginx is in front of it all. Ping the root domain and you’ll think you are on a big-standard Linux/nginx confit, even though it definitely is not.
So HN readers are not necessarily contributors. And not all contributors would plug their blogs in a thread asking them to do so.
If you want to get an idea of the HN readership rather than of the HN contributors you may want to start off by scraping all the profile pages instead, it will give you a much larger set of sample data to work with.