Oh, and by the way, responding to these kinds of issues with “if you wanted your voice to be heard, you should have turned on analytics” is inexcusable.
Then again most browsers have a print button.
Every time a new rule was added, you had to wait up to 24 hours for the css servers to propagate.
You have to delete something every once in a while to make room for something new.
If you want to read the data behind many of the decisions, read these papers .
Just curious, what's so inexcusable about that?
Exactly. The ergonomics of much software is pretty damn terrible with users struggling to get done what they want to do. Thus, developers who rely on telemetry only cement-in bad problems.
It's for this and privacy reasons I always turn telemetry off.
It's very valuable feedback to hear, "wait, it does that?" Yeah, we spend a quarter million dollars on a feature that our users don't even know exists.
Sure it is. It's not a perfect indicator all the time, but can definitely pinpoint a common workaround users perform due to a lacking or unfriendly interface. It's just a helpful data point though, not a silver bullet.
I'm with you on the trust issues though when 90% of software companies abuse our data. Just another reason why we can't have nice things.
> In practice, I’m just doing workarounds to accomplish things the telemetry won’t explain on its own
This is one of the most obvious things people are looking for in analytics. People will always use your product in unpredictable ways to get the whatever functionality it is that they really wanted. This is one of the things product managers are most interested in knowing about because they either want to properly implement that functionality, or they already have it and they want to know if they’re doing a bad job of getting their customers to adopt it.
It’s incredibly unlikely that you’re such an ultimate power user that what you want out of the product is so unique that you’re inventing your own usage patterns that others aren’t also following.
I completely agree that you should have the right to not share your information with anybody you choose. But if you choose not to share any information about your use case with the company that makes the product, then you don’t have much right to complain about it not doing what you want it to.
Please don't cross into personal attack. You don't need to, it poisons discussion, and it evokes worse from others. Your post would be fine without that bit.
Watching individual user sessions from session recordings provides more depth, but you are still missing out on great swathes of information.
Analytics and session recordings don't give anything near as much concrete data as taking the time to engage with your users. You then use analytics to confirm behaviour.
Hum... On real life you would quickly discover how vulnerable all of your instruments are sensitive to both random and bad-intended interference.
You can apply the same phrase as a metaphor for software.
Well no, commercial pilots have to be rated IFR. Flying via instruments only is completely normal and part of the training. Visual is perhaps easier for many people, but you lose the use of visual navigation in all kinds of circumstances, like at night, over the ocean, or in a storm.
I generally disable the analytics in software I buy, but I do participate in surveys they send my way. Would that count for you?
Forums can also be a rich source of use case info for software developers who really care.
What you wrote here sounds very naive to me.
I might be completely wrong, but i assume if you have a large enough user base you would use telemetry only in an aggregated way, and then prioritize the majority of the user base. So I might totally be underrepresented but still have the same loss of privacy as everyone else.
I also feel i always have the right to complain about a product, publicly and vocally (it's just a different channel for analytics/telemetry, imo). Much like product owners have the right to just change it remove features that break my workflow.
> I might be completely wrong, but i assume if you have a large enough user base you would use telemetry only in an aggregated way, and then prioritize the majority of the user base.
In some cases yes, but you need to get that aggregate data from somewhere. It’s also naive (and probably a bit arrogant) to assume that your requirements aren’t in fact well aligned with the majority of the user base.
This is however a somewhat immature way of using analytics data. There’s plenty of incredibly valuable data you can get from peculiar and unpredicted usage patterns. Any product manager will tell you that most customers are quite bad at expressing what they actually want. A good product manager can gather a fair amount of insight from what they choose to complain about, but analysing the most immediately obvious trends really only gets you so far. You’ll often get the most insight from the rare customers who are able to concisely articulate their feedback, and those that are frustrated and ingenuitive enough to subvert your application to their will.
For any motivated organisation, there’s countless ways to get insight out of their analytics. You’ve just described the lowest hanging fruit.
> I also feel i always have the right to complain about a product
Depending on where you live, you likely have the right to complain about anything you like. But if you want to prevent an organisation from analysing your usage of their service, then complaining about them not meeting the needs of your use case is hypocritical.
No. In any high-dimensional space (in this case the dimensions are features used by a given user), the vast majority of the elements will be outliers in at least one dimension. Everyone has idiosyncrasies. It is arrogant to assume that everyone is the same as oneself.
The problem is those companies view it as their data for which you have no right to modify it. Let alone chose to transmit it at all.
Before anyone replies this would invalidate telemetry, ask yourself this question: "If you see a spike in data that makes no sense, isn't that an indication that somehow your data collection policies are pissing off people?"
I really don't have the time or malice to do this. I can and do move on to something else.
Giving a business your data is an act of charity. It should never be an expectation.
Yes, it costs more to collect the data manually. It usually costs more to behave ethically when you aren't legally required to.
> Yes, it costs more to collect the data manually.
Not just that, but you also get data biased to people willing to spend two hours answering questions for a $25 Amazon gift card.
Sure, but the whole point of this article was that the telemetry data is also biased.
There is no magic bullet. Your options are to get biased data the cheap way or the ethical way. Either way, the data is biased and you have to consider that when making decisions.
It doesn't just ethics, and I think it should be "first, do no harm", as in, first, be ethical.
Btw, opt-in metrics are ethical and cheap afaik. As for biased? all metrics are biased...
A business is not obligated to improve a service just because you donated your telemetry data.
If they stop improving, I'll stop paying them as well as stop giving them my data.
Alternatively the business could pay us to do that but obviously they resist the idea.
Not all software telemetry is that creepy, but things like google analytics definitely are.
It’s not as big of a deal now that most users upload everything anyway.
Are you suggesting companies spend 10x on manual user research? Are you willing to pay more for a product because of increased costs to understand what users actually want?
Besides, When we bought magazines the editors didn’t know what stories people were reading or what words you were spending more time reading. When you buy a tool the manufacturer doesn’t know how you use it. Why software gets a free pass at getting all that data for free without asking is beyond me. What I expect are for regulations to eventually hit this industry.
this is the reality of how businesses operate.
> product pricing and labor aren’t that tightly correlated.
If you're the staff accountant, sure. You're technically correct.
If you're the CFO and I'm trying to convince you we need to spend $1M on user research instead of $100K, you can be sure this is taken into account when modeling monetization strategies to recoup R&D costs.
It's the reality of how a subset of industry operates, which doesnt make it less wrong.
Move your analytics to the server and stop tracking my every mouse move and page scroll. Needing to know that stuff suggests you're either creepy or that your organisation is saturated with marketing and SEO wonks.
User research and analytics are two entirely different things. Your product will suffer if you neglect the first.
if they want to make their product better, that's what they should do, instead of drawing wrong conclusions from the limited data that telemetry gives you.
> Are you willing to pay more for a product because of increased costs to understand what users actually want?
It's your business. Add telemetry, have it turned off by a significant amount of your users which might not be an average representation of your user base.
So your go ahead and optimize for a biased subset, maybe even interpret some of that data wrong, and before you know it, for iterations later some fancy startup is stealing the show, because they simply read all the detailed complaints about your software on reddit and HN instead of jacking off to analytics data.
It tracked this, of course, by assuming that the mail would be opened in a web browser that would make requests for images.
He cancelled that decision:
"Logged opens for each newsletter are between 53% and 60% -- but an experiment a while back revealed that hundreds of you aren't logged, for various reasons around email security and that one guy who reads everything on a 1980's greenscreen monitor. Clicking of links depends massively on what's in the newsletter, plus the note in the previous sentence, but has gone as high as 41% of readers on one week this year."
Some folks seem to think analytics is the only way to get feedback from users.
Just ask me. Survey, email me directly, anything but siphoning all my data off to Google in the process.
Authentic question: how would a business know this otherwise in an actionable and effective way?
Maybe two cohorts. The people we're really sure about, and the ones we're pretty sure about. If they diverge, ask why.
Note that reCAPTCHA thinks you’re a bot if you block analytics…
The Textarea Cache Firefox extension https://addons.mozilla.org/en-US/firefox/addon/textarea-cach... is incredibly useful, but I keep forgetting to install it when I get to a new machine. I forget only once.
If you're running anything with an API, then unless somethings horribly wrong it's even easier: look at the number of requests being made to an API endpoint and spot check a few of the user identifiers (tokens, keys, whatever you're using) to see the variety of users.
All of this is assuming you're trying to merely investigate the volume of use of a feature, not trying to diagnose demographics. If you're trying to extract more fine-grained detail, I don't have as many answers; I hope others will chime in with constructive ways to get things like geographic demographics via server logs.
I don't use Google analytics but I have seen time and again vocal users who seem ignorant of their own usage of an application...and very much not representative of the majority.
Plus, people often don't give valuable feedback when asked questions about features they want and use... people are poor judges of what they actually want, and will list things they think they care about and then end up never using or which don't affect their choices.
We do. We have real customers do real surveys of our real web sites. In person. We even do shadowing to see how real people use our sites in their daily work.
It's uncommon in SV due to laziness and an unwillingness to talk to actual human beings. Which is dumb because there are companies that will handle this for you.
But SV is stuck in this mindset that everything can be solved by an algorithm. It can't. The tech echo chamber really needs to get over itself.
Our customers certainly do. We get excellent results from asking a few simple questions now and then, providing both a good source actionable feedback on feature requests and any current problems, and often some encouraging comments that reassure us we are basically doing things that our customers like.
It doesn't just have to be surveys with lots of participants, though. For example, we've known for decades that a simple observational study with just a handful of people is often enough to identify most of the serious usability problems with an interface.
The idea that everything important must be reduced to automated analytics and number-crunching is a very strange disease. Even if the numbers don't lie -- and as we see here, that is far from guaranteed -- you still need to be asking the right questions and comparing useful alternatives for the results to be valuable.
Sorry, your comment just reminded me of that. Are surveys perfect? No, but they have their uses, and plenty of companies find real value by making use of them.
Just searched "nginx 1st party analytics"
You can't just follow analytics. You have to understand your users.
It's true though. If you want to be recognized, you can't be incognito. That's like refusing to vote and then complaining about politics.
Chrome's analytics ought to say nobody uses Incognito Mode, but they'd be dumb to remove it.
Inform your users honestly and do not hide things. Make promises about tracking data usage and stick to them. Tell them the truth and give them a choice. Stop acting like you got something to hide and do your tracking yourself.
1. Number of visitors as recorded in Google Analytics
2. Number of loads of a 1x1 pixel served on a different domain
They see higher numbers for (2) than (1), and attribute the difference to users blocking Google Analytics.
I don't see them describing how they excluded bot traffic, however, and for my sites the majority of hits I get are from bots. Only some bots run JS, so I suspect their numbers for blocking users are thoroughly diluted by these bots.
(Disclosure: I work for Google, speaking only for myself)
The author extracted the browser information from the server logs (presumably from the User-Agent header i guess?). If they were able to do this, i'd assume they also filtered out bots from the tally :)
And I don't know what you mean by "serious". The most common crawlers (Google, Baidu, Yandex, etc) identify themselves as bots on the User-Agent very clearly. Personally, those are the ones that I'd call the most "serious". And also the ones which I've seen generating the most on servers.
But, is your experience that these kinds of bots cause much traffic? Because, from what I've seen, they can make a mess with fake accounts, fake content, fake clicks, etc, but as far as traffic goes, they were completely dwarfed by search engine crawlers and real users' traffic.
Thanks for the clarification :D
Took me a while to realize that most of my readers block analytics since they're super privacy-saavy. I shut the tool down, it's useless for some crowds.
Nowadays I also doubt if it's ethical for any crowd.
Any company that does machine images for employees should ship them with adblockers preinstalled. There are zero downsides.
On the other hand, that’s a fairly entitled viewpoint.
Users are using infrastructure I finance, to do things on the website(s) I created. If you want privacy from the services youre using, create or host your own.
I keep the data secure and don’t give it to 3rd parties. That information is used to fix bugs, improve / build services, etc.
I’m not “subverting” privacy, users are coming to my house and playing with my things, so to speak.
That does not matter, at least in the EU where citizens are protected by the GDPR. Article 6 of the GDPR:
Processing shall be lawful only if and to the extent that at least one of the following applies:
(a) the data subject has given consent to the processing of his or her personal data for one or more specific purposes;
However, OP does not understand why you are so entitled to want to use their website for free, and then also want to tell them that they is not allowed to make note of you having done so.
"his or her personal data"
Analytics is not personal. No one shares their names or social security number. And I strongly agree with people above mentioning The Visitor is the requester. We provide services, they use it, we want to understand what we are doing by checking analytics. Nothing immoral about this.
* ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;*
In the legal opinions that the EU provides with the GDPR, random tokens that are associated with a person are PD. An IP address is also considered personal data:
Nothing immoral about this.
What is ethical is a personal opinion, but collecting personal data (following the definition above) is simply not legal without consent in the EU.
(f) processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child.
General analytics on a website are probably not covered under f, since it is not necessary and not what you’d expect when you visit a website to which you have no customer relation.
There are clear cases where one has to collect data, even without consent, such as fraud prevention.
IANAL, but this is not legal under the GDPR. The GDPR requires op-in. Moreover, you cannot make non-anonymous data collection mandatory to use a product, unless it is necessary for the product to function (in a strict sense).
 Anonymous does not seem to include pseudonyms like a random identifier in the GDPR, since pseudonyms could be linked to real identifiers in the future.
As long as you stay away from the EU, they'll probably not touch for GDPR.
I do the opposite, my webserver sends back some information unsolicited about my production environment that you can track if you want.
Not everyone does smart things with the data.
I'm not on Twitter…
> It comes across as smug, because you assume you're an authority on what a "good look" is, that is, what the correct thinking is. But you're doing this while the reader is noting that you failed to express a real argument, so they're not sure you even know what the correct thinking is, let alone why it ought to be correct.
I think it's fairly obvious what my argument it: people who are blocking client-side analytics have expressed a clear interest in not being tracked. Saying that you're tracking them anyways on your server shows that you don't really care about their wishes–and as other have mentioned, it might fall afoul of active consent legal requirements. Considering that my comment is currently at the top of this thread, I think it would be fairly difficult to not know what my argument is.
Now, Ghostery says I have "zero trackers" on my personal site.
Fathom doesn't collect personal info about my visitors. They just show me my aggregate metrics (popular pages, top referrers) which is all I need anyway.
Don't forget about all those smartasses spoofing their user agent just
to make your life even harder.
Sidenote: ads on sites degrade the experience so badly, whenever I browse without adblock I am honestly shocked at how bad it is, and I am amazed others don't seem to care.
This is also missing the part where you can run Android without the Play Store using microG and F-Droid + Aurora. It's not a huge number, but there are still a lot of people blocking ads in one form or another--even if it's as simple as using Firefox or Brave for basic browsing.
It saved the mobile surfing experience for me (was using chrome on the mobile before)..
Because I don't think the amount of bloat on their phones is even unusual. Yet, Google keeps singling them out.
Recommend uBlock Origin instead. "Free. Open source. For users by users. No donations sought." https://github.com/gorhill/uBlock#installation
You'll have to get it from the F-Droid app store, but this doesn't require a rooted phone. It uses a local VPN to block ads at the network level, hence no ads in your apps as well
I've been using this for a month and it works wonderfully
There's always one
I wonder if this is actual data?
I do log some stuff sent to the server, as well as some stuff about the response (such as the HTTP status code, data size, timing, etc), although I do not sell this data to anyone else, and this logged data is reduced further if the client sends a "DNT:1" header.
But decisions about how to make something would normally be based on actual comments by the users, rather than analytics, I should think.
I'd imagine most users who block GA/ads on desktop would also want to on mobile, but can't just because it's so difficult to set up an adblocker on mobile.
A HOSTS file or blocking DNS server will easily do that, my whole network in fact has GA and a bunch of other crap blocked this way. On the other hand, setting up a MITM proxy/VPN is much much harder on mobile. However I am surprised at the 0% for Android and 17% for iOS blocking GA --- I was expecting it to be the opposite, with the former being historically much less of a walled-garden than the latter.
On the other hand, perhaps everyone who blocks GA and uses Android is in the aforementioned situation of not saying that they're using Android --- they may be reporting a Linux or some other user-agent.
> From now on uBO will CNAME-uncloak network requests.
If you want to help me out, you are very welcome!!!
As a side note, those users were useless anyway as all of them were bots: https://www.reddit.com/r/marketing/comments/4smisl/facebook_...
Why would you handover the data to third party when you don't have to?
I can't believe how ignorant people that say "use server logs" are. They clearly haven't done any online marketing or run an online business, yet they want better and cheaper products - even free if possible.
How do you think a company gets to improve and optimize their product? By surveys? I think many assume the entire analytics required for a business is just reading a few GET requests from the server logs and categorizing them by user agent.
As far as I'm concerned online marketing is cancer. It acts against me, wastes my time, etc. Marketing should be about presenting your product in the best light possible, and stop there. You shouldn't be allowed to track or waste other people's time to promote your product - being in business is not a right after all.
> How do you think a company gets to improve and optimize their product? By surveys?
Yes exactly - companies were in business just fine for over a century and they didn't have analytics, why should it suddenly be required?
It's one thing you imagine a business should be run in an utopian world where people respond to surveys and know exactly what they need to solve their issues (Henry Ford said it perfectly "If I had asked people what they wanted, they would have said faster horses") and another thing how it works in real world.
> Yes exactly - companies were in business just fine for over a century and they didn't have analytics, why should it suddenly be required?
This is a very ignorant response so I assume all your answers are the same.
> As far as I'm concerned online marketing is cancer.
I think you meant "advertising"... if yes, in some cases you're right.
If that is ignorant then I guess most companies founded in the 20th century (without analytics, stalking and targeted advertising) must not be real then? In fact I'd argue most of the money that funds the cancer that is modern advertising, marketing, etc was made before such things were actually invented.
> I think you meant "advertising"
I kind of agree, and this is why I clarified my definition of marketing. For me, marketing is about putting your product in the best light possible on your website (or physical space if you're into retail), so that people who stumble upon your product (randomly or by searching for it) will be encouraged to buy it.
The modern definition of marketing however seems to be stalking (aka analytics), spam (newsletters, push notifications), creepy targeted advertising (often relying on the previous items) and so on, essentially forcing your product onto people who haven't asked anything. I consider this modern marketing to be cancer.
As far as I am aware, Hacker News has never used Google Analytics.
I'm totally against ad tracking where a company can identify what I do on multiple websites and link my browsing behavior. But I have no problem to let a company track my clicks and what I do on their website if it's used to improve their product and ultimately to benefit from it.
And yet you have no problem using it when it's included as a third-party script, because you want to give the first-party that information. Why do you suddenly forget about the third-party (Google) in this framing?
My experience tells me that most "data-driven" products are absolute trash. The best products I've used were developed in the good, old-school way of someone having a concept of how the product should be, common sense, user research and feedback.
Nowadays I see shit changing all the time for no good reason because I guess somewhere there must be a 0.1% short-term increase in some metric, while annoying a bunch of people and eroding their goodwill.
I also don't feel "free" using a product with analytics because I don't always want my usage pattern to skew the data and make the product change in any way. There are times when I do weird things for good reasons, but it doesn't mean I want the weird thing to become the primary way to interact with the product because someone wanted to make a "data driven" decision.
> I'm totally against ad tracking where a company can identify what I do on multiple websites and link my browsing behavior
You do understand that third-party analytics services can do exactly that? And those provided by advertising companies like Google probably do use the data for their own purposes as well.
That doesn't mean you have a right to collect it from me, and if your business model fails as a result, well that's a shame for you.
It is impossible not to use it at this point. If you don't use it, you will be forced to add it for another 3rd party working with you or with your client.
Maybe the reason is that a lot of products are just useless trash and don't actually fill a legitimate need? They only sell because of the advertising exploiting a weakness in people's mind, but wouldn't sell on their own because it turns out people don't actually need this product?
If you make a product that truly solves a problem, it should sell itself. A good example of this is Monzo - they founded a new modern bank that addresses all the problems of the legacy banking industry. The product sold itself and they reached a million customers without any marketing.
In any case, whether you need ads and tracking to be in business is one thing, but being in business and profitable is not a right, so if you can't do so while respecting the law such as the GDPR then it shouldn't be our problem as users.
That is literally its purpose, yes.
> Seems that you don't make a difference between ad related tracking and product based tracking.
Now that's an interesting question - does Google combine data from GA into their other tracking, or does that data stay separate?
You can hide third party tools behind a proxy for example or write your own custom tracking software.
YC news tech gurus will come up with all sorts of geeky server side solutions but people need to accept Google Ads is a must for most companies out there, and one will use google analytics one way or another. No escape.
I want good things, I have them, and now I want greedy, needy people to stop pushing their mediocre, poorly understood imitations to drown out the really good things we might use and nurture instead, hurting us all and even themselves in the process.
If what you make serves a purpose other than generating needs to make a profit from, then you'll probably be fine with mostly simply paying attention to what you're making, using it yourself, and occasionally making surveys and collecting metrics from volunteers to see if there's anything you missed.