I don't know if this calculator was good or bad, but the rationale sounds superficially ridiculous.
> Visitors who didn't see the calculator were 16% more likely to sign up and 90% more likely to contact us than those who saw it. There was no increase in support tickets about pricing, which suggests users are overall less confused and happier.
Of course if you hide the fact that your product might cost a lot of money from your users, more of them will sign up. Whether they are better off depends on whether they end up getting a bill they are unhappy with later at some unspecified future date, or not. That's not something you will figure out from a short-term A/B test on the signup page. So this seems like totally useless evidence to me.
I see this dynamic frequently with A/B tests. For example, one of my coworkers implemented a change that removed information from search result snippets. They then ran an A/B test that showed that after removing the information, people clicked through to the search result page more often. Well, obviously, it makes sense that they might click through more often, if information they wanted which was previously in the snippet, now requires them to click through. The question of which is actually better seemed to have been totally forgotten.
> Of course if you hide the fact that your product might cost a lot of money from your users, more of them will sign up
The problem with their calculator was that the users introduced slightly wrong data, or misunderstand what means some metric, and suddenly a 1000x the real price was shown. Their dilemma was "how to fix those cases", and the solution was "get rid of the messy calculator".
But they are not hidding a 1000x cost, they are avoiding losing users that get a wrong 1000x quote.
Being a complete cynical bastard here but I sometimes feel like these calculators are actually meant to obfuscate and confuse and the result is that a startup worried about scale is going to pay over the odds and then deal with ‘rightsizing’ after the costs get out of control.
I felt like that with elastic serverless’ pricing calculator which on the surface looks perhaps cheaper or more efficient than a normal managed cluster, because you think it would be like lambda. Except there are so many caveats and unintuitive hidden costs and you’ll likely pay more than you think.
Can't speak for everywhere of course, but the places I have worked nobody likes spikes or over commitments. The customer is shouting at your people, salespeople and support spend time and get stressed dealing with them, leadership gets bogged down approving bill reductions. Even if granted, customers remember the bad experience and are probably more likely to churn
My cynical take: I make things that look hard to make to impress you but if you make them for me I feel my money is going into the calculator rather than the product.
In my mind Pinecone is an exemplary example of modern "social media marketing" for a technology company.
They started on vector search at a time when RAG in its current form wasn't a thing; there were just a few search products based on vector search (like a document embedding-based search engine for patents that I whipped into shape to get in front of customers) and if you were going to use vector search you'd need to develop your own indexing system in house or just do a primitive full scan (sounds silly but it's a best-case scenario for full scan and vector indexes do not work as well as 1-d indexes)
They blogged frequently and consistently about the problem they were working on with heart, which I found fascinating because I'd done a lot of reading about the problem in the mid ought's. Thus Pinecone had a lot of visibility for me, although I don't know if I am really their market. (No budget for a cloud system, full scan is fine for my 5M document collection right now, I'd probably try FAISS if it wasn't.)
Today their blog looks more than it used to which makes it a little harder for me to point out how awesome their blog was in the beginning but this post is definitely the kind of post that they made when they were starting out. I'm sure it has been a big help in finding employees, customers and other allies.
Thank you. :) I don’t think of it as social media marketing but more of helping our target audience learn useful things. Yes that requires they actually find the articles which means sharing it on social, being mindful of SEO, and so on.
Probably our learning center is what you’re thinking of. https://www.pinecone.io/learn/ … The blog is more of a news ticker for product and company news.
why not fix the calculator in a way that avoids/mitigates scenarios where users get to wrong quotes and then do an A/B test? This setup seemingly tilts towards some sort of a dark pattern IMO
Because the results were probably wrong because the inputs were wrong (exagerated by over-cautious users). There is no automated way to avoid that in a calculator; only a conversation with a real person (sales, tech support) will reveal the bad inputs.
I wonder if some of that could have been automated. Have a field to indicate if you are an individual, small business, or large business, and then at least flag fields that seem unusually high (or low, don’t want to provide too-rosy estimates) for that part of the market.
Relatedly, there seemed to be no acknowledgement of the possibility of dark incentives: many businesses have found they can increase sales by removing pricing details so that prospective customers get deeper into the funnel and end up buying because of sunk time costs even though they would have preferred a competitor. Example: car dealerships make it a nightmare to get pricing information online, and instead cajole you to email them or come in in person. In other words, a calculator makes it easier to comparison shop, which many businesses don't like.
I have no idea if that's a conscious or unconscious motivation for this business, but even if its not conscious it needs to be considered.
To be fair, the pricing is still available in this case: https://www.pinecone.io/pricing/ (though the "Unlimited XXX" with the actual price below in gray might be considered misleading)
Only if this were untrue, i.e., I was motivated by the fact that it made my customers believe my product was better than a worse product.
For me the principle is based on not exploiting the gap between the consumer and a better informed version of themselves. (“What would they want if they knew?”) There’s a principle of double effect: I don’t have to expend unlimited resources to educate them, but I shouldn’t take active steps to reduce information, and I shouldn’t leave them worse off compared to me not being in the market.
This is a blind spot for pretty much entire industry, and arguably spreads beyond tech, into industrial design and product engineering in general. Of course being transparent with your users is going to be more confusing - the baseline everyone's measuring against is treating users like dumb cattle that can be nudged to slaughter. Against this standard, any feature that treats the user as a thinking person is going to introduce confusion and compromise conversions.
> treating users like dumb cattle that can be nudged
Essentially the failure is that we do treat users like this by relying on mass collection of data instead of personal stories. To be human is to communicate face to face with words and emotions. That's how you can get the nuanced conclusions. Data is important but it's far from the whole story.
Absolutely true! An A/B test enthusiast in our team once significantly reduced padding on the pricing page to bring the signup button above the fold and used the increase in signup button clicks as a proof of success of the experiment. Of course, the pricing page became plain ugly, but that didn't matter, because "Signups are increasing!!"
In this case, I do agree that the calculator is a bit daunting if you're not used to all the terms, but what should be done with it should have been an intuitive decision ("what can we do to simplify the calculator?") Not a fan of A/B testing culture that everything needs to be statistically analyzed and proved.
>> Of course, the pricing page became plain ugly, but that didn't matter, because "Signups are increasing!!"
I'm not sure I'm following you here, so perhaps you'd care to elaborate?
The GP critique was that it was perhaps just creating a problem elsewhere later on. I'm not seeing the similarity to your case where the change is cosmetic not functional.
The issue of whitespace (padding) is subjective (see the conversation recently between the old and new windows control panels) but "scrolling down" does seem to be something that should potentially be avoided.
If sign-ups are increasing is that not the goal of the page? Is there reason to believe that the lack of padding is going to be a problem for those users?
I think one problem is that a better design would move the button above the fold without ruining the spacing, and therefore achieve a better result with even higher lift, but someone focused on just the numbers wouldn't understand this. The fact that the A/B test has a big green number next to it doesn't mean you should stop iterating after one improvement.
A/B testing has something to say about (ideally) a single choice.
It has nothing to say about optimal solutions. Nothing about A/B testing suggests that you have reached an optimal point, or that you should lock in what you have as being the "best".
Now that the button position (or tighter layout) has been noted to have a material effect, more tests can be run to determine any more improvements.
The issue is that A/B testing only looks at outcomes, not reasons. There is a possibility that having the sign-up button above the fold wasn't the contributing factor here, or that it was only a contributing factor. Having to scroll through an estimate may lead a potential customer to believe that pricing is too complex or, worse, that the vendor is trying to hide something. Perhaps there are other reasons. The problem is that A/B testing will only tell you the what and not the why.
If it was your average modern startup page where you have to scroll down 3 screens to see anything besides generic marketing drivel and licensed stock photos of smiling corporate drones, of course reducing whitespace so people can see what the product is about will increase signups.
A/B tests suck because you are testing against two cases which are probably not the best case. If you take your learnings of the A/B test and iterate your design that's a viable strategy but proposing a shit design and insisting on deploying is wrong.
I'm going to assume the 90% number was simply hyperbole. Because it's trivially false in any number of ways;
Firstly many businesses have never heard of A/B testing, much less apply rigorous application of it to proposed changes.
Secondly many businesses have found their niche and don't change anything. There's a reason "that's not how we do it here" is a cliche.
Thirdly a whole slew of businesses are greater changing things all the time. My supermarket can't seem yo help themselves iterating on product placement in the store.
Blaming testing in general, or A/B testing specifically for some companies being unwilling to change, or iterate, seems to be missing the actual problem.
Frankly, with regard yo web sites and software I'd prefer a little -less- change. I just get used to something and whoops, there's a "redesign" so I can learn it all again.
I feel like that example is missing some context - if signups did increase then their experiment was successful - we aren’t here to make pretty pages, we’re here to make money.
The problem is that it's easy to prove that signups are increasing, and lot harder to prove that there was a measurable increase in number of paying users. Most A/B tests focus on the former, very few on the latter. We had a free plan, and most users who signed up never made a single API request. So, assuming that the increase in signups is driving more business is just foolhardy.
> The problem is that it's easy to prove that signups are increasing, and lot harder to prove that there was a measurable increase in number of paying users.
Okay? The A/B test sought to measure which of two options A and B led to more signups.
> So, assuming that the increase in signups is driving more business is just foolhardy.
Your "A/B test enthusiast" was not testing for or trying to prove a causal relationship between increased signups and more business.
If he made the claim separately, then that is the context that is missing from now multiple comments.
You can always track signup/paying-users ratio. Purpose of landing/pricing page is to get the users to sign-up. Unless some dark pattern or misinformation is used to confuse users into sign-up, more users is a positive thing.
If I had a choice between ugly and rich and pretty and poor I'd be sorely tempted by ugly, particularly if I was making the decision for an organization.
> Whether they are better off depends on whether they end up getting a bill they are unhappy with later at some unspecified future date, or not.
How is that a function of the overly-simplified and often-wrong calculator?
If the user is never there to be happy/unhappy about it in the first place, then how would you test this anyway?
By closing the loop and increasing engagement, you are increasing the chance that you can make the customer happy and properly educated through future interactions.
The author was very careful with their words: they didn't say the calculator was wrong. They said it was confusing and sensitive to small adjustments. It's likely that the same confusion and variable sensitivity exists during usage. IMHO they should have bit the bullet and revised the pricing model.
Fair point. The author commented elsewhere here and stated that it's not the usage but the understanding of the variables in the calculator which are often wrong by more than 10x. From the response, it seems like the only way to know how much something will cost is to actually run a workload.
Edit: if the customer is getting a wrong-answer because of wrong-inputs, IMO, it's still a wrong-answer.
> IMHO they should have bit the bullet and revised the pricing model
I don't know enough to agree/disagree because they may be offering close to at-cost which might give better overall pricing than competitors. It's a complex-game :)
> so we dug into it and realized the calculator was far more confusing and sensitive than we thought. One slight misinterpretation and wrong input and you'd get an estimate that's overstated by as much as 1,000x.
They should have looked into this to see how to make it more obvious or more reflective of "regular use case"
Their sliders there are not too detailed. For example, what are namespaces, how many would a typical use need? Is 100 too much or too little? And if this is one of the variables that is too sensitive they would need to represent this in a different way
There are multiple companies on my blacklist that definitely got me to sign up. But as there was a hook that anybody acting as a trustworthy partner would have mentioned, I parted with them — potentially for life. You know, things like "click here to sign up, sacrifice your newborn on a fullmoon night while reciting the last 3 digits of pi to cancel"
I don't particular care whether their A/B test captures that potential aspect of customer (dis)satisfaction, but I am not sure how it would.
I designed an internal system that optimises for long term outcomes. We do nothing based on whether you click “upgrade”. We look at the net change over time, including impact to engagement and calls to support months later and whether you leave 6 months after upgrading. Most of the nudges are purely for the customer’s benefit because it’ll improve lifetime value.
That's the only thing I was thinking with their A/B test. The calculator might immunize against unhappy customers later on. I think they could've looked at something like the percentage of customers who leave one or two billing cycles later.
Unfortunately, there's never enough time to run a proper experiment - we want answers now! Who cares if they're the right answers. Short-termism can't wait two months.
You could only be measuring in aggregate, no? Overall signal could be positive but one element happens to be negative while another is overly positive.
Well, adjusting nudges in aggregate but diced in various ways. Measured very much not in aggregate. We’d see positive and negative outcomes roll in over multiple years and want it per identifier (an individual). I’ve heard of companies generating a model per person but we didn’t.
A silly amount of work but honestly lots of value. Experimentation optimising for short term goals (eg upgrade) is such a bad version of this, it’s just all that is possible with most datasets.
That’s why you need domain experts and clear explanations and hypotheses before you experiment, otherwise you’re throwing shit at a wall to see what sticks.
Companies can continue to monitor cohorts to compare retention to check the potential outcomes you highlighted.
Also clicking through is not a good thing if it doesn’t result in revenue! Why do I want to render eight pages for someone who will never give us money if I can find that out in three?
Upvoted to spread the immense wisdom in this post. But golly I must say the line can get blurry quickly.
> Would anything of value be lost if this or that chunk of it was removed?
In early stage projects I’ve seen this mentality backfire occasionally because it’s tough to estimate future value, especially for code and data.
For example, one time in a greenfield project I created the initial SQL schema which had some extra metadata columns, essentially storing tags for posts. The next week, a more senior engineer removed all those columns and associated code, citing the YAGNI principle (“you aren’t gonna need it”). He was technically correct, there was no requirement for it on our roadmap yet.
But the original work had taken me maybe an hour. And the cost of keeping the data around was approximately zero. It seemed he didn’t consider that.
Guess who needed the columns to build a feature a year later? Yeah me, so I found myself repeating the work, with the additional overhead of prod DB migrations etc now that the product had users.
I guess my point is, sometimes it’s also wise to consider the opposite:
Would anything of value be gained if this or that chunk of it was removed?
In this article the evidence is clearly affirmative, but in my case, well it wasn’t so clear cut.
I'm sympathetic to your situation, but it's possible that the senior was still right to remove it at the time, even if you were eventually right that the product would need it in the end.
If I recall correctly they have a measure at SpaceX that captures this idea: The ratio of features added back a second time to total removed features. If every removed feature was added back, a 100% 'feature recidivism' (if you grant some wordsmithing liberty), then obviously you're cutting features too often. 70% is too much, even 30%. But critically 0% feature recidivism is bad too because it means that you're not trying hard enough to remove unneeded features and you will accumulate bloat as a result. I guess you'd want this ratio to run higher early in a product's lifecycle and eventually asymptote down to a low non-zero percentage as the product matures.
From this perspective the exact set of features required to make the best product are an unknown in the present, so it's fine to take a stochastic approach to removing features to make sure you cut unneeded ones. And if you need to add them back that's fine, but that shouldn't cast doubt on the decision to remove it in the first place unless it's happening too often.
Alternatively you could spend 6 months in meetings agonizing over hypotheticals and endlessly quibbling over proxies for unarticulated priors instead of just trying both in the real world and seeing what happens...
At one time Musk stated that SpaceX data suggested that needing to add back 15% of what was removed was a useful target. He suggested that some of the biggest problems came from failure to keep requirments simple enough due to smart people adding requirements, because they offered the most credible and seemingly well-reasoned bad ideas.
Thanks I couldn't remember the exact number. 15% seems like a reasonable r&d overhead to reduce inefficiencies in the product. But I suspect the optimal number would change depending on the product's lifecycle stage.
How would you measure / calculate something like that? Seems like adding some amount back is the right situation, and not too much either, but putting a number on it is just arrogance.
Either accepting or dismissing the number without understanding its purpose or source can also be arrogance, but I agree that throwing a number out without any additional data is of limited, but not zero, usefulness.
When I want to know more about a number, I sometimes seek to test the assumption that an order of magnitude more or less (1.5%, 150%) is well outside the bounds of usefulness—trying to get a sense of what range the number exists within
I think we're getting hung up on the concept of dismissing. To question skeptically, to ask if there is evidence or useful context, to seek to learn more is different than to dismiss.
The 5 Step Design Process emphasizes making requirements "less dumb," deleting unnecessary parts or processes, simplifying and optimizing design, accelerating cycle time, and automating only when necessary.
Musk suggests that if you're not adding requirements back at least 10%-15% of the time, you're not deleting enough initially. The percentage is an estimate initially based on experience, and now for several years based on estimates from manufacturing practice.
> How would you measure / calculate something like that?
SpaceX probably has stricter processes than your average IT shop, then it isn't hard to calculate stats like that. Then when you have the number, you tune it until you are happy, and now that number is your target. They arrived at 15%. This process is no different than test coverage numbers etc, its just a useful tool not arrogance.
I have no clue what you are saying. "They did it somehow"? How? Maybe they did not measure it, but Elon just imagined it. How can we tell the difference?
Yeah, not sure how you'd measure this apart from asking people to tag feature re-adds. And all that will happen is that people will decide that something is actually a new feature rather than a re-add because the threshold has already been hit this quarter. Literally adding work for no benefit.
Maybe it's just down to the way the comment was written and it actually played out differently, but the only thing I'd be a bit miffed about is someone more senior just coming in and nuking everything because YAGNI, like the senior guy who approves a more junior engineer's code and then spends their time rewriting it all after the fact.
Taking that situation as read, the bare minimum I'd like from someone in a senior position is to:
a) invite the original committer to roll it back, providing the given context (there isn't a requirement, ain't gonna need it, nobody asked for it, whatever). At the bare minimum this might still create some tension, but nowhere near as much as having someone higher up the food chain taking a fairly simple task into their own hands.
b) question why the team isn't on the same page on the requirements such that this code got merged and presumably deployed.
You don't have to be a micromanager to have your finger on the pulse with your team and the surrounding organisational context. And as a senior being someone who passes down knowledge to the more junior people on the team, there are easy teaching moments there.
Main issue with adding something you might need in the future is that people leave, people forget, and then, one year later, there's some metadata column but no one remembers whether it was already used for something. Can we use it? Should we delete it? Someone remembers Knight Capital and spectacular failure when old field was reused. So it's always safer to keep existing field and then you end up with metadata and metadata_1. Next year no one remembers why there are two metadata fields and is very confused on which one should be used.
Easy solution: Add a comment in your schema-creation SQL script explaining what the purpose of the column is. Or some other equivalent documentation. Stuff like that should be documented in any case.
So, every db-column gets a "in_use_boolean" assigned? It gets reset every year, reset on first use query and auto-purged after a year and a day. Self-pruning database..
This would break if you need something after two or three years. It happens.
My point is - it's relatively easy to tell if something is used. Usually, a quick search will find a place where it's referenced. It's much harder to 100% confirm that some field is not used at all. You'll have to search through everything, column names can be constructed by concatenating strings, even if it's strongly typed language there could be reflection, there could be scripts manually run in special cases... It's a hard problem. So everyone leaves "unknown" fields.
In most cases though, anticipating requirements results in building things you don't need. And if you do need them, you usually end up needing a very different form of it. The worst codebase I have worked on is the one that was designed with some complex future-use in mind. In your example as well, the codebase only required columns a year later. So I think removing all chunks of code that anticipate a future need sets the right precedent, even if you end up needing it eventually.
Thanks for this balanced (and IMO necessary) reminder.
It's all too easy to get caught either doing way too much defensive/speculative stuff, or way too little.
one person's "premature optimisation" is another person's "I've seen roughly this pattern before, I'll add this thing I wished I had last time" - and AFAICT there's no _reliably_ helpful way of distinguishing which is correct.
Would you have remembered to write this comment if the fields had never been added back?
In the case you describe, there are three possible outcomes, in broad categories:
1) The fields do turn out to be useful, in exactly the way you implemented them first.
2) The feature is implemented, but using a different set of fields or implementation.
3) The feature is not implemented.
Even if we assign equal probablility to all the options, creating them in the beginning still only results in a win in 1/3 of the time.
How much extra mental effort would have been spent making sure that all the other features implemented in the mean time work correctly with the metadata columns if they had not been removed?
Of course, you turned out to be correct in this case and that shows you certainly had excellent insight and understanding of the project, but understanding whether a decision was right or wrong, should be done based on information available at the time, not with full hindsight.
What was the DB migration required for those particular columns?
I presume this was not the only time such a migration needed to be done to add columns. Is it possible that as new features emerge, new columns and migrations will need to be done anyway and one more or less migration would make less of a difference on the grander scale?
Well, in your case it wasn't clear cut, but YAGNI is still a good default approach, I'd suspect maybe even in your case.
First of all, it wasn't guaranteed that this feature of yours would come. Even in this case, the feature came, you could probably add it with not too much effort, sure maybe a bit more than otherwise, but on a large project it's hard to guess the future. What if someone else would have taken that task, maybe they wouldn't even recognize why those columns are there and they could have just reimplemented it anyway.
Also, a year is a long time, and who knows how many times it would have caused additional work and confusion.
> A: hey why this thing here, it doesn't do anything / never actually used?
> B: I dunno, Alex added it because 'one day we might need it' (eyeroll), will get very touchy about if you try to remove it, and will explain how he/she can predict the future and we will definitely need this feature.
> A: and this thing over there?
> B: Same... just move on, please, I can't keep re-discussing these things every month...
And, if your team wouldn't have applied YAGNI, you would have 1 feature that was a year later needed, and probably around 20 that was never needed, yet caused maintenance burden for years down the road.
I dunno, YAGNI is one of the most valuable principles in software development, in my opinion.
Writing what down? A bullet list of 30 items that "we added something that currently doesn't do anything or not used, but in case two years down the road, you need something, it might be close to what you need, so don't delete any of these unused items"? YAGNI is much simpler.
The thing is that such features can also become a burden quickly. E.g. people know it is not used, so nobody bothers to write tests or other things that would be usually done and those things might come back to haunt you once it is and boom a good source for catastrophic failure is there. Also, when you implement it later you can shape it without having to rework a ton of code.
Don't get me wrong, I understand your point here — and if the feature is something that you know for sure will be needed later on, adding it in from the start instead of bolting it on later is absolutely the smart choice. You wouldn't pour a concrete foundation for a building after you built two floors — especially if you know from the start it is going to be a skyscraper you are building.
In fact my experience with software development is that good foundations make everything easier. The question is just whether the thing we're talking about is truly part of the fundament or rather some ornament that can easily be added in later.
I get what you mean. However other comments raised some valid points too. It's just a few hours of works . I think what matters a lot more is when these kind of anticipated features do not add up to constant gains (which in your case, setting aside the fact that this senior engineer rolled back your tactful additions, would be the time it takes to run the prod DB migrations, since this is what differs between your early and late implementation). More often than not an architectural choice will impact n features orthogonally with respect to other potential features. If it takes you 2 days of work to implement that particular transversal feature, the time it will take you to implement it across all "base" features will be 2n days of work. Stuff like class inheritance will allow you factor that into, say 2 days + 10n minutes. I witnessed disasters because of situations like this where stuff that would have taken me days to implement (or even hours) took 6 months. And the main reason for this is that another team was tasked with doing this, and they didn't know that part of the code base well, which, because everything had to be super "simple" ("easy" would fit better), no class hierarchy, everything super flat, each base feature taking up thousands of lines across several microservices in the hope that anybody could take over, was a slow, dreadly, soul crushing job. Members of the previous team said it would have taken them 2 weeks (because they had a long experience with the spaghetti dish). I re-implemented the program at home in a weekend (took 3 month to code): the required change would have taken me an hour to code, and a fews day of integration to change those n base features (but because I had a leetcode-complex, very expressive architecture, and knew the domain very well). It took the new team 6 months. 6 months ! And I think they only managed to implement one feature, not all of them.
Result: disgruntled senior employees quitted and were replaced with juniors. 3 month later, overtime was distributed across the tech department, killing the laid back 4-days week culture that was put in place to attract talents, the employee unionized, some more quitted, and about a year later, upon failing to hire back these productive elements, the COO was fired.
The big difference here is something the user sees versus the developers as in your example. For users I think the answer is almost always, less is better. For the developers also, but there’s a bit more leeway.
> But the original work had taken me maybe an hour. And the cost of keeping the data around was approximately zero. It seemed he didn’t consider that.
i guess this is very illuminating - you have to predict the cost of adding YAGNI, before doing it. A low cost YAGNI feature might actually serendipidously become useful.
I feel this is the same principle as random side projects, not done out of necessity but out of curiosity and/or exploration.
Wouldn't using a VCS help in this case? So you could go back in time before this column was removed, and copy paste the code you already wrote and maybe change some things (as likely some things have changed since you wrote the code the first time)
In this case you where right, and the removal of the additional fields turned out to be a mistake. More often though I see this going the other way, you end up carrying around tables and fields that aren't needed and just muddles the picture. Even worse is "Well we added this field, but we're not using it, so let just stuff something else in there". It can get really ugly really fast.
The article is a little different, because "YES, they do need the calculator, people should know what your service costs up front". If they want to do away with the calculator, then they should restructure their pricing plans, so that it will no longer be required. That just not an IT problem.
I think when you have a mindset of removing the useless (which your columns were at at the time) you have to be prepared to sometimes add things back in. Yes, it is painful, but it is not a signal that the process didn't work. You should expect to sometimes have to add things back in.
We cannot perfectly predict the future, so when removing things there will always be false positives. The only way to get a false positive rate of zero is to never remove anything at all.
The real question is what level of false positives we find acceptable. If you can only cite this one case where it was painful, then I'd say that's evidence your colleagues approach worked very well!
(I think Elon Musk recommends a deletion false positive rate of 10 % as appropriate in general, but it will vary with industry.)
> the original work had taken me maybe an hour. And the cost of keeping the data around was approximately zero. It seemed he didn’t consider that.
You're missing the time other developers will spend trying to figure out why that code is there. When investigating a bug, upgrading, or refactoring, people will encounter that code and need to spend time and mental energy figuring it out.
Recently, I've been modernizing a few projects to run in containers. This work involves reading a lot of code, refactoring, and fixing bugs. Dead code—either code that is no longer called or code that never was—is one of the things that most wastes my time and energy. Figuring out why it's there and wondering if changing or deleting it will affect anything is just tiresome.
Answering "Why" is usually the hardest question. It becomes especially challenging when the original developer is no longer with the team.
> In an internal poll, 7 of every 10 people in the company thought the version with the calculator would do better.
An interesting and pretty classic dynamic - I liked the article overall but I think this point didn't get the highlighting it deserved. If 30% of the people involved think that the calculator is a bad idea that signals a potentially huge problem even if the majority think it is fine.
Be alert to the politics here. Although it seems otherwise, people generally don't like to criticise other teams in the business unless there is a political gain to be had. By extension, if I polled the company and asked "is this thing my team did a net positive?" I'd expect the default position to be "yes" as people wouldn't want to stir the pot for no reason. 30% of people signalling that it might be value-destructive is much more significant than it seems because of that. It should trigger some fairly thoughtful consideration of why exactly they thought that.
In this case they seem to have indeed been alert to all that and the story has a happy ending, which is nice. But this poll result was always evidence of a serious problem.
I agree in principle, but I am struggling on how you could quantifiably evaluate the contentiousness of a change. No feature will ever get 100% consensus. 30% does not seem great, but is it meaningfully different from 20%?
Even better if you have mixed incentives: sales wants any and all dark patterns enabled, customer support is sick of issuing refunds because the cart auto adds extended warranty to the purchase, etc
I smiled in the article when they claimed that removing the calculator might be better for users because more sales are completed. Ignoring that maybe the users were getting the appropriate amount of sticker shock, and bailing was the correct choice.
> No feature will ever get 100% consensus. 30% does not seem great, but is it meaningfully different from 20%?
Nobody is saying a feature should be automatically removed when it has 30 % detractors, just that it is a useful signal to investigate further.
The specific % threshold doesn't matter. Pick one that makes you chase false positives rarely enough. The exact number will vary from organisation to organisation.
Another problem with internal polls is that you will have the point of view of those who make the feature, not the point of view of those who use it.
Imagine the calculator code was a mess compared to the rest of the project, it uses outdated libraries and updating would break it, it may have some security vulnerabilities, uses an unreasonable amount of resources and breaks the build system. No one wants to work with it. Ask whether it is a good idea and most people will say "no", hoping to get rid of that mess. In that context 70% is a very good number. If on the other hand, it is a feature people love working on, then 70% would be very bad indeed.
The article just says those 30% of people weren't convinced the version with the calculator "would do better", not that it "is a bad idea". Granted, they might have thought that, but it seems quite a leap. They could just have easily thought it would make no difference, or assumed the version with calculator was underperforming because of the cases where it gave the wrong answer.
> One slight misinterpretation and wrong input and you'd get an estimate that's overstated by as much as 1,000x.
Does it also mean that in real world usage, one slight misinterpretation or misevaluation of your metrics and you're liable to 1000x more than you planned to ?
I totally see this as a reality of online billing systems. I've misconfigured GCP prototypes and ended with 100+ bills where I though it would be 2 or 3 at most and didn't care to watch for a few days.
But I'd understand a client bailing out when they realize slight changes to the sliders result in wild increases in the planned pricing. And removing the tool would sure help for registration, but not help the customer if they hit these kind of issues down the line.
> Does it also mean that in real world usage, one slight misinterpretation or misevaluation of your metrics and you're liable to 1000x more than you planned to?
Unlikely. You can see why in these two examples that really happened:
One user I spoke with said they assumed "queries per second" is calculated by (number of searches) x (top-k for each search), where "top-k" is the number of results they want back. I don't remember their top-k but let's say it's 10 -- so they were entering a value for "queries per second" that was 10x higher than it should be and they'd see an estimate around 10x higher than they'd really be charged.
Another user thought you get "number of vectors" by multiplying the number of embeddings by the embedding dimensionality (1,536 is a common one). So they were entering a value literally 1,536x higher than they should've. Their actual usage would be calculated (by Pinecone) correctly and not be that high.
Vector dimensionality is a basic concept for AI engineers and QPS is a basic metric for DB admins, but Pinecone sees lots of users who are either new to AI or new to managing DBs or both.
> where "top-k" is the number of results they want back.
Some systems will do that, so I get the confusion. I think the YouTube API for instance has a quota system that takes internal operations into account, so getting back 10 results in a query effectively weights 10+ credits.
I better understand the kind of issues you are facing, as these kind of subtilities are inherently hard to explain.
For better or worse, that's another advantage of starting with a small trial account and actually see how the operations are billed for typical operations.
Here’s a followup for the author: follow your own advice. Remove the “Psst... Get the next post in your inbox” interruption midway through your post. Dare to remove the stupid button that follows you around as you scroll.
I counted five obvious ways to subscribe on that page alone. Five! Do you really need to shove them in our faces and in the middle of the freaking content? Do you think you get more subscribers by interrupting people and being annoying? Are those the kind of subscribers you want?
Removing stuff is often obvious, you just have to take your mind out of the “more more more, make money, attract customers, more more” gutter mindset and think “what is respectful to users, how can I help them while simultaneously treating them like human beings instead of wallets to be milked”.
I agree with you, but the data does not. These annoyances serve the business's goals really well. It's good to remember that most businesses exist to make money, not to be pleasant to us HN readers.
> It's good to remember that most businesses exist to make money, not to be pleasant to us HN readers.
HN does not have a monopoly on discerning users. We are not special. It would be unrealistic to think the number of people who care about this is but a subset of people who frequent one website.
>> It's good to remember that most businesses exist to make money, not to be pleasant to us HN readers.
> HN does not have a monopoly on discerning users
What he said (or what I understood) is the opposite:
Businesses can and will take advantage of less discerning users and they are in their right to do so because making money is their reason to exist. That's a terrible mindset that dominates the (big)tech/startup sector and the reason for the Great Enshittification. Let's see how far this can go before it collapses.
IIUC the lesson of blog SEO is that, if you want to grow your readership, copious attention-co-opting calls to action are unambiguously worth the annoyance foisted on discerning readers.
What’s respectful to users is a separate (but not wholly unrelated) question…
Even if that’s true (I’m not convinced it’s unambiguous), my points still stand: Are undiscerning readers the kind of subscribers you really want? Perhaps so if you are, as per my last paragraph, the kind of person concerned with profit over quality. If you are, it shouldn’t come as a surprise that you don’t see removing stuff as obvious and your mind only thinks of adding more junk.
> In my post history, I asked about newsletters and newsletter popups, and a few people confirmed that they work really well.
Ignoring for now that’s 100% anecdotal and that “a few people” is far from enough to make definitive claims, what post history are you referring to? Do you have a link?
> The goal is to get paying customers. We discerning readers are the high effort, low reward cohort that aren't worth losing sleep over.
I understand that. I’m lamenting we live in a world where we find it acceptable to purposefully produce shit to exploit others.
You're being overly pessimistic. It's not exploitation. You ask people if they want a thing and they say yes. It works predictably better than not shooting your shot.
Call it anecdotal evidence if you will. The matter of fact is that it seems to work well enough for people to keep doing it.
> You ask people if they want a thing and they say yes.
You call it asking, I call it nagging. And “yes” isn’t the only answer, there’s also “you annoyed me so much I’ll actively avoid you”. Have you never seen one-star reviews saying “the app keeps nagging me to review”? These have business consequences, it’s far from all positives as your responses imply.
> The matter of fact is that it seems to work well enough for people to keep doing it.
That’s like the old maxim that “nobody gets fired for buying IBM”. Just because “everybody does it” does not mean it’s the optimal approach. Things change and people get wise to common bullshit, even as this kind of “knowledge” and “best practices” keeps being shared by money-hungry pariahs. No one really tests these assumptions in depth, they just share them uncritically. If you’re so sure it’s the best approach, let’s see the data. Otherwise let’s just be honest and say we don’t know.
> Before long, a dedicated Slack channel was created, which accrued over 550+ messages representing opinions from every corner of the company. Another few thousand words and dozens of hours were spent in meetings discussing what we should add to the calculator to fix it.
This is a symptom of over hiring. Too many people removes agency.
When people lose sight of what's actually important and feel that they must reach consensus by committee then there are too many people.
True, but at least the communication overhead between 2 people, and the time for them to either agree, compromise, or punt, can be a lot lower, which is a significant win for getting things done.
First: A lot of time was spent building consensus.
No individual felt they had unilateral power to remove the calculator. Instead the behavior was to seek approval. That's probably because of unclear ownership, which often happens because of too many people.
Second: Too many cooks in the kitchen.
At any stage of a company there's a limited amount of important work. Work that is mission critical and provides outsized value to the company.
When people feel their own work is not important they seek other work that appears to be important. So, you get the behavior of a lot of opinions on a pricing calculator.
The company I work for has cut staff by 60%, and workload has not decreased. This is standard corporate shittery, however my point here is that we still have the issue of unclear ownership even on basically skeleton staff. It's not a question of individual ownership, it's about what team owns it in my case.
The article states that the biggest factor was user misunderstanding of the options, not so much the number of different options. In other words, if they offer option A at $x and option B at 10*$x, if most users mistakenly think they need option B, the calculator is misleading.
Also, I'm a big fan of "contact us for pricing." It's annoying for users who are window-shopping and want a quickie ballpark, but it helps you identify cases where standard pricing (or pricing which can't easily be described online) can be negotiated, and which the user would have otherwise overlooked. This doesn't work for things like most ecommerce, of course.
My biggest issue with these: when introducing a new tool/solution, we often don't know how much we want to use it. In particular, it will usually be introduced in a minor application first, and if it feels reliable and useful it moves to bigger systems and more critical roles.
Contact for pricing requires us to explain all our internal politics, budget management, which systems we have etc. upfront to random strangers, who are also incentivized to just onboard us first and push the sunk cost fallacy button from there.
I kinda feel this works best for companies that don't really care about comparing products and will buy whatever give them the best numbers on the contract (which is common for enterprise software, it's just not my personal cup of tea as a small fish in the game)
Many users will see "contact us for pricing" and assume that means you can't afford it. That's fine if your customers are enterprises but definitely not for consumer products that middle class people might actually buy.
A lot of time when there’s something like that, I’m fine not having a firm number, but it’s nice to have at least a ballpark idea of cost. (I found this particularly egregious with musical instrument pricing where I didn’t know if I was looking at a $1000 instrument or a $20,000 instrument, so I generally assumed that these would be cases where I clearly couldn’t afford it so best not to wonder—not to mention the cases where the list price was often as much as double the actual street price for an instrument).
>>> I'm a big fan of "contact us for pricing." <<<
I have opposite feeling about them. They are like open invitation to give the sales guy a window of opportunity to look up your company website, and markup the price accordingly.
Exactly. Its an invitation to the used car salesman circus. I do t have time to play games with a salesman for the next two weeks. If a company doesnt have at least ballpark pricing available upfront they never hear from me and dont even know they lost a potential customer. Only huge entrenched companies can get away with that long term. That and single suppliers.
I make most of the buying decisions for tech tools for my company. And it is exceptionally rare for me to ever contact somebody for pricing. I usually move on to the next vendor with transparent pricing.
You can get away with it, if you are targeting a very small market with your product and none of your competitors offer transparent pricing. My own company does not offer transparent pricing and we can get away with it for the above reasons.
I would never entertain any "contact us for pricing" offer. It means that they are looking to rip you off. If you can't give a fixed price for bespoke solutions, you should still publish prices for standard solutions, so that customers can get an idea of your rates. Then they will contact you for bespoke solutions.
> Also, I'm a big fan of "contact us for pricing." It's annoying for users who are window-shopping and want a quickie ballpark
Don't underestimate those kinds of customers.
For example, an ad for custom tile showers showed up in my feed. I just wanted to "window shop" the price, so I could get an idea if it was something I wanted to do, and plan when to do it.
I filled in the form with a "I'm just looking for a ballpark number, please don't call me."
No response.
Salespeople just don't understand how irritating phone calls are when you're collecting data: Whatever I'm doing at any given moment significantly more important than dropping what I'm doing to answer the phone. This is especially important if all I need to know is a ballpark number to know if I'm interested in having such a phone call.
>> Perhaps removing a pricing scheme so complicated that it literally can't be modelled usefully by the customer would be even better?
> The article states that the biggest factor was user misunderstanding of the options, not so much the number of different options.
(Emphasis mine)
It seems to me that you are in agreement with the GP :-/
When a significant portion of the target userbase cannot understand something presented by the software, the problem is rarely the users.
More developers should read Donald E. Norman's "The Design of Everyday Things"; even if you forget specifics in that book, the general takeaway is that the default position must be "It is not the users' fault!".
There must be significant evidence that the user is to blame before the user actually is blamed. The more users' that have that problem, the larger the body of evidence required to prove that this is a user problem.
More than 10% of target users have problem $FOO? Then you better have a mountain of rock-solid, beyond a shadow of a doubt evidence that the software/interface is correct!
the article stating doesn’t mean it’s correct. The users misunderstood because they have a poor pricing model which obviously users won’t understand because pinecone isn’t as known as mysql or postgres yet
My employer has something like 250 products. Five of them are responsible for 80% of the revenue.
Dev teams of those five are stretched thin, barely able to keep up with bug fixes, and struggling to add new, important features -- so much so that getting anything on the roadmap is an impossible battle, no matter who is asking.
There are thousands of devs at the company, most of them attached to the products that contribute relatively nothing to the revenue stream.
I don't know if it's not obvious -- it must be obvious -- that to move forward the right thing is to cut most of the products and refactor the teams to drive the remaining money-makers. Yet, this has not happened, and I see no indications, no murmurs of it happening. Corp politics are wild.
I experienced something similar. We had a website with several products that seemed fairly similar, so we were concerned that people might have trouble deciding which one to get, and not buy as a result.
So we made a product advisor applet where the user would answer a few questions and it would suggest one or two products that would be best for them. Getting the applet right took a bit of work, but once it was done it worked very well.
We put it live on the site and.... conversions dropped precipitously. We A/B tested it and yep, it definitely hurt conversions. I still don't know why it hurt conversions, but it did. So we moved it from the homepage to a FAQ section of the site, and hardly anyone ever used it at all.
So maybe your advisor applet actually helped people, they were not buying just because it was the best choice for them given all the additional information you were now giving them?
> I still don't know why it hurt conversions, but it did.
Maybe people were indecisive and you just saved them trouble of trying stuff to find out for themselves.
If it was a service, then maybe they were signing up just to try because they weren't sure, and then taking advantage of sunk cost fallacy by pulling out an Amazon Prime strategy. Or, maybe without the advisor applet they might sign up thinking they can get away with the cheapest version (e.g. like people who purchase 3-5 EUR/month VPS), but with the applet you deny those hopes right away.
If it was physical products, then you might have helped them out of a bad purchase.
> So we moved it from the homepage to a FAQ section of the site, and hardly anyone ever used it at all.
Yeah I wouldn't expect to find it there. In the footer, maybe. But not in the F.A.Q., so there's that.
Interesting case study, but I'm skeptical of the broader implications. Pinecone is notoriously expensive compared to other vector database services. A horizontal price comparison reveals several better options in the market.
Removing the calculator doesn't solve the core issue - it just obfuscates costs and makes it harder for users to compare options upfront. In my view, this approach merely reduces the comparison stage, potentially leading more uninformed users to upload data without fully understanding the pricing implications.
While simplification can be valuable, in this case it seems to benefit the company more than the users. A better approach might be to improve the calculator's accuracy and usability rather than removing it entirely. Transparency in pricing is crucial, especially for B2B services where costs can scale quickly.
Not only is it often better but it can literally enable you to get to market an order of magnitude faster (at with higher probability of success).
I'm working on a tool that makes running wire inside residential homes easier. It requires carving a channel on the back side of the baseboard.
My original idea required a mechanical tool with a motor [2]. We prototyped a working version of this but always felt that the manufacturing challenge would be large.
We ended up adapting the system to existing routers. That meant our product was just a series of extruded parts with almost no moving parts [1].
Wireshark is one thing, but considering its added to a router the collisions in language here are funny to me. Most of your customers wont care however.
Yeah - I didn't think there would be confusion with the networking tool but based on feedback we're receiving...I was wrong. We're considering options including changing the name.
You're in a niche(ish) space, that doesn't have a lot of general overlap with HN.
I can say "I wish someone made a cheaper alternative to the Domino" and anyone in the space will understand what you mean instantly. But from an original analysis others might of told Festool that it was a bad name that will confuse people.
The HN feedback is likely heavily biased… even if you are deploying your product in Silicon Valley. Most people that will be fishing wire through their home have never heard of the networking tool. ie: You might consider not changing it too :)
I don’t get it. Presumably the pricing model didn’t change, so all you’ve done is push the burden of doing the math onto the user (or more realistically, hope they just don’t even bother?) If users are frequently estimating costs that are off by orders of magnitude, surely the correct response is to change the pricing model so it’s easier for customers to understand?
Once they’re using the product they can see their actual usage and cost metering. So they can either extrapolate that to larger scale or test it at scale for a short time to see hourly/daily cost and then extrapolate for the month or year.
In other words it’s not much of a burden and they get much more reliable information.
But they can still do that even if there's a cost calculator upfront. Removing the calculator simply obfuscates the ability to estimate cost with the happy justification that fault lies with the customers who all seem to be bad at estimation.
We, as humans, always carry a few types of things with us on a journey.
Things we like, things we don't like, things that are totally useful, things that are somewhat useful etc.
There comes a point where we need to let go of the thing(s) we like - I called them the precious payload where we are most reluctant - and in this case the 'calculator' was the precious payload so many people in the company were unwilling to remove this feature except for one person.
In business, adding a completely new feature or subtracting an age-old feature is extremely difficult but oftentimes, this is where growth comes from.
> We assume that if something exists then it exists for a good reason
I suspect that this often exists because people have tried this, and been summarily burnt by either politics, or some form of Chestertons-Fence.
Which leads to the question of: how and when do we discern the difference? How do we design our teams and engineering to do things like,
- ensuring the psychological or political safety to suggest slimming down systems without having every other team come down on your head.
- incentivise “svelte” systems at a product level, without falling into Google-levels-of-“lmao this is going away now thanks bye”
- engineer for slimmer systems. There’s lots of ink spilled on the topic of making things extensible, or able to have stuff added to it, but seemingly little about the opposite. Is it the same engineering practices, or are there other concerns you’d need to take into account, if so, what are they?
- would you as a customer pay for something that was better at its purpose but didn’t have every single feature under the sun? I sure would, how many other people/orgs would, if given the option? I semi-controversially think that too many tools providing too much flexibility mostly encourages orgs to paint themselves into wacky processes, just because they can. I doubt this entirely goes away, but if given less feature/flexibility bloat, how much “fat” (process/overhead/friction/etc) could be trimmed out?
Software Engineers (a large portion of HN community) inherently have a hard time with this. We're always taught to consider edge cases. Oftentimes handling the edge cases can be more work than the rest of the project.
But when we handle them, we give ourselves a pat on the back, without thinking can I just ignore the edge case and shrug if someone runs into it?
In the case of the OP, the edge case is the occasional user who might want to project their exact costs.
Reminds me of the advice that, if you need to join two mismatched pieces together, you have two options.
1)Add an adapter that’s customized on both ends, or 2) subtract the differences so they mesh directly.
Always look for opportunities to subtract, rather than add. Your system gets easier to understand over time as it becomes the purest representation of itself, instead of a collection of gap-fillers.
i think of this lesson often (i don't remember who told me) elon building something at spacex and ruthlessly cutting out pieces they initially thought were needed. less complexity meant cheaper and faster construction so = more tests.
i use this in day to day to life: making plans, buying things, managing others - reducing has led to more accomplished.
Simply removing a feature without addressing the root cause of confusion will lead to other problems down the line, such as increased customer support demands or user dissatisfaction when actual costs differ from expectations.
I love how this lesson applies to AI software products: it might not be obvious at first, but removing that dedicated vector database (which is only there because everybody's using one in the first place) often improves things :^)
Is it standard to use percents of percents in conversion tracking? Going from 20 to 23% conversion rate is not a 15% increase in conversions, it is 3%. If that is the kind of shenanigans being played, there is something else to remove
It is. To simplify, if every conversion makes the company makes $1 and 100 prospects enter this funnel, going from 20 to 23% means they make $23 instead of $20
How common is dropping a feature because of A/B testing results?
I feel like it should have a lower priority on such business decisions. My guess is that calculator removal was approved before A/B testing have started.
I just finished Elon Musk’s biography by Walter Isaacson and I was struck by how often this actually works. Whether you’re actually sending humans to the ISS, or building cars removing requirements and complexity is a surprisingly effective tactic.
I love his “algorithm” for getting shit done. The final step is, “If you don’t put back at least 10% of what you take out, you aren’t removing enough.”
Why stop there? Why not just remove all pricing completely, and let your clients contact sales for the shake down? That model seems to work great for many SaaS companies.
There are too many interesting ideas in this framework, but one of the first steps in the algorithm of solving a problem is to "Formulate ideal final result", according to TRIZ.
Now the "Ideal final result" has a specific definition: The part of the system doesn't exist but the function is still performed.
I'm having a lot of fun with this and other tools coming from TRIZ when solving problems every day. You might like it as well!
As for A/B testing and getting unexpected results: inside TRIZ there is explanation why it works - it is called "Psychological innertion". i.e. when engineer is getting a problem it is usually already formulated in a certain way and engineer has all kinds of assumptions before he even starts solving a problem. This leads to him thinking along specific "rails" not getting out of box. Once you have algorithm like TRIZ, it allows to break through psychological innertion and look at the problem with clear eyes.
Some other trick one might use to find interesting solutions to the problem from the post:
"Make problem more difficult". I.e. instead of how to make calculator simple and unrestandable, formulate it in a different way: "how to make calculator simple and unrestandable, visual, fun to use and interact with, wanting to share with your collegues?"
"Turn harm into benefit". calculator in the post is treated as a necessary evil. Flip it. Now we have a calculator, but we could show some extra info next to prices, which our competitors can't do. We can compare with competitors and show that our prices are better, calculator can serve as a demo of how customer is always in control of their spending as the same interface is available after they become customer to control their spend etc.
Formulating this way already gave me some ideas what could be added to calculator to make it work.
> Except for one person, it never occurred to this very smart group of people that removing the source of confusion could be a good option.
Reminds me of the quote: "It is difficult to get a man to understand something when his salary depends on his not understanding it."
That applies to us as software engineers too, our salary depends on having projects to write code for so it's not so surprising that a very smart group of people don't often consider that doing nothing or removing something is a valid solution. I like the author's observation on this too: it would be nice if removing things were rewarded. I wonder of the employee who questioned the calculator's necessity got any reward.
Removing features is deeply unpopular. The millisecond it is released, someone is going to make it a load bearing part of their workflow. A removal is now breaking the contract with the customer.
Which may/not matter, but it requires more politics and negotiation to suddenly be dropped.
It feels redundant to agree with this comment. I will anyway.
"Any fool can make something complicated. It takes a genius to make it simple."
"I apologize for such a long letter - I didn't have time to write a short one."
--
As for the calculator, I think it points to a bigger problem. Customers need to know what the platform will charge and a way to compare platforms in general. If the only way to truly know how much something will cost is to run their code on it, then maybe that's the thing that someone needs to implement.
There are big issues with this in the most-naive implementation in that people can easily abuse the ability to run code. That suggests that perhaps we need a benchmark-only environment where benchmarks themselves are the only thing allowed out of the environment. This may require a fair-amount of engineering/standards effort but could be a game-changer in the space.
A framework for being able to run this on many platforms to compare performance and pricing would lead to customers generating packages for vendors to compete. Though, I suppose it could also hide some devilish details like step-changes in rates.
This same framework would be useful for other things too, like testing how implementation changes affect future bills. Also, how pricing changes between vendors might become more-advantageous over time.
Of course, the sales folks might balk because they would rather have a conversation with everyone they can. Maybe I'm just advocating for a more advanced and complex calculator? ¯\_(ツ)_/¯
> The calculator also gave users a false sense of confidence, which meant they were unlikely to double-check the estimate by reading the docs, contacting the team, or trying it for themselves.
How dare they think that a thing called calculator is accurate and not double check!
Up-front pricing is only good for the buyer and never good for the seller. And while being an advocate for the user is a good thing in general, it's not a good thing if the result is less revenue. And if you want to present a pricing estimate your incentive is always to lowball and underestimate.
Businesses that are successful tend to exploit information asymmetry. This is frustrating as a consumer but if you can guarantee that the user doesn't know their final cost you can always sell them a sweeter picture than reality and play off the cost of switching to a competitor, if one exists.
Total aside but this is why housing prices are insane, at the most micro level no buyer has any idea over the final cost until the last possible moment at which point the cost of undoing the transaction probably outweighs the risk of continuing - psychologically, at least.
(I may or may not have been the victim of this recently and thought about it from the seller's side)
> Visitors who didn't see the calculator were 16% more likely to sign up and 90% more likely to contact us than those who saw it. There was no increase in support tickets about pricing, which suggests users are overall less confused and happier.
Of course if you hide the fact that your product might cost a lot of money from your users, more of them will sign up. Whether they are better off depends on whether they end up getting a bill they are unhappy with later at some unspecified future date, or not. That's not something you will figure out from a short-term A/B test on the signup page. So this seems like totally useless evidence to me.
I see this dynamic frequently with A/B tests. For example, one of my coworkers implemented a change that removed information from search result snippets. They then ran an A/B test that showed that after removing the information, people clicked through to the search result page more often. Well, obviously, it makes sense that they might click through more often, if information they wanted which was previously in the snippet, now requires them to click through. The question of which is actually better seemed to have been totally forgotten.