In this particular example, my "reset password" functionality could be sending me an email, and I have to reset the password manually and email you a temporary one that you can use. Bad solution? Yes. Doesn't scale? Of course. But if you have 5 customers it's not a big deal and you can use your time on something else.
The key here is to have good people in the team. A decent Senior Engineer / Architect would know that 'Reset password' will be needed. That is why a very technically solid person is gold in the first round, because they can design the signup in such a way that it's prepared for future requirements. They of course would also know that you can't store in plain text, that it should be hashed, that you might have to use a vault, that you need to randomize the hash, and that in the future limit access to the "user" table and never ever expose it through an endpoint where a user can reach it, as well as protecting against XSS and SQL injections.
Which of the list above goes off the list? There is a balance there in getting the right solution without scope creep, but also not implementing something so silly that it's going to get you into trouble later. The genius of a competent person is being able to tell what needs to be solved now vs. later
People spend so much time trying to preemptively solve scaling problems and proactively design around scale. I am being 100% serious when I say that this is an absolute total mistake. Scaling problems are just the absolute best to have, you shouldn't run away screaming from them before they happen. Twitter had the fail whale for years, you know what it didn't do? Tank the business. In scaling problems:
• You have money flowing in, and confidence that the thing is working.
• The load on the system has been dynamically adjusted down for you as your users have a suboptimal experience, giving you smaller messes to clean up.
• Often there is an expensive short-term solution to remove the egg on your face if so desired. You have one Really Important client who is pissed, you give them their own private app with twice the hardware until you can get this resolved, and now they love you for life because they see that you did backflips for them.
• You can profile the system. Complex systems, under instrumentation, literally tell you where to look for what is wrong. “Why is that taking 400ms?!”
• You can run side-by-side tests, clone the input from the stressed system to the new system and verify that the load has decreased while the underlying model is the same.
• Very often you will find logical bugs that you needed to fix anyway, they just become much more apparent when the system has high contention. Two things were implicitly ordered by time and thus did not break unless requests were concurrent, so they were doomed to break eventually, but now they break almost immediately.
• Your corporate overlords immediately understand the business value in getting the situation resolved “correctly” whereas convincing them of refactors usually takes time. “We got the system limping along for now but if we don’t fix the underlying issue soon then this will happen again.” “Well we are really happy with how explosively this product is going, I guess we can delay those other two apps while you fix up this one.”
• At the same time that license is not a license to dither, as usually happens with such speech. You feel committed to investigating and refactoring and fixing, with that time, because the uncertainty of the system is a sort of abstract menace exerting deadline pressure. It is legitimately exhilarating to figure out and fix the actual problems, and gives you a good feeling about your contribution.
This is why I would generally tell people to not start out a project with Kubernetes, for example. You are trying to up-front the cost of a scaling problem but that scaling problem is gonna be really nice to have. Do it later! Get some egg on your face.
There is an abstract reason too, which I should mention. Lots of systems die because they do not have a “revenue problem”—long term they will make enough money to cover their costs—but a “cashflow problem”—short-term they overcommit and run out of money. You fail to pay your employees on-time one month, and by the next month you find yourself potentially needing to spend time and money hiring new employees to replace the ones you just terrified. So you fail to deliver a functionality on-time and a new client chooses a competitor. Stuff like that. Can happen just because if you have N payments coming in the uncertainty of those payments can grow like √N while you keep only a constant buffer of cash-on-hand, but can happen in more elaborate ways. Almost always you can ease the pressure by deferring payments until the latest possible.
When I was planning my wedding, my wife wanted to pay everything as soon as possible to save on our mental load; I resisted. And that was clutch because there were times during this when we were saying “ok we got $100 left to cover food and gas this week before I get paid on Friday, what can we do with that?” where if we had up-fronted those costs, presumably we would have unnecessarily eaten ramen or less for the three weeks before. Like it is easy to say “here is what we can save per month, so this is a reasonable budget for the wedding that won’t kill us”—solving the revenue problem. But cashflow problems still exist even when you know that you will be able to cover the cost eventually. Solving a scaling problem is up-fronting a cost that can easily be delayed and if you don’t delay it then there is a very real chance that your project can be canned way before you run into that scaling problem.
Especially let me link @jackdied’s talk “Stop Writing Classes” where this is a theme,
Something something like “They subclassed a dict. Because they might need to add some functionality to it later. You know what? You can just do that later!” Same with scaling problems. This won’t scale, but we’ll worry about that later. And then when we worry about that, we will know by measurement whether our shiny new cloud machines need to be RAM-optimized or CPU-optimized or whether they need to be cheap-as-possible-but-ten-times-as-many-of-them. Whereas if you up-front this cost you literally are making all of your scaling decisions based on zero data and hunches.
Understanding when the inevitable problems pop up and being able to solve them in reasonable enough time not to lose your users is - to me - not a nice to have problem. It's a showstopper that will turn egg on your face problems into fatalities.
Few things I would recommend are:
* Instrumentation - You don't need to go full ELK before you find product-market fit. Hell you can log everything, and with two users you can always read all the logs. But something is key.
* Up and down scripts - How long will it take you to set up a new copy of the product? Do you have to trawl bash logs and have a team meeting to find out all the configs that need to be set up?
* Load-testing - Again, nothing fancy. Just put curl on a bash loop and see where the system starts to hiccup. Once you have real users with complex data on real hardware you'll wish you profiled things, even slightly.
* Instrumentation - I say it again because the sheer number of systems I've found on the verge of silent failure and user frustration because the errors weren't being propagated, be it from client side to server or from the logs to an alert, is massive. With modern SPAs it's easy to think everything works when all your users see is a blank page.
* Please don't roll your own crypto. Unless you're working in a language that is 2 days old you really shouldn't need to or be doing this.
In short, if you're using HPAs and Kubernetes you've gone too far. But don't choose your instance size at random, be able to set up a new instance in an hour, and have at least 60% confidence that if there is an error you will know about it.
In 2020, with the number of managed solutions out there for simple CRUD functionality, the default should be just using the managed thing until it doesn't suit you, not just ignoring critical yet non-unique pieces of infrastructure all together.
A more practical example: Start/stop/change subscription plans. Until 12 days ago, Stripe had no customer portal so you had to implement this yourself. You can do it with support email, and "easy to switch plans" is not on any prospective customer's checklist.
"Too many change plan emails" is a good problem to have - deprioritize the feature. And if you're lucky, by the time you really need it, your billing service will have implemented the feature for you.
A more pragmatic approach involves examining X and Y before deciding not to pay any money. (They're also different for everybody, and it also depends on your financial situation.) If X is less than the cost of 1 coffee from a local coffee shop, I personally would pay for it rather than spend the month building my own $0 monthly unused cost pubsub bus for 1 month.
(Time it takes to implement is a 3rd variable, Z, though I'll note that estimating software project is notoriously difficult to get right.)
That's exactly what I just did at the soon-to-launch startup I'm currently working at. There is a change password function, but no password reset (or other systems that would require generating stateful links for emailed). As an added bonus, if you try to sign up with an email address that is already in use, the automated email you receive (from an address monitored by a human) includes something to the effect of "If you aren't sure of your password, reply to this email and we'll get it straightened out."
Wrong. The correct technical decision is to bundle in a library that solves this for you - login with Google, or Facebook, or GitHub, or OIDC, pick one according to the context. Get password reset, MFA, password security, etc. for free.
Why reinvent the wheel, poorly?
Any of these things may or may not be relevant to your decision, and I know that authentication is a specific example within a larger point. What I'm trying to say is that it's never correct to say "<x> is the right approach for all <y> trying to do <z>". Every decision has consequences, and what really matters is your ability to foresee those consequences and weigh them against the broader strategic picture of what you're trying to accomplish.
And between 0 and that many users you could have all sorts of other reasons for rethinking your authentication architecture, nevermind whether home grown or third-party library.
1. This is a blocking issue now.
2. This is not a block issue now, but postponing will cause more pain than we would save by not doing it now.
3. This is not a blocking issue and doesn't create future pain.
The third bucket are natural to postpone. Finding the balance between the first two is one of the primary challenges of engineering.
Later on you might realize that you've bound your user ID to email and that you need to use UUID, where email is just a contact address.
You might even want to have users have multiple accounts that use the same email address.
You might decide to bind the user's account to their phone number instead.
It's not about One True Strategy, but about looking further into what you will need in the future. You might decide not to develop something right away, but you can avoid much refactoring later by making a smart decision now that did not involve any more work than the less smart decision.
Then you'll have password recovery, 2FA, sign in with FB/Goog/LinkedIn etc., loads of insightful management screens and even SAML support for enterprise SSO - all free, out of the box, battle-hardened and already in wide use with many eyes on it for security.
Then when your customer waves some IT checklist in front of you asking you "do you support blah blah password complexity" the answer is just YES.
The TFA has a solid point, but has picked a bad example to illustrate it. These days if you find yourself even thinking about building your own password recovery, even way off in the future, you should really reexamine your core decisions.
I believe the point of GP was to say "use an auth framework". If you're rolling your own auth, then your approach is fine.
People who think their problems are small or simple write their own libraries, and then every change in requirements gets incorporated into the library or blamed on the victim. By the time the system is profitable you've reimplemented half of a robust library, badly.
As an engineer working at many startups, this is a sad and painful reality of working under product driven ideas and "first to market" type things- basically anything that doesn't (usually) involve life or death, or federal regulation is rushed out the door when it can be.
As a user, I absolutely 100% hate this mentality. You skip features you don't consider necessary, do things in ways that make them difficult to return to later and throw it in the "technical debt" pile.
But once stabilized, and there's no immediate danger of running out of runway, things shift, and the medium and long terms become much more significant. But many startups fail to realize this; they've become too accustomed to the fast-paced, ship-asap mentality. Breaking from it seems to them a decline in productivity, where it's actually just a shift in priorities. There are a lot of contributing factors to this oversights, and they can vary from company to company, but it's continuing to prioritize the short-term is a dangerous act that usually serves neither the end user nor the business.
One problem in these discussions is that people quickly start using absolutist language, which makes it hard to find balance.
The proper course of action is to work with the leadership to figure out when the expected critical points are for the business. This includes fund raising, IPOs, acquisition, or any other time when the company receives a valuation. The goal, for both the business and the dev team, is to make sure that the company has the best possible value at those exact points. You don't want to peek too early or peek too late. The proper course of action then is to prioritize things in accordance with how they impact those points in time. If one's coming up soon, prioritize the short term almost exclusively. If it's years out, plan ahead as a marathon, not a sprint.
For example, it would be a bad idea for me to say "I am not going to write any automated tests" for a 3-week MVP. That decision would lead to me producing nothing.
"My customer data isn't backed up", is a nice problem to have when you're just launching a new service and don't have any customers. But it's still a problem you'd probably like to solve relatively early. There won't be a moment where missing backups are an immediate problem but also still easily solvable; they're either a future problem or you're unavoidably about to lose a large number of your customers.
As a less on-the-nose example, sometimes when I'm building software, I can anticipate that something will be a problem two or three months from now. In those cases, I have a decision to make. If the code is disposable, I can ignore the problem and solve it later. But if I'm working on core architecture, spending an extra 2-3 days to solve the problem now can save me 3 weeks of refactoring in the future.
Many businesses start agile and become slow and cumbersome over time precisely because of this reason. They can't keep up the pace they started at because the technical debt, compliance issues, and future problems end up weighing them down over time.
I think there's just a balance to be had here.
"My customer data isn't backed up" is a stupid problem to have, and if your friends and coworkers aren't stopping you, then you need a better class of friends.
Make your money and burn down the building on your way out!
While I'm all for building minimum viable products, I would argue that a login process that does not have a password reset function is too minimal to be viable. If you're so strapped for time that you can't build a password reset form while you're building your user sign-up form, I would ask why you're bothering with building your own sign-up functionality at all. Why not use OpenID or OAuth based logins that bypass the need for building a sign-up system at all?
As a user, I would be hesitant to sign up for a product that didn't have password-reset functionality, and, if I discovered that the service I just signed up for has no way to reset a password, I would probably stop using it (since I lost my password and have no way of resetting) and disrecommend it to people who ask me about it.
When those are done, reset password might literally be the next item on your list. You might end up building and shipping it before anyone signs up anyway, but in the meantime you've at least got something out there and had the opportunity of getting any users at all.
To go back to your point, was "lack of export functionality" a "nice problem to have"? Arguably, yes. You don't have to worry about exporting data until you have users entering data, right? But, as Microsoft found out way back when it was building Word, users won't come to your system unless they know they can get their data back out again to share with others.  So just like Word's market share only really took off when it was able to write WordPerfect files, I predict that Roam will remain, at best, a curiosity until they implement an export system that allows the user to take his or her notes and use them in a different application.
Instead of thinking about problems in terms of what prevents users from signing up, or what prevents users from using the product, it's better to think about problems in terms of what prevents users from accomplishing the tasks this product is supposed to help them accomplish. In the case of Roam, the lack of export functionality (which at first might seem like a triviality) is actually a big deal, because the utility of a personal knowledgebase is significantly diminished if there isn't an easy way to get all the knowledge back out of it again. As a result, lots products in that space (not just Roam) suffer from marginal adoption because while it's easy for a user to sign up, it's much less obvious that the product is worth investing any significant time or effort into learning and using.
This is a great problem to have - lots of pent up demand. Nevertheless, it is a problem, and we must actively strive to solve it. It's nice to have lots of demand, yet if the problem is not solved, we will be giving up future revenue. A larger team will be able to support more revenue than a smaller team.
There is no part of this situation where it is a good idea to slow down on hiring, or to compromise on hiring standards.
Ninja edit: this is in contrast to things that aren't a problem yet. E.g., in the article, the author discusses not having password reset functionality. That is not yet a problem in the absence of users. It's not a nice problem to have. It is not a problem.
I think you misunderstand his word usage, though you're seemingly following the same process.
A "nice problem to have" is one that results from having success. Yes, he doesn't have the problem now, because he's got no users, so indeed, having to reset the passwords of existing users would be a "nice to have" problem.
You're already in the "nice to have" zone (clients with problems to solve!), and you want to fix those problems brought on by popularity.
I am focusing (perhaps too much) on the tense issues. Without tense, "no password reset mechanism", to continue using the same example, represents a problem state.
If something is a problem, it exists in the present.
If something will be a problem, with any contingency or timeline, then by definition it is not yet a problem.
Even in your own phrasing, the tense issue becomes clear:
> Yes, he doesn't have the problem now, because he's got no users, so indeed, having to reset the passwords of existing users would be a "nice to have" problem.
Specifically "... would be a "nice to have" problem." This indicates that it is not a problem, rather that it can become a problem.
I fully agree with everything GP said, by their understanding of what I meant. But yeah, to be useful the question has to be forward looking. Would things have to change for the better for this problem to actually manifest.
That's the crux of it. The headline is catchy, but the author isn't talking about nice-to-have problems. He's talking about recognizing if something is a problem right now, or only a problem at some point in the future.
IMO, the article is click-bait.
The full question I ask is "From my perspective right now, is this a nice problem to have?" - emphasis on the first part.
I certainly didn't mean for this to be click bait, just wanted to share something that's been valuable for me with others in the hope it might be useful. I might work on the title.
Therefore I don't value the approach as presented here. But a sane variant could be interesting: anyway I suspect it will need domain specific judgment, preventing from throwing a nice universal rule of thumb to the world.
A 'good problem to have' is 'we need to scale to handle another 1 million users', not missing basic functionality.
For many "nice problems to have" I really disagree.
A nice problem to have can be a real problem. It's nice you are in that situation but it's still a problem.
Your site offline because of demand is a nice problem to have. It's absolutely one to solve right now.
OTOH, I think that it's still a bit wrong to say don't solve them now: if you have a problem that is potentially a business-ending disaster that you won't have time to fix before it destroys your business if it manifests, it doesn't matter if the conditions which would cause it to manifest are otherwise an improvement over your status quo conditions. It still needs fixed before that happens.
Otherwise, yeah, YAGNI applies.
On your second paragraph, 100% agreed. This is really just a useful prioritization heuristic, not a rule, and in some cases other considerations should and will override it.
It's mainly useful just to deliberately and routinely check that instinct to go deep on whatever you're working on when there's a lot of breadth to cover that's more important to get done first.
"Is this feature important, or a nice to have" is the question they want, not "is this problem nice to have?".
> nice problems to have relative to the status quo, so problems that are really only potential problems right now, but could become actual problems if good things happened
But that's not what this very common phrase means.
What they want to say is "nice to haves".
I've made this mistake a lot. I'm trying to get better at being strict about saying "not now" to new problems that I discover in the middle of an iteration, and avoid getting sidetracked from shipping by working on things that can be left for later. I'd like to share a tactic I use to do this that has worked well for me. It's to ask a simple question about each new problem I encounter: "From my perspective right now, is this a nice problem to have?". If yes, I skip it. If no, I work on it immediately."
Opinion: There is great value to persisting problems that can be instantaneously or quickly solved, if only for the simple reason that they can be thought about; meditated upon; and other/better/more novel/more creative/more elegant solutions devised -- than the one initially determined by one's mind...
In fact, if in the future, I built a large company, and solved all known business problems (well, in that business's industry!), then I'd probably want to get rid of that company and start over again with nothing!
Well to again rethink/reengineer better solutions -- to the problem set that I once faced as barriers!
You will always think more creatively the second (and nth) times you revisit problems and problem sets!
This might sound crazy to most normal people -- but I value problem solving skills over money.
With problem solving skills, you can always get money -- but the reverse is not always true...
Anyway, excellent article!
It's a TERRIBLE problem to have not addressed in advance because now you're loosing money and people may wander off.
almost everything folks call "a nice problem to have" is not really "nice". It's frequently business hurting and sometimes business killing.
I understand that we all want to build products which others use and pay for but why not also be building things for yourself?
Is it really such a bad thing to spend some time over-optimizing pieces of a system for your own pleasure and possibly learning things along the way?
Work on the most important features first but find the middle ground between some spaghetti that just works and an over engineered system.
Exactly the mechanism that this question aims to address!