Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The single biggest value add of feature flags is that they de-risk deployment. They make it less frightening and difficult to turn features on and off, which means you'll do it more often. This means you can build more confidently and learn faster from what you build. That's worth a lot.

I think there's a reasonable middle ground-point between having feature flags in a JSON file that you have to redeploy to change and using an (often expensive) feature flags as a service platform: roll your own simple system.

A relational database lookup against primary keys in a table with a dozen records is effectively free. Heck, load the entire collection at the start of each request - through a short lived cache if your profiling says that would help.

Once you start getting more complicated (flags enabled for specific users etc) you should consider build-vs-buy more seriously, but for the most basic version you really can have no-deploy-changes at minimal cost with minimal effort.

There are probably good open source libraries you can use here too, though I haven't gone looking for any in the last five years.




Seriously. This is one of those cases where rolling your own really does make sense. Flags in a DB table, flags in a json file, all super simple to build and maintain, and 100x faster and more reliable than making the critical paths of your application's request cycle depend on an external provider.


You know what I would find worse than telling my customers that they can't access the application they paid for and works because I farmed my auth out to a 3rd party that is having an outage?

Telling them that my auth provider isn't out, but the thing I use to show them a blue button vs a red button is.

Oof.


Has this actually been a problem? We’ve been using launch darkly for years and if they do have an outage (which is really really rare) the flag will be set to the default value. It’s also very very cheap, maybe $500 a month.


$500 a month and it has had many major outages in 2024 alone. lol

https://status.launchdarkly.com/uptime?page=5


What about ConfigCat?


I don't know, why don't you do the research and let us know?


We did this. Two tables. One for feature flags, with name, desc, id, enum (none, defaultToEnabled, overrideToDisabled). One for user flag overrides, with flagId, userId, enum (enabled, disabled).

The combination of these two has been all we've ever needed. User segmentation, A/B testing, pilot soft launch etc are all easy.


Would you mind expanding on the usage of enums for the feature flags table? Why not use a boolean?


We actually did use booleans, I just found it easier to explain using enums, and the code would have been simpler if we'd done it that way.


In years of trying to sell things I've found that one of the best selling points to management is "susceptible to vendor lock-in", "you don't own your customer database", etc.

I have no idea why that is.


I'm confused. Are you saying this ironically or have you literally pitched management with the risks of using your product?


I developed an open source "user management system" circa 2001 which I used on several sites, including one that had 400,000+ users, a famous preprint archive and the web site of a county-level green party. It was patterned on what sites like Yahoo and Amazon had at the time, did email verification and numerous things that were a hassle to implement, had great screens for the administrators, all of that.

I couldn't get anybody else to adopt this software despite putting a lot of work into making it easy to pick up and install.

10 years later competitors popped up like mushrooms and were adopted quickly. The thing they all had in common was somebody else owned your user database. So yeah I feel pretty cynical.


There's such a thing as being too early.

My university had a great shared browser bookmark management system, even with a basic discussion support for them. In 1998. It was not super popular because people just didn't have that many links to share, eventually it fell offline and got accidentally deleted in 2001.


> using an (often expensive) feature flags as a service platform

I have no idea why anyone would actually do that in real life. Feature flags are something so trivial that you can implement them from scratch in a few hours, tops — and that includes some management UI.


Often these 3rd party offerings are feature flags PLUS experimentation with user segmenting. Depending on the style of software you build, this can be extremely valuable; it’s very popular in the SaaS market for a reason.

Early on at Notion we used simple percent rollout in Redis, then we built our own flag & experimentation system, but as our needs got more complex we ended up switching to a 3rd party rather than dedicating a team to keep building out the internal system.

We will probably hit a scale in a few years where it makes sense to bring this back in house but there’s certainly a sweet spot for the 3rd party version between the 50-500 engineer mark for SaaS companies.


That's a reasonable path! You probably learned to appreciate and value the complexity, but you wouldn't have from the start. Which service do you use?


That path sounds very sensible to me.


Happens when you do the flags wrong :)

We have a FF as a service platform and a big "value add" is that we can turn on and off features at the client level with it.

But, unfortunately, it's both not the only mechanism for this and it is also being used for actual feature flags and not just client specific configuration.

I'm personally a MUCH bigger fan of putting feature flags in a configuration file that you deploy either with the application or though some mechanism like kubernetes configs. It's faster, easier to manage, and really easy to quickly answer the question "What's turned on, why, and for how long". Because, a core part of managing feature flags is deleting them and the old code path once you are confident things are "right".

The biggest headache of our FF-ws is that's really not clear and we OFTEN end up with years old feature flags that are on with the old code path still existing even though it's unexercised.


You'll still be building management UI over their system (it doesn't understand or validate actor types, tenants, etc, so you have to do that.).

But at high throughput, you might want something with dedicated professional love. Ten thousand feature flags, being checked at around 2 (or 200) million RPS from multiple deployments... I don't want to be the team with that as their side project. And once you're talking a team of three to six engineers to build all this out, maybe it makes sense to just buy something for half a million a year. Assuming it can actually fit your model.


But it's not a side-project in most implementations it's part of the app itself.


Side project means it’s not that team’s primary focus.


The scale is easy in practice, cause you outsource to a CDN. But everything takes time and has opportunity cost.


Maybe we've worked with different FF systems, but anything that involves a call more expensive than an RPC would be lethal to request latency. Calling out to a CDN forty five times per inbound request would be... Infeasible


A background thread long polls the CDN, updating a local hashmap on change.


If I was a bootstrapped startup, I'd do a json file and then when I've outgrown, I'd hand write something that long-polls a CDN for updates, with a tiny rails or react app behind the CDN.

But these approaches are insane for companies above a certain size, where individuals are being hired and fired regularly, security matters, and feature flags are in the critical path of revenue.

Last time I looked at LaunchDarkly Enterprise licensing, it started at $50k/year, and included SAML.

Now that sounds like a lot, but if you're well past the startup stage, you need a tiny team to manage your homegrown platform. Maybe you have other things for them to do as well, but you probably need 3 people devoting at least 25% of their time to this, in order to maintain. So that's at least $175k/year in the USA, and if your company is growing, then probably the opportunity cost is higher.


Add to that ideally feature flags should be removed after feature is released. Ideally also you shouldn’t have more than handful of feature flags.

Permanent per customer configuration is not a feature flag. Also best would be not to have too many per customer configurations.


Feature flags are ofter initially for feature and later left in as dependency flags. Even within a large organization, individual components and services by other teams will have outages.


That sounds like bad engineering.

Having all kinds of flags makes system prone to misconfigurations.

As amount grows you get flags depending on other flags etc.

Gets insane rather quickly. Unless someone purges flags relentlessly.


> build-vs-buy

Roll your own. Seriously.

Feature flags are such an easy thing that there should be a robust and completely open source offering not tied to B2B SaaS. Until then, do it in house.

My team built a five nines feature flag system that handled 200k QPS from thousands of services, active-active, local client caching, a robust predicate DSL for matching various conditions, percent rollout, control plane, ACLs, history, everything. It was super robust and took half an engineer to maintain.

We ultimately got roped into the "build vs buy" / "anti-weirdware" crosshairs from above. Being tasked with migrating to LaunchDarkly caused more outages, more headache, and more engineering hours spent. We were submitting fixes to LaunchDarkly's code, fixing the various language client integrations, and writing our own Ruby batching and multiprocessing. And they charged us way more for the pleasure.

Huge failure of management.

I've been out of this space for some years now, but someone should "Envoy" this whole problem and be done with it. One service, optional sidecars, all the language integrations. Durable failure and recovery behavior. Solid UX. This shouldn't be something you pay for. It should be a core competency and part of your main tooling.


I don't understand what a dedicated "completely open source offering" provides or what your "five nines feature flag system" provides. If you're running on a simple system architecture, then you can sync some text files around, and if you have a more scalable distributed architecture, then you're probably already handling some kind of slowly-changing, centrally-managed system state at runtime (e.g. authentication/authorization, or in-app news updates, ...) where you can easily add another slowly-changing, centrally-managed bit of data to be synchronised. How do you measure the nines on a feature flag system, if you're not just listing the nines on your system as a whole?


> If you're running on a simple system architecture,

His point was that even a feature flag system in a complex environment with substantial functional and system requirements is worth building vs buying. If your needs are even simpler, then this statement is even more true!

I'm having a hard time making sense out of the rest of your comment, but in larger businesses the kinds of things you're dealing with are:

- low latency / staleness: You flip a flag, and you'll want to see the results "immediately", across all of the services in all of your datacenters. Think on the order of one second vs, say 60s.

- scalability: Every service in your entire business will want to check many feature flags on every single request. For a naive architecture this would trivially turn into ungodly QPS. Even if you took a simple caching approach (say cache and flush on the staleness window), you could be talking hundreds of thousands of QPS across all of your services. You'll probably want some combination of pull and push. You'll also need the service to be able to opt into the specific sets of flags that it cares about. Some services will need to be more promiscuous and won't know exactly which flags they need to know in advance.

- high availability: You want to use these flags everywhere, including your highest availability services. The best architecture for this is that there's not a hard dependency on a live service.

- supports complex rules: Many flags will have fairly complicated rules requiring local context from the currently executing service call. Something like: "If this customer's preferred language code is ja-JP, and they're using one of the following devices (Samsung Android blah, iPhone blargh), and they're running versions 1.1-1.4 of our app, then disable this feature". You don't want to duplicate this logic in every individual service, and you don't want to make an outgoing service call (remember, H/A), so you'll be shipping these rules down to the microservices, and you'll need a rules engine that they can execute locally.

- supports per-customer overrides: You'll often want to manually flip flags for specific customers regardless of the rules you have in place. These exclusion lists can get "large" when your customer base is very large, e.g. thousands of manual overrides for every single flag.

- access controls: You'll want to dictate who can modify these flags. For example, some eng teams will want to allow their PMs to flip certain flags, while others will want certain flags hands off.

- auditing: When something goes wrong, you'll want to know who changed which flags and why.

- tracking/reporting: You'll want to see which feature flags are being actively used so you can help teams track down "dead" feature flags.

This list isn't exhaustive (just what I could remember off the top of my head), but you can start to see why they're an endeavor in and of themselves and why products like LaunchDarkly exist.


> if you're not just listing the nines on your system as a whole

At scale the nines of your feature flagging system become the nines of your company.

We have a massive distributed systems architecture handling billions in daily payment volume, and flags are critical infra.

Teams use flags for different things. Feature rollout, beta test groups, migration/backfill states, or even critical control plane gates. The more central a team's services are as common platform infrastructure, the more important it is that they handle their flags appropriately, as the blast radius of outages can spiral outwards.

Teams have to be able to competently handle their own flags. You can't be sure what downstream teams are doing: if they're being safe, practicing good flag hygiene, failing closed/open, keeping sane defaults up to date, etc.

Mistakes with flags can cause undefined downstream behavior. Sometimes state corruption (eg. with complicated multi-stage migrations) or even thundering herds that take down systems all at once. You hope that teams take measures to prevent this, but you also have to help protect them from themselves.

> slowly-changing, centrally-managed system state at runtime

With flags being so essential, we have to be able to service them with near-perfect uptime. We must be able to handle application / cluster restart and make sure that downstream services come back up with the correct flag states for every app that uses flags. In the case of rolling restarts with a feature flag outage, the entire infrastructure could go hard down if you can't do this robustly. You're never given the luxury of knowing when the need might arise, so you have to engineer for resiliency.

An app can't start serving traffic with the wrong flags, or things could go wrong. So it's a hard critical dependency to make sure you're always available.

Feature flags sit so closely to your overall infrastructure shape that it's really not a great idea to outsource it. When you have traffic routing and service discovery listening to flags, do you really want LaunchDarkly managing that?


> I think there's a reasonable middle ground-point between having feature flags in a JSON file that you have to redeploy to change and using an (often expensive) feature flags as a service platform: roll your own simple system.

The middle ground is a JSON file that is copied up and periodically refreshed. We (Sentry) moved from a managed software to just a YAML file with feature flags that is pushed to all containers.

The benefit of just changing a file is that you have a lot of freedom of how you deal with it (eg: leave comments) and you have the history of who flipped it and for which reason.


How do you push the files to all of your containers? I’ve done this in the past with app specific endpoints but never found a solution I liked with containers.


We currently persist the feature flag config in a database where the containers pull it from. Not the optimal solution but that was a natural evolution from a system we already had in place.


We keep a JSON blob in Google Secret Manager for our flags. The service running in the container will reload the secret anytime it changes


Ah that’s a super nice feature. I’m mostly familiar with AWS who don’t have a neat way of doing this, you end up with a bespoke solution either with lambdas pushing to shared volumes or just polling s3 for updates.


If you are using kubernetes, you can mount the secret/ConfigMap as a volume and it will be updated automatically when changes occur. Then your application merely watches the file for updates.


Being on AWS, using EKS feels like overkill when you're talking $75/month just for having it managed by AWS. This doesn't work with ECS, unfortunately, or if you're just running docker on EC2.


AWS has a native service for this called AppConfig and has agents that can pull and cache flag values so your services only need to make localhost requests.


AH nice, I was not aware of this. Thanks (It is expensive, though...)


Expensive?? Really? One of the cheapest services around.

$0.0000002 per configuration request


Depending on when you’re evaluating it’s a per request overhead. You might/provably have multiple flags per request. Compared to a lambda invocation that pushes a config file to every container if it changes, it’s expensive.


I've been doing this a long time and seen a few different apps use config in database. There's different levels of config you're talking about here, but general app config should generally not go in a db.

No-one ever changes the bloody things and it's just an extra thing to go wrong. If it only loads on startup, it achieves nothing over a bog standard config file. If it loads every request you've just incurred a 5% overhead on every call.

And it ALWAYS ends up filled with crap that doesn't work anymore. Because unlike config files, no-one clear it up.

Worse still is when people haven't made it injectable and then it means unit tests rely on a real database, or it blocks getting a proper CI/CD pipeline working.

I end up having to pick the damn thing out of the app.

Use a config file like everyone else that's probably built into the framework you're using.

To be honest, most of the time I've seen it has been when people who clearly did not know their language/framework who wrote the app.

I'm not saying it's you, but that's been my honest experience of config in the db, it's generally been a serious code smell that the whole app will be bad.


There's differences to what kind of configuration you'd want to have in a config file (or environment variables, or some other "system level" management tooling) versus a feature flagging system.

In my experience, feature flagging is more application-level than system-level. What I mean by that is, feature flagging is for stuff like: roll this feature out to 10% of users, or to users in North America, or to users who have opted into beta features; enable this feature and report conversion metrics (aka A/B testing); enable this experimental speedup for 15 minutes so we can measure the performance increase. It's stuff that you want to change at runtime, through centralized tooling with e.g. auditing and alerting, without restarting all of your application servers. It's a bit different than config for like "what's the database host and user", stuff that you don't want to change after initialization (generally).

Regarding the article though, early on your deployment pipeline should be fast enough that updating a hardcoded JSON file and redeploying is just as easy as updating a feature flag, so I agree it's not something to invest in if you're still trying to get your first 1000 users.


For some kind of software, another call to the DB is the best way to add bog-standard functionality without adding complexity and failure modes.

Granted, not for all software. And there's something to be said about a config file that you can just replace at deployment. But that's something that varies a lot from one environment to another.


> feature flags in a JSON file that you have to redeploy to change

Our config files are stored in their own repo. Pushes to the master branch trigger a Jenkins job that copies the config files to a GCP bucket.

On startup, each machine pulls this config from GCS and everything just works.

It's not a 'redeployment' in the sense that we don't push new images on each config change.


We do the same thing but slightly differently. If a new docker image is built, we deploy that image. If the config changes, an ansible job moves that config to the target host and the service is restartet with that new config file. Configs are mounted inside containers. It all runs on GitLab CI/CD.


Great summary.

Just starting with them and learning to improve your application of them is the best way to learn, too.

There is one book on feature flags that had been written earlier, some of the independently published books by experienced tech folks out there are a goldmine.

Feature Flags by Ben Nadel is one such book for me. There is an online version that is free as well. Happy to learn about others.

https://featureflagsbook.com/


Heck if your user system is just a Users table, you don’t even really need to consider build vs buy for them either.

If you start doing it for sub-groups, hard agree but this is a space where it almost always pays dividends to roll your own first. The size of a company that needs to consider adding feature flags (versus one that already has them) is typically that in which building your own is quicker, cheaper, and most importantly: simpler.


Why aren't you just using environment variables for feature flags?

Have people still not bought into the whole 12 factor config things?


When your app starts to get bigger and more complex, the idea of needing to restart a process to pick up any new kind of data starts to seem silly.

Have seen the pattern many times:

Hard-code values in code -> configure via env -> configure slow things via env and fast things via redis -> configure almost everything via a config management system

I do not want to reboot every instance in a fleet of 2000 nodes just to enable a new feature for a new batch of beta testers. How do I express that in an env var anyways? What if I have 100s of flags I need to control?

In other cases I need some set of nodes to behave one way, and some set of nodes to behave another way - say the nodes in us-west-2 vs the nodes in eu-central-1. Do I really want to teach my deploy system the exhaustive differences of configuration between environments? No I want my orchestration and deploy layer to be as similar as possible between regions, and push almost everything besides region & environment identification into the app layer - those two can be env vars because they basically never change for the life of the cluster.


I would add two things:

It's often important that flag changes be atomic. Having subsequent requests get different flag values because they got routed to different backend nodes while a change is rolling out could cause some nasty bugs. A big part of the value of feature flags is to help avoid those kind of problems with rolling out config changes; if your flags implementation suffers from the same problem, it's not very useful.

Second, config changes are notorious as the cause of incidents. It's hard to "unit test" config changes to the production environment the same way you can with application code. Having people editing a config every time they want to change a flag setting (we're a tiny company and we change our flags multiple times per day) seems like a recipe for disaster.


Making changes atomic is literally impossible, it's easier to just assume they won't be than chasing down something computer science tells us is impossible. I assume you are saying "every node sees the same change at the same time" when you say "atomic."

As for unit testing flags, you better unit test them! Just mock out your feature flag provider/whatever and test your feature in isolation; like everything else.


It seems like you've kind of missed both of my points.

If you're doing canary deploys to a fleet of 2000 nodes, it might take hours for the config to make it to all of them (I've seen systems where a fleet upgrade can take a week to make it all the way out). If your feature flags are configured that way, there's a long time that the state of a flag will be in that in-between state. We put feature flags in the database not config/environment so that we can turn a feature on or off more or less atomically. Ie, an admin goes into the management interface, flips a flag from off to on and then every single request that the system serves after that reflects that state. As long as you're using a database that supports transactions, you absolutely can have a clear point in time that delineates before/after that change. Rolling out a config change to a large fleet, you don't get that.

On the second point, what I'm saying is that (talk to your friendly local SRE if you don't believe me), a large percentage of production incidents in large systems are because of configuration changes, not application changes. This is because those things are significantly harder to really test than application code. Eg, if someone sets an environment variable for the production environment like `REDIS_IP=10.0.0.13` how do you know that's the correct IP address in that environment? You can add a ton of linting, you can do reviews, etc, but ultimately, it's a common vector for mistakes and it's one of the hardest areas to completely prevent human error from creating a disaster. One of the best strategies we have is to structure the system so you don't have to make manual environment/config changes that often. If you implement your feature flag system with environment variables/config, you'll be massively increasing the frequency that people are editing and changing that part of the system, which increases the chances of somebody making a typo, forgetting to close a quote, missing a trailing comma in a json file, etc.

Where I work we make production config changes maybe once a week or so and it's done by people who know the infrastructure very well, there's a bunch of linting and validation, and the change is rolled out with a canary system. In contrast, feature flags are in the database and we have a nice, very safe custom UI so folks on the Product and Support teams can manage the flags themselves, turning them on/off for different customers without having to go through an engineer; they might toggle flags a dozen times a day.


How do you do software upgrades if you don't have a good system for handling process restarts without downtime?


Then again, speed and performance.

At my last job, updating a productive game server cluster took an hour or so with minimal to no customer interruption. Though you could still see and measure how the systems needed another hour or two to get their JIT'ers, database caches, code caches and all of these things back on track. Maybe you can just say "then architect better" or "just use rust instead of Java", but the system was as it was and honestly, it performed vey very well.

On the other hand, the game servers checked once a minute what promotion events should be active every minute from the marketing backend and reacted to it without major caching/performance impacts.

Similar things at my current place. Teams have stable and reliable deployment mechanisms that can bring code to Prod in 10 - 15 minutes, including rollbacks if necessary. It's still both safer to gate new features behind feature toggles, and faster to turn feature toggles on and off. Currently, such per-customer configs apply in 30 - 60 seconds across however many applications deem it relevant.

I would have to think quite a bit to bring binaries to servers that quickly, as well as coordinate restarts properly. The latter would dominate the time easily.


Software updates happen once per two hours, config changes happen once per 5 minutes or faster.

A few days ago I’m tuning performance parameters for a low latency stream processing system, I can iterate in 90 seconds by twiddling some config management bits for 30s in the CLI, watch the graphs for 60s, then repeat.


I mean, isn't that even worse?

If I have 100 servers and I'm doing rolling deploys then I'm going to be in a circumstance where some ratio of my services are in one state and some ratio are in another state.

If I am reading per-request from redis (even with a server cache) I have finer-grained control.

For me it is a question of "is the config valid for the life of this process" vs. "is this config something that might change while this process is alive".


How do environment variables help? You still need something that knows what values to set the env vars to.


I think openfeature.dev is an attractive proposition these days - start off with an env-based provider or roll your own and if you get to a point where you need to buy, you only need to swap over a provider (or use a multi-provider).


I like openfeature. Initially I thought that it was overengineered and quote honestly, I never grew to a size where I had to use anything but an env-provider paired with out CI/CD pipeline. But it gave me security that A/B testing would be possible if needed, and, more importantly, we had a unified API for feature flags and they were all defined in one place.


I agree with a lot of this, except for the part about de-risking deployments. That should not be a reason why to adopt a feature flag platform - that is a symptom of a bad deployment pipeline that should be fixed which is a whole other story.


I disagree that using feature flags to de-risk deployments is a symptom of bad deployment pipelines.

There's several aspects of deployments that are in contention with each other: safety, deployment latency, and engineering overhead are how I'd break it down. Every deployment process is a tradeoff between these factors.

What I (maybe naively) think you're advocating is writing more end-to-end tests, which moves the needle towards safety at the expense of the other factors. In particular, having end to end tests that are materially better than well-written k8s health checks (which you already have, right?) is pretty hard. They might be flakey, they might depend on a lot of specifics of the application that's subject to change, and they might just not be prioritized. In my experience, the highest value end-to-end tests are based on learned experiences of what someone already saw go wrong once. Writing comprehensive testing before the feature is even out results in many low quality tests, which is an enormous drain on productivity to write them, to maintain them, and to deal with the flakey tests. It is better, I think, to have non-comprehensive end-to-end tests that provides as much value for the lowest overhead on human resources. And the safety tradeoff we make there can be mitigated by having the feature behind a flag.

My whole thesis, really, is that by using feature flags you can make better tradeoffs between these than you otherwise could.


> That should not be a reason why to adopt a feature flag platform

It's one of the two big reasons. First is the ability to rollout features gradually and separate deployments from feature release, and second is the ability to turn new features off when something goes wrong. Even part of the motivation of A/B testing is de-risking.


The risk of deployments isn’t entirely technical. Depending on your business and customer base it might be necessary for some groups to have access to the feature earlier or later than others.


Strong disagree here, my whole org does not roll out changes without feature flags at all and whenever someone doesn't follow this policy they cause large scale incidents. Feature flags are actually a sign the deployment pipeline is very sane and mature, because people understand any new code comes with unexpected risks and we should prevent these risks from taking down systems.


Sometimes the only way to try out a distributed system is to run it in prod and see what happens. Having the tools to flip behaviour within 1 second globally can be a useful escape hatch. When you get to large enough scales “just roll back” is not always good enough. I deploy systems with tens of thousands of nodes and we specifically have to rate limit how fast we deploy so we don’t cause thundering herds.


Very few teams have instant deployments. Even fast systems take a few minutes to run. If you can turn off a flag faster (because it’s a DB record), then you should do that.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: