Hacker News new | past | comments | ask | show | jobs | submit login
SQL Is No Excuse to Avoid DevOps (acm.org)
263 points by YesThatTom2 6 months ago | hide | past | web | favorite | 118 comments

I really enjoyed Thomas Limoncelli's Time Management for System Administrators. I've also got The Practice of System and Network Administration on my shelf, and hopefully one of these years I'll read it---maybe once my time management is better. :-) I like that Tom was practicing and advocating system-administration-by-writing-code way before DevOps, and to me he represents a tradition of automation that far predates the buzzwords.

I've lived in the world of Rails and Django so long, I sort of take schema migrations for granted. It's interesting to see such a high-level person covering this ground; it reminds me of how much these frameworks have given us, and how so many organizations are still practically emailing around zip files of source code.

His Technique 2 feels harder to me: write the code to work on both old & new schemas, and deploy the schema change after the code change. I know this is what Google does though, and I think it's the only safe way to deploy rollback-able code changes. I'm curious who else is doing it, and especially how you are scheduling it. Especially with a CI/CD workflow, I'm used to getting things integrated and letting them get deployed "whenever". So how do you make sure the schema changes go out "later", and what gating is required first, and how do you get buy-in to spend time on that after the feature is already live? I'm not asking about the technical how-to, but the organizational/political how-to. For instance if you're using Pivotal Tracker, do you make a separate PT story? Leave the old one open until the schema changes are deployed? What does it concretely look like?

> I've lived in the world of Rails and Django so long, I sort of take schema migrations for granted.

And there's lots more room for improvement over Rails and Django migrations.

Some of the problems:

- Annoying version number management

- Management of long "chain" of migration files

- Unnecessary coupling between ORM and migration tool

- Heavyweight process to generate and run each migration, which slows down local development

- Bad testability

Autosyncing local dev databases, and automatically generating migration scripts with a diff tool is a better approach, imho.

Don't disagree with the rest of these points, but I do want to point out that Django lets you squash long "chain" migration files just as you'd do with commits.

Totally agree -- I wrote some similar thoughts elsewhere in the thread. It might just take some time for more developers to gain experience with the declarative/diff-based approach, to become comfortable with it.

Like the feature flag approach for all feature rollout(with or without schema change). If the flag is set the new feature kicks in so its easy to do a rollback if need be and we do this all the time. We don't close the feature ticket until the flag(old schema too) is removed and made as default in production after the smooth transition.

Think of Schema changes much like a two-phase commit in a database (Which has more than two phases - I hate that term.)

First, propose the new schema. You should check in a version of your schema that contains both the fields from your new and old code. You can't do certain changes in this phase - If you're deleting fields, you can't do that yet. If you're changing column types, you should be careful; Hopefully your DAO already abstracts this, but you may need to roll that code change out first.

Then, make your application use the new schema. Start writing to the new columns as well as the old. Prepare a migration that will copy any data from old columns that needs to be. Your application should use the old data as the source-of-truth during this time, but when writing, it needs to write both versions of those fields. At this point, you can still easily roll back, and may want to, if you decide to proceed a different way for development.

Next, flip the source of truth. Your new fields should be the primary ones, but the old ones still need to be valid. You should be prepared to roll back still - The previous release should still be usable.

Once you're certain you're not going to roll back, stop reading/writing the old data. This is your hard flip, but once it's started, you can proceed leisurely. It doesn't matter - You've entered cleanup phase. Once that software change is finished, you can do a second schema change to delete old fields. (This is where the significant value of Schemaless is - You don't need to do anything in this second change, just let old data naturally clear itself out. Schemaless actually requires more attention to schema than a Schema-driven database, for this reason, but the iteration time can be quite fast if you have a good schema-control system, like, say, storing protobuf definitions in your version control and following their update rules[1])

All in all, this requires at least two periods of schema change and one period that involves multiple software rollouts. It's not simple, but nor is it that complex - And if done right, it's easy to roll other software changes at the same time, never losing velocity (aside from that which is spent reaching this target)

All of this requires that your application is well formed. Your application needs to be properly modularized, use reasonable DAOs or otherwise have complete understanding of how the data is accessed. It may require coordination between two projects that move, while not in lockstep, with gates on the other.


All good points. It's worth mentioning that this is exactly why protobuf doesn't by default fail when it gets a field it doesn't recognize (there might be a flag to control strict validation, but there is a reason that "required" fields were taken out in proto3). From the source code:

https://github.com/protocolbuffers/protobuf/blob/master/src/... // An UnknownFieldSet contains fields that were encountered while parsing a // message but were not defined by its type. Keeping track of these can be // useful, especially in that they may be written if the message is serialized // again without being cleared in between. This means that software which // simply receives messages and forwards them to other servers does not need // to be updated every time a new field is added to the message definition.

A bit more detail:

For those who don't work with protos, they're the thing that Google created to use instead of XML before JSON was a thing. To use them, you define a message schema consisting of fields, where every field has (1) a number that will uniquely identify that field in the message, (2) a type for the field, and (3) a name for the field. The protoc compiler then generates code that decodes and encodes that specific message type for you.

Now think about what happens when a new field is added to that message type and the code is re-generated. That change is probably going to be in at least three different binaries (and it's often more): the client, the frontend, and the backend. There's no way to guarantee that every single client, frontend, and backend picks up that change at the same time - it has to go out incrementally. So if client v2 talks to frontend v1 and gets routed to backend v2, frontend v1 damn well better not crash if it sees a new field.

For historical context, there were of course similar things that eixsted before protobufs. Some were even widely standardized (eg ASN.1). Protobufs apparently was a kind of reboot of these concepts inside Google.

Indeed. It's doable, but requires a reasonably fast rollout tempo and a manageable level of technical debt.

It is utterly impractical when you're deploying to individual customer sites, for customers who agree to an upgrade on timescales ranging from three months to five years...

[edit] Part 1, the automated schema updates, I'm right alongside. I can't fathom how people can operate with manual schema updates all the time. But part 2 is IMO wishful thinking outside of a deployed environment your own team can upgrade pretty much at will.

Absolutely. It only works for Managed or SAAS solutions. You need to be prepared to deal with much wider ranges of schema problems on embedded devices - Which is a good argument for Schemaless and a good version control document, though you can absolutely take it too far in terms of compatibility.

DevOps is about removing a secondary team that you throw something over the wall to for deployment. Instead the build pipeline (be it Circle, GitabCI, Jenk, Go, etc...) can control ALL environments and things have been optimized, in such a way, that a single click updates an environment. (including prod). The pipeline "code" and all its related scripts are the actual devops work products. They CAN be done by a single team that supports multiple other teams, but truthfully it should be a shared responsibility. Perhaps one team provides base scripts and answers the questions about where logs and monitoring data go (and of course those questions are answered with scripts essentially). So "ops" can still exist if your organization is large enough, but I definitely say there's no "thing" you throw over the wall anymore.

So it's a secondary team to replace the secondary team.

Always. The question with any form of process optimization is how much smaller does the human presence need to be? The idea that we would replace people entirely is a sales pitch.

The difference is an automation of a contract to make a new release in production: the process of delivering a new code should make all the necessary checks such as CI, regression tests, monitoring endpoints and so on up to deployment itself, in a reproducible manner.

I always thought that DevOps are developers who can do ops. But in recent years I see more and more people calling themselves DevOps, even all they do is ops with simple scripting.

I think we need a new name for developers who run their code in prod, as DevOps is diluted.

So, I’m this light name of the article is utter BS. As devs we deal with a lot of different databases, be is SQL, NO SQL or graph and dealing with them is part of our job. If all you do is run stuff and release - you don’t have Dec component, so you just Ops.

I have been doing DevOps since before it had a name.

Been running on AWS for the better part of 8 years now and running multiple production environments for clients and as an employee.

Over the years, I have seen the term "DevOps" slapped on so many things I don't even know what it means anymore.

For some, it's a culture, for some, it's a team, for some, it's a role.

SRE is now getting the same treatment. No idea what it means most of the time.

IMO, the only places where it's done successfully view it as a culture and neither of the other two.

That said, the easiest way to make yourself more 'in-demand' right now is to make it your title.

As someone who does ops and development, when I look around in the market and see companies hiring full time devops roles, I usually see big red flags. The only jobs that I tend to want lately are on development side at small companies or the operations side at larger ones.

I tend to see Dev ops as mostly ops but a background in dev. They can step in and understand the development environment, but generally they are going to make life easier for the devs. it's an important role, but having some developer background is nice while they kick ass getting things done. Your standard sys admin will do the job but generally get a big eyes look on their face when you ask about something outside of their wheelhouse.

Most of the time I see the word people are talking about deployment automations. But DepAuts doesn't sound as good.

A person who specializes in deployment automation, a company that has a culture in investing in deployment, a team that specializes in deployment automations.

Deployment automation is an engineering task that can be designed, implemented, deployed by an engineering team. It could be an internal Software Delivery product

Yup, our ops call themselves DevOps and SREs in the same meeting, not doing any of it.

I agree that DevOps becoming a job title is a perversion of the term. That applies equally to an operator or a developer calling themselves "DevOps".

As the article mentions, DevOps is a methodology. To quote the author, "The most concise definition of DevOps is that it is applying agile/lean methods from source code all the way to production."

There is, of course, a mirror of the situation you're describing. With people that are relatively competent developers "also doing Ops", but doing it badly.

The article is a description of how to make schema changes less painful and less risky.

How is your comment relevant to that?

I have a problem with the title. They could’ve named it: SLQ schema management.

Dev that do Ops is usually a terrible idea.

It depends on the person and mindset.

I switched from dev to ops fulltime (as in not "DevOps") and I'm perfectly at home here. If anything, I'm the one bridging the gap between us and the development teams. My team thinks I'm an ops person who does development, but I don't really find that to be true.

Why do you say that?

I've worked at a couple places that had developers build their own environments before I got there (I'm more of a systems engineer).

There was no thought to high availability, security, scalability, reliable backups, etc.

I'm not saying all devs are like this, but that's just been my experience.

This. Honestly Ops is a different mindset that devs, the tldr version is devs don't really know / care about running things in production. I don't blame them it's just different set of skills.

A developer who does not know/care about running things in production is not a developer. These are not different skills. Security, high availability, reliability, these are concepts that hold as much weight at the application side as they do on the systems side. Two sides to the same coin.

> A developer who does not know/care about running things in production is not a developer.

Behaviour depends on how incentives and constraints are structured.

Traditionally, operations had to keep things running and cared little about new functionality. Developers had to deliver new functionality and cared little about keeping things running.

Requiring developers attend 3am outage calls changes incentives, as does requiring operations to own (or get out of the way of) software delivery pipelines.

> devs don't really know / care about running things in production

I disagree with this entirely. Devs _you have workd with_ might not have cared but it doesn't make the statement universally true. As a counter argument, some of the best software developers I've worked with were also very good at operating and debugging production software and the reverse has also been true.

I also don't buy that the two are a different mindset (at least in the domains I work in and care about). In my experience the very best people (whether they're working "dev" or "ops" roles or something in the middle care about the entire development and deployment lifecycle of the software they work on. Building a good experience in software includes thinking about reliability and availability and planning a reliable and performant deployment also requires that to be thought of in the application layer at some point.

No idea how this made front-page. Sounds like the author met one crazy person and wrote an article about it.

Totally irrelevant to me, I've never seen a company like this in practice in my long career.

If we're talking about "a company that only deploys rarely and requires many manual steps with many people involved to do it", then I've worked with a few.

Typically they're shops that got very big very fast. They deal with problems of scale where schema changes are very expensive and data migrations can require weeks without heavy optimization.

They typically have tried to automate some stuff, but the crush of feature work to support the massively expanding business always came first. They usually have CI, but never enough tests (unit, integration, or functional) to deploy with any real confidence.

They used to deploy (by hand) every couple of days because it was easy. Then they just kept getting bigger and it became every week. Then maybe once a month.

Typically they'll talk about why they don't do continuous delivery or deployment because deploys are RISKY and the business depends on deploys being perfect.

I've seen this situation in video games (twice), a mobile app that had reached the 100 million user threshold, and an old school data marketing company.

It definitely is a thing that exists.

What I would say, if I could rephrase, is this:

Sure some shockingly-behind-the-times companies exist, and may even be a non-negligible percentage. But I'm surprised this is is front-page HN because I assume that the engineers on this website know that you don't need downtime for a DB migration. It feels very common-knowledge among this community.

The author, Tom Limoncelli, is seen as an expert in Systems Administration circles, and has written several books The most famous two:

  Time Management for System Administrators
  The Practice of System and Network Administration
He also regularly gives talks at conferences, etc on the subject.

He is most likely paraphrasing many conversations he has had with people over the years


I've seen plenty of companies function pretty much exactly as described. Just because it is irrelevant to you does not mean it's not interesting to others and thus front-page worthy.

Have you considered that your experience is not universal? I voted it up because I've had surprisingly common conversations with people at large corporations, academic institutions, and the U.S. federal government where sclerotic processes made every deployment a tense, error-prone ordeal which people went to great lengths to avoid.

> Totally irrelevant to me, I've never seen a company like this in practice in my long career.

Never worked in government, I assume...

Two private sector industry leader companies that I worked for had never heard of a staging environment before I joined, and the process to implement one was long and painful. The argument against it was: we wouldn't need a staging environment if we had competent developers ...

Niche industry, so still not very big corporations, probably explains it.

I've worked in several organizations where installations are as described in the not CI/CD fashion. It's not fun. Right now I'm working on a green project, and at least have CI/CD working, by once we start deploying our application to clients, it will get messy because we aren't following the practices set out in the article.

It's been so bad, I've completely distanced myself as a developer because of those that have far more political influence as to the use of the database and the effects it has on me. I can work on the API, Database and other services. I simply don't, I was hired as a UI/UX/JS Expert. That said, it's not always the most pleasant way of separating the people working on an application, compared to full-stack development.

I've been at this for over two decades, and tend to find it in lots of business development. Not so much in typical SV startups, but there's far more business development units out there than SV startups.

I've seen paragons of "Devops" with precisely this practice in significant chunks of the company. You've gotten lucky - or possibly just buried your head in the sand.

Probably less to do with luck/head-in-sand and more to do with filtering out potential employers.

> Sounds like the author met one crazy person

He either met a straw man or a troll.

In the old methodology, any schema change requires the entire application to be shut down while a team of experts (or one very overworked DBA) modifies the schema manually.


I'm working pretty much in an environment described in the article. Big bank, complicated database setup with multiple replicates and databases up to a size of ~25TB.

On a (pre defined) release weekend there are essentially two entities to be released: The application (obviously) and the database changes.

For any DB change a patch is created and tested through an environment of half a dozen instances. If bugs are found on already deployed db patches they are never fixed in the patch, but a subsequent patch (also tested through all environments) is created.

Needles to say that anything provided, which applies database changes is version controlled.

The idea to apply database patches manually is so damn outlandish to beggar believe.

Anybody doing this on a production database without a strictly controlled (and tested!) break-fix should be fired on the spot.

There are some banks that won't have DBAs pretty soon if you're in charge, for better or worse.

Almost every DBA (both administrator and architect) I’ve worked with was opposed to the workflow that most developers like to use - schema migration tools like Liquibase and FlywayDB along with continuous deployment. I have problems with this approach only if there’s poor testing around database changes and this is the default in most companies. For example, adding an index to a column can write lock an entire table and cause a site outage as a result. On most developer test databases this is accomplished within a few seconds usually in which time most test suites won’t notice the issue. And because most development environments don’t look anything like production for cost reasons, you’re incurring a risk there.

Instead of fixing the technical root of why there’s so much risk, most companies add the most possibly expensive resource out there to mitigate or understand it - a person. The more dysfunctional, the more people get piled on while nothing actually is fixed (this is a larger cultural enterprise problem that goes beyond database changes). Most decent data tier engineers I know don’t like to babysit DB changes yet almost everyone I’ve worked with had that as a large percentage of their time spent. Such a waste of time and talent.

But then there’s a few DBAs I’ve seen that basically don’t do anything productive by making everything that goes wrong a developer problem, refusing to help troubleshoot (even when the DB in question is not accessible to developers), and other not very devops-y behaviors (hostility-first operations that is). It’d be one thing if the DBs had some huge changes happen, but usually I’ve seen these folks hardly do anything visible to me as another infrastructure guy and the troubleshooting I wind up doing on behalf of powerless developers is something that would have taken someone more knowledgeable less than 5 minutes with a quick trace of DB queries in flight and some historical profiling.

But being change review Charlie is something that a lot of people do that, while necessary to some degree, becomes their entire job. That’s just frighteningly dull work to me.

And medical records software

In fact, the entire hospital had plans for multi-hour downtimes due to releases, not to mention department-specific contingency plans if the downtime took longer than expected.

In my first job out of college I was required to do the after hours prod releases with another guy (we were the 2 youngest on the team...). We did quarterly releases most of the time and did everything manually from building/packaging the web app to running the DB change scripts, etc. Needless to say, when the other guy left for a different job I was left alone to do these releases and left a month or two later.

Maybe training rather than termination?

I'd consider this such a gross case of malpractice on the level of a surgeon forgetting some instruments in patient, before sewing her shut.

Errors and mistakes happen, no question about that. And any ol' mistake certainly shouldn't be a firing offense.

But this behaviour is so grossly incompetent that I don't believe that it can't be helped with any training.

Surgeons are generally trained with a grueling multi-year academic and practical program about how to conduct surgery. DBAs aren't. If you're lucky you have someone who has been trained to be a DBA professionally - but I've never worked with one, all the SQL I've interacted with has been run by people with unrelated job titles who learned how to operate databases by Googling when necessary.

You're welcome to fire people if you think it's going to help. I'm just curious how you think a skilled, trained DBA is going to show up at your desk when that happens.

You pay them money :-) though think the guy that was our DBA at BT has now retired - his first boss was Diskaja at NPL labs

At a bank, yes. However, for a smaller startup firm, no. (The truth is, for most of this stuff, most of the time, it just doesn't matter.)

It just doesn't matter until it does.

Which is fine if failing is cheap. Depends on the risk profile.

Maybe training, but definitely not keeping the position. And, to be frank, if they had any experience, I'd consider that a sign of negligence.

They should have never been put in that position in the first place.

> Anybody doing this on a production database without a strictly controlled (and tested!) break-fix should be fired on the spot.

I agree with almost everything you said except this line. Usually, people with production access are (or should be) sensible enough not to touch the prod db with anything untested that might mutate things.

If they're not sensible enough for that, they shouldn't have had access in the first place, and instead of placing blame and firing people, I'd start thinking about /why/ that happened. Points to a more fundamental flaw in the Engineering. Firing won't fix anything.

Also note that the statement in question talks about an old methodology.

As for the article, good stuff. I enjoyed reading it.

> In the old methodology, any schema change requires the entire application to be shut down

MySQL perhaps, when even the addition of nullable columns without defaults or constraints requires a full table lock and copy?

That seems to be a common reason for stating online schema migrations are difficult (not impossible), but not a problem with many other RDBMS engines.

While it's clear that the author of the linked article isn't familiar with MySQL administration, it's worth noting that the most recent versions of MySQL finally support instant column adds: https://mysqlserverteam.com/mysql-8-0-innodb-now-supports-in...

That won't save us from needing extra tooling like pt-online-schema-change or gh-ost for other types of schema changes in high availability environments, but it's a step in the right direction.

MySQL has supported algorithm=online since the release of MySQL 5.6 in 2013.

MySQL builds a shadow copy off the new table in the background, then swaps the new table in for the old table once the copy is complete.

My lame attempt at clarifying DevOps confusion goes something like this:

DevOps is an analogue of Agile and Lean. Lean, which came from the Toyota Production System, doesn't just exist in one department and in one position. Saying that "DevOps is <x position> doing <x thing>" is like saying "Lean is a factory floor worker pulling an Andon cord when a product is defective". It massively understates all of the aspects of the method, and all the different roles involved, and how they get work done.

Nothing about these methods prescribes that you have to use X technology, or that there is just one position that does work in one way. They prescribe a method, and you choose how you implement that method, and lots of different kinds of people in an organization are needed to make it all work.

I started a Wiki recently to try to give some more in-depth understanding of DevOps (https://devops.yoga/overview.html). It's still pretty bare-bones, but I think you can quickly get the idea that DevOps isn't one particular thing, or one particular team, or one set of technologies.

Agile set about to deliver working software frequently.

Devops set about to deliver working software frequently.

Both optimise for frequent delivery. The theory and practice of how working is measured is where they differ.

Even though I've given in to using the "automated schema updates", or migrations, approach (Technique 1 in the article), I'm still uncomfortable with the idea of the application tier having permission/ability to manage database schema and execute DDL. I feel that it removes a key layer of defense-in-depth, and opens up the possibility for bugs and exploited vulnerabilities in application code to have hugely negative impacts.

Even ORMs make me nervous for similar reasons, since they essentially enable the application tier to pass into the database any/all queries and DML. But like migrations, I allow ORMs for the convenience they provide during development.

The live application serving traffic does not need to have DDL permissions. The migration code can live in the same code base/application, but can be executed as a separate task with escalated privileges.

Ideally, the same build should be used in every environment and all the configurations such as db endpoint, username and password must be external to the built and fed from the environment. The migration command can be a part of application, but only invoked automatically in dev environment. Most of ORMs allow to configure whether schema initialization happens at the startup or not.

During deploy, the migration part can be executed separately in prod environment, before the actual deploy. Different tools provide different "hooks": e.g. heroku has release phase(https://devcenter.heroku.com/articles/release-phase), in spinnker you can add a separate stage for db migrations: https://blog.spinnaker.io/deploying-database-migrations-with..., which can use a different db user. In this case, migration functionality is disabled at startup in production environment and runtime db user need not have DDL privilege, because by the time it starts up, migration phase must have been finished.

Treat a database as any other service. SQL is its API. Don't break the API contract of a view or procedure, create a new one and deprecate the old one. Not rocket science.

Actually, running integration tests against SQL RDBMS works really well. It might be a tad slow but the CI can easily handle that. We have Docker and all of that for this, and it works x-platform.

Running the RBDMS with `eatmydata` (https://www.flamingspork.com/projects/libeatmydata/) can significantly speed up tests in my experience. All of you're data is gone as soon as the process exits, but that doesn't matter for tests.

You can even do SQL DDL ALTER TABLEs in a Transaction on production, do the change, run smoke tests in a nested transaction that are all rolled back when complete, and if anything fails you roll back the DDL changes and are back where you started clean.

Support for DDL in transactions seems to vary quit a bit between DB servers.

In mysql (and I think also mariadb) ALTER TABLE cannot be rolled back, and also implicitly does a COMMIT on any in-progress transaction before altering the table.

In PostgreSQL you generally can do ALTER TABLE in transactions, but there are dome restrictions or exceptions. If the alteration is to add a value to an enum, it cannot be done in a transaction.

MS SQL Server seems to be OK with ALTER TABLE in the midst of a transaction, according to this article [1]. Oracle seems to be similar to mysql.

Even the ones that support this well, like MS SQL Server, have restrictions on other DDL, such as creating indexes.

I think I'd rather just assume no DDL in transactions and design my schema update procedure accordingly, rather than asking whoever is designing the new schema to try to limit themselves to changes from the old that avoid whatever statements whatever DB server we are using doesn't allow.

[1] https://www.mssqltips.com/sqlservertip/4591/ddl-commands-in-...

Docker isn't really required for this. Any in memory db will provide good contract verifications. And a simple alpha/beta/gamma setup with the end database can get you the rest of the way there.

Docker can be a bit of a better option... Unless you're using the same database software for the in-memory operations as the actual deployment. Or aren't using many specific features the rdbms you have chosen (stored procedures, triggers, virtual fields, etc). Not that I'm a fan of triggers and prefer direct table access over SP, but virtual fields are GREAT during transitional changes.

It also generally requires an abstraction between the DB variants.

Agreed. In the old world, this is what an alpha/beta/gamma environment could help test against. The in memory could be used to make sure your units and overall components were correctly specified. The actual installs of the DB software to isolated schemas were where you made sure you integrated with the DB correctly.

That all said, I did not mean this to be that Docker doesn't provide some value. Just pointing out it is not a prerequisite. If that makes sense.

"rather than using SELECT . If you do use SELECT , don't assume that the fields are in a particular order."

There are professional developers that use Select * EVER!

Maybe add write a simple Select, Insert and Update Statement to the FizBuzz test.

You should never use "SELECT * FROM SomeTable" to retrieve data because the column order isn't guaranteed. Also, you should always use column names to access data.

However, "SELECT * FROM SomeTable" can be used to determine if a table had data ( for "EXISTS" clause ) or to "COUNT" the rows of the table. It's industry standard.

You could also use something like "SELECT 1 FROM SomeTable" but it isn't natural.

> You should never use "SELECT * FROM SomeTable" to retrieve data because the column order isn't guaranteed. Also, you should always use column names to access data.

If you always use column names, why does column order matter?

You’re sending more data than necessary across the wire In general and specifically if someone adds a blob data type it can get really hairy.

Another time when you don’t want to do select * is if you are using a database that does columnar storage like Redshift.

It's not just column order. Sometimes the columns you select affect performance of the query.

More precisely, if you only select indexed columns you can read only from the index rather than read from index then read the rest of columns from table. Usually the indices are entirely in memory as well, so it's buttery smooth, whereas a read from the table will probably require a disk read.

Generally for schema bound objects you will want the SELECT 1 anyway, and I find it fairly natural for expression a boolean.

If you want to change it to SELECT 'EXISTS' then I wont be mad :)

Don't throw out the baby with the bath water. Just be aware of invariants, and respect them.

There is nothing wrong with a select * when working with an unfamiliar data model for initial rough development to get your head around the data path. Just remember to refine your query to meet your performance requirements.

This advice always struck me as funny, because an ORM will select every column in a table too, even ones you aren't going to use. So the corollary to this is: never use an ORM.

One would catch a lot of debate (never use an ORM), the other mostly doesn't (never use select *). Maybe most people that use an ORM don't connect these dots though.

IMHO if you are using a scripted language, I find it easier to NOT use an ORM/ODM layer.

    function lookupFoo(baz) {
      const sql = await db.init();
      const result = await sql.query`
        SELECT x, y, z
        FROM foo
        WHERE bar = ${baz}
      return result.records.map(mapResultsToFoo)[0];

I really hope that sql.query does proper escaping, because otherwise, you're going to have a jolly time, when Bobby Drop Tables signs up for your service.

Which ORMs please? I'd like to avoid them!

Hibernate doesn't for starters.

Django's ORM doesn't select *, but it will select every column unless you tell it otherwise.

> There are professional developers that use Select * EVER!

Old code from the 2000 PHP era is full of this. Never underestimate the horrors of "legacy" code bases, and never underestimate what people do in short lived "maintenance scripts" or "helpers" that are used well beyond the intended life span. Or "proofs of concept" aka "unbelievably dirty hacks to get a MVP" that end up being extended and extended to being the "real product" in the end.

For an MVP I will always use "select *" and friends simply for speed of development, but clearly communicate that this is in no way intended for production usage.

> clearly communicate that this is in no way intended for production usage.

But it likely will be anyway.

I use a variety of ORMs and hand-rolled SQL, and often will do select . The times when I know there may be a performance impact, or I'm debugging an existing performance impact, hand-rolled SQL always is on the table, but often* the issue is either missing/poor indexes or simply table design that didn't foresee the current use case. There's been a couple of times where a stray select * was, really, pulling back unintended very large blobs and causing a performance issue, but it's been infrequent compared to index issues.

Sometimes, devops, admins, DBAs, etc are there for QA. It's not that they don't know how to automate everything, it's that they know if you permit someone to do a dumb thing, they'll probably do.

It's one thing to talk about good practices and another to build a large development team that can understand and follow them. By moving the responsibility to devs, you tax your senior devs, who have to monitor middle and junior devs more closely and sometimes micromanage them as a consequence.

A few schema changes a year are more easy to test, justify and —most importantly— track than many small ones going on at the same time. How do you even manage the schema version when there are multiple branches that require changes and you don't know the order they will be merged in? Instead of a version number, you start to use flags. How about deprecated features? How and when do you clean up? Something else that might be important, are performance optimizations.

I agree on the suggestions of the article and it's the way I try to write code that uses a SQL store, just saying that sometimes in real life you have to make things harder in order to stay safe.

Everything about this article is really weirdly written. It's glib in the way that preachers are glib from the pulpit and trying to get you to just believe something while glossing over your particular concerns.

Is "doing DevOps" impossible? "No! It's simply a difficult transition."

Oh, I see now. That makes it all better. Never mind that the author's "friend" never claimed it was impossible. Rather that it was fundamentally a bad decision.

It's like a parable, but a really bad one. I wonder when the author is going to bring up Agile and just assume we're all true believers and that anything that incorporates Agile must be good without question. Oh, there it is. Right there.

So the difficult transition of DevOps is presented here as necessary and appropriate because 1. Agile, 2. Faster, and 3. Because it avoids some incredibly horrible practices the friend's DBA team does.

1. I won't get into here. That's a different flame-war for a different time. But people should know better than to write articles with the unquestioning premise that Agile automatically justifies any and every difficult transition. 2. I'm highly allergic to justifications that involve moving faster. Most companies do not have a problem delivering value fast enough. They have a problem delivering any value at all, and delivering a reliable product at any pace. Faster time to value is the worst possible justification if you're trying to convince me to experience some pain.

I am a big proponent of DevOps, and I bake it in to all my work, both at my work-work and in my side projects. It's worth the difficulty to me because it allows me to reliably get my code from point a to point b. That it's perhaps faster (is it really faster if you add up the person-hours? I don't know. I don't really care) is a little extra niceness if it's there at all.

3. DevOps has basically nothing to do with the pain that's causing the situation described in this parable. DevOps won't fix bad DBAs or sloppy Devs. DevOps won't fix a manager that allows this kind of situation to happen in the first place. DevOps won't fix a company culture that allows Management to run this kind of a dev shop. It's like the "friend" here is dying of dehydration in the desert and this dude rolls up in his car and says, "Hey man, you really need to eat more vegetables and less red meat and exercise at least 30 minutes a day. It will totally change your life! Also, I have these essential oils that will cleans toxins from your spirit aura and help you focus on your life goals. Good luck!"

This is interesting. For the very first application I had to create from scratch, the first thing I learned was about SQLAlchemy and the second was Alembic. It's almost alien to me to see that there are large organizations that have never heard of migrations.

For scripting database changes so they are deployed with app changes... or so they are automated and auto-tested upon deployment and in version control, I've had good success with open source Liquibase. Worth looking at Flyway also.

For those using Microsoft SQL Server, we've had success using SQL Server Data Tools (SSDT), to automate database deployments for a few years now. Prior to this, developers would check-in 'delta scripts' - which were just all the sql changes that were manually applied. Now, the entire schema for multiple products is stored in source control as the source of truth, and deployments themselves generate and apply the deltas.

Great example of blaming the tool [SQL] to try and bring in hip technology when the people are to actually blame.

We use Liquibase, which is event-sourcing for database schema changes, and a database that supports online schema changes.

> There is always some fixed cost to deploy. You should always, in principle, be driving down the fixed cost of deployment toward zero. Increasing deployment frequency without driving down that fixed cost is detrimental to the business and irresponsible.

Surely it's the incremental cost you want to drive down. If there were only a fixed cost, it wouldn't matter how often you did it.

Microsoft isn’t given enough credit for what they have done investing in ssdt it really is a game changer for sql developers

Yep; discovering - and using - SSDT to replace a 3-day manual migration with a half-hour automated one is still one of my "most proud moment" stories.

The premise of this article is just too bizarre for me - even the most inexperienced software engineer today surely can see the fallacy of claiming a SQL database prevents use of DevOps style practices.

Either the author A. Met a crazy person, or B. Invented the anecdote for sake of what they beilieved to be a snappy title.

context: I work as a DBA.

It is a bizarre premise, and its alive and well at an insane amount of shops today.

When I walked in the door at my last gig(and a few others), they were hand running all deployments, and often manually troubleshooting in production when it fucked up because they didn't have a QA environment.

Everyone has a QA environment. Some people have separate production environments.

I'm stealing this quote.

You and I may be defining "everyone" differently.

What the parent was getting at is if you don't have an environment devoted to QA specifically, your production environment is your QA environment.

I think you might have missed the joke...

No, it's the "QA environment" that you define differently.

There are a lot of crazy people in industry, unfortunately. Not as much at the big flashy tech startups or places you hear about in the news, but there are still plenty of 'work a day' businesses in corners of industry that do not have expertise in these areas that absolutely should.

I helped my current employer make this transition and it was similar to the description in the article. Controlling database change in source was foreign to them.

Absolutely B to sell the article

The technique described in the article would fail if you'd have instances of different versions of software each expecting different version of database to connect to in a distributed database environment. So let me tell you how I did it. So my client was running multiple venues each having a database with local records and the client (meaning the CEO, the company board, and middle managers too, so in total more then 100 people running the software on their laptop) would connect to one venue or even to multiple venues at once to get reports of what is going on. So downtime of the venues was out of the question.

Also some of them would have some other problems to solve and would fire up the software on their laptop even a month later, meaning the venues would already have different DB schema. So to solve this I've done the following: -each venue would have a table for SQL statements to upgrade the software local database. Also this would specify what schema No is current one. -each venue would have a table for software upgrades, that's it inside a blob field the software was there for that specific version. The software on client laptops would also have a local database, used for aggregating reports from multiple venues mainly, but also to automatically perform a venue DB schema upgrade if required: -a table with SQL statements to upgrade the connected venue schema DB to latest one. -a table the software upgrades, the same blob as described above.

So, when the CTO (usually he was the one receiving the latest and the greatest) would get a new software version from DevOps, he would basically receive it by connecting his software to my DevOps DB. The software on his laptop would see that it has lower number than the connected DB schema No, access the upgrading fields, get the new software version in a temporary file and restart the software. The restart process would upgrade the software from that temporary file, run the necessary SQL commands to modify the local DB as well, if it was the case and launch the software with the latest upgrade. Now the CTO would connect to multiple venues, usually by just asking for the daily report on those and for each connected venue the software would see the connected venues schema No are older. Would run the necessary SQL commands to upgrade those venues, pushing the software latest blob as well and then get the records to compile the actual report. When the other members would connect to those upgraded venues they would receive the latest software as well same as the CTO received from my DevOps one. This ripple effect of upgrading software and schema was done without a single second of downtime from those venues.

Also the venues did ran a local software on their own, which was also restarted and upgraded as soon as it was detecting the changes, in the same fashion as the software on the client laptops was upgraded. In practice, since the venues were scattered around US and Mexico from Boston all the way down to California and Mexico City, took around 30 seconds each, as they were all connected through VPN. So the most downtime was experienced by CTO for like 2 hours (~100 venues), and usually the guy would do this during lunch leaving the laptop to do this while he was attending other matters. The rest were only seeing the message "Upgrade in progress, please wait" for only 30 seconds and was very acceptable. Also the local venue software was less then 5 seconds with this message up, since it was connecting to a local DB, not over VPN.

I did this in 2010, and while I departed in 2015 from that company this methodology is still in use without hiccups today by them.

Ignoring the rather bonkers anecdote at the start of the article, I'd also argue against one of the other recommendations here, even though it is quite common in practice:

  > The command in array index n upgrades the schema from version n-1 to n.
  > Thus, no matter which version is found, the software can bring the
  > database to the required schema version. In fact, if an uninitialized
  > database is found (for example, in a testing environment),
  > it might loop through dozens of schema changes until it gets to the
  > newest version.
There are 3 major problems with traditional incremental migration strategies:

1. Once you've accumulated years of migrations, it's unnecessarily slow to bring up a new dev/test db. The tool is potentially running many ALTERs against each new table, vs a declarative approach which just runs a single CREATE with the desired final state.

2. The migration tool does not have a way of handling databases that aren't in one of the well-defined versioned states. This can happen easily on dev databases, where a developer just wants to experiment before landing on the final desired migration.

3. Most of these tools require a developer to code (in a specific programming language, or a proprietary DSL) the migration logic, as well as its rollback. There's plenty of room for human error there, especially since most of these tools don't automatically validate that the rollback actually correctly reverts the migration.

In contrast, a declarative approach simply tracks the final desired state (a repo containing a bunch of CREATE statements), and the tool figures out how to properly transition from any arbitrary state to the new desired state. This permits the state to be expressed in pure SQL, with no need for a DSL or custom programming, nor any need to define rollback logic.

A declarative tool can also be used in an automated reconciliation loop, which can handle situations like a master failing mid-schema-change in a sharded environment. This is analogous to other declarative tools, such as k8s.

There are several recent open source projects using the declarative approach for schema management. I'm the author of one, Skeema [1], for MySQL/MariaDB. Others include migra [2] for Postgres, and sqldef [3] for MySQL or Postgres.

Although declarative schema management is only recently picking up adoption, it isn't a new approach. Facebook has been handling their schema changes using a declarative approach for over 8 years now.

[1] https://github.com/skeema/skeema

[2] https://github.com/djrobstep/migra

[3] https://github.com/k0kubun/sqldef

1. This burden is only for new ground-up environments... you can turn to snapshots/backups at specific points and start from that instead of running all migrations from the start.

2. This is where having a ground-up script is helpful. If you can destroy/recreate a local dockerized db (as an example, if not specifically) that's far less of an issue.

3. Developers writing code against a SQL database, should understand the variant of SQL being used. This is also where things like code reviews can minimize the issues. I'd also suggest that issues would be caught in QA or UAT before production providing decent test coverage prior to production release.

As for your contrasting approach, what about the example in TFA where you go from a composite Name field into separate FirstName, LastName fields?

1. True, but it depends on the size of your data set. If you just want the tables but not the full prod data, backups may not be usable as-is for this purpose.

2. What if you want to capture the set of local dev changes, rather then destroying and starting over? Most migration tools that I've seen don't have diff logic built-in, so they have no way of saying "take whatever I did to the dev db, and capture that state in the filesystem, so that I can turn it into a pull request". Whereas in Skeema you can simply run `skeema pull development` and then `git diff` / commit / etc.

3. Agreed; but the more automated tooling, the better IMO. Otherwise it's easy for a code review to overlook something like the wrong table name or column name being used in a migration rollback, due to bad copypasta. And in an environment where schema changes occur quite frequently, I'd be surprised if QA always involves testing and confirming every single rollback.

re: your last question, that operation is a mix of DDL and DML... I'd agree that the declarative approach doesn't natively handle it, since DML isn't declarative. My personal advice as a database professional is to treat that example as a data migration, managed separately from pure schema changes. Migrations of this sort should be less common -- as well as riskier to execute due to deploy-order considerations.

FWIW, I'm approaching all this from a background of consumer-facing web/mobile tech companies, where schema changes need to occur quite often, and making them automated/self-service is game-changing. Processes/requirements will certainly be a bit different for traditional enterprise, finance, etc.

On a side note, it's really good to run into you here tracker1! It's been a long time. Are you still involved in the BBS scene?

On you fwiw, I understand. Different environments can have different challenges. Right now, I'm working on enterprise/corporate style software that is deployed into government organizations, where the versioning between Agency A and Agency B may be at different deployments and wanting to bring both to the current version prior to the next election cycle. Currently using SQL project in VS, which I'm not a big fan of, as I think upgrading could be very challenging. Fortunately there's a long window between necessary releases, and other challenges with legal requirements get interesting.

On BBSing, on several BBS groups via FaceBook... I've been pretty hands off, would like to get a board up. I put bbs.io on freedns.afraid.org for dyndns support, and have BBS.land if I ever get around to doing anything with it. It's hard to find motivation after hours to work on stuff. Also been wanting to get my board back up.

Didn't realize you open-sourced tournament trivia. Was recently thinking of reaching out recently to try and get a bigger doormud instance running.

Awesome, I'll send you an email soon! Don't want to go too far off-topic in this thread :)

I can describe how Django addresses the issues that you raise:

1. You can squash the migrations, or delete them all and start afresh. I have had to do this just once on a project after it was 4 years old, and it was easy to do (ie. it is relatively infrequent).

2. You can experiment by creating new migrations, and, when you’re done, rollback and create the one you want.

The third point I cannot address, as you were very much using Python/Django and I cannot imagine an easy way to use the database models cross-project. I would probably argue that you shouldn’t do this.

However, Django’s approach was largely declarative, but in Python not SQL. Further, the migrations were automatically generated as Python which made them easy to modify.

Sure, but what happens when your company gets big enough to have backend systems/projects/tools in multiple languages?

I've seen this at countless companies, and in my experience it gets ugly fast. You can either shoehorn all schema changes into Django (for example) even for non-Python projects, or you can have different processes for each language, but either way that's a losing proposition vs having a unified, language-agnostic tool.

The migration systems built into frameworks like Django and Rails are truly great for rapid development, but they can become a quite a burden later on.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact