Has anyone ever actually moved a mature application from one database to another...

mrkeen · 2024-09-17T07:24:47 1726557887

I strongly believe in this principle, but I've also seen colleagues try to future-proof the database (via interfaces) in the wrong place.

If your DBUserStore happens to know directly about SQL, that class is the wrong place to try to introduce flexibility around different DBs, ORMs, SQL dialects, etc. Just hard-code it to PostgresUserStore and be done with it.

Instead, put the interface one level up. Your PostgresUserStore is just a place to store and retrieve users, so it can implement a more general UserStore interface. Then UserStore could just be a HashMap or something for the purposes of unit tests.

Also, if you have some autowiring nonsense that's configured to assume "one service has one database", that's bad. Singletons used to be an anti-pattern and now they've been promoted to annotation and/or basic building block of Java microservices.

When it comes time to live-migrate between databases, your service will need to stand up connections to both at once - two configurations, not one, so architect accordingly.

chuckadams · 2024-09-17T22:53:28 1726613608

> Singletons used to be an anti-pattern

Singleton was never the antipattern, it was the GoF implementation using the class to manage the singleton instance that everyone eventually ran from. Object lifecycles like Singleton are managed these days by a module system or a DI container.

yuppiepuppie · 2024-09-17T06:34:09 1726554849

My anecdotal experience tells me that it never works in a high scale product environment. Having managed and lead 2 teams that maintained a legacy system with hex-arch and we had to move DBs in both. We ended up rewriting most of the application as it was not suitable for the new DB schema and requirements.

phamilton · 2024-09-18T05:54:30 1726638870

Thanks for sharing. It matches my experience.

After many years of a lean team serving high scale traffic (> 1 million monthly active users per engineer), most abstractions between customer and data seem to turn into performance liabilities. Any major changes to either client behavior or data model are very likely to require changes to the other side.

There's a lot to be said for just embracing the DB and putting it front and center. A favorite system we built was basically just Client -> RPC -> SQL. One client screen == one sql query.

pjmlp · 2024-09-17T08:00:14 1726560014

I have, back when we were selling a CRM product in the dotcom wave.

We could do AIX, HP-UX, Solaris, Windows NT/2000, Red-Hat Linux, with Oracle, Informix, DB2, Sybase SQL Server, Microsoft SQL Server, Access (if you were feeling crazy, just for local dev).

It wasn't that the customers would switch database, or OS, rather the flexibility allowed us to easily adapt the product to customer needs, regardless of their setup.

regularfry · 2024-09-17T08:43:38 1726562618

That's a subtly different situation, as you've presented it here. In that case you know up-front what the set of databases you need to support are, so you can explicitly design to them. One promise of Hexagonal Architecture is that you should be able to get the benefits of being able to move underlying stores without knowing in advance the precise products that you might want to move to.

Depending on the early history of your product that might be the same; or it might not. If you know from day one that you need to support two databases rather than one, that would be enough to cause design choices that you wouldn't otherwise make.

pjmlp · 2024-09-17T08:55:22 1726563322

It was still a product, a different kind product, but still product being developed and sold in boxes (back when that was a thing).

Also it wasn't like we developed all those OS and database backeds for version 1.0, and didn't do anything else afterwards.

Which OSes and RDMS to support grew with the customer base and added to be plugged into the product in some way or fashion.

mrkeen · 2024-09-17T08:57:33 1726563453

> If you know from day one that you need to support two databases rather than one, that would be enough to cause design choices that you wouldn't otherwise make.

I disagree (strongly in favour of of DI / ports-and-adapters / hexagonal).

I don't want my tax-calculation logic to know about one database, let alone two!

Bad design:

  class TaxCalculator {
    PGConnection conn;
    TaxResult calculate(UserId userId) {..}
  }

Hypothetical straw-man "future-proof" design:

  class TaxCalculator {
    MagicalAbstractAnyDatabaseInterface conn;
    TaxResult calculate(UserId userId) {..}
  }

Actual better design:

  class TaxCalculator {
    UserFetcher userFetcher;
    PurchasesFetcher purchasesFetcher;
    TaxResult calculate(UserId userId) {..}
  }

I think a lot of commenters are looking at this interface stuff as writing more code paths to support more possible databases, per the middle example above. But I do the work to keep the database out of the TaxCalculator.

yencabulator · 2024-09-17T20:52:08 1726606328

That sounds like a codebase that doesn't contain a single JOIN..

mrkeen · 2024-09-18T06:56:39 1726642599

And that's fine.

If it's really big data I imagine I'd be in some kind of scalable no-SQL situation.

If it's not so big, Postgres will comfortably handle my access patterns even without joins.

If it's in the sweet spot where I need Postgres JOINS but I don't need no-SQL, well then, refactoring will be a breeze. I'll turn:

  class TaxCalculator {
    UserFetcher userFetcher;
    PurchasesFetcher purchasesFetcher;
    TaxResult calculate(UserId userId) {..}
  }

into:

  class TaxCalculator {
    UserPurchasesFetcher userPurchasesFetcher;
    TaxResult calculate(UserId userId) {..}
  }

which is backed by JOINS inside. And I can do this refactoring in two mutually-independent steps. I can make my Postgres class implement UserPurchasesFetcher without thinking about TaxCalculator, and vice versa.

And if it's about the data integrity that JOINs could notionally provide, I no longer believe in doing things that way. The universe doesn't begin and end within my Postgres instance. I need to be transacting across boundaries, using event sourcing, idempotency, eventual consistency and so forth.

imron · 2024-09-17T21:14:40 1726607680

Not advocating for or against, but having worked on systems like this, the joins here would happen in the Fetchers.

That is, User is the domain object, which could be built from one or more database tables inside the Fetcher.