
Data-Oriented Architecture - Eyas
https://blog.eyas.sh/2020/03/data-oriented-architecture/
======
jayd16
We jump through so many hoops to avoid this terrible pattern. This puts more
load and less scaling options on your DB. If you need to upgrade how something
is stored, you need to touch every service. And they didn't really solve their
problem.

If you can't keep track of what services call each other, what makes you think
you'll be able to keep track of who is writing out what data?

Reading how they did SOA (spaghetti of services with highly coupled
interactions) I don't think putting that at the data layer is useful in the
long run. They even suggest many services operating on the same data as a way
to decorate the data as it haphazardly makes it way through this mess of an
app. Their original problem was a poorly planned separation of concerns
leading to cross service coupling. In DOA that still have it. Its just coupled
directly through the datastore with no hope of cleaning it up without a
massive migration.

~~~
kasey_junk
The simple answer to this question is that all mature data stores have
extremely sophisticated tooling for multiple concurrent access. Namespacing,
views, access rights, concurrency/locking, migrations, stored procedures, etc
are all things you end up poorly implementing at the app layer that you get
out of the box at the data layer.

Every rdbms left on the stage was built with the assumption that lots of
disparate systems would be coordinating within them.

I’ve yet to see a SOA system with half the tooling to support this. That’s
before you get into the performance advantages.

If you’ve done a bad job coupling your data tier it’s because you didn’t
know/follow best practices we’ve had for at least 30 years. Don’t blame the
architecture for that.

~~~
zie
Agreed with kasey_junk.

Also assuming you are sane and give every service/user/etc their own user
account into the DB, your DB logging will easily show user X did action Y.

Databases these days scale very well. PostgreSQL, which is a great OSS
database can scale very well out of the box on any single system, and the size
of X86 boxes are getting pretty giant these days(not to mention other
platforms). Plus there are loads of 3rd party, but well supported, options for
scaling past a single instance.

There are other systems like FoundationDB, etc that scale well out of the box,
but I'd argue most people don't actually need to scale that large, 99% of us
will never get to Google size and one can get very far on a single DB
instance. Especially for new projects, scaling should be near the bottom of
your todo list, until it starts to hurt, and then the general answer is, just
throw money at the problem, since if you are having scaling problems, you
better not also be having money problems or you likely have larger problems
than scaling your DB.

~~~
bcrosby95
You can accomplish this with a monolith too though. The service part is doing
no work. And a stateless monolith is even easier to scale than a database.

Monoliths certainly can reach their breaking point for other reasons. But they
can take you very, very far too. Depending upon your exact use case, a
stateless monolith, Postgresql, and something like Redis can handle millions
of daily users.

It's been a while since I've had to do this, about 10 years. The breaking
point for our use case was around 6 million daily users. I would imagine you
can take it further today, but haven't had the need to so I don't really know.

~~~
machinecoffee
I would say nowadays, from a scale-up perspective we've never had it better.

Multi-core CPUs are cheaper than ever, and terabytes of RAM are not uncommon.

The only problem is that to have an HA solution with these monster machines,
you need at least 2 of them, and maybe a similarly sized test system.

------
atomicity
I liked the clarity of the article and the reasoning is solid. Still, I'm not
too convinced:

\- A key reason for splitting up a monolith into services is because
collaboration becomes too costly with 100s to 1000s of developers working on
the same code. Your build & test system can't handle the number of commits.
Your code takes too long to build. Would the data-access layer not them become
the development bottleneck in such scenarios, meaning that it doesn't scale as
well as SOA?

\- What are the advantages and disadvantages of centralizing in the data-
oriented layer over centralizing through a network-related tool like Envoy,
Kubernetes, or Istio?

\- The database layer itself often becomes a performance bottleneck, which
requires us to run a sharded database. Some companies go the extra distance by
running in-memory databases, document databases, and time-series databases. In
such cases, wouldn't the data access layer need to support federation, which
is a hard problem according to database research?

\- Is O(N^2) really that big of a problem? It seems like the problem can be
reduced to something simpler: developers cannot easily understand which
services communicate with one another. If that is the case, would a
visualization tool be sufficient?

~~~
jeffffff
yes, at scale this turns the database into the bottleneck. this is not
strictly a downside though, as it means you can centralize ownership of your
database to one team of experts who can handle optimization, capacity
planning, sharding, multi-tenancy, security, monitoring, etc for everyone.
most product teams do not and should not need people with that expertise, so
if you have multiple products or services running into scalability issues this
approach can be a far more cost effective way of solving them than having each
product or service handle these issues independently.

while they do not use the term "data-oriented architecture", many of the
largest web companies use what is effectively this approach and have teams
dedicated to building and maintaining a shared data layer. some examples:

google - spanner

youtube - vitess, migrated to spanner

facebook - tao

uber - schemaless

dropbox - edgestore

twitter - manhattan

linkedin - espresso

notably absent is amazon. amazon has taken the full blown microservices
approach where anyone can do whatever they want. worth noting is that amazon
is in a very sad place when it comes to data warehousing and analyzing data
across teams/products/etc. while the shared database approach is strictly
intended for OLTP use cases and explicitly not meant for OLAP use cases,
having a common interface to all data and something approaching a data model
makes it extremely easy to replicate all your data out into a data warehouse
or data lake or whatever you want to call your system for your OLAP workloads.
with the 'every service has its own database' model, each team has to be
responsible for replicating their data to analytics systems, and that is
usually not super high on their priority list relative to product features.
this problem is magnified when people from a different team want to consume
data from that team's product/service but the team producing the data has no
incentive to make it available. in large organizations (including amazon) this
is a huge issue for teams who mostly do analysis, reporting, marketing, and
other activities where they primarily consume data produced by others.

~~~
pm90
Choosing an architectural design simply because it makes data warehousing
easier doesn’t seem like a good enough reason to me.

You give examples of all the Big Tech having such shared DBs but that seems
like more of a reason to not use that pattern. Good DBAs are hard to find and
not many people choose to become DBAs anymore. Big Tech can hire the
experienced ones since they can compensate them pretty well; most companies
can’t. The shared DB therefore becomes a critical bottleneck to the business.

~~~
jeffffff
beyond some fairly large size of company it's less that it makes data
warehousing easier and more that it makes centralized data warehousing
possible.

fortunately this type of environment is available today as a managed service
in a few different offerings. gcp has spanner and vitess is available as a
managed service on multiple cloud providers from planetscale.

~~~
closeparen
Centralized data warehousing is possible as long as you constrain the number
of distinct database _engines_ and provide connectors for those. Services
having private databases doesn't preclude data warehousing. It's why we _have_
data warehousing! To enable joins across data from different silos.

------
apalmer
This was the predominant architectural design about 20 years ago.

There are many strong points for this the general challenges are...

Databases don't/didn't have too much in the way of integration primitives...

Databases can become a performance bottleneck that can only be scaled
vertically...

SQL language is not that great as an application programming language

EDIT: forgot the biggest one which is, it is completely up to discipline to
produce any kind of separation between implementation details and public api
since everything lives in the database. That's really the biggest challenge.

~~~
marcosdumay
> Databases don't/didn't have too much in the way of integration primitives...

Hum... They have the best and most diverse set of integration primitives
available. Services architectures (micro, SOA, and whatever) did never
actually reach parity to them.

Your other points are good (DBMSes do scale horizontally, but it's not easy
nor nice), agreed on everything. But they still do not beat the capacity
DBMSes have for integrating stuff on most applications, so this is still a
good paradigm.

~~~
apalmer
I am quite serious when I ask for more specifics on the types of integration
primatives you are talking about?

My experience has been the opposite here are some examples:

Number of times I had to implement or maintain hand rolled queues in the
database.

Number of times I had to implement a web service whos only purpose was to
expose database data to the world.

Number of ETL processes I wrote just to handle some data daily data input from
a third party.

Number of times I had to use comparatively complex SQL techniques to iterate
over a list of rows because SQL is intended for set based operations not
iterative processing ..

Number of times I had to do tedious text templating to generate HTML or XML or
JSON or any kind of heirarchical data format that is easily consumable by a
non database system.

That's what I am talking about

Which isn't to say this is a bad architecture. Just that database as
integration platform has its challenges, a lot of them.

------
dathinab
It's in a certain way a very roundabout way to describe a event based SOA
system.

Many of the problems described are problems from SOA systems which are using a
number of common anti-patterns for SOA systems.

Also the way this person describes DOA is prone to end up with a monolith in
_data_ and a just the logic is not monolithic but might end up accidentally
being quite tight coupled. (Through it does have some benefits).

Also even with DOA you can end up with internal state coupling between
services if you do it wrong. It's harder then in some bad designed SOA systems
but IMHO roughly as likely as a SOA system communicating with events.

\---

\- So use events if you do SOA (for inter service communication)! \- Never
rely on the internal state of another Service. \- Make sure you don't send
events to a specific service, instead "just" send events, then all services
interested _in that event_ will receive it. (Make sure to subscribe for events
independent of their source not services; E.g. use a appropriate event broker
or service mech; Sometimes just broadcasting is fine; __Oh and naturally
storing the event and making that trigger other services works, too. At which
point we are back at DOA __)

\----

DOA is not bad just IMHO misguiding. If you want to use it look at common
problems with event based systems for vectors of potential problems wrt.
accidental internal state coupling as many of this will apply to DOA, too (if
your system becomes complex enough).

Note that I don't mean all problems caused by combining eventual consistency +
high horizontal scaling with event systems. Sadly this is just very often
mixed up.

~~~
meowface
What are some good articles/resources on event-based SOA? I've been hearing
about event-driven architectures a lot in the past few years.

~~~
dathinab
Honestly I'm not sure. I listened to some grate talks on youtube about event
sourcing mainly for with a scalability focus but I don't remember by whom they
where. Most good articles I read about event sourcing for
traceability/replyability where pretty old even when I read them a view years
ago. Also many articles I read where pretty messy wrt. what applications of
event sourcing help with which problems and have which consequences :=(

Give me a minute I will try to find at least some of the sources, but don't
get your hopes up.

~~~
dathinab
I think that one was good:

\-
[https://www.youtube.com/watch?v=STKCRSUsyP0](https://www.youtube.com/watch?v=STKCRSUsyP0)

I wasn't able to find any other talk I watched back then or any of the stuff I
did read, but I then last time I looked into talks and reading material about
this was ~2.5 Years ago and while a lot of new software and tooling was done
since then the principles didn't change. Try some of the other GOTO; talks
about it if you like listening to talks, they tend to be quite good.

This was interesting as far as I remember but not what I was looking for:

\-
[https://www.youtube.com/watch?v=CZ3wIuvmHeM](https://www.youtube.com/watch?v=CZ3wIuvmHeM)

------
zmmmmm
I like this architecture but I worry that in sufficiently complex systems it
can lead to some significant complexity if there are entities within the
database that have a lot of contention around them. For example, it may end up
translating to deadlocks etc. when two services that are naive to each other
start locking tables in interleaved sequences of interactions. It still seems
wise even if you follow this pattern to segment areas of the database to
management of particular services or consumers and then have those present
APIs or message passing interfaces to each other. Which leads you half way
back to SOA or microservices.

~~~
Eyas
Yeah, totally. The way I've seen this work is that a single service will "own"
a record, so you never are running into a multi-writer situation for a given
record.

While this seems closer to SOA, the key difference here is that single Type or
Table can still have multiple produces (of non-overlapping records). In a
trading system, you'd have producers of RFQs from makretplace A, B, C, etc.
but for each single row in that table, the same service "owns" it. So you
still get the benefits of not caring about the DAG/callgraph or knowing about
the individual service that calls you.

Locking might be hard if you're doing some transactional change. Those become
harder to do. But the half good news is that shifting to an "Event-based"
programming mindset might mean you run into less of these.

But yeah, there's a whole new set of drawbacks.

~~~
nine_k
But what's the point to keep them in the same table then?

Won't having several independent (and maybe physically remote) tables, one per
service, solve the problem better? You can still `union all` them for
analytical purposes.

~~~
Eyas
Same schema, and the service doing the query doesn't need to worry about which
tables to union. (Problem with union all is that it reintroduces
addressability and a form of direct component interaction)

But sure, you can probably implement that with views etc. also.

------
bencollier49
Hmm. Does this not turn the data access layer into a service bus?

This could get very messy at scale.

~~~
jayd16
This is the correct take. They just reinvented the ESB.

~~~
james_s_tayler
Sub that database graphic out for the Kafka logo and it sounds exactly like
everyone else's recommendations lately.

------
sadness2
Yikes. You haven't reduced inter-component communication. You've basically got
an interface distributed across all your components, and defined as a data
schema. Any change to this can now potentially break communication between any
two or more processes.

------
awinter-py
using schemas rather than APIs to share information between components seems
right

IMO one of the reasons CRUD is so hard now is that the schema is different at
every layer of the product

Slight differences between layers are necessary for permissions / privacy, but
there are probably better ways to get that done than to reimplement the schema
at every layer.

~~~
dathinab
I was thinking about this recently a lot as in my experience the difference in
the API is often a major source of overhead (not necessary complexity).

\- I think a major problem is in the difference between the way you can layout
thinks in the storage layer and your application.

\- Another is where to evaluate correctness (e.g. in service checks + DB-
system constraints, etc.).

\- Another major think is that different actions/taks take different slices of
the same data. One area where dynamic typed languages can have a clear
benefit.

\- Different actions/tasks in the same system work better with different
representations.

\---

What I currently think is helpful is to:

\- learn from Entity-Component Systems (from Games) for slicing data of the
same entity. (but can lead to problems with consistency across slices,
transactions can help if doable).

\- I would love to have a DB which can somehow do algebraic data types/sum
types/tagged union/rust enum (all different words for roughly the same
concept).

\- Be __very __strict about preventing coupling of internal state, mixup of
service responsibilities in logic /endpoint _and_ mixup of this
responsibilities in __data __. Which e.g. means that e.g. you avoid any FK
between schemas owned by different services, even through it often seems
usefull at the beginning.

\- Have a _system wide_ schema for all entities split up into small slices
which if combined together from the entity and only use that schemas in the
system no service specific schemas.

\- Specify this schema somewhere, preferably generate data types and similar
from it, preferably have some opt-in strict schema validation (enabled during
part of testing).

\- Use events for communication, use the schemas from above here, too.

\- Have a well defined way to represent "patch"/"update" queries, I have run
to often into the problem with JSON of "reseting/deleting" a values vs. just
not changing it (null value vs. field not given) and/or nested optionality.

~~~
dathinab
Also depending on what you do using event sourcing can be helpful, too. Or can
be a big chunk of unneeded additional work. Be aware that there are 2 ways to
do event sourcing one focused on traceability and maybe even replay-ability
which has a choke point in that you have a single sequential log, i.e. it
doesn't scale horizontally. Or the one which uses it to archive huge
horizontal scalability but on the cost of problems with (more) eventual
consistency and a very very hard time to get system wide transactions right.
(Best of both worlds is if you can shard your system into _independent_
subsystems and have a use case where you can rely on it to not need too much
throughput per shard so that you can go with sequential. E.g. like a chat
system which can shard per channel all messages and user management per
workspace/group).

------
tyingq
DOA isn't a great acronym :)

~~~
dathinab
Yes it's _Domain_ orientated architecture. Which is a less well known think
since I think 10+ years ;=)

~~~
james_s_tayler
Dead On Arrival

------
hestefisk
Congrats, you just reinvented the monolithic database with a bunch of
functions on top. Everything old is new again.

------
janci
Would it be possible make interactive systems with this architecture? The
requester needs to be notified about result availability, so some
communication must be targeted to specific component (the requester) going
against the foundation idea of this pattern.

~~~
Eyas
I didn't cover it in the article, but the key to get interactivity would be to
switch to a datastore layer that supports subscribable queries.

I'm aware of Esper[1] which tries to do this. And maybe/arguably Firebase (?)

[1]: [http://www.espertech.com/esper/](http://www.espertech.com/esper/)

~~~
bradleyankrom
I believe RethinkDB also has this functionality, but I’m not 100% certain.
Will check the docs later and update...

------
ChicagoDave
This is nothing new and generally how SOA began in the 2000’s.

One of the critical improvements to SOA is DDD (domain-Driven Design) where
context matters and boundaries should include the service and its data.

Data oriented architecture a bad idea. Period.

~~~
stilisstuk
Any good sources on DDD?

~~~
stevetodd
“Domain-Driven Design: Tackling Complexity in the Heart of Software“ by Eric
Evans seems to be the preeminent book on the subject.

~~~
dzikimarian
It's valuable, but very hard to read. I would recommend implementing DDD by
Vaughn Vernon first. Also understanding event storming helps a lot.

------
WesternStar
So in this scenario are we essentially reimplementing Apache Kafka? Producers
who create events that are then read by consumers and persisted on a database
layer. What am I missing?

~~~
Eyas
Not really reimplementing, Kafka is a great example of a technology you can
structure a DOA around.

~~~
james_s_tayler
Reinventing a less scalable version of Kafka/ESB is basically what this is.

------
discreteevent
A central message broker that intermediates between services also solves the
problems described in the "Problems of scale" paragraph. It might be worth
mentioning this.

------
fulafel
If you put your data in CSV files on your file system and had different
programs on the same server access those files.. would it be a data oriented
architecture?

------
dksidana
Another interesting point of this architecture, that you can plan your cache
really well.

------
throwawaygo
Cache rules everything around me.

------
thecleaner
So on a high level redux but for backend (Store-reducers-components) ?

------
lincpa
Industrial-grade Data-Oriented Architecture:

[The Pure Function Pipeline Data Flow v3.0 with Warehouse / Workshop
Model]([https://github.com/linpengcheng/PurefunctionPipelineDataflow](https://github.com/linpengcheng/PurefunctionPipelineDataflow))

1\. Perfectly defeat other messy and complex software engineering
methodologies in a simple and unified way.

2\. Realize the unification of software and hardware on the logical model.

3\. Achieve a leap in software production theory from the era of manual
workshops to the era of standardized production in large industries.

4\. The basics and the only way to `Software Design Automation (SDA)`, just
like `Electronic Design Automation (EDA)`.

