The domain is really complex - credit card transactions have many variables that can make endpoints unhappy, add internet payments to the mix (like PayPal) and you are in for a madhouse
External systems are full of surprises - you get an error response to your payment request, but then money suddenly appear on the merchant account. Or API changes without notice. Or test environment behavior is different from production. Or any other surprise that can come up from integrating to third party service.
Every transaction counts - you need to carefully consider which errors you ignore and move on, for which you try to cancel the transaction and which are just fatal failures.
Logging, logging, logging - having good information in logs is half the battle in disputes (which come up regardless of how fault tolerant your system is).
Your clients come from all walks of life - you don't have any guarantees about the user agents - IE8 support anyone? Odd mobile browsers? Paranoid browser settings? All of that can cause havoc in your presentation layer and cause endless mysteries.
I am not sure if Erlang (or any platform for that matter) could help with such issues. The key is having development team of technically competent people who care about the project. The rest is implementation details.
That is very true and is hard to do. I have seen people (including me in the past) pick fun, easy to understnad, easy to iterate on pieces first. And often push the harder things further down the road.
> The problem with this method is that you probably don't know what the most difficult problem is when you start.
Often in distributed systems it is fault scenarios, how it handles partitioning, network errors and other funkiness like that.
Security is like that as well. Bolting those on at the end / later often doesn't work very well.
> Erlang was designed for building fault-tolerant systems
I like to often say that highly concurrent and distributed systems without fault tolerance become useless as the concurrency or scale increases.
If some service segfaults and if it handles millions of transaction now all those transactions stop and service become unavailable. Don't need Erlang for that of course, can use containers, monitoring systems, consistent key-value stores and such.
The use case for Erlang is also a business use case -- it means saving money on having a smaller ops team and having a smaller development team. Developers are expensive, 24/7 ops are expensive. If your code is lines and lines of low level code, in language not designed for fault tolerance -- means weeks and months debugging and developing issues where in Erlang the problems can be expressed a lot more concisely.
I have seen subsystems crash and restart without affecting the core of the service. Nobody had to get up at 4am, can wait till morning.
Hot code updates may sound like a gimmick but combined with dynamic tracing it can save the day, helps find and fix problem quickers with often 0 downtime (I have seen it happen).
Little over a year ago, at Erlang Factory SF, Visa was poking around looking to hire Erlang developers too, not sure what ultimately happened with that though.
The guy from Visa was at Erlang Factory again this year and, from what he told me, the project is chugging along.
Reconciling those external transactions is just one bar of the overall piece, but it's a doozy. As another post points out, you need a lot in place to work out what is wrong when things go wrong, logging, reporting, lots of fancy sql queries to show what is happening.
Unless you have done one of these, you would think that once you a fault tolerant stack that implements a fast switch and pattern matching you are up and running. DK strikes again. Focussing on that stuff means you have not realised the hard bits yet.
The music is good example. You play the piano well and when you face a challenging piece you don't even know know enough to understand what bits are hard. The master says, "look at this bit one hundred times over and you might get a feeling for what you don't know".
His observations at the end of the article are very important to anyone who wants to grow professionally, and even more important if you are responsible for assembling (and keeping) top teams.
In modern team building you need to strike a delicate balance between praise and criticism, not only with millennials as one would immediately assume, but also with very experienced engineers who may not be so flexible to embrace change.
By embracing agility, rapid iteration and small tasks with a full feedback loop, it can become easier to accept criticism and you can now have many opportunities to improve.