People always say this guy just has had bad luck with his employers but I live in Melbourne and work in data and reckon the whole industry is a scam.
Like why didn't anyone catch the issue with the logs? Because it doesn't matter, every data team is a cost-centre that unscrupulous managers use to launch their careers by saying they're big on AI. So nothing works, no-one cares it doesn't work, most the data engineers are incapable of coding fizzbuzz but it doesn't matter.
People always wonder why banks etc. use old mainframes. There's like a 0% success rate for new data projects. And that 0% includes projects which had launch parties etc. but no-one ever used the data or noticed how broken it was. I don't think a lot of orgs which use data as core-infra could modernize, the industry is just so broken at this point I don't think we can do what we did 30 years ago.
Author here. I now know some places in Melbourne that have a good success rate on projects. Some of them are so small as to be invisible and rarely hire. One uses two specific independent recruiters or internal referrals. As far as I know, they are extremely profitable because the competition is a joke.
For many organizations, the success rate is indeed 0%. A Group of Eight university (our top 8 universities nationwide), for example, sent me a job description a few months ago where they misspelled the word engineer, and left change tracking on in the Word document. This allowed me to walk through the profiles of the people running their data projects, and it was super obvious that many of the people involved aren't going to do a good job. They could have saved millions by having a random HN person eyeball the CVs of their chosen leadership team.
i think it all goes deeper in overall culture/attitude there.
i was in Melbourne in 2012.. with idea to relocate wholesale, 2nd time. Worked 2 months at some "startup", that fired me when i finished the task given.. Seems it was cheaper to hire "permanent" then fire, rather than take someone on 2 months contract. So that's one red light on the dashboard.. There were other redlights from overall "society", feeling something-is-wrong but i did ignore them for quite a while - people have become evil, etc..
Then i started going around places and mailing my cv here or there (with 22y of experience making software, by that time),.. ibm, ernst&young, you-name-it.. to no avail, and more red-lights flashed on me.. And one day visited some kind of meetup, organised/held in some wellknown company. Seemingly it was kind-of "hiring" event or so, we grouped in teams of 3-4-5 people, with half from company, and other half outsiders.. and went solving some problems of theirs. Or that was the "label". Any solution that any of outsiders suggested, was shot down, with somewhat vague reasons, that at the end started to sound like "if we solve this there'll be no job tomorrow". And Smile :) Lots of smiles. Empty ones.
That was one of the Last red lights on my dashboard. Whether it was a financial balloon pressing everyone so they only smiled and did nothing, in order to pay the mortgage, or something else, i don't know. Next day i watched Sacrifice/1986/Offret by Tarkovsky, and.. bought a ticket out. Discontinued my oz dream. For good.
quoting meself, from 2007-8:
"with time, places change people. Other way happens noticeably only while coming in - or switching on."
I lead a team on a large data project at an enormous bank, hundreds of devs on the project across 3 continents. My team took care of the integration and automation of the sdlc process. We moved from several generations of ETL applications (9 applications) netezza/teradata/mainframes/hive map reduce all to spark. The project was a huge cost savings and great success. Massive risk reduction by getting these systems all under 1 roof. We found a lot of issues with the original data. We automated the lineage generation, data quality, data integrity, etc. We developed a frame work that made everything batteries included. Transformations were done as linear set of SQL steps or a DAG of sql steps if required. You could do more complicated things in reusable plugins if needed. We had a rock solid old school scheduler application also. We had thousands of these jobs. We had an automated data comparison tool that cataloged old data and ran the original code vs the new code on the same set of data. I don't think it's impossible to pull off but it was a hard project for sure. Grew my career a ton.
I know startups that hired data engineers, deployed warehouses,DBT, a BI tool and churned hundreds of reports, and in one case their DBT project has hundreds of files. No one in that company knew why any of it was used.
All said and done the business users wanted three reports.
More often than not data teams are self-serving than anything else.
I think the difference is that technical and business leadership at a bank understand that data is lifeblood. Bad data will get you on the front page of WSJ and a phone call from a regulator in Luxembourg.
For a lot of smaller Internet companies, data is just a fluffer. The real business is in image and which VC bbq you get invited to.
Can you define fluffer as you use it here, and maybe mention where you picked it up? I haven't heard it used much outside of a specific and very notorious Sankey diagram.
Not the person you’re replying to, but I would expect that a near universal answer to this across all kinds of projects (not just software) is effective collaboration and communication between stakeholders and teams.
Despite no shortage of technical talent on large projects they can still often fail, and it’s because building a technically impressive thing doesn’t matter if it doesn’t do what business needs.
So it’s about making sure you’re building the “right” thing that delivers on business’s actual needs, and the only way to find out what those are is through constant and ongoing good communication between technical and business people.
Downside is lots of work business is doing is running around with wheelbarrows and they actively sabotage it when someone wants to build a conveyor belt.
The flip side of this is that the stakeholder has to actually care enough to invest in collaboration and have enough bandwidth to be able to follow through.
The kind of communication that lets cross-functional projects be effective is time consuming, and competent people tend to be overworked, no matter what part of the business they’re in.
Specifically for the financial sector and especially banks and government tax departments, they’re on a clock.
As time moves on, there are less COBOL engineers. Hell, sometimes their systems have been written in a bespoke language. There is less and less understanding of why something is set up the way it is due to lack of documentation. Updates / changes to the code sometimes have to wait for 2-3 years because the system isn’t flexible enough (literally, not as in “this change will take 2-3 dev years”). Even code that old contains bugs, but due to the age of the code they’re inscrutable.
However, whichever new system gets tooled up has to be 99.999% flawless, or it could cause serious damage to the bank and even its regional market.
When there is that kind of pressure, dev teams are no longer considered a cost sink, money flows, and the world is possible.
A large project where the end goal is replicating (and possibly correcting) existing data outputs is much more likely to succeed than one that is integrating new data sources or building new data models. For the latter type of project, it's very common to find that the team is disconnected from the business users and original motivation for the project, with poorly defined success criteria.
There was a clear, large cost savings and risk improvements with the project. The project was actually easy on the requirements front. They put all new non critical features on hold for 2 years and there was no question on the requirements: The new system's data must match the old system's data except for any bug fixes or agreed upon changes.
Reluctantly worked on AU data projects for maybe the past decade. I don't classify myself as a data engineer, in fact I hate data engineering or data related work which is glorified ETL and SQL most of the time. They are the worst, broken projects I've done, not software engineering in the software engineering sense. I don't think I've worked on a good one yet despite the potential to be really interesting projects. I prefer general software projects doing a bit of everything as a generalist, data pays the bills though in AU.
Not seen/heard of this person before but reading this specific blog post it all sounds very familiar, it's depressing.
The "CTO" getting on stage taking a bunch of credit and everything being a mess or incomplete or lies is very familiar. Maybe not CTO but higher management. It's all smoke, mirrors, optics, self promoting, it works as these people end up making their way up the ladder when the lonely dev trying to do better work is just a dispensable cog in the wheel.
> Not seen/heard of this person before but reading this specific blog post it all sounds very familiar, it's depressing.
Someone once told me he, as a form of therapy, rewrote the company he worked at in a few weekends. He never mentioned it to his coworkers, it was strictly a therapeutic effort. They apparently spend years "fixing" things without making any progress.
Most apps are trivial for a decent dev to reproduce, I'd wager the root problem is rarely the codebase: the org is rotting. Years of 'fixes' with no progress is like blaming the water for sinking a ship.
Success attracts deadweight who (un)intentionally sandbag efforts to reverse this downward trend for their own self-preservation. I don't blame them, doubt there's a fix when the system requires most people work bullshit jobs instead of collecting UBI.
Bingo. The #1 thing I learned in consulting is that you can't build good software if the processes and structures are wrong in the first place. Ditto with off-the-shelf software.
Something that takes a week in company 1 can take a year in company 2 purely because of organizational issues.
Rotting organizations will produce rotting software.
> I don't think a lot of orgs which use data as core-infra could modernize
I argue this is a happy conclusion, not a problem to be solved.
What would "modern" bring to a bank except even more pain & suffering? Database technology invented in the 80-90s is more than sufficient for tracking information at the scale that 99% of financial institutions operate at today.
Virtually every core conversion project I've ever heard of has been a failure or is currently a burning wreck on its way to the bottom.
The only new bank projects that touch data and seem to succeed are LOB apps with highly curated experiences that are tightly integrated with the actual front/back office business. Having buy-in from staff regarding your UX is way more important than spinning out a 20 page AWS architecture diagram. The CTO can only take you so far through the vendor approval process at a bank. Retail operations (i.e. the people who are responsible for the brick & mortar branches) typically have substantially more pull in these organizations.
> What would "modern" bring to a bank except even more pain & suffering?
In the most simple term, a future.
Except if your bank is literally too big to fail, at some point you have to either move on from 80s technology or at least bring in an adaptation layer, because your profit center have also moved on or you're facing harder competition.
A typical example is banks getting merged: there will be a fight to see which system stays and which one disappears. If you froze your technology 4 decades ago it won't be your stack winning. [0]
Another is the evolution legal frameworks: EU countries passed laws requiring interoperable APIs to perform standard banking operations. Being a customer of a decent bank or a fossilized one made a huge difference and the market grew a lot more competitive. People would start hedging their bets when legacy banks looked too far behind.
[0] The most interesting and recent example of this is Mizuho bank just miserably failing at that task to the point the gov. intervened and anyone not married to them probably moved out.
> A typical example is banks getting merged: there will be a fight to see which system stays and which one disappears. If you froze your technology 4 decades ago it won't be your stack winning. [0]
In my experience (small/mid-size US banks), the institution with more assets or branches usually wins. It rarely has anything to do with technology. If a 6 region, 200 branch monster comes in and wants to buy some 4 branch relic in the West Texas desert, it doesn't matter if the smaller institution has achieved AGI and an intergalactic core platform. They're almost inevitably gonna be merging their records into some old boring IBM system.
The landscape is a little different over in Australia. Most of the Big Four are closing as many branches as they can. Branches are no longer a mover or shaker, because most Australians never touch cash anymore. [0] Most transactions are digital.
Almost as many people pay with card as with phone.
Faster record systems, faster transfers, actually do win people over here.
I welcome the day when the US stops devoting enormous amounts of useful real estate to bank branches. They are a sad simulacrum of actual street life, taking up tons of space to advertise a bank and contributing to high rents that preclude less-profitable small businesses. One step up from billboards.
I think it depends on why they're merging. If the goal is just to increase size, as you point out doing it at the lowest cost will be the only POV.
If they're doing it for more strategic purposes, the calculation becomes more complex and there will be more "reverse" acquisitions where the entity closer to the target is prioritized.
I tossed out my credit card because the UX was bad. At this point most of the CC services are utilities or commodities. Just get another at a bank with better apps and website.
Yup, I'm moving to a new ptimary checking account currently because I'm sick of my local credit union that is apparently so incompetent they can't handle sending email alerts correctly. Also, any bank or credit card that won't support Plaid seems not even worth considering at this point.
Had to look up what plaid was. Think I’d prefer Fednow support and/or Aus/NZ style modern banking, that’s future proof. I see no reason for a third-party to be involved.
It sucks, but it's the only service anyone ever uses. Doesn't really matter what I prefer when every financial service i want to use offers Plaid or nothing.
Mizuho is doing great, they're probably the least awful of the big Japanese banks. Everywhere is like this, and "old" technology doesn't seem to make the places that use it appreciably worse.
Mizuho Finance as a group is doing fine, partly because their main business is not consumers and companies can't just leave their main bank on a whim.
And also partly because it's the third biggest bank of Japan and it wouldn't be allowed to not be doing fine ("too big to fail" doesn't even start to describe the impact of a group this size starting to go down)
Do they do "great" ? Arguably no. They shut down a number of consumer facing locations, had a hard time recruiting, and compared to Mitsubishi and Mitsui the gap has kept widening. In their small/middle customer businesses they're starting to face the rise of GMO and Rakuten where the two other are too far ahead to even need to care about it.
> Do they do "great" ? Arguably no. They shut down a number of consumer facing locations, had a hard time recruiting, and compared to Mitsubishi and Mitsui the gap has kept widening. In their small/middle customer businesses they're starting to face the rise of GMO and Rakuten where the two other are too far ahead to even need to care about it.
The last year I can find figures for has Mitsubishi's assets -5.46%, SM -4.14%, and Mizuho -2.03% (so yes a decline in absolute terms, but that's the best performance in the top 10, and sounds like closing the gap with Mitsubishi and SM rather than widening it). I can't find a branch count but Mizuho has vastly more ATMs (around 4x as many as Mitsubishi) and that number is increasing. They've continued to make consumer-facing improvements recently like their wallet app allowing electronic money payment directly from you bank account, and English support in their main app. Of course like all of the big banks they're facing competition from the rise of the net banks, but as far as I can see they're doing as well as any of the big four, perhaps better.
Yes, please, fix the UX. That was my biggest gripe, working as an FSR at the bank.
One particular thing was we had to convert transit #s into branch numbers regularly. We did this by looking at a sheet of paper of course. Eventually I got fed up and wrote a web app so you could just punch in the numbers and have it instantly convert. I checked and people are still using it 10 years after I quit, which means nothing has changed and they're still using the same god awful software.
They did move some data at some point. I know this because they screwed that up too and partially merged my mom's and my bank accounts, which is a pretty bad error. Would be worse if it was some rando. Speaking of... That's exactly what AT&T did.
>What would "modern" bring to a bank except even more pain & suffering?
It probably depends on what "modern" means here.
If updating from tons of COBOL to {Julia, Python, Rust, or some other well known language} with an update to an SQLite backend (or perhaps postgres is acceptable for very specific scenarios), that is likely a good choice due to being able to fix old cruft and add maintainability for the future.
If it's a switch to some nosql database backend with everything switched to some cypher-based lang or anything that touches javascript in any way, it's probably a mistake.
Ten years ago data engineering was another discipline in software engineering, like backend or frontend. Somewhere along the line the term was co-opted by “I can maybe barely string together some untested airflow pipelines” and it means something much different now.
It is said that every major, still living COBOL program contains a bespoke, (sometimes poorly optimized) database engine with no standard query language, the only query tool was more program code. Perhaps the longevity of Mainframes points to there was some wizardry/safety lost in standardizing databases, giving people the impression that data itself was standard and too many tools to footgun data into foot pain, that we lost when databases were defined entirely as COBOL internals?
(Not that we haven't gained a lot from modern database tools, just something to think about that maybe the data siloes were good sometimes, too.)
This jives with my experience at a financial services company. I once sat next to the “big data team” and the company 5 year plan was all about delivering analytics and ai to customers using their data the company housed.
The team consisted of one guy (who had a business degree) and a lot of empty cubes they were trying to fill. A year later the company had been acquired and the big data initiative had evaporated.
I have felt exactly this on regular full stack teams many, many times, so it’s also not just limited to data teams.
IMO a major factor here is that software engineering is both opaque and esoteric - at least with physical engineering, there’s something people can look at and think they understand.
My theory is that data is worse again because at least if you're making a website you're expected to end up with a website. The process is opaque and esoteric, but the end-product is somewhat tangible.
A lot of data projects are moving and transforming data no-one cares about. They can fail completely silently, a manager can lie and say 'we've successfully built the data platform which is going to enable AI analytics' and it'll be like a misconfigured S3 or something. No-one's checking the end-product or even understands what it's meant to be.
Excellent point… and one I should know. I spent about 6mo as a data eng (only one at the startup) and long after, found out no one ever had a clue what I was talking about in standup. (To be fair, I was self-teaching, and no-one else knew anything so)
I have seen data work well, but it only worked well in a situation where we had management focusing on two very tangible things that even the CEO could verify (since the CEO did know the product). Those tangible things were:
1. Accurate, auditable billing down to per-chargable entity/event
2. Dashboards for each customer that reflect THE SAME numbers as we generate in #1, so customers could see relevant info quicker than just waiting for the bill from #1
The only reason those things were valued and made a focus though was because a HUGE customer threatened to completely drop our company because that customer did an audit and noticed that we had overcharged them 3% because we were actually billing them on estimated numbers. That led to our CEO being personally yelled at by a much larger CEO, and our CEO (to his credit) didn't blame us (we'd raised the alarms that the bills were estimates and not auditable) but did say "this can never happen again, I trust you, do whatever needs to happen to make sure this never happens again."
And once we had #1 solid and tight, we were able to leverage that solid auditable data to generate solid dashboard numbers that always squared with what showed on the bill.
The problem is that most orgs seem to do the wrong thing, because the incentives of the higher ups don't align with ehat is good for the org.
E.g. if you are a bank ideally you'd like all your processes automated and streamlined with extremly transparent data flows etc — and you want as many of the banks employees to be proficient in these systems and constantly work on improving the systems within an controlled environment.
In practise this is not the kind of thing that allows single managers to come across like heroes — so it doesn't happen that way and you get island solutions with duct-taped connections between.
I also get to spend a lot of time with executives thanks to the blog's success, and part of it isn't just incentives, it's pure confusion. People have no idea what they're buying.
I also get invitations to "sponsor events" now, since people see "director" on LinkedIn and think I have way more money than I do. Their business model seems to be flattering executives by inviting them to events where they can network with other rich people, then ask me for "sponsorship" money so that I can go into the room and brainwash them with my marketing material. I might even try it at some point to see if that's an accurate read.
>I also get to spend a lot of time with executives thanks to the blog's success, and part of it isn't just incentives, it's pure confusion. People have no idea what they're buying.
So, 43 years later and Putt's law is alive and well.
Putt's Law: "Technology is dominated by two types of people, those who understand what they do not manage and those who manage what they do not understand."
From the book Putt's Law and the Successful Technocrat,
published in 1981.
An updated edition, subtitled How to Win in the Information Age, was published by Wiley-IEEE Press in 2006
We're bootstrapped so we unfortunately can't fling money at sponsorships. Or rather, we can, but it would cut into our runway quite a bit, and we have more promising avenues to pursue. If we acquire bad clients, the type that would let themselves be brainwashed and who status-seek by attending these events, that's just going to be as dumb as a regular job but without the luxuries afforded to employees.
100% agree with most of this. Most large organisations in Australia are just clueless about their needs and ride hype cycles with execs acting on FOMO. And they bought the lie that you need all these separate distinct engineers who are professionals in a niche. They think they are building a high end kitchen by having specialists and think they are hiring sous-chefs etc, but most of the time they are hiring line cooks, and in fact most of the time they should just be hiring short order cooks.
Most teams and businesses I know who are doing great things with data are mostly smaller companies and tech built up start-ups and hire one of two types of people. Generalists who can fill the gaps and are interested in everything, and staff software engineers who are amazing developers and asked to put together data pipelines with consideration of the whole infrastructure.
When I was looking for work, all the good companies were not hiring data engineers or machine learning engineers. They were hiring Senior Software Engineers with a remit to build data infrastructure, or build machine learning models. And they immediately removes 90% of the noise applicants.
Melbourne is easily the worst city in the country for this. Most of the tech sector is in the very large enterprise space lead by the banks, and as a result it's who you know and whether you went to Melbourne Grammar or Geelong Grammar that will determine which company you work for once you reach a certain level. Sydney is better just because there's more smaller stuff going on, and because CBA is better than NAB and ANZ combined on tech. (I hate Sydney otherwise and am based out of Melbourne)
Some places in Melbourne get real work done, even in the data sector. They're hard to find, but they exist.
> it doesn't matter, every data team is a cost-centre that unscrupulous managers use to launch their careers by saying they're big on AI. So nothing works, no-one cares it doesn't work
Yes. Lots of times the most important asset for these companies is actually contractual obligations in terms of exclusive access to data or customers. It doesn’t matter if the product works, you’ll have to buy the company to build a different one that does. But the (broken or nonsensical) product pushes up the value of mergers and acquisitions. If leadership completely makes shit up then they might go to jail, so, they burn X million on “work” and cloud spend as part of an elaborate argument that it should sell for 10X.
> the industry is just so broken at this point I don't think we can do what we did 30 years ago.
Well no, it’s never been easier to do high quality engineering, but mbas are in charge. They don’t think like philosophers or scientists and don’t traffic in common sense.
For anyone questioning their life / career choices because of this, it’s not about you. An individual working in an environment like this can still be a craftsman of integrity if they focus on small problems and solve them well, but you need to be able to get satisfaction from that, not from some overall mission (which again, is probably fake). If you’re most motivated to work directly on architecture, unification, etc, and want to change lots of things then you will probably be miserable.
But if you’re feeling shitty about the whole thing, it might help to realize that the actual nuts and bolts of adtech/martech data pipelines are much the same as the ones for cancer research or particle physics or climate science, so one can at least try and get transferable skills if circumstances are currently holding you hostage. Data isn’t a bullshit job. Leadership and management that just want to play games is the problem.
Agree.. I can tell you at least one Melbourne-based Flybuys retailer calculates your points with an unholy daily-scheduled stored procedure in Snowflake SQL, because.. big business dysfunction reasons lead to the data team being assigned to do it, and the data team didn't actually have any software engineer roles in it.
>Because it doesn't matter, every data team is a cost-centre that unscrupulous managers use to launch their careers by saying they're big on AI.
If you have lots of data flowing and have full teams "working" 24/7 on it, does it really matter if that data is junk and that is not processed in a meaningful way? You can still ask AI to generate some nice looking charts with big numbers to show to investors. Investors like nice charts and big numbers. Or so, some businesses people think.
But in all reality the investors will ask questions like: how will this solve problems for customers, how do you intend to sell this to customers, how much does it cost, how will this generate me money. Unless those investors plan an early exit by finding other, more gullible investors than them, kind of like knowingly investing early in a Ponzi scheme.
Do you think perhaps the problem is rooted in people being dishonest, and honest people are driven mad by it all and self-select out? The dead-sea effect?
It's so many things. Dishonesty, lack of technical competence, political pressure, hype, organization structure, and incentives.
If I had to summarize though, it's that the median performance in any field will be at much lower levels than outsiders expect, and some fields with hazier results have this level set very, very, very low, especially when they're hyped up. But also that the market is actually at least a little bit efficient, but over long time scales. I think there's a 50%+ chance that the role of Chief Data Officer begins to die off, but also that it'll be replaced by something silly.
I do not use this term to refer to myself. I respect those who do and respect the meaning behind it but am just old enough that it feels alien to me 99% of the time.
But I am SO triggered by this piece. I had that intrusive feeling you sometimes get when driving where you think, "I could just close my eyes and see what happens", "Or that clif is so close and the guardrail doesn't really extend far enough"
Only for my career. Like I should just not show up on Monday. I should get in the car and drive far away and change my name and work at a nice retail joint in a mid-sized town.
I'm going to need to sit and stare into the distance for an hour and 3.
It's an almost exact copy of my last few months, right down to the 10am start.
Except that all our other senior engineers got laid off and there's nobody to pair with, I don't give two fucks about bullying because at this point the entire company knows I'll quit on the spot if they try, and our problems are mostly that the remaining team cannot understand the terrifying eldritch decision making process that led to fun little patterns like "wrap every API call in a try/catch and then ignore the errors".
I am seriously considering doing a TAFE course and becoming an electrician.
I wish the abominations software engineers create were as regulated and fixable as a bad wiring job. I would feel absolutely chuffed to work in an industry with licensed inspectors and standards bodies.
I am currently dealing with a system involving four separate serverless functions that call each other. There's no reason whatsoever why any of them need to be network calls. The fourth function just calls the first function again. One is in a different region for no discernible reason.
> There has been a point in my life where I ended every day in the dark, staring at a wall for an hour or two straight, trying to figure out why everything felt awful.
From his post about burnout and mental health. Also worth a read.
It took me about six months off to start feeling normal, and I think I got out much earlier than most people do. And if you read that post, I still clearly let it get pretty bad before I left.
i don't buy that any situation is so hopeless, you're powerless to improve it. at least in the context of this field and its line(s) of work.
sounds a lot more like learned hopelessness making it harder to respond to stress with radical change because of (normal and human) fears of the unknown.
at some point though responsibility for the circumstances, the feelings, the stress -- the good, bad, and ugly or easy, hard, and nearly impossible -- has to be taken.
there's only one life to live. we owe it to ourselves and others to do more than -- to try not to -- just "roll over and play dead", so to speak.
humans have survived a lot and have adapted to just as much if not more.
if i ever allowed myself to even stay at any of my former jobs coming up in my life when i was paycheck to paycheck because of not making rent or just being flatout broke and homeless, i would have not progressed my career, or life, in any meaningful way, and just fed the negative feedback loop influencing what feels like a miserable existence (even privileged as it were).
can't hold myself hostage. and also, i can't hold those around me hostage as consequence of my non-action, either.
> I had that intrusive feeling you sometimes get when driving where you think, "I could just close my eyes and see what happens", "Or that clif is so close and the guardrail doesn't really extend far enough"
Does the mention of such concepts or acknowledging it is real ... put some lisetners (if they are work certain professions) under an obligation to refer the person to a mental health assessment?
Seriously, quit then. It’s not worth it. You get one life. How many hours on this earth do you want to spend suicidally depressed? If you have a really high pain tolerance, maybe you can do that for years. How lucky would that be?
There’s a polish restaurant near where I live that makes amazing food. The owner is always out and about, chatting with customers and making sly jokes. Turns out he used to be an oracle sql consultant of some sort, and he turned it all in to run his restaurant. You can tell he’s thriving. I think he’s got the right idea.
I hear you. But also, ... if you're literally feeling suicidal because of work, in a sense it really is that simple. You aren't doing anyone any favours - not your coworkers, your family or yourself - by living like that.
> I've even degraded team morale because I've convinced some of the engineers that things should be better, but not management, so now some of the engineers are upset.
I have this illusion in my head that I stayed so long at my last company that almost all of my favorite people left, but one of my coworkers had my number.
After a person I liaised with on another team left, I asked his superior if there was someone else I should build bridges with. We started talking about one of the team members and he said, “I don’t want you to talk to him. We like him, and if you talk to him he’ll leave.”
This was on Slack so I don’t know if this was a jest or he was serious/mad. But it’s entirely true. I’ve convinced at least half a dozen people that we should expect better from a team environment and ourselves, and that this org (not the whole company, just this division) is a cult of stupidity.
I was trying to recruit collaborators to fix the bullshit but apparently they decided it would be much easier to just start over.
There's some finer points to parse in your comment that I'm not 100% on (ex. if "this org" is your division or the partner division), so I'm out on a ledge a little bit here, might not relate to what you meant.
I was lucky enough to get ~6 years running my own tech company after 6 years as a waiter. Then I sold it, yadda yadda, went to Google 6 months later, got ~7 years in there.
It really, really, really, disturbed me how approximately every situation, in every division, with any people, ended up being boiling down to "how do we muddle through one more day without challenging anyones preconceptions", 95% of the time it was tribal antisocial stuff, and no one would speak up about it.
Direct example, for posterity.
I don't wanna speak too directly to it, so lets imagine Google Division A (hereafter, dApps).
New division lead (ex-dApps) joins dBytes with apparent bias against partnering division (dConsumer). Despite the project being previously framed as top priority, new lead consistently undermines dConsumer in meetings and shows little interest in understanding their work. Team adopts leader's negative attitude, becoming obstructive and uncooperative. I ended up carrying a critical launch, virtually alone, for 6 months. At performance review time, my boss questions why I didn't get more team involvement - despite the hostile environment that prevented exactly that - and speaks glowingly about how we need to support peer going for promotion based on their excellent job on part Y...which they didn't do. They spent 2 days on it then said it was impossible. And they were definitively the most cooperative because at least they tried, and wouldn't actively be aggressive in meetings with the outgroup.
Everything, always, came down to: A) don't cause conflict at all, at home, or you will be buried B) we'll bend over backwards to accomodate conflict you invent, as long as we can clearly define them as an out-group with 0 ability to affect us day to day.
At prior jobs we had an escape hatch for this: go to a fancy coffee shop with the dissenters and have all of our bitchfests out of earshot of the muggles.
But it’s trickier to coax people still on the fence to come out for multiple coffees.
The good news is that, since the others are also looking for work elsewhere, there will be more engineers out in gen pop that actually thinks tests are useful, hah.
> At two of the four businesses I've worked at, the most highly-performing engineers have resorted to something that I think of as Pain Zone navigation. It's the practice of never working unless pair programming [...] The fear and dread comes from a culture where people feel bad that they can't work quickly enough in the terrible codebase
Exactly why I burned-out at work, worked at most 2 hours per day on a good day and finally was ejected from the project after a PM that graduated last year from school noticed and went after my head. Author is a wizard for describing the situation this well.
It's been 3 days I've been free from the tyranny of Jira and project managers, and I worked more on my personal projets than I did in a week at my former workplace.
But on the other, the same sentence could be written about software deployed to traditional servers. "Because of course, how can you hurt yourself without the joys of badly configured servers?".
You can hurt yourself with a badly held butter knife, and you can hurt yourself juggling katanas. Which situation would get people saying you're crazy?
Well, if you narrow down the metaphor to just knives, a dull knife is more dangerous to a chef than a sharp knife, because you need to apply more pressure and you get less control over the cutting action.
Dull knives are dangerous to most people, not just chefs. Most beginner's cooking books/lessons will tell you to keep your knives very sharp, because dull knives are dangerous (for the reasons explained by grandparent comment).
This affects amateurs just as much (if not more) as experienced chefs.
Which of the two, "going serverless" or "managing your own servers" would you say is unequivocally like juggling katanas?
I don't think the analogy is very good, since juggling katanas is always a crazy idea, while choosing whether to go serverless or not is always a respectable discussion.
I understood the pun about the "cutting-edge" cutting you, I just went deeper than the joke to note many hurt themselves by not going serverless when they should have, and that server maintenance/configuration often becomes a mismanaged nightmare.
Do you mean figuratively that OP is replacing Cobol? Because I don't see that in the article. It mentions other technology that I would not associate with a super-conservative stack - like Databricks, JSON, Postgres and Google Analytics. So I'm a bit confused by your comment. And by all the downvotes, honestly.
I just pointed out that personally I would not consider Lambda - which has been a stable and popular technology for 10 years - to be cutting-edge. It's not old but also not cutting-edge imo. I would reserve that term to newer technology. Apparently a controversial view on HN, which is interesting.
To respond to your question, I did work for a bank in 2017 with moving certain burst-type processing to a set of Lambdas.
I worked for a company that went all in on Lambda as well. The knots they had to twist themselves into so that everything ran nice and smooth in Lambda environment was mindboggling. We have certain actions like orders that would pass through 8 Lambdas before completion because of execution time or just the big code base would result in 7 seconds start up time (node) so it would get broken down. If any of them failed, and it felt like failed a ton due to Amazon backend stuff, it was a nightmare to resolve.
All of it could probably been handled by larger node application in docker container somewhere but AUTO SCALING, FAILOVER, SERVERLESS!
Once I started as SRE for a new team, we built a larger monolith using Node and docker on EC2. We would get massive complements for our uptime and reliability but there were some architects extremely unhappy when I revealed in division presentation that it was just Docker + m4.xlarge running Ubuntu 18.04. When I left, more and more Lambdas were being broken down into docker running on EC2. They are probably on some container managed solution now.
I’m going to read the rest of this. I’m enjoying it. But, simultaneously, part II has me so triggered - it bears striking resemblance to repeated situations I’ve encountered where the meaning and content of columns in a relational database were overloaded in varying degrees of heaviness (which is a practice I absolutely detest) - that I need to take a short break.
I manage a database for a small local charity. I have set it up so that only I can add, delete or change the column structure. If someone wants a change, they have to email me and convince me (they are fine about this BTW). I'm sure the database would be an utter disaster zone by now if everyone was allowed to change it.
I think database schemas deserve to be protected with one’s life as the holy ground of the system. If the schema is fucked, everything else will be fucked too.
Schemas require domain knowledge. When domain knowledge is unclear or lacks ownership, it can lead to a range of issues that impact both data integrity and system functionality. Things that screw this up in the financial world include: working in different countries, acquiring new branches, new hires, and leavers. And people who think they can insist the database schema be protected somehow. A manager told me to add the last reason, it wasnt my idea and makes little sense.
With a database you can lock down the schema. In reality though, many data system are composed mainly of people emailing in Excel spreadsheets. Good luck enforcing any sort of schema there.
My day job is writing a desktop/file-based ETL system. I have just added in a schema version feature to cover these sort of issues. It was one of the most requested features, because most people aren't able to control the schemas of the data they receive.
We can automatically handle some schema drift if columns are renamed or reordered, or columns added or deleted. But if they are both renamed AND re-ordered, you are out of luck!
If you detect a level of drift that you can't handle, this is the perfect opportunity to delegate that bit of work to an LLM, if it's a problem that you deal with regularly enough to feel the cost of it to your business.
The latest generation of LLMs are pretty, actually very, good at this kind of situation, where somebody has renamed something - but kept some semblance of the meaning - and also moved it so a basic, or even a fuzzy, comparison might not be able to make a good match.
But a model like GPT-4o-mini will eat a problem like this for breakfast, and it's now incredibly cheap to use it for this kind of thing as well.
And it's almost impossible to get them to stop: the hostname should either be a random UUID or a random name from a pronounceable list depending on scale (or a syllabic UUID thing).
Because every other factor has one answer: you look up the other data you need in your CMDB. If that's too hard, you fix that so it's easy (DNS TXT records can be surprisingly useful here).
I've seen some "data engineering" scripts that were complete messes and beyond crazy. Some examples: Massively over engineered "pipelines" that process a few hundred rows a day, but somehow manage to take forever to run. Developers that didn't know SQL beyond "select * from table", so they do all their summarization in Python. Or, worse, I've seen a Python script calling a shell script calling R calling something else, several more layers deep, when the same result could've all been done in SQL with a few temporary tables.
Oh, then I'm asked to "give this a code review before so-and-so does a deployment tomorrow." Uh, it's a little late to address any of the fundamentals, but there are hard coded paths everywhere...
I recently got a bit of a shocked reaction when I proposed to directly load daily files into temporary SQL tables and then use merge commands within the database to load the final tables. My use of code is essentially a shim between an SFTP client and SQL Server in this scenario. Maybe ~200 lines to connect, locate the files, run the bulk load operation, and then invoke the merge commands. Most of the fun bits are in the actual merge scripts.
Once your data is safely inside the database (temporary load tables or otherwise), there really isn't a good excuse for pulling it out and playing a bunch of circus tricks on it. Moving and transforming data within the RDBMS is infinitely more reliable than doing it with external tooling. Your ETL code should be entirely about getting the data safely into the RDBMS. It shouldn't even be responsible for testing new/deleted/modified records. You really want to use SQL for that.
You'll also be able to recruit more help if everything is neatly contained within the SQL tooling. In my scenario, business analysts can look at the merge commands and quickly iterate on the data pipeline if certain customers have weird quirks. They cannot do the same with some elaborate set of codebases, microservices, etc.
One specific thing that really sold me on this path was seeing how CTEs and views can make the T part of ETL 10000000x easier than even the fanciest code helpers like LINQ.
The architecture is sound - typically called ELT these days. Dump contents of upstream straight into a database and apply stateless and deterministic operations to achieve the final result tables.
SQL server is where this breaks though. You'll get yelled by DBAs for bad db practices like storing wide text fields without casting them to varchar(32) or varchar(12), primary keys on strings or no indexes at all, and most importantly taking majority of storage on the db host for tbese raw dumps. SQL Server and any traditional database scales by adding machines, so you end up paying compute costs for your storage.
If you use a shared disk system with decoupled compute scaling from storage, then your system is the way to go. Ideally these days dump your files into a file storage like s3 and slap a table abstraction over it with some catalog and now you have 100x less storage costs and about 5-10x increased compute power with things like duckdb. Happy data engineering!
It amazes me how many DBAs think the limit on a varchar column impacts the disk space. The "on disk" size for `varchar(12)` and `varchar(32)` and `varchar(MAX)` are roughly the same and depends on the data itself more than the schema. That's what the "var" in "varchar" means: variable storage size. The limits like (32) were added for compatibility with `char` and for type-based "common sense" validation. Sure, it helps prevent footguns like accidental DDoS of ingesting too much data too quickly, but there are other ways to do that basic top-level validation of "is this too much data to insert?".
Five varchar(12) columns is more storage overhead than one varchar(60). There's a lot of great use cases for varchar(MAX) and everyone I ever had tell me that varchar(MAX) wasn't allowed didn't understand the internals of DB storage that they thought they did and somehow still believe in their internal model of the DB that varchar is just spicy char and fixed column size allocation.
> With Postgres, we mostly just use `text` everywhere, unless there is an actual reason to have a size limit.
Yeah, there's still the very rare need to performance engineer out a fixed char field "to the left" of the table to speed up common table scans, but also so many of the reasons you might table scan strings have moved into proper full text search indexes or now all the rage is in vector embeddings.
> In other news, I haven't seen a dedicated "DBA" at a company in over a decade.
Yeah, anecdotally from LinkedIn and other sources it does seem like all the dedicated DBAs that have stayed that way have stuck to very specific niches and/or Oracle Products (including MySQL and derivatives these days; the "Oracle Effect" is strong). Especially in Amazon RDS and Azure SQL Server/Cosmos DB today, Postgres and Microsoft's SQL Server mostly run themselves and day-to-day administration is minor/trivial.
My experience with delta was that the catalog, being stored in s3 itself, was unacceptably slow, and for our data volume, Airflow was prohibitively expensive. We spent a lot of engineering time working around both problems. Which is funny because the consultants who advised us to do this told us it was the best possible solution; tailor made for our application, foolproof in every way. After that we proceeded to pay for their “data” “science” “services,” which went about as well as my scare quotes would suggest.
You're basically describing the Lakehouse Tables architecture.
Store your data as tabular data in Iceberg/Hudi/Delta on S3. Save a bucket on storage. Query with whatever engine you like (Snowflake, Redshift, BQ, DuckDB, etc).
Yes, this is the vast majority of my data work at Google as well. Spanner + Files on disk (Placer) + distributed query engine (F1) which can read anything and everything (even google sheets) and join it all.
It’s amazingly productive and incredibly cheap to operate.
Some of my colleagues use Microsoft PowerBI, and indeed, they upload a few hundred rows of data (and a few hundred columns, which get unpivoted in powerbi to a say 40k rows). When they upload it, the powerbi instance overloads, and people get timeouts and such. That can last up to 20 minutes. I stay away from that as far as I can.
This is what bothers me with MS SQL related tools - they all seem horrendously brittle. Everything seems prone to deadlocks, has weird edge-cases, and incomplete coverage of the API of the next tool they're talking to so you keep having to break open the abstraction and manually tinker in the next level.
If by that you mean, “knows the commands to create, fill, and select from a table,” then yes. If you mean, “knows how to create a performant schema and queries that will serve them well into the future,” then no, absolutely not.
OTOH, IME data folk are much more cheerful and willing to change things than devs when I point out the innumerable ways their DB choices are choking them. Devs more often fall on the side of “we don’t have time for that on our roadmap; can’t you just fix it?”
It is often not the devs telling you about the roadmap, but the management. Devs are more willing to fix things, if they are broken, but are not given the time.
I've worked with people where "knows SQL" meant "knows the Access query builder UI, sort of, and demands the ability to query Production databases with the awful SQL auto-generated from the Access UI".
As a data engineer I have seen absolutely bullshit pass for production, but it doesn't seem that different from all the other bullshit I have seen people deploy in my life.
It is one of the few types of jobs I have worked were someone credulously offering adding five more layers to fix an issue with latency is a normal operating procedure though.
What’s the solution to wrangling these data projects?
The author’s experience is not far off from my own.
1. Any solution in place can only be understood by the person who created it
2. ”No, we can’t change that because then we’d have to validate everything from scratch again”
And therefore, as the author says:
> ”we'll continue with the work instead of fixing the critical production error”
I’m honestly not sure how to address it either. With traditional software dev we’d write tests, incorporate those into CI/CD, and start to course correct. We can use sample data to validate the code does what we think it does and that we didn’t break it.
But in these data projects, it’s not only the code that’s changing, but the data is also a moving target. You can write a test with sample data, but tomorrow your data might change because someone in sales added a custom field to the CRM, or IT upgraded the accounting software and all of the unique IDs changed, or someone upgraded their Excel version, or whatever.
And your code that works on the sample data needs to handle all of this, which obviously it can’t. You can try to validate the data somehow, check the schema, check if the number of rows hasn’t doubled or halved, and so forth, and then stop it from importing until you look into it, but also you can’t stop inbound data because an exec has a meeting in a few hours and expects their report to be updated.
I heard something about “data contracts” that’s supposed to address this, but it sounds like the next in a long line of buzz words intended to get management to buy another data product.
Has anyone worked in this kind of project that went well?
Author here, and also executive director at Hermit Tech now where we do things like this. Your approach has the core of how I'd go about it. The contract stuff is legit, though you don't need to buy a product for it.
The thing that is hard at big organizations isn't that executives need the data for meetings. The issue right now is that many organizations are already 3-4 years into building their analytics platforms, and Chief Data Officers worldwide are trying to prevent their role from disappearing. They're already very much "Mom, we have CTO at home" in many companies, as evidenced by the fact they're usually reporting to the CTO or CFO.
So at this stage, they've already told the business that the platform is "ready", and they are onboarding data sources. With no way to measure data quality, the only thing visible at the organization level is number of data sources onboarded. The fastest way to onboard data sources is to have good CI/CD and a solid developer environment, but this would probably result in slowdown for 1 - 2 months even if you had executive backing to bulldoze all objections from the IT department.
That's the sort of thing I can commit my team to as a business owner, but most executives don't have the nerve to slow delivery down and aren't losing money out of their pocket due to the inefficiency - I get to talk with a lot of them due to the blog's success these days, and many of them really are just employees with more status, with the same incentives. And to make it worse, the loss of nerve is actually understandable, because the type of team that would build something this bad will also waste those two months then still deliver slowly! But most people aren't thinking in terms this complex, and yes, I know it isn't that complex.
I'm expecting to pick up some work in this area at larger orgs in a few years when these leaders rotate out and new leaders rotate in and go "what the hell IS this?", but for now we're mostly aimed at helping smaller places do it right from day one.
The boring answer is that it's context dependent, but fundamentally "do data engineering the same way you do high-performance software engineering". Have tests that run fast, where fast means "a few seconds when you start" and "refactor as you go so the tests keep taking a tolerable amount of time". I think Kent Beck suggested 10 minutes in Extreme Programming Explained.
We're gradually forming our own, complex opinions in this. In the consulting context, this is essentially our product. A fascinating realization moving from software to marketing is that a sales pitch or marketing strategy can be built in a way that isn't entirely dissimilar to code, and that it has second-order effects. They aren't the same because... they aren't the same, but there's an artistry combined with principles to doing it "right".
And then as a consultant there's additional complexity, as each team is different. Some are high-performing and need a bit of an external jolt. Others need the help the most, but are in politicized environments so they're almost inaccessible until a new executive comes in who can admit there are problems (or indeed, even see that there are problems).
Joe Reis has some great stuff in Fundamentals of Data Engineering, which includes advice on early objectives when rolling out a new practice.
Disclaimer: Joe has hosted me on his podcast, and we are in the mutual-marketing whirlpool together. But I've been recommending his book long before I met him.
> Like why didn't anyone catch the issue with the logs?
I see questions like these a lot and every time I feel that people immensely underestimate the effort required for curating data. In my experience data can only ever be as good as what it's being used for and in this story the logs haven't been used for this purpose before so they're not going to be any good.
It's some sort of data variation on the second law of thermodynamics - entropy is winning. Going in with the expectation that things should be better will only lead to frustration.
The observability world still regards itself as a system for monitoring, but reading (and sometimes seeing) how these systems just go so bad continues to drive a conviction that perhaps their strategies and tools should become bigger. That they should converge with business pipines.
We shouldn't just have wide events/big spans emitted... We should have those spans drive the pipeline. Rather than observability being a passive monitoring system, if we write code that reacts to events we are capturing, then we shuffle towards event sourcing.
Given how badly coupled together with shoestring glue & good wishes so many systems are, how opaque these pain zones are, it feels like the centralization upon existing industry standard protocols to capture events (which imo include traces) is a clear win.
(Obvious downside, these systems become mission critical, business process & monitoring both.)
Totally agree. Observability is just another dataset and should be modeled, managed and governed as other datasets. Data quality controls should be equal or of higher standard than regular data sets.
Monitoring, dashboarding and alerting should leverage other BI-class tooling.
Hah, yes, I'm not in cybersecurity but am very close to a few people that are. The incompetence is not evenly distributed and not as bad as it is in data, but some companies are in terrible states, and the stakes are much higher.
I worked with an Aussie who was in the US on H1B because the aerospace industry was even worse than the status quo in Australia. Last I heard he went back. I sincerely hope he switched industry verticals.
I’m working on an SDLC app that will end up with inspirational sayings as interstitials once I run out of bigger features/desperately want to procrastinate.
I’ll stick heavy hitters from Goldratt, Fowler, Feynman et al in there, but there’s going to be a “dark humor” and “snarky” category and this will definitely go into one of them, along with some Ambrose Bierce.
This blog post rescheduled all my appointments, tucked me in, sang me a lullaby, then woke me up with coffee and breakfast late the next morning. I am healed.
For real, a fun and refreshing read (if also a little haunting).
Great piece of writing from someone who truly cares about craft and suffers from the feeling that this craft is not what they are paid for.
Add: for people who sharer the feeling -- you can work in a place where velocity isn't all, managers are not assholes and you can dedicate yourself to craft.
"…what they actually needed to do was fire most of the staff in every team, leaving behind the two people who actually had good domain knowledge, then allow them to collaborate with good engineering teams to build sensible processes and systems.
Instead, they hired a bunch of Big Firm Consultants. You can see where this is going already."
I’m on month ten of my tech sabbatical and it’s been great. I’m no closer to wanting to return to the industry (former FAANG data jockey), not in a position where I can never go back but am in that middle-state where I want to use tech to contribute to the betterment of society, the community, nature and sustainability. I have a couple more months to try and figure out how.
I wonder what company they’re describing here. It sounds like so many self inflicted problems that that you could undo or set right in a couple of weeks if you had the time and latitude to make changes across the system instead of being confined to a small area of team ownership.
I worked somewhere that had a lot of this sort of thing going on once. You cannot overestimate how hard it is to get anything done: politics and organisational dysfunction, not to mention that you probably don’t have access to half of what you need to in order to fix any given problem and are even more unlikely to be able to get it, mean there are just huge scads of problems that, on the face of them, look relatively straightforward to solve but which, in practice, are organisationally impossible to solve.
> you probably don’t have access to half of what you need to in order to fix any given problem
I’ve found this to be one of my largest day-to-day problems even in a relatively functional organisation. Particularly when it involves something I can’t run on my own machine, like an AWS service.
In a previous role I often found myself constructing elaborate hypotheses about what was going on inside systems I couldn’t see into. I’d then need to try to verify it with someone on another team, in another timezone, who had the access but not necessarily the development background. Which usually meant getting on a screen share and asking them to click various things I wasn’t allowed to. If I was wrong, back to the drawing board and start again.
A framing/question I like to ask use is: "Look for a root problem that ought to be fixed through a change in policy, politics, or incentives, and the wasteful use of time/money is how the company tries to avoid or defer facing it."
For example, Operations might demand that Engineering develops an increasingly-byzantine approvals process, to stop Sales from over-promising impossible or unprofitable projects.
I inherited a pipeline like this. It is as if everything is a global variable. You cannot "just fix" one thing in isolation, because some spooky action at a distance of which you were unaware relies upon this insane behavior. Each and every hack is the expected input somewhere else in the chain. You have to carefully inspect everything downstream of any kind of minor adjustment because your cleanup is quite likely to break something else.
Immensely frustrating and draining where you can have accomplished ~nothing in a full day of work to fix what should have been a five minute change.
I'm the author. You are exactly correct. Everything was so heavily interwoven that it was impossible to tell what would happen downstream without making an edit and then tracking the changes through dozens of steps through the architecture diagram.
In this situation I’d make a second copy of a bad thing and then try to make the copy good, instead of changing anything in place. But yeah, I totally get the huge upstream fight it must be and I’m not trying to backseat drive… just marveling at how the organizational fuckups & constraints make it hard to fix obvious problems
You're right. I actually did propose this, but the other difficulty is getting stuff put onto the Jira board. My belief, yet to be verified, is that it was possible to deliver everything the executives promised AND perform the refactoring without any drop in service, but this would have required convincing management that a well-run team could 3x productivity. Eventually everyone got tired of talking to people that were just used to bad performance, so they couldn't envision smooth CI/CD and happy workers (as opposed to contented to get paid to hang around workers, which is what happy means in many cases).
"I would like to make this systemically better by addressing our second order problems that are causing our very visible first order problems"
You will be told that we absolutely do not have time for that. The only actions you're allowed to take are fighting the fires closest to you, not turning off the pumps that spray the gasoline everywhere.
Typically this is only even possible because nothing that you're doing is actually used or scrutinized since if it was, someone would have immediately noticed that nothing works. Usually this is at places running on varying levels of investment dollar three-card-montey.
I wrote a piece last year on how they were running a $500K Snowflake bill due to typing one number in wrong, and how I noticed within my first few weeks there by literally eyeballing something and going "queries take milliseconds so you really have to justify everywhere a minute appears in the configuration".
The logs are stupid (sorta) but imagine how many other issues can exist if we uncover something like this every time we open something.
It all seems easy to fix until you end up in a place where management process (and thus politics) has become more important than outcomes.
The people with experience and knowledge get pushed aside in favor of someone who can talk management-speak and (often) looks the role. Suddenly meetings where Things Get Decided only include managers who don’t know what they’re doing.
The best is when they make a mess and then get tremendous accolades for half-fixing it.
I’ve even seen it happen within relatively small startups. It’s a sign of rotting culture, but sometimes you have a mortgage to pay and have to get comfortable with the situation until you can find a better gig.
Irreducible complexity and load-bearing bugs. Kludge built upon kludge built upon kludge. An engineer with the go ahead and all the support required would still struggle because the knock on effects of any one change cascade out in unpredictable ways. Not to mention working in an active environment where, although the other engineers support the goals in theory, they still need to deliver business requirements for a fickle management that doesn't truly understand what lurks beneath the facade - they don't have time to do it the right way, and if they try they'll just break both the old system and the new.
Why are we so crummy? Is it because our talent finds it generally easy to migrate to the states for better opportunities, because of no language barrier and the ‘coalition of the willing’ visa (iirc)?
This must be one reason (not Australian, but have friends there), but more importantly it seems to me the "better opportunities" are much, much better. Australia seems to have worker rights that make most of the US look decent by comparison while wages remain far below the latter---and I don't mean just US hot spots: an Australian friend went from Sydney to Salt Lake City (!) to get a much better job with much higher compensation and better rights/environment. In Utah!
> Australia seems to have worker rights that make most of the US look decent by comparison
This is the first I've heard anyone say that. Can you elaborate? We've got more leave (separate holiday and sick leave), none of the "at-will" stuff, right to disconnect, less hours, etc.
US software engineer. I have 24 days of PTO, 15 company holidays, and 9 sick days. 10 weeks for parental leave (16 for moms). $240K salary, $400-600K in annual vesting equity. That’s private paper equity, but I’ve already been able to cash out $700K and buy a house with cash.
Fully remote. I can expense $120/month for phone and internet, and a few lunches each month, too. I can get a new laptop and/or monitor sent to me just by asking.
When I do visit the office, the trip is fully expensed. Free daily lunch. Coffee and drinks and snacks everywhere, free. Private desks in a semi-open office with couches scattered around. Lounges with hundreds of board games, nearly all of which have seen table time during work hours.
Primary projects are tracked in a knowledge-sharing system, but I can mostly work on what I want to. I’m encouraged to merge small fixes and refactors without any ticket-pushing at all. Yelling by managers or anyone would not be tolerated.
“At-will” is more FUD than reality in my experience. Most companies, when firing or laying someone off, give something like 2 weeks of severance for every year of service.
That compensation is better isn't in dispute. My question is about worker rights, how much of the stuff you posted is just your company vs US law? Are you protected better by law than someone here would be?
What is more important, the law, or the typical employee experience? If you're in a strong market where workers are in demand (evidenced somewhat by higher compensation), then workers will tend to be treated better for fear that they will leave for another opportunity (remember: "at will"). If workers have fewer opportunities, then how much can the law really help?
Are there specific protections that are lacking in the US that you would expect to result in worse employee outcomes?
I'm not sure what any of that has to do with my question. The person I replied to said that from what he knew worker rights were somewhat better in the US so I asked for examples. How an employer treats you is of course important, but can't really be considered worker rights if they can decide to change that depending on market conditions.
Again, I'm not arguing about who has it better. I am asking a very narrow question about worker rights differences between the two countries. If you want me to say that US software workers have it better, then sure. But that's not really what I'm asking.
I work at more typical software job for an AUS company operating in the US ... AUS workers make less money than the US office but they can rollover PTO indefinitely, right to log off, etc. I think the main reason people come to the US to work from aus is the cost of housing anywhere near the cities is exorbitant and the australian version of the american dream is unbelievably dead.
I wish I could suggest something for you. My path was moving to the Bay Area 13 years ago to work for a small startup, helping to grow it, then going remote after a few years. Startup is a B2B with an ethical technical founder, and it had a credible business model from day 1.
Looks like the classic mistake of every data team. Every single office person works with data in one way or another. Having a team called 'data' just opens a blanch check for anyone in the organization to dump every issue and every piece of garbage to this team as long as they can identify it as data.
That's why you build data platforms and name your team accordingly. This is much easier position to defend, where you and your team have a mandate to build tools for other to be efficient with data.
If upstream provides funky logs or jsons where you expect strings, that's for your downstream to worry about. They need the data and they need to chase down the right people in the org to resolve that. Your responsibility should be only to provide a unified access to that external data and ideally some governance around the access like logging and lineage.
Tldr; Open your 'data' mandate too wide and vague and you won't survive as a team. Build data platforms instead.
As a regular old “platform engineer” I fight to ignore “data platform” tasks. There’s no target to hit, it’s just moving sand around a sand box.
If you want an answer to a specific question, we can spin up a read replica and a Metabase and write a query in an afternoon, cool. I’ll get you a chart, we’ll move on. If you want “a data analytics platform to enable blah blah blah” I’m out, I can’t do it. My eyes won’t focus, my hands stop moving.
Developers sometimes tell me stuff like “Kubernetes is too complex”, “jeez React is a pain”. I send those quotes to my friends stuck writing 195 step DAGs to transform log files from s3 into s3 so they can eventually land in s3 - ah yes but they’re parquet somewhere in between, and that matters for some reason. We laugh together, but I can see it hurts them more than I intended.
Life is too short to faff about doing nothing. Go join a company with less than 100 engineers and learn to be happy again. Let the enterprises burn, we’ll all be better for it.
Anyways this was a fantastic piece, I hope this person writes their book after all.
>>> pretending that any of this is more important than hiring competent people and treating them well. I could build something superior to this with an ancient laptop, an internet connection, and spreadsheets.
Author here. This is one of my favourite books of all time. In fact, my two favourite books are ZAMM and The Black Swan, both of which I hated on my first read when I was 19.
I recently re-read it for the third time while on holiday and was taking notes on pages worth including in a book review. I ended up with about 1/3rd of all pages logged and decided no review was better than just telling people to read the book, but I might write a blog post on gumption traps.
Understanding that IT projects are difficult gives us more empathy. Gartner says that 80% of corporate IT projects are considered failures. McKinsey says that 17% of large projects fail so badly that the companies existence is threatened. Standish group says only 10% of projects succeed.
Bullshit jobs once again. I don’t know. These companies are complex systems.
He is writing as if the engineers all knew how to fix the systems, but were just powerless to do that. But I’ve also seen projects lead by engineers that only added to the overall complexity.
There is a paradox in this - the people who seem the most confident about fixing the systems usually only make things worse. Chesterton fences and stuff.
This article triggers me because everybody who reads it will always believe that they would fix the mess if only they got the power, but in practice when they get power they would only add new complexity to the whole mess.
People generally do have some idea how to fix complicated systems have have endemic problems. The reason they don't is that the company considers the capitalism going up faster NOW to be way more important than it going up faster in the future.
I had read and very much enjoyed the "AI silence or you'll be Piledriven into next week" post, without clicking on anything else on the author's blog. It was click link, read whole thing, love it, send it to one or two people, move on.
Very happy to see this here, realise it's the same person and that this is "a thing", and then to rollick in the author's backlog. A joy! Raucous real-life laughter has exploded from me on numerous occasions along with most articles. I think I've read 5 in a row there, and my brain is buzzing happily.
Thank you to the author for having the courage to write about real experiences. A breath of fresh air. I look forward to future books and articles, and reading more previous work, and cross my proverbial fingers hoping they can keep it real in the face of what will presumably be an avalanche of grifters looking to leech off the attention.
Nice to see someone having a similar reaction to me upon reading it. I too found it astoundingly good. I couldn't say what exactly about ludic's blog reminded me of it so strongly, but in any case, thrilled to share Caroline Busta's work with someone.
I've worked for like a dozen companies full-time. Most people don't know what they're doing. I always thought 'impostor syndrome' was a projection of general insecurity. But I've started to think it's actually the subconscious saying "I'm not sure what's right or wrong, please consult an expert."
I have a fantasy of quitting my job to write books on the [modern] theory and practice of information systems engineering. Not 'how to write software', that's been done; I mean all the forms of engineering around software/information systems. In my dream, I write the books, everyone reads them, and starts doing their jobs right.
But then I remember, I, a person arrogant enough to believe he knows how to do things right, still can't get shit done right. Maybe if I were a one-man company, I could 'do everything right', and feel good about the result. But I depend on an entire company of people to do the right thing, in the right way, at the right time. That's hard even with the best people. No company is made up of the best people. It's always a mix of the best, worst, and in-between.
Strangely, a company can put out a decent product, despite the company being a tire fire. This is some comfort when you get older. You realize that everything being shit is okay, as long as the bills are paid. I have PTSD from when the thing that paid the bills was on fire, every week, for years. Lately at every job I have, I internally panic and scream at how horrible everything is. Because I'm haunted by what might happen. But it's not happening yet. So I muffle the screams, smile and nod along with the stand-up-meeting-cum-status-update.
The sad thing is, I forget that it's okay that the stand-up is shit. I forget that I'm still getting a fat paycheck just to sit in meetings that could have been an e-mail. I forget that, despite the company bleeding cloud costs [no savings plans, RIs, serverless, right-sizing, etc], we seem to be making a profit. Despite the terrible designs, bad process, ineffective leadership, absentee management, lack of security, and all the rest, the bottom line is fine. The shit is fine. Currently, and probably for the unforeseeable future.
I get craftsmanship. I'm a crappy woodworker. I enjoy making things well, and getting better at it. But our jobs are not fine woodworking. Our jobs are construction. We are banging rusted nails into shitty, twisted, racked, cupped, knotty-ass studs. If we're lucky. Yeah, this building is going to be shit. But somebody's still going to pay for it. And there'll be another job after. If we really wanted fine woodworking, we never would have taken this job, and we know it. We'd be struggling to sell a cabinet that took us two 80+ hour weeks, too tired to appreciate its beauty, too defeated by flaws only we notice.
So let's stop beating ourselves up. Let's stop beating each other up. We don't, can't, won't, find meaning in this monument to mediocrity. No comfort from the pain zone. No pride to take home. But we are paying the bills, with more left over than most have. No broken backs and long hours. No lack of health care, no abuse from customers or the public. Not even that big a worry about job security. We are the lucky ones. We are blessed with a golden shovel. So let's do like those blue collar laborers we often idolize, and get to this annoying, bloody awful work that we are blessed with.
I just wonder what would happen to society if everyone worked like software engineers do.
For example, what would happen if carpenters just eyeballed every measurement, and just shrugged their shoulders when the walls didn't line up. If they found out the way the wiring was planned will not actually end connecting where they thought it would and just shrugged and did it anyway.
What if they had hour long meetings about how to drive in screws and their workdays consisted of putting up a beam, and then going home because they think they've done enough for the day.
If all of society was ran like that, it wouldn't be running for long.
That seems a rather defeatest attitude. I was fed up with working for other people 20 years ago. I started my own 1-man software company and write software how I think it should be written. The software isn't perfect, but I'm proud of it. I'm not rich, but I do fine financially. It isn't a viable path for everyone, but it is something to consider if you hate your job.
I'm the author. This is the track we're on. We also named our company "Hermit" so there's something extremely serendipitous about this comment, hah.
I definitely think it isn't viable for many people, but I've also met people who could totally do it that are scared because they're around a lot of people with ability/financial constraints that they simply don't have.
I am defeated. I think a lot of us are. It's wonderful that you have your own company, but I don't have the grit and self-discipline for it, and I'm still trying to scrape up enough for retirement. A younger me might've liked to try your path, if I'd heard more of those success stories growing up.
>> I've worked for like a dozen companies full-time. Most people don't know what they're doing.
This is why I reflexively chafe when I see "Software|Data Engineering".
I've never witnessed it in practice anywhere I've worked.
Sure, embedded systems for a pacemaker/defense industry or something is probably proper engineering, or graphics programming where you're concerned with creating approximations of real light transport with a time budget per frame or whatever, but most of this over-architected junk involving putting strings in databases and sending clients 10MB of JS to render a list is the farthest thing from "engineering".
A decade+ of inflated compensation really went to people's heads with these silly titles.
The other aspect of this is that it's easier to be the cog when you're tasked merely with implementing shitty decisions. You can just disregard the rationale and focus on the list of things you're told to do.
But as you (hopefully) progress in your career, the expectation is that you are increasingly the one making and justifying those lists, which is much more soul crushing and harder to "background" and pretend that it doesn't matter.
100% this. It’s still sad to experience this though, day after day. I’ve seen fundamental things done so badly, no time to fix them, we’re given no choice but to continue “construction” on top of a smoking garbage pile. Nobody wants to hear “start over.”
You can’t as it will immediately draw out the leeches who will hurl “best practices”, strategic partnerships with scam software vendors and compliance check ticking bullshit at you until you back off.
Author here. This is essentially the case for many clients, especially governemnt. But we're bootstrapped, so we can afford to consult and use what we've learned to build tooling that only targets a small handful of sane clients.
I've sought advice from various people on this, some who are famous-ish or quiet sales powerhouses in the US. My question was "What are executives buying when they hire consultants?", and the answer is consistently "Comfort". No one is actually comforted by Deloitte, KPMG, whoever.
The moment I had confidence in an ethical consulting practice is the moment someone said "I don't even know where I would hire good consultants". This was someone with 30 years or something absurd of industry experience, including mentoring people that went on to become staff engineers. After processing that, I realized I don't know where to hire a consultancy that isn't going to bait-and-switch me with mediocre talent. They obviously exist, but they probably can only support something like 1 to 10 clients each.
Even Thoughtworks, a place that I used to hear mentioned positively, was flagged by the CTO of a >$1B company over lunch last week as "shifting to bait-and-switch" tactics.
tl;dr Pretty sure you can absolutely do better and make a living off that, but you have to think very carefully, do research, read a bunch of sales/marketing books, have great communication skills and at least adequate engineering skills. I still don't know if I have some of those, but if I fail it'll be a skill issue, not because the problem is not tractable.
In Australia, data jobs have a high floor but a low-ish ceiling at most places. Salaries start at around A$100K if you know how to interview, but 30 years of diligent practice only nets you around A$200K, and this would be considered an extreme success.
If you're already good, just have coffee with executives/leads at places that take engineering seriously and you'll earn well, but there's no need to enter data hell as an employee at the typical company.
I'm also based in Melbourne. I would add that all the above is true of maybe 95% of companies, but there are a few exceptions: big US-based tech companies. There are roles for devs that pay AU$250-$550K/year, including the data engineers. Unfortunately, there appears to be zero correlation with expertise and we have the exact same pain zone experience that I had when I was earning a third of this salary at an AU-headquartered company.
"Suffice it to say that while people are sincerely trying their best, our leaders are not even remotely equipped to handle the volume of people just outright lying to them about IT."
I've tried to come up with some heuristic to determine whether or not a team is competent, good, or doomed. I've been exposed to all over the last... 8-10 years, and one of the key things I've noticed is the ratio of competent/skilled developers to the unskilled ones is a big ... indicator(?). Predictor?
Colleague of mine has been working with a team - dev team has ranged from 5-8 people over the last few years. Few people seem to have any grasp of programming at all. Only two people - my colleague and one other - have ever taken projects from ideas to delivery, or even taken features from requests to successful rollout of already functioning software.
The arguments that people get in to there - days or weeks of people 'researching' whether or not OAUTH 'really' requires 'refresh tokens' or whether it's really supposed to be a JWT. Management has some notion of 'every voice is legitimate and should be heard - we don't support bullying' and so on.
If you have a team of 10, and 1 or 2 people are simply bad at having the ability to think somewhat abstractly, you can survive.
If that number hits, say, 4-5... the team will struggle. A lot. You can keep things going, but it will be slow. And everything becomes a battle.
If that number becomes 7 or 8, and you only really have 1-2 developers who are actually competent developers... things will continue to spiral downward.
On the other side - I worked with a team of about 8-10 people on a 6 month contract. The larger org had another 40 or so folks, handling other projects, and support. Onboarding was great - I pushed production code in the first week. Everyone on the team was competent, including the juniors. I had more development experience, but they had more company experience, and it was really a relatively enjoyable engagement overall.
It was refreshing to be able to ask anyone on the team questions, and either get a workable answer, or an "I'm not sure, let's check with XYZ" to get working answers. The "oh, yeah, it's ABC" when ABC is clearly not the answer stuff never happened. People committing code and pushing to production without ever having run the code at all - I've experienced that - didn't happen - that's happening to my colleague.
The problem with a plurality of tech-incompetent folks in a tech group is that they honestly can not determine that they aren't competent. The only examples of competence are in the minority, and tend to not be trusted (even though that minority is the only portion that turns out working/functional code).
Leaving ends up being the only option in those cases. My colleague is only at his place part time, and has hung around because they've gone through some restructuring where new folks were brought in, and... you hope that things might get better in a few months, then realize they don't.
Hah, well, it was easy in some ways. Performance expectations were so low that I could have cruised for years if I didn't care about my job. Just not easy in other ways.
When someone leaves have you ever had code dumped on you that you've never read before and has no tests? If you don't even know what the code is supposed to do, you can't write tests for it.
The way I handled it is: when people come to me with a bug, then I fix it and add a test for it.
I'm excited for LLM applications that can setup, monitor/validate, and optimize data pipelines at scale. Seems possible soon given that SQL and most data records aren't intended to be human-friendly
When LLMs can do the following, they might be able to fix data hell:
- Negotiate with different teams to figure out what a field means
- Be told that a field should be converted from one format to another, but oh wait it's causing errors somewhere downstream because it was told the wrong instructions
- People come to you with some issue about the code you maintain, and you dig enough to realize the root cause is another team's code
I was offered a book deal and turned it down, hah. I read in spectacular volumes and books have a very special place in my heart. Growing up in Malaysia, most of my English was initially mastered from a gigantic pile of Enid Blyton books my family had lying around after the British occupation. We even give everyone at Hermit Tech a day off per week to study non-IT things and let the subconscious do some processing on work. Not because I'm a weirdo (though I am), but because I genuinely believe this produces a higher-quality experience for our clients, and no one can stop me from testing things like "five day work weeks are too long".
In any case, the book deal had constraints like "no swearing", and it was implied they'd find their own artist for the cover. I didn't even intend to swear, but I care too much to let them assign an editor and impose arbitrary constraints. I ended up chatting a bit with Ed Zitron after the AI rant article went super viral, and he told me that I have the audience to just publish my own books.
So I'm doing just that! I'm starting with ten short stories, at the advice of someone in the local Melbourne scene that has helped many people publish. Then I'm aiming to write one book containing a series of essays on IT work, and one fantasy/fiction book. I'm a reasonably good judge of popularity, and because neither will be particularly angry, I will probably only sell a few hundred copies. But I'll have a book with my name on it, which is very special to me.
If this was a job with a non-abstract input and output process they would have OSHA saying it was genuinely unsafe (and definitely stupid.)
Many of us build systems to manage automatic actions to take care of this stuff, if I had an engineer I had to take 100+ steps to get something done I would definitely be considering 1) What the hell am I paying for and 2) Why the hell am I paying for it?
Quick question: How many years of industry exp do you have? I thought the same way until this year when the burnout got to me too. I thought I was too much of a high achiever to get burnout and yet I'm in the same boat as the author.
Also, once the people who speak up about a problem leave, all you are left with are idiot yes-men in management, old timers doing their job as minimally as possible to not to be noticed by management, and fresh new engineering grads ready to be grist for the mill. When those sorts of people are writing all of the code around you, no matter how good you are, you will be driven insane.
Like why didn't anyone catch the issue with the logs? Because it doesn't matter, every data team is a cost-centre that unscrupulous managers use to launch their careers by saying they're big on AI. So nothing works, no-one cares it doesn't work, most the data engineers are incapable of coding fizzbuzz but it doesn't matter.
People always wonder why banks etc. use old mainframes. There's like a 0% success rate for new data projects. And that 0% includes projects which had launch parties etc. but no-one ever used the data or noticed how broken it was. I don't think a lot of orgs which use data as core-infra could modernize, the industry is just so broken at this point I don't think we can do what we did 30 years ago.