Hacker News new | past | comments | ask | show | jobs | submit login
Today’s Top Tech Skills (hiringlab.org)
356 points by netcyrax 60 days ago | hide | past | web | favorite | 357 comments

It remains weird and disturbing to me that "AWS" is treated as a skill. Not "cloud system management", but "AWS".

Edit: I do understand why it can be lucrative to have expertise with a specific cloud provider. It's just the status quo of these services being their own unique silos that disturbs me: I would personally be wary of making "ability to effectively use Amazon's servers" a pillar of my career.

I am a mechanical engineer. Mechanical engineering jobs ask for SolidWorks/AutoCAD/CATIA/Creo/etc experience, not just computer-aided design; ANSYS/ABAQUS/NASTRAN/etc expertise, not just finite element method; Fluent/CFX/OpenFOAM/etc knowledge, not just computational fluid dynamics.

With the exception of OpenFOAM, all of these software are proprietary, largely non-interoperable, and expensive—like thousands to tens of thousands of dollars for the cheapest option with basic functionality kind of expensive; like I can never imagine opening my independent consultancy business because how expensive the basic tools are kind of expensive.

Outside of programming, this is the norm. The number of free and/or open-source tools available to a mechanical engineer or an electrical engineer are infinitesimally small and it is almost impossible to do a serious engineering project with only free tools.

Since this is quite lucrative for the companies making those tools, I can only imagine tech companies would be more than happy to lead us to a future where software engineering is similar in this regard to other engineering disciplines. Perhaps this is what happens when a field matures, and since software is maturing, it will become more and more like this as the time goes on.

>Outside of programming, this is the norm. The number of free and/or open-source tools available to a mechanical engineer or an electrical engineer are infinitesimally small and it is almost impossible to do a serious engineering project with only free tools.

>Since this is quite lucrative for the companies making those tools, I can only imagine tech companies would be more than happy to lead us to a future where software engineering is similar in this regard to other engineering disciplines. Perhaps this is what happens when a field matures, and since software is maturing, it will become more and more like this as the time goes on.

That's a really great insight. (Sr EE here.) I wonder sometimes about the fact that SW engineers are capable of making their own tools in a way that we are not. Do you think this helps to combat some of the super high pricing associated with their tooling?

For example: it'd be pretty hard for me as an EE to say "Fuck it! I'm fucking sick of Orcad! I'll just go write my own fucking Orcad!" (OK, well, easy to say. Hard to do. I've spent plenty of days cursing Orcad, but none building a suitable replacement.)

> That's a really great insight. (Sr EE here.) I wonder sometimes about the fact that SW engineers are capable of making their own tools in a way that we are not. Do you think this helps to combat some of the super high pricing associated with their tooling?

That's part of it. There's also the element that in many engineering fields, there is a floor on project pricing. Think Civil Engineering, Chemical Engineering, etc., most of those projects start at 6-figures, and often go into 7 or 8-figures. A per-engineer cost of $Xk/yr for tools is really not that much, in the scheme of things. I think you're seeing a similar increase with web designer tools. Adobe doesn't want you to think of a Creative Suite license as $500 or $1000, they want you to think of it as 1% of your employee's salary.

I think if you a machinist/mechanical eng you possess this trait, you have almost all the tools to make other machines that make new machines, but this is is kind of limited to this one discipline, whereas software we can almost build anything. In a previous profession I learnt this great lesson where we would occasionally build specific tools for specific jobs if we didn't have the correct machines to save sending parts out to be machined or even for doing backyard jobs.

> I wonder sometimes about the fact that SW engineers are capable of making their own tools in a way that we are not. Do you think this helps to combat some of the super high pricing associated with their tooling?

I can only speculate, since I am not a software engineer, but I think part of maturing is things get more complex and making things more difficult and expensive. Thirty years ago, someone could say "Fuck it! I'm fucking sick of Unix! I'll just go write my own fucking operating system!" and actually do it. Today, operating systems are so complex and large that this is not feasible anymore. Maybe you can write your own AWS replacement today, but would you able to write a replacement for AWS of 2050 in 2050?

It it one of those things that I would be happy to be proven wrong about, but as of now, I am not super optimistic that the current state of freedom of software engineering tools would continue indefinitely.

I don't think it's about software getting more complex. It does become more complex, but you don't actually write more code or handle a more complex architecture, you just build it from more complex blocks. Those blocks are packaged so their own complexity is hidden from you.

And you can write software on you spare time, and distribute it. Like it was done for Linux, gcc, git, and many other open source software. The other reason we have free software is because many software companies write their own tools and distribute them as Open Source because those are just tools, not what makes them money.

AWS on the other hand, is not just software but it's infrastructure. It's servers that are running, maintained, etc. There are already Open Source alternatives to AWS, that you can install on your servers and maintain, but it's a very different experience from using the AWS tools on Amazon's servers.

>I think part of maturing is things get more complex and making things more difficult and expensive.

I agree. I think this is a great, succinct description of the underlying "why". It's a moat, to use PG's term for it.

I could make a better Altium. It's just that Altium has a few decades head start on me. XD

Fellow Sr.EE currently pivoting in to SW. The core problem with our fields(EE/ME) is that our resp tools vendors are set in the old ways of building shipping and supporting the tolls that are fundamental to our jobs. No open APIs to customize the functionality, modify the GUI or extend it to Automate( wouldn’t you love to automate updates to the CIS database every time one of your colleague adds a new component to it?.I mean the basic infrastructure is there but just no motivation as they know that their core user base is not really growing and doesn’t mind putting up with this crap. Hence all valid open source efforts like KiCAD have been sparrow fart. No real dent. But SW industry is driven by innovation that’s happening in the FOSS domain. Even the mighty MSFT and APPL have to a certain extend embraced OSS. It’s not in the best interests of cadence and Solidworks to do that. Because the market size for their tools is minute compared to say Visualstudio code which is free and is highly customizable.

> I can only imagine tech companies would be more than happy to lead us to a future where software engineering is similar in this regard to other engineering disciplines. Perhaps this is what happens when a field matures, and since software is maturing, it will become more and more like this as the time goes on.

Actually software engineering had been like that for years in (distant) past. Most operating systems and compilers (that mattered for businesses) cost significant amounts. Not to mention access to computers.

It is said that Bill Gates' access to shared computers in his private high school [1] gave him advantage that very few people at that time could have had.

[1] https://www.cnbc.com/2018/05/24/bill-gates-got-what-he-neede...

That’s something that’s really bothered me about engineers and engineering students. I feel like if you and your team think a head a little you can avoid a lot of it. Plus a lot of software is pretty similar (Autocad electrical vs Xcircuit for example.)

It's not just about choosing the right tool from the beginning. The open-source options are often not adequate at all.

I'm more comfortable talking about mechanical CAD, and in that space FreeCAD (while amazing for a free tool) doesn't come close to professional tools like Solidworks or Catia. Even for personal 3d printing projects it's really lacking, let alone for professional work.

I’m much more familiar with electrical stuff personally and my experience with that is that a lot of the commercial stuff is overdone without any useful thing to show for it (AutoCAD electrical vs XCircuit is my favorite example.) I use OpensCAD for drawing 3D stuff because it’s more intuitive to me and haven’t used either FreeCAD or Solidworks for more than a few minutes each. Is it basic drawing stuff that FreeCAD lacks? Or all the other extra features.

My understanding (which may be wrong) is that FreeCAD just does drawing, if you want FEM then you go get some FEM software. That’s not FreeCAD being incomplete that’s good software design, kind of like the Visual studio vs vim/gcc/gdb/make/git/valgrind/cscope debate.

> Or does FreeCAD lack basic drawing features?

Yes, basically.

FreeCAD only does 3d modelling and 2d drawings, that's right. That's not what I "complain" about, this is normal, that's a CAD package. FEA is another software category.

But it's really lacking in modelling compared to professional tools like Solidworks.

“ I can never imagine opening my independent consultancy business because how expensive the basic tools are kind of expensive.”

I’m in the same boat as an EE, however some of these tool vendors will cut you a break as a consultant. The same PCB tool I pay $10k/seat at work I bought for $5k at home.

You can say smithing like, “weird to see Linux admin, instead of OS admin”. But doesn’t make sense in a business context, if you want a Linux admin.

The difference is that Linux isn't owned by a single company. It runs on virtually every platform, including all of these hosting providers.

Bingo! There isn't a cloud OS and then an standards-ish implementation of that standard. All of the clouds are bespoke and closed source. That said, nothing weird about needing specific knowledge for running on the specific thing.

If I run a motorpool that is all Prius, then I'll want Prius mechanics.

At some point, I think I hoped that Kubernetes would be that "cloud OS" but as time passes, I'm not sure that's going to happen and I continue to make sure I stay abreast of the various closed source cloud ecosystems, including AWS which has served me reliably for over a decade.

Windows Admin vs OS Admin then.

So if it's owned by a single company you can't list it as a skill? Huh?

There’s a lot of details to learn before you can manage a complex aws set up. Sure someone with “cloud system management” skills can ramp up on those details. But if you need someone to be productive right away, you want to hire someone that already has them at his or her fingertips. And in a world where people jump ship every year or two being productive right away is pretty important.

If you need productivity right away then you don't need an employee, you need a consultant.

Sure, but the consulting firms need employees right away. With specific skills.

That just means they don't know how to read demand trends or are overextended.

There were people who made entire careers being experts in Cisco. And Windows. And Solaris. And RedHat.

Heck, you can still make a career on being an expert in IBM 360.

It's pretty common to go deep on a particular vendor's technology and make a career of that.

"AWS" seems like a much more useful delineation over "cloud system management" than "Java" does over "Programming"

I could be super wrong but I'd expect a Java programmer to become useful on a C# project faster than I'd expect an AWS person to become useful building out other cloud infrastructure.

> I could be super wrong but I'd expect a Java programmer to become useful on a C# project faster than I'd expect an AWS person to become useful building out other cloud infrastructure.

I think that is indeed wrong. The major clouds are all pretty similar in terms of the commodity IaaS services - VMs, disks, networks, load balancers etc., right up to things like managed Kubernetes clusters (AWS EKS, Google GKS, etc.) The basic correspondence between those various components is obvious, and the underlying infrastructure is the same - you still end up with servers with disks and TCP network access running whatever OS you chose. The situation gets even less different if you use something like Kubernetes, since your interaction with clusters largely relies on Kubernetes tooling instead of the cloud provider's.

Someone very familiar with AWS should easily be able to find corresponding services in e.g. Google Cloud - I know because I just went through that, having years of AWS experience as a software developer and architect, and now contracting with a company that uses Gcloud. There are differences, but prior knowledge of what to expect tends to make it easy to find what you need.

The only real exception is when you get to proprietary SaaS products, like AWS DynamoDB or Google BigQuery. But these are managed services which require virtually zero maintenance, so it's really the software developers who have to deal with those differences.

The problem with switching from something like Java to C# is that, even aside from the language syntax and semantics differences, which one can pick up fairly quickly, there's a huge ecosystem of libraries and tools that all changes, and the details matter much more - what collection of classes/functions/methods does a library have, what are their names, what do they do, which ones are better than which other ones, etc.

The time to get up to speed on all that is going to be greater no matter what, just because of the sheer volume of detail involved. Not that a Java programmer couldn't get up to basic speed on C# relatively quickly, but to reach expertise with it and not have to consult docs for many little things, i.e. to get efficient and effective at it, takes much longer. I've done that too - I spent nearly two years at a company that used C#, and can't consider myself an expert at all, unlike with Java.

All the consulting shops I have worked so far require Java and .NET stacks experience, as most customers use both and depending on the project setup you might even write Java (or other JVM language) in the morning and some .NET language in the afternoon.

For example, Spring for the backend with a Xamarin/WPF frontend.

It's a big industry. In a long career in consulting and contracting, I've never come across anything like your description.

The skill sets needed for the three big cloud providers are very different. At a high level it seems the same... VMs, VPC, Firewalls etc. but they all are configured and managed differently on each platform.

You need the experience as an AWS, GCP or Azure expert.

Not sure I buy it entirely. I've found that once you become an expert in a particular vendors technology to the point you have a reasonably complete mental model on how it works and how to solve all kinds of problems with it and the specific solutions tailored to specifc problems, then learning another vendors specific technology comes at deep, deep, discount.

Tbh if you are an expert in X and I need you for Y and Y is analogous to X and there is time allowed for ramp up, then it's all the same to me.

Then it's disturbing that they're so non-standardized. I personally wouldn't want to spend a bunch of time becoming an expert in a skill that only applies to one company's physical servers.

Why would they standardise? They see their way of doing things as their competitive advantage - they each think their approach is better.

Maybe you don’t understand how much people spend on these services? If you’re spending a million dollars a month on AWS then it pays to have a few people who focus on AWS full time and know it intimately.

I really don’t think you need to let yourself get disturbed by this.

If we can learn anything from history, I wouldn't hold my breath waiting for the standardization. Just look at situation on the operating system side. Different flavors of Unix, VMS, z/OS, Windows. Sure there are concepts that apply to all of them, but if you want to be an expert you really need to go deep on the vendor specific things.

Anecdotal: my cousin is a sysadmin and he recently took a certification class for z/OS. He says it was probably the best career-enhancing step he's ever made.


If you keep breaking the site guidelines, we're going to have to ban you.


You know what? Fuck it. Ban me. If you really think the above comment is egregious, fuck you. Dumb motherfucker.

I don't want to ban you! I'd rather try to persuade you to use the site as intended. It's in your interest to do so, because if HN becomes a flamewar wreck, it won't remain a good place to discover new things, or read interesting articles, or whatever else it's good for.

Having spent some time in AWS, it’s not to me. I get it. You can “learn AWS”.

Strictly talking policies, there is a lot to know there. Gotchas abound. Things that don’t transfer at all outside their ecosystem and I assume vice versa to azure or etc.

It's the same as calling PostgreSQL a skill. You can know SQL in general, and you can know all Postgres's little ins and outs. It's not the same thing.

Personally I wouldn't make knowing all Postgres's ins and outs the "pillar of my career" but some people do. I don't see how doing that with AWS is any different.

Realistically skills in one are transferrable as long as you're thinking of the components in terms of abstractions. For instance, use a deployment pipeline to deploy your microservice with a frontend api for queries against your scalable database, subscribe to your event source and create your faas consumers to push data to your next bounded, isolated component.

This is something you can do with any provider, the names are irrelevant.

Some features are lock in, but this lock in can be seen as tech debt prior to delivery.

Being able to architect cloud systems is just being able to architect distributed systems. Specializing in one isn't a huge problem.

Twas ever thus

For as long as I been working job adverts have been asking for vendor specific skills

Oracle vs SQL Visual C vs C SAP vs ,..

It can be risky, and in my experience skills are less important than general technical experience, as it’s easier to teach someone specific skills rather than teach them general aptitude.

Having said that, if you are moving to AWS and have no one with AWS experience in your organisation, then it may make sense to hire someone with relevant experience.

Different clouds operate in different ways - people probably want the skill of knowing how the various parts of AWS best integrate and I’m sure that there is a lot of subtlety and non-obvious things to this. Seems reasonable to me? I don’t think it’s disturbing to seek a specific practical skill that you know you need. Not everyone has to be a master of everything.

I think the biggest problem with hiring an AWS or Azure consultant, is you might not get someone who actually does infrastructure as code (at least this happens a fair bit in London).

I'd rather hire someone who knows terraform and another cloud, than somebody who'll click on the GUI for my preferred cloud, and make excuses around not being able to follow process.

This is insane -- how do these people pass through an interview?

It depends on the person interviewing you.

It's common enough for them no to be technical at all, such as a scrum master or project manager, it's also common enough for the first person on site (those working for really low remuneration and a lottery ticket) to just not be that good.

This is crazy to me. Any reasonable certification to qualify someone for an AWS specific role will cover IaC in depth. I guess it explains a lot about the number of cloud antipatterns I see on a day to day basis though.

I've seen people with certs who are fairly crazy too, and heard comments like "$foo is good in theory, but it's not really possible in practice".

I'm afraid its a large part of the market here, but you're more exposed to the different parts of the market when you're contracting and thus moving around on a semi-regular basis.

In my experience there's much more transferability of skills laterally between cloud platforms than there is within the platform itself, ie. knowing how to design a high-volume transaction platform on Azure or AWS is going to be leveraging the same kind of technologies with the same kind of behaviors and features, the real change is going to be in details that in my opinion are quick to figure out.

But arguably if you are specialized in building internal tooling in AWS, the gap to design an effective HVT platform in AWS will be much harder to fill. Same thing goes if your specialization is more on the infra side of AWS or on the application development using paas.

Do you find it equally disturbing that Oracle DBA (or Postgres DBA or Mysql DBA) are specific skills (and job positions)?

> I would personally be wary of making "ability to effectively use Amazon's servers" a pillar of my career.

It is SEO for consultants to effectively catch customers/agencies with specific platform needs: AWS Architect, SAP Developer, Tibco Developer, Wordpress Programmer...

It's not new. I remember being turned down if I didn't have specific experience in some flavor of Unix (HPUX, AIX, Unicos, Solaris, etc). Similar for specific databases, cluster software, messaging software, etc.

And, yes, as dumb then as it is now.

No weirder than putting "Java" instead of "Object Oriented Programming"

Yes, it is. I can build anything in Java and run it on any computer owned by anybody. AWS isn't a technology, it's just Amazon's computers and the constraints and interfaces they wrap around them.

One could then argue that Java is owned by Oracle, and could be called Oracle Java, an Oracle technology. I tend to agree with you though, AWS is more of a company's interpretation of the cloud hosting technologies. It's too close for comfort, to be honest. I prefer to self host, but I tend to host in the cloud due to financial constraints.

I don't have any problem with cloud hosting, as long as there's portability. I wouldn't want to design my architecture to be completely specific to one hosting provider.

It's nothing to do with the hosting or portability and everything to do with talking the language and navigating the interface. You can set up a bare VM and install your own OS and manage everything yourself and avoid any kind of lock-in, but if you don't know how to navigate AWS your entire operations collapses with bad IAM policies, world-readable S3 buckets, and VPCs that let the entire world in.

On the other hand, enterprises don't care as much about vendor lock-in or portability as people on HN do. As a consultant working with many Fortune 500 enterprises, "we're an IBM shop" or "we're a Dell shop" or "we're a Windows shop" is more common than you might think. Enterprise companies want relationships, stability, and a business partner. They don't care about anything else. If they're an IBM shop, your default choice is IBM and any exceptions have to be run up through management. If they're a Windows shop, any request for Mac desktops or Linux servers has to be an exception as well. I've seen clients ditch million-dollar products they've used for years just because their digital transformation was AWS-only and that product didn't support running in AWS.

Enterprise companies WANT to be locked to one vendor, as it makes choices easier, makes the relationship easier, and provides stability to their IT operations. Cost and portability doesn't even factor into it. Lock-in is just a part of enterprise IT, has been for decades.

This is the strange dichotomy on HN, which started as portal for doing business (startups) not FOSS advocacy and getting everything for free.

It feels strange that a community trying to start business has so little understanding from how B2B works.

All true. Point is that it used to be that a key piece of intrastructure was based on "open", general tech, and now it's not, and that's regrettable.

I’m not sure enterprise computing was ever based on open/general tech. In the 80s everything ran on closed-source for-pat Unix systems and big-iron proprietary mainframes. In the 90s that shifted more to Linux but increasingly companies would buy “appliances”, which are pieces of general purpose hardware combined with specific software. Those appliances were never meant to be opened or toyed with and the hardware lifecycle was based on the software lifecycle. If the software was discontinued, the hardware was thrown out as well. Appliances are still very popular today because again enterprise companies like stability and predictability.

An appliance is the ultimate lock-in and enterprise companies LOVE them.

Not just regrettable in a philosophical sense, but in a real, break-fix, operational speed, meeting multi-million dollar per night, getting-shit-done-for-real bottom-line sense. There is a direct correlation in observability in open and closed tech components of a stack. If you have the tech talent to dig into open stacks, break-fix resolution is faster with more granular traceability (and observability depth is unmatched), your team's understanding of the data and process models is more thorough, you gain a better understanding where the design limits lay or at least their directional value, and you have a better overall sense of control over the solution.

That "if you have the tech talent..." qualifier is the big one, though. I see very few companies who recognize how to attract and retain that talent level. IMHO, out of the more than 200 companies I've consulted at or sold deeply enough into in the sales cycle to see enough of how their sausage gets made to form an opinion, probably no more than 1:100 do this. So we end up here, where most companies are content to hire well below that level, buy closed tech stacks, and then play support-tag while users sit unhappy. Plays well politically, but it's a mess for delivery service levels.

As a business you need to decide what your core competency is and play to that. If you're a food distribution company do you really want to hire staff to "dig into open stacks" or do you want to just call the vendor and let them sort it out?

The reason enterprises can't hire that kind of talent is because they don't want to. The reason they can't retain that talent if they do accidentally hire it is because that kind of break-fix resolution

1) doesn't happen that often so the people get bored

2) is frowned upon by management because those people have actual jobs to do


3) is the entire reason they pay a support contract to their vendor

And $50k/yr in support costs is a lot cheaper than hiring the world's best DBA, world's best Linux resource, world's best Java programmer, world's best infosec analyst, world's best architect, world's best SAP resource, the list goes on and on.

The job of enterprise IT isn't to be the best at anything, it's to be stable and predictable.

Started dealing with computers since the mid-80's, I never remember big corporations being big into any kind of "open", beyond standard specifications like X/Open and similar.

Even Linux's adoption was more with not paying UNIX licenses than caring about any kind of openness.

Not really. Using Java doesn't come with having to have a relationship with a company (beyond the the JDK EULA nobody reads or cares about anyway).

In Java I'm a big fan of using SQL directly versus ORM. To that end I highly recommend JOOQ. I library that allows for object oriented SQL. Its not ORM, so you still compose SQL, you just get a lot objects and methods to allow you to construct your SQL in an OO, and functional, programming type way.

It also is a DB first approach. Where you construct your DB schemas and tables first, then use JOOQ to create objects that match your DB schemas. These objects allow you to reference the tables and fields of your DB in an OO way. From that point of view it may seem like ORM, but you use SQL to interact with the your DB.

Two of the things I really about JOOQ is that you stay in-tuned to SQL, which as we all know is very powerful and popular. Second, your SQL can get constructed in a procedural way based on input. So you can added to based on , for example, if query param were provided.

The creator of JOOQ, Lucas, has done good videos out there comparing SQL to ORM. What's cool about those videos is he does not mention JOOQ, maybe at the end. The videos are just comparing ORM to SQL and makes the case for SQL being way more powerful than ORM.

My team (Spring Boot / Angular greenfield app in an enterprise) has a guy that transitioned over from .NET and is totally new to the Java ecosystem. He has 30+ years experience with SQL and can't stand entity frameworks or make them do the best-practice stuff he knows we should be doing, but with JOOQ he's had a lot of success bringing his extensive knowledge base to bear on our project.

I fully understand him.

Dapper or ADO.NET, and with my Java hat on, myBatis.

With jOOQ you still don't write SQL, you write the DSL and still spend a lot of time looking up docs for how to write the query in the DSL, when all you really want is to write the SQL query without the ugliness of JDBC. To this end I highly recommend JDBI.

If the Java proposal for multi-line strings ever gets done it will make writing SQL queries a lot nicer.

I've had people tell me they learned a ton about SQL constructs they were unaware of because they just discovered about them by auto completing on the jOOQ API.

If you know SQL, you won't spend much time looking up how the jOOQ API thinks about it, and if something is not possible, you can always use jOOQ's templating feature: https://www.jooq.org/doc/latest/manual/sql-building/plain-sq...

In any case, I always recommend people build a ton of view libraries directly in their database, and then query them with jOOQ when they need dynamic queries on top...

Please note the DSL is a one-to-one mapping to SQL. So its still SQL, you just have to identify the correct method that maps to your SQL element.

Soooorta. It can actually translate between SQL dialects so it's a more like 1:10 mapping.

I like that line of thoughts. Given that we have a parser, too, it's more like a 27:27 mapping: https://www.jooq.org/translate

You're here!

I asked you a bunch of questions and even filed a few issues a few months back.

You were incredibly patient and helpful as I tried to sort out the performance characteristics of the a few bulk insert approaches.

I want to publicly thank you again for all your help. You've set a uniquely high bar with all your work on and around jOOq.

Thanks for your nice words, I really appreciate it! :)

> If the Java proposal for multi-line strings ever gets done it will make writing SQL queries a lot nicer.

Groovy's multiline strings are really nice that way. Groovy essentially bundles its own version of JDBI as well (but better because it does automatic variable interpolation).

Never heard of JDBI before. Will check it out. Its true there is a lot of DSL looking up. It does decrease over time, but then ramps back up as you do more complicated SQL... lol.

It is already accepted for Java 14.

Other alternative worth mentioning is myBatis.

Not a 100% compatible substitution but Spark can be even more OO and easier to manage.

I've tried several times to switch the URL to https://www.hiringlab.org/2019/11/19/today's-top-tech-skills..., which is the source that this one is copying from, but our software insists on escaping the apostrophe to %27 and hiringlab.org won't unescape it, resulting in a 404. I don't think I've seen that before.

Edit: never mind, taking out the apostrophe works! URL changed from https://spectrum.ieee.org/view-from-the-valley/at-work/tech-..., which points to this.

The escape rules vary in small maddening ways and trying to figure out if your implementation is doing the right thing from specs will make you give up forum moderating and join the merchant marine, in the best case.

Fortunately, that's the sort of thing Google can afford so it's safest to just check what their RFC-compliant utility libraries do:


The apostrophe is not escaped in URI path segments.

Thanks; I've added non-apostrophe-escaping to our list.

I've always been frustrated by the promotion of this kind of list. Yes, it matters what the demand for a given skill is but that knowledge is virtually meaningless without also having information about the supply side!

If 1,000,000 companies desire talent $foo which 2,000,000 workers have, it's not going to lead to as many opportunities or as much leverage as talent $bar that 300,000 companies desire but only 50,000 workers have.

There's also a question of skill liquidity: if there's a very low supply and a low demand - even though the demand might be multiples of the supply - it puts everyone in a tough place. The employee doesn't have the leverage they might hope for because finding a compatible employer is very hard.

Yeah, this is a huge, issue. Even in cases where there's high demand on all sides, the difficulty of evaluation, the difficulty of firing, etc lead to a lot of poor outcomes for employees and employers both.

I believe for many years, that kind of supply/demand relative comparison typically determined that COBOL was the skill to have :)

I've been on projects with COBOL (yay!:), and there's always a complete dearth of skilled people...

That would suggest that a better metric would be to look at the supply demand imbalance and also look at the growth rates for future demand.

For anyone looking at this list it’s probably the places where demand is growing fastest that you want to target.

From those you could split things into stuff that’s mainstream and upcoming.

Mainstream would be Python, AWS and JavaScript as per this list.

The fastest growing areas as per this list are Machine Learning, Azure and Docker.

A common pattern here is that the in demand skills cluster around front end + full stack (including ops) or data engineering + ML

> ...and also look at the growth rates for future demand.

I would say that looking at the growth rate of future demand is a poor way to go about it unless you're also looking at the growth rate of future supply, for the same reasons mentioned in my original comment.

Actually, they won't even test you on Skill X, if they are hiring for Skill X. They'll just make you take tests on Leetcode.

You are just better off skipping all this, and focusing on Leetcode interviews instead.

You might be able to determine desirability by aggregating how long jobs are posted for or how many times it is extended. This would be a equate to some corelation.

I got back into java recently and am appreciating how you can define an interface reference type super class then swap out the different data structure subclasses as their implementation. E.g. make a queue refer to either PriorityQueue or LinkedList. Or make a Map a HashMap or TreeMap. It kind of provides an "aha!" moment in deeper understanding of why there are so many Datastructures and why it actually matter to pick 1 over the other if in the end they are both implementing the same functionality. With python/js , yes those DS are still there but they are most of the time hidden for your convenience (which is great sometimes when you just wanna pump out some idea). The Accidental joy of Java (with its verbosity) is that by pushing the DS choice to time of reference setting, it makes me have to actually think about the choice more and what it would mean.

If you're looking to go lower (systems) level, or even if you're not, rust is worth checking out then too. It's the best experience I've had letting you require only the interface that does what you need, and it's also great at letting you make a type "from" something. e.g. you could require something be a HashMap, or merely that it can be converted `Into` a HashMap (a free action if it's already a HashMap).

Yep, sum and linear types are two of the biggest improvements of Rust over, say, C++. The third would probably be that it's a lot smaller than C++.

That's my contender for both biggest pro, and biggest con, of Java. Because where it breaks down is where the implementations are not clear, the tradeoffs are not clear, and the interfaces are not abstract enough to switch between (for instance, this useful library gives you a Future, and this other one requires a CompletableFuture).

Java is still undisputed in the enterprise. Node.js was a contender but lost steam. May be .net once it becomes truly platform neutral have a chance to take on Java. Until then most probably it will remain widely popular.

Is it really your experience that node was ever a contender? In what context or application? Also enterprise is almost all windows on the client and on the server still has lots of traction; I don't see why .net needs platform neutrality to take a big piece of this.

I find the numbers reported here pretty flawed; ".net" is down while "C#" is up about the same amount. I realize this data is based on keywords, so maybe this just means clueless HR posts are finally stoping the practice of saying "must have .net, C# and ASP.NET"

Stating "SQL" is the top skill is also pretty pointless; I've rarely met a developer who will say "I don't know SQL" because most think this means "can write simple SELECT statment", which a lot of the time is enough.

Yes it was. That's why IBM, Microsoft etc rushed to adopt or sponsor Node.js. IBM acquired strongloop a NodeJs based enterprise framework. Microsoft extended their Chakra engine to support NodeJs but its still work in progress shows it lost steam..

Enterprises do not work with a brilliant language runtime, they need large ecosystem support from large vendors (read enteripse tools) and large availability of workforce, both of which are lacking in NodeJs esp. compared to Java. It's unfortunate as Typescript have now elevated NodeJs to a mature package and should have made stronger in roads into enterprises.

I figured that was mostly due to legacy code at this point; is it undisputed for new code too?

Yes, there is a lot of new code being written for the JVM. E.g. Kotlin is a fine choice as a language for a new backend project these days with excellent support in frameworks like Spring Boot. Also it is a drop in solution in pretty much any Java project (as in you can mix the two and they interoperate without much fuss). Modern Spring Boot is pretty nice and getting nicer with each release. In terms of feature set it pretty much runs circles around anything in the javascript ecosystem. And with a language like Kotlin it's not a bad programming experience either.

As a language Kotlin is sort of similar to typescript with a few nice features added, a bit more elegant syntax in some places, a bit richer in some places, and of course none of the madness that comes from javascript compatibility. I've seen typescript developers pick up Kotlin and liking it. Other languages it is similar to are C# and Swift. Swift is so far not used a lot for backend stuff but I could see that change. All of these can already compile to web assembly and the tooling for that is likely going to mature in the next few years to the point where you can target browsers, node.js, and native with a wide variety of languages.

I was specifically talking about Java, not the JVM in general. The latter is obviously still a great technology and it hosts some of the best up and coming languages.

You cannot meaningfully differentiate between Java and the JVM because all popular JVM languages (Skala, Clojure, Kotlin, Jython) can interact with Java libraries. In fact, they are even using this to market themselves.

Java libraries are the way to program for the JVM and all its (popular) languages.

I've work for large national ISP where they are heavily trying to create a new series of network engineering tools (device discovery, adding new ARs with ease, upgrading existing devices, pragmatically change thru-puts between regions based on load, etc).

Most of the backends for these types of tools would explicitly be done in Java. Why did they chose Java? Mostly because they would staff entire teams with H1Bs and dump them after 5 years. The directors of these projects would only hear about "buzzwords" surrounding the latest tech if they themselves went to conferences or happened to luck out if the project managers they hired had varied experience.

Oddly enough, there's a lot of greenfield work being done using Scala at Verizon and Comcast. But from my direct experience, it's entirely dependent on the team. The more the team doesn't rely on contractors the more likely they are to use niche tech.

Second that. Even the much advertised linked in uses Java engineers much more than NodeJs

If your systems are Java, you'll hire people who can work in Java, and if the only thing your devs have in common is experience with Java, your new code will be in Java too.

Sad but true, for the obvious business reasons.

I was pleased (somewhat) 20 years ago when Java displaced C++, but now I just wish it would fade away.

Thanks for the garbage collector, Java, now retire.

This said as someone who is convinced that most of what I want to do could be done with FP vs OOP techniques, and FAR less of it.

Except that .NET is the only actual alternative, regarding eco-system libraries, IDE capabilities and tooling for monitoring production systems.

There are few competitors that have strong, static typing, great test tooling and broad library support.

I guess "great test tooling" and "broad library support" are subjective, but there are several contenders these days that I would say might fit those criteria

>>there are several contenders these days that I would say might fit those criteria

Even if you get the basic 'Batteries Included' part right, you still won't match up ubiquity in terms of being able to hire developers(in big numbers), common knowledge base, maintainability of code bases. Java code bases tend to be around for long. And people of all skill levels can get started with minimal handholding.

Also note not all companies have the revenue streams of top internet companies. And they can't afford to flush hundreds of millions of dollars, and man years of personnel time down the sewers to arrive at hiring the perfect candidate for the job. Almost always most companies need candidates who can get job done, for whatever salary they can offer. And they can't afford to rewrite their code every two years, so they care deeply about things like easy hiring, and code maintainability on the longer run.

Apart from these things Java itself is a great piece of tech, and has passed the test of time over so many technology trends.

If you are starting a backend project, Java is more or less the best and the top tech choice at this point, and has been for long now.

> "great test tooling" and "broad library support" are subjective

Not really the case when it comes to Java (and the JVM in general). It objectively has great test tooling, and a very diverse library ecosystem, not to mention monitoring, introspection, management, tunability, etc. are second to none. The only other ecosystem that arguably comes close is .NET.

The point is the words "great" and "broad" are relative.

I can only speak from experience, but I know two large corporate entities that used Java and have over the years dropped it in favor of Go and websites.

Edit: i like how my first hand experience is disagreeable and should be hidden. This site gets STRANGE with facts that users just don’t want to see.

You'd have more success with the vocal minority around here if you replaced Go with Rust.

Highly recommend doing an experimentation a week from now. You'll see exactly what I'm talking about.

Same for this comment: https://news.ycombinator.com/item?id=21621738

I’m sure you’re right.

The issue is I can’t change my first hand anecdote of the two companies specifically I’ve been to having switched to Go and web apps from their older Java applications.

Basically fad driven development.

"Comments should get more thoughtful and substantive, not less, as a topic gets more divisive."


Same for another comment of your:

"Yep. golang is one big appeal to authority fallacy at work." - https://news.ycombinator.com/item?id=21618006

Java is hot again.

>Node.js was a contender but lost steam. May be .net once it becomes truly platform neutral have a chance to take on Java.

C# and CLR lost it a few years ago, GraalVM is taking this space really rapidly now.

I'm in opposition to most comments here hating Java for whatever reason, hoping JS or C# to destabilize this system. I absolutely love Java and its ecosystem. Super mature libraries, library for everything, great tools, literally for everything. A lot of tutorials, massive investment from several corporations contributing to OpenJDK. Java is super hot right now.

As someone who loves kotlin on android, I have been searching for companies experimenting with kotlin jvm or kotlin node backends and have been surprised to see that companies are not making the jump. Swift is remarkably similar to kotlin. With a kotlin backend you have the benefits of the JVM or node ecosystem and a beloved language that a bunch of your frontend devs will understand.

Yes I also note this lack of uptake of Kotlin in Java environments but on reflection it's not surprising because organisations which have chosen Java tend to be conservative. Enterprise software is predominantly Java and .Net and probably will be for a long time to come.

To me node never was a contender. The lack of types is painful in JS. Then out of the box nodejs just dies by default on error. Without tons of package and tooling around it. It is a horrible application server.

I was hoping Kotlin might challenge Java outside Android but it seems unlikely to make a dent in Java's dominance of enterprise computing.

Personally as a JVM dev, I see no reason to choose kotlin over Java unless I'm stuck in an environment thats running a JVM < 8.

For years people have been using stuff like Lombok as a half-baked Java dialect with more usable defaults: non-nullable types, immutable objects with getters and builders with named setters, sneaky throws, meaningful #equals and #toString. Kotlin just concisely supports what we’re already doing. And for everyone who’s been waiting for Java fibers, well, Kotlin coroutines are out of beta.

I do like Scala, I wish we worked in an industry that could expect everyone to master it, but it’s probably too much to ask.

> Node.js was a contender but lost steam

My guess is that Node.js either saturated the market of web developers looking to do backend, or people realized that type safety is a good thing, and JS doesn't have a good story for it.

It does, TypeScript, and I'd argue it's a better type system than Java

But it's an extra step in the build process, you still have to decide on TypeScript, Flow, CoffeeScript, etc., and I'm not sure if the Node runtime library gets the benefits of typing.

Depends on the enterprise.

I pretty much keep using Java and .NET alongside each other, because all the customers have mixed systems.

I don't really understand the desire for skills to be in high demand.

Obviously, you want your skills to be in demand in some capacity.

But putting so much emphasis on whether some tech skill is a winner on a scoreboard can give people the wrong impression about other languages and technologies that appear to either be unpopular or dying. It's reassuring that Java and SQL are still in high demand after all these years, but newbies looking at these charts might make the assumption that they might as well not bother learning other things like Elixir, PHP, Rust, Ruby, etc., because they're not that popular, even though those jobs exist, will continue to exist, and pay well.

Also, it's disturbing that Scum is the 13th most demanded "skill".

Library depth and growth are side-effects of a language ecosystems popularity.

It doesn’t mean that all libraries are good ones, but remember the open-source rallying cry “many eyes make all bugs shallow”? Yeah, that tends to be true.

Also, in-demand technologies tend to have a larger talent-pool to choose from. Not everyone is good, but you have a much better time finding the right skills and cultural fit if the talent pool is larger.

A great example counter example is elixir... A technology with a lot of advocacy but almost no demand. Slow library growth, almost no programmers with production experience, and almost no market demand... which is a self-perpetuating cycle. Choosing a technology like that for your stack is like buying an obsolete car with impossible to find parts.

This sounds horrible but maybe the people who would take these lists seriously are precisely the kinds of people who are careerist programmers. And they really just want the maximum value from their “investment” in learning programming.

In college I didn’t learn Java at all because it didn’t seem very fun. Python did and I learned that. Then Go came along which was easy to pick up. But once you know a general purpose language well, it doesn’t take long to learn another one (maybe not the most performant code right away but that’s what code reviews by peers are for).

> This sounds horrible but maybe the people who would take these lists seriously are precisely the kinds of people who are careerist programmers.

While I agree with the sentiment, I think there's a flipside to this. Certain communities cough cough ahem have the luxury of not caring at all about skills in broad demand, hopping from technology to technology because the market allows it.

I've seen a lot of "expert beginnerism" from that group of people, and a little bit of disdain for the people who do schlep work day in and day out to make a living. In my experience, a lot of the people who work with the boring technology have more seniority and depth because they're not relearning the window dressing that is syntax/framework/language constantly.

If anything, I think these reports (from Indeed... real data, not some meaningless top ten) just indicate that employers don't really care about whatever's hot right now, and that a surprising amount of the economy is powered by unglamorous tech.

Is there anything inherently wrong with that? They know where the butter for their bread comes from and make it work for them and reap from it.

As far as individual career choices go, seems like a winning move.

This is a good question. I think I don’t want to work with careerists. Programming is a kind of labor that seems to have different properties than other kinds of labor, which can be quantified effectively and used to make accurate predictions. The sheer complexity of the field makes promising things a fools errand. The Agile manifesto is an acknowledgment of this unpredictability, and the approach they take is to quantify only the immediate next goals instead of predicting months ahead.

Careerists tend to like predictable slices of work which they can accomplish during work hours and go home and do whatever. Unfortunately, most high impact projects require a bit more involvement: not to give up all free time, but to be more flexible. A commitment to do the right thing (very easy to cut corners and ignore best practices) but to deliver on time.

Careerists tend to prioritize their quanta of work and how to get it done during work hours rather than focusing on getting it done well. They have to be coached into best practices. They need to be asked to meet deadlines, or disagree to deadlines if they feel it’s inadequate. For these reasons, I personally do not like working with careerists.

Ok, well what if you picked up VBasic and Pascal instead of python and go? I would say your choices are particularly convenient.

Is that a freudian typo?

>> Scum

I don't if it was an intentional typo, but the `pun` might actually apply in many projects, unfortunately I must say.

because price ( aka salary / fee) is a function of supply and demand and it grows the more the skill is in demand. As simple as that.

Let’s say you want to start a career as a freelance for example : what skill to you put forward ? You may say « but jobs requiring elixir are probably more interesting than sql/java ». I agree, but that is if there’s one position open at the time you’re looking , in the area you’re looking and with other qualifications matching

It isn't that simple. Most "Java + SQL" jobs are run of the mill cost-center type jobs at companies that aren't tech companies. Their salaries, benefits, and locales are generally poor relatively to other software engineering jobs. The demand is high, and so is the supply of labor, but that isn't the whole picture.

I would agree with that picture. In other comments on this thread, I've indicated Java & SQL are the #1 demands on any project I've ever worked on.

At the same time though, none of those people have earned anything remotely what I tend to see on HN in general, and SF area in particular. From my limited experience, it feels like a good, experienced, hardened, senior Java developer in Toronto, Canada (not a tiny/cheap place), makes about half what a hot-startup-technology intermediate makes in SF :-\

Tech "hubs" like the Bay Area are rare: most software developers earn far less, and our salaries shouldn't be taken for a typical salary.

The vast majority of software developers will never see six figures (adjusted for inflation) unless they move to management, just like most of their white-collar peers.


Software engineer and senior software engineer are 100k+, among many others.

The average wealth of ten penniless homeless people and Bill Gates is in the billions.

Canada is particularly bleak in terms of SDE salary though. My offers in South Africa were higher than average salaries in Vancouver.

Yep. I live in the Sacramento area. Half the salary. Half the price for a house. Half the commute.

Seems an even trade.

I thought that, too, before I had the income. Being able to max out my 401k and IRA contribution on 5% of my salary is a huge deal, and more than makes up for the difference in housing cost. I also live 5 minutes from work. People overestimate how much the COL difference really is and underestimate the significant savings advantages that come with living here.

I think you underestimate the COL difference because you are settling for inferior housing. Do you at least have a low-end house, such as 2000 square feet on a 0.25 acre lot that is all your own? Around here in the Space Coast area, that goes for $150,000. It can be done with a 5-minute commute.

Software developers get paid better than typical, so they can get fancier housing if they don't have expensive hobbies. See if you can afford the things my coworkers have gotten:

1. a house with canal access to a lake

2. a new McMansion (looked like 3000 to 6000 square feet)

3. 11 acres, building the house as desired (he got sheep and chickens for fun)

4. beach condo

5. a house with a boat dock on a large waterway leading to the ocean

I'm going to guess that at least 4 of those are impossible for you. The cost of 11 acres is particularly entertaining. I went for something a bit more modest, about 3500 square feet on 0.39 acres, but it is just 0.9 mile from work and I'm supporting a family of 14 on a single income.

We live in a 1500 square foot townhouse. It's smaller than the 2200 square foot beast we left behind near the Space Coast, in fact. But my cash salary is 3x what I was earning there, the benefits are better, the equity is a nice bonus on top of that, and the CoL is just about 2x. I save about what my net take-home used to be. I'd be saving more, but one of my kids need therapy that is quite costly and not covered by insurance.

So while I can't afford that stuff now, out here, I'm going to be in a much better position in retirement than most of your friends. And I'll be able to retire sooner.

There is no "2200 square foot beast". That is just a median home. It's a bit cramped, but adequate. Right now you could get 4400 square feet on a 1 acre lot for $400,000 in Melbourne or Palm Bay.

You CoL difference is only 2x because you are settling for inferior housing. People can do that. You have. Let's not pretend it is good housing however. To get that early retirement, you forgo decades of living in decent housing. The CoL difference is well over 10x with equivalent housing.

That housing deficiency can have an everlasting impact on your family. People in costly and cramped spaces have smaller families. You might not be aware of the effect it has on you, simply thinking that you didn't want that many kids. In retirement, it will be too late to have more kids.

He trolled you good :-)

I have always found this quite amusing that every company I have ever applied to for a job has SQL in required skills and yet not a single one of them has ever interviewed me on it.

It's because they have a database on the back end and you need to be able to write "SELECT * FROM [foo]" to get data.

I'd actually be really pissed if a company asked me to write a query like "count these orders by month in a single row" not because I couldn't do it but because I'm confident I'd screw up the syntax a couple of times.

I think they should ask more conceptual questions around set theory or storage structure, epsecially since they all want "SQL, SQL Server, Oracle, Postgres, etc". Nobody can stay on top of all the platform specifics, but someone who knows their stuff can answer the udnerlying problems.

I always get the same single question: “explain the difference between an inner join and an outer join”. They seem to figure if I can answer that, I know everything I need to know.

I've had the same experience. Furthermore, I haven't worked at a company who actually uses SQL since the early 2000s. Since then, every company has used some kind of ORM.

Python's growth is staggering and I'm happy to see that it's challenging Java. There are certainly some use cases where Java has some big advantages but there are so many use cases where Python can get the job done with much less time and code. Plus I personally enjoy writing python more than Java.

I'm also curious where Go stacks up in this. In Stack Overflow's 2019 Survey [https://insights.stackoverflow.com/survey/2019#technology] Go and Ruby were neck and neck at 8-9%. With Kubernetes and microservices getting so popular I'd expect Go's popularity to grow and maybe pass Ruby.

Seems to me that Java is also still one of the most supplied tech skills. So this does not necessarily make it a market you want to be in. Good SQL skills don't tend to be that common though, but I think a lot of the demand is more for CRUD-level operations than for complex analytics or performance-sensitive stuff.

Most people don't know Java as well as they think they do.

Do they need that though? Will they be paid for that expert Java knowledge?

What kind of question is this? Expert engineers will be paid for that knowledge in any language. Tons of complex, high performance, and high scale applications are written in Java. Half of FAANG runs on Java. Contrary to what HN thinks the world doesn't only run on python, go, and rust.

Half of FAANG runs on Java.

This is interesting and indeed not what I expected. Care to elaborate? I’m curious which components end up being built in Java.

In the recent two or so years I was surprised to see how many people don't _really_ know about SQL. Sure they can write a simple insert and update query and maybe a join, assuming they can just ignore that there actually are different types of joins. But if you ask them about other _still fundamental_ knowledge like a _rough_ idea about what the different transaction isolation level are/mean/imply for you, they fail hard.

Realizing this allowed me to understand why:

1. There is so many used ORM(ish) libraries which make it in practice harder to access the database for anything but trivial queries.

2. NoSqlish databases seem to have became so successful even in application where non of there "benefits" (wrt. scalability and similar) matter and you would normally prefer to avoid some of there drawbacks (e.g. eventual consistency, no "system wide-ish" transaction, enforcement of correct schemata and some parts of data consistency). (Sure there are other reasons for there success, too. Like being fancy, modern or no clear requirements analysis and therefore no idea about scaleability requirements ...).

I mean if SQL and relational databases are for you just structs with a bit more basic types then JSON but which in turn are flat requiring annoying foreign key references and a bunch of ceremony around this with very little added benefits then yes it makes so much more sense to just use a nosql database and be done with.

(PS: Yes I'm aware that for certain use-cases the resulting databases can be very complex and hard to use, what I mean is that it's a skill you have to learn to use efficiently and a scarily large amount of people a came in contact with in the recent years not only don't have that skill but are not aware that, if they don't want to mess up larger databases they work on, they will have to learn that skill.)

What would you say is the special thing about this particular bit of knowledge that made you point it out? To be perfectly honest, it seems quite arbitrary to me.

I could say I'm surprised to see how many people don't know anything about the browser rendering pipeline and consequently produce badly performing interfaces. Someone else would complain about people's inability to do even the most basic Linux administration tasks.

But at the end of the day any single bit of knowledge doesn't say much about your general ability to be a developer. The term "web development" covers an extremely wide array of topics. Luckily, there are many wonderful pieces of software out there that abstract things away and allow you to produce a finished product even if you aren't an expert in everything involved.

It's not arbitrary because screwing up your data integrity is far and away one of the most painful and business critical mistakes you can make as an engineer. Your product isn't finished if it looks like it works, but is either storing data improperly at rest or doing the wrong thing with it.

I think the point to be made was that if you screw up your browser rending to the point of it being unusable on old devices, that could be just as business critical as the data integrity issue.

Making either of those your line in the sand for what a developer should and shouldn’t know is arbitrary. I don’t know a ‘Scala’ full-stack developer that could do advanced optimisation on React render times, or a ‘React’ developer that knows transaction isolation.

Not knowing either of these does not make you an unworthy developer and unless you have a specific need for these skills it probably does more harm than good expecting everyone to know them.

The difference is that if you screw up your browser rendering, you're not messing up your data integrity at rest and creating compound technical and potential business debt. If anything in the critical path from pricing to payments that talks to your data store makes a mistake, you can risk anything from not recording your transactions properly to charging the wrong amount or not enough at all.

It's very expensive to acquire customers. It gets even more expensive when you have dirty data from incomplete domain modeling.

Again, you just pick data as the "one" core business critical technology. I guess you are an SQL export? There are dozens of critical points of failure. Losing your HA gateway during an expensive ad campaign, designing your network/data center wrong such that your SQL cluster goes down etc. etc. Even JavaScript that hides the "checkout" feature. I didn't even go into security (think your domain gets hijacked, or XSS injection on your site etc.).

I never said there was only "one" core part of business critical technical competency. What I'd say is that there are several of them in high priority that have an out-sized effect on long term project success. One of them (security) probably has a similarly outsize effect, but everything else you're describing is operations. It doesn't have an effect on how you design your software, and whether it exhibits correct behavior.

At the most basic conceptual level, the business of software rests on reading inputs correctly, processing them, and writing the right outputs with strong guarantees. SQL is historically the most sophisticated, powerful, common and cross platform way to do that in a structured manner that is easy to reason about for a vast majority of use cases. You can layer an application on top of the foundation of an appropriate data model, but the appropriate data model is a root requirement to even get to a passable prototype. That is why gradual mastery of it yields such outsize payoffs compared to mastering other skills.

This argument is entirely arbitrary.

What about security? Your business won't succeed if you're hacked daily.

What about monitoring? If you're losing data and don't even know it, you're creating compound technical and potential business debt.

What about performance? Once you have a perfect data model, no amount of trying to optimize it will improve your business further. Optimizing other parts of the application can.

Basically everything is important, saying that just because someone isn't an expert in one phase that they're not as good of a developer is myopic.

I get the sense that you and other commentators are putting a certain binary in place, and I'm not sure it's in good faith. Nobody ever said that you have to pick one particular place to be an expert in, and if you're not an expert there, you're not a good engineer. That's an absurd argument, but it's also not what is being said.

The truth is, you need to be good enough at all of the high priority parts of technical competency. Security, monitoring, performance -- sure, all of those are also extremely important. They're all places where unforced errors can be introduced that can and do hurt the business -- sometimes catastrophically or fatally.

With that said, to not recognize the evergreen utility of domain design skills is ignorant. You will never get the luxury to worry about security, monitoring, or performance if you don't build the state machine that makes the right outputs out of the right inputs because you will either never sell it to a customer, or lose that customer when they choose a competing solution which actually does what it's supposed to do.

It's not true that everything is equally important. I think that's a myopic way to look at software development without considering the business impact of key crucial areas where software design and maintenance intersect with stakeholders.

> Your business won't succeed if you're hacked daily.

Which can be the consequence of setting the wrong isolation level: http://www.bailis.org/papers/acidrain-sigmod2017.pdf

Even if you are using an RDMS it doesn’t help to protect you against many business rule violations. Your data may be relationally correct, but that’s about it.

We're having this conversation as if everyone making SQL queries is one fat finger away from irrevocable data loss. I know very little about SQL but have managed to build literally dozens of systems on top of Postgres without ever causing a data loss event as a result of my poor SQL skills...

I think we are seeing the difference between software development and software engineering laid bare.

The distinction is similar to a property developer and a civil engineer. Both create buildings, but one does it at scale by offloading functions to known entities and prepackaged solutions, while the other understands one domain in depth.

Both are needed in any team or organization, because not every solution needs to be "engineered" (a Dockerized Redis instance without SSL or auth behind a corporate firewall may survive untouched for a decade), but sometimes you have to engineer something that withstands gale-force winds at 1000 ft height.

Again, arbitrary. You could also argue a JavaScript bug on your site preventing a shop checkout is the worst, because you lose Cash by the minutes.

While a front end bug that gets in the way of sales is bad for a business, a poorly designed or buggy data layer can cause issues throughout the rest of the application. It can lead to things like lack of isolation between users, loss of data, data inconsistency, and scalability issues.

You have got to be kidding. Corrupting data for a sale already made is far worse than losing a sale because of an outage.

> Corrupting data for a sale already made is far worse than losing a sale because of an outage.

I mean, this is still arbitrary. There are millions of small businesses worldwide that do fine with terrible records, but making on-going sales.

If you don't make any sales because of constant outages, you could argue that's worse than trying to deal with corrupted data, especially if you have log files or some other method to recover corrupted data anyways.

You're getting to the crux of the issue now. If you don't have log files to recover corrupted data anyways, or that is you never saved it in the first place -- you CANNOT recover it. If you risk irreparably corrupting old data and new data, what is the value of your software system in the first place? Why not use google forms and google spreadsheets, data entry analysts and a call center and be done with the whole thing anyways?

My point is that the very root of what gives software value is something deeply industrial -- it is a machine that can do something over and over correctly, reliably and without regressions. If you take away that reliability to be depended upon by a business, you are no longer building an industrial strength solution -- you're building a toy.

Not when you’re corrupting only a handful of sales, but losing all of them. It still feels arbitrary to say one is more important for developers to be aware of than the other here.

As long as you have both covered to a reasonable degree within your company/team it’s a waste of time arguing who’s got the most important info in their brains.

One is more important because presumably one is harder and thus more expensive to recover from. Typically the deeper the rot in the foundation, the more expensive the repair.

A data corruption might be also impossible to recover from. You could even only notice it after months of it happening, e.g. if your users spend months collecting data which is only analysed at the end of the year.

You are potentially corrupting ALL of your data, not just a handful. Sure, in the two scenarios you chose to provide (speaking of arbitrary), the JS error is clearly worse. If you take the worst case scenario in both cases, one is much more likely to cause an existential threat to your business.

...that's assuming there's some sort of meaningful liability in corrupting the data, either legal or from PR backlash, but sometimes it feels like many companies have discovered that's not really the case.

I think you're proving exactly my point.

SQL is intrinsically very close to the metal of your business logic, especially when it comes to general business logic. General business logic is typically most critical and risky around the surface of payments. A JavaScript bug on your site around your shop preventing checkouts is very bad because you are indeed losing cash by the minutes. A poor domain model that doesn't properly associate active promotions to cart order lines could result in an incorrect charge amount and an irate customer.

>> many people don't know anything about the browser rendering pipeline

if you can recommend a good source for learning the browser rendering pipeline please post it here, everyone I have asked just shrugs it off and says that no one really knows.

I'd be interested in other resources for this too. The best I can think of is Udacity's Web Performance Optimization class[0]. It's by Ilya Grigorik, chair of the W3C Web Performance working group.

Other than that there are some older docs for the Blink engine which drives Chromium. It's much more low level and hard to follow though[1][2].

The Navigation Timing spec is good for building an understanding of the major events that go into a page loading and creating HTML elements[3]. It's not the whole picture but gives the timings for navigation and DOM element events.

[0] https://www.udacity.com/course/website-performance-optimizat...

[1] https://www.chromium.org/developers/the-rendering-critical-p...

[2] https://docs.google.com/document/d/1wYNK2q_8vQuhVSWyUHZMVPGE...

[3] https://www.w3.org/TR/navigation-timing/#processing-model

This one is quite good. I posted it to Hn 6 years or so ago but it didn’t get upvotes:


How much of it is relevant 8 years later?

Nearly all of it

This series is also quite good, a bit more modern:


> I could say I'm surprised to see how many people don't know anything about the browser rendering pipeline and consequently produce badly performing interfaces.

I mean, if you're supposed to be a senior front-end dev then that wouldn't be an unreasonable expectation. Meanwhile there's plenty of senior back-end developers out there working on SQL-backed applications who don't know SQL and refuse to learn it.

by this reasoning all knowledge is arbitrary so this comment provides no value beyond arbitrarily shitting on someone

many people don't know anything about the browser rendering pipeline and consequently produce badly performing interfaces

No one cares about that because the website will be redone in the hottest new framework every few months anyway. Whereas serious databases are around for decades.

My go to interview question to test basic SQL knowledge is use of “HAVING”.

It’s a surprisingly good filter.

EDIT: To clarify since this got more attention than I expected...I don’t disqualify any candidate based on the answer to a single question. I just use that question to assess their actual SQL experience. If you’ve spent any amount time writing raw queries by hand for reports or just to pass through to a web service, you’re going to have run into HAVING. It’s a simple part of the basic SELECT syntax without needing to involve joins, subqueries, index optimizations or specific knowledge of database internals.

It’s a fair question and the only way you wouldn’t have run into it is from working purely with ORMs or NoSQL. There’s a lot you can do without it, but it goes a long way in determining my ability to ask you to open up a query analyzer in different environments. It’s also going to tell me a lot about the way you think about data problems and where you are going to gravitate for certain types of logic.

Eh, HAVING is something that can be learned in 5 minutes. So filtering out for that is probably generating a lot of false negatives. I quiz them on indexes and table design. Those can be done without writing a single SQL query.

I look up HAVING every time I write it.

I just don't use it enough to keep it in memory as a first class concept. A quick google search with an example is enough to jog my memory.

If I got asked about it on an interview, I'd be completely blank.


"I use [x] all the time and know it well, so [x] will be a good filter for evaluating someone's technical skills."

No. People do different work tasks depending on the project or company.

Sometimes I'm knee-deep in SQL. Others in React. I'm often forced to do PHP. My familiarity with specific SQL syntax ebbs & flows with what I'm doing.

There's no point in keeping something in your brain if you're not going to use it every day. Just remember the general concept and Google the specifics when you need it.

Just one example why interviewing is terrible and interviewers really should be trained on what not to do.

Exactly the same with me.

Need me to write an ETL. I probably can't crank it out in an hour like a data scientists, but I can get it done in a reasonable amount of time. I just need to do some major context shifting.

Have me write only ETL's for a week, I'll be cranking them out after the first few.

I'm curious now. What's there to look up? It's just like WHERE, but you use it on aggregate values.

WHERE filters the data before you aggregate it, HAVING afterwards.

That's all there is to it.

Simply remembering that fact is language trivia that is easy to refresh when needed.

It’s like asking if intel x86 is big-endian or little-endian. It’s trivial to check but if I haven’t had to use that fact in a while, I likely won’t bother to remember because it’s trivial to search for.

My use of SQL beyond very basic is very sporadic (I do remember about HAVING though). If you ask me out of blue I would most likely fail tests. From time to time however I do have to design database for particular product and in a matter of few days my knowledge gets from basic to very good one. Only to go down again a short while after database design implementation and testing completes.

That does not mean that I do not know basic concepts about how different databases work, including underlying storage engines etc. To the point that for one product I personally designed actual NoSQL (EAV to be precise) database engine and query engine largely resembling SQL to go along.

So would you fail me in your test?

Exactly. You don't use it, you lose it. That doesn't mean you don't have the capacity to reacquire it in a short period should you need it again. Most humans are not encyclopedias.

I can describe for you in detail the different index types on MySQL and Postgres, and go into when/why you would use each. But it's been so long since I crafted a complex join by hand I would definitely have to google it. However I don't doubt that I could write the query in a short time. One of the most realistic interviews I ever had, they gave me a laptop and allowed me to google during my answer. I took the job.

I like to structure my questions such that they could using HAVING or a CTE or subquery and not test on the piece of knowledge directly, but the application of knowledge.

Like in Python if you asked lots of questions with list of numbers ... and then generalized it to infinite lists of numbers. If the first case I'd like to see list comprehensions, and then generator expressions or possibly the use of itertools.

That tends to be how it leads in. We’ll talk through building a simple CRUD application with a single table and then ask questions about how to get specific data for somebody in marketing.

There are a lot of ways to answer it, but HAVING is the simplest.

My experience with interviewers is that they all believe that their pet question is "a surprisingly good filter". Considering that there is rarely a data-driven approach to recruiting, usually this statement has a heavy confirmation bias.

Usually when you add data to the mix, you'll find that most questions are not significantly correlated with candidate success/failure once you condition on "ability to write any code at all".

Completely agree. Early in my career I had what I thought was "a suprisingly good filter" question I would ask candidates. For a couple years I did this. We hired one guy who had failed my question and then fired him a couple months later, which heavily reinforced my belief in the value of this question.

A couple years later I found the stack of resumes, some we hired, some we didn't. I realized that some of our best engineers had failed my question, and I had subsequently voted "no" (I was outvoted, fortunately, by my fellow panelists. Another reason I strongly advocate panel interviews but that's a separate discussion).

I soon realized that my question was good at filtering out people who didn't think like me, not at filtering out people who would make good engineers.

My takeaway was that everyone, especially me, could use a healthy dose of humility and self-skepticism. Not advocating swinging the pendulum into Impostor Syndrome (which I now struggle with sometimes), but somewhere in the middle is good and healthy IMHO.

That is my experience too. I had people swear by "multiply two numbers without using a multiplication sign" as their one and only coding challenge.

I knew a guy, a really good engineer, who used to ask people about a specific thing you would see in a log file from a particular open source server. The SQL "having" question isn't quite as bad, but neither question is good at assessing one's ability to solve problems.

Is the pun of saying knowing “HAVING” is a good filter intended?

It is now.

I can barely get candidates to join two tables.

I’ve hand written thousands of sql queries, including a year spent writing nothing but PL/SQL stored procedures. I’d fail your test, not because I’ve never used it but because I’ve not had to use it in years and, importantly, I rely on Google to remind me of syntax trivia when moving back to a language I haven’t actively worked in in a while.

I HATE trivia questions and I’m convinced they are a sign of a lazy interviewer. If you want to test someone’s knowledge of sql, give them a sample schema, an internet connected computer and five minutes to write an appropriate sql query for your use case.

It’s not a trivia question. It’s directly related to GROUP BY and is a core part of standard SELECT syntax.

I’ve never met somebody who has spent any amount of time writing raw SQL who didn’t know it.

I’ve met a lot who overly rely on their ORM but “have done a lot of SQL” who don’t.

Brightball, how many years of coding/sql/developer experience do you have?

Going on 20

How much time would you think, the (+)average programmer, say with 10 years experience needs to learn _basic_ sql, along with the HAVING clause, assuming they have never learnt sql during their entire working career? ( we are talking RDBMS concepts from scratch. like Primary keys, normalization etc, and finally sql).

(+) average= meaning the programmer should be able to say 'solve' the FizzBuzz under comfortable conditions. Or say add the first n items in an integer array etc...

Largely depends why you ask? I get the impression that most people have taken this to be a pass fail type question in an interview and that’s where the contention is coming from.

My interviews are conversations to get to know you, your background, your professional interests, how you think about problems and how closely your resume lines up with those conversations. I like to get people talking about their work to see where their energy level goes.

When we start talking through a hypothetical data problem there are people who will describe the problem from the UX perspective, the app code perspective and the database perspective. The question prompts that portion of the conversation.

>I get the impression that most people have taken this to be a pass fail type question in an interview and that’s where the contention is coming from

Precisely. And why not? I have my fellow developers do that ( ask trivia and label the interviewee incompetent, if he cannot answer it) , and most people who have reacted to your post have also probably seen that.

Anyway humor me and tell me how much time it would take?

Learning the concepts? A few hours

Using the concepts in how you naturally think about problems?

That will only come from experience. I can't say exactly how much but I'd imagine something in the realm of 6 months minimum of applied usage to different problems.

I went to a job interview recently where the interviewer was trying to phrase the question properly: "I'm trying to figure out how to ask this next one, so if you have a lot of Customers, and... umm..."

And I interjected during a long pause and said "It sounds like you're trying to ask a question which could be answered by using a HAVING clause. So something like, only those Customers who have two or more orders, which would be 'GROUP BY Customers HAVING count(Orders) > 2'."

I was right, he was looking for that. Still didn't get the job.

How would you know it’s a good filter? What position are you interviewing for?

It's about as good a filter as asking a candidate what a "gyascutus" is.

ROFL. You've made my day.


Some months ago I sat in a meeting to discuss design and implementation details for a medicine delivery system. The fact that it was about medicine I thought it would be obvious to everyone that we would have to use a relation database with ACID properties. Last thing we want is to send medicine at best effort, I assumed everyone would be on the same page on the need to use transactions, foreign keys and all other data integrity mechanisms baked into relational databases. I was taken aback when the app developers really could not understand why we needed to use a relational database. A document database was going to be fine. Needless to say it was a long meeting and I wasn't invited back.

Blech, the worst thing I've noticed after moving to document databases even when scale does matter is the loss of the relational data model. A nice ERD that shows relationships between your entities makes everything so simple. You could have one for document database as well but people stare at you strangely. So now it's just a random collection of collections of stuff, and it takes 10x more words to describe your design.

This is a really good read: http://www.sarahmei.com/blog/2013/11/11/why-you-should-never...

It shows what happens when you have exactly the misunderstandings being discussed here. Also note that the author still has absolutely no idea how to do a simple ER diagaram.

My favourite line:

> Once we figured out that we had accidentally chosen a cache for our database, what did we do about it?

I'm going to use that one to win imaginary arguments in my head with my pet NoSQL strawman.

So they went with a non relational?

I wasn't invited back so I don't know. The back story is there had been some initial discussions before the meeting I attended. I was invited because my area is data analytics so the project owner wanted to keep me in the loop. It appears I upset the apple cart with my opinions. I was polite but firm. I don't regret it because I spend my life cleaning up messy data from systems not using core relational principles to get analytics out. Very few systems use foreign keys.

There are lots of ways to interact with folks who don't understand something that get your point across without making them feel bad. I'm guessing you didn't employ any of those strategies.

The truth is probably somewhere in between. I was firm but polite. I was called in because I was going to be analyzing the data. Build data warehouse and import other data. I had to be firm because I know how much time it takes to clean up data before you can analyse it.

I never learnt SQL "properly" until I took one of the first MOOC courses, Introduction to Databases by Jennifer Widom. Really good for getting a better understanding. Also, for a fantastic source on database transactions and isolation levels, check out "Designing Data-Intensive Applications" (chapter 7) by Martin Kleppmann. Really a great book!

I have written about both these here:



Regarding your 2nd point: commercial grade NoSQL db's can do all of these things now. I think the notion that they cannot is from the v1-2 era of Mongo about 7 years ago. You should read the Dynamo white paper and subsequent documentation on write consistency, transactions etc. and also look at the feature set available on Mongo V4.2.

Commercial grade databases have done all these things since the 1980's. We just hadn't invented the name "NoSQL" for them, because we hadn't yet realized we were supposed to hate SQL. We called them Object Databases, because it was the decade when objects were cool. It also gave us C++ and Objective-C and Object Pascal. Basically we were just gluing objects onto everything we could.

I've used one. It was great. They (mostly) failed because of business and politics and a whole lot of other reasons unrelated to the technology. I'm sure by now everybody here knows that systems don't succeed or fail based purely on the technical qualities of their design and implementation.

We all still get a good laugh at the "MongoDB is webscale" video.

There are still serious concerns in terms of speed for anything involving analytic applications.

I agree. Use the right abstraction and the right number of them: I’ve yet to find an ORM that didn’t kick you in the face In terms of performance to only hold your hand with queries.

SQL queries can be done in code in a sane way to avoid sql injection or you could implement stored procedures and functions to make operations more ORM-like but not pay any of the ORM tax.

It’s just annoying to see the DB treated so flippantly by devs who are Hussein Bolt quick in grabbing an ORM when it’s not always the right tool or even necessary. They’re great for prototyping but not for production imo

I think HugSQL[1] (Clojure library) has really opened my eyes to what DB interaction should be like. It’s not exactly an ORM, but it protects you from SQL injection and makes it convenient to leverage the capabilities of your SQL database.

Also, his name is Usain Bolt[2].

[1] https://github.com/layerware/hugsql

[2] https://en.m.wikipedia.org/wiki/Usain_Bolt

Sure I am not saying crafting queries explicitly and preventing sql injection etc in code is elegant but it can be done and you can write super thin abstractions over the boring bits and still maintain a ton of power over your queries and know exactly what is going on without having to add another dependency.

The one huge win for ORMs in my mind is the standing up of and the iteration of schemas, defining a schema and then summoning a database in alembic vernacular is really neat whereas one would have to manually manage a set of init scripts and migrations oneself.

In the end there's no really clean way to do DB maintenance and work it's just work and has to be done, imo.

I've found Mybatis[1] to be a better alternative to Hibernate. At least with the version I was using you had to do complex mapping in XML instead of annotations which isn't great but I'll make the trade-off to avoid the ORM. Not sure if this is improved in 3.5. [1] https://github.com/mybatis/mybatis-3

> which make it in practice harder to access the database for anything but trivial queries.

I did my time. I understand normalization. I've got a vague understanding of the various normal forms (except 6NF which nobody seems to understand ;-)

I can use a query profiler and I've occasionally read up on the different index types and their performance profiles.

But I still prefer to use an ORM. SQL syntax just never fitted my brain.

And the ORM I use can handle aggregation, annotation and a bunch of stuff that you couldn't really describe as "trivial queries"

(Actually - just read up my Codd again - I think it's 5NF that always had me stumped)

I’ve used both relational DBs and nosql professionally. While in super specific use cases, NoSQL does shine; it’s a giant mistake to use them as a generic, general data store like with a traditional SQL database. You end up doing a lot more avoidable work in the long run especially if you’re doing reporting ie something that SQL can easily handle, you end up having to replicate it in your own code base

Why is making a materialized view for reporting hard? Shouldn't components be aware of what they need to enable analytics?

I don't see how adding logic to a service at runtime is any different than fiddling with queries and optimizing them

With SQL the logic of filtering and joining is already mostly done for you. You’re basically now reimplementing this logic if you’re not using it. In most cases it isn’t difficult but it still time consuming and more prone to bugs vs using SQL

I don’t see the point of reinventing wheels unless you have to, unless you have lots of time and energy to waste

This is a side effect of the extremism that happens in our industry. A decade and a half ago there was a hard push on ORMs, to the point that people would refuse to work on a code base that didn’t use one. So a lot of developers simply “grew up” not learning SQL (properly).

We can easily enumerate dozens of other technologies that developers hard push on only to years later realize they’ve swung too far.

You really don’t need to know transaction isolation to do the vast majority of work on a typical web app.

It can seem that way, but the consequence of not knowing them is lots of subtle race conditions. Granted, most webapps probably don't have enough users to trigger them too often, and the impact is often limited e.g. because the kinds of inconsistencies you can introduce don't actually matter that much in many cases.

For the majority of webapps, I'm not sure the default isolation level of MySQL and Postgres is an appropriate tradeoff, since I'd rather have correctness by default than performance by default.

The vast majority of work on web apps is probably writing CRUD controller methods, proxies to upstream services, or React components. It’s like saying CSS transitions are vital; they aren’t, but probably affect the user experience for the average Joe just as much if not more than a phantom read.

At that level of depth you just hand it over to the CSS champ or the DB champ.

>At that level of depth you just hand it over to the CSS champ or the DB champ.

Developers are blaming other developers for not knowing everything they do because businesses forced them to learn everything (Fullstack), but that's actually a bad arrangement and somehow the solution is to just punish the developers more rather than hiring DBAs.

> >At that level of depth you just hand it over to the CSS champ or the DB champ.

> Developers are blaming other developers for not knowing everything they do because businesses forced them to learn everything (Fullstack), but that's actually a bad arrangement and somehow the solution is to just punish the developers more rather than hiring DBAs.

If the organization views developers as cogs or something like a software assembly line, the desire for interchangeable, unspecialized workers seems like a natural consequence.

Sure, as long as the developer knows what they don't know. Race conditions usually don't fall into that category (and neither do potential UX improvements actually).

The advice I've received is to just set the isolation level to serializable and stick with that until you get performance problems, which you likely won't. If you stick to that advice, I doubt you're going to get "lots of subtle race conditions".

That is good advice that is unfortunately not widely followed in most of the apps I've seen. There's also the caveat that you should be prepared to retry transactions, but for small apps it's probably fine if you don't.

> Granted, most webapps probably don't have enough users to trigger them too often

It isn't purely a matter of number of users. You can have a billion users, but if your access pattern is embarrassingly isolated you are still probably okay. (And sharding is going to work great to boot.)

True, though even a single user triggering two actions quickly can cause issues if unlucky.

I wouldn't say so. Concurrency bugs, like lost updates, can be very subtle and hard to debug, and common databases don't prevent these kinds of bugs by default.

If someone on my team was writing a web app and spending time meddling around with transactions outside of whatever abstraction layer we were using on top of the DB without an incredibly good reason, we'd definitely be having a coaching moment, and if it continued, I'd likely help them find a role with another org where they might be a better fit.

Any quality database abstraction provides fairly simple means to perform transactions.

>If someone on my team was writing a web app and spending time meddling around with transactions outside of whatever abstraction layer we were using on top of the DB without an incredibly good reason, we'd definitely be having a coaching moment, and if it continued, I'd likely help them find a role with another org where they might be a better fit.

And most people with knowledge of DBs should run, not walk, away from a team that operates like that...

Because if you are writing a fairly standard web app, you don't need this. There isn't the scale to make it important. Software is a very large field, with a wide variety of use cases. Obviously an exchange trading platform is going to have much more focus on these DB issues than a SaaS web app for business that expects at most 5,000-10,000 users at a time. With very few of those user modifying the same records.

>Because if you are writing a fairly standard web app, you don't need this.

Does the app handle user accounts? Payments? Data editing? Multiple logins into the same account? Then it should handle races and transaction properly...

That said, it's true that they can be rare, and might be less worth it (engineering or time wise) than just letting it go.

For an app with like a few thousands users or internal use, the odds might be small.

But a fairly standard web app might serve 1M, or 10M or 100s of millions of people, and they deserve better.

> deserve better

So less features and slower performance to account for the hypothetical case where someone tries to edit their user account from 50 devices at once over and over again?

> But a fairly standard web app might serve 1M, or 10M or 100s of millions of people

You are talking about less than 0.5% of revenue generating SaaS apps.

Transactions are hard. You can't just abstract it. Those different isolation levels are there for a reason.

Just slap READ UNCOMMITTED on everything to get your speed back


I'd argue you should be aware of the transactions and isolations levels exist, like many things, but having to recite them in an interview situation is a different ballgame.

Otherwise, you end up in the situation where every second person has a different set of pet skills and you can't hire anyone because they need to know all the skills perfectly.

But it depends on what you're doing and your market. I've worked with people who took a pay cuts to go work for FAANG, after several interviews which they've studied for but I've decided it's not really needed when you're just writing a little in house CRUD webapp.

With SQL, I started out using an ORM at my first internship and I've been using an ORM ever since. Over time, I've grown to dislike ORMs for anything more than anything trivial (a simple select statement with a handful of 'WHERE' clauses, an insert, or a delete). Most ORMs try to generalize the behavior between various RDBMS (Postgres, MySQL, SQLite, Oracle, MS SQL, etc.) and end up being exceptionally leaky abstractions.

When combined with tools that create a schema from code (which are in of themselves quite leaky), the resulting schema is frequently atrocious and lacking in both safety and speed.

Could you describe what you mean by “leaky” in your experience?

For ORMs, the ORM is theoretically supposed to hide all details of the underlying RDBMS from the developer and only expose a set of common APIs. In actuality, you have to be acutely aware of the subtle differences between the various RDBMS and how they interact with the ORM. Generally the most frequently used functions are fine. But the less frequently used functions (which tend to be some of the most useful functions) may be dramatically different in behavior dependent upon the RDBMS [0].

For tools to generate a schema from code, the common data types are simple since the tooling is usually good enough to handle those common cases. But once you get into the non-trivial types (notably numerical types with custom scale and precision and timestamps), things go haywire very quickly. Another example is that all of the migration tools I've ever used have required the developer to define the model separately from the actual migration code which inevitably causes drift in the schema/migration code at some point.

Anyone that tries to define non-trivial relationships between tables/entities using non-SQL code is usually doomed to strange relationships in the RDBMS that'll make it arduous/near-impossible to ever use a different migration tool/ORM. I've seen some horrifying schema designs due to a near religious dependency on ORMs/automatic migration tools (foreign key references to the same table, join tables lacking foreign key constraints, columns intended to be foreign keys lacking a foreign key constraint, tables lacking in primary keys, etc.).

[0]: Notice how constraints are deferrable and adding comments onto columns are only available on certain RDBMS. https://sequelize.org/v5/manual/models-definition.html

What would you recommend to pass the level where you can use SQL to get the data you need and move on to a more advanced level? Any particular book/course?

Itzik Ben Gan has excellent books and sql for smarties by celko is great as well.

Tons of replies to your comment already, but for point #1, this was absolutely the case for me. ActiveRecord is incredibly, incredibly good for pretty much all basic use cases. However, it comes with drawbacks like you said. It took me too long to realize I was behind on knowing sql to get to the next level.

To the point where I'm working today to fix slowness in a client's app where it relies way too heavily on the ORM and plain ruby code instead of going more in depth with sql.

Knowing sql more in further depth is the number one skill I'd want to tell backend devs to learn.

Also, I'm mainly a python guy, and if we're talking about ORMs, I will say that SQLAlchemy doesn't live up to ActiveRecord, to the point where I built my own db connection code for queries in some of the other flask / general python apps I'm working on.

There is nothing about Nosql databases that force you to have eventual consistency or lack Transactions. As far as forcing the correct schema, if you are using a statically typed language, it can enforce a “schema”. For instance with C#, LINQ and the standard Mongo driver, you work with strongly typed MongoCollection<T> and perform LINQ queries with IMongoQueryable<T>.

Amazon’s DynamoDB for instance supports transactions and optionally consistent reads. With Mongo you can set your WriteConcern to force strong consistency. Mongo also supports transactions.

A simpler explanation is that some people will just "learn" (via googling) the basic SQL they need to get something done. This is not uncommon across all domains in our industry.

> But if you ask them about other _still fundamental_ knowledge like a _rough_ idea about what the different transaction isolation level are/mean/imply for you, they fail hard.

More a property of concurrency than of SQL, I believe; your point applies for programming languages too.

It's possible to get by as a programmer with no real understanding of how concurrency works in the relevant languages.

If concurrency bugs were always obvious, doubtless things would be different.

My company is in the midst of a modest hiring blitz and it is really eye-opening how few applicants are actually writing SQL in the coding exercise as opposed to using a highly-opinionated ORM. More damning - in my opinion - is that they mostly have blinders to how much time they are spending working around ORM limitations on a day-to-day basis.

Not everyone is a DBA?

You mean that only DBAs need to know SQL?

That's because a different database may have different transaction isolation than an another one. Fairly complex SQL queries cannot be used between different major databases.

I am one of those who don't care about SQL, I assume it will fail and in general avoid transaction as much as possible.

TBF, sql queries is something you _don't_ want to have in your code stack. It should be abstracted away in some kind of DAO, and no one should be worrying about SQL queries when they're developing new features.

Fascinating statement which I feel is probably agreed upon by many. I can see how that can be true; but once again,I feel, demonstrates that HN has a specific subset of "developers"/"techies"/"enthusiasts".

In any "Enterprise" environment I've ever been in - core back-office enterprise resource planning apps or similar - SQL is right there front and center. "Should" it be? I don't know, I'm not a theoretician; but all developers around me for 20 years have had near-DBA level of SQL skills. Their core skills really are a) Understanding business requirements and b) Understanding SQL. For any complex business requirement, by time, we find in transactions code takes 1-10% and database activity takes 90%. So skillset in optimizing database performance of your code vastly outbenefits the skillset of optimizing your code. As much as possible, we push everything to database - queries, looping, even business logic if we can - because it's a robust, mature, optimized product and we don't have to invent optimizations from scratch...

Edit: If it may prove illuminating, even team leads & management on any of the 30+ projects I've been on, understands runstats, index reorg, transaction isolation, database maintenance basics, and a few basic optimizer dos and don'ts. They just wouldn't be able to survive without that knowledge - it's 90%+ of both good stuff we develop, and bad stuff that goes Bang in the night (e.g. a process which worked for 10 minutes for 2 years but suddenly decided to asymptotically explode and not finish before heat-death of the universe :P)

I think the phenomenon you're describing is exactly what led to the rise of no-sql dbs.

SQL allow you to do too many things, a lot of things that could happen in programming. From a design point of view, that's not a good thing, you want a data store to do just what it is - a data store, all joining and such can happen somewhere else - just make sure it's in the same transaction.

I see many of the problem stem from a undesirable design of the tables - not splitting them when you can, having too many foreign keys, and generally being a interconnected mess. Having dealt with nosqls, I attest that it is much better in that it forces you to design things in an atomic way, preventing many of the potential problem you might have down the line from the beginning.

>>From a design point of view, that's not a good thing, you want a data store to do just what it is - a data store, all joining and such can happen somewhere else - just make sure it's in the same transaction.

See, that's exactly why I indicated I'm not a theoretician and do not feel comfortable making such a sweeping "Should" statement (much as I would've made an opposite one:).

Immediately my question is "Why?" and "What data do you have to back it up?". I understand model-view-controller etc is a design pattern, but are we certain that it's always the right one?

From my trenches perspective: relational RDBMS has been around for neigh half a century. It's an INCREDIBLY mature, optimized, understood (by experts), common, standardized (for all the individual RDBMS differences), safe technology. I have team members who have been doing hard-core SQL for 20+ years, and a market full of similar, serious, hardened experts. And it'll survive changes in the higher-up stacks - in my own meager career of ~20 years, relational RDBMS has been the most stable part of the stack.

I can get a 3-5 year 'expert' on a particular programming stack, or a 20-year expert on SQL, who has seen things and will safeguard customer's data and business priorities with their life. Again, a very personal experience, but the dozens of clients I've been with, have broadly similar priorities, objectives and concerns on database level; and occasionally vastly different ones on layers above. It's a brilliant unifying common denominator.

Without a fun discussion over a drink and whiteboard, and it could be my inability to see forest for the trees, but I'm just not convinced that "all joining and such should happen somewhere else" :-\

[note, I didn't downvote your comment - I don't necessarily share the same conclusions, but I think it contributes to discussion :-]

I think your point is perfectly valid, I put should when I might have said ‘I Believe should’. But nevertheless I stand by my own point and is not quite convinced, both of us bringing anecdotes from our own experience. It is always good to hear different opinions, though.

I strongly disagree. For many backends/purposes, SQL is really the thing doing all the real work and the backend's just some fancy authentication and validation layer.

Stick close to the technology doing the heavy lifting.

If people abstract away SQL too much they just start reimplementing things in buggy underperforming code, using 1000 lines of Go or Java or whatever what should be done in 20 lines of SQL.

Also, don't get med started on people thinking they can abstract away tye database for testing. The automated test suite should always include the real database you are working against and all queries executed. I will never write a backend again without testing against the real database (whether sql or nosql).

Some people justify abstracting away the database saying "one needs to be able to switch to an arbitrary storage engine". Well..storing things is probably the main purpose of your backend. I sometimes turn that around and say, "I want to write things in SQL so that I can easily change what language the backend is written it".

Testing against the real database? Won't keeping a copy of the database just for that be expensive?

Also, should devs really be seeing potentially sensitive data?

On a side note, "I use SQL so I can change the backend language" is the hottest take I've heard in some time heh.

With database I meant an instance of whatever database engine you are using, not the actual data.

My current project uses mssql. Each test run probably spins up and destroys 100 databases (inside one mssql docker container that is spun up for each test run). Each test function populates a DB from scratch (using the same sql migration scripts that we have used in prod), runs the test, drops the database.

Can do that quite some times in the minute it takes to run the full suite.

Point is to actually execute the SQL (or whatever NoSQL you use) as part of the test.

And that's a problem - my opinion remain that individual sql queries are better left for analyst and data engineers to craft so they can discover whatever correlation they might be interested in. But for a full scale app it shouldn't be in the forefront, at least not in the same package as other logic.

The entire idea of programming is the one where you abstract away things and then abstract away the thing you're currently writing. I agree that it has gone too far in some of the places, but abstracting SQL away is the first and almost one of the best abstraction you can make.

Analysts and data engineers these days mostly work on seperate data warehouses? Do they get involved in the OLTP work in your case?

I was talking about using SQL specifically for the OLTP workload not analysis. If a query that is necessary for a backend response in some REST API ends up taking 10 minutes, but could have taken 100ms if the backend developer just knew what they were really doing over in the SQL land....the backend developer will probably waste a lot of time doing silly things with the tools available to him/her (perhaps introduce redis to cache stuff there is no need for caching, build their own fragile homegrown index table or aggregation table manually using 100 to 1000 lines of backend code plus tests, and so on).

I mean, noone ships a totally broken application that noone can use because it is so slow then asks the SQL expert to optimize it onve it has shipped!! Instead any lack of knowledge of underlying database will just mean reinvention and needless and fragile cruft in the backend code.

Note: I am against abstracting away the specific database technology you sit on (whether some SQL or not). If you are on SQL, know about indexes and materialized views (and their limitations) and so on and use them to implement efficient API endpoints, let your model in SQL directly handle any questions about idempotency or races in your API, and so on.

If you are talking about just abstracting away the SQL syntax, not the DB technology as such... such abstractions I am "meh" about, not very against them but they are usually very leaky so... They don't seem to bring anything fundamental for or against just syntax candy (getting in the way in my case).

You can't optimize well enough if your abstraction layer is too far away from your application layer.

I really don't get all the hoopla about learning SQL. There's really not much to it. It can be learned in an afternoon. It's by far the easiest language I've ever learned.

To all the naysayers and downvoters: There's a difference between knowing SQL and being an SQL ninja or a master of database architecture. Those are three different things. The former, and what my original comment concerned, was just learning the language itself.

I concede that learning to be able to craft complicated, performant queries and design complex databases with any desired property you might dream of is not quick or easy, but I still maintain that learning the language itself (sans extensions) is.

Basic SQL sure, but the moment you start getting into complex operations, upserts, extensions, functions, custom aggregate operators, arrays, HStores, etc it gets complicated. At least for me.

I would not be able to do this super quickly: https://stackoverflow.com/a/42939280

The finer points probably require a bit more than that.

If you're already technically proficient in general programming (or Excel), you're likely to pick up SQL quickly.

There is immense, unwarranted filtering of people from jobs for relatively minor things.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact