We're focused on getting things done, and I think that is a good focus and always has been more satisfying for engineers to see things ship.
But... there are extremes at both ends of that scale that make for very bad places to work, and really bad product and code.
Getting things done whilst not applying any focus to debt avoidance, maintainability, observability, supportability... is a recipe for disaster that leads to constant fire-fighting and the burning out of engineers and others.
Yet building ivory towers of as close to perfect code equally is a bad signal that can kill a company (not bringing in revenue whilst having costs). Not all code is equal and some code requires a higher quality than other code.
I've found that what experience has given me over 20 years is a pragmatism... is it constructive to add this nice-to-have thing, or would it bring more $$$ in at an acceptable TCO if I just do it sooner.
The best engineers I know all strike a balance; keeping things simpler than a lot less experienced engineers would expect, and shipping something sooner.
Because experienced engineers know that there is an exponentially-growing cost associated with complexity. So what may look like "clean quality code" at first look may in fact require an entanglement with frameworks that may or may not age well. And when the business needs start diverging with the vision of framework makers, then there's a world of painful molasses as the team struggles to either hack the requirements into the current codebase, or perform a costly rewrite.
Everything is Minimum Viable Product, trouble is those MVPs end up in producation and never actually get rewritten so all code is horrendous when it comes to quality. Result: Even what should be insignificant changes takes forever and you never know if you are introducing bugs due to zero unit tests. Work that should take a day or so takes weeks.
I've seen programs that are just hundreds of lines of code in Main(). variable names like `var varvar`. No DI, code that is very susceptible to sql injection, things like
`var sql = "select * from " + tblName + " where Foo = " + searchString;`
And other horrendous things. No one seems to care though. If it causes an issue in the future (and it will!) that's future developers problem.
The other issue I find with this is you end up going home each day as a developer and feeling completely unsatisfied, you can't take any pride in your work as it's all just a hack job to get it working ASAP. It's like if I trained to be a chef then got stuck in a kitchen and spent all day just heating stuff up in a microwave.
I can relate to the last bit too; it also feels bad when you know people are looking down at your "lack of productivity" when you're just trying to slow down a bit to build maintainable software.
I love resharper personally and have my own licence.
I mean, I've seen WPF apps where everything was in app.xaml.cs because it was just easier. Need a function, stick it in App.xaml.cs. Need a class, stick it in app.xaml.cs. At this point I think APp.xaml.cs in one our solutions is around 130k LOC because one of the devs thought just having a whole bunch of global functions was easier than worrying about nonsense like classes, ecapsulation, testability etc. You can do this when their is no code reviews. (Who has time to do code reviews, that's going to slow everyone down!)
I've seen console apps written that were all in Main because the attitude was start writing it, debug as you go, almost like writing C# interactively, moving the debug breakpoint up as you make mistakes nad want to rerun stuff. When you can get to the end without it blowing up. You're done. Deploy it.
Or web projects where they say we want you to gather data from some DB, display it in a nice way. It's just a PoC, so knock it up in half a day. It actually though ends up pretty complicated but you get it done in a day or so. But still, the code is a mess but it's PoC. But then the client likes it so this mess gets deployed. Then they want a feature added, but you don't have time to rewrite it so you just have to hack around the initial mess to get it working. Then more features and more features until you have an unmanagable mess that you hope you never have to touch again.
When you're working to half a day / a day for what's a non-trivial project sometimes even naming variables takes a back seat. So you end up with stuff like `var varvar` or `string s1`, `string s2`, `string s3` ending up in your source code.
It's not great, although sometimes it makes me laugh, it mostly makes we think why do I bother though, but that's the reality for some places.
The result of all this is that even if a team wants to just get things done, they’re going to be nudged in the direction of better quality simply by using our tools.
As for my team, I work in Ads. We’re a group of just over half a dozen people operating a handful of revenue-critical systems, and our code quality standards are raised accordingly. From a management perspective, it doesn’t make sense to hire a group almost entirely composed of PhDs and then have them spend their time putting out fires caused by sloppy implementstions. We put a lot of effort into getting the code right at the outset and then running it for years without need for massive intervention.
In my experience this involves a few things. First, we tolerate a fair bit of deadline slip. Google isn’t exactly gonna go broke without our newfangled product, so if we need an extra half a quarter to make it happen right, then we take it. Secondly, we write up thorough design documents before breaking ground. This means few surprises at implementation time, letting us focus on getting the code right. Finally, we spend a lot of time refactoring and cleaning things up. Systems balloon over time, and sometimes it’s more cost effective to clean house and make otherwise complex changes simple. Doubly so if subsequent changes are also made easier.
I can’t speak for the rest of the company, but I think our team’s standards are pretty high.
But if a new startup tries to replicate a process whose aim is to produce code that "[runs for] years without need for massive intervention" before they've established product/market fit, they will run out of cash.
Google has very specific needs. Code quality requirements for banking software is different than a CRUD app.
IMHO, the best path is to be honest with where you are as a company. Some times you need to just see what works, and multiple cheap iterations that you have no plan to support in the future can be critical to success. Other times you're making billions of dollars and the most important thing is to not screw up. With every variation in between.
So at my company: our subscription and payment handling is designed to run for years. Not screwing up is the most important. Certain product features are pushed as quickly to market as we can, with an eye towards future stability. "Good enough" being good enough.
It doesn't need to be "fun" to be enjoyable.
Design documents? Sounds scarily like waterfall to me.
Moving fast and breaking things isn't the only way to write code.
Waterfall assumes you can know too many things at the start and have everything go right. It fails when it encounters the real world but is still pushed by program offices across the DOD despite extreme cost and schedule overruns and quality issues.
I’m out now and on mobile. But I can write in more detail later if you want about the problems of the pervasive waterfall method in DOD projects. Even the DOD systems acquisition documents have endorsed iterative/incremental (a slow motion agile/lean when you dig into it) development models since 1985 but still Waterfall lives on.
There's a lot of options between "moving fast and breaking things" and waterfall. And almost every one of them is better than waterfall.
Let's be clear on terms. True Waterfall is a single sequential pass: Analysis, design, development, testing.
There is no feedback, there is no room for error. Now, you think there being no room for error sounds good until your project is years late and billions overdue. Because no time was allowed for feedback and correcting for errors, they're discovered late (in testing) and either:
1) you loop back then and fix it 
2) you "descope" requirements and whittle it down to a functioning, but incomplete, system
3) you push through and finish it on time (because in most Waterfall shops the estimated schedule is a commitment) and deliver a buggy product (that in the USAF and others can, and has, literally meant death).
Implemented waterfall actually allows for a greater degree of feedback. Maybe they even do unit tests on software  so they have some degree of effective testing during development. More often, they keep the Waterfall idea of big testing at the end. By this I mean a 1+ month event where a comprehensive test suite is executed. This is preceded by a preliminary execution which also takes 1+ months. So let's say you have a 15 month project, this means they literally schedule 2+ months to executing these tests and this is the only time they will execute all of them. Between the PFQT and FQT they will deal with any rework. They still treat estimates as commitments and still rush crappy code out to the customer. So every defect is discovered as late as possible, rather than as early as possible. Any error in logic, any error in design (which is the most critical cause of lack of safety and reliability in systems) will not be found until month 12 or 13 of a 15 month project, if it's found in this cycle at all.
There is no place in this world for Waterfall in any variation except for small projects (less than 30k or so lines of C) or projects that have been done a hundred times before by the same team who is going to do it now (they are literally the only ones who can get this done). But most companies will still try to do waterfall with teams of novices (to the domain, often to the profession to as they prefer cheap new grads to experienced developers and experts).
The DoD has, since 1985, endorsed a model based on iterative and incremental development. This means that you scope your initial necessary features, and over a series of projects you add in the extra ones (including ones you may not have thought of to start with, either on oversight or something that changed). Within each project cycle the iterative model has you doing a rough equivalent of the "sprints" in Agile (big-A) shops. Though often longer (still 1-3 months rather than aiming for days or weeks). However, since there's always a lag this wasn't really picked up by leadership until the 1990s so most of the present leadership still started their careers in the Waterfall world of the 1980s (civil service tends to stick around for 40 years) and still insist on peddling that bullshit rather than adopting more sensible development models.
There is no feasible way that anyone working on the F-35, originally conceived in 1992!!!, could have possibly laid out the work for that project properly in a Waterfall approach. Which is also probably why it was years late, billions over budget, and failed to meet some operational requirements when it was finally delivered.
 Too late to be effective because everyone who did the development has been laid off or reassigned or simply slept and forgotten what they did a year earlier.
 The initial developers may, but they will strip this out before handing it off to the maintenance programmers who are in a different company. Can't help the competition, customer be damned.
In our own company, getting things done is important, and sometimes we do prefer that over writing high quality code. But that decision is not made lightly and it's usually still good code compared to what it's replacing.
We've been working on replacing a large real-estate website (millions of visitors) with a new version that has been rewritten from the ground up. We did this in 6 months and are going to be releasing on time and within budget. The code is in great shape and we got things done.
If you're writing a minimal viable product, something that you know isn't likely to be expanded on or code you know is only going to be live for a few months, obsessing over code quality can actually be a really bad thing. I've been on short time scale (you won't always have control of this) projects before for example where continuous integration, code coverage, code reviews and TDD were insisted upon where the extra time invested in these just didn't make any sense.
It's a different story for projects where mistakes can be very expensive or when you know code is going to require maintenance for a long time but I think it's important to learn when there are other priorities. Recognising when you should just "get things done" is actually an important skill that shouldn't be looked down upon because projects only have a finite amount of resources.
It is definitely a balancing act and a lot of people here are going to disagree with me but launching and getting a viable product out into the world has to take precedence over having a pretty release pipeline or 100% test coverage. On the flip side, it's the engineering managers' responsibility to make sure the tech side gets fleshed out once you have product market fit, otherwise like @LandR alluded to, you'll end up building a house of cards inside a wind tunnel.
Almost nobody understands or cares about good engineering, tests and the like. That's not where they are making money from, at least in the near future. However I did have some pleasant experiences.
- After many years of getting shit done mindset, now maintenance costs are significant, mainly because we can never reject bug reports from customers because we don't know if stuff is really broken. So most of the time, one of the maintenance guys spends hours/days and finds it's a problem with the customers' network, weird proxies, unsupported hardware, whatever.
- As a result, the company frantically started building extensive test suites. However, with most things done in a frantic manner, this doesn't work so well because it was not engineered properly. We have/had hours-long test suites, hard to reproduce failures, devops problems for the test machines, etc. etc. It is, however, much, much better than the previous approach (i.e. release and pray to God).
- Now, after the dust settled somehow, people are starting to realize that the only way to improve the situation is by focusing on quality from the beginning. We spend more time designing architecture with testing in mind, try to be somewhat TDD-ish, look at coverage early on. Now, when a manager is asked to provide estimates for a feature, they are asked if they can fulfill the requirements on time, within budget and in quality.
Typically I find that code quality isn't too vital, as the majority of big-co systems tend to get re-written from the ground up every few years, rather than modifying and evolving the existing system.
But, the real question is why does a support person have global root access to every server in a large enterprise?
So one of the old-timers came by (really fun and totally random guy) and was like... "ah, there's this one thing I thing you could help me, I'm trying to find a solution for these tracking error reports. wait, I'm giving you access"
A few minutes later came my email together with password over MS Lync, and nothing else. And I waited for him to come by with some requirements.
Which of course didn't happen. But what happened instead was that the newly minted Head of IT came by who was tasked of cleaning up the mess after the acquisition and introduce a proper process.
He looked at me very earnestly and just when I tried to think hard what I did wrong he blurted out "WTF ARE YOU DOING WITH YOUR LIVE DB PRODUCTION CREDENTIALS!?"
In the moment I realized what that guy gave me, it was gone already unfortunately.
So long story short, as time goes by and companies grow, there are very different approaches at "getting things done" ;-)
To us, code quality matters insofar as it affects us, whether that's performance (rarely a problem), maintenance (often a problem in our older code), or something else. I think this is a good compromise. Don't burn mental cycles optimizing (whether that's for readability, CPU, memory, whatever) until you have to, but really take time to do it when you have to. Of course, write the best code you can in the mean time to minimize the need to rework old code.
Curious, what kind of games you work on that still "get out"? I used to work on mobile, now switched to Steam "indie", and in both cases the public release is only the beginning of ongoing tweaking and support.
Not if you have to support it and build on top of the features implemented with such an ad-hoc approach.
I've heard this is true of other games as well: Street fighter (cancels I believe?) and smash (Wave dashing). Sequels reproduced bugs from earlier versions as they came to be seen as features of the gameplay.
It's only some aspects of 'code quality', which do have a cost (immediate and/or upkeep, e.g., tests) which even incur any dilemma.
There are many people who don't write that one line of code that throws an immediate exception with a proper error message (and instead just return null or something).
I also see every few months objects and classes with code copied (as in ctrl-c ctrl-v) instead of generalized.
To them, this is getting things done (vs spending time writing descriptive exception strings or isolating the repetitive part in a separate function).
So, your point of view is of course correct, but your definition of "getting things done" takes the next few weeks already in account which, frankly, I don't see that often.
What is easier to generalise -
1. the point where you realise you need to do basically the same thing in another location with a few small changes
2. the point where you have a few cases of basically the same thing with their own small differences in context?
The kicker is that 2 is much cheaper at all steps.
I think the biggest problem with this idea is setting a time limit to allow these duplications to grow - if you allow for it indefinitely, you'll end up with duplicated code building on top of duplicated code. In that scenario, the refactoring becomes harder not easier.
So far, this approach has paid of tremendously both for me as well as my supervisor, as the reuse value is quite high and performance very competitive, resulting in many projects both within the former chair and also with external collaborations.
Unfortunately closed-source (though open for collaborations), otherwise there’d be a link here.
Edit: Just to be clear, nobody sets out intentionally and says "lets favour delivery over quality" or vice versa. Its an approach manoeuvred by the personalities in charge and the team as a whole.
I put value on those things, because that's my job. Their job is business. They want features that will enable their job. I expect nothing less. I want them to tell me exactly what they need to get to the next level. How to achieve that is what I do.
If they ever tell me how to do my job, that's when I will hand in my notice.
The company or "business guys" should at least hire a CTO that ensures a baseline level of quality instead of letting the developers roam wild and free :)
Unless you're the CTO, then it makes perfect sense.
Re type of tests, I think the most useful type tend to be end-to-end, that is testing the end usage.
also anecdotally, engineers who focus too much on code quality tend to miss the big picture
I've seen horrible overengineered sins being done due to this "focus on code quality" for the code quality sake.
Prototypes are meant for info. They are not simply disposable, they are supposed to be destroyed, like toxic waste. We crank up speed all the way. No documentation, etc.
Tracer bullets are meant to hit, but not overly planned. They are completely maintainable, fully documented, refactored whenever possible. The name comes from the idea that instead of making calculations and plans, you just point and shoot. After shooting, you adjust the machine to hit the target better.
With the first manager, everything was overplanned, but the estimations was completely out of control.
Specially one software engineer said that he need one month for a simple reporting page, one week for adding a button to a webpage or just one day to add a label in a dropdown.
Every user was unhappy and the company was completed stucked because no new tools was released. We was neither allowed to do queries by hand or to just write down our CLI tools. Combined it with a narcissist manipulator boss and you will have a toxic environment.
After some here, the management change: a new manager, focalised on "getting things done quick", was in charge and the situation was a little better (at least in the small term).
But after some months it was impossible to survive: there was no any plan, no any quality review or architectural decisions. Project was made as fast as you can, without thinking about scalability or side effects. Refactoring and adding new features was impossible or takes weeks.
I left, but now the company is bankrupted.
The lesson that I learnt is "In medio stat virtus".
The baseline may vary depending on how experimental the feature is (proofs of concept are written to be rewritten; upgrades of existing major and heavily used components are written with the long term in mind) and how business critical it is (we put a lot more time and effort into e-commerce related code than into "share this listing on twitter" code).
As our codebase and feature set stabilize, and as the drag of tech debt has become more obvious, things like automated testing have transitioned from being discouraged to optional to encouraged and now required for significant new development. We don't enforce code coverage standards, but our team is now fully on board with the idea that tests help us move faster with higher quality.
In short, be pragmatic. Quality and speed are always a balance, and the right balance depends on company and development stage as well as the particular piece of code you're working on.
Perhaps that could be useful when having to write pep8 compliant code but don't want to manually format everything.
I think of code quality as practices that reduce the risk of common or damaging errors. As a toy example, if you have a lot of off-by-one errors from loops, using list comprehensions instead reduces the risk of those errors, and would therefore be a code quality improvement. A larger example is writing code with well-defined interfaces that's easy to delete, in order to make it easy to replace bad decisions as our understanding of the problem improves (because there will always be some bad code.)
I tend to place a very high value on code quality that affects errors I'm reasonably sure are common, and less value on practices where I can't see how they result in fewer errors.
Plain, "assembly"-grade discipline - because you don't get anywhere without when feeding the CPU with bare operations - is not widespread enough among developers, but I think that at least some of that discipline will be vital in the long run. Code review on other people's code is so much easier if the patches are mostly unobstructed by out-of-context style changes.
I've had situations where I talked to managers about a situation that would escalate in the coming months, only to have those same managers mad at me when it finally did asking me why I didn't warn them.
I always wonder who these developers are that get time to write tests and develop new necessary skills. But I imagine these developers work on in-house projects, not for clients.
In a previous startup that I worked in code quality was outright ignored and I saw first hand the kind of technical debt that builds up because of this.
This is of course exactly the notion of technical debt, and how the debt can compound (that is, you will struggle to get things done more and more the longer you leave your debt unpaid).
1. Deviation from original estimate as a percentage
2. Time spent on bug fixing as a percentage of time spent on new development
Still new, so this could go either way...
These kinds of priorities will vary depending on time as they have inside my own little company. I generally like to invest more in quality aspects earlier on the project so that the amount of entropy is limited. As the deadlines loop, corners are cut and compromises made to get the thing out the door.
Then once I decide how we want to store credentials securely, we can rip it out, change one method and everyone is updated through the shared library.
Another real world example for us is logging. I knew in advance that we ultimately wanted a better logging framework for searching than just a straight text log like you get with log4net/log4j. We wanted structured searchable logging and to eventually use Elastic Search and Kibanna. I didn't have the bandwidth to do that at the time, but I still insisted on Serilog that would give us structured logging and just used a text file sink, later I added a Mongo sink for searchability just by making a configuration change, when I get around to it, I'll add an ElasticSearch sink. We then get easily searchable logs with applications specific properties to search with a nice front end without any code changes.
Even if I skip unit testing, I'm still going to insist code be structured in a way that separates business code from persistence so when we get a chance to write unit tests (if ever) it makes it a lot easier.
Yes it takes years of experience to know when you are taking shortcuts (getting money on credit - technical debt), and knowing how to get the loan at the lowest interest rate to make the loan less onerous (paying back the technical debt)
Its so bad that even though I've been here going on 6 years. It will never appear on my resume. I don't want my name sullied by the decisions of others.
For some context, almost exclusively I work on ground software, e.g. integration and test (I&T) and mission operations, mostly the former recently. We're nearing our delivery deadline for our completed flight model hardware, so since the summer we've been in the I&T phase, integrating various spacecraft subsystems and running test campaigns on the lab bench, out in the field, or in our vacuum chamber.
I lead the development of our (Python 3 and PyQt4 based) general "Ground Support Equipment" (GSE) software, which the engineers use to communicate with the spacecraft over both a physical/umbilical serial UART interface and through our radio. Getting back to the point of the question though, during this time the software side has been focused solely to get the job done, sometimes in regretfully ad hoc ways.
In some sense, working closely with the engineers building the spacecraft almost necessitates this, since as capabilities are added to the spacecraft, the GSE must be extended to accommodate those in the way of new command and telemetry abilities. Some examples of more significant additions made within the last 8 months or so include adding radio interface capabilities (in our case, sending data through a TNC for uplink and connecting to a chain of SDR software for downlink), allowing for more complex command abilities (and the requisite handling of telemetry data), and creating sequences of commands for use in hardware tests. The more mundane day-to-day work mostly involves adding new commands for the spacecraft and making telemetry easily visible/accessible to engineers.
Throughout all of this though, there's been a great deal of seat-of-the-pants/build-it-as-you-need-it style development, even during test campaigns, because things break, issues (both in the GSE software or on the spacecraft) arise and have to be troubleshooted, etc. And though from the start I desperately wanted to have a (software) test architecture (set up a Jenkins server and everything!), the time and the will to write tests was often not available. From the outset I tried to make the architecture of the application amenable to easy extension, but many of the engineers, at least to begin with, weren't familiar enough with Python to do so, though this has gotten better with time. When time permits, I refactor and redesign (or sometimes actually design something that started as a kludge) the back end, internals, and GUI portions. In some instances I was lucky enough to have previously designed something with enough foresight to make it easy. Sometimes not.
With all that said, the end of the hardware build and test phase is near, so we're at a transition point regarding our coding practices and standards. There's a lot of "dead time" between delivering the satellite to the launch provider and the actual launch, so we've waited on building much of our in-flight mission operations software until now. Because this is the actual mission-critical ground software, though, we're taking the design/construction/devops a lot more seriously. Just this weekend I set up a new server to host a Jenkins instance and future documentation. We'll be running unit tests, coverage checks, and a linter, and also have regular code reviews.
Most of the mission operations software design was finished long ago, but since we're using the GSE as a base, and assumptions/plans may have changed since then, there will definitely be further refactoring and design. I basically look at this software as our actual "production" code as compared to our internal design and test tooling.
I'm quite glad that we'll taking a more systemized approach now, not only because of the importance of the in-flight ground software, but also because we've recently expanded our ops development team and will have many more people simultaneously contributing. Our external reviewers will of course be happy as well.
I didn't intend to write a novel here, but wanted to share some insight from a less-common place, even if we're not in industry. In the three years I've worked on this project, I've learned a huge amount, and have gotten to experiment in small ways with different practices, but I know I still have a lot more to learn. I'd love to hear perspectives from within academia (where code quality is notoriously neglected) and in the aerospace industry, where the stakes require a more rigorous approach.