
Ask HN: How to transition from academic programming to software engineering? - fdsvnsmvas
I taught myself how to code and ended up doing a PhD in a computational discipline.  Programming has been a big part of my life for at least the last decade, during which I&#x27;ve written code almost every day, but always by myself.  After graduating I joined a medium-sized company (~10^2 developers) as a machine learning engineer and realized how much I don&#x27;t know about software engineering.  I feel very comfortable with programming in the small, but programming in the large still feels mostly opaque to me.  Issues like testing &#x2F; mocking, code review, management of dev &#x2F; stage &#x2F; prod workflows and, most importantly, the judgment &#x2F; taste required to make maintainable changes to a million LOC repository, are areas where I can tell I need to improve.<p>Former academics who moved into software engineering, which resources did you find most useful as you made the transition?  Python-specific or language-agnostic books would be most helpful, but all advice would be welcome.
======
ChuckMcM
I have hired some PhDs in your situation and worked with others, I personally
just went to work after I got my BSEE.

My observation is that you're halfway there when you realize that you need to
improve, of the folks I saw who did poorly it was because they didn't realize
that you could be both the smartest person in the room and the least capable
at the same time.

Right now, on your first job experience, even a kid who never went to college
is better at programming than you are because they've been experiencing all
the pitfalls that can happen and have built up a knowledge from that which is
perhaps more intuitive than formal but serves them well. What you have over
that person is that you've trained yourself to consume, classify, organize,
and distill massive amounts of information in a short amount of time.

Use that training to consume everything you can on the art of writing
programs. Read "Test Driven Development" read "Code Complete", read "Design
Patterns", read "The Design of the UNIX Operating System", read "TCP/IP
Illustrated Volume 1", and a book or two on databases. With your training you
should be able to skim through those to identify the various "tricky bits" and
get a feel for what is and what is not important in this new field of yours.

Soak in as much as you can and ask a lot of questions. Pretty soon no one will
know you haven't been doing this your whole life.

~~~
stedalus
This is pretty good advice overall. One small change I suggest is to take it
easy on Design Patterns and the like. I’ve seen people in OPs position
(general smarts but limited production experience) turn into architecture
astronauts and start overengineering everything. It can be useful if you’re
working on a legacy codebase and need to understand the jargon that can appear
in [possibly overengineered] existing codebases.

~~~
fsloth
I've written software over a decade and I _loath_ the design patterns book.

I would not give it to a beginner as it would corrupt their mind with useless
drivel .

It's authors exhibit themselves as morons who celebrate renaming existing
computer science concepts while occasionally mixing and matching them.

I would not be this uncharitable towards it if it was not such a famous (and
hence harmfull) book.

It's harmfull because software engineering is really hard, and they just add
up to the load by trying to have the reader memoize their junk instead of
doing something that would actually make them a better programmer. And, if you
try to use their concepts _while_ programming - shudders, oh god help you.

I'm not going to iterate over every pattern. Here's an example:

 _Flywheight_? You assholes, just tell memoization for what it is. Why don't
you rename existing data structures as well? Good thing those come out of the
box, otherwise you would have began the book by renaming array, linked list
and dictionary. Maybe you would have called linked list "the chaingang" or
something.

You don't simplify things by giving things cuddlier names. You stunt peoples
growth that way.

Gang of four book is a malignant offshoot of the practice of attempting to
make software engineering better by legalizing and dogmatizing it by
decomposition into trivial details that hurt your brain. It's a branch of
'consultancy driven software development' where people attempt to aquire a
halo of professionalism by calling things by fancy names and over-complex
descriptions (while skipping the practical things with equally complex names
but at least those weren't made up by a bunch of idiots).

~~~
voltooid
Your comment is very interesting. I recently took a course on Design Patterns.
I sat squirming during the lectures because I didn't like what was being said,
but couldn't put my finger on what exactly I disliked.

What I understand from your comment is that you dislike the Gang of four book
because it renames concepts that don't need the cutesy names that they give
them. Do you have a problem with the _concept_ of design patterns? Or just the
names they are given? Are the concepts themselves sound and worth paying
attention to?

~~~
fsloth
I wrote a long rant as a response to another comment above. To quote myself:

"It reads like it was written by a clever, verbose, and 'over-eager novice."

Design patterns are a really usefull concept. The GoF book just totally
botches up that concept by dwelving deep in trivial language details while
missing the big picture.

Christopher Alexander's "A timeless way of building", and "Notes on the
Synthesis of Form" are the books in architecture that prompted a lot of
dicussion in software design circles, and from which I presume GoF got their
idea of software design patterns.

What are good design patterns in software? IMO they are composed from the
programming models exposed in a basic CS book like Aho and Ullman's
"Foundations of Computer Science" and further developed in a books like
Structure And Intrepretation of Computer Programs.

GoF is an ok anecdotal reference after those, but it really is not suitable as
a didactic resource.

Peter Norvig wryly commented that 16 of the 23 pattern are either invisible or
non-existent in lisp Lisp[0].

[0] [http://www.norvig.com/design-patterns/design-
patterns.pdf](http://www.norvig.com/design-patterns/design-patterns.pdf)

------
crackerjackmack
Some general advice I've given multiple junior developers over the years, you
probably aren't a junior but most likely applicable to the advice you are
seeking. These were passed down to me by other developers. Other HN folk will
have links to literature but hopefully my advice will give you a precursor.

* testing - write your functions small enough to be readable, but not so small their abstractions are meaningless (because you have to test them all)

* testing - don't reach into your code's modules and mock. Instead use dependency injection with non-testing defaults

* code review - It shouldn't be personal, if it is are you reading it wrong or are they attacking you personally?

* code review - when referencing style complaints ask for reference material. Don't get caught in cyclic-pedantic style war between lead devs.

* code - your code should be environment agnostic, if you have environment/context specific things to do, pass along a environment/configuration dict or make a global config singleton. As long as your code depends on that you can write code more discretely.

* code - personal preference but try to not nest your loops too deeply, when you can use itertools.

* code - if you can help it, try not to mutate dicts/objects in place while in a loop. Makes testing a difficult.

* code - exit early if possible, test for failures instead of nesting your entire function inside a single `if`. Helps identify the bad inputs faster as well.

Above all, remember code isn't perfect. It's a tool to get to an end goal. If
you aren't solving for the end goal you aren't solving the right problem. At
the end of the day, you are employed to build a product and that product needs
to perform it's job. (that isn't a pass to write super shitty code)

edit: formatting

~~~
fraudsyndrome
> testing - don't reach into your code's modules and mock. Instead use
> dependency injection with non-testing defaults

Could you please go into more depth with this?

~~~
crackerjackmack
In an example. NoopTelemetry would be some type of empty class not dependent
on mock (I've used a meta class a singleton in this case, but whichever, could
be a module just the same). To test, you'd pass in a mock object to telemetry
and check that a both timer_start and timer_stop are both called with the
correct function name.

In your main or context, you setup your application with the needed pieces.

    
    
      def main():
         context = {'telemetry': StatsClient(....)}
         start(context)
    
      def start(context):
          algo_5(1, 4, context['telemetry'])
    
      def algo_5(param1, param2, telemetry=NoopTelemetry):
         telemetry.timer_start('algo_5')
         ret = param1 / param2 ** param2 # whatever
         telemetry.timer_stop('algo_5)
         return ret
      
      def test_start():
        context = {'telemetry': MagicMock()}
        start(context)
        # more testing.
    
      def test_algo5_math():
        ret = algo5(4, 5)
        assert 78  # maybe?
    
      def test_algo5_telemetry():
        mm = MagicMock()
        algo5(1, 1, mm)
        assert mm.timer_start.called_with_args(['algo5'])
        assert mm.timer_stop.called_with_args(['algo5'])

------
fdsvnsmvas
Thanks everyone, the comments are much appreciated. Here's a list of books and
other media resources recommended so far in the thread:

Robert C. Martin, Clean code: [https://www.amazon.com/Clean-Code-Handbook-
Software-Craftsma...](https://www.amazon.com/Clean-Code-Handbook-Software-
Craftsmanship/dp/0132350882)

Vaughn Vernon, various:
[https://vaughnvernon.co/?page_id=168](https://vaughnvernon.co/?page_id=168)

Steve McConnell, Code Complete: [https://www.amazon.com/Code-Complete-
Practical-Handbook-Cons...](https://www.amazon.com/Code-Complete-Practical-
Handbook-Construction/dp/0735619670) 2

Clean coder: [https://cleancoders.com/](https://cleancoders.com/) videos

Hunt and Thomas, The Pragmatic Programmer: [https://www.amazon.com/Pragmatic-
Programmer-Journeyman-Maste...](https://www.amazon.com/Pragmatic-Programmer-
Journeyman-Master/dp/020161622X)

Hitchhiker's Guide to Python: [https://docs.python-
guide.org/](https://docs.python-guide.org/)

Dustin Boswell The Art of Readable Code: [https://www.amazon.com/Art-Readable-
Code-Practical-Technique...](https://www.amazon.com/Art-Readable-Code-
Practical-Techniques/dp/0596802293)

John Ousterhout, A Philosophy of Software Design:
[https://www.amazon.com/Philosophy-Software-Design-John-
Ouste...](https://www.amazon.com/Philosophy-Software-Design-John-
Ousterhout/dp/1732102201) This one looks particularly interesting, thanks
AlexCoventry!

Kent Beck, Test Driven Development: [https://www.amazon.com/Test-Driven-
Development-Kent-Beck/dp/...](https://www.amazon.com/Test-Driven-Development-
Kent-Beck/dp/0321146530)

Dan Bader, Python Tricks: The Book: [https://dbader.org/](https://dbader.org/)

Ian Sommerville, Software Engineering: [https://www.amazon.com/Software-
Engineering-10th-Ian-Sommerv...](https://www.amazon.com/Software-
Engineering-10th-Ian-Sommerville/dp/9332582696)

Svilen Dobrev, various:
[http://www.svilendobrev.com/rabota/](http://www.svilendobrev.com/rabota/)

~~~
sanderjd
There are a lot of good recommendations here, and I certainly relate to the
instinct to go to books when you're looking to level up a skill set, but I
really think what you need is not a bunch of books to read, but a few _people_
to watch do the work. The only real way to do that is to get a job alongside
them. You can read the books at the same time; you can ask your new coworkers
which recommendations they agree with and read those ones first.

~~~
fsloth
Yeah, software engineering is a craft, and generally the only way to learn
those fast is to learn from others.

~~~
blub
It's not a craft, in its purest form it's an engineering discipline with
specific rules, procedures and standards.

The crucial point is that most of us a doing programming, and not software
engineering. Learning from others is hit or miss. One can certainly learn to
program from others, but that's not enough to be able to do software
engineering.

~~~
fsloth
"It's not a craft, in its purest form it's an engineering discipline with
specific rules, procedures and standards."

Sorry, but I have to strongly disagree. In it's purest form the core of
software engineering - i.e. programming is a craft. The other parts are mostly
about creating processes so that craftsmen can create something together
without stumbling into eachother.

The difference between a craft and engineering are numerous.

\- engineers generally need a license

\- engineering is about repeatability and creating dependable cost estimates

\- engineers are required to study for years for a very good reason. You can
be a rockstar programmer out of highschool.

Just having a bunch of cargo cult gibberish bound into a book does not make a
craft into an engineering discipline.

It's harmfull to call programming engineering. Engineers have curriculums that
can teach them pretty well what is expected of them once employed.

Not for programmers - or, well, software engineers. If there was even one
curriculum that could churn out good programmers dependably, don't you think
this model wouldn't be copied instantly elsewhere? If such a curruculum
existed, do you think think software interviews would be filled with
whiteboarding just to check out that the candidates understand even the
basics?

I think this incapability to create a curriculum for actually creating good
programmers is the best evidence that programming is a craft. It's such a
complex topic that you can't create a mass curriculum that would serve all
equally. Not with our current understanding, anyway. Maybe if we could teach
everyone assembly, and Haskell , and have them implement compilers and
languages as a standard things would be different.

The second best way to learn programming without being born a programmer
savant is to learn from others while doing. Apprenticeship is the traditional
way to train craftsmen.

Programming is so much more like a craft than engineering that it's best to
call it a craft.

Craft is not a deragatory term. It just means we don't understand it
theoretically well enough to teach it properly.

~~~
blub
Software development as practiced now by a huge number of individuals and
companies is closer to a craft, but it can be and must be more than that if we
want to be able to tackle the growing complexity of software and improve its
overall barely adequate quality.

Crafts don't scale and are a poor fit for highly complex domains.

The curse of software development is its huge financial success, anemic
legislative specification and the observed reality that customers will still
buy poor quality software.

These are preventing the craft-like programming from turning into software
engineering, but the craft is already failing to reach expectations: countless
security disasters, unethical programmers enabling spying on millions,
software literally killing users. This stuff will only get worse.

And finally, we do understand software engineering well enough to teach it
properly. It's just not done, because it's not considered necessary when one
can get by with a computer science degree, no degree or a bootcamp
certificate.

~~~
fsloth
"And finally, we do understand software engineering well enough to teach it
properly."

This is news to me. I would very much like a citation, please. Or do you mean
applying formal proof verification to everything?

~~~
blub
Engineering doesn't mean using formal methods or specific fancy proofs, it's a
systematic, disciplined quantifiable approach to software. It's described in
an ISO standard and the more approachable IEEE SWEBOK.

The above is neither widely known (I only found out about it after many years
of doing professional programming), nor is it necessary in order to be
successful in the profession and/or make a lot of money.

Commercial software development is mostly a wild west and we're calling that
craftsmanship.

------
tensor
In my experience, it's easy to just learn on the job. Some basic points
though:

* Follow whatever formatting and style rules your workplace uses. It's religion and not worth getting into, as long as everyone uses the same style its a win.

* Dev/stage/prod is also workplace specific. Just go with the flow and avoid time wasting arguments on these topics, it's not usually worth it.

* Try to break your work into small commits. This is both easier to review and easier to estimate time on.

* Architect your code so that you can add unit tests. Make sure all your commits have this.

* Prefer longer simpler code to clever code, you're optimizing for newcomers to your code reading it.

* When a one line comment explains it to you, you'll probably need a paragraph at least for someone outside the field to get started understanding it.

* Think about how you'll respond to someone coming to you and saying "something something prod something something your code is buggy." How will you get enough information to determine if this is true, and to debug it when it is? Logging is one good tool here, so consider what you log carefully.

Finally, don't be too surprised if you find people talk down to you. Unless
you are in a FAANG company, which it sounds like you are not, developers can
be very condescending towards academics (and people from other fields).

~~~
khitchdee
That is a wonderful point. There is no replacement for an on-the-job
experience as the understudy of a more experienced professional. I have
experienced this first hand and can vouch for it.

------
sbussard
\- Most of your job is to make people happy. Communicate well. Coming from
pure research, it might feel a little uncomfortable at first. Remember, you're
there to consult, and that happens to involve writing code.

\- Go to hackathons to learn to ship code fast and get used to building
"skateboards". Learn how to make tradeoffs that optimize for development
speed. It's not about writing crappy code, it's about optimizing for the right
variables at the right time. There are now a lot of real world variables to
consider.

\- Practice Kanban. Divide and conquer your projects. Make small and focused
pull requests. You will naturally start strategizing on how to do things right
while you're doing things quickly.

\- Using category theory and functional programming in your code, but being
practical about it so others can read it, will really help when it comes to
writing unit tests. Unguided polymorphism is from the devil.

------
pjmorris
I'd suggest "The Pragmatic Programmer" by Hunt and Thomas. It's a compendium
of advice on being an effective programmer compiled from experience. Also,
take look at "The Practice of Programming" by Kernighan and Pike. It's a bit
more narrowly-focused, but Kernighan and Pike are models for clarity in
programming and in writing.

------
navinsylvester
It is critical for a company to have a concrete on-boarding process. If your
present company doesn't have a good one take this as an opportunity to design
one. You will learn a lot and also help others in the process.

Here are some of the guidelines:

    
    
      # Code style/guidelines
      # Git/version control workflow
      # Testing methodologies & tools used
      # Agile/project management tools used and best practices
      # Read the wiki about infra/services used in production/dev/staging and its workflow
      # Release guidelines & workflow
      # Mentoring process
      # Engineering style/culture

------
cnees
Your coworkers are your best resource.

\- Ask them to review your code and suggest changes

\- Look for questions of taste and ask more. It may feel intuitive to them,
but if you dig in you can often find a good reason/principle behind it.

\- Read your coworkers' code

\- Read the comments people leave on others' code

~~~
mitchellst
This is the best answer here. You've come to the conclusion that you're good
at coding alone, but you don't know how to do it well in a team or company—
your team and company. Other answers frame this as a technology problem
(patterns and practices) but you'll hack it faster as an acculturation task.
Get mentors. Plural. Grab one person in each department where you feel shaky—
QA's, solutions architects, operations, maybe product, etc. Tell them you're
new at this and you want to ask them questions and work closely with them to
get better. (This will not offend them and it will not make them look down on
you. If it does, you don't want what they're selling anyway.) Two months into
asking them for code reviews and just taking them to lunch and asking about
things you know they care about in their areas of responsibility, you'll
notice results in terms of your own thinking and output. 1 year in, you'll be
very, very good at this.

~~~
mertd
I was in the same boat as the op. Dove in head first into a software
engineering role. Ended up working with the only true 10x'er I have known to
this day. Nothing improves you faster than getting feedback from someone like
that.

------
currymj
I have made the jump from writing academic code to working on a product where
actual software engineering was encouraged. Although I did jump back to
academia pretty quickly.

Hitchhiker's Guide to Python is a very good book (freely available online, or
get a copy from O'Reilly); some of it may be obvious but some might not be.

It is true IMO that making your code testable will also make it better
designed. It might even be worthwhile to do completely dogmatic test-driven
development (i.e. always write tests first, then stub out everything with
NotImplementedError, then write actual code until all tests pass) for a while
to get used to it, and force yourself to become familiar with tools for
dependency injection/mocks/etc.

This is complicated by the fact that unit-testing machine learning code can be
unusually tricky; normal unit-testing practices and metrics (e.g. code
coverage) may not be very effective.

~~~
currymj
Oh and I don't think you'll be as hopeless a coder as some other posters might
think, because you did your PhD in a computational discipline and know Python
and have heard of unit testing.

There are, say, physics PhDs who only write numerical Fortran or C++ routines
(in one big file, sometimes even in one big function), who really might want
to attend a boot camp or something but it doesn't sound like you're in that
boat.

------
blt
I went back for a PhD after a few years in industry.

As a PhD student, my code and habits do not meet the standard of industry.
This is because I'm constantly changing the whole architecture to try new
ideas, so I I optimize for small and simple code at the expense of
testability, modularization, robustness, etc.

It's important to recognize this. You will need to change your style.

I can't recommend any one book. I feel like I mostly learned these lessons
through random articles, lecture videos, conversations, and personal
experience.

IMO, some of the most important principles:

* Implement as much as possible with pure functions (but don't contort the code to achieve this).

* Make your commits as small as possible. Well structured version control history is valuable.

* Spend lots of time time thinking about how data flows through your program, more than how the code is organized.

* Strongly prefer DAG dependency structure. Write a set of libraries and then a top level program that uses them.

------
svilen_dobrev
Read "Software engineering" by Ian Sommerville. Any edition (maybe from 6
onwards, though they are slighty different.. pick latest u can get). Maybe
skim/skip the (technical) parts u think u know, and read the rest. Most will
not make sense initialy.. does not matter, keep reading. u need to get all
that "uploaded" in brain in order to be able to grasp it one day.

It took me 10 years to be able to skip all the technicals. And another 10+
years to understand why u may ever need the rest..

for judgment etc... Maybe pick some big-enough open-source project in a domain
u know well and follow it - how and when they do change what. Dont worry, it
does take years to really form your own judgment.

btw u will need some philosophy/methodology/human-side too.. there's not many
of it in the above book.

For more, see the recommended readings on www.svilendobrev.com/rabota/

have fun

------
ttalviste
First of all, SW engineering is a practice with a lot of responsibility. The
main responsibility lays in writing code, that is easy to understand. For
example, if you think you write well written code, then try reading code that
you have written a couple of months ago. Usually, a very painful experience :D

So try to write code for an audience. This has been the trigger for me. Also I
encourage code reviews and TDD.

The main learning resources for me have been, Clean Coder videos by Robert C
Martin aka Uncle Bob. They are pure gold. They can feel awkward, but after a
while they make sense.

Also DDD domain driven design is a key topic to tackle.

Books: \- Clean code \- DDD by Vaughn Vernon

Videos: \- Clean coder E1-E52

With these two books and videos you are on a good track! These worked for me.

~~~
Lyren
I can vouch for Clean Coder. We watched them in our company. It's a small dev
team so we took the time together. Afterwards we implemented a 4-line rule
amongst other things.

We don't always hold ourselves to it, sometimes 5-6 line functions make sense,
but we strive toward 4. Sometimes it's as easy as breaking code out into a new
function, but sometimes you just simply have to create a class for it. That
way a lot of complicated code suddenly becomes very easy without much effort.

~~~
JanisL
This honestly sounds extremely limiting. I do get why you'd want to make
functions short in general but I think there's a tipping point where making
the functions shorter actually increases overall complexity and 4 lines seems
to be past that tipping point in my experience.

~~~
Lyren
It's honestly not as limiting as you might think. Readability has definitely
improved a ton since switching.

Of course we don't count every bracket or blank lines. Only the rows with any
logic or assignment.

And yes, I agree, there are occasions where the complexity goes up. If there
is a good argument for that, then we of course go with it.

But so far, almost everything anyone in our team has made, has been improved
by rewriting it to something that works well in 4 lines, be it using
polymorphism, object oriented, or functional.

------
grigjd3
Be patient with yourself. You have talents that are quite useful, but good
code design and architecture are rarely thought about in academics. Realize
that while you spent 4-6 years getting your PhD, your coworkers were becoming
better engineers. That doesn't mean you can't do good work, but for a while
you'll mostly be learning from others.

~~~
tensor
The exception is if you are in the area of study whose entire existence is to
understand what is good code design and architecture.

------
Arnie0426
Agree with a lot of the comments here. I went through this very issue a couple
of years back and I did end up reading a few of the books suggested in that
thread and while they were good reads, I think I learned the most from my
colleagues’ harsh code reviews and developing a slightly thicker skin to those
review comments, and not getting triggered at every single slight
disagreement. I used to write a lot of grad student code at my current work
and got rightly flamed for it (when appropriate)…

These days, I do try and think about the software engineering side of things
first just so that I get quicker +1s from my team, and honestly, all the
_quick and dirty prototypes_ I used to write (I still do, but a lot less)
ended up requiring me to do a lot more debugging/redo-ing/thinking about
scaling up etc later on anyway.

Books can get you a decent idea of what to do, but I think I found reading
code (and especially code reviews of my colleagues for other people’s code)
much more useful. I think reading a few 800 paged books to improve your
software engineering skills is a very grad student thing to do. :p. I admit I
did that way too much.

------
baq
whenever you want or need to do something more than shuffling bits between
buckets with different names, do some research. most likely someone already
did it and published a library for it.

test third-party libraries. it's uncommon to find bugs, but it's not so rare
that it happens only to others.

don't forget to leave comments. a lot of my code review questions could be
answered (hence could be not asked in the first place) by a well placed
comment.

sometimes people say that code is documentation or code should read like
documentation. this is false. code can explain (usually poorly) what it does
but it can't give a rationale why it does it the way it's been written, can't
say what it doesn't do, etc. always write some documentation - comments and
commit messages at least. this should be enforced in code review.

i'd say engineering is about not writing code unless absolutely necessary.
code is an asset, but it's also a liability. you really don't want more than
you need.

------
sevensor
I made a very similar transition four years ago. Finished a Ph.D. Started a
job in a related field writing lots of Python. My advice is to take advantage
of your analytical and abstract reasoning skills. Your peers may have more
concrete experience writing software, but you did a Ph.D., which means you
have the patience and tenacity to follow all the threads until you figure out
where they go. That means that where other people might give up, you can
actually figure out how the system works and where a new feature fits in it.
Or why it doesn't work the way anybody thinks it does. Think of reading other
people's code like doing a lit review -- multiple authors, different schools
of thought, arguments about how to do things right -- these have all played
out in the code base and they're there for you to read. As a Ph.D., you have
the ability to pull this all together into something that makes sense.

------
sanderjd
My two cents: you don't need to read anything at this point, you need to
_apprentice_. Go work somewhere where there are experienced developers. Spend
your first weeks there sussing out who is highly respected among your
coworkers and choose one or more of those people that you click with. Then
just brazenly copy all their techniques and opinions for awhile. Pretty soon
you'll find yourself disagreeing with some of what they're doing or thinking.
That's natural, but you should resist the urge for awhile; some of that stuff
comes from hard-won experience that is hard to explain. Eventually you'll
start going your own way more and more. Sometimes that will blow up in your
face, and that will give you your own hard-won experiences. Before you know
it, you'll be one of the highly respected engineers that the newbies are
cribbing from.

------
anonytrary
Code review, dev/stage/prod workflows all vary on a team-by-team basis. If you
already know what they are and why they exist, there isn't a better way to
"prepare" for these than to just roll up your sleeves and look at how your
current team implements these things.

Good testing practices:

1\. Minimize mocking as much as you can -- as a rule of thumb, mocking is
inversely proportional to test confidence.

2\. Don't test implementation details, test public-facing APIs. This way, your
implementation can change. Mocking makes this harder. Don't test _how_ you get
things done -- test that they are done.

3\. Make sure your API is well defined before you start writing tests, or you
will waste time.

You can find loads of Python testing guides on Google on the first two points.
There will be times when you have to break some of those rules, but knowing
when will come with experience.

------
AlexCoventry
> the judgment / taste required to make maintainable changes to a million LOC
> repository

Try _The Art of Readable Code_ (a pair of google authors, IIRC), and
Ousterhout's _A Philosophy of Software Design_.

------
JanisL
Recently I've been involved in transitioning an academic software piece to an
open source library. One of the most noticeable things is the different
priorities and emphasis on what is driving value in these different
environments. The people who were making the code before had priorities mostly
to do with research, the main artifacts were papers and research, the software
_itself_ was not the main artifact. The interesting thing is that they had
good software and research skills so it wasn't a matter of bad skills muddying
the waters and hence gave a great spotlight into how different people can have
different priorities with code. So when we were making it into a library which
others could base their work off there was a big shift in priorities because
the code became an artifact worthy of _directly_ spending more time/money on.
You may find what we wrote about this process interesting as it highlights the
things from a software engineering/open source perspective that were now
important and had to be done to make the project a standalone library useful
for consumption by other developers:
[https://www.customprogrammingsolutions.com/blog/2018-02-25/P...](https://www.customprogrammingsolutions.com/blog/2018-02-25/Persephone-
project/)

------
truth_be_told
"Software Engineering" is merely a collection of principles, techniques,
heuristics, structure and practice all validated by trial and error. As such
you have to read a variety of books to get the overall picture. Specifically
books with sizable code for various problems. You may find the following
helpful to get started (many of these can be bought used and cheap);

* Fundamentals of Software Engineering by Ghezzi, Jazayeri and Mandrioli

* The Practice of Programming by Kernighan & Pike.

* Code Complete by Steve Mcconnell.

* The Unix Programming Environment by Kernighan & Pike

* Advanced Unix Programming by Marc Rochkind.

* C Interfaces and Implementation by David Hanson.

* Large Scale C++ design by John Lakos

* Unix Network Programming by Richard Stevens.

The key is that while reading the above you need to "get" how the code is
"structured" rather than the details. For example, how does the code for a TCP
server and client "look like"? It is a kind of spatial knowledge which you can
then consider as one "module" of functionality and reproduce as needed. Large
Systems consist of a bunch of layered and well partitioned modules exposing
simple and clean interfaces. There will also be modules which cross-cut all
the functional modules like "Error-Handling", "Logging" etc. This is the core
of "Software Engineering", everything else is details.

Finally, you would also need to read a book/source where you can see all of
the above principles put into practice while building a non-trivial (initially
not overly complex) system.

------
da_murvel
I read systems science at university. While we did some technical stuff like
basic Java programming and database design, we mostly focused on WHAT a system
is and how to design one from certain requirements. So when I got my first
job, as a web developer, I hit the ground quite hard. I hadn't really
programmed in my spare time either so I didn't have that backbone experience.
(You might wonder why I got hired in the first place, in hindsight, I also
do).

Nowadays, some X years later, I identify myself as a backend developer and I
tend to stay out of those "up in the cloud" discussions about what a system
should look like. So how did I get here? First of all, I think I was pretty
lucky having a boss at my first job who wasn't interested in me being really
productive during my first time there, but rather wanted me to learn and grow
with the company. I also had great colleagues, especially my then team lead,
who really took the time and showed me the ropes so to speak. I bought one
book, which I didn't really read. I did some online classes, but I mostly
learned programming, problem solving, TDD, etc. by working.

------
issa
I've worked in a variety of companies as employee and consultant and I have
some counterintuitive advice that applies to building things on the web: In
almost all cases, things like code maintainability, coding "standards", and
TDD, go out the window during actual development. I'm not saying this is good,
just that it happens (this is more a management problem than a software
development one). There are deadlines to meet, changes to make, surprise
features to add, etc. And usually you're just throwing things away and
building new things before any of this comes back to bite you. Being able to
go with the flow and handle chaos --flexibility-- is probably the most
important skill to have. The job ends up being a lot of communication. If
you're lucky, you'll get to code some cool stuff. But you'll also have to
hardcode something clunky and ugly because there was no time to do it right.
Be OK with that.

~~~
not_kurt_godel
> things like code maintainability, coding "standards", and TDD, go out the
> window during actual development

As someone with experience in both academic and professional programming, I'd
say that the difference is that while in the professional world those
principles may be sacrificed at times, in the academic world you are lucky to
work with someone who even knows what they are, much less how to implement
them.

------
bitwize
To quote another great academic, Ray Stantz: "I've worked in the private
sector. They expect results."

Just remember that the business world is going to expect results driven by
their current business needs, not by solving interesting problems. So you want
the shortest path that'll get you from here to there, which means bone up on
the libraries or frameworks that are germane to your company's needs. Learn
what your company's coding standards are from developers who are in the know,
and apply them to your code.

Also, might I suggest finding a company that's at least tangentially related
to what you did your Ph.D. in. That doctorate is going to look great, and your
expertise is going to be super valuable, giving you a much-needed opportunity
to strengthen yourself in the areas of industrial development where you are
weak on the job, while still contributing value.

------
dasmoth
A bunch of comments on this post give pretty good advice about what to expect,
and if you're in your first commercial job it's probably worth going with the
flow. However, one thing I'll add is that it's worth watching out for the
"academic bad, commercial practices good" mindset. Keep your eyes open, form
your own opinions about what is and isn't working. Don't necessarily kick up a
fuss about the things you don't like right now but _do_ file them away for the
future.

[https://yosefk.com/blog/why-bad-scientific-code-beats-
code-f...](https://yosefk.com/blog/why-bad-scientific-code-beats-code-
following-best-practices.html) is an interesting counter-point to some of the
usual commercial-vs-academic thinking.

------
WheelsAtLarge
Your best bet is to look into a programming boot camp training program. As you
have seen Ph.D. academic programming deals more with research while commercial
programming deals more with delivering a product as fast as possible. It's two
different mindsets.

1st decide on what area you want to specialize in and then look at a reputable
boot camp that fits your goals. You can do it on your own but it's going to
take you a lot longer and it's hard to focus on what you need to learn. Also,
if you could do it, you would have done it already. The advantage is that
you're already used to the scholastic environment and you'll be able to do
very well and be even well ahead of everyone if you challenge yourself rather
than strictly following the curriculum.

------
itronitron
I'm also self taught, have done a lot of research code and product code, have
worked on 'production' teams for over ten years and have had several PhD
students as contributing team members.

The number one thing you can do is read through other people's code. If your
colleagues are very good then you will learn a lot and pick up good habits, if
they are so-so then you will build your self-esteem and sharpen your critical
thinking skills. Some developers are shifty, and others love to talk about
what they are doing and share insights. Spend time with the latter type.

Don't try to be an expert at everything, most teams should have self-selected
individuals that choose to specialize in different areas that the team depends
on.

------
pietro
You won't be alone. In any real organization, you'll be part of an experienced
team that's working towards the same goal as you, and if you're curious, open
and appropriately humble, they'll teach you everything you need to know.

~~~
lsh
In fake organisations you'll be alone, surrounded by passive-aggressive office
culture and every personality on the spectrum from always-hostile to Vulcan-
autistic. If you're able, shop around for a job until you find one of these
real organisations. They do exist.

------
enitihas
If you are OK with reading books, I will recommend Code Complete.

------
marmaduke
I made this move (and back): treat the goal of software engineering like
another topic to do a lit search on, map out the domain, implement a few
papers, etc. Instead of journals, you’ve stackoverflow, coworkers, Google.. I
got out-coded more times than I can count, but as a PhD you can catch up
quickly by treating it as a domain and problem to analyze and solve.

For Python, the built in docs are already very good, but I use devdocs.io a
lot.

------
thom
Does your institution have a Research Software Engineering group? I think
increasingly universities acknowledge the gap between how academics use
software and how industry approaches it, and I think that would be a fantastic
first step if you were looking for a change.

[https://rse.ac.uk/](https://rse.ac.uk/)

------
pvorb
If your code is reviewed by colleagues and if you review code of your
colleagues, if you do some occasional pair programming you probably don't need
to read books about programming. Concentrate on books that help you with
things like estimations and communication, e.g. "The Clean Coder".

------
sixhobbits
This really depends on what specifically you're struggling with, so going to
take some shots in the dark:

* "Refactoring" by Martin Fowler would probably help with writing good quality code and doing code reviews (or understanding the reasons for changes requested in others' code reviews).

* In my experience, "academic" code tends to be far more prone to very long functions. Understanding the Single Responsibility principle was a very important part of the transition from academic scripts to software engineering for me. If you regularly write functions of 30+ lines, start looking at breaking these down better -- what are you actually doing in each chunk of code? Can it be broken down further.

* In Academia, building software is 90% coding. Now, reading code will be far more important, and you might even spend more time reading code than writing code. Relatedly, the _readability_ of your code is now the most important thing to optimize for (sometimes even at the expense of computational efficiency, you should aim to reduce the number of developer-hours spent wherever possible). This means writing more readable code (good variable names, learning and following style guides and other conventions, following a process even if it feels like a waste of time), using better tooling, more focus on testing, more time documenting, and more time communicating with programmers than actual programming.

* At times this will be frustrating because you'll remember when you could just go into the zone for several days straight and produce something fairly significant. Remember that a lot of the stuff that feels like a waste of time isn't - it's necessary to get out of the local maximum of what was possible when you did everything yourself and could keep the entire project scope in your mental model at all times.

* Whenever you have written code and are 99% sure that it will run fine and not break any other parts of the system ("it was such a small change"), rather assume that it is very likely to break something else. I still find I constantly need to recalibrate my confidence that what I'm changing has a limited area of affect.

* Spend a decent amount of time getting more familiar with things like source control. Assuming you use git, go down this rabbit hole to read about the different git workflows and get as comfortable as possible with flows that your team uses, dealing with conflicts, reversing mistakes, etc.

* Most of what you specifically asked for can only be learned through experience. Experience is gained the most quickly by doing, even if you make mistakes. For software quality, the best (only?) way to learn is to have someone more experienced review your code. The more pedantic they are, the more you will learn. If it takes you 3-4 attempts to get a code review through, then you have team that will accelerate your learning. If you're getting mainly LGTMs, you might be super competent and had no need to ask this question, but more likely you need to try find people to push you harder to learn new standards.

------
fsloth
Code Complete 2 is kinda the bible on disciplined software engineering
practice. That's a good start.

Other than that - there is not that much _theoretical_ basis. The core
principle is that software engineering is about dealing with a situation where
you have far too many variables to fit into a single persons working memory,
and how to organize a group of people in a way that they can co-operate
without turning the thing into a mess, really fast.

It's more about having a set of understood processes, rather than _what_ those
processes really are, so that people can communicate about and co-ordinate
their work effectively. Of course the processes need to make sense, but
there's no "silver bullet" process that either would fix everything, or,
conversely not following would lead to the end of the world.

"Issues like testing / mocking"

Testing has sadly developed bookish dogma around itself. But it's extremely
practical. The most important automated test is the high-level integration
testing - will this and that work when the customer uses it.

Unit test are about creating enforceable rules to the production system, which
makes thing break faster, and, hence, faster to fix.

You don't want someone else to accidentally to break your code - especially
that kinda weird cornercase? Write a test - now the rule becomes enforced as a
part of your domain model.

" code review,"

Same principle as in writing text. Having someone proofread the things you
write generally improves the quality. No need to be dogmatic.

The second aspect is the zenlike increase in code quality. People know that
their work will be looked at by _someone_ , hence they have a higher intrinsic
motivation not to fudge things.

"management of dev / stage / prod workflows"

The only thing that matters there is that there is one agreed process inhouse.
Otherwise things turn really messy, real fast. It's kinda tricky to wrap ones
head the first time around the ramifications of the chosen rules, so that's
why there are lot of published ways of working .

"the judgment / taste required to make maintainable changes to a million LOC
repository,"

"Working effectively with legacy code" by Michael C. Feathers is a good start.
Now, if the corpus of code has a thorough integration and unit test suite, you
can change things, and if you accidentally break something, the tests will
tell you.

If there are no tests, then, better start writing them. You can't do any large
scale modifications - especially to production code - without them.

Have some tool that automatically tells you the test coverage.

------
9dev
I personally came from the other side of the table: I got a German
"Ausbildung" as a sysadmin, started developing software as a hobby, and have
now made the switch to professional software engineer. While I feel I'm pretty
good at what I do, I'm lacking the algorithm education, the basic concepts and
especially team related skills. Always something new to learn, I guess.

------
Hbthegreat
Get into a startup where a lot of these practices/ideas aren't yet fully
ingrained/adhered to and grow with your team. This will also let you learn
more skills than "just coding" as you will have to wear multiple hats.

Once you are confident you can move on to bigger engineering shops. Or just
stay and have a great time building new things in startup world. :)

------
LifeQuestioner
code design patterns look at 'testing'

------
sonnyblarney
If you're smart enough to do a PhD, you're smart enough to figure out all of
the scaling and operational bits. It's not rocket science. It's just
operations.

Simply by asking relevant questions, practicing some things a bit, you'll get
used to it.

And consider that it's different in every organization, and nobody does it
really well frankly.

Given the pace of change, the varying technologies and flavour-of-the-month
processes, it may always seem a little unwieldy an opaque: the feeling of 'not
knowing everything' never goes away.

And I concur with the itronitron: read other people's code on the team, who
are known to 'code well'.

It does not mean that algorithmically it's genius or even good - it just means
that those are the styles/patterns that may be expected of you. It's like
learning to say certain words a certain way. You'll get it soon and then
forget you're doing it.

Don't fret you'll get all of this quickly.

~~~
drewmassey
This. I transitioned from a (non-STEM) PhD and am now working as a "real"
engineer. It is called a "practice" for a reason, you will get better with
time as long as you are self-reflective about it.

One practical tip - take a look at Dan Bader's book if you are deep in Python,
it has a lot of good stuff.

One philosophical tip - depending on your organization, remember that slow is
fast in engineering. This is somewhat different from more academic computing
environments (at least that I know of). So take the time to get it right and
really deeply understand your solution.

~~~
tamcap
> remember that slow is fast in engineering

Coming from an academic field myself... be careful with this one. Depending on
your personal habits in academia, you might have to learn the opposite - your
code doesn't have to be perfect, it has to work.

Be careful not to end up overthinking the code/design and under-delivering on
the timeline. Missing time estimate once in a while is usually OK, missing it
consistently and by a lot might become a problem.

