
Ask HN: What overlooked class of tools should a self-taught programmer look into - nathanasmith
15 years ago I learned Python by studying some O&#x27;Reilly books and I have been a hobbyist programmer ever since.<p>The books went into detail and since reading them I&#x27;ve felt confident writing scripts I needed to scratch an itch. Over time, I grew comfortable believing I had a strong grasp of the practical details and anything I hadn&#x27;t seen was likely either minor quibble, domain specific, or impractically theoretic.<p>This was until last year when I started working on a trading bot. I felt there should be two distinct parts to the bot, one script getting data then passing that data along to the other script for action. This seemed correct as later I might want multiple scripts serving both roles and passing data all around. Realizing the scripts would need to communicate over a network with minimal latency, I considered named pipes, Unix domain sockets, even writing files to &#x2F;dev&#x2F;shm but none of these solutions really fit.<p>Googling, I encountered something I hadn&#x27;t heard of called a message queue. More specifically, the ZMQ messaging library. Seeing some examples I realized this was important. The step of then plowing through the docs was nothing short of revelatory. Every next chapter introduced another brilliant pattern. While grokking Pub&#x2F;Sub, Req&#x2F;Res, Push&#x2F;Pull and the rest I couldn&#x27;t help breaking away, staring in space, struck by how this new thing I had just read could have deftly solved some fiendish memorable problem I&#x27;d previously struggled against.<p>Later, I pondered the meaning of only now stumbling on something so powerful, so fundamental, so hidden in plain sight, as messaging middleware? What other great tools remain invisible to me for lack of even knowing what to look for?<p>My question: In the spirit of generally yet ridiculously useful things like messaging middleware, what non-obvious tools and classes of tools would you suggest a hobbyist investigate that they otherwise may never encounter?
======
aequitas
Makefiles. I always dismissed them as a C compiler thing. Something that could
never be useful for Python programming. But nowadays every project I create
has a Makefile to bind together all task involved on that project. From
bootstrapping the dev environment, running checks/test, starting a devserver,
building releases and container images. Makefiles are just such a nice place
to put scripts for these common tasks compared to normal shell scripts. The
functional approach and dependency resolving of Make allows you to express
them with minimal boilerplate and you get tab completion for free. Sometimes I
try to take more native solutions (eg. Tox, docker) but I always end up
wrapping those tools in a Makefile somewhere forthe road since there are
always missing links and because Make is ubiquitous on nearly every Linux and
macOS it is just all you need to get a project started.

Example: [https://gitlab.com/internet-cleanup-foundation/web-
security-...](https://gitlab.com/internet-cleanup-foundation/web-security-
map/blob/master/Makefile)

Running the 'make' command in this repo will setup all requirements and then
run code checks and tests. Running it again will skip the setup part unless
one of the dependency files has changed (or the setup env is removed).

~~~
shavenwarthog2
In 2019 Makefiles are a useful tool for automating project-level things. Too
often webapps will require you to install X to install Y to run producing
artifact Z. Since Make is old and baked and everywhere, specifying "make Z" is
a useful project-level development process. It's not tied to a language (e.g.
Tox) nor a huge runtime (Docker). Make is small enough in scope to be easy,
and large enough to be capable without a lot of incantations.

The big downside of Make, alas, is Windows compatibility.

~~~
pickdenis
> The big downside of Make, alas, is Windows compatibility.

You'd have to give me a _very_ compelling reason to support developers who use
Windows, when Windows lacks these essential tools. Besides, don't people who
develop on Windows live in WSL?

~~~
ChristianGeek
Nope. I develop in Python, Java, and Kotlin on Windows and never touch WSL.
Make is available natively through Chocolatey (a Windows package installer),
but I prefer Gradle.

(I also write code to run on Linux, but still prefer Gradle.)

~~~
OJFord
Why don't you use WSL?

I can barely understand why you'd want to develop on Windows (ok, for non-
Windows-only products) with it, but without it...

~~~
bunderbunder
If you're already using a Vagrant or Docker-based development workflow, WSL
doesn't really add much, and takes some things away. I/O performance, for
example.

~~~
nickjj
> If you're already using a Vagrant or Docker-based development workflow, WSL
> doesn't really add much, and takes some things away. I/O performance, for
> example.

I've been actively using WSL for over a year along with Docker and set up the
Docker CLI in WSL to talk to the Docker for Windows daemon.

Performance in that scenario is no different than running the Docker CLI in
PowerShell, or do you just mean I/O performance in general in WSL? In which
case once you turn off Windows defender it's very usable. WSL v2 will also
apparently make I/O performance 2-20x faster depending on what you're doing.

WSL adds a lot if you're using Docker IMO. Suddenly if you want, you can run
tmux and terminal Vim along with ranger while your apps run nice and
efficiently in Docker. Before you know it, you're spending almost all of your
time on the command line but can still reach into the Windows cookie jar for
gaming and other GUI apps that aren't available on Linux and can't be run in a
Windows VM.

~~~
bunderbunder
I find that it depends a lot on what you're doing. The real problem with WSL
is I/O latency.

It's acceptable for relatively infrequent file access, but will eat you alive
if you're doing anything that involves lots of random file access, or batch
processing of large sets of small files, or stuff like that.

~~~
nickjj
I just haven't seen that as a problem in my day to day as a developer working
with Flask, Rails, Phoenix and Webpack.

That's dealing with 10k+ line projects spread across dozens of files quite
often, and even transforming ~100 small JS / SCSS files through Webpack. It's
all really fast even on 5 year old hardware (my source code isn't even on an
SSD either).

Fast as in, Webpack CSS recompiles often take 250ms to about 1.5 second
depending on how big the project is and all of the web framework code is close
to instant to reload on change. Hundreds of Phoenix controller tests run in 3
seconds, etc..

------
Robin_Message
Read the curriculum of an undergraduate computer science course and read up on
the things you haven't heard of. Some courses will even have lecture notes
available.

E.g. these four pages are the university of Cambridge masters in computer
science:

[https://www.cl.cam.ac.uk/teaching/1819/part1a-75.html](https://www.cl.cam.ac.uk/teaching/1819/part1a-75.html)

[https://www.cl.cam.ac.uk/teaching/1819/part1b-75.html](https://www.cl.cam.ac.uk/teaching/1819/part1b-75.html)

[https://www.cl.cam.ac.uk/teaching/1819/part2-75.html](https://www.cl.cam.ac.uk/teaching/1819/part2-75.html)

[https://www.cl.cam.ac.uk/teaching/1819/part3.html](https://www.cl.cam.ac.uk/teaching/1819/part3.html)

(Or a MOOC, but the links above are easy to browse text, syllabuses and
lecture notes, not a load of videos.)

~~~
samrat
Do Cambridge courses not have labs/projects? I looked at the course materials
on a few of the courses and couldn't find any. Or are they given out to
students separately?

~~~
longer_arms
There are hardware and software labs, which are administered on paper by PhD
students. These include(d): ML (the functional programming language),
FPGA/soft core development, Java tasks, breadboarding some logic, prolog and
probably some different ones now (looks like some machine learning tasks?).
Some of them are referenced and described on the links above. There's also a
group project in year 2, a dissertation individual project in year 3, and a
small holiday project between 1 and 2. Overall, a few students get through it
without being able to properly program, but most basically self teach.

------
jtolmar
1\. Profiler. There's a standard tool that tells you what part of your code is
slow. Over half the time it'll find something dumb and easy to fix instead of
whatever you expected.

2\. SQL / relational database schemas. Persistence opens up a lot of
capabilities. And databases themselves are very well-optimized; if you do any
nontrivial data manipulation it's likely that whatever the query planner comes
up with will be faster than your first idea of how to do it by hand.

3\. Graph searches. An awful lot of problems can be solved by knowing how to
turn problem into a graph search. Make sure not to fall into the trap of
thinking a graph search is limited to paths through space - you can solve
problems like "get through this dungeon with keys and doors" by adding
duplicate nodes for the different states.

4\. Sequential Bayesian Filters. Are almost as useful as graphs, but aren't in
a standard CS curriculum so you'll look like a wizard. These solve the problem
of "I want to know a thing and I know how it changes over time, but I only can
get rough estimates for its current state." Kalman Filters are simple and give
great results when applicable. Particle Filters have lower quality but are
applicable to more problems and dirt simple to code.

~~~
rassibassi
Support for 4! Yet, my understanding is that particle filters are superior but
computational more demanding. For nonlinear problems, the extended Kalman
Filter linearizes the task, whereas particle filters don't and work with many
point estimates instead.

I loved this book:
[https://users.aalto.fi/~ssarkka/pub/cup_book_online_20131111...](https://users.aalto.fi/~ssarkka/pub/cup_book_online_20131111.pdf)

and also Thomas Schoen group does great work on Sequential Monte Carlo (SMC),
MCMC for sequential data :)
[http://user.it.uu.se/~thosc112/index.html](http://user.it.uu.se/~thosc112/index.html)

They are also building a probabilistic programming language for sequential
data! [https://github.com/lawmurray/Birch](https://github.com/lawmurray/Birch)

~~~
jtolmar
Regular old Kalman Filters are the best (literally perfect) when your problem
fits all their requirements. They also have a lot of nice properties if you're
dealing with a problem that mostly fits their requirements. But the linear-
gaussian requirement is pretty steep, they don't always work.

I don't like the EKF much and prefer the UKF. The core filtering code is a
little more complex but they're much easier to actually work with; you can
give them arbitrary functions like a particle filter.

Particle filters have the advantage of being able to handle arbitrarily wacky
distributions. But they _are_ random and do some wacky things in edge cases.
They'll behave much more poorly in low-evidence situations than other filters
will. And they fall over spectacularly if you switch from low-evidence to
high-evidence (there's a workaround for this but it's still counterintuitive).
Finally they're just more computationally expensive than the others.

Birch sounds interesting, I'll take a look.

------
gwbas1c
Unit testing, mocking, and various other testing techniques.

Why? Any project of sufficient complexity is very hard to test. If all you're
doing is code -> build -> run to debug your code, you can very easily break
something that's not in your immediate attention.

The problem is that good unit testing is hard, and time consuming. It can be
so time consuming, that unless you can really plan in advance how you test,
you could spend more time writing test code than real application code. (This
is what happens when writing professional, industrial-strength code.)

So, when a hobby project becomes sufficiently interesting enough; such that
the code will be so complicated that your code -> build -> run loop won't hit
most of your code, you should think about how to have automated tests. They
don't have to be "pure, by the book" unit tests, but they should be an
approach that can hit most of your program in an automated manner.

You don't need to do "pure" mocking either. If you're writing something that
calls a webservice, you could write a mock webserver and redirect your program
to it. If you're writing something that works with pipes, you could have a set
of known files with known results, and always compare them.

The goal is that you should cover most of your program with code -> build ->
tests; and only do code -> build -> run for experimentation.

~~~
crimsonalucard
You are completely wrong.

Mocking is a huge design smell. The more mocks or integration tests your
projects requires to get full coverage the less modular your program is. A
program that uses many mocks is a sign of very very poor design. You will find
the code more complex to reason about and much harder to reuse code without
necessitating a lot of glue code to make things work together. Without proper
knowledge you won't even know the program is poorly designed.

I will grant you that 90% of programmers out there don't know how to design
programs in a truly modular way, so most engineering projects will require
extensive mocking. In fact most engineers can go through their entire career
without knowing that they are making their programs more complex and less
modular then it needs to be. Following certain design principles I have seen
incredibly complex projects require nearly zero mocking (very very rare
though).

Mocking indicates a module is dependent on something. Dependency is different
from composition.

    
    
         Dependencies                                Composition
    
    
               C                                        C
     +---------------------+
     |                     |       +----------------+       +-----------------+
     |     A               |       |                |       |                 |
     |                     |       |                |       |                 |
     |        +----------+ |       |                |       |                 |
     |        |          | |    in |                |       |                 |  out
     |        |          | |    -->+       A        +------>+         B       +-->
     |        |    B     | |       |                |       |                 |
     |        |          | |       |                |       |                 |
     |        |          | |       |                |       |                 |
     |        |          | |       |                |       |                 |
     |        +----------+ |       |                |       |                 |
     +---------------------+       +----------------+       +-----------------+
    

What's going on here? Both examples involve the creation of module C from A
and B.

left: 'A' exists as wrapper code around B and is useless on its own. To unit
test A you must mock B.

right: every module is reuseable on its own. Nothing needs to be mocked during
unit testing. No dependencies.

The only exception to the right example where you MUST mock is a function that
does IO. IO functions cannot be unit tested period, they can only be tested
with integration tests.

There's a name for the left approach. It's called Object oriented programming
using inheritance or composition(the oop version of composition; not
functional composition) as a design pattern. (both are bad)

There's also a name for the right approach. It's called functional programming
using function composition.

I don't advocate that you strictly follow either style. Just know that when
you go left you lose modularity and when you go right you gain it. All
functional programming does is force your entire program to be modular down to
the smallest primitive unit. Extensive mocking in your program means you went
too far to the left.

tangent: Another irony around this world is that a lot of functional
programmers (javascript and react developers especially) don't even know about
the primary benefit of functional programming. They harp about things like
"immutability" or how its more convenient to write a map reduce rather than a
for loop without truly ever knowing the real benefits of the style. They're
just following the latest buzzword.

~~~
tra3
Forgive me, if I'm being dense, but doesn't either of these cases depend on
how the composed objects are being used?

In your functional example A is an input to B (or vice versa?), how do you
propose testing one of the modules without first instantiating the other one?

~~~
crimsonalucard
I'll give you two examples. One functional and the other OOP. Both programs
aim to simulate driving given an input of 10 energy units to find the final
output energy.

    
    
      #oop
    
     engine = Engine(10)
     car = Car(engine)
     car.drive() #result 8
    
      class Car:
        def __init__(self, engine):
         self.engine = engine
    
        def ignite(self):
         self.engine.energy =- 1
    
        def run(self):
         self.engine.energy =- 1
    
        def drive(self):
         self.ignite()
         self.run()
         return self.engine.energy
    
     class Engine:
      def __init__(self, energy):
       self.energy = energy
    
    
     # ignite not testable without engine
     # run not testable without engine
     # drive not testable without engine and a car
     # ignite, run, and drive are not modular cannot be used without engine. 
     # engine testable with any integer. 
     # Car useless without engine
     # engine useless without car
    
    
    
    
     #functional \
     def composeAnyFunctions(a,b):# returns function C from A and B. See illustration above. 
      return lambda x: a(b(x)) 
    
    
     def ignite(total_energy):
      return total_energy - 1
    
     def run(total_energy):
      return total_energy -1 
    
     drive = composeAnyFunctions(run, ignite)
     drive(10) #result 8
    
     # compose testable with any pair of functions
     # run testable with any integer
     # ignite testable with any integer
     # drive testable with any integer
     # all functions importable and reuseable with zero dependencies. 
     # input_energy -> ignite -> run -> output_energy
    
    
    

"I think the lack of reusability comes in object-oriented languages, not
functional languages. Because the problem with object-oriented languages is
they’ve got all this implicit environment that they carry around with them.
You wanted a banana but what you got was a gorilla holding the banana and the
entire jungle." \- Joe Armstrong

you don't necessarily need the car or engine to simulate the energy output of
driving.

~~~
profalseidol
I've been using static methods in java that follows the pure function way, it
proved very easy to maintain even to those who inherited my code later on.

~~~
crimsonalucard
That's mainly just namespacing. The only point to use an Object in object
oriented programming is to unionize objects and state. To combine them
together into a single primitive. This combination breaks compose-ability.

Static functions avoid state. You put them in an object in java because java
has to have everything in an object. In any other language these would just be
top level functions namespaced into a package or something. You are basically
using java in a more functional way. Which is fine.

------
kyllo
Learning in-depth your various options for persisting data, is very useful
since most applications have to deal with persistence in some form, and
increasingly in a distributed manner. Go beyond simply skimming the surface of
SQL vs. NoSQL and the marketing claims different databases make about their
scalability and consistency. Learn what ACID and CAP stand for and the
tradeoffs involved in different persistence strategies. Learn SQL really well.
Learn how to read a query plan, which is the algorithm your SQL query gets
compiled into. Learn about the tradeoffs of row-based vs column-based storage.
Learn how indexes work, and what a B-tree is. Learn the MapReduce pattern.
Think about the tradeoffs between sending code to run everywhere your data is
stored vs. moving your data to where your code is running.

~~~
theossuary
Two great resources I've been going through are

\- [https://dataintensive.net](https://dataintensive.net) \- Really deep dives
into different types of data storage solutions, their history, and how they
_actually_ work.

\-
[http://www.cattell.net/datastores/Datastores.pdf](http://www.cattell.net/datastores/Datastores.pdf)
\- Good paper that helps differentiate similar but different datastores.
Really helpful when you're trying to pick a modern data solution.

~~~
barbecue_sauce
Designing Data-Intensive Applications is probably the best O'Reilly (if not
overall technology) book of the past decade.

~~~
btown
The talk on "Turning the database inside-out" [0][1] by the author, Martin
Kleppmann, is a fantastic intro to these dynamics, and it's something I'll
always recommend to both experienced and inexperienced data modelers and
backend developers.

It goes pedagogically through the way things are typically done in a
relational database in such a clear way that word-for-word it's one of the
best tutorials I've seen... but it also weaves a narrative of "how can this be
done better/more scalably/more reliably/more flexibly-to-business-needs" in
pointing to a streaming/event-sourcing architecture. You may or not need the
latter right away, but it's a fantastic tool to have in your toolbox to be
able to say "ah, this new requirement feels like it would benefit hugely from
this architecture."

Especially for OP who's starting to think about the "why" of messaging queues,
this could be a fantastically valuable first step.

[0]
[https://www.youtube.com/watch?v=fU9hR3kiOK0](https://www.youtube.com/watch?v=fU9hR3kiOK0)

[1] [https://www.confluent.io/blog/turning-the-database-inside-
ou...](https://www.confluent.io/blog/turning-the-database-inside-out-with-
apache-samza/)

------
jimpudar
Learning how to use dtrace / bpftrace [0] is very valuable if you ever need to
get into serious systems profiling.

There are some really cool data structures out there you might not know about.
One of my favorite basic ones that I get a lot of use out of is the trie [1]
(a.k.a. prefix tree). Very useful for IP calculations.

Also look into probabilistic data structures [2], very amazing things can be
done with them.

[0]
[https://en.wikipedia.org/wiki/DTrace](https://en.wikipedia.org/wiki/DTrace)

[1] [https://en.wikipedia.org/wiki/Trie](https://en.wikipedia.org/wiki/Trie)

[2]
[https://en.wikipedia.org/wiki/Category:Probabilistic_data_st...](https://en.wikipedia.org/wiki/Category:Probabilistic_data_structures)

~~~
Const-me
Approximate Windows equivalent of DTrace is sysinternal process monitor,
freeware. Very useful sometimes.

~~~
Birch-san
The Windows equivalent of DTrace is.. DTrace. [0] DTrace is about far, far
more than snooping the filesystem. At best, Process Monitor is an equivalent
of Brendan Gregg's DTrace utility, opensnoop. The true power of DTrace is to
correlate events across subsystem boundaries. Like, graphing the top quartile
of latencies from network acceses initiated via a given function in your
application.

[0] [https://techcommunity.microsoft.com/t5/Windows-Kernel-
Intern...](https://techcommunity.microsoft.com/t5/Windows-Kernel-
Internals/DTrace-on-Windows/ba-p/362902)

------
nickjj
Shell scripting for processing text. You can often get so much done with so
little code and effort.

Also on a semi-related note, I think as a self taught programmer, it's easy to
get stuck on things that seem cool but are just procrastination enablers (I
know, I've been guilty of it for 20 years). Like, if you're about to start a
new project and you want to flesh out what it's about, you really don't need
to spend 5 hours researching which mind map tool to use. Just open a text
document and start writing, or get a piece of paper and a pen. It won't even
take that long.

I spent about 1.5 hours the other day planning a substantially sized web app.
All I did was open a text file and type what came into my head. For fun I
decided to record the whole process too[0]. I wish more people recorded their
process for things like that because I find the journey more interesting than
the destination most of the time. Like your journey of eventually finding
message queues must have been quite fun and you probably learned a ton (after
all, it lead you to message queues, so it was certainly time well spent).

[0]: [https://nickjanetakis.com/blog/live-demo-of-planning-a-
real-...](https://nickjanetakis.com/blog/live-demo-of-planning-a-real-world-
web-application-from-scratch)

~~~
slightwinder
These days it might be better to just learn python. It's cleaner and scales
better to complex code. And it's ons most system modern systems available out
of the box where shells are available too. Shells are still good for simple
oneliners, and knoting multiple processes together, but text-processing
involves so many different commands, each with their own quirks, that a
consistent simple language is IMHO superiour.

~~~
chaostheory
Honestly this applies to Ruby & Javascript/Typescript as well and not just
Python. I really don't see the value of learning shell scripting anymore when
the newer languages are just was easy to learn, terse, and you can adapt
better to changing conditions when needed with libraries.

~~~
latexr
I often find multi-line Python scripts with `import os` and others that could
be a fraction in size (and just as clear) in bash. Even more ridiculous are
the times I find a node script (published to npm, even) that is little more
than a wrapper on a shell script.

Inevitably someone will read these arguments and think “those are just bad
programmers”, but your point was that you “don't see the value of learning
shell scripting”. The value is in not spewing absurd code like that. Shell
commands are fast and efficient. There isn’t an emphasis on libraries because
instead you use _tools_. Is `grep` not enough? Try `the silver searcher`[1] or
`ripgrep`[2].

Are shell scripts the best instrument for every job? No, but no tool is.

[1]:
[https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)

[2]:
[https://github.com/BurntSushi/ripgrep](https://github.com/BurntSushi/ripgrep)

~~~
chaostheory
Just because someone knows shell scripting I'm not going to consider them "bad
programmers". I know it myself and I've used it extensively before I learned
python & ruby.

My point is that the cost for learning and using shell scripts is just too
high compared to just using a modern language that's just as terse and a lot
more powerful and flexible. Context switching from one language to another
isn't free either.

imo the only time shell scripting was practical was when the only major
programming languages were C, C++, and Java. imo even Perl5 is more practical
than shell scripting.

Also I doubt that the python program you mentioned was that much bigger than a
shell script

------
m_fayer
[https://dataintensive.net/](https://dataintensive.net/)

I can't recommend this book enough. I have a CS background, and still had
quite a few "I can't believe this thing has been hiding in plain sight!"
moments while reading it.

~~~
copperx
I'm now torn between reading this one first or the Architecture of Enterprise
Applications.

~~~
jb3689
I loved Designing Data-Intensive Applications. It gives you the reasons why
NoSQL databases exist and the problems they solve. Moreover it gives you
reasons to select one over another. It's really excellent and one of my top
two CS books

~~~
mleonard
Your other top CS book out of interest?

------
nerpderp82
Debuggers and property based testing. It is a select few people that can
actually productively (not their own metrics) use print statements for
debugging. Learning how to craft repro scenarios and adequately capturing
state in a debugging session can enable junior devs to easily surpass senior
devs.

Property based testing aren't quite formal methods, but I think they are a
good stepping stone. And they also somewhat force your code into an
algebraic/functional style which also make it amenable to refactoring, better
testing and is easier to understand.

Design tools like Swagger can help one think through services w/o diving into
code. Code itself is a liability and should be thought of as "spending" not
creating. Code is a debt.

Refactoring and code understanding tools, if you use PyCharm (you should, it
is free in all senses), learn how to navigate into your libraries. Read you
libraries.

~~~
marktangotango
This x1000 debuggers are seriously undervalued by many developers. It’s like a
super power.

~~~
thetwentyone
What are some good resources to learn about debugging patterns and
tips/tricks? My preferred language, Julia, recently introduced a nice set of
tools related to debugging. I feel like there's probably things that would
make me more productive but I think the techniques would be more broadly
applicable than a specific language.

~~~
drainyard
One thing I always do these days is I step through any new code I've written
the first time I run it. This usually weeds out some bugs that might take a
while to find because they are easy to miss. It also ensures that you actually
go through each line of code you write, doing a forced code review on yourself
early on.

------
enkiv2
I highly recommend learning PROLOG & understanding how to write your own
simple planner system. The hairiest real problems are hairy because they're
best suited to a declarative style (and programs written declaratively can be
made much more efficient through more clever solvers -- given naive code, a
clever solver has a much bigger efficiency boost over a dumb solver than an
optimizing compiler does over a non-optimizing one -- although PROLOG itself
leaks too much abstraction for many of these techniques to be viable in it).

I also recommend understanding message routing systems used in file sharing,
like CHORD.

If you don't have a strong background in the math behind theoretical computer
science, you might benefit a lot from an understanding of the formal rules
around boolean logic, symbolic logic, & state machines -- especially, rules
about when certain kinds of things are equivalent (since stuff like demorgan's
law are used for simplifying and optimizing expressions a lot, and rules for
state machines are used to prove upper limits on resource usage).

If you don't already, learn to use awk. It's a much more powerful language
than it seems, and fits extremely well into the gap between command-line
prototyping in shell one-liners & porting a prototyped tool to python or perl,
and so it's a huge time saver: it is faster to write many kinds of tools in a
mix of shell and awk and then rewrite them in python than it is to write them
in python in the first place.

~~~
bdamm
I've never used Prolog in the 15 years since I learned it in college. It's an
interesting take on programming, for sure, and I appreciated the mind-
expanding exercise, but hasn't helped me in my career at all.

Totally agree on awk. I use it almost every day for quick little one-liners.
Big time saver.

Also agree on state machines, because from there it is a short hop to
understanding formal grammars and the foundation of compilers and languages,
which has been immensely useful in my career.

~~~
doboyy
Learning about automata theory was one of the most mind-expanding experiences
I had in college. It was certainly not something I would've stumbled upon
without guidance. I believe that describes most of the value I derived from a
degree, being nudged in the right directions toward solutions and problems
that a lot of smart people have though about for a while.

Understanding how different languages, or inputs more generally, can be
transformed into meaningful outputs is pretty satisfying. It's a topic that
almost seems to transcend the realm of computer science.

------
agentultra
Formal methods.

It took me nearly a decade of working in distributed systems to be introduced
to TLA+ and other tools in this space. Until then my knowledge had been built
from textbooks describing the fundamental data structures, algorithms, and
protocols... but those texts take an informal approach with the mathematics
involved. And since I was self-taught I was reading those texts with an eye
for practical applications than for theoretical understanding. I had no idea
that a tool existed that would let me specify a design or find potential flaws
in systems and protocols, especially concurrent or parallel systems, with such
ease.

I think type theory and category theory have also been great tools to have...
but I think mathematics in general is probably the more useful tool. Being
able to think abstractly about systems in a rigorous way has been the single-
biggest booster for me as a practitioner.

~~~
pickdenis
> the single-biggest booster for me as a practitioner.

Can you instantiate this claim with an example? I'm somewhat knowledgable in
both math and computer science theory but have yet to feel as though my math
background has helped me in practical CS.

~~~
agentultra
About 3-4 years ago I was working on an open source cloud platform for a
company deploying a public cloud. There was a particular error that
_sometimes_ happened in our production environment where a rebooted VM would
come up but couldn't connect to the network.

We tracked it down to a race between two services in the data plane. It turns
out the VM controller wouldn't wait for the network controller to unplug a
virtual interface before requesting the interface to be plugged back in. There
was a lack of co-ordination happening. However it only happened when the
network component was under heavy enough load that it would take too long to
respond before the VM service finished rebooting the VM -- usually it was fast
enough and the error wouldn't appear.

I managed to model this interaction at a high level in TLA+. From there I had
suspected that the error was in the mutex locking code in the async library
this system depended on so we refined the model pretty close to that
implementation. As I recall we found that the mutex code wasn't the culprit --
a fine result. We ended up implementing some light-weight co-ordination
mechanism to ensure that the VM service waits to acknowledge the progress of
the network service.

Since then I've continued to use TLA+. I find that programming languages are
insufficiently expressive enough to describe high-level interactions between
other processes, network events, and humans.

~~~
ilaksh
I'm sorry but I don't see why the TLA+ modeling was necessary. You said that
you noticed the lack of coordination before that. From your description, it
seems like the mutex thing was a diversion. Anyway, a mutex would not
necessarily be adequate for coordination (and many types of mutexes will not
work between processes at all).

So to me it seems that you could have gone straight to the lightweight
coordination mechanism without the TLA+ model. And anyway, if there was a
problem with the mutex, you could test that theory by doing additional logging
or an experiment around the mutex functionality.

~~~
agentultra
Sorry for what? It was my first time using such a tool. I found it useful for
understanding the system. I've improved my understanding of TLA+ since then
and it has been valuable.

------
ilaksh
The premise that self-taught programmers necessarily have core holes in their
knowledge and skills that would have been filled if they had a CS degree is
entirely false.

Start with the example you gave of messaging middleware. There are many BS CS
curricula that do not address this at all. Also he mentioned that he had
already learned about named pipes on his own. For many applications, named
pipes could be a perfectly valid alternative to some external message queue
system.

Looking at the items submitted, the vast majority are core skills that would
necessarily be picked up by people who need to work with them. The idea that
someone would not know about Makefiles or debugging or profiling or SQL just
because they were a hobbyist or self-taught is ludicrous. If you are serious
about C programming, whether it's a job or your hobby, you are going to learn
about Makefiles. Likewise anyone seriously working on a data-centric
application is likely to become well-versed in some database technology, up
until a few years ago that would have automatically been relational.

And one other thing. Some of the most important skills in programming are in
the domain of software engineering. Software engineering is very poorly
addressed by many BS CS programs. So again, whether they have good SE skills
is often not going to be determined by whether they have a BS degree or not.
It's not even necessarily determined by whether they are working in a
professional environment. It's mainly going to be a factor of their motivation
to self learn and above all practical experience.

~~~
djhaskin987
My experience has been quite different. True that some of the most technically
skilled programmers I know of had no degree, but the polished ones, the ones I
find easier to work with, tend to have one. Further it's pretty easy for me to
tell if a person's degree was a CS degree or not just by talking to someone
about the problems they have and how such problems might be solved with code.

That's not to say it's required; some of the best professionals I know have
non-CS degrees (one in fine art -- painting) or no degree. But if you're still
young, I submit that a CS degree is totally worth your time.

~~~
ilaksh
The argument that I was making was clearly stated at the beginning of my
comment. It is significantly different from the supposed argument that you
seem to be refuting.

Notice how I did not say that "the developers that certain degree holders find
to be most 'polished' and easiest to work with will on average not have a
degree".

Notice how I also did not say that a CS degree was not worth people's time.

------
new4thaccount
SQL (even if just SQLite) as databases open up a lot of power.

Vim or Emacs for powerful text editing.

A low level language. Sometimes Python doesn't cut it, or it is pretty
suboptimal. If you're writing a trading bot, is speed of execution not
important?

Operating System knowledge can be helpful at times. I bought one of the No
Starch books "How Linux Works" and it is very helpful.

The command line and by that I guess you should know the common Linux commands
(cat, grep, sort, uniq, head, tail, ls, top) if you use Linux and how to chain
them together via pipes. To give some context, I can write one command which
would require 8 lines of Python (saves you valuable time). If you use Windows,
learn enough Powershell to be comfortable with it. On occasion I'll use
Powershell over Python even though it is dirt slow for reading files.

~~~
afarrell
I thought this course did a good job at getting SQL to stick in my brain,
largely due to the relational algebra section.
[https://lagunita.stanford.edu/courses/DB/2014/SelfPaced/abou...](https://lagunita.stanford.edu/courses/DB/2014/SelfPaced/about)

~~~
new4thaccount
The W3Schools site is what helped me

[https://www.w3schools.com/sql/](https://www.w3schools.com/sql/)

It has a database you can query and add to and what not.

------
mnemotronic
* Regular expressions. The most valuable "not a language" that I know. * SQL. For me, it was a trip coming from a procedural language background. I kept trying to figure out how to do loops. * Command line - DOS and unix * Batch languages for those. * HTML / CSS / Java (please don't kill me) Script and the DOM

~~~
JamesBarney
Second regular expressions. I use it all the time to edit large bundles of
code.

------
AaronFriel
Value/message/actor/event oriented programming like you mentioned is useful
for building distributed systems. I am a huge fan of going a step further and
learning this model:

Imperative shell, functional core.

The external shell of code in a project is responsible for network
connections, console IO, etc. But the internal guts of a program should be
largely functional, that is, instead of mutating (changing) values, consider
returning different forms of the same value.

Decisions (branches, logic) are made at one level, data dependencies at
another.

The talk Boundaries by Gary Bernhardt describes this model in detail:
[https://www.destroyallsoftware.com/talks/boundaries](https://www.destroyallsoftware.com/talks/boundaries)

~~~
btown
I'm also a fan of "imperative shell, functional core, imperative
implementation of that core," where within a function that you promise to have
no side effects or very specific side effects (say, a component's render
function, or a transformation from one complex data representation to
another), you should still feel free to use loops, imperative-style control
flow, even network requests, etc. I find this puts folks used to "imperative
everything" at ease, while still maintaining almost all of the benefits of
functional-core provided that execution is scheduled properly.

~~~
lenticular
It's really easy to do this with pure functional programming in impure
languages like Scala. You can use arbitrary impure code within lazy IO values.
Outside, the functional purity makes reasoning easy. If, as recommended, there
is only a single point in your program where the IO values are actually
executed, then the execution order can be reasoned about statically.

------
cameronbrown
I'm surprised nobody's mentioned spreadsheets - specifically Google Sheets or
something scriptable and hosted. Recently I've built up a small system which
sucks in data from a few places (fitness, task management, calendar,..) and
analyses it against several goals I've set. This means I can see how I'm
progressing towards what I want without even touching it anymore.

They're a really nice UI for bootstraping projects, and even some small
databases. A current project of mine for a client uses sheets as a backing
store for an email collection list. Since there's only a few hundred rows, it
makes sense at this scale and works really well for non-technical users.

~~~
Forge36
Similarly Microsoft Access. I learned Access for my first job (100% of my job
was moving tasks into Access, 3 years later I helped some interns transition
to a SQL server+web application). 10 years later I'm still using Access for
some aspects of my new job. The ability to quickly prototype something and
receive almost immediate value is vastly underestimated (I know of one company
which runs all of their tasks out of Access).

~~~
cameronbrown
I suppose what I've described is just a little more of an unstructured version
of this. Mixing data and computation in such a way is definitely a tech debt
tradeoff though.

------
vosper
Your IDE, its refactoring tools, but especially its debugger.

VSCode or PyCharm (assuming you are still a Python developer) could be a good
place to start. I'm always surprised when I see professional developers coding
in Sublime Text and debugging with print statements (or their equivalent).
Usually you have better options than that, especially for statically-typed
languages - but even for JS and Python.

~~~
collyw
I don't know how people could work on large scale projects without a step
through debugger.

------
augbog
Honestly I still find a lot of engineers don't know git properly. Like they
know enough to commit and push but that's about it. It really helps to
understand everything git has to offer.

~~~
333c
The three most important basic git operations to know (in my opinion)

    
    
        git checkout -b
        git log
        git rebase -i

~~~
btilly
I strongly prefer git merge over git rebase.

Using rebase results in a cleaner history and simplified workflow in many
cases. However it also means that when you have a disaster, it can be truly
unrecoverable. I hope you have an old backup because you told your source
control system to scramble its history, and you don't have any good way to
back it out later.

For those who don't know what I mean, the funny commit ids that git produces
are a hash signifying the current state of the repository AND the complete
history of how you got there. Every time you rebase you take the other
repository, and its history, and then replay your new commits as happening
now, one after the other. Now suppose that you rebased off of a repository.
Then the repository is rebased by someone else. Now there is no way to merge
your code back except to --force it. And that means that if your codebase is
messed up, you're now screwed up with history screwed up and no good way to
sort it out.

That result is impossible if you're using a merge based result. The cost is,
though, that the history is accurately complicated. And the existence of a
complex history is a huge problem for useful tools like git bisect.

~~~
james_s_tayler
I don't know. Since git uses content-based addressing, you can't actually
alter any commit, only create new ones. And orphaned commits don't get garbage
collected for like 30 days even if you explicitly tell git to clean things up.
So, the original stuff will still be there. It might just not be obvious how
to access it. Part of the commit is the reference to zero or more parent
commit object ids. So, if you find the old commit, it still has it's history
intact. `git log -g` is a handy command to see a composite git log that
travels across branch changes.

I do get what you mean though. You effectively create new commits with an
alternate view of history. I don't get quite why/how that causes a situation
in which the code can't be merged? I don't rebase much, I prefer merging. Is
there any resource that can explain why rebasing might be dangerous like that?

In general if branches diverge too far then you have difficulties merging no
matter which strategy you use and sometimes if it diverges too far it just
becomes hopeless. Mostly though if you are working in a team, commit daily and
merging/rebasing frequently it should present fairly few problems.

I find I never run the actual command git bisect. I just do `git log
--decorate --oneline --graph` and eyeball a good commit to start from and then
basically do it by hand using commit messages to aid in making reasonable
guesses as to where to try but following the basic binary search philosophy.
Works well enough even with a complex history.

~~~
btilly
Here is an example of how to create a problem.

You rebase your private branch off of a shared master and pull in other
people's commits. Someone else pushes out their rebased version using force.
More commits are made on top of the other people's commits, including
reversing some bad commits. You try to rebase off of the shared master.

In your last rebase, you are trying to replay all of the commits in your
history that are not in the remote history. However git does not understand
which local commits are from you and not pulled in on the previous rebase. It
therefore tries to play them on top of the remote master if it can make sense
of them. Which means that you bring back the reversed commits. You might find
conflicts in code that you have not touched. You resolve them as best you may.
And now you've got the definitive version of what happens, and no way with the
screwed up history to figure out why it is going to go wrong. Then you force
commit because that is how a rebase flow works..and everyone is screwed.

I agree on branches diverging too far. Merge early, merge often.

If you never run the command to git bisect, you should try it. What it's for
is finding the random commit that recently broke a piece of functionality that
nobody realized would break. Because nobody realized it, the log messages will
say nothing useful. And you don't need to figure out where the change is -
just write a test program for the breakage, run git bisect, and look at the
offending commit.

~~~
shandor
> Then you force commit because that is how a rebase flow works

Absolutely not. Force pushing a shared master is probably the worst sin one
can commit with git. I guess you already have come upon the 'why' of it.

A "Rebase workflow" works so that devs use rebase to 'move' their work on an
updated master after a pull/remote update, resolve potential conflicts
locally, and do a fast-forward push to origin/master. This also works on
copying work between different feature branches just as well.

------
funkjunky
When I went from amateur Python programmer to Google Cloud developer support,
I remember being completely blown away by technology and design patterns I've
never heard of that goes into modern web/enterprise architecture. I had to
learn it all the hard way, but these days there are great free (or mostly
free) courses you can take to learn this stuff.

For Google, check out the study guides for their certifications, specifically
Cloud Architect (basic overview), Cloud Developer, and Data Engineer.

[https://cloud.google.com/certification/](https://cloud.google.com/certification/)

Be sure to follow along with the recommended Coursera and Qwiklabs tutorials
and do the exercises. You'll learn about all kinds of neat stuff, like
scalable application design, container technology, monitoring+metrics, various
types of database technologies, data pipelines (including Pub Sub messaging),
SRE best practices, networking+security, and machine learning.

I currently work on AWS, and don't find it a good starting point for diving in
to these things quickly, but most companies use it so it wouldn't hurt to
learn I guess. I still recommend GCP over AWS to start with, as their
technology is far more interesting and focused, and quicker/easier to work
with.

------
hsitz
A new version of "The Pragmatic Programmer" recently came out. [EDIT: not
available yet, only preorder at amazon, beta version available at
pragprog.com.] That book is all about tools and methods that a self-taught
programmer should look into:

[https://www.amazon.com/Pragmatic-Programmer-journey-
mastery-...](https://www.amazon.com/Pragmatic-Programmer-journey-mastery-
Anniversary/dp/0135957052)

~~~
colomon
For me, that Amazon page is listing it as a pre-order, without any release
date. And all the other versions (Kindle, Paperback) are the 1st edition
instead of the 2nd.

Very frustrating, as I considered the first edition to be essential and upon
reading your comment, instantly went to purchase the 2nd edition.

Edited to add: Found a date, Amazon is listing it as October 21, 2019.

~~~
hsitz
Sorry, I thought I'd read a review of it already, so just didn't look closely
to see it wasn't available yet.

It looks like you can get a DRM-free beta version of the ebook on their
website, with free upgrades to published version once it's finalized:
[https://pragprog.com/book/tpp20/the-pragmatic-
programmer-20t...](https://pragprog.com/book/tpp20/the-pragmatic-
programmer-20th-anniversary-edition)

------
ww520
Learn Emacs. Stick with it to get comfortable. Text editing is one
transferable skill that you can reuse again and reuse in different projects
and on different languages.

Org-mode in Emacs. My poor man's project management for a number of projects
consists of a todo-<project>.org file listing all the planned features, the
pending TODO items, the DONE items for the current working release, and
release notes describing features and changes for each of the released
version. In one place, I have the future features, immediate todo items,
current completed item, and the history of all past releases in an org file,
making things simple to access and manage.

For theoretical stuff, learn about transaction.

------
glangdale
Satisfiability Modulo Theory (SMT) and a nice friendly implementation like Z3.
When you have a difficult algorithmic problem and can't be bothered to dream
up a good solution, why not try chucking it into a solver and see if a
computer can solve it for you? Surprising what it can find - not just Sudoku
puzzles and "Mr Green lives next door to Mr Black, the baker lives across the
street to the butcher, ... " type stuff either.

~~~
travisjungroth
Here’s a great talk by Raymond Hettinger on those things and more:
[https://m.youtube.com/watch?v=_GP9OpZPUYc](https://m.youtube.com/watch?v=_GP9OpZPUYc)

------
throwaway55554
Develop a project that has two local processes that have to communicate, one
local database, and one remote service that the two processes have to
communicate with.

Doesn't matter what the project is. Could just be tossing around a string of
data that eventually gets dumped into the db and sent to the server.

Research different methods of architecting such a system. Code up a few.

By doing this, you will actually find the answer to the question you posed.
And, arguably, have fun doing it.

~~~
lioeters
I love it. The best way to learn is by doing, and the suggested project idea
is simple enough as a starting point, and complex enough to require deep
thinking and exploration in various aspects to figure out a good solution.

Many of the recommendations in other comments could tie into architecting such
a study project too: data structures, algorithms, communication among
distributed processes..

------
niklasjansson
Excel. Seriously. For more things than you imagine

Let's say you want to make a property assignment for some class;

this.a=x.a this.b=x.b ...

While you probably would want to do this some other way to start with, and
while you of course can solve it using some emacs wizardry, I can whip that up
in Excel using some formulas in a matter of a minute. Moreover, I can keep
adding to it when I realize something was missing.

I can make a diff, join or union of the results of two queries from different
databases (or even database engines). Not to mention calculation and design
mockups.

~~~
hsitz
I'm not following. What good is it to have equivalent of "this.a=x.a
this.b=x.b ..." in Excel, when what I need is to have this in code in my
editor? Are you saying you use Excel to create the code, then copy/paste it
into your editor? Or what?

~~~
JonathonW
I think the parent's using Excel essentially to do repeated template
expansion: e.g. for a given set of member variable names ([a, b, c...]), give
me the assignment statements I'd need to use those in a constructor.

Which I could do pretty trivially in Excel... but could also do trivially in
about two lines of Python:

    
    
        vars = ["a", "b", "c"]
        statements = ["this.{0} = x.{0}".format(var) for var in vars]
        print(statements)
    

Spreadsheets can be an incredible tool-- as an interactive environment that
allows non-programmers to express domain knowledge and quickly automate parts
of their workflow, they're really unparalleled. But this isn't a very good
example of something that Excel is particularly well-suited for.

~~~
UnFleshedOne
Those can often be done in a decent text editor with regexp-like search and
replace.

~~~
kragen
Took me 21 seconds with Emacs keyboard macros; no search and replace involved.

------
fancy_pantser
The next obvious step after message queues and distributing work would be
streams.

Here is an excellent introduction to unified logs and stream processing by the
author of Kafka at LinkedIn:

[https://engineering.linkedin.com/distributed-systems/log-
wha...](https://engineering.linkedin.com/distributed-systems/log-what-every-
software-engineer-should-know-about-real-time-datas-unifying)

------
pankajk1
Here is a list of technologies/tools that I have found quite enabling (in
addition to what has already been covered):

1\. Earlier use of VMs and now Docker containers -- It was around 2004 that I
got introduced to VMware tools to create, configure and run VMs. Life was
never the same after that. No more fretting about installing pre-beta software
for the fear of hosing my perfectly working system. There was a time when I
had multiple servers running VMWare hypervisor, each running multiple systems.
Then around 2014 I switched to using Docker containers for similar purposes
and haven't had a need to use VMs.

2\. Jupyter notebooks/pandas/plotly -- Can't imagine how one can explore data
without this.

3\. SQLite -- Perfect for writing unit tests for code that deals with SQL
databases.

------
mattnewport
If you're a self taught hobbyist you may not have had much structured exposure
to fundamental data structures and algorithms and complexity analysis. I think
that type of thing is easier to learn when you already have some experience so
you can relate it back to real world problems you have encountered as you
describe doing here. Now might be a good point in your development to dig into
some of those fundamentals if you have not done so much in the past.

------
Jach
If you're strictly asking about things a self-taught hobbyist programmer may
have missed, then I second the suggestions here to skim through a CS degree
curriculum and dig into anything that's unfamiliar. As one possible example,
maybe you're solving a problem trying to parse some text, and you're in over
your head with ad-hoc regexes and conditionals and type casts and exceptions
all over the place. If you've seen the concept of a grammar (likely to come up
in CS programs, though not all) and of generating parsers / validators for
them, you can eliminate a whole lot of programming by specifying a grammar in
some common format (for instance, ABNF) and running it through a program that
generates a parsing program for you. The general category of programs writing
programs is worthwhile to look into, and playing with languages where such a
feature is first-class (like Common Lisp, which apart from having macros also
has a compile-file function you can invoke at runtime) can be enlightening.

A lot of the comments here are highlighting things that not even most CS
degree holders will know about, nor many professional programmers. Those can
be useful for a hobbyist too, so maybe the lesson is that regardless of your
current level of knowledge it's important to keep in mind: "There are more
things in heaven and earth, Horatio, than are dreamt of in your philosophy."

Your comment about feeling comfortable in the practical details reminded me of
how I used to feel as a fresh self-taught PHP-wielding teenager after a few
years of it. As an "expert" PHP coder, I could do anything! Even an OS if I
wanted! Well I learned better eventually. :)

~~~
stormking
"If you've seen the concept of a grammar (likely to come up in CS programs,
though not all)"

Really? That was one of the very first topics in my first semester. How do you
teach CS without grammars and the Chomsky hierarchy?

~~~
Jach
I wondered if someone would ask about that... It's just my own recollection
from years ago when I looked at a bunch of CS curricula from public and
private schools. There was a lot of patchy variation. If the topics were
covered at all they'd typically be done in optional electives like a compilers
course, or "theory of computation" course, taken in junior or senior years.
Also depending on the school such an elective might not actually be available
anymore (I looked at current-semester/last-semester offerings to try and
identify those) and is only part of the catalog for historical reasons. Maybe
it was offered once, but not anymore, in part because students having the
choice would rather take the new data science course or Advanced Networking or
whatever to fulfill that elective instead.

I do think the problem is mostly a function of having lots of choice.
Everyone's got their own list of what a CS degree absolutely must cover, but
there are going to be differences, and schools are incentivized to let
students carve their own path. Let's look at this list:
[http://matt.might.net/articles/what-cs-majors-should-
know/](http://matt.might.net/articles/what-cs-majors-should-know/) I've looked
at a lot of intern resumes over the past few years and broadly they've been
impressive from a hire-to-BigCo perspective but I would be surprised if many
of them knew much if anything about a large majority of topics from that list.
Even narrowing to the "better" schools. I also don't even agree with that
list, I just don't think it's realistic to cram all that into a 4 year program
on top of all the other STEM courses, humanities courses, and project
courses...

Going back further there's the continuing problem of "dumbing down". It ties
into a school's incentive to offer more choices to fulfill the graduation
requirements (creating an "easy" path), but as a problem in itself it's worth
considering. [https://www.joelonsoftware.com/2005/12/29/the-perils-of-
java...](https://www.joelonsoftware.com/2005/12/29/the-perils-of-
javaschools-2/) is a classic rant about it, not a very good one but it points
at one consequence of the problem. I would bet there's still a sizable bunch
of current-year CS graduates who never had to deal with pointers. Maybe I'll
do some data exploring some time with the resumes I still have to see what
fraction put C or C++ somewhere to make a weak proxy measure.

~~~
stormking
"they'd typically be done in optional electives like a compilers course, or
"theory of computation" course"

Okay, maybe it's because here in Germany, we have a separate line of eduction
for "mere" programmers and system administrators as well as a separate branch
of higher education for more "practical" skills. But if you study CS at an
university, both of these course would be mandatory, together with a lot of
math and some physics (to know what actually happens in a circuit).

------
WoodenChair
The best tools you can teach yourself are arguably more fundamental than any
specific language or library, but instead computer science problem solving
techniques. I've written books on this subject specifically targeted towards
self-taught programmers:
[https://classicproblems.com/](https://classicproblems.com/)

Sorry for the self-plug, but it's super relevant I think.

------
nprateem
I used to read High Scalability [1] which has write ups of real-life
architectures of lots of the largest web sites. I found it great for learning
about different architectures. I'd definintely recommend it as a way to find
out about thing like messaging queues, asynchronous vs synchronous processing,
etc.

On a similar note but more theoretical, my life changed when I learned about
design patterns. Martin Fowler's books/site are a good place to start [2].

[1] [http://highscalability.com](http://highscalability.com)

[2]
[https://www.martinfowler.com/eaaCatalog/index.html](https://www.martinfowler.com/eaaCatalog/index.html)

------
joker3
[https://ocw.mit.edu/courses/electrical-engineering-and-
compu...](https://ocw.mit.edu/courses/electrical-engineering-and-computer-
science/6-005-software-construction-spring-2016/)

In my experience, most hobbyist programmers are fine writing scripts and other
small programs but have no exposure to the sort of ideas that are necessary
for making large programs. This course at least exposes you to a lot of those
concepts.

~~~
metahost
The latest course is available here:
[http://web.mit.edu/6.031/www/sp19/](http://web.mit.edu/6.031/www/sp19/)

------
paulgb
State machines. They can be a helpful abstraction for UI or business logic,
and they also make regular expressions make a lot more sense.

~~~
jimpudar
Yeah, far too often people create horrific spaghetti code of if/else
statements which could be neatly coded as a state machine. This is a very
important concept to grasp.

------
lixtra
Learn to use a profiler for your programming language. It will tell you what’s
worth optimizing in your programs. I.e. if your program spends only 10% of the
time talking to others than that may be the maximum you can optimize by
choosing a fancy communication pattern.

------
CalChris
_Godbolt Compiler Explorer:_ [https://godbolt.org/](https://godbolt.org/)

 _DTrace:_ [http://dtrace.org/](http://dtrace.org/)

------
new_guy
'The Imposters Handbook' is good for foundational knowledge, it was written
for exactly your use case.

[https://bigmachine.io/products/the-imposters-
handbook/](https://bigmachine.io/products/the-imposters-handbook/)

~~~
LeonB
Came here to recommend this too. It’s goal is very much aligned with the OP.

------
narag
A few decades ago, a new programming environment exploded: the web. Looking
for a ridiculously useful tech stack? Look no further: HTML, HTTP
architecture, SQL backend... a guide written at the time:

[http://philip.greenspun.com/panda/](http://philip.greenspun.com/panda/)

Paul Graham also wrote about why the web was such a deal, IIRC in the _Beating
the Averages_ essay. In particular: you can use whatever tools you want and
avoid deploying to client machines.

~~~
cryptonector
PostgreSQL + PostgREST + react-admin == fantastic stack.

You can write an entire application in SQL and PgPlSQL but using an HTTP JSON
API as the interface with a static and responsive BUI.

This allows you to be extremely agile in development and ops (because, e.g.,
you get to use logical replication).

I can't say enough good things about this approach.

~~~
pandemic_region
indeed if your app is about managing your personal dvd collection, or variants
thereof, such anemic UI-to-database tools work very well. Not when there is
complicated domain logic involved.

~~~
cryptonector
Having built such an application (w/ complex business logic), I have to
disagree. But I'd like to hear what problems you've run into.

I would agree that react-admin is a bit too simple, and this stack really
calls for a BUI to be built specifically to fit the PostgREST model.

------
quanticle
I would recommend learning the SOLID [1] design principles. I've found them to
be a very helpful guide when designing software components.

[1] [https://en.wikipedia.org/wiki/SOLID](https://en.wikipedia.org/wiki/SOLID)

~~~
deathanatos
Particularly the S — the Single Responsibility Principal. So much messy,
convoluted code is convoluted because it lacks a singular, clear purpose, and
bundles up multiple responsibilities into one section of code, be that a
module, class, or whatever is appropriate to your language.

They're all good, and you'll get good insight from them all, but I think that
first one is more important and has provided me more value than all the rest.
I think Liskov substitutability would be my second pick.

~~~
collyw
Agree completely.

------
jacksnipe
Not nearly as big a deal as some of the other tools and techniques being
mentioned, but tmux/screen are lifesavers when you need them.

------
anonu
> Googling, I encountered something I hadn't heard of called a message queue.

Former high frequency trader here. Messaging middleware is a god-send for
distributed systems. It used to be quite commonplace in the late 1990s/early
2000s for creating trading systems.

By mid-to-late 2000s - and especially post GFC (great financial crisis),
volatility lowered, technology improved and middleware systems were never
again put in the critical path.

The important measure here is "tick-to-trade" \- from the moment a market-data
message comes off the wire to the moment you send an action to buy/sell/cancel
back on the wire. A middleware system just slows this down considerably. As a
result, you want "tick-to-trade" to be in the same process, preferably single-
threaded and boost your thread affinity to minimize context switching.

To answer your actual question: I would say learn about the concepts that the
big cloud providers are pushing. If you open the AWS app drop down - there are
dozens of concepts that have been encapsulated in managed or serverless
frameworks. They are all worth learning IMHO - as they define the current and
next generation of computing.

~~~
tynpeddler
Message queues are really cool, but to paraphrase an old joke, "Some people
see a distributed systems problem and think, "I know, I'll use a message
queue." Now they have otw problems.

------
brianpgordon
Don't go too overboard with message queues. There's nonzero development and
operational overhead incurred when part of your application takes its input in
a weird binary format, and when the data in your queue is thrown away after
processing, and when you need to think about scaling of workers and
concurrency. If you're not working with real "big data" – and, let's be
honest, almost nobody is – I would advise using an HTTP-based service (REST,
SOAP, whatever, take your pick) for communication and a SQL/NoSQL/NewSQL
database for state.

~~~
jrockway
ZeroMQ is not really a message queue, it's more of a networking library. It
takes TCP sockets and adds other concepts on top, like request/reply or
publish/subscribe.

------
Const-me
I don’t think there’re many ridiculously useful things that are universally
useful.

If some stuff is domain specific or have much theory involved doesn’t mean
it’s not considered a fundamental concept in some areas of work.

If some other stuff was very useful for you doesn’t mean it’s universally
applicable. For example, if you had worked on a high-frequency trading bot, I
don’t think you would have used neither Python, nor ZMQ or other general-
purpose messaging middleware, nor even OS-provided TCP/IP stack — they all
cause too much latency.

I’ve been programming for living since 2000, worked in a lot of different
stuff from web dev and enterprise to videogames, embedded, robotics and GPGPU.
Yet I can name many huge areas which I hadn’t seen close enough, or at all,
along with libraries and tools used by people working there.

Every time I start working in a new area, or when I resume working in an area
after a long (years) pause, I read a lot of relevant stuff. Continuous
learning is the key to stay good, IMO.

------
mntmoss
Some simple things with good ROI:

1\. Ergonomics tooling: Set up your screen at an appropriate height and your
keyboard at an appropriate distance, so that you aren't in pain. If you use a
laptop often, try setting it on a shoebox. Use a work timer to take
breaks(I've been using Workrave on its default settings lately, which is quite
harsh but has worked for my productivity).

2\. Diary keeping: Note what you did, what you plan to do, and the date and
time. Note as often as you feel necessary. Record notes both in source
comments or in general purpose diaries. Use the date to eliminate or rewrite
notes that are stale.

3\. File management, process management and editing. Take a little time to
learn things about your operating system and editor. You don't have to master
it, or do elaborate configuration(in fact, having a complex config makes it
hard to transfer to other environments). What you want are little things like
knowing a few handy shortcut keys or a few built-in tools.

4\. Gaining familiarity with writing simple "skeleton" or mock-up code, and
waiting patiently for it to mature. When first building a system it may be
tempting to apply the biggest algorithmic hammer or design pattern you know
of. This is the kind of trick you are learning in learning about message
queues. But the most likely outcome of trying to use a trick, no matter how
well intentioned, is that this will get you a wrong result more slowly,
because the full shape of any problem tends to come into view slowly,
progressing from blurry and unclear with a rapidly changing design into
something sharp, with well-defined boundaries. As such every new system
demands a beginner's mindset, and some ability to refuse engineering in-depth
until you absolutely must do so to progress. Done properly, you build a system
that leverages existing tools well, has some kind of value now(even if it's
limited or lacking a critical feature), and then can harvest the learned
lessons into a more complete form later. Trying to get all the features into
one pass creates a messy soup: iterating on a subset of them naturally leads
towards development of tricks like the message queuing architecture, without
any prompting.

(edit)

5\. Read c2 wiki when bored - it covers a lot of recurring discussions in
programming: [http://wiki.c2.com](http://wiki.c2.com) If you know the stuff in
it you'll be reasonably prepared to think about any unusual ideas and compare
them with existing examples.

------
amrox
Learning the basics about how programming languages work - parsers,
interpreters/compilers. I've heard good things about Writing An Interpreter In
Go [1]. Related, I've enjoyed Martin Fowler's DSL book [2].

    
    
      - [1] https://interpreterbook.com
      - [2] https://www.martinfowler.com/books/dsl.html

~~~
O_H_E
Just noting that the URLs weren't formatted as links but code

------
SonOfLilit
More important than any specific technique or tool is that you get an
experienced mentor to look at your work once in a while and point out the
obvious things relevant to the specific project that you don't know you don't
know. Seriously, you'll probably save a few months of work in the first ten
minutes of talking to a senior developer about your hobby project.

Talk to someone with many years of experience, though, otherwise you'll most
likely get sent on a quest for beauty of implementation that has nothing to do
with the goals you're trying to achieve. Sadly, that is a lesson that takes a
_long_ time to internalize.

~~~
SonOfLilit
(Learning all the things recommended in this thread would take a few years,
and maybe half of them would be useful given you know them, a lot fewer useful
enough to justify the price of learning them)

------
gaze
Get good at math. It'll serve you well and never go out of style.

~~~
ThrustVectoring
Math is a bit too broad. From personal experience, the most relevant topics
for programmers are Linear Algebra and Discrete and Combinatorial Algebra.

~~~
gaze
Eh I mean, I think there's a certain discipline that comes with studying any
branch of math to a certain degree of rigor. But yes, linear algebra and
discrete math are probably the most useful. I think control theory and
optimization are under appreciated amongst programmers though.

------
neilv
At some point, it helps to broaden your programming language exposure, even if
you stick to mostly one language for most of your work. You'll find ways to
apply ideas from other languages/communities to your work.

Try to spend some time learning idiomatic programming from one of the Lisp
family (Scheme, Racket, CL, Emacs Lisp, and Clojure all have different
thinking, but a lot of overlap). Play a bit with Smalltalk or a similar
descendant, even if you're already doing OOP elsewhere. At some point you
should learn a textual expansion language, like one of the Unix shell
scripting ones, or Tcl (and learning basic Bash scripting will probably be
useful in tech work). Try a logic programming language, like Prolog, or one
that's a minilanguage within another, like Mini-Kanren. Maybe buckle down for
hardcore functional programming (e.g., Haskell, OCaml, or discipline yourself
to do it in a Lisp?). You should also get comfortable with C or at least an
assembly language at some point, to have a better idea of what other languages
are and aren't giving you, and also C is just a really useful thing to know
when you need to write a little fast code, FFI to a native library, or get
into languages/IRs for newer target architectures.

(Disclosure: I've been especially involved with Racket, an energetic close
descendant of Scheme, and have some interest in promoting it, but I'd list a
Lisp as one of the first in any case.)

~~~
jimpudar
Racket is a very good Lisp for beginners, DrRacket makes it easy to get up and
running with little effort.

~~~
neilv
Agreed. Racket is from a particular school of thought, and you won't get all
Lisp family ideas from it, but it's great.

You can start up DrRacket (a simple IDE for students that can also be used for
professional work, and has some powerful features in it), or just use your
favorite editor and the `racket` command-line program and REPL. There's way
too much documentation at: [https://docs.racket-
lang.org/](https://docs.racket-lang.org/)

You can also do the old MIT introductory CS textbook, SICP, using Racket.

~~~
snazz
> You can also do the old MIT introductory CS textbook, SICP, using Racket.
    
    
        #lang sicp
    

[https://github.com/sicp-lang/sicp](https://github.com/sicp-lang/sicp)

------
zoomablemind
I'm not sure if 'hoarding' such knowledge would be practical.

Sometimes, having a limited toolset would better focus you on the problem at
hand. Then, once the challenge is clarified, the search for alternative ways
to architecture and implement it would become practical.

If just for fun of exploring something new, pick whatever interests you in
general, language, framework, a domain. Async processing and idioms?

------
wpietri
For me a lot of programming involves some amount of data wrangling. E.g.,
getting input data ready. Or generating and understanding results of some
technical experiment. I recently came across VisiData and adore it. It has a
steep learning curve, but I've found it very much worth it:
[http://visidata.org/](http://visidata.org/)

------
GuB-42
Not one in particular besides the obvious. If it is really fundamental, it is
not overlooked.

Every programmer has their favorite tools. Some will use debuggers, others
will prefer logging. Some will use class diagrams, others will grep. Some will
use IDEs and GUIs, others will use text editors and shells.

Same thing for programming languages, techniques, libraries, frameworks,
etc...

So you are very likely to get a lot of answers. It seems that you got an
epiphany when you learned about message passing. All programmers had similar
experience as they discovered the one thing they really needed.

In reality it depends on the project. Great programmers simply have a lot of
experience with many, many things.

My suggestion: continue what you are doing.

Try new things, don't blindly follow other people lists. In the process that
led you towards that messaging library, you learned named pipes, unix sockets
and shared memories, all useful, and maybe they will be essential to your next
project. Had someone else served you that library on a platter, you would have
lacked that insight.

------
caust1c
Heh, funny that you mention that ZMQ is overlooked. While often though of a
message passing library or serverless queue, Zero MQ has some pretty severe
limitations that implementers fail to consider.

To be clear, I think it's an amazing library which is unmatched in its
performance, but it comes at a cost: reduced reliability.

ZeroMQ will drop messages in a number of situations. The library does not
handle delivery guarantees which means that the application must do it
themselves. Whether or not this works for you is an application level concern.
However, having used it at two companies now: both times it ended up being
thrown out for a more reliable queue (kafka).

[http://zguide.zeromq.org/py:all#Missing-Message-Problem-
Solv...](http://zguide.zeromq.org/py:all#Missing-Message-Problem-Solver)

So maybe the reason you haven't stumbled upon it sooner is because it's
overhyped? Definitely useful but with a grain of salt.

~~~
wallstprog
The "MQ" part of the name is unfortunate, but apparently came about because of
the original idea to come up with a "better" implementation of AMQP
([http://zeromq.org/docs:welcome-from-amqp](http://zeromq.org/docs:welcome-
from-amqp)).

But you're right -- ZeroMQ doesn't do queueing (except in some very limited
circumstances), and if you need reliable delivery you must implement that
yourself "on top of" ZeroMQ. I've done that, and while it's not a simple task,
it is certainly possible.

You can get reliable delivery "out of the box" with other software, that in
fact does do queueing. (kafka may be one, but I don't know enough to say).

But what you give up when you do that is performance -- ZeroMQ can easily be
orders of magnitude faster than those other solutions, and for some
applications (e.g., real-time market data) the work to provide a custom
reliability solution on top of ZeroMQ is worthwhile.

------
hammerbrostime
One minute ago I discovered that HN comment threads are collapsible by
tapping/clicking the "[-]" on the _right side_ of a comment header. I think
that could fall under the class of non-obvious tools (maybe not for a
hobbyist, but as a long time HN reader I'm shocked I never noticed it before)

------
shiloa
An IT automation framework like Ansible (my favorite), Chef, Puppet, Salt.
Highly recommended if your work includes doing repetitive remote server tasks
or any sort of infrastructure / devops stuff.

------
RangerScience
At the job I'm at, I've picked up three tools either for the first time, or in
a very new way:

1\. Makefiles. See @aequitas' comment for more.

2\. Terraform. Seriously, just _using_ this tool taught me [a lot of] devops.
It's fantastic!

3\. Docker (as a tool!)

I'm going to go into the third one a bit - I feel like Docker is mostly
thought of as useful for deploying things to the 'net (kubernetes, ECS, etc),
but I think it's also amazing for local development and build pipelines. I
actually have no idea one way or the other how much other people use it this
way as well, so maybe it's just me that's finding it unexpectedly awesome for
this.

Put together the right CLI command + Dockerfile, and you can hand someone a
repo and they can launch a complete, reliable development environment in a
single command without any other system prep. No more worrying about which
dependencies need to be installed in what way; it's like `rvm` + `bundle exec`
but for EVERYTHING. No more dealing with whatever custom system modifications
someone has going on. `git clone`, `make dev` move on with life.

And then you can also have Dockerfiles that are specifically for producing
your build artifacts, and then completely ignore the container for execution.
This is how I'm using both AWS EMR and Lambda.

~~~
akx
I'm also occasionally using Docker t generate build artifacts (so +1 for that)
- how do you pull the built blob out of the image? I've used `docker exec`
plus `docker cp`, but it feels a little clunky.

~~~
victornomad
You can just mount a volume and write the blob into it, very easy and
convinient!

~~~
akx
Ooh, right, you can mount in `docker build`? That'd solve it, thanks!

~~~
victornomad
You mount it when you run it using -v hostfolder:containerfolder

I recommend you though to use Docker-compose, somehow everything becomes much
saner than using the normal "docker" executable

------
OliverJones
Event-driven, asynchronous, programming

Here's a valuable addition to your toolkit of mental models for programming:
Event-driven, asynchronous, programming (in the style of ES6 Javascript or a
similar language. )

Some suggestions about how to learn the basics? tutorials on...

\-- building a so-called "single page web app" with a framework like vue.js or
even jQuery.

\-- node.js to build a complex back-end server without using a threading
model)

\-- React (to build an interactive program to run in a browser)

------
quickthrower2
Tools that have changed how I think about things, and yes some are hyped
things you have probably heard of. But the opposite effect of people ignoring
it because it's too hyped might come into play:

Docker - figuring out what you really need to get something running and having
a reproducible way of doing it. I did a blog post on it:
[https://stackabuse.com/how-docker-can-make-your-life-
easier-...](https://stackabuse.com/how-docker-can-make-your-life-easier-as-a-
developer/)

Node JS - much derided, but nothing compares in terms of the speed you can
quickly hack up a tool in Node JS and the ecosystem you have access to. I
don't use it as production server, but just as a quick way to hack up tools.

Pandoc - being using this tool to convert markdown to PDF, and it does a nice
job. Uses Latex & friends so you can find a nice template somewhere and base
off that.

Markdown - I love using markdown formats. Especially with the tooling, things
like Pandoc and Github are enough to justify learning a bit of MD.

Touchtyping - Makes typing a bit nicer and a bit faster. I could type without
looking at the keyboard before, but now I don't lose my hands, just feel the
bumps!

------
jdavis703
I’m not sure if I count as self taught or not since I first learned
programming outside of formal education, only later getting a degree in IT
(for which I really didn’t learn anything new except graph theory). But we
never talked about queues in school, in fact I only learned about them when I
started working professionally. If you want to be exposed to all kinds of
interesting problems, I would suggest working at some mid-stage startup (i.e.
around the B or C rounds).

They’ll be starting to get in good people who are fixing up the mess the early
employees made* and can help you learn why certain patterns are anti-patterns
and how to fix them. At this stage they’re still small enough that if you pay
attention, you can just learn by paying attention to what everyone else is
working on.

* Before you downvote me I’ve contracted for pre-seed startups, am currently working at a seed stage startup and have been at companies all the way from a series A to a series E. So yeah I’ve been the one making a mess (because of the whole minimum part of MVP) and cleaning up said mess.

------
i_feel_great
Tools you can use to recover from the loss or destruction of your
laptop/desktop as fast as possible. Or having to hand over your password to
the authorities. This involves compressing all of the state and data and
removing and/or sending it somewhere, and then being able to recover it
quickly. And it must be simple, robust and automated.

------
alphaoide
This was recommended in a blog post I read. I haven't read it myself but the
table of content looks promising

[https://www.educative.io/collection/5668639101419520/5649050...](https://www.educative.io/collection/5668639101419520/5649050225344512)

Designing a URL Shortening service like TinyURL

Designing Pastebin

Designing Instagram

Designing Dropbox

...

Key Characteristics of Distributed Systems

Load Balancing

Caching

Sharding or Data Partitioning

...

------
namank
What you want to do is ponder over how common functionality around you must
work - if you can't come up with a reasonable solution, it's time to google it
and learn it. If you can come up with a reasonable solution, google to check
if you're correct.

This is the most pragmatic way I've found to learn and stay in touch.

------
xs83
DEBUGGING! And I don't just mean echo'ing out statements to check the values,
I mean an actual attached debugger to your code.

GDB for C / compiled languages XDEBUG for PHP etc

The actual act of stepping through your code and looking at the values,
datatypes and their transitions massively increases your productivity if
something is tricky!

------
malvosenior
If you're coming from Python you should start looking into how other languages
handle concurrency. Python has a GIL (global interpreter lock) that only
allows for single threaded execution under normal circumstances. Learn about
threads, locking, mutexes, semaphores, green threads, race conditions...

------
danesparza
Caching, logging, centralized configuration, Security, and Design Patterns are
all probably easy to overlook.

------
smoe
Monitoring

From the beginning of my career in web development I worked in companies that
had good to great monitoring capabilities in place to log and measure
performance, resource usage, availability, errors, etc. in all parts of the
system. so I took it as something self-evident that you would add that first
thing, before deploying a new project.

Only recently as I started mentoring people in other companies did I realize
that many developers or not aware of what exists in this space.

In two cases I got asked for help to deal with performance issues, and in
neither case did they have any idea what was causing them, because they had
pretty much no tooling to tracking it down, so they resorted to speculation.
We installed a application monitoring solutions and they were able to fix the
problem in no time once it was identifiable.

~~~
alphaoide
Can you recommend books/tools? Thanks!

~~~
smoe
This chapter of the Googles SRE book gives you a good overview.

[https://landing.google.com/sre/sre-
book/chapters/monitoring-...](https://landing.google.com/sre/sre-
book/chapters/monitoring-distributed-systems/)

But I think it is less abut what specific tool to use, but more to just get
started with one and learn how to understand your production system behavior
and dependencies from metrics and graphs.

An easy way to get going for monitoring an application is using a hosted
solution like newrelic.com Their free version should be sufficient for a very
long time.

If you want to run it yourself there are opensource solutions like
prometheus.io or riemann.io among many others.

------
Khelavaster
It's absolutely critical to learn object-oriented programming in a mature
language like C#. (Java's OO model is slightly broken, as well as
fundamentally underdeveloped; it's better tackled after learning a correct,
functional object0oriented language like C#.)

------
asdffdsa
Books: Operating Systems/Database/Networking/Computer Security/Computer
Architecture textbooks, Software Engineering textbooks (Clean Code, Design
Patterns, Designing Data Intensive Applications, Domain Driven Design as a
short list off the top of my head)

~~~
lbrindze
"Designing Data Intensive Applications" is an absolute goldmine for things
like message queues but also going beyond understanding the full implications
in database selection and other common, distributed-oriented engineering
decisions modern software engineers may come across.

------
bradford
I've extensively used Diff tools as an analytical aid across many different
domains. It has uses far beyond code change tracking!

Specifically, I like vim's interactive diff'ing capabilities (although any
interactive diff tool with sufficiently powerful text-editing capabilities
should suffice).

So much of the troubleshooting that we do in programming is asking "this thing
used to work, what changed?". Don't rely on your eyes to find the changes, let
the computer do the work for you. The ability to load up two different log
files in a diff session, regex-substitute the noisy portions away (dates,
process/thread id), and view the key functional changes really helps me go
from noise to signal in an optimal way.

~~~
reificator
You touched on it briefly but I'd like to highlight regular expressions in day
to day editing and log delving. It's a massive time saver in my experience at
least.

Coworkers often come to me to help them write a quick regex for something, or
to have me double check their work.

If you need a playground to get comfortable,
[https://regex101.com/](https://regex101.com/) is a great resource. Dump some
examples you'd like to match and some you don't in the bottom section, and try
to write a regex that matches in the top. It will dynamically match as you
type, and the right side shows a token by token breakdown of what your regex
does.

------
theonemind
* Parser generator tools like ANTLR [https://www.antlr.org/](https://www.antlr.org/), or even lex and yacc, useful for parsing languages/config files and probably generating a better/more robust parser than you'd cobble together by hand

* Dynamic programming [https://en.wikipedia.org/wiki/](https://en.wikipedia.org/wiki/) \-- great for relatively-quickly (in computation time) coming up with good-enough solutions to some hard (like NP hard) recursive problem that would take forever (almost literally) to technically find the absolute best solution to, but not something you'll use every day.

~~~
sgillen
I was under the impression DP was usually used for getting the exact answer to
certain classes of problems faster than exhaustive search. I usually see
approximations compared to the ground truth dynamic programming answer.

Do you see the reverse in your work?

------
gamegod
IDEs - A proper work-grade IDE like Visual Studio will have some tools built
into it that will save you tons of time. Features like "Edit & Continue" have
saved my bacon many times by making intractably difficult bugs in algorithms
much easier to understand because you can experiment with your code on the
fly.

There's also some more common features in most IDEs like being able to jump
between symbols that saves a lot of time day-to-day.

If I'm interviewing you and you say you don't like using an IDE or a debugger,
that speaks to your work experience, your productivity, your self awareness
about your productivity, and really puts an upper limit on the difficulty of
the problems you've had to solve.

~~~
skummetmaelk
I find that it is usually the opposite. Devs who don't use big IDEs like
visual studio are much more likely to know why things fail in a build pipeline
etc.

If you can only build a project by bumbling through menus and pressing a big
green button at the end, that is worrying.

If you can only debug by immediately jumping into the debugger and single
stepping that is also worrying.

Devs who reach for the tools appropriate to a given situation inspire much
more confidence in their abilities.

Knowing the time saving features of your editor is a huge boost. However, the
old editors have much more of these ;)

~~~
snazz
IDEs aren’t useless, you just don’t want them to be a crutch. For this reason,
I usually recommend new developers use Nano or a notepad-esque editor until
they understand why they might want vi keybindings, then use Vim until they
understand why they might want an IDE or something like Emacs. Starting with
the IDE hides layers and layers of both junk and useful tools, while
experienced developers know which layer of the stack to work on at which time.

------
mooreds
I would say studying any of the AWS paas offerings. Not saying you have to use
them, but they cover a large segment of the system component space.

They can help you answer questions like "When would I use an in memory cache
vs a rdbms vs a key value store?"

------
hannob
If you happen to program in C/C++ you should absolutely familiarize yourself
with the sanitizers of gcc and clang, most notably address sanitizer. (However
there are good reasons why you shouldn't program in C/C++ to begin with.)

------
nojvek
As a programmer, learning how to use a user analytics tool really shift my
perspective on things. Profilers and debuggers are tools that tell you where
to fix the code. A user analytics tool tells you where to fix the product.

Being able to hypothesize ideas, measure conversion, engagement, retention,
A/B test, slice by cohorts and validate whether an experiment validates your
hypothesis. It makes you tackle problems very systematically.

Being able to measure both impact and effort. Always evaluating whether the
impact was worth the effort and fine tuning it.

The product version or tree falling in the forest joke: If you build something
and no one uses it, does it really exist?

------
whalesalad
Low level unix tools. Learning how to read a file, tail a file as it grows
(like a logfile), or look at just the beginning, or just the number of lines
in the file, or merge a bunch of files into one, etc...

How to get a file from one machine to another with scp. How to use more
advanced features of SSH like agents, forwarding, your config file, SOCKS
proxying, etc..., How to debug system issues, find where config files should
be, find out why your app wont compile. Learn how to install code from source,
using configure and make. Learn how to operate your own basic network services
like HTTP servers, mail servers, local file sharing with NFS or SMB, etc...

------
AaronFriel
Devops; being able to stand up your entire stack in an automated way. Some
programmers dismiss this as "yaml/bash engineering", and that's true, but it
also challenges all of your assumptions in your stack. If you can't stand up a
complete duplicate programmatically, you almost certainly have implicit
assumptions that you haven't verified, which will make it much harder to
recover from disaster or scale.

Put another way, devops is a mix of declarative programming (YAML) and writing
idempotent, imperative code in the presence of large side effects (bash).
Learning to handle both is very educational.

------
crimsonalucard
Nobody mentions what self taught programmers miss the most.

Theory. And not just algorithm theory.

~~~
Zealotux
Could you expand on that if you don't mind? I'm a self-taught programmer
currently in the process of realising just how much I miss from theory, my
list so far on theory fundamentals includes:

• Mathematics and probabilities, and their applications to CS (e.g. formal
methods)

• Design patterns (OOP, functional programming)

• Data structures

• Algorithms, time complexity

• System architectures

• Software strategies: CI, CD

• Database principles: SQL

It may sound naive, but I'm kind of overwhelmed by all this and it's not
helping my impostor syndrome, I may make a repo of the list with links to
ressources I've identified for learning as it seems like a common struggle,
maybe it'll help someone.

~~~
crimsonalucard
None of what you mentioned is theory expect for Mathematics, probability,
formal methods and algorithms.

Stuff like design patterns, system architectures and software strategies are
like flavor of the week stuff. Opinions basically. Patterns like microservices
are bad or good depending on opinion, but theory is always correct. Theory
gets less bang for the buck but it's always what many programmers especially
self taught ones are missing.

Theory is so hard that it will be hard to see applicability until you're a
more seasoned programmer. Many seasoned programmers get by without ever
knowing theory. But you will be a better programmer if you know it.

If I were to recommend one theory to study it would be category theory. If
there was any true axiomatic theory for design patterns or how to design
programs... categories are it. The study of morphisms is the study of the
simplest form of a compose-able module. Knowing this theory you will begin to
understand why Some design patterns don't work and why it's sometimes hard to
reuse patterns in code that was that not properly designed. Theory doesn't
answer all questions but for the questions it does answer you will get a
definitive answer and not an opinionated one.

------
croo
Did you know there are document storage/management systems out there? Its
useful when one needs to save a lot of documents(duh) or other too-large-to-
save-in-rdbms type of content with metadata. Examples are FileNet or Alfresco.

Did you know there is a protocol called CMIS [1] which you can use to query
content from these systems? It's similar to a subset of SQL.

[1]
[https://en.wikipedia.org/wiki/Content_Management_Interoperab...](https://en.wikipedia.org/wiki/Content_Management_Interoperability_Services)

------
azhenley
Usability engineering:
[https://en.wikipedia.org/wiki/Usability_engineering](https://en.wikipedia.org/wiki/Usability_engineering)

------
nmca
SQL

~~~
mfatica
You think a self-taught programmer might never encounter SQL?

~~~
politician
Speaking as a technical development manager, I can say that many people have
far less than adequate exposure to SQL, and that training people to use SQL
effectively and safely is all too often a common requirement before letting
them loose on the database.

There are so many ways people misunderstand and misuse SQL and relational
databases, it's honestly staggering.

So, I second the OP. Learn SQL, and you'll stand out.

~~~
beat
Yeah, learning real sql is a very useful skill. At one employer, I became the
"sql expert" because I knew the difference between inner and outer joins.
Which, if you know sql, means you know next to nothing. But knowing next to
nothing was better than knowing nothing at all, so...

------
greenyouse
This might not be for a hobbyist since the setup is difficult but validating
your frontend work with automation is very interesting. I didn't realize for a
long time that I could use visual regression tools and Selenium at the end of
development to catch bugs. I'm surprised that there aren't more tools in this
space targeted for developers. Selenium Grid with WebdriverIO is incredibly
helpful. If you put together a visual regression system for validating your
daily work you could save some serious time.

------
achenatx
There are so many specialized areas that are useful, like common design
patterns, basic data structures, various algorithms, transactions,
multithreading, parallel processing, etc.

Too many things to list, but I agree with a poster below that suggested
looking at the course offerings for a computer science degree. You might be
able to find outlines of the courses to get specific topics.

On a day to day basis, code coverage is something that more people should use
when writing unit tests to ensure they have tests that execute all their code.

------
geirman
Learn to use Proxy Servers for Troubleshooting

Whenever your web app, mobile app, etc depends on a network response, Charles
Proxy ([https://www.charlesproxy.com/](https://www.charlesproxy.com/)) can be
super helpful. It sits in between and captures all the HTTP request/responses
and allows you to manipulate them. So, you could capture an API response (or
request), manipulate it, then let it continue. It also let's you interrogate
the requests easily.

------
gitgud
Design patterns were enlightening to me. The idea of expressing concepts in
software design which can be reused was what got me out of the stone age of
programming. You can describe Pub/Sub or the Builder Pattern and quickly
implement it into a project and know it's advantages and limitations,
definitely worth checking out if you haven't already.

[http://wiki.c2.com/?CategoryPattern](http://wiki.c2.com/?CategoryPattern)

------
jpmelos
If understanding how Linux works interests you:
[http://www.linuxfromscratch.org/](http://www.linuxfromscratch.org/)

------
avip
The concept of monitoring. This simple, obvious concept changed a lot how I
think of software maintenance. I usually use slack but that's an
implementation detail.

------
justinholmes
Using GitHub trending for each language look at new things each month.

Build something and think how would I scale this project to X capacity then
rinse and repeat.

------
Khelavaster
Visual Studio, especially its visual UI designer, build & reference
management, and scaffolded website development (ASP.Net MVC with Entity
Framework from scaffolding). Night and freaking day.

Also, WCF for all your inter-computer communication needs. I wouldn't be
developing software today if I weren't introduced to low-trouble application
development with Visual Studio

------
eranation
Not tools but more of a mindset: Application Security, OWASP top 10, Web
Security. Learn what is XSS, CSRF, what is a CSP, why HSTS exists, why CORS
allowing * can be dangerous. Also what is OAuth, OIDC, SAML, etc. How to store
passwords securely, how to add security to your devops cycle. Having a good
security mindset can be a great asset to any team.

------
vast
Automated deployment. Even for smaller projects. Ansible for example is a
python product.

Online schema change tools if you use RDBMS in production.

------
olingern
Just a few that I've found liberating over the past few years:

1\. In memory caches for small projects / one-offs.

Redis and memcache are sometimes not needed. Wrapping your cache in an class
whose storage is dependency injected is a nice way to keep moving along.

\---

2\. grep

Not much to say here other than it's a great search tool. Admittedly, I'm
still learning working on wielding it even better

\---

3\. Piping output: [CMD] > myoutput.txt

Lifesaver for large output.

------
mettamage
It depends on what you want to achieve. I can tell you to read about security
but that is only relevant if you want to secure your software. I can tell you
about pub/sub, but that is only relevant when you need it.

The things you’ve learned were mostly covered in my computer science
curriculum. So I am going to second Robin_messsage his message.

Skim through a CS curriculum.

------
6nf
SQL Stored procedures and triggers. It's some of the most useful stuff you may
not have considered if you're just a hobbyist. It's not for every project but
if you do need a proper database for some reason, there's often great ways to
use stored procs or database triggers that greatly simplifies your system.

------
jondubois
Corporations have been pushing hard on marketing their vendor-lock-in as-a-
service solutions so you don't hear about simple free open source tools
anymore; all developer channels are saturated with corporate marketing. So
much so that you don't even realize that it's marketing.

------
lixtra
For learning what’s going on in compiled programs _strace_ can be very
helpful.

[https://www.tecmint.com/strace-commands-for-
troubleshooting-...](https://www.tecmint.com/strace-commands-for-
troubleshooting-and-debugging-linux/)

------
keyle
The design of everyday things. It's a design book that will make you a better
programmer too.

------
sycdan
I use ack ([https://beyondgrep.com/](https://beyondgrep.com/)) many times per
day, and it has saved me a great deal of time. That said, if you use IDEs (I
don't), it may be somewhat redundant.

------
tosh
* relational databases (e.g. sqlite, postgres)

* regular expressions

* lisp / scheme / clojure

* emacs

* mini kanren / datalog / prolog

* neural networks and deep learning

* state machines

* caches

------
jedberg
If you're into python, look up decorators and coroutines. They will blow your
mind.

~~~
Jach
This is still my favorite mind-blowing introduction to them, from 2009:
[http://www.dabeaz.com/coroutines/Coroutines.pdf](http://www.dabeaz.com/coroutines/Coroutines.pdf)
Don't have hardware interrupts or threads but still want a multi-tasking
operating system? Not A Problem.

------
codr7
Macros, interpreters and embedded languages [0].

[0]
[https://github.com/codr7/g-fu/tree/master/v1](https://github.com/codr7/g-fu/tree/master/v1)

------
LeonB
So many of the problems are “people problems” so in addition to the excellent
technical suggestions in this thread I’d add books like “Thanks for the
Feedback”, “How to Win Friends and Influence People”, “Getting to Yes”

~~~
yyx
"start with no"

------
contingencies
Anything that automatically generates pictures from text: _mscgen_ ,
_imagemagick_ , _graphviz_ , _SVG_ , _povray_ , _gnuplot_ , _R_ , etc.

------
HocusLocus
Perl one-liners, the ones that overflow to several lines of code in your
terminal.

------
jasonhansel
Shell scripting (trust me, this is _way_ more useful than you might think).

------
westonplatter0
SQL

------
fa
Is this trading bot you’re writing available online?

------
mosalarynolife
Design Patterns

~~~
sesser
What are some resources outside of gang of four?

~~~
pernambucano
[https://sourcemaking.com/design_patterns](https://sourcemaking.com/design_patterns)

------
ajflores1604
tldr: redis

Hey man, I went down an eerily similar path as you. Seld taught and building a
trading system. Wading thru sockets and pipes before discovering zmq and
having my whole paradigm for programming completely shifted. Absolutely love
zmq, and enjoy thinking of new projects to use with it.

More towards your actual question, I don't think it's as ground shifting of a
discovery as zmq, but Redis has also helped me a lot with the trading bot. I
use Interactive Brokers, and how their api examples are set up, I haven't
figured out how to design my system using only message requests like you see
in micro-service style programs. Also when requesting a price for any given
contract, I don't like the idea of having to request it once I need the price
and then waiting on the network for the brokerage to receive my request and
then the price to come back to me. By that point the price or whatever data
you're requesting can get stale depending on the speed and timefrime you're
wanting to play. So what I did was just request ahead of time streaming data
for contracts I'm interested in, and I write that continuously to a localhost
redis server. And then whenever any of my bots need information, they check to
pull the latest value from that redis server first before going out to the
brokerage platform to request the data directly. Basically cuts the full round
trip time in half if it can find the data locally first. I believe this is a
similar programming paradigm as using redux if you have experience with that,
but I've never personally used redux so don't quote me on that.

~~~
nathanasmith
Absolutely agreed. Redis is a godsend for small time traders like us. I've
scaled up to where I'm trading on about 20 crypto exchanges in addition to
dabbling in stocks and prior to redis I was using flat json files to store all
my data so every time a new price came over the wire the bot would read the
file, parse the json, append the price then write it all back out. That was
quick and easy to put together and it worked great at first but later when I
had all the exchanges going even with an 8700k and Optane drive the computer
would just fall over when volume got high. Got sick of that, converted it all
over to redis with rejson, added another 32 gigs of RAM and problem is utterly
solved. Redis' built-in queue has been extremely useful too as a job system.

