
Ask HN: What's the dumbest thing you've done on the job? - exolymph
Could be a programming mistake, shooting yourself in the foot with a boss, scoping a project terribly, etc. Bonus points if it&#x27;s something you knew was dumb before doing it anyway.
======
karakanb
I was trying to deploy a CI/CD pipeline for a small project and testing some
changes in the test step before deploying. I wanted to collect env variables
into a global location in the pipeline and put some defaults for the tests,
but I have used the same variable names for them, so the global ones that were
for production deployment were being used for the tests as well, and the tests
were starting by wiping out the connected database.

As if this was not enough, I didn't understand what was going on because tests
that used to take 5 seconds were taking 3 minutes at that point, so I ran them
at least 10 times to make sure this was not a pipeline glitch. This also
helped me to make sure that the database was completely wiped.

~~~
shoo
how could we structure things differently to prevent anything remotely like
this from ever happening?

~~~
gtsteve
When we migrate databases, we execute a Docker container with a specific
environment variable to let it know that it's time to migrate. The CI
environment doesn't actually have a network connection to the DB; it just
updates the Docker images and then when the deploy task executes it starts
this Docker task remotely and polls for success/failure.

I think Kubernetes has some nicer features that could make this easier but I
haven't had a chance to try this out yet.

In fact, the CI environment and production environments are on different AWS
accounts. This way if you accidentally execute tests with the wrong connection
strings (and we have done this before), they just fail because they can't
connect.

~~~
shoo
> the CI environment and production environments are on different AWS accounts

yep, that's one way to structurally enforce isolation (unless you go out of
your way to configure cross account permissions)

there's probably a bit of a tradeoff between things being safe and isolated,
and things being seamlessly automated.

~~~
gtsteve
Indeed, I didn't find it all that hard to configure a role in the production
account that has the permission to launch containers that the CI environment
can assume. I locked it down as much as possible but if the CI environment
gets hacked someone could launch a container in the production environment.

I did think of a way to make this more secure but I haven't done it yet - you
could write a script that waits for input from a website where you have to log
in via SAML. The script then passes the AWS credentials for the role back to
the deployment task. Effectively, it's a second step authentication process
where you'd need to be authenticated using a security key for the deployment
to proceed.

I haven't quite figured out all the details yet but that might help you get
the assurance you need.

------
photonios
We just did a large migration, transitioning our deployment pipeline and front
facing web server to a new set up. I stayed at work till like 4 am. Everything
went smoothly.

The next day I woke up and realized I forgot to enable password protection on
the staging environment. I opened up my laptop, navigated to the dashboard,
add the flag and reload the config. 5 seconds later, my phone starts beeping.
Tons of alerts that the production website was down and responding with "403
access denied". Immediately tens of people jump on Slack "website down?!".

Turn out, with my sleepy head, I added the flag to the production website
instead of the staging website. And thus enabled password protection on the
live website. As soon as I noticed, I undid it and in the end, downtime was
roughly a minute.

Suffice it to say, I got slapped for this fuck up.

~~~
exolymph
Oh noooo! Can't believe you had to stay that late at work. Poor planning
leading up to the migration, or just unexpected issues?

~~~
photonios
No other time to pull it off. We run a large website in a different part of
the world, so we're several time zones behind. We could only do this migration
during low traffic, which for us meant the middle of the night.

Sounds worse than it is. Got to work in the afternoon, got pizza and took a
couple of extra free days exchange.

------
xtagon
Probably the longer I think about it the dumber and dumber things I will
remember, but one that comes to mind is unintentionally configuring a test
suite to use a live SMTP connection instead of a mock one. When the tests ran,
thousands of e-mails to invalid e-mail addresses tried to send for real,
bounced, and temporarily affected the "reputation" score on the transactional
e-mail service.

------
Raed667
While inbording, i have been given shell script that is supposed to take a
copy of the data in production, anonymize it, and set it up locally on my
machine.

It took 2 parameters, the first one is the IP address of the production
database and the second one my IP address.

I guess it was inevitable for those to get mixed up, and I guess no one did
bother to prevent write access to production.

------
Torgo
I stored an entire 30GB PostgreSQL production database on an AWS ephemeral
volume (gets automatically wiped on restart) because I mistook the device ID
for one of a regular volume I thought attached.

Found my mistake months later, was able to fix it before disaster struck.

------
Adamantcheese
One time I tried to use Visual Studio's git integration and managed to delete
the entire remote repository and all commit history. Good thing I kept
religious zip backups because I was bad at git back then.

------
AnimalMuppet
Bringing up an embedded system that was its own development environment. I
messed up and deleted the boot file.

The system still booted, because the boot sector pointed to the sectors that
the deleted file had occupied, and the contents of those sectors had not been
over-written. I ran that way for two days until I got it to a point that I
could restore the boot file. I spent those two days afraid that any step would
over-write the wrong sectors and the system would be dead.

------
sethammons
My very first project at my new job. First "real" programming job. Due to bad
assumptions of dates in the db, I got a user stuck in a loop. We emailed them
some 400k times: "Your credits are low, upgrade your account!" I felt
terrible. Our only way to reach out? Email :(

~~~
exolymph
400k emails?! Oh my god. Did you ever hear back from the user?

------
elken
Not my screw up but working for a fairly large e-commerce site we had a cron
job to pick up 'stuck' orders and push them down to the warehouse every 30
minutes.

For a combination of reasons there was one order which was picked up every
time the job ran.

This went on for a very long time and nobody in the company realised. The
warehouse staff were all agency workers who dispatched B2B and B2C so sending
a pallet wasn't unusual for them. It was only when the customer got in touch
asking us to please stop sending the same thing that we were alerted.

Apparently she had ordered the items to her friend's office and they were
getting very frustrated having to deal with them.

Even worse the customer was in China and the company had to order a shipping
container to retrieve the stock.

------
ice303
After many hours trying to convince the IT director to do this in a off peak
time, and with proper planning, he just wouldn't listen to me.

Hyper-V live migration from a HP EVA4400 to a 3PAR Storage. 45 hosts migrated
without a hitch. The one that couldn't fail, a SAP production server, failed
hard. The EVA crashed hard, both controllers went offline in the middle of the
migration. After a couple of seconds later, one of the PSU's shut off. The
other one was waiting for replacement part. My face turned white. Huge
downtime to recover everything from a Tape backup. A couple of days later, I
had a major burn out.

It was a really bad day at work :(

------
droptablemain
I once crashed a production server by accidentally running an infinite loop.

------
heelix
I misspelled pharmaceutical, which was part of the company name, and also
appeared on the splash screen of the power builder application. Was not
noticed by anyone before several cases of floppy disks were delivered to us.
The App required a dozen floppies or so per set... so we had a lot of 'spares'
laying around. My personal cone of shame.

------
AwesomeFaic
Formatting a USB stick from Terminal only to realize I referenced the wrong
drive. Killed the process too late, the Macbook crashed and failed to boot.
Apple couldn't save the drive or recover anything.

Thankfully all but the last day's work was on Git, but I was using the laptop
as a temporary photo storage (only device with an SD card reader at the time)
and lost 9 months of important photos.

------
potta_coffee
Transitioning from military to civilian life is difficult, especially right
after a deployment. I shot myself in the foot a few times because I just
didn't fit in. To be honest, I was a real asshole. For instance, shredding
someone to pieces like a PFC who's just done something really stupid - it just
doesn't work in the "real world".

------
jimrhods23
DELETE * FROM <table>

Yeah, I forgot the WHERE.

This was 15 years ago at my first job and I haven't done it since. I was lucky
that we had backups.

~~~
photonios
I wonder whether tooling that is commonly used to query a database should
detect these common mistakes and warn you. `psql` could easily pop up a
warning "Are you sure you wanna drop this entire table?".

An alternative would be for databases to add an option to prevent accidental
deletion. When enabled, would make it impossible to truncate or delete entire
tables. I would enable such a setting on my production databases.

~~~
Torgo
JetBrains DataGrip does put up a warning like this before you can do a DELETE
on a table, unadorned by a WHERE clause.

~~~
jimrhods23
DataGrip is my #1 DB tool for my team for reasons like this.

------
turtlegrids
Not taking the 83(b) election...

