Ask HN: Weirdest hack that you ever saw in production? - jxub
======
Spooky23
I worked at a place that had a large, distributed terminal network running on
something like OSF or DEC Unix.

It was 2001 and the thing was on life support while PCs were being rolled out.
I was helping to rack my new database servers, which was next to this big lab
table/shelf combo with like 16 terminals on it. When pulling a cable I banged
my head on the table, then this big book fell on my hand.

About 10 minutes later, a bunch of graybeards come around the corner yelling
“WTF are you doing!”

Turns out that the dictionary that hit my hand was perched across two
keyboards, holding down the “enter” keys of two terminals. Turns out that for
reasons unknown, those terminals had to be repeatedly hitting the enter key in
order for the logins and print jobs of about 40,000 people to work.

~~~
neverminder
I've got one of those from "Only on WIndows" series. One of my colleagues uses
WIndows and tends to remote in from home. Unfortunately when a cleaner cleans
the office after hours she sometimes accidentally hits the CAPS key and he
can't log in anymore remotely. His solution was to rip out the CAPS key and
cover the hole with duct tape.

~~~
Odenwaelder
I do this, too. The first thing I do when I get a new keyboard is to remove
the caps lock key. I never used it in my 20 years of PC usages and I don't get
why it's still there.

Back in my CounterStrike gaming days I would also remove the Windows key
because it would crash the game when accidentally pressed.

~~~
ssalka
DON’T remove caps lock!!! Use it as an application switcher!

Some of my most used shortcuts:

Caps + C = Chrome

Caps + S = Spotify

Caps + A = Atom

Caps + T = Terminal

+1 for removing windows key if gaming

~~~
cookingrobot
On Windows, Win+1 launches the first app in the taskbar, Win+2 launches the
second, etc.

------
etxm
I worked for a chain of medical clinics in the early 2000s in Florida.

Everything was going digital. We had a “remote” office across the street that
didn’t have internet access and it was really expensive to have lines
installed.

I don’t remember what the device was called but it was some sort of satellite
dish. Our network admin had ordered them and installed them on the roof tops
of both buildings and we could provide access via this link.

It was pretty rad to me. But then the intermittent bugs started creeping in.
The remote site used some windows 2000 dumb terminals and throughout the day
all of them would disconnect at the same time - every day.

It was very random and they’d automatically reconnect after a few seconds to a
few minutes - always.

Well, I was the new guy / intern grad. And so after a few weeks of debugging
different things inside and moving the dishes around to make sure they were
exactly pointed at each other my boss rolls into work with a beach chair and
an umbrella.

I looked at him and asked if he was taking off early and hitting the beach and
he goes “no you’re debugging”.

We set up the chair and umbrella on the roof. I sat up there with an walkie
talkie and the remote site had the other one. My job was to sit until they
radioed to see if anything was obvious interrupting the connection.

Then a semi truck stopped at the red light.

~~~
dijit
Sounds like a point-to-point directional wifi radio. They're far from a hack
and have been used to great success upwards of 2miles away from each other
with 1Gbps throughput, even for small ISPs.

~~~
odammit
Yeah I can’t remember what it was. Someone said laser link above, I feel like
it was microwave - I do remember being concerned we’d cook birds.

~~~
ahoka
Yeah, it was probably microwave.

------
Benjamin_Dobell
Whilst this isn't really in production. I was porting AOSP to an Android TV I
owned. However, I wanted to use the latest version of Android, but I had some
closed-source graphics composition binary blobs to interface with; they were
for an older Android API. Naturally, this meant writing a wrapper from the new
API back to the old API.

For some reason, every so often (sporadically) I'd get a segfault inside the
closed-source binary blob. To get things "working", in my wrapper, I captured
the stack before calling the occasionally segfaulting function, and setup a
segfault handler that would simply restore the stack to its state prior to the
crash.

Unfortunately, after restoring the stack, a subsequent call to the closed-
source function would hang. I did some preliminary reverse engineering of the
binary blobs and found that it was segfaulting whilst having retained a mutex.
So I did, ah... the obvious thing(?) and just grabbed the raw memory address
of the mutex, and released it myself when a segfault was encountered.

Surprisingly this all "worked". In the end I had a TV that thought it was a
65" phone, lock screen and all. Umm, yay!

Here's the code:

[https://gist.github.com/Benjamin-
Dobell/bb13f6169aaa48625453...](https://gist.github.com/Benjamin-
Dobell/bb13f6169aaa486254537c5e973e53ec)

~~~
gandreani
Wow. Thanks for sharing your story.

The way you casually explained this to me really threw me off. I remember
trying to do some AOSP hacking on some older phones and always gave up. For
different reasons

Locked Bootloaders

Huge sources to download

Confusing device configuration (all those xml files were and still are magic
to me)

The farthest I ever got was to compile a kernel and flash it unto a phone. I
was so happy for so little. The only difference was the change to the build
version / name.

I remember back in the pre Ice cream era graphics drivers were the most often
quoted reason for older devices getting "stuck" on older versions of Android.
It never occurred to me that you can write a graphics wrapper.

------
astockwell
Right after WinXP came out, I was a sys/win admin at a manufacturing co. Their
ERP system was an old VB app that sat on a share drive and everyone opened the
same .exe from their respective workstations.

The app required a ton of scheduled database and ERP tasks (it used a legacy
flat-file db), so the vendor wrapped them all up in a secondary executable
that was effectively a non-headless (headful?) daemon (this was expensive,
niche industry software btw). The first instance of the application that
opened would also trigger the daemon to open too, _on whichever PC it was
executed on_ (it was supposed to be opened on the server 1st). It was provided
by the vendor this way, as part of the COTS application.

As a result of this daemon hack, every couple days (after the application
crashed on the server, as it did frequently) I would run around the building
to dozens and dozens of workstations until I found the user’s workstation that
had been the first to run the ERP after the server process crashed, and thus
had the daemon running on their workstation. Then I would kill it, and sprint
back to the server closet to reopen the daemon before any other users would
run the ERP and grab the daemon (later would just RDP after we got off NT).

It was awesome.

~~~
davb
Oh wow. Yeah, that sort of thing seems to have been so common in ERP software.

My first professional programming job was working on a bespoke ERP and
industrial process control suite (written in PowerBuilder). The program had a
huge number of sub programs (dynamically loaded modules, each an MDI window
accessed using a “program name”, something like a SAP transaction code).

We had a number of background services that would have to run, however writing
Windows services in PowerBuilder was anything but easy. And we were reluctant
to use anything else - the whole benefit of using a 4GL was a well integrated
ORM and report generating functionality.

So we’d implement our background services as regular modules (with their
little MDI window) within the main thick client app. Clients would have a
number of workstations dedicated to running a single one of these processes.
Nothing headless, each outputting it’s status or logs to he connected display.
If the power, network or database ever dropped, each of these machines would
have to be restarted and have its allocated sub program reopened.

For example, despatch label printing program would monitor a database queue
table for new rows, bring up a report associated with the specified despatch
note, print the report to the label printer then delete the row.

It seems so hackish but it worked incredibly well. Our clients were all food
or paper manufacturers, running 24/7\. Operations were rarely disrupted. Have
a single screen per function to monitor for status changes was something
operators were accustomed to.

This was over a decade ago, but I’ve never worked with a more productive team
since. The constraints of the system let us focus on solving business
problems. I can’t imagine writing anything of this scale in a modern
environment. I’d love to see 4GLs like this make a comeback. The first class
GUI, ORM, report generation were a huge productivity boost. And the simple
programming language (with a very simple object model) put the focus on
problem solving and not API acrobatics.

Simpler times.

~~~
astockwell
No kidding. Everything got clear queueing and back-pressure for free!

------
mattzito
In my startup days, we were working on a proof of concept with a really big
bank. Because of their security rules, we couldn't have direct access to their
systems - so if we wanted to do something remotely, we would have to start a
webex, they would join and share their screen, and give us remote control.

This worked great, except if we wanted to work over the weekend, since if we
left the screen alone for more than a few minutes, the screen lock would kick
in and we'd lose the session.

Our solution? We purchased a small fan with an oscillation mode, and tied a
mouse to it. We then had the fan drag the mouse ever so slightly back and
forth whenever we wanted to step away from the remote session. Kept it going
for weeks.

~~~
macromaniac
I use a posh script like this when i dont want the computer to lock.

$ws = New-Object -COM wscript.shell;

while($true){ $ws.SendKeys("j"); Sleep 60;}

Ive used it for demos that way the computer doesnt lock before the demo
starts, its pretty short and easy to remember. Also on windows if you spam
SendKeys("{Left}") everything you type is backwards and when you hit the
windows key it freezes the computer in an interesting way, pretty fun.

~~~
ghkbrew
I used to use autohotkey to send an F13 key press every couple minutes to
avoid lockout on a work computer where I wasn't allowed to increase the
timeout

~~~
unixhero
Wait is Function keys above F12 still addressable?

~~~
striking
Certainly, my keyboard still has those buttons and I still use them

~~~
jscholes
Which keyboard is this? I'm a keyboard-only user and would love to have an
extra row of keys for shortcuts.

~~~
duskwuff
Can't speak for the parent, but Apple USB keyboards have F13-F19.

------
kieckerjan
In the midnineties I was hired to improve the performance (and eventually
rewrite) a custom in-house search engine. I dipped into the software and there
were some quick wins, but I couldn’t get the damn thing to reply quicker than
100 ms. In desperation I just grepped for the number 100 and sure enough I
found a 100 ms sleep in the routines handling the connections. Turned out the
author had made a mess of his socket handling and by trial and error had found
out he could get the thing to work reliably only by waiting for a while.

~~~
krylon
An SAP consultant once told me that his preferred technique of averting long-
winded and pointless discussions about irrelevant details was to insert random
delays into his code. That way, instead of discussing irrelevant details,
people would get upset about the performance. He would then sigh,
dramatically, and say he would see what he could do, remove a few lines, spend
the rest of the day reading the news, and - importantly - billing the
customer.

~~~
jhanschoo
I'm pretty sure I've seen this before, is it one of the BOFH stories?

~~~
codfrantic
close:

[https://thedailywtf.com/articles/The-Speedup-
Loop](https://thedailywtf.com/articles/The-Speedup-Loop)

~~~
shoo
there's also the memory-optimization variant of this, from the land of game
dev. see "the programming anti-hero":
[https://www.gamasutra.com/view/feature/132500/dirty_coding_t...](https://www.gamasutra.com/view/feature/132500/dirty_coding_tricks.php?page=4)

most of the other game dev dirty trick on gamasutra are good value -- the
"(s)elf-exploitation" one from Jonathan Garrett, Insomniac Games is
particularly awfully clever:

[https://www.gamasutra.com/view/feature/194772/dirty_game_dev...](https://www.gamasutra.com/view/feature/194772/dirty_game_development_tricks.php?page=1)

------
burlesona
As a consultant I got a job from a major public company to fix a new
touchscreen based in-car dashboard they had built. It was a web app running on
a cheap android tablet full screen. The thing worked well, they said, except
that it would get stuck in demo mode, and you couldn’t switch out of it.
They’d paid an overseas contractor a significant sum to build this and
eventually fired them when they got stuck at this point.

Upon opening the code I discovered the entire program was a carefully
constructed slide show with hundreds of jpgs in a jQuery carousel and some
magic click areas coded in to jump the user between slides. Other than this
code to jump to specific slides, there was no code at all. Even the text on
screen was in the images.

I should note that their git repo consisted of about a hundred folders whose
names were dates, and one folder named “current.” That was actually my first
warning of just what I was getting in to.

~~~
fsloth
"Upon opening the code I discovered the entire program was a carefully
constructed slide show with hundreds of jpgs in a jQuery carousel and some
magic click areas coded in to jump the user between slides."

Based on my narrow understanding of "standard issue practices" in car
dashboard UI:s workflows, this was a common pattern at least at one major
German automobile company. Static views and transition rules between them.

I was a bit involved in dashboard software a decade ago and was really
surprised to learn this.

------
chrissnell
In 1997, I worked with FedEx to build an integrated order management system
for the e-commerce company that I ran with my dad. Orders would come into my
Perl-based order management system and the pickers would use the web interface
to print a packing slip. A bash script would generate a Postscript barcode and
then my Perl would generate a packing slip in LaTeX that included that
barcode. The LaTeX-generated PS file was sent over a private T1 from the
datacenter to the warehouse printer. The order would get picked and put in a
crate with the packing slip. The barcode was then scanned at a FedEx shipping
station by our shipping guys. That would trigger a script on the FedEx machine
(written in Visual Basic, I think) that would make a call to PostgreSQL over
Windows ODBC to pull the shipping address and shipping method. As soon as the
workstation populated this info, it would spit out a FedEx shipping label and
the VB script would then trigger an INSERT back into Postgres with the
tracking number. This triggered another Perl job to mark the order as
"shipped" and would send an email to the customer with the tracking number.

TLDR: we had real-time order tracking with full shipping and billing
integration in a tiny mail order bicycle parts business in 1997.

~~~
mabbo
You should have sold books, and expanded to bicycles later.

------
crispinb
Webhost circa 2002. Lots of carrots from Microsoft for us to go big on ASP.NET
hosting. Fat boss made a deal which involved us rewriting our customer
interface in ASP.NET from the existing ColdFusion morass. His eyes popping at
our estimates of how long this would take, he came up with a solution: rename
our *.cfm files to .aspx, and map IIS to pass .aspx files to the CF server.
Job done.

~~~
danieltillett
Genius. Now if we could convince the kids of today this is solution to
rewriting everything using the latest fad framework.

------
bongilla
Years ago, in the 70's, we came across a bug where some program would skip
every other input line. When asked to fix it, the responsible programmer went
away and within a few minutes reported back that she fixed it. When we told
her it was still broken she referred us to the updated documentation, which
now said "the input should be double spaced". The said program was used this
way for years after.

------
mi3law
In a consumer app, I would say Snapchat's early camera hack on Android takes
the cake.

To be brief, their app ran the Android native camera app in the background and
took a screenshot of the resulting feed for the image, bypassing actual
integration with Android's camera apps. Having worked on an Android smartphone
from the ground up, I can understand their reluctance to commit dev time to
having to support so many Android versions and other variations on all the
devices out there, but still a lazy weird hack.

[https://android.gadgethacks.com/how-to/fyi-why-androids-
snap...](https://android.gadgethacks.com/how-to/fyi-why-androids-snapchat-app-
takes-inferior-photos-0174597/)

[https://www.reddit.com/r/GooglePixel/comments/64xqv0/snapcha...](https://www.reddit.com/r/GooglePixel/comments/64xqv0/snapchat_recording_quality/)

~~~
pg_bot
This is hilarious as Snap Inc. bills itself as a "Camera Company"

~~~
jbob2000
This isn't a comment about snap chat, moreso about how shitty the camera API
is for Android. Yes, it is faster and more reliable to take a screenshot of
the camera app.

------
theslugger
First job out of university and I had to fix a terrible crash that happens to
our prod application every few hours running on Windows NT boxes. After lots
of debugging and asking all the “senior” devs, no one knew a solution since it
was all super old code. What I did notice, though, was the apps that I was
debugging didn’t crash until I stopped debugging it.

Turns out that it was a memory issue and every time I minimized and maximized
the app, part of the memory got cleaned up. So as a temporary fix, I just
wrote a script to auto minimize/maximize the apps on all boxes until we found
the memory leak.

Note: we never found the leak.

------
mikelevins
I once worked maintenance on a large C++ program used in production by a lot
of customers. It was odd in several ways, but the feature that stands out in
my mind was the numerous classes that were not defined anywhere in the source
code or libraries.

If that sounds unlikely to you, it sounded unlikely to me, too. I wasted a lot
of time trying to figure out where they were defined. I couldn't ask the
original author of the code; he had moved on.

Eventually, I found them, sort of. They were being defined by a sed script
that ran during the build process. It read the sources before they got to the
compiler, constructed class definitions on the fly, and injected them into the
code before it was fed to the compiler. So the definitions were right there in
the code that the compiler saw; they just weren't anywhere in the code that
humans could see.

Why was it done that way? I have no idea.

~~~
C4stor
Honestly, that sounds pretty normal, or at least ok, to me. Auto-generation of
code is one of my primary daily tools, and I think it's just right for whole
categories of problems. I currently generate a good third of my compiled
sources, and also generate a rudimentary typescript library out of my source
code to be used by the my colleagues.

That being said, I usually work in Scala which provides language-based tools
for that, so it definitely helps with avoiding the "dark magic" sentiment you
may have had.

But why was it done that way ? It reduces boilerplate, copy-paste errors, code
duplication, in ways sometimes not made possible by inheritance or
composition.

~~~
ups101
Did you consider the point about code being defined only at build-time, not
being available for inspection by the developer? That sets it apart from most
auto-generating code scenarios I've seen.

------
closeparen
The lingua franca of theatrical lighting control is a physical-layer protocol
designed for custom cabling called DMX. Light boards emit an array of 512
values in the range [0,255] and dimmers, or lighting instruments themselves,
interpret these values as parameters like intensity. For various reasons it's
useful to carry this signal over an IP network, and proprietary standards to
do so have proliferated.

Light boards these days are just computers with some domain-specific IO. Tired
of our ancient ETC Expression console, my colleagues and I wanted to start
using ETC's new Nomad control software on our laptops. Our venue's dimmers
only understood ETCNet2, while Nomad could only speak the newer ETCNet3 (and a
few other open standards we couldn't use). Attempting a software upgrade on
the dimmers themselves seemed incredibly risky. To bring Nomad's output to DMX
would have required an additional $500 hardware purchase on top of the
already-not-cheap software license.

On the message boards, I discovered a strange fact. The ETC-branded DMX<->Net2
interfaces we owned were actually white-label manufactured by a company called
Pathport. Pathport boxes spoke a much wider array of protocols using the same
hardware. These things handled firmware updates by flashing themselves with
whatever was served to them over BOOTP. Pathport firmware images were free to
download straight from the manufacturer.

Net3->Net2 was too much to ask for, but they _could_ do ArtNet (an open
standard) to DMX. Nomad could also emit ArtNet. So I flashed and configured
one node to operate as ArtNet -> DMX, and plugged it into another node
configured for DMX -> Net2.

So now, locked in a closet, there is a very strange loop of switch -> hacked
ETC box -> normal ETC box -> switch which seems bizarrely redundant, but
actually makes the world go 'round. And I could run lights and sound from any
network drop in the building.

~~~
magmastonealex
Wow, this brings back memories of running the theater in my school. Definitely
a different situation, but I like to think we did a good job given what we
had.

We didn't really have a budget, just some hand-me-down equipment that came
from above sometimes. I and others on my team put together so many hacks to
make things work. One memorable time, our light board had broken, but we still
needed to run shows.

We didn't have enough time to wait for shipping on a real USB->DMX adapter,
nor budget for a new board, so I created a hacked together DMX adapter with a
serial to USB adapter and a NAND gate (I put schematics together here, if
anyone's interested:
[https://github.com/magmastonealex/DMXAdapters](https://github.com/magmastonealex/DMXAdapters)).

It worked remarkably well for being a bit of a hack, but paired with software
like QLC+, had more features than our old light board! It was still in use for
controlling special effect lighting when I left, though thankfully not for
main lighting and day-to-day use.

------
sokoloff
Sony PSX (original playstation) port of a PC title that I worked on, we needed
to have a physics thread run at a predictable and consistent rate regardless
of what the rest of the game was doing. Sony Japan said pre-emptive multi-
tasking wasn't possible.

Found a way to hook the vertical blank interrupt (shades of old Atari 8-bit
programming), push all the registers onto the stack to create a
setjmp/longjmp-ish way to call our physics thread at a consistent 30Hz. (OK,
29.97, but close enough)

~~~
jai_
Do you have any more interesting stories of working on game development for
the PSX?

~~~
sokoloff
I only worked on the one title (NASCAR Racing) and so my war stories are
limited, but I'll give you what I recall.

Original dev boards were 3 full length ISA bus (IIRC) boards and were a PITA
to get installed, all the IRQ conflicts resolved, etc. Later dev environment
was a "blue PSX" (basically a production PSX with blue plastic that could run
non-copy protected discs). I think the ISA boards had more memory than the
production boxes; I'm not sure if the blue had extra RAM or not.

We were always RAM constrained (may have been less of an issue for a ground-up
game, but we were porting a PC title), and we wanted to use a common codebase
with the PC title, so we had a _LOT_ of complex C macros to bridge between the
PC world and PSX world. (As just one example, we could have used filenames on
the PSX, but there was no reason to waste the RAM, so I wrote macros to turn
PC-file-based accesses into PSX-sector-byte accesses. I also wrote the macros
such that they'd break the PC compile/runtime [depending on the macro] to
prevent the PC teams from writing code that would only work on their platform.
It wasn't hugely popular with some of the "old-timers", who viewed the
consoles as a distraction.)

Compiler was gcc; we used Emacs as our editor (me and the other main
programmer were MIT alums) and in order to get a better emacs experience, we
installed OS2-Warp as our desktop OS (so we could get subshell compilation
working, which didn't work, or didn't work well on a DOS boot [this was 1995
and prior to NT-based flavors of Windows]). Debugging was primarily via printf
or small graphical blocks on the corner of the screen.

Documentation was fairly terrible and Sony CA had to escalate many
clarification questions to Japan. Docs would say things like, “It’s critical
to never fail/forget that initializing this system must happen strictly before
the lack of initialization of that system.” It sometimes felt like the Ed
Asner water-in-nuclear-reactor sketch.

Sony QC to approve the golden master was very strict. We shipped with over a
dozen tracks and they seemed like they drove every square inch of them and
complained about graphics glitches in many places that were far enough off the
racing line that we never noticed (or never cared).

In terms of graphics "flair", the PC title had a flat colored track, which
wasn't as appealing as the PSX titles of the day (Ridge Racer and the like).
We didn't have a huge art budget for the title, but we created an artificial
racing "line" of darker track which we placed by repurposing the position and
acceleration data used for the PC AI drivers' algorithm. Where the AI cars
were accelerating (including laterally) was darker than where they were just
driving was darker than where they rarely drove.

Because the PC title was heavily focused on realism (which means it's not as
easily accessible or "fun" for the casual gamer), I created an "arcade
physics" mode where the car would slide and rotate more, had higher absolute
cornering and braking ability, but the same forward acceleration. I also added
"double click to burnout/do donuts" in normal mode as both a fun way to screw
around but also a way to more easily exit a tight pit box. This had the
unfortunate effect of giving much better acceleration from a standing start.
So, when it found its way into the PC multiplayer title, standing start races
became a sea of tire smoke and cars running into players who hadn't learned
that burnouts gave faster acceleration. (We properly modeled the horsepower as
a function of RPM. Burnouts raised the RPM. My hack didn't model the tire slip
under acceleration, so burnouts brought the car up into the power band and you
would walk away from a car who was accelerating from a lower RPM.)

We had another team working on a Sega Saturn version at the same time; that
title never shipped, in small part because of the technical hurdles of getting
the title to run on the platform, but also because of the limited commercial
success of the Saturn was becoming obvious during development.

Other memories were working with some of the most talented programmers and
artists I'd worked with up to that point in my career (both on my immediate
team and elsewhere in the company), meeting Ken and Roberta Williams (Sierra
bought us), and going to racing school to get a better hands-on feel for auto
racing. Fun times and I sometimes wonder how my career would have gone
differently if I'd stayed in games. (I left because each successive merger or
acquisition by non-gamers made the company worse and worse to work for. Sierra
and the Williams were great; subsequent MBA-types were each progressively
worse, including substantial securities/accounting fraud so I was glad to get
out when I did.)

Random tidbit: it was a single player game. If you pressed a button on P2
controller during boot, we had a simple light cycles of Tron type game
embedded as a small Easter Egg.

~~~
ashleyn
Funny you mention NASCAR Racing. Was this the 1994 MS-DOS title? How were
threads even done in an OS like DOS? I'm guessing this was something DOS/4GW
gave or along those lines.

~~~
sokoloff
Yes. This one:
[https://en.wikipedia.org/wiki/NASCAR_Racing](https://en.wikipedia.org/wiki/NASCAR_Racing)

I was tech lead on the PSX title and contributed to the PC NASCAR Racing 2 and
Grand Prix Legends title. Many of the core programmers from Papyrus went on to
form iRacing.

I seem to recall that the DOS titles were 4GW. We ran the physics and joystick
read (time a capacitor charge through a variable resistor in the controller)
together (and maybe the sound synthesis as well)

(We had hacks to detect running under Win95 and then walk the app, touching
each page periodically to keep Windows from paging us out.)

------
davidmr
This isn’t even top 20 in this thread, but here’s mine: maybe 17 years ago, we
upgraded our department server from a big old Sun 4/690 running sunos to a
shiny new Ultra80 running Solaris.

Among the many functions this server has was to host a bunch of black and
white x terminals. Probably only a few people here ever used those (although
more than most other online forums!), but basically the idea is that they plug
into the network, at power on they tftp down the image for the x server, they
boot and allow you to run x client apps on the server, an 80s thin client
implementation. So we upgrade the server and things are working pretty well,
especially for such a major upgrade.

My boss/mentor at the time is truly brilliant, so we really had most
everything thought of. The only thing that was off was that all of a sudden
the xterms all stopped booting. We couldn't figure it out. Network sniffers
didn't show anything useful--we were just baffled.

On a whim, we decided to take the tftp server out of inetd control and truss
it (Solaris equivalent of strace). The first time? Worked perfectly--our test
xterm booted just fine. Eventually we figured out that the new server was so
fast that the speed of the tftp transfer was triggering a problem on the
Ethernet card firmware of the xterms and by using truss, it slowed down the
transfer and bypassed the bug.

Solution: In inetd.conf, we just spawned in.tftpd with "truss -o /dev/null".
Never saw the issue again.

~~~
ysleepy
That reminds me of this:
[https://github.com/strace/strace/issues/14](https://github.com/strace/strace/issues/14)

------
tyingq
Early 90's. Big fortune 500 website running on a single Pentium 90 desktop PC.
We had to remove the case on the computer to allow for more cooling and put a
consumer grade house type fan next to it. Otherwise, it constantly overheated
and would reboot.

So real data center, racks, etc. But this cheap ass, caseless, P90 on a shelf
with a household fan blowing on it, while making millions of dollars.

I was mystified why there was no budget to use a real 1U server. The internet
was pretty new at this time, but it was driving revenue.

Also, side info, this caseless P90 still exists. Sitting in my friend's
cubical, naked and caseless. Pure glory. It's a hero. Tech stack was NCSA
webserver, C, and Ingress plus daily updates with a 1.44MB floppy disk.

------
Alex3917
"Don't delete this comment or the production server will crash." Tried it, did
as advertised. Apparently the website went through a proxy that reflected over
the code, using that comment as a hook to inject some sort of functionality.

~~~
qrohlf
I've personally set this kind of thing up. Inherited an old PHP site for a
webdev contract, and a couple weeks into development (before any of my code
had made it to prod), the server starts hanging randomly, or spitting out
seemingly random errors on every request.

I was told in no uncertain terms by the client that I had to fix the server
hangs within 48hrs I'd lose the contract. This was in a million+ LOC custom
Wordpress nightmare.

I wound up writing a script that ran on a little EC2.micro instance that would
ping the homepage every 60s looking for the HTML comment `<!-- if you delete
this comment, the server will reboot forever -->`, and if the request timed
out or the text wasn't found it would hit their hosting API and reboot the
server the site was running on.

I deployed the "fix", finished the contract without incident, and subsequently
fired the client.

~~~
earthboundkid
The Atlantic has a magic PAGE_COMPLETED comment that we used when I was there
to tell the CDN whether to cache a page or not. I imagine that’s common.

------
jboggan
Using the Google Sheets API to store session history and metadata for a
nightly backfill job instead of, you know, a database. The program broke after
the creator left and no one could figure out how to bring it back up. The
engineer assigned to fix it pulled their hair out looking for the database
creds, local SQLite3 records, anything that would initialize the backfill.
Finally realized it wasn't just printing out metadata to a Google Sheet but
actually relying on that as a persistence layer. Root cause of the breakage
was automatically adding every Hadoop counter from the job as its own column
in the Sheet, which eventually exceeded the dimension limits.

~~~
sothym
Using Google Sheets instead of a proper database brings me nightmares.

~~~
mehwoot
I did this once. A client wanted a website that mimicked the functionality of
a complicated spreadsheet that he had created to calculate quotes for
customers and didn't have enough money to pay to have all the logic rewritten
in a webserver (not to mention continually updated).

I imported the spreadsheet into google sheets, gave him access, and had the
webserver paste the values in the spreadsheet via the google sheets api and
read them back out.

~~~
franciscop
I created a whole project for this!

[https://github.com/franciscop/drive-db](https://github.com/franciscop/drive-
db)

Since you can also hook a Google Form to a spreadsheet, you can do
surprisingly advanced things over there.

~~~
Xorlev
Not to detract from this project, buy this is a good opportunity to mention
Apps Script.

You can do all sorts of interesting things between a form, spreadsheet, and
other services including your own. Nobody seems to use it, but internally we
do all sorts of gloriously hacky workflows with it.

You can easily script forms/sheets/calendar/Gmail together to create pretty
much anything you need.

I use it to send daily email reports of data fed into a spreadsheet.

~~~
franciscop
Ah sure, no problem, feel free to do a PR mentioning App Scripts as well if
you'd like.

From a quick overview it seems like if you need serious work with
spreadsheets/GDocs then App Scripts is a good choice. However drive-db is more
like a (very) quick way of putting a Spreadsheet into your Node.js backend as
an array/db. I purposefully didn't even allow edit since that'd require API
keys from users and defeat the _quick_ part of it.

------
jballanc
Not sure if this is still the case, but back when I was at Apple the program
that triggered when you pressed a button on an Apple remote pointed at a
Macintosh was just a giant AppleScript file that, at the top level, was a
giant `if...else if...` statement to try and determine which application had
the foreground so that the appropriate action (e.g. next track for iTunes,
next slide for Keynote, next chapter marker for QuickTime, etc.) could be
triggered.

~~~
ollin
Interesting! The only .scpt files I can find in /System are Automator actions
or in Script Utility itself, and of those the only one that seems relevant is
Library/Automator/Initiate Remote
Broadcast.action/Contents/Resources/Scripts/main.scpt, which seems like
something else. Hopefully that means they fixed it (if someone with an apple
remote wants to run opensnoop and double-check that would be cool!)

~~~
rcarmo
I think I remember this from the days of Front Row. It’s long gone now.

------
theyinwhy
Years ago, I was playing Prince of Persia: The Sands of Time, a lot. As the
game was quite hard, I died often and every death resulted in huge loading
times. After hours of game play I found out that every load was showing the
exact same animation and took about the same time to load. I browsed the game
folder and found a video file with the exact same animation. Replaced it with
a 1 second video file and guess what, it worked. Never felt more like a hacker
again.

~~~
kleer001
Heh, nice. Reminds me of 13 y/o me and hacking a copy of a shareware CGA strip
poker on a 5.25" floppy disk. What'd I do? The revealing images of the digital
strippers were numbered 0.bmp up to 5.bmp. This was in DOS before we had
windows. I renamed the files so their numbers were backwards 5 to 0 4 to 1 and
so on. Instant reverse-strip poker and a viscerally satisfied teen :)

------
lettergram
The music for our call waiting at my first job, was an old Windows machine
blasting music in our server room with a phone on speaker... You ever wonder
why music on clal waiting sounds so fuzzy?

~~~
chris_wot
Which I guess means you had to be super silent when doing physical hardware
maintenance?

~~~
Paperweight
"I just heard a bunch of swearing when I was on hold!"

------
andywood
Here's a late version of Encarta.

[https://goo.gl/6uX4Qu](https://goo.gl/6uX4Qu)

Do you see that plain-looking dropdown menu with the rounded orange
highlights? That is Internet Explorer. Just this one menu. It's an in-process
instance of Trident, IEs old HTML rendering engine. So that little window is
the equivalent of somthing like chromium embedded. I don't know why that menu
is an instance of IE's HTML renderer. Someone wanted to style it with CSS, I
think. So they embedded IE. That flyout to the right is probably another
Trident window. In order to meet accessibility requirements, I had to grab the
running instance of the root IE COM interface, and route keyboard events into
it. With raw C++ COM. There were other hooks going in the opposite direction
so the menu / browser window could tell the app about clicks.

~~~
birdiesanders
That is just insane. Separate rendering engines for each tab?

~~~
andywood
I can't remember how the flyouts worked. They might have shared a window, or
they might not have been HTML windows (but I think they were). What I know for
sure is that the one main dropdown was IE.

------
cagataygurturk
Year 2006, we had a very high-traffic website running with 1 MySQL server and
1 web server (PHP). Maybe high availability or resillience terms were not
coined yet, that's why we were comfortable with having one server per each
function. Web server had two ethernet cards, one is with private IP and one is
publicly accessible IP. After a while, the platform started to crash and I
would be called by my loyal users before Pingdom alerts reach to me, then I
would call the datacenter technicians to press restart button of the web
server. Obviously it was a lengthy process for recovery, with a lot of human
involved.

After a while, I discovered that the issue was about web server's ethernet
card attached to internal network and used to connect MySQL server. When that
ethernet card stops working, the platform would crash. On the other hand, it
was also possible to connect to MySQL using the other ethernet card via public
IP. It would reduce the performance of the platform, since all the bandwidth
of that card (100 mbps!) is already eaten by HTTP traffic, but at least it
would keep it running.

I ended up writing a script at my home computer, checking if the platform is
up or not. Once the faulty ethernet card fails, it would connect to FTP,
change PHP configuration to use the other ethernet interface to connect to
MySQL server, and send an e-mail to datacenter technicians to press restart
button.

This script successfully did its job during 3 months, until I eventually
replaced the faulty ethernet card and fixed the issue.

Isn't it "Invent and Simplify" like Jeff Bezos says?

------
rcarmo
The other day a former colleague pinged me with a screenshot from one of our
secondary RADIUS servers, asking if he could remove my former user account
from a bit of Perl code (we used Radiator).

That ‘if’ block exempted me, the CEO and the CMO from traffic limits (which at
the time would forcibly disconnect you) and make sure we had 24/7 access (I
had set it up during testing because they kept calling us to remove the
blocks, and one night I couldn’t log in either).

We found out during that exchange that another former colleague had left a
cron script downloading Dilbert and User Friendly comics that had filled up
the hard disk since 2008 (the machine had nearly 12 years of uptime).

~~~
Asooka
Hm, 10 years at 200KB/day (about average for a Dilbert strip) would come out
to 730000KB or 713MB. That seems rather quaint compared to the ~50GB git repo
we have :)

~~~
rcarmo
I didn't have access to the machine, but I gather the cron job grabbed more
stuff :)

------
orf
Finance needed to do end-of-year stuff a couple of days past end-of-year. The
system couldn't handle this, bad things would happen and data would change
once end-of-year passes.

Solution? A bash script that does:

    
    
       while true:
           set date to 4pm end-of-year
           sleep 1

~~~
nailer
Why the loop, vs setting the date once?

~~~
orf
NTP would resync it, and obviously after X hours it would no longer be end of
year even if you set the date in the past.

It needed to be end of year day for 2 or 3 days.

~~~
nailer
yeah I get that but `while(true)` seems excessive. Why not just do it once and
disable NTP?

~~~
orf
because time moves forwards, and if you set it to 4pm then in 1.5 hours it
will be 5.30pm, and the end-of-year stuff will kick in?

Three lines of bash seemed simpler. It's a hack, yes, and there are better
ways. But really who gives a damn.

------
Kubi
Big C codebase. To be more precise, they said it's C++, but as far as I could
see, it was C compiled with g++.

Some code read xml data. Instead of choosing one of the xml-parsers available,
author decided to write another one. Instead of using C++ features, atoi()
used. For empty strings, atoi() got NULL and segfaulted. Signal 11 has been
handled and suppressed in order to avoid crashes. Certainly, the code had
other segmentation faults too, which could not been discovered this way. :)

~~~
ComputerGuru
You mean, instead of fixing _just_ the atoi() crash, that developer fixed
_all_ crashes with his patch? Quite the clever bastard!

------
scandox
I installed 65 cash registers all over Ireland. Each one had only an RS232
serial port. I had to read and aggregate their daily reads between 5am and
10am (only time these outlets were not running).

It was not possible to read this particular cash register when it was in
operation mode OR if it was in OFF mode. Also we did not have access to GSM
sims and there was no WIFI at stores.

SO:

We installed 56K modems and plugged them into the regular PSTN lines.

BUT:

That would interfere with customers calling to order out.

SO:

We plugged them into analog plug timers and only had the modems switch on
between 5AM and 10Am.

AND:

The store owners kept switching Off the tills. So we had to disable the off
position for all the tills.

The backend was a VB6 app running a 56K modem that read each till in turn and
then processed all the results.

Ran for 11 years with not much bother.

~~~
johnflan
CBE?

~~~
scandox
Nah. EPOS wasn’t our main business - or even competence! Got dragged into it
because we could do the backend...

------
emilsedgh
Reminds me of this brilliant The Daily WTF submission. [0]

[0]
[https://thedailywtf.com/articles/ITAPPMONROBOT](https://thedailywtf.com/articles/ITAPPMONROBOT)

~~~
chris_wot
Even better:

[https://thedailywtf.com/articles/Open-
Sesame](https://thedailywtf.com/articles/Open-Sesame)

------
ss248
5F3759DF a.k.a. fast inverse square root [1] in graphics programming.

[1] -
[https://en.wikipedia.org/wiki/Fast_inverse_square_root](https://en.wikipedia.org/wiki/Fast_inverse_square_root)

~~~
lgregg
I'd love to know who left those comments for Quake III Arena in your
referenced Wikipedia article. I had a good laugh.

~~~
ss248
>I'd love to know who left those comments

The legend himself, John Carmack.

Fast inverse square root is really the perfect example of black magic in
programming.

~~~
lscharen
I always thought of this bit of code as a great example of applied numerical
methods techniques, rather than “Black Magic” The magic constant is derivable
from standard methods and one can even choose to optimize other measures of
error.

[http://www.lomont.org/Math/Papers/2003/InvSqrt.pdf](http://www.lomont.org/Math/Papers/2003/InvSqrt.pdf)

~~~
ss248
Isn't it what black magic is all about?

Unorthodox technique, that you can explain if you try hard enough (in a sense,
everything that reliably work could be explained and someone has to came up
with in the first place), used by people who don't really understand it.

What do you think is a better example?

~~~
lscharen
I would consider “black magic” to be something that works reliably due to some
specific and idiosyncratic property of the environment that it operates
within. Basically, something that is exceptionally tightly coupled. I think
the novel FPGA solutions that genetic algorithms can create fall into this
category; they often didn’t work on different boards, or even when the same
board was plugged into a different power supply because the solution was
overfit.

“A Story About Magic” is black magic in action.
[http://catb.org/jargon/html/magic-
story.html](http://catb.org/jargon/html/magic-story.html)

“The Story of Mel” is not black magic even though no one else understood his
program. [http://www.catb.org/jargon/html/story-of-
mel.html](http://www.catb.org/jargon/html/story-of-mel.html)

------
gwbas1c
I work on a .NET application that runs multiple download HTTP requests at the
same time. We recently added support for client-side certificates to
authenticate to a customer controlled server.

When Windows is configured in a very high security manner, the user needs to
manually give our application permission to use the certificate once during
the lifetime of our process.

We hit a bug in .NET where, if we start multiple HTTP requests at the same
time that use the same certificate, and the user needs to approve our use of
the certificate, the user will get multiple request dialogues.

The fix is a very convoluted lock statement, because if the user says no, the
other HTTP requests that would be started at the same time need to be aborted.

What makes the lock statement more complicated is that we essentially need to
lock right before the HTTP request starts, but then unlock when we are reading
the stream. This means that the first time we use a client-side certificate,
we have to disable multi-threading until we know that the client site
certificate is approved by the user.

------
time0ut
An external vendor delivered a new static marketing site written in PHP. Info
sec team wouldn't let us install mod_php on the publicly facing servers and
the vendor needed more time/money than the budget and timeline allowed to
change it. A coworker stood up a local server and wrote a script to
periodically crawl it and push changes out to the publicly facing apache
servers. It might still be there for all I know.

~~~
nikisweeting
Thats actually not so bad imo, it's fairly common to build a site in wordpress
just for the CMS, then convert it to static HTML periodically and rehost it
straight from CDN edge servers.

~~~
time0ut
Definitely not as crazy as some of the other posts here, but it felt like a
hacky work around for something that probably wasn't really a problem.

------
n8rb_
I used to support a warehouse management system (RedPrairie) that our company
had customized per business rules. At some point, a bug was introduced which
locked up a very important table and brought multiple warehouses to a stand-
still. The decision makers weren't interested in fixing the bug, so after
months of waking up at 2am to kill these locks, my coworker and I wrote a
script which monitored for locks on this table from SQL with a certain pattern
and killed any lock that persisted for longer than N seconds, then sent an
email to anyone and everyone. This really messed with the integrity of the
data in the system, but the decision makers loved the decrease in downtime and
it stayed in place for a year before the bug was finally fixed.

~~~
kup0
Wow that name brings back memories. Worked for a company years ago that moved
to RedPrairie from an older system and was not happy about the "upgrade". This
was around the same time they changed their interface for customer service to
enter orders and customer information (to BlueMartini- same company that owns
RP maybe?), which was also painful (old-IE-only, bug-ridden, etc)

------
charred_elf
Mid-90s, a very large cell phone company at the time, working on a car phone
(the head units had their own microcontrollers).... Discovered that the
interrupt service routine was calling the reset vector instead of returning
from the keypad interrupt. The original author apparently didn't understand
how it was supposed to work, and a simple RTS would eventually overflow the
stack....so on every keypad interrupt the whole code flow started over from
reset. Worked surprisingly well.

------
himom
Circa 2006, I saw some crazy ass undetectable worm Windows shit on a
production oracle database server that was insanely connected directly to the
internet for a couple of years. Winternals, Symantec and Microsoft folks
couldn’t find a forensics smoking gun for what was there, but it fit a too
aggressive advanced persistent threat (APT) type that was so aggressive it was
NIDS detectable but neither HIDS nor clean boot HIDS detectable. The only
solution would’ve been to reimage the machine, but of course that wasn’t
allowed, so it was just locked down brutally and left to do whatever.

------
plantain
I had a piece of legacy, proprietary software that communicated to another
machine by wvdial over an SSH TTY (?!). It inexplicably stopped working when
migrated to a newer machine, but.... worked just fine while running under
strace. It appeared to be some kind of timing/race condition brought on by the
new machine being faster to bring up a connection. strace slowed it down JUST
enough to work.

So naturally, it now runs strace >/dev/null in production, probably to this
day.

------
koonsolo
CTO had to make a deadline for the next morning, so he decided to override the
safety system of an Autonomous Guided Vehicle of a few tons. Just bridged over
it electrically.

Not only this, but he failed to inform the onsite crew that was going to put
it in production.

AGV went somewhere it wasn't supposed to go, first employee pushed the red
safety stop. AGV kept on riding, and thank god there was another person at the
electonics panel to make it stop.

I was outraged about this, said that someone could have died, and that could
mean jail time for the CTO. They said I was overreacting, and continued on
their latest project that involved an AGV carrying people in an amusement
park. I quited after that.

~~~
theyinwhy
Now I am curios! What deadline needs you to override the safety system of an
Autonomous Guided Vehicle?

~~~
koonsolo
Cashflow. Customer paid a part on the "delivery" milestone. This means the
thing had to drive a bit on-site.

Problem was that it didn't want to move because of wrong interfacing of the
safety system. So the solution was quick.

------
sokoloff
Gas boiler in an old building with a safety mechanism that would trip and
latch/lockout sometimes in high wind conditions, requiring a manual power
cycle to restore heat.

Solution: Arduino and a relay in-line to cut the power for 1 minute every 120
minutes.

Planned to add a DS1820 to only cut power when required, but never got around
to it.

------
roywiggins
A website that was initially written in VB6 on the back end (I know) had
transitioned to C# for new development. So new functionality would be written
in C#, but all the old stuff had to still work. On a page built with the new
framework, the bulk of the page was going to be C#, but the "chrome" around
it- the header, footer, page nav- needed to be generated with the old VB code.
It turned out to be nearly impossible to begin in VB6 and call out to C# to
generate the page contents, so the solution was to

1\. Generate the page chrome 2\. Post that HTML into the database 3\. Call the
new framework 4\. Pull the page chrome out of the database 5\. Render the page

The worst piece of the VB6 code I ever had the misfortune of working on was an
implementation of a questionnaire, which would guide you through several pages
of questions and keep track of what the state of the questionnaire was. A
particular sequence of actions could crash it consistently, but the code was
so incredibly stateful that it was impossible to work out what was going on.
To add insult to injury, it didn't happen when debugging, and the VB6 debugger
left a lot to be desired anyway.

------
Cerium
After more than a year at a small company the lead developer decided to take
some vacation. He revealed to me that at one point an important customer
requested a daily summary feature for their account. I was horrified to find
out that since there were some tricky bits he didn't actually implement the
feature. Instead he signed on every night right at 9pm and hyped up the
summary. I refused to take over and helped implement the actual automatic
summary feature.

~~~
PaulStatezny
> I refused to take over and helped implement the actual automatic summary
> feature.

What do you mean you refused to take over? You refused to continue “hyping”
the undeveloped feature?

~~~
Cerium
"Hyping" was an auto correct typo, I didn't notice until it was too late. I
meant to type "typing". My coworker was typing up a fake automated email every
night as close to 9pm as possible. I didn't want to do that, since it is
invasive to my schedule and would take place on my own time. Instead, we
solved the problems and implemented the actual feature.

------
odammit
Oh boy.

I had just joined a “subscription based social site” as a backend developer. I
think it was PHP 4 and MySQL 3 at the time.

Someone had realized that it would be great to have database “migrations” in
code instead of using phpMyAdmin to modify tables.

Thumbs up, forward thinker!

So there was an ONE BIG migrations.php file that had our DDL in it with _if_
statements around each one to see if it had already been run.

The statements didn’t have a version number or anything super simple and
trivial. It was a combination of all the worst ways you could figure out if a
column had been added to a table in MySQL.

When code was deployed, someone would visit
[http://example.com/migrations.php](http://example.com/migrations.php) real
quick to migrate the site.

Oh man, classic.

------
wink
Not claiming "weirdest" by reading the other comments - but this is one off
the top of my head.

Application A is multi-threaded, huge, hardly maintained. It also constantly
crashes. Would've probably taken a few ace C++ programmers quite a while to
iron them all out.

Solution? Write a custom watch-dog application that restarts it once it
crashed. (This was on Windows, and a while ago. Being a *nix guy myself, I've
no clue if something -available- like daemontools could've been used back
then).

Let's say I didn't know whether to facepalm or stand in awe.

~~~
ashleyn
Sounds ridiculous until you realise they probably weren't given the proper
time to ever fix it. What's better at that point? Restart it until it doesn't
crash at a significant performance penalty, or just leave it broken?

------
norbertnorris
my dad put a rubberband on an ibm mainframe printer in 75 to make it work
while ibm showed up to fix it. They came in and didn't get it to work. So my
dad as he needed to print millions of water bills on time put the rubberband
back in and ran his job.

~~~
Dr_Jefyll
Glad to see confirmation of using a rubber band to fix an IBM mainframe. Been
there done that, but the story is so loony I wondered if anyone would believe
me.
[http://laughtonelectronics.com/oldsite/comm_mfg/commercial_i...](http://laughtonelectronics.com/oldsite/comm_mfg/commercial_ibm1130.html)

~~~
eitland
Love that you've updated your page with a reference to this discussion already
:-)

~~~
Dr_Jefyll
I was delighted at being able to cite an independent source saying such a
thing was possible!

In the 2016 HN thread "Strange bug workarounds" I posted a much gnarlier
problem (and oddball solution):
[https://news.ycombinator.com/item?id=12485921](https://news.ycombinator.com/item?id=12485921)

~~~
eitland
Read it yesterday as I was reading your previous posts :-)

(I often check the posting history when someone posts something interesting.)

------
dzhiurgis
Email validation by inserting a record into database and rolling back from
savepoint if succeeded - Salesforce does not publish their official code/regex
and architects pushed it to match it perfectly with the platform. Project was
done in 2017 btw.

------
jonstaab
Just recently I had to put together a web-based UI for editing vector graphics
(clothing tags) and export them to pdf for printing. In order to avoid writing
and maintaining two separate codebases, I re-used the client side code as my
rendering code.

So I ended up with a server serving the editor frontend, as well as api
endpoints that would use puppeteer and headless chrome to make a request (to
itself) to load the frontend, import a given label file, render it, take a
screenshot and save it to a file, then reply to the api request with the
contents. So kind of a recursive api.

I had so much fun with this but I still can't believe I had to write my own
svg editor and bake my own pdf generation and merging code to get the job
done. It should not require two languages (node and python) and a headless
browser to get the job done.

~~~
mlevental
wut. why did you do this? why didn't you just handle all of it in the
frontend? jspdf makes PDFs in browser.

~~~
jonstaab
I would have loved to, but one of the sticking points was that the label file
had to be serialized, saved, and later interpolated with data from a bunch of
different records, which is why I used svg over canvas, wrote my own editor,
etc. jsPDF seems great for imperatively generating a pdf, but not for
editing/loading template files. Plus, since the pdf generation had to happen
server side, I would have needed a headless browser anyway.

It's super weird, and I've been over it from every direction but I don't know
how I could have done it differently.

~~~
mlevental
>Plus, since the pdf generation had to happen server side

this is exactly my question: why did the pdf generation need to be server
side?

~~~
jonstaab
1\. Client creates a custom label and we store it in s3 2\. They select x
inventory items + click print 3\. We retrieve and send the label file +
inventory data to a service that interpolates the data into each template,
renders to pdf, merges them, and forwards it to our printing service.

We certainly could do the work on the client, but they're making this request
in a context where neither the label nor the libraries need to be loaded; in
fact, in our server-side implementation, it could be done with a single api
call. Doing it all client side would strongly couple the client to the
technology.

Also, we re-used this api in two applications, one of which never loaded the
editor (along with its dependencies).

TLDR; a weird hack was better than violating separation of conerns.

~~~
mlevental
>forwards it to our printing service.

is the printing service a real physical service?

anyway i'm doing basically exactly this same thing except all client side (i
send the serialized "label file" and data to the client) but printing is being
done using the user's printer

~~~
jonstaab
Yep, the printing service is a third-party api (PrintNode) that routes the
request to one of any number of desktop clients.

The reason we did it this way is 1. we have a web app so we can't print
directly to their printer without a print dialog, and 2. we want to print to
potentially a different device than the user is on.

------
Smushman
I worked at an international oil and gas company that put a high price on
security in the early 2000's.

I was hired because the regular firewall/security sysadmins resisted
installing tooling the director wanted that would allow them to be effectively
'monitored' doing their work on the firewalls distributed worldwide. In
particular the director wanted to use Tripwire to alert when files were
changed on the firewalls. He had tried to push this through 3 times before and
each time it was rebuffed/scrapped one way or another.

As I went through the testing phase I took careful note of the security issues
I found. When all was done I had 2 big holes I could not easily close. The
first one was that from the management server (a simple Java app) you could
click file/open and using the explorer window you could 'run' explorer.exe
thus opening the Windows shell/GUI (as well as run command.com, notepad.exe).
I closed all these with file permissions and other settings.

The final one was much harder though. You could, using the same file/open
explorer window, open the log files of the GUI (with a notepad like
functionality), and alter and save the logs again without notification (non-
repudiation violation). The user account had to be able to write to that log
for the entries to be created.

My solution was to create a long-running script in the background that would
cat and empty all of the log entries every 2 seconds to another file location
further limited by tight permissions only to the script account.

I deployed this in production for over a year and to my surprise it never
stopped running (or more likely I thought, overflow and/or lock up the
system).

My worst kludge...

------
twodave
The entire billing system at my first job was written as a console exe by a
(brilliant) guy in during one-day session, working almost completely from live
debugging breakpoints (against the production database). The result was less
than beautiful, but there were rarely any problems with it. Needless to say
anytime a billing inquiry came up it was his problem (which he was okay with
since he owned the place).

~~~
contingencies
Haha, literally writing a multicurrency accounting system from scratch at the
moment. Same deal: my company, my rules. Trying to be a bit forward-looking
though[0], and already tonnes of features you can't buy. Live third party
market platform scraping, machine translation, github integration, beginnings
of live mainland Chinese bank API integration, etc.

[0] [https://github.com/globalcitizen/ifex-
protocol/](https://github.com/globalcitizen/ifex-protocol/)

------
andy_ppp
A function called dig(key) that iterated over a map to find the value. It even
had a comment above it that complained about the slowness of the function, and
that C version was available to improve the performance. Facepalm.

~~~
ashleyn
Couldn't they just....use the map as a map? They probably didn't know how to
use it.

------
natoliniak
running a 3rd party program called PTFB (press the fckn button) as a
background process on a farm of 20 Prod web servers to automatically press
"OK" on a dialog window that would pop up as a result of unhanded exception in
one of our hacky conversion processes that the company didn't allow us to fix.

------
ocdtrekkie
The one I removed a ways back that really irked me was a poor man's backup
circuit: Equipment room had two electrical circuits. A consumer grade desk UPS
was plugged into one, and a homemade male to male power cord went from the
output of the UPS to a wall outlet for the other circuit. So if the power went
out, the UPS started feeding power into the other circuit.

------
shakna
Memory management under PL/I.

It was a highly optimised sorting algorithm for client names, and used
everywhere. It occasionally dropped a name, and I was asked to fix it.

PL/I allows you to free part of allocated memory. In this case, two names
would be pulled out of the array, that part of the array would be freed, and
turned into a new array instance. Pair got sorted and reinserted. Repeat until
array becomes array of single arrays, then bitshift edge of array allocations
to recreate array in place.

------
protomyth
Had a server that was overheating badly and the numerous fans just weren’t
working. Need to keep it running to transfer all the data to a newer machine.
Went to Walmart and bought flexible dryer ducting and duct tape. Vented the AC
directly into the case. Got everything transferred without having to pull the
backup (which is slow). Never did figure out the fan thing but something must
have gone wrong on the motherboard because all the fans were good.

~~~
lgregg
I had a similar set up for a computer I overclocked to play games as a kid.

------
jaggederest
Production system that would chain-ssh through various bastion hosts to get
around asinine firewall systems, eventually _telnetting_ in _as root_ to a
non-standard port to run a script and dump data out of the files on disk of a
MySQL instance backing a virtualization system. Billed based on how many of
the output files that instance appeared in ("hourly billing!"), and who was
the "owner" of that instance.

------
cs702
A long time ago (early 1990's, IIRC), I came across a Novell Netware LAN in
which none of the Mac OS clients could print to a then-very-costly laser
printer shared on the network if a particular a Windows 3.0 client elsewhere
the LAN was powered off.

It turns out that when the costly printer was purchased, there wasn't a Mac OS
driver available for it, so the people who installed the LAN created their own
homemade print queue management system, relying on a batch file running
permanently in that one Windows client. The batch file would take .eps files,
saved on a shared folder by the Mac clients, and send them to the network
printer using Windows printer drivers. Whenever that Windows _client_ freezed,
was reset, or was powered off, printing to that one printer would stop working
for all the Macs at the same time.

~~~
chris_wot
That’s how most RIPs work.

~~~
cs702
The batch file was running on a Windows _client_ \-- i.e., on someone else's
desktop machine, and the user of that machine was apparently unaware.

------
geforce
Once we had a problem with Microsoft' LDAP library not handling referrals
correctly on a Big Corp AD forest with domains on each sites. Real headache as
we were already late for meeting our deadline...

Backstory: we produced a custom software that used Windows Embedded' LDAP
library to handle the LDAP part (Winldap32 library with the Winldap.h
headers). The machine running our software didn't join the domain, so it only
authenticated the users with the ldap_bind function.

If I recall correctly, we found the ldap32 library referral problem when we
used AdInsight (by Mark Russinovich) and saw the library was poking all around
the place (the other forest DCs) and never completed any of the requests. I
think we confirmed with Wireshark.

The hack was in 2 parts:

1) We made a DLL that offered the same ldap_* functions as those we used in
our software. The library then redirected the LDAP calls to a python script
that used a native ASN1/LDAP implementation which relied on nothing but pure
python code.

2) Then we made a injection software that injected the DLL in our software at
startup and replaced the Winldap32 functions with our DLL functions.

We then were able to bypass the MSFT' LDAP library problem, and I think we
pretty close to our initial deadline in the end. Apart from the (very small)
added latency on LDAP code, everything was fine in the end.

------
kennu
A company I worked for in the early 2000's produced a reality tv series which
was shot on an island about 1.5km out from the city shore. It needed a decent
(but rather temporary) Internet connection and our company headquarters
happened to be near by, so my colleagues set up a directional WiFi antenna
pointed from the building's rooftop to the island. The total distance of the
line-of-sight WiFi connection was about 2.5km and apparently it worked fine
when the weather was okay.

~~~
theyinwhy
Now you need to match your experience with the post above about the van
stopping in the link's line of sight and tell us: \- what wifi you had and \-
what the bandwith was.

We need to solve the riddle if it was wifi or laser link.

~~~
kennu
Really can't remember much details, but I suppose at the time all WiFi
equipment in Europe was based on 802.11b (max 11 Mbit/s).

~~~
PaulStatezny
2.5km seems ridiculously far for WiFi to be effective.

Does 802.11b support this magnitude of distance?

------
api
Qemu emulating Ultrix at AWS to run software from the early 90s for dialing
out to weather stations. No source code, so could not be ported. Modems were
local on site and accessed via TCP tunnels over an OpenVPN tunnel from AWS.

------
Tade0
Not _the_ weirdest, but weird nonetheless:
[https://github.com/ckeditor/ckeditor-
dev/blob/major/core/too...](https://github.com/ckeditor/ckeditor-
dev/blob/major/core/tools.js#L902)

Interestingly enough as recently as in 2016 none of the browsers' devtools
were prepared for such a usage and would hang if you set a breakpoint inside
any of the "tried" functions.

------
csours
Here's the weirdest hack that no one thought of as a weird hack:

I worked in a vehicle assembly plant from the mid to late 2000s, and our real-
time monitoring was performed with a commercial off the shelf system called
Cimplicity. Cimplicity was an excellent systems integrator... for the 1990s.

Cimplicity's main problem was that it wasn't enterprise scale; updating the
content of screen objects was laborious.

Someone's solution was to add a database call to every object on the screen -
some screens had about 100 objects.

So on each and every screen startup, on each and every computer, about 100
database calls were made.

\---

I guess that wasn't too hacky, but when absolutely everything is half hacked
together like that, it's hard to what is a hack and what is normal.

------
stephengillie
Using a Remote Access Object to proxy calls between 32-bit and 64-bit code.
Instead of rewriting any of the original ASP code, the lead architect made a
DLL that would provide .NET 2.0 functionality to the ASP code. This was in
2014. The DLL was 32-bit and predictably ran out of memory every 30 hours. We
created a scheduled task to restart the service every 24 hours, what a band-
aid.

~~~
mgamache
Tons of asp (and asp.net) sites are restarted on a regular basis to prevent
crashes. The source of most of these problems are never known.

~~~
ComputerGuru
Actually, IIS automatically restarts the worker thread once a day by default;
meaning most developers are unaware of memory/resource leaks that don't end up
causing a crash within 24 hours. Got called in to look at quite a few "random"
issues that turned out to be resource leaks that only showed when the site was
under enough stress for it to hit the brick wall within the 24 hour recycle
period.

~~~
mikehotel
The default IIS app pool elapsed time based recycle interval is 1740min, or
29hrs. 29 being the smallest prime over 24. Seriously.

~~~
ComputerGuru
Yup, and for good reason. It means your site won’t be down at the same time
two days in a row.

------
harunurhan
Maybe not the weirdest but;

I was building the client library (wrapper) of an old external third party
service we were running, for the our new system (re-write).

By the way, this old third party supposedly has REST interface which was added
later, when REST was getting popular. So I needed to get all ABCs but couldn't
figure out how because there wasn't any `GET /abc` for ABC but `GET /abc/<id>
on the docs. So I just checked how it was done in legacy code.

Then I found a for loop 0 to 100 which was doing 100 `GET /abc/<index>`. It
was working because ids were incremental and there were only less than 50 ABC
records. Unfortunately there was no caching or `break;` after `404`. :(

------
chris_wot
I did some work for a major Australian Federal government department. They had
a very secure environment and couldn’t (well, wouldn’t) allow webservices or
anything else of that nature between themselves and other government
departments. However, they did allow for email.

Luckily the ITSM system I had took incoming and outgoing email. They had a
stored procedure “hook” for email. I reimplemented the SLA functionality by
inserting the relevant data into the database, calculated the SLAs via the
procedure.

~~~
krallja
Email: the original asynchronous work queue!

~~~
davb
And it’s comparatively one of be simplest and most reliable transports for
business data exchange.

In my first professional job, we used PKI-encrypted email as a transport for
ebXML messages (order, despatch notifications, etc) between our customers ERP
system and their suppliers. And in my most recent role, we received retail
transaction data and returned analytics reports via email.

The great thing about protocols like this is that virtual all enterprise
systems support them in one form or another. And it’s always been easier to
get clients to set up exchanging data via automated emails vs SOAP or REST
APIs.

------
werber
I had a fairly large client that literally had an entire hacked together "CMS"
they refused to move away from because it was a memorial to the late lead Dev

------
INTPenis
This was probably not considered a hack when it was implemented but at some
point in the early 2000s someone deployed two binary programs on an old Compaq
rack server running RedHat Linux of some version.

I think they're called lpr and lpd. But I've tried to find their source and
it's not in any major printer packages. By running strings and hexdump on them
I think my co-worker finally managed to trace it to a package of software made
for a firewall. And embedded in there were these two programs for printouts.

SO that became well used in a major government branch that will go unnamed.

Fast forward to 2013 and it's my job to upgrade it. Replace its hardware and
its OS.

Luckily the two binaries could be run from RHEL 6 but it was all on faith. I
had no way to replace them or upgrade them because I had no idea what they
did, what protocol they spoke.

So they're running to this day. I'm sure someone here who is more versed in
printing might tell me that they implement some standard protocol that can be
replaced by CUPS or something well known. I'd be thankful but tbh they work so
they won't be replaced in my time with these systems.

~~~
krallja
[https://en.wikipedia.org/wiki/Line_Printer_Daemon_protocol](https://en.wikipedia.org/wiki/Line_Printer_Daemon_protocol)

------
drinchev
Not an actual bug, but I found this a couple of days ago. And it seems like an
elegant hack, allowing `/` in directory, files, etc.

Typing `mkdir foo:bar` in my macOS's terminal leads to a directory, that
Finder shows as `foo/bar`.

~~~
acous
Classic Mac OS (<10) used : as a directory separator, so it's probably related
to that.

------
GlenTheMachine
This wasn't production, exactly, but it was - in my own humble opinion - an
awesome hack, so I'm going to post it.

Was the lead software _and_ computer engineer on a new robot. We had decided
to use compact PCI, which was at that time (late 1990s) a brand new form
factor, so a lot of specialized cards weren't available for it yet. But that
was OK, because the manufacturers were selling bridge cards that adapted
smaller form factor cards to the compact PCI standard.

One of the cards we needed was a motor driver card. The particular card that
was available to us used an LM629, a very old, widely used, digital servo
controller chip. Now, whichever guy or girl designed the LM629 was an anal
bastard. The chip had memory-mapped read-only and write-only registers, which
was fine. But if you ever read or wrote anything out of order, or tried to
read a write-only register or write to a read-only register, the chip would
drop into an error state. So the device driver had to be right on the nose,
the chip wasn't going to cut you any slack.

Because the cPCI standard was so new, I ended up writing the device drivers
myself. Which was OK, I had just graduated with a CS degree, knew metal-level
C pretty well, things should have been fine.

Except I couldn't get this motor driver card to work. Every time I tried to
command the motor, the chip would generate an error. I stripped the code down
more and more and more, to the point that I was only sending a single 8 bit
write and then a single 8 bit read, and still the chip generated an error.

After two weeks of banging my head against it I was at the end of my rope. The
manufacturer of the LM629 card took pity on us and let us bundle the robot up
and bring it to their site, which was in Minneapolis. They gave us a small
lab, where we proceeded to bang our heads for another three days with no
progress. Eventually he took more pity on us and assigned us his digital logic
expert for the afternoon.

Dude rolled in the most badass digital logic analyzer setup I had ever seen at
that point. The thing took up a full-sized equipment rack. He hooked up to our
board and asked me to issue an 8-bit write. That looked fine. Then an 8-bit
read. He raised his eyebrows at that one - asked me if I was sure my code was
correct, as he had seen a 16 bit read. The top byte of which was mapped to
chip memory space we weren't supposed to be tickling.

I told him I was 100% positive that the code was not asking for a 16 bit read.

Eventually we ended up on the phone with Intel, which made the bridge card
chip. They told us that their chip had a known bug; it couldn't translate an 8
bit read on the cPCI bus to an 8 bit read on the daughter card bus. Instead,
it issued a 16 bit read and threw the most significant byte away. "This is
documented behavior," they said. And sure enough it was, in a footnote in 10
point font on page 52 of the manual.

That left us in a quandary, since there didn't seem to be any way to fix
things. Then we had a brainstorm. The fix was to cut the address lines of cPCI
side of the daughter card and shift them all one line to the left, and tie the
least significant address pin of the bridge chip to the most significant
address pin of the LM629 address decoder logic. That way any attempt to access
odd addresses got mapped into nullspace, memory space somewhere way above what
the chip actually had, and the decoder logic just rejected the access request.
The chip would never see it. Then we rewrote the code to make the new
addresses line up with the chips newly re-jiggered address space.

Worked like a charm. We treated ourselves to a high-end steak house that night
and flew home. As far as I know the robot worked in that fashion for close to
a decade, until they retired it.

------
vermaden
I know an ISP at Poland that used/uses 169.254.0.0/16 for their whole clients
network. This way anytime the client does not get an IP address from DHCP (for
example because DHCP server is down) the client after little longer periond of
time would get a working IP address anyway :)

------
setheron
At amazon is was common to just restart the servers every 100k requests to
deal with unsolved memory leaks.

------
lunaticlabs
I worked on games for the original PlayStation, and one of the requirements
for release was that your title would be able to run for 48 hours without
crashing (the soak test). A big issue we ran into as developers is that we
only had 2 MB of memory, and unlike cartridges, you had to load all your data
into RAM. Since we dynamically allocated memory, you would quickly run into
memory fragmentation issues. If I have 2 MB of memory, I could have 1.5MB
free, but I would be unable to allocate a 750k block because it could be laid
out with 500KB free, a small block of used memory, another 500k, another a
small used block, and then the rest of memory. If you end up in this position,
there aren’t a huge number of good options, and several engineers started
thinking about how to carefully allocate the memory so that we wouldn’t end up
in that position. Instead, I found a way to fix it with a scorched earth
policy. I could soft reboot the PlayStation which resets it, but avoids the
Sony logo at startup. I would write the current player data into a special
save game, do the soft reboot, which would launch the game. The first thing
the game would do is look for the special save, and use that to skip various
menus and the front end and just drop you into the game. A simple soft reboot
per level increased our load times by a second or two, but safely reset our
memory. Years later, after talking with other programmers, it ends up that
many people had discovered this trick, and it used to get through the soak
test.

As a side note, memory issues were a huge problem that is largely invisible to
most software engineering. To give you an idea on how complex the solutions
are, check out how Naughty Dog solved this on Crash Bandicoot:
[https://news.ycombinator.com/item?id=9737156](https://news.ycombinator.com/item?id=9737156)

------
slipwalker
We were located on Brazil, moving some applications to RHEL servers located on
EUA. My junior-sysadmin, logged as root, accidentally removed some /lib
libraries ( don´t remember which exactly ) and we would be locked out of the
servers if we closed the ssh shell already opened. No scp, no ftp, no kind of
file transfer was possible ( due the lack of the libraries on the destination
server ). and we would have to call the admins on the remote site to clean the
mess, but i decided to try something: I base64-hexdump'ed the proper binaries
on a local server ( same distro ) and copy/paste via the terminal to the
remote server. Luckly i had busybox compiled statically on the remote server,
so the base64 tooling was available... and then we had our server accepting
connections again, and the other admins never knew it ever happened.

------
vermaden
In an private cloud EHC (EMC Hybrid Cloud) implementation there was a need
that deployed hosts needed to have, well a hostname with a domain, like
hostname01.example.com but the problem was that EHC does not support that in
that version (as stupid as it sounds) so the developer put a script in all
Linux templates called /root/hostmame.sh (yes with an error) which was run by
cron every minute and the only thing it did was to put the
hostname01.example.com into the /etc/sysconfig/network file
(RHEL/CentOS/Oracle Linux) and to invoke the hostname command with
hostname01.example.com as an argument.

Every minute of every host on the private cloud ...

... but I have come upon many strange problems in various EMC products so that
does not surprise me a lot.

------
amiga-workbench
Wordpress being used in a regulated software project.

------
itronitron
the X Windows system (in use in a production system, post 2015)

~~~
fao_
Is there a full, complete compatibility layer between X programs and Wayland
now? I know quite a lot of people who use X Windows still. It's not that
unusual or weird.

------
cbanek
Making a wrapper function that locked/serialized access to printf. There was a
crash that would happen occasionally in a server which always happened in
printf and outputting logs and debug data. While you'd think it was a software
bug in calling printf, the underlying issue was that the app wasn't linked
with the multithreaded stdlib on Windows, so therefore the printf we were
calling was not reentrant.

