
Ask HN: What's the hardest problem you've ever solved? - jamestimmins
Could be engineering, interpersonal, strategic, etc. This is purposefully open-ended.
======
herodotus
When I was a CS prof, many, many years ago, our undergraduate lab had Macs
with floppy disks. I asked the University to pay for installing 10MB Hard
Drives in the Macs. I was asked to present my case to the deans council. At
the meeting, I said that the students used the floppy to load their
development environment. I said that, with a hard drive, it took 10 secs to
load and be ready. With the floppy, I said it took 30 seconds. Then I said,
"That does not sound like much difference, but this is 10 seconds ...." I
paused and looked at my watch for 10 seconds. Then I said "And this is 30
seconds" \- again I looked at my watch. At the 20 second mark, the VP Academic
(chair of the meeting) said "You have your money".

~~~
zerr
I guess that trick would be harder to reproduce for my problem - bringing down
compile times from 4 hours to 1 :)

~~~
m_ransing
In 4 hour duration, you can do some (or many) other thing. In 30 seconds, you
can not, those 30 seconds are pure waste.

~~~
zerr
Yes, I procrastinate.

------
yholio
I somehow decided I needed to cheat to pass a certain exam because I was
basically crap at memorizing stuff. So I used an analogue wireless headphone,
an induction loop around my neck and a mobile phone. Since I lacked an
accomplice to dictate, I read aloud the hundreds of pages and recorded myself,
careful to preserve and properly serialize things like complex formulas.

This was before the era of iPods and SDCard players, so I had my mobile phone
in a setup where I would call back another phone connected to my Pentium MMX
233MHz at home, that ran a sort of audio directory that would playback a
certain lecture recording I would select from the menu, using DTMF tones.

I had a small keyboard sewn into my sleeve that connected to the customized
mobile phone via a DB9 plug and then to the numeric keyboard, allowing DTMF
codes to be sent by gently and invisibly moving my fingers. The whole setup
was hideous and had it's own dedicated jacket, with wires, phone, keyboard,
audio amplifier, neckloop, the earphone... a complete cyborg for academic
fraud.

Back on the PC side, I wrote a C++ application from scratch that would capture
audio via the soundcard using Windows Wave API, decode the DTMF pulses using a
couple of IIR filters then navigate the menu and playback the required file to
the mobile phone connected to the soundcard. The C++ program and menu system
was scripted using an .INI file that defined the structure with links to
various ADPCM-compressed .wav files that represented the menu headings or the
leaf content itself (a good structure was necessary to quickly access the
correct lecture after receiving the exam subjects).

Work-wise, it was a lot more difficult then putting the effort in memorizing
the stuff, but I rejected memorisation on principle, that's not what an
university should be about. The whole thing turned out to be a massive
learning project, but I obviously could not speak about it at interviews. It's
the first time I mention it to anybody.

I used the setup for two exams that I aced, was never caught but it was nerve-
wracking to use in close proximity to a teacher.

~~~
yesenadam
You rejected ethical behaviour 'on principle', you even seem proud of that.
Have you kept doing that in life? Do you feel like an impostor, presumably
with your degree that you didn't think you could acquire without cheating, or
couldn't be bothered?

A surprising number of the replies to your comment also seem to think cheating
like that at university is perfectly fine.

~~~
quadcore
My humble opinion. University is about learning (not, for example, status).
Learning is something strictly personal. There is nothing fundamentally
universally wrong to cheat at an exam beside the fact you may be harming
yourself.

Now since we need a little order for various secondary reasons, we promoted
exam cheating to illegal and that's ok. We need order. But there is nothing
someone should feel guilt for imho, assuming the person knows she may be
harming herself.

~~~
yesenadam
I wasn't talking about law, I was talking about ethics.

"There is nothing fundamentally universally wrong to cheat at an exam beside
the fact you may be harming yourself."

I don't understand that, I have no idea where you got that. Or what those
words "fundamentally" and "universally" add. I say it's wrong, you say "Oh,
but it's not fundamentally, universally wrong".. As if it's clear what that
means.

For example: You may have harmed the people who didn't get good enough marks
because you cheated your way into higher marks. Then you may harm people in
your career that you're not qualified for, besides stopping properly qualified
people from doing their jobs. I don't want an airlane pilot or doctor that
bought their degree or cheated in exams, thanks. Anyway, it seems ridiculous
that I have to explain to people why cheating's bad. Well, I don't know, maybe
you are in a country where it's normal, perfectly fine, accepted, everyone
does it. Where I come from, people don't have to have it explained to them why
it's bad.

~~~
quadcore
I guess you are right. It's ethically wrong but not morally wrong.

~~~
yesenadam
_I guess you are right. It 's ethically wrong but not morally wrong._

Huh? I don't see a significant difference between 'ethically wrong' and
'morally wrong', no idea why you would say that.

~~~
quadcore
'Cheating at an exam' is transgressing a rule that doesn't "exist" in nature,
it only exists in our social (if that's the right word) system. So if I cheat
at an exam, I'm breaking a rule that's in the system that we made up
(together), and therefore I can judge that, in some circumstances, I can break
the rule without feeling morally wrong, without feeling guilt, because I, in
some sense, made the rule myself. I'm breaking my own-making rule.

Another example of that could be: I want my kid to go to bed at 9pm. Sometimes
I will break that rule. To some extend, because of the reasons I've advanced,
I claim that cheating at an exam follows the same characteristic as the "kid
go to bed at 9pm". Just not in the same magnitude if you will.

I then guessed that it may draws the limit between what's ethic and moral.

~~~
yesenadam
Hi again. I don't see how that draws any limit/distinction - it was 2 examples
of rules that can be broken, not sure how that helps explain the difference,
or why you said that. I wouldn't say bedtime is a moral rule/principle, or
that breaking it is unethical. Maybe could make it clearer for me which one
was supposed to illustrate what, if one was meant to be ethical, one moral, or
something, I don't know. I really have never heard the words used with much or
any difference. (I'm no expert, but have read dozens of ethics books, studied
ethics/moral philosophy at uni etc)

I'm just guessing here, but maybe you have a religious value system, with
absolute moral commandments or something? All I have (as an atheist) are
ethical/moral principles exactly like 'cheating is wrong'.

~~~
quadcore
You say 'cheating is wrong'. Fair enough, you can see 'right or wrong' as
binary. Or you can live in the real world and understand that things are a
little more complicated than that. With all the information you have out here
(more than your hypothesis) you can make a fairer judgement. And you don't do
judgement without introducing the living anyway because only what's living can
judge and be judge-able. It's a social thing to judge right or wrong. So you
have to take into account all the system. The living.

Now I'm not going to do the math for you.

------
lettergram
This was more of an introspection, but i used to be rather depressed. When I
started reflecting on life and all I could contribute or succeed at (this was
actually years ago, not specifically this question). I realized that if
someone 1000x more intellect / capable / knowledgeable came along, everything
I could add or solved from an intellectual perspective would be useless.

After meditating / contemplating this for quite some time, we are talking the
order of months. I realized that even if someone knows 1000x more, they still
may not know your niche. More than that, everyone has niche expertise from
their lives, which no one else can access (from their experiences).

I don't know how I got to that conclusion, or how to explain it. It was the
hardest problem I solved, because it's something that had to do with a change
in perspective and personality. It also is what helps me listen to others in
debates and to be much more open minded. I wouldn't say it's that big of a
deal, as I felt it was part of growing as a person, however many around me
don't seem to recognize that. When I hear "they are naturally gifted" or "I'm
not smart enough", I feel those are people who haven't had the same
realization (i.e. experience).

Anyway, that was probably the most difficult and profound problem I solved for
myself. What motivates me.

~~~
projektir
> When I hear "they are naturally gifted" or "I'm not smart enough", I feel
> those are people who haven't had the same realization (i.e. experience).

They may simply not agree with the conclusion, or the end. It's pretty hard to
have this discussion without crashing into the nature-nurture debate, but one
could be interested in a particular result, not merely _any_ result, and they
may perceive a lack of something blocking their way towards that.

Not to mention, we do not really live in a world / society that recognizes
this, and social approval + hireability dramatically affects one's psyche,
more so than mere rational thinking.

------
jjjensen90
I once told a room mate in college I would repair their laptop's charging port
if they were willing to pay more per month on 50 megabit internet (this was a
lot of bandwidth back then).

I ordered a new port online, waited for it to come, then spent like 12 hours
trying to get the factory solder out of the original port. I ended up
accidentally frying part of the power board. By this time it was the middle of
the night and I had class the next day, and my room mate expected a working
laptop in the morning.

So I started trawling Craigslist for similar laptop models, emailed every
single one of them that I would buy the laptop at 6am (enough time before
class). I found one with a broken screen, and got a decent deal, I think I
still spent $150 or something (almost a whole paycheck at the time). I brought
it home and tore it open, took the entire power connector module out of it and
into the new one, it fit thankfully.

I handed my room mate her working laptop and never mentioned the ordeal... It
was still working 2 years later when I moved out of that house. The problem
wasn't so much technical as a problem of desperation and saving face :)

~~~
quickthrower2
A geek trying to save face is the best worker. Well done.

~~~
kchoudhu
A geek trying to get _more bandwidth_ is the best worker.

~~~
ebcode
But a geek trying to save face and get more bandwidth _at the same time_ is
the very best worker.

------
simonbarker87
Fixing a SRAM 11 speed GX rear shifter on my mountain bike. I was on a
downhill MTB holiday and fell off, snapping the thumb trigger on my rear gear
shifter.

Replacing the shifter is usually like a £30 part, but I was in an alpine
resort in summer so the shops that stocked them were charging nearly £100!

I had my tools and some super glue so sat down to repair it. The snapped part
fixed in two minutes but getting that part in place required the complete
disassembly of the internal mechanics of the shifter which was all tensioned
with multiple hidden springs, the internals literally burst apart half way
through carefully taking it apart.

We had no instructions, no YouTube video (since it is usually cheap enough
just to swap out) and no diagrams. It took me and a random guy staying in the
chalet 3.5 hours to put it back together, we essentially had to spread
everything out and think from first principles of how a shifter works, the
specific features of that shifter (rapid fire, mutiple down shifts with one
leave arch) and build up how those peices would match how it operates.

Must have been 30 part, all small and all tensioned with three(?) springs.

6 months later still working fine but man did we get a mental workout that
day.

Next day I had to fix my 3 axis gimbal but that was a lot easier.

~~~
ArthurN
I empathize! These kinds of shifters are SUPER hard to put back together.
Quite literally, so many moving parts in such a tight space. If you
inadvertently bend the springs or coils too much, you could end up ruining the
whole thing.

I had a similar situation happen to on a Shimano Deore shifter, which is even
less complex than the SRAM you mention.

For others who have never had this "joy", it looks like there's at least one
video at least partially showing what the OP had to go through:
[https://www.youtube.com/watch?v=nrfKQfXJgd0](https://www.youtube.com/watch?v=nrfKQfXJgd0)
Jump to ~2:00 to get a sense of how finicky this stuff is.

~~~
simonbarker87
That video was released a month after I fixed mine! Could have done with that.

------
archagon
I started with the glimmer of a hope that perhaps network sync could be made
stateless, went down a months-long rabbit-hole of research, and ended up
writing a novella-length article about CRDTs:
[http://archagon.net/blog/2018/03/24/data-laced-with-
history/](http://archagon.net/blog/2018/03/24/data-laced-with-history/)

So many days spent thinking, sketching, trying to swallow a concept that
seemed far to big for my jaws—only to suddenly find myself on the other side,
with this arcane knowledge fully internalized.

Going from a few wayward thoughts to a working proof-of-concept was the most
professionally satisfying thing I've done. It felt like alchemy.

~~~
raphlinus
That might be mine as well, [https://medium.com/@raphlinus/towards-a-unified-
theory-of-op...](https://medium.com/@raphlinus/towards-a-unified-theory-of-
operational-transformation-and-crdt-70485876f72f) . It's very much unfinished
work though. Certainly wrapping my head around the OT and CRDT literature took
huge amounts of brainpower exertion, comparable to what I did for my PhD.

Also, I have your essay in my queue to read more thoroughly, it's high on my
list of stuff to study as we possibly rethink the way this stuff works in xi-
editor.

~~~
archagon
Hey Raph, thanks for the vote of confidence! I spent a lot of time looking
through your research on OT and CRDTs (plus your Rope Science series) in the
course of writing my article, especially as documented and implemented in xi.
Really fascinating stuff. I love how many different ways there are to approach
this problem.

------
triviatise
Dont know if this is the hardest, but it is one of the most impactful. During
the 2010 recession I had a client that had about 700K in receivables past due
and we were doing about 120K/month in additional work. I started reading their
10Q and 10K and decided that they were at risk for bankruptcy as their cash
position was not great.

I told my sales guy that we had to collect our money somehow. He was like they
are a big public company, there is no way they will go bankrupt. I insisted.

I told him we needed to start working their procurement and AP hard. We bought
gifts, took them out to golf, bars, lunch, dinner, and told a ton of sob
stories about how we needed the money. We managed to get our AR down to about
120K which we decided was acceptable losses.

They went into bankruptcy and I was contacted by a receivables purchase
company who offered to buy our 120K for 40K, so I sold them immediately.

Many of the small companies that we worked with there went out of business as
none of us could take such a huge receivables loss.

To this day I try to get my finance team to send gifts to the AP team of our
clients.

~~~
dctoedt
> _To this day I try to get my finance team to send gifts to the AP team of
> our clients._

1\. Many, many big companies require vendors to agree to codes of conduct that
explicitly prohibit such gifts.

2\. If the customer were to file for bankruptcy protection, the trustee (or
debtor-in-possession) likely would characterize recent payments as "avoidable
preference" payments and seek to claw them back — with claw-back being the
presumption and the vendors having the burden of proving that they're entitled
to keep the payments (which is somewhat of a PITA). [0]

3\. If the recipient of such a gift happens to be a foreign "official," then
the criminal penalties of the (U.S.) Foreign Corrupt Practices Act might
become salient. [1]

[0] [https://www.nolo.com/legal-encyclopedia/pre-bankruptcy-
payme...](https://www.nolo.com/legal-encyclopedia/pre-bankruptcy-payments-
creditors-can-the-trustee-get-the-money-back.html)

[1]
[https://en.wikipedia.org/wiki/Foreign_Corrupt_Practices_Act](https://en.wikipedia.org/wiki/Foreign_Corrupt_Practices_Act)

~~~
sonnyblarney
In my experience, the vast majority of companies will not be reporting
'dinner' as a gift, probably not even golf. Maybe 'material gifts' i.e. things
that are directly gifted that have value, but even a small basket might not
get reported.

------
sadok
When I was 14 or so, my mom installed one of those "telephone locks" that
blocked outgoing calls and only allowed incoming calls, she did this to limit
my dial-up internet. I wanted to see if I could bypass it without my mom
noticing.

I noticed that the case for the lock could be pried open easily because it was
just a plastic cover with 2 tabs. I examined the circuit and saw that if I
could bridge the cables around the lock via a toggle switch, I could have an
open phone line during the day and then toggle it at night when my mom came
home and she wouldn't know. And so I did. I felt like Kevin Mitnick. I
remember documenting the process and posting it to hackaday.com

Funny thing was that my dad found out but he didn't tell my mom. He would just
go "hey can you make the phone work, I need to make a call". I later found out
he was on phone call restrction too.

~~~
calabin
That's pretty good. With the device is there some sort of emergency way to
split the lock in case of emergency? It would be unfortunate to be unable to
make outgoing calls if you needed to call 911.

------
shaklee3
I eventually solved a bug that took about 1.5 years to figure out, since we
were not able to reproduce it, and it only happened on a customer's system.
Long story short, it ended up being a by-product of sending a (256*N)+1 byte
packet through the system, and the fpga asserted a signal 2 clocks later that
did not assert in time on those sizes. This resulted in a single buffer
leaking, but eventually it built up exponentially. Many days and nights of a
logic analyzer and other equipment led to a single accident in the lab where I
was able to reproduce it.

It's hard to describe the feeling of reproducing something like that, after
haunting me for so long.

~~~
ComputerGuru
I feel you. A bug in our recovery software would result in the live CD being
unable to open the Windows kernel file after having located it on disk. No
amount of logging we added to the code made sense. In the end, after more than
a year, there was a customer in the UK that could reproduce the problem each
time on a clean Windows 10 install (so no private info). I mailed him a new
SSD to replace his old HDD and paid for him to overnight me his hard disk.
Problem solved the very next day.

~~~
shaklee3
That's awesome!

------
antisugar
I’m surprised and slightly disappointed that my memories don’t bring up a
clear answer to this.

The hardest problems I’ve faced were never exactly solved, just moved past or
muddled through. Things like loss of close friends and family, acknowledging
my own limitations, and accepting the inertia of flawed institutions.

Any problem that eventually found a solution I remember as feeling relatively
simple in retrospect.

~~~
galfarragem
Finding a solvable hard problem is itself very difficult. If the problem is
within your confort zone, you will never feel that is an hard problem. If it
is too far from your confort zone you will not be able to solve it.

~~~
clarry
And if you obtain the knowledge required to solve it, the problem will start
to look easy in hindsight.

------
mooreds
I was working on a calendaring system. We supported recurring events, and each
event could also have a piece of equipment associated with it (one or more).
Events could also be changed, either the current event or this and every
event. Event start/stop times were stored in UTC, but the event was displayed
in the users timezone, and could cross daylight savings time boundaries. This
was done with a mix of javascript on the front end and ruby on the backend.

When first implemented, the equipment was reserved for the entire time of the
event. A feature request came in to allow for partial reservations of
equipment (we want to reserve space A for four hours, but equipment B for only
the last hour and equipment C for the first two hours).

Solving it on initial event creation was pretty easy. The UX was difficult.
But handling updates, especially across recurring events, in a way that was
maintainable and correct was the hardest technical work I've done. I wrote a
lot of tests.

The difficulty was compounded by the fact that this was a startup and I had no
technical peers to discuss the issue with. The code implementing the work
wasn't the cleanest either. I could have reached out to friends, but they
wouldn't have had the understanding of the issue.

~~~
whack
How did you handle the database storage for recurring events? Did you have
just one entry that represents the recurring event as a abstract whole, or a
database entry for each instance of the recurring event?

The former seems "better", but you also run into a whole lot of complications:

1) The user can delete specific instances of a recurring event. Eg: delete the
event for thanksgiving Thursday but leave it intact for all other Thursdays

2) The user can make instance-specific edits. Eg: edit the event description
with custom meeting-notes that's specific to that week's instance

3) The user can invite/dis-invite people for specific instances. Eg: invite
Dave only for the event this Thursday, but not for the following weeks'

Creating a database entry for each instance avoids the above complications,
but comes with its own drawbacks: if a user creates a recurring event with no
end-date, you have to populate a large number of instances, all the way until
some arbitrary MAX_DATE.

I was asked this question in an interview once and I recommended the latter
option as the lesser evil, and the interviewer was visibly displeased with my
recommendation. I'm wondering what the better solution would be.

~~~
was_boring
I have worked with recurring events, and the answer that I've seen is
generally both. You have one record which is the "template" and then populate
the database out with specific instances to some arbitrary date. Every so
often then run a cron to populate further and further out.

~~~
mooreds
That's pretty much what I did, but the original record wasn't stored in the
DB, just the specific instances. Our use case was such that folks were moving
around events periodically (it was a scheduling app), so we rarely had anyone
"run off the end" of their repeated events.

------
iamgopal
Once in a ie6 era, I created vertically and horizontally centre align login
page that also worked in Mozilla.

~~~
lostgame
You, sir, are a God.

------
guidovranken
Years ago I wrote an emulator for the Intel 8086 processor in C++. It's
deceivingly difficult because the instruction encoding is complex and the
emulation of each instruction has to have a very high fidelity. In a sense,
software at the CPU instruction level is a chaotic system, as each instruction
can influence the system state to a critical degree, so if there is a slight
deviation from the spec/hardware, a snowball effect of deviations happens that
leaves the system in a completely botched state where your emulator won't boot
at all. Because the deviation can happen anywhere within an execution path of
millions of instructions, it's very hard to debug, too.

Eventually I got it working and I could boot DOS and play games.

The aim of my project was to create a programmable emulator that could be used
for the semi-automated analysis of malware, and sell it. Eventually this
didn't really go anywhere as this was a too ambitious goal to do all by
myself. See here [1] for a demonstration where I load tweets from Twitter and
send them to the DOS text editor by triggering a keyboard interrupt via a
Python API. Fun...

Later, the Unicorn CPU emulator framework implemented my idea much more
effectively by creating Python bindings to an already mature emulator (QEMU).

[1]
[https://www.youtube.com/watch?v=XwPZH8LAVIY](https://www.youtube.com/watch?v=XwPZH8LAVIY)

[2] [https://www.unicorn-engine.org/](https://www.unicorn-engine.org/)

~~~
toutouast
Trying to do the same project but for MIPS architecture as a bachelor thesis
project!

------
aaronbrethorst
How to be happy.

For me it turned out to be jettisoning all of the stuff from my life that
didn’t feel meaningful and reorienting my life such that all of my major time
commitments positively impact things that I consider to be personal values.

This includes: hobbies, non-profits, and my job.

This does not mean I live an ascetic life. Instead, I live one that I
personally find meaningful.

~~~
cortesoft
What were some of the things you jettisoned?

------
matheist
Probably not hardest, but one that sticks in my mind was that I downloaded an
old copy of Jedi Knight II onto my computer and was excited to play it
again... but it crashed on launch. Or rather it would throw a modal dialog
(complaining about graphics something-or-other) and go into a loop of system
beeps.

So I attached to it with gdb... and started going down a rabbit hole of stack
traces and assembly code. I don't remember ultimately what the problem was (I
think the code had been (partially?) stripped of symbols) but I do remember
ultimately flipping a bit in some conditional that finally got it to launch!

~~~
Tempest1981
You reminded me of this: "How I Fixed a 10 Year Old Guitar Hero Bug Without
the Source Code"

[https://youtu.be/A9U5wK_boYM](https://youtu.be/A9U5wK_boYM)

~~~
matheist
Thanks for sharing that! It does look very similar — I wish I had documented
my process as neatly as the author of that video.

------
bringking
One of the hardest challenges I have had in my career is convincing our
company to move to Continuous Delivery. 90% of the challenges weren't
technical, but emotional. Shipping software comes with lots of feelings, fear,
politics, etc. I had to personally work with various leaders across the org to
help them through these feelings and perceived blockers.

We aren't 100% there yet, but we are shipping numerous times per day across 20
or so services and quality has gone _up_, not down

~~~
ozim
That is interesting because I am still trying to figure out how people do CD.
Of course I know CI, we have it all set up. But we still work in sprints with
manual testing and release every 2 weeks.

I could spend time on marking features 'frontend only, low impact' which we
could deploy pretty much the same day. Still there are quite some features
that need bigger amount of work where they might be 'done' by dev but I am
sure they are not actually done, because security, because error checking. It
of course is usually that one dev has his not tested feature merged to develop
and then also some other dev has production ready one, then if one feature is
production ready, I would have to put also some time to make release and pick
only changes for accepted change. I am not sure that additional work to check
what we can release ' right now' will pay off vs just taking time for fixes on
acceptance by people who were working on code and then release it (after 2
weeks o 1 depending how fast it is done in sprint).

So are you having people who work only on picking changes that are low impact,
or working on making stuff production ready by picking from develop? Maybe you
pick changes by yourself, or you just defer manual testing to end users and
rely on automated tests unit/integration?

p.s. Funny thing with automated tests is they are good at keeping old stuff
working but not for testing newly developed features where actual tester can
test new GUI/ new features. If you have a lot of GUI changes you cannot
automate first tests...

~~~
delecti
A couple obvious things stand out from your situation. First, only allow
merges of code that's been reviewed and has enough tests along with it.
Second, have a pipeline that automatically runs all the tests whenever you
merge, and if they pass, then goes on to automatically deploy. It's really
that simple. It's not _easy_ to get to that point, but it's simple.

Mostly it comes down to organizational changes and everybody getting used to
what constitutes "enough" tests.

------
sxp62000
For my thesis project back in 2011, I wanted to create an interactive
installation, the kind you see at museums. Since multitouch tables were
insanely expensive to buy or rent, I decided to build one myself. Learned the
basics of woodworking over a semester, and built a table after two failed
attempts. Then I bought an LCD TV, took it apart, connected it to a
playstation camera with a modified infrared lens and developed an Adobe Air
application. The thing was so unresponsive to touches, I decided to ditch the
whole multitouch table idea. Started from scratch again ... learned
objective-c (iOS 5 or was it 6), and created an iPad app for my thesis. Don't
think I've ever worked so hard in my life.

~~~
iliis
There's actually a quite nifty principle that makes it relatively easy to
build your own multitouch table: Frustrated total internal reflection.

The idea is based on internal reflection: Whenever lights hits the boundary
between two materials with different density it gets refracted. If it goes
from dense to less dense and the angle is flat enough it will reflect back.
This is the principle behind fibre optics. It's also the reason why the
water's surface looks "silvery" (like a mirror) when you're submerged and look
up at an angle (i.e. not straight up).

Now imagine you have a plane of glass and you put LEDs around the edges that
shine into it from the side. The light will zig-zag trough the glass and come
out at the other edge. However, if you touch the glass you inhibit that total
internal reflection because your finger is alot denser than the air and so the
light will "leak" out the glass where you touch it, illuminating your finger.
If you look at the glass from behind you'll see a bright spot.

Use a camera to detect that spot and you basically have a touchpad. To make it
a screen you can put a translucent sheet behind the glass and project an image
onto it (and use infrared LEDs).

See e.g. [http://wiki.nuigroup.com/FTIR](http://wiki.nuigroup.com/FTIR) for
some helpful images. Just google "FTIR touch screen" or similar for build
instructions and blob-detection software.

------
sanjamia
The hardest thing I’ve ever solved is finding 30 years of motivation. I co-
founded a family business, or it co-founded me, because I was only 14. From
inception to exit, it was an interesting intellectual and emotional challenge
to keep 50-100 people motivated at any one time, including me. Coding was fun
but talking to an angry customer was less so. Eventually, survival instincts
kicked in and taught me that creativity and learning was the solution.
Everything happened for a reason, and that reason was usually hidden under
multiple layers—for staff, customers, or me. I found I became motivated by
avoiding reaction, and instead seeing that there was already a motivation
behind every interaction, like boredom, anger, and ambivalence. Understanding
these individual motivations provided clarity as to what needed to change to
maximize team motivation. This made hard problems palatable (and motivating)
for an old-school techie, like me.

~~~
phkahler
>> Everything happened for a reason, and that reason was usually hidden under
multiple layers—for staff, customers, or me. I found I became motivated by
avoiding reaction, and instead seeing that there was already a motivation
behind every interaction, like boredom, anger, and ambivalence. Understanding
these individual motivations provided clarity as to what needed to change to
maximize team motivation.

At the risk of over generalizing it sounds like you learned to empathize.
Sounds like you found that to be a key to success!

~~~
sanjamia
That’s insightful.

------
throwaway000021
To learn to listen to other people, instead of just waiting to talk - that's
an ongoing problem.

To understand what "controlling behaviour" means.

To empathise, and see oneself from the eyes of others.

~~~
ratsimihah
If only we were all as self-aware of this as you are.

------
TomVDB
I had designed a PCI chip. Our driver team reported that the PC would hard
hang under certain conditions. They suspected a bug in the chip.

Armed with a PCI analyzer, I figured out that it was a race condition where
the firmware on the chip would raise an interrupt to the PC. And they would
clear the interrupt right at the moment where the firmware wanted to issue a
new interrupt.

If performed in the wrong order, the PCI interrupt stayed high even after the
PC thought it was already serviced.

The solution was to just switch around 2 lines of code in the firmware, but it
took two weeks to figure that out.

Not that long, but it was an insidiously subtle mistake.

------
basementcat
This isn’t the hardest problem I’ve solved but I was damned proud of solving
it when I did...

Was about to tape out a chip when we got a last minute change order... a
100kohm resistor needed to be added. This was an already laid-out, routed and
optimized chip design that was ready to be fabbed.

A few hours later (including at least an hour of shouted obscenities) I found
that I was able to snake through 100kohms of poly and diffusion resistor into
our design without moving anything, violating any design rules or changing any
operating points even after parasitic extraction.

------
lisper
I don't know if it's _the_ hardest, but very recently I had a problem that
drove me nuts: I was doing some consulting for a company that designs chips,
and they assigned me a project to take a nearly completed design and augment
it with additional wires to route extra power to some parts of the chip that
were running on the hairy edge of not having enough current. So I had to read
in the design spec, search it for available space, and generate new wire
geometries for the power grid. This sounds straightforward, but there were two
things that made is incredibly hard. First, the design that I was reading in
was the output of a routing tool that was provided by a third-party vendor,
and it was producing some really crazy shit. Not even the senior design
engineers could understand some of the things that the router was doing. And
second, there are hundreds of design rules that the new geometries had to
conform to. Most of them are not relevant, but about a dozen or so are, and
some of them involve incredibly complicated interactions of multiple
geometries, occasionally across multiple metal layers. Trying to keep track of
all that and make it all run in a reasonable amount of time was quite
challenging, to say the least. But the worst part was that the development
cycle involved a manual step where the output of my code had to fed into yet
another third-party tool to see if I had violated any of the design rules, and
the output of that tool was _graphical_. And I was not able to use the tool
because of licensing restrictions! So I had to take my output, give it to
someone else, have _them_ run the tool, look at the results, and send me a
report of how many rule violations there were, where they were, and what rules
were being violated. That took several hours, sometimes multiple days.

------
akavel
1\. Debugging a non-deterministic crash in a huge and complex C++ app. It took
me a few weeks of super-focused hard work (kudos to my then team & management
at Sabre office in Kraków for giving me this time, and understanding and
appreciating what I was doing) to solve, with lots of low level debugging down
to raw disassembly, as well as code reading and intense thinking and thinking
and thinking and thinking. Eventually I managed to narrow it down to a memory
race in advanced template code. Some time later I realized it scarred me to
C++ (my first love in programming) for life, also opening my eyes to what
horror Undefined Behaviors in C++ mean and how they loom ominously over every
single seemingly simple character of C++ source code anyone writes. I'm now
working in Go and appreciate it so much, as well as Rust etc. It also was one
of things that made me appreciate what good management can mean to an
employee.

2\. Debugging Go runtime in pre-1.0 era, to find out why it was not working
stably on Windows. It involved a lot of digging through Go internals (also in
Go's old C compiler & assembly), as well as WinDbg disassembly-level
debugging. I eventually found out that the Windows call convention was not
preserved in some aspect, in that one of the registers was being mangled on
return. I see it as my most important contribution to Go, though formally I
was not listed as a Contributor through it, though I got a Thank You mention
in a commit message I think. But some time later I did some much simpler
contribution which got me a then coveted by me (and still valued) entry on the
Go contributors list.

------
ComputerGuru
* Creating a prototype of a low-cost alternative to braille displays for PCs for my thesis (without access to modern machining resources): [https://www.youtube.com/watch?v=bPwgkf1aZ9I](https://www.youtube.com/watch?v=bPwgkf1aZ9I) then filing for a patent (that I didn't get): [https://patents.google.com/patent/US20130203022A1/en](https://patents.google.com/patent/US20130203022A1/en)

* Improving that design to make it work reliably and conveniently enough for mass production and deployment in 3rd world countries (still haven't figured it out :(

* Reverse engineering the APIs and roadblocks to port iMessage to Windows: [https://neosmart.net/blog/2018/imessage-for-windows/](https://neosmart.net/blog/2018/imessage-for-windows/)

* Porting an entire automated Windows PC system repair suite from Windows to Linux (still to fix Windows PCs) in two weeks when Microsoft pulled the rug out from under us and abruptly informed me that they would no longer be licensing Windows PE to ISVs: [https://neosmart.net/EasyRE/](https://neosmart.net/EasyRE/) (now powered by FreeBSD)

~~~
adroitboss
This is really amazing stuff! How do you keep moral and focus when tackling
harder projects?

~~~
ComputerGuru
Thank you. It's an ongoing struggle that I haven't solved yet. It can be very
tempting and fun to just switch to a new project when you reach the point
where you know what needs to be done but it's an insane mountain of work that
you either dread (e.g. you need to rewrite an entire project from scratch to
take a different approach) or can't even do at the moment (e.g. you need
precision machining beyond what is available to you).

I get a thrill out of besting myself, and find that helps extremely well when
I've all but given up hope but conversely that makes it really hard to
motivate yourself when it's something you know you can do (perhaps you've done
it before) but just aren't looking forward to. Also it's really hard to do
something that takes so much insane time/effort (literally years) when you
know there's a good chance the world just won't care.

------
blendo
When you're younger, the problems are harder, and the success sweeter.

I earned my spurs (well, perhaps an enlisted stripe) back in the 1980's when I
inherited a non-functional mess of an HP-85 Basic program to control an
antenna test system. Many days in the Hanscom AFB RADC EE Propagation shack to
get the RS-232 mag tape and IEEE HP-IB/GPIB/IEEE 488 custom hardware working.
We kept bumping up against the HP-85's program storage capacity, so I had to
remove most program comments, and actually shorten the variable names to get
it all loaded into memory.

Then we got to fly that sucker on a pair of C-141B's from Wright Pat AFB, to
Elmendorf AFB, then a flight over the North Pole, then to Thule AB, Greenland
(9 days in November - do not recommend), then to RAF Lakenheath.

30 years on, I still make a living fixing crappy code.

------
enraged_camel
Joining my current company and saving a relationship with a key client from
the brink of termination.

The previous engineer on the account had overpromised and under delivered...
in a rather big way. Not only that, but he also managed to misconfigure the
enterprise automation software we use and accidentally delete the customer’s
entire network share, _then_ managed to get one of those cryptolocker viruses
on the share after it was restored from backups, prompting a second restore.
Regardless, the customer was pissed as hell and was on the verge of firing us.

After I was hired and put on the account (which I knew going in), I spent two
days taking stock of all the shitty stuff the previous engineer had built,
deciding what could be salvaged and what had to be discarded. Then, for the
next two months, I went over the original contracts and built (or rebuilt) the
originally promised system. I would basically work for 10 hours everyday, 7 am
to 5 pm, then write an extremely detailed report to the customer explaining
what I had done that way, which would take another 2 hours.

For the first week they ignored my reports. Then the CFO started taking
interest, first by nitpicking, then asking questions, then praising. The rest
of the customer’s team slowly came on board as well, especially once I started
delivering the bits that were user-facing.

Today this customer is our flagship account, and has helped us land many other
large accounts. But those two months were probably the most stressful of my
life, not just because of the technical challenges but also due to having to
simultaneously navigate the tricky political situation.

------
ph0rque
I solved some interesting technical problems this year. One was optimizing a
script that took over eight hours to run, to just twelve seconds (validation
code brought that back up to several minutes, but still a lot better than
eight hours). Another was taking several map pins whose coordinates were based
on the same physical address (just different suites/floors, and having the
same exact coordinates returned from the API) and staggering them. I ended up
using a spiral pattern to do the staggering.

An ongoing soft-skill problem I am learning to solve is the effective customer
development.

An interesting UI/UX problem I am currently thinking about is how to allow
people to draw sun/shade patterns, wind current patterns, and essentially
contour lines that show elevation on their property (the last is to ultimately
learn how water flows on their property, and where there are drainage
problems) in a fun, minimal-effort way. If anyone has any suggestions on this,
please feel free to share.

The last two types of problems are for my side project, AutoMicroFarm.

~~~
abraae
IMO the most fun way to enter elevation data for your property would be to
ride around on your farm bike with an altimeter lashed up to a smartphone,
gathering data as you go.

~~~
ph0rque
Thanks for the idea! I found a variety of altimeter apps for the phone; I'll
have to try just walking around my yard with one (or several), then see if
it's possible to get elevation data on a map from such an app.

~~~
brianpgordon
I'd suggest getting an app like Sensors Multitool for Android which gives you
the raw output of your phone's sensors. I tried one of the altimeter apps a
few years ago to see how tall a big hill was, and the value that the app
reported turned out to be far off the actual value. There are various
techniques for mixing and cleaning data from the GPS and barometric pressure
sensors and a premade altimeter app (if it's not open source) obfuscates
what's really going on with the values you'll actually end up seeing if you
create an app for your microfarming project.

Also be aware that different phones have different levels of support for non-
GPS location services. I was surprised to learn recently that my Pixel 3 can
use data from the US, EU, and Russian satellite geolocation systems for better
coverage/precision.

~~~
ph0rque
Thanks for the tips! I don't need accurate readings, just precise ones (so the
map can be internally consistent; here's what I drew manually for my property:
[https://i.imgur.com/2ZGmB1M.png](https://i.imgur.com/2ZGmB1M.png)). I'll keep
in mind what you suggest when I am ready to develop that part of the app.

------
tnolet
I was working at an IT agency as an ops guy. One Saturday morning our alerting
went of for one of our customer's sites (a somewhat high profile public sector
company). Turned out we were being DDOS-ed.

The hosting company was trying to mitigate, but the attack was "real traffic"
so hard to just black hole. I dove into one of the Apache logs and was looking
for some way to sift out the bad traffic and keep the real traffic.

Then I noticed a lot of Nintendo Wii user agent headers. That was suspicious.
Who in hell is using this service 100 times per second from their Wii from
Pakistan and Russia?

I wrote a quick regex that blocked all IP's from requests with Wii user agent
headers. This took care of 99% of the DDOS traffic. After 2 hours the whole
attack stopped.

------
DoreenMichele
Getting well with a genetic disorder while the world harangues me and accuses
me of making that up.

Next up: Figuring out how to talk to the world about it effectively. I'm not
sure that's solvable. "Curing CF" seems easier.

~~~
hycaria
It's just an hypothesis, but I think saying CF when you actually have an
atypical form of it is a bit dishonest. I have no doubt that diet and
lifestyle do wonders for many conditions including yours, but the CF everyone
thinks about is the one discovered in infancy and the full set of symptoms,
not the one with late onset (at an age which many CF patients do not even
reach) and milder symptoms.

If you were willing to lift up that imprecision, I bet you wouldn't have such
a hard welcome.

~~~
DoreenMichele
It's a case of damned if I do, damned if I don't. People with CF don't like me
saying "mild" CF. They have a cow about that. I spent years on CF forums and
got schooled on that detail quite thoroughly. There are more than 1600 alleles
that can lead to a diagnosis of CF and the exact presentation of the condition
varies from one person to the next in part because of that.

My condition is not late onset. I have had it my whole life. Being diagnosed
late doesn't mean it "came on" late. I spent years being treated by people
like I was some kind of hypochondriac and asking doctors "Can we test me for
something? My body doesn't seem to work normally." and being blown off.

It's not imprecision. I'm as precise as I know how to be or you wouldn't know
that my actual diagnosis is _atypical cystic fibrosis._ I in no way hide that.

So from where I sit, that just looks like yet another BS excuse or
justification for people on the internet to be jerks to me.

~~~
hycaria
I knew it because I looked up your bio for the first time You don't hide it,
but I don't remember reading it in your posts here, not as much as the
complaints about other people's behavior at least.

>My condition is not late onset.

And the only person I knew with CF had been diagnosed as a toddler and had a
20 y. life expectancy because of genetic bad luck. That you could live 30
years without heavy treatment sure says that your condition was not as bad as
that person, and maybe there's a state of affliction that cannot be
conservatively managed.

------
_asummers
I wrote a tool[0] that converts Erlang style Dialyzer messages to enough of
Elixir to where the Elixir formatter can pretty print it. This isn't an
intractable problem, but the source tool does really bizarre things so the
output so it wound up being really annoying to deal with the quirks. Still not
quite done, but it's in a workable state to where messages can be dealt with
individually when the errors are poor.

[0] [https://github.com/asummers/erlex](https://github.com/asummers/erlex) [1]
[https://github.com/jeremyjh/dialyxir](https://github.com/jeremyjh/dialyxir)

~~~
jxub
Hey, I'm also working on something similar, namely automatic Elixir to Erlang
code transformation at the AST level [0].

Despite it being really interesting, some features like scope analysis for
variables detection and conversion, the possiblity of multiple modules in an
Elixir file and the Elixir pipes which need to be translated to sequential
function calls are honestly a pain in the ass.

Keep at it and I'm sure you'll crack it!

[0]
[https://github.com/jxub/doppelganger](https://github.com/jxub/doppelganger)

------
rassibassi
I'm about to finish my PhD in the field of optical communications and applied
machine learning to increase the data rate. More specifically, I used an
autoencoder with an embedded fiber channel model, to learn something called a
geometric constellation shape or modulation format. (if you ever heard of QAM
then you're on the right track, you can think of it as a modern Morse code for
transmitting bits) Thereafter, we took the learned constellations to the lab
and conducted an experiment, transmitting the learned constellations through
actual fiber. I've learned so much during the whole process

Although, the gains are marginal, I like the method a lot since it combines
physics (nonlinear schroedinger equation derived fiber channel model),
information theory (optimizing for mutual information) and machine learning.
It wouldn't have been possible without the people who published the fiber
channel model, my colleagues and in particular the colleagues who could help
me in the lab.

The debugging was hell, there are so many dimensions where stuff can go wrong
(besides the usual bugs): physical parameters with the wrong unit, the
implementation of the fiber model in tensorflow, the machine learning parts
with its training process. Plus the things that can go wrong in the lab.

There are still pieces where I'm not 100% sure, and would love to speak to
someone with some background in autoencoders.

I've open sourced the autoencoder and the fiber channel model, checkout my
github with the same username as here!

------
thisacctforreal
Editing a Minecraft save with sh and dd to resurrect my girlfriend's wolf.

~~~
BeefySwain
Oh man, if Minecraft counts, I would say implementing a fully automated
resource processing and storage system in FTB pre-Applied Energistics with Red
Power and Buildcraft.

~~~
brianpgordon
I made a PRNG with a 13-bit linear feedback shift register out of redstone
with no mods. :D

------
IloveHN84
Recognizing face expressions of babies born before the expected date and put
in incubators with oxygen masks. The problem is really hard and the collection
of video material available prevents the usage of machine learning techniques,
so old school algorithms perform much much better.

~~~
smadge
Can you elaborate more as to why you possibly would want to automate the
facial recognition of prematurely born babies?

~~~
cimmanom
Some guesses: to alert nursing staff in a preemie ward immediately (instead of
waiting for a scheduled round of checks) when a baby is in pain or hungry or
needs a diaper change, etc.

~~~
IloveHN84
Exactly, preventing stress situations which can lead to complications

------
quickthrower2
Without going into details, doing things where someone else you know will be
devastated by your decision, and may even beg you to change your mind.
Technically easier than opening a door. Emotionally like climbing a mountain.
Can take a decade to get over it.

------
zwischenzug
How to reduce time spent on incidents that were somewhat similar when managing
an SRE team. It took seven months of full-time work writing documentation to
get off the ground.

I wrote about it here:

[https://zwischenzugs.com/2017/04/04/things-i-learned-
managin...](https://zwischenzugs.com/2017/04/04/things-i-learned-managing-
site-reliability-for-some-of-the-worlds-busiest-gambling-sites/)

The hard bit of it was the leap of faith - I had to stop working on live
issues to invest the time in getting the documentation to critical mass. Then
we had to get the process right to maintain their usefulness. It resulted in
_massive_ savings.

It also inspired this, which I've just started (so very early stages):

[https://therunbooks.com/doku.php](https://therunbooks.com/doku.php)

eg

[https://therunbooks.com/doku.php?id=networking:dns-lookup-
fa...](https://therunbooks.com/doku.php?id=networking:dns-lookup-failure)

------
franze
Creating a user customizeable webportal in ACT3 as my first real programming
job. An undocumented template engine that grew into a full fledged programming
language that compiled down to into a perl-dialect which inturn compiled down
into native C. (ACT3 was used by just 2 companies APA and DPA)

Nearly every (but not all) instruction began and ended like HTML comments. It
mixed HTML with code logic and code logic with HTML. Oh, I mean DHTML. Target
plattforms: IE 5.5. & IE 6. and Netscape

No stack traces, no debugging tools other than alert().

In the end I had a whole meetingroom as my office as ever wall was full of
print out sheets and hand written dependency & inheritance graphs. It was a
procedural language, nearly functional but every function had mandatory side-
effects (as it started as a template engine) with object oriented approaches
sprinkeled in.

The project was a major success, but as the aspect of maintanance came up we
switched to something sane: PHP 3 at that time.

These 9 months, it changed me & my brain.

------
etrautmann
Stable two photon brain imaging at single-cell resolution. Biological systems
can be hard in unexpected ways.

------
pugworthy
Get along with my wife in a positive way for both of us. Been working on it
for almost 30 years. I'm serious.

~~~
sgc
I sometimes feel I am far from even being able to start on that problem. I
think it is often life's biggest difficulty, especially if you have great
loves (for activities, places, etc) that do not coincide. Unfortunately, it is
easy for it to become a situation where nobody wins and everyone loses.

------
crb002
Proving there was a bug in a vendor's device driver on an ARM touchscreen by
stripping down X-Win to intercept system calls around glitches.

Using a mock libc to intercept everything since there was no strace.

Porting Ruby to the IBM Blue Gene/L.

Updating a DAG stored in a SQL table.

------
fitba1969
Very interesting question because once you solve it, it doesn't seem hard
anymore.

~~~
seanmcdirmid
Ya, I hate this question as an interview one. I’ve solved lots of hard
problems, but only the ones I haven’t solved yet stick around in my head. It’s
even worse if they aren’t interested in design problems, but rather technical
ones.

~~~
jamestimmins
Ha, this is actually why I thought to ask it. Sadly, "I read the docs and
wrote sample code for days until it became obvious" isn't what interviewers
are looking for.

~~~
seanmcdirmid
I know, but I debugged a lot of use cases, tried to narrow the scope of the
bug, thought really hard about the execution in my head, and then fixed it all
using an explicit semicolon....

Asking an experienced dev this question is like asking someone what was the
hardest food they had to eat and how did they manage to swallow it. What is
the interviewer hoping to learn by that?

~~~
yetihehe
It's a simple task to check that you really are experienced developer and not
someone who stuffed his CV.

~~~
seanmcdirmid
It’s one of those questions you should definitely ask yourself before you ask
others, to check if a reasonable answer is possible or not.

------
astrostl
Hardest, IDK. Most clever, the first that springs to mind is a power supply
unit (PSU) replacement on an old HP server. It went down over the weekend, and
HAD to be back up. It also had a proprietary pin setup with more than the
standard ATX plugs. I popped in an ATX replacement, and it wouldn't go.

If you've never seen a PSU plug, it just has a bunch of cables placed into a
plastic block for socket standardization - you're plugging in a chunk of
plastic, which is physically assuring that each cable will go in place. Since
I had little to lose, I severed the plastic block on the OEM PSU as to create
two plugs: one ATX/standard, one proprietary. I then tried firing up the
system with a standard ATX PSU in the ATX sockets, and the proprietary plugs
(connected to a dead and not even power-connected) into the proprietary
sockets. And it worked! I let the system run like this with a dead OEM PSU
plugged into it with an open case as I waited for a proper part replacement.

I'm still not sure what those cables actually did. I posted about this on
Usenet, and got 1-2 emails over the years about how it had saved someone
else's bacon. Sadly, I can't even find it after the Deja News -> Google Groups
-> offline transition of Usenet search.

------
Aweorih
My company is building a device which collects data from a car while you
drive. We wanted to group those in "trips" on a very simple logic. If the
device sends any data for the first time, start a trip and if there's none for
more than 5 minutes, close the last trip. Problem was that the source of the
time is coming from the device which had a lot of bugs (e.g. time was in
future, time is from year 1970, time is not in order...). Management also
decided that it's too hard to fix them on the device. At the end it took me 3
full weeks +/\- to make it work. At the end I was able to convince the
management that some bugs were too serious and required fixing

~~~
mod
> At the end I was able to convince the management that some bugs were too
> serious and required fixing

This was the hard problem you solved, right? How did you go about it?

~~~
Aweorih
A guy from the device developers and me talked with them about it. We argued
that it's probably far easier and faster to fix the problems in the device
code than to make some workarounds on server side. I think the thing that it's
faster and we were 2 convinced them since I tried it with the easier part
already before. On other parts we decided to not process the data.

------
jv22222
When our education startup built out its first product in 2012 (real time
student content delivery and live chat etc on iPads) the internet in many
schools was terrible.

To counter that awful bandwidth issue we created a local server version of our
cloud based application.

In each classroom we created an intranet with one web server and a wireless
access point.

When students used the system their ipad directly connected to that local
access point and intranet (ie for chatting via node/socket.io etc.)

It did not matter if the internet went down the full system still worked
perfectly well as all the kids were operating 100% locally.

The hard part was making an entire system that could be online or offline at
any time sync with the mothership cloud. Especially hard was synching, well
everything. Node chat, Rabbit, mySQL, linux updates etc.

The app allowed the teacher to drive the content via node, live chatting
between students, live answers/questions/assignemnts etc.

We needed to write a lot of software from the ground up that allowed for 100%
asynchronous provisioning of machines, content synching, live chat syncing,
message bus routing etc.

Crazy thing was, a kid could be playing and accidentally pull the plug out! We
needed to account for all sorts of shenanigans.

Essentially, it was a bit like creating dropbox that was also a full stack
architecture and a ridiculously distributed data center with servers that
could switch on or off at any time.

It worked pretty well.

The teachers couldn't understand why our app always worked lighting fast when
all the other internet stuff they did was so slow!

As time moved on, we realized that single point solutions would not solve the
main goal we were targeting in education which is to modernize all aspects of
districts.

So, we built out new software that helps districts take the 5 year journey of
moving from a traditional district to a modern learning environment.
Ironically the platform that does that is pretty simple by comparison.
Essentially it's systametized consulting.

I think we still have one district still using that old software (5+ years old
now) and we are about to sunset it.

We did that whole thing with a team of 3 devs in 12 months.

Honestly, it was the most fun I ever had coding!

------
klyrs
It was the third semester of grad complex analysis. I dread analysis to this
day... but it was worse then. I didn't know my stuff. I hadn't done well the
previous semesters; my homework scores were continuing to droop... no way that
I could pass this final. I didn't have a clear idea of what I wanted out of my
next action: to cry in my professor's office. He had some questions for me.
Asked me how I'd gotten into grad school. Not in a mean way, just, we all have
our own things that work. He gave some nice words of encouragement, and a
kleenex, but he was right to send me back to my own toolshed.

My first real math class was undergraduate differential equations. The teacher
was legendary; gave a speech on the first day saying it's normal to repeat the
class and you can drop without penalty, even after the final. I bricked his
first exam; didn't even smudge the back page with a sweaty fingerprint -- he
tells me with a tsk, my handwriting is too sloppy: I should be deliberate in
my thoughts, not rushed. So I did speed drills before subsequent exams. Later
in life, I'd write sloppier answer keys for my own students.

But you can't speed drill for a graduate course in complex analysis -- proving
an unfamiliar theorem is quite a bit different than applying a standard method
for a familiar diff.eq problem with carefully-chosen coefficients.

The problem wasn't just believing in myself -- my prof boosted me over that
fence. It was believing in abstractions hidden even from mathematical lore.
Back to undergrad: I wrote a note sheet that night. I summarized every
important theorem from the class into the most compact statement possible,
along with a cliffnote about the proof. And then came the drills, copying the
notesheet again and again, with as few peeks as possible.

The actual hard problem that I grappled with on the exam is memorable only for
its beauty. The 24h transition from paralyzed fear, to justified confidence,
was where the real work happened.

------
austincheney
A simplified universal parsing format (AST) that equally describes all
languages. It needs to allow seamless switching between different languages in
the same file and also allow recursively nesting of different languages within
each other without resulting in a deviating from the simplified parse format.

The solution is written and it appears correct and viable. As in all things
related to parsers the devil's in the details in attempts to prove edge cases
with exotic tests and use cases.

~~~
bordercases
That sounds unbelievably useful, is it open source?

~~~
austincheney
Yes.

[https://github.com/unibeautify/parse-
framework](https://github.com/unibeautify/parse-framework)

The demo is at [http://prettydiff.com/parse-
framework/runtimes/browsertest.x...](http://prettydiff.com/parse-
framework/runtimes/browsertest.xhtml)

The backstory is at [https://github.com/Unibeautify/parse-
framework/issues/54](https://github.com/Unibeautify/parse-framework/issues/54)

Please keep in mind this is a single person's pet project, so it doesn't cure
world hunger or eliminate all disease just yet. Drop in some JSX or HTML with
embedded JavaScript to see it go.

------
golergka
Multiplayer engine for an open-world RPG that doesn't require developer to run
a dedicated server. Think Skyrim, GTA or Kingdom Come, but with a much smaller
budget for content - aside from very fast movement and physics-based mechanics
of GTA, this engine could've been used for any of these games. Previous
developers did the single-player logic first, and then added very ad-hoc
multiplayer sync, hoping that everything will be fine - but if you played long
enough, you inadvertently ran into desyncs.

So, I had to write first the network engine itself (which was used both for
networking and load/save), and then rewrite the whole game logic to take
advantage of this mechanism. Now, if you want to write any game mechanic, you
have to define it in proper MVC terms (although, due to amount of legacy code,
I didn't separate the view and controller as well as I should have). Dozens
and hundreds of small scripts for all the unique quest content and game
mechanics working on top of a unified platform, relying on it to sync their
data, one-off events (think "play sound" command that doesn't change any data
but has to be replicated) and not even knowing on which one of connected
clients is it run with authority, while all others run this particular object
in "slave" mode.

I honestly tried to research if any other game developer have used a similar
solution, but found nothing - it seems as if all open-world games that have
multiplayer rely on dedicated authoritative servers, operated by developer, to
run. Here's the project itself, although this update is not yet released:
[https://store.steampowered.com/app/526160/The_Wild_Eight/](https://store.steampowered.com/app/526160/The_Wild_Eight/)

Too bad that at the end of this project I got the worst case of burn out in my
career, spiraled into the worst depression episode that I've ever experienced,
and now quit the company while struggling to put my life back together. I have
no idea if I'll be able to publish anything on this tech, but I still feel
that it's been worth it.

------
rorykoehler
I solved a performance bug that was probably going to cost my company the
majority of our customers if we didn't sort it out as it made the app
unusable. It had been going on for months. None of the dev team had any idea.
APM was throwing out a bunch of red herrings. I haven't written a line of code
in the app or a line of production Java (the codebase language) in my life. In
the end I tracked it down to a single line of configuration code. Performance
went from being atrocious to being one of the best performing apps I've ever
seen.

This was not technically "hard" as in complicated but more so hard in the
sense that it was very high stakes and the time pressure was high.

~~~
Aeolun
Sometimes selectively commenting stuff out really is the best way to figure
out what is wrong.

~~~
rorykoehler
I do this a lot but it wouldn't have helped in this case unfortunately. Also
the code base is over 700k lines which added another parameter to the mix.

------
nbclark
I once wrote an app for Windows Mobile that implemented the L2CAP bluetooth
protocol, and implemented the HID spec. This allowed phones to appear as
bluetooth mouse/keyboards and be paired. It used the accelerometer and
touchscreen to allow for cursor manipulation and had an onscreen keyboard for
typing. Took me months to figure out the Bluetooth driver and to debug the HID
issues, but when it worked, was pretty exciting. Just dug up an old analysis
of it too [https://forum.xda-
developers.com/showthread.php?t=504730](https://forum.xda-
developers.com/showthread.php?t=504730)

------
leowoo91
Learning to throw years of work to thrash with no hard feelings.

~~~
LeonB
Hey Leo, You may be interested in Dan Ariely’s short book on Motivation, which
is based on his Ted talk here:
[https://www.ted.com/talks/dan_ariely_what_makes_us_feel_good...](https://www.ted.com/talks/dan_ariely_what_makes_us_feel_good_about_our_work/up-
next?language=en#t-13690) Part of it covers the motivational impact of having
years of work cancelled. If you’ve struggled with this, I just want to wish
you the best.

~~~
leowoo91
Just watched, great motivational content, thank you for sharing! Cake example
was really weird and still making me think.

I dont struggle as I used to since I've learned it is about environment we are
in (meaning as in video) and tomorrow is uncertain (because many people are
involved)

------
btiown442
I recovered the redacted text in this image:

[https://btcblockchain.files.wordpress.com/2015/02/e2.png](https://btcblockchain.files.wordpress.com/2015/02/e2.png)

~~~
uxcolumbo
How did you approach this problem and solve it?

~~~
Cyph0n
Not OP, but it seems like Electrum maps the wallet seed to standard English
words. The other thing that helps is that the redaction isn't complete.

So what I would do is:

1\. Estimate the possible locations of spaces (e.g., using OpenCV).

2\. "Guess" certain letters based on what is visible above and below the black
bar; these are your constraints.

3\. Look up dictionary words that meet the constraints found in (2).

4\. Loop through steps 1-3 to build a list of possible seeds.

5\. Run these seed candidates through Electrum.

~~~
btiown442
Yes this is generally the idea. Didn't know about OpenCV though. Was a font-
matching and trial and error by hand in MS Notepad + some scripts to help
brute force the thing kind of solution... Turns out there are actually at
least a couple dozen possible solutions based on what is visible so some small
amount of information was lost through the redaction, but not tons.

------
chrisbennet
A few years ago I solved a computer vision problem involving getting the spin
of a golf ball as it came off the tee; a _fuzzy_ image of a ball going up to
150mph.

Because the client didn’t have a have a system that could truly “freeze” the
ball (“make it work with our crappy images”) I had jump through a lot of hoops
to get an accurate spin. In the end I was getting rotational accuracy of
around 1/2 a degree I think. I remember calculating that it was 1/2 the
thickness of a credit card as measured on the surface of the ball.

------
marktangotango
I came up with a recursive layout (ie graph drawing) algorithm for directed
acyclic graphs (DAGs). I was trying to draw program flowcharts and the
hierarchical layout graphvis creates are very ugly and unbalanced. If you
reverse the back edges of while and for loops a flowchart is a DAG. I did a
python implementation.

I research graph drawing for a while and the closest algorithm to mine was a
“delta drawing”, I also found a paper about delta with force reduction steps.

Then I stopped working on program static analysis and didn’t do anything with
it.

~~~
lixtra
> If you reverse the back edges of while and for loops a flowchart is a DAG.
> Depending on language there could be also end recursions that never return
> or - the horror- GOTOs.

------
yuchi
Not _the_ hardest, but among them. Back to 2011~12, cross-platform mobile
application development with Titanium SDK (JavaScript).

We have this ~50K sloc application which has several performance bottlenecks.
At the time connecting the Chrome Inspector was not a thing. There was a half
functional profiler in the paid version of the platform, but it was cluncky
and the UI simply unusable (the whole IDE was a customized Eclipse distro).

Since at the time ES6 definitely was not a thing yet, everyone rolled their
own class system in projects. In our case the OOP library was more or less
Backbone’s, with the typical Child = Parent.extend({ ... }).

This was a godsend: I had a place for every class definition of our codebase,
and there I placed an hook which wrapped every damn method of every class with
profiling instrumentations.

I then built a small profiling UI (shake the device to start/stop collecting +
profiling chart in the logs) and we were good to go.

The funny thing is that I never actually used a profiler before that, so I
literally re-invented it from the ground up — only having a collegue of mine
staring at the results and saying «what a nice profiler you built over
night!».

Only then I started looking at real profilers and introducing “correct”
features in my own: real time, self time, flame chart.

A nice memory of simpler times.

------
afranchuk
As far as technical problems, there are plenty of optimization sorts of things
that come to mind. But I'll go with a different sort.

A while back I was working on an embedded 3D scanner at school. Part of this
was using a freescale ARM processor, and a USB interface to transfer scanned
models back to the computer. But a weird problem was happening: only when
trying to use the USB, every once in a while the processor would have a fault,
and the fault was often different and seemingly unrelated.

After a week of constantly banging my head on the problem, I came upon a
stroke of luck. We were smart enough to have built in a programmer on the
board that supported gdb debugging (we built the circuit and populated the
PCB). While in a gdb session, I read from the code flash section and noticed
that there was a zeroed-out word (4-bytes) that definitely shouldn't have
been. More intriguing, a subsequent read was correct, or had a different word
zeroed! So the random faults were when one of these zeroed words occurred
while code was read (and would have a bad effect depending in the
instructions).

Now that I had found the problem, the solution wasn't too far, but still took
some fiddling. Eventually I found that reducing the flash memory clock fixed
the issue. We had all the clocks set to a valid configuration according to the
documentation, and they did work fine when the USB module wasn't enabled. But
as soon as it was enabled, it apparently interacted with the flash in weird
ways.

I was glad to find a fix, and luckily we didn't necessarily need the flash to
be fast. Embedded systems (and other complex black boxes) can cause some real
frustration at times. But it's always fun, and I still found it rewarding :)

------
boyband6666
Regisered just to respond to this one :-)

Working as a junior analyst at a large pharmaceutical company we had a tender
document arrive due in 2 weeks for a _massive_ amount of money. Being
generally super keen I was given the brief 'can we do anything in this time
period? If so, do it!' \- usually our models take 4 months to build, and cost
~$100k.

I used meta-analysis our our trials vs the main competitors for each age group
(children, adults, elderly), cross referenced to the age structure of the
population i.e. number at risk in each group. This showed because of the
numbers at risk our drug looked better.

I then included data on the harms and connsequences in each group, using past
data, and added in monte carlo analysis around the uncertainty, which gave a
massive chance of of ours being the optimum strategy.

It was only a part of the case we made, but was delivered in 10 days at a cost
of $600 (software licenses), and really helped show what I could do. People
still mention it now (it was 10 years ago). My boss at the time was also
amazing in giving me freedom, but then helping communicate it all.

I've worked on some really cool projects in my career, but that's probably the
one I'm proudest of.

------
AsyncAwait
Not the hardest and the exact technical details are a bit murky now, but super
annoying nonetheless.

I had an external disk for extra data with an APFS volume as well as an NTFS
partition for data when I was in Windows. Once I plugged it in under Windows,
it must have nuked the partition table because of an 'unrecognized' partition
or something, as I then rebooted back into macOS and noticed my APFS container
wouldn't show up.

I opened disk utility and saw that it was marked as NTFS too. No my data! I
contemplated what to do and realized that it may be just the partition table
gone and my data may still be intact, since I haven't actually formatted the
drive myself, or anything similar.

The problem was, this was an APFS container, not a simple partition and the
tooling was not very good in the Sierra days. So I had to manually calculate
the sector at which I then created a new APFS container, but without
formatting and indeed my data was there.

That episode ended up reducing my already little time spent in Windows to
practically nothing these days. (As primarily a Linux user, I don't spend much
time in macOS either, but Windows is forbidden on real-hardware since that
fiasco).

------
iliis
What a fun question :) And there are some very cool answers!

A few things that come to mind from my life, from back when I was still a
kid/teenager:

\- Building a three-way marble-track switch. There are many marble tracks that
have these two-way switches that alternate between two tracks, like this one
here:
[https://theworkbenchutah.files.wordpress.com/2013/04/06-marb...](https://theworkbenchutah.files.wordpress.com/2013/04/06-marble-
race-050.jpg) I wanted to build one that would evenly distribute marbles along
three tracks. Took me quite a few prototypes until I finally had a working
one. And all I had to work with was paper, tape, glue etc. No wood or metal,
no bearings or precision mechanics, only a kid's craft stuff.

\- A LEGO technics robot leg. This is a bit hard to explain without images. I
wanted to build a legged robot (something spider-like) and needed legs that
could move a foot both borward and back and also up and down. As the LEGO
motors where quite big and heavy I didn't want to mount them on the leg itself
(e.g. at the "knee") but have them all fixed inside the body and transmit
their power via gears and shaft. The problem here was, that the up-and-down-
motion had to go trough the forward-and-back-joint which meant that whenever
the leg moved forwards or backwards it would also move a bit up or down, even
if the up-and-down-motor didn't move at all. So I wanted to decouple these two
motions. This would be trivial to do in software but I was just a kid playing
with LEGOs so I wanted a mechanical solution. I managed it by using a
mechanical differential to actually subtract the one motion of the other.
Unfortunately, I then realised I didn't have enough motors for a full robot...

\- Years later I was learning C++ and I wanted some kind of "linked variables"
(I'm sure there's an official term for this): I.e. variables that would depend
on others and would update whenever one of their dependencies changed. I
though this would be a cool way of writing a GUI. With a lot of operator
overloading and some abuse of the type system I could actually write things
like

    
    
        x = a + b * 2;
        y = x / b;
    

And 'y' would update whenever a or b (or x) was changed.

I'm not sure if they really are the hardest problems I've ever solved, but it
certainly felt so at the time ;)

~~~
_Nat_
The "linked variables" concept is often called ["reactive
programming"]([https://en.wikipedia.org/wiki/Reactive_programming](https://en.wikipedia.org/wiki/Reactive_programming))
or ["data
binding"]([https://en.wikipedia.org/wiki/Data_binding](https://en.wikipedia.org/wiki/Data_binding)).

And you're right that it's very helpful with GUI specification.

------
navinsylvester
This was 15 years ago. I signed up to a correspondence course to study company
law after my under graduation(not CS). Was bored so wanted to pick a hobby and
was fascinated by robotics/automation. Self thought 8051 by reading Kenneth J.
Ajala and this hobby soon turned lucrative.

I signed up with an institute which was helping college students with their
final year projects. I had a college junior(Amazing guy who is still a buddy)
to help me with the pcb fabrication and i would handle the software side. All
was well until a project which required a heat sensor and i couldn't source
it. My mode of sourcing the chip is via a friend who was in Chennai and i
would send the list of items and he would go into Ritchie street and get it
for me and mail it. Good old days. Sorry forgot the name of the chip. All i
could get was an alternative one with single pin instead of double and it uses
a different protocol. I had the alternative chip delivered one day before the
girl student's viva voce. Got it working and the examiner took the pain to
visit the lab in the institute to examine and conduct the viva voce. Phew.

Another incident happened with a company where i was the second technical hire
and there was just me and the CTO on tech team. The CTO had a working POC and
it was quite a complex product. Even before i could start writing a single
line of code the CTO left the company after a spat and without proper hand-
off. I wouldn't wish this scenario even upon my worst enemy. It was quite a
significant challenge to even get it running. One simple example - In the
initial days after he left, i found out Opennebula Virtualbox driver is not
attaching the context disk on windows. I had to write a patch in ruby and
submit a pull request. I didn't knew ruby and haven't even wrote a single line
of ruby code before that. But after quite a struggle - got the code base to
start making significant money.

~~~
brianpgordon
> Even before i could start writing a single line of code the CTO left the
> company after a spat and without proper hand-off. I wouldn't wish this
> scenario even upon my worst enemy.

I know that feel. My first full-time job right out of college, the lead
developer left only a week after I joined. I individually inherited a 70k+
SLOC project (in Java, but still...) of his that sorta kinda worked, in a
domain I had no experience in. Especially as a new grad there was a massive
amount of anxiety trying to get up to speed with nobody to ask questions to.
But as the months went by I learned every nook and cranny of every class in
the codebase, and was able to make big sweeping changes. It turned out to be
an incredibly valuable lesson in soldiering on and not getting intimidated.

~~~
navinsylvester
I had loads of experience when i encountered the scenario so your story is
much more impressive. Agree that the schooling this kind of scenario provides
is harsh but an enriching one.

------
WheelsAtLarge
Just a tip for all dys'ers. As you disassemble, it's always good to take
photos so you have reference points as you reassemble.

------
ttul
I once had to fix a rendering bug in the heart of Mozilla that prevented a
XUL-based code editor from working properly on Linux - this was back in 2001.
The bug was a one off error about eight layers deep in the XML layout system.
It took me about two months of full time digging and nearly destroyed my life.
But I found it!

------
CodeWriter23
Getting clean off drugs and alcohol.

Oh, and debugging a DOS TSR (Terminate and Stay Resident) API for realtime
communications to mainframes and other information sources (async and X.25)
while running under Windows 386 - Windows 3.11. Every different network stack
(IPX, Netbeui, LanMan, etc.) presented a new brainfucker of a problem.

------
closed
The hardest problem I've ever solved is learning to sit down and collage.
Like, where you cut out bits of magazine and paste them onto paper, etc..

My friends do it pretty often. I find it super relaxing, and know I'll be
happy I did it, but the engineer / researcher in me wants to always have
fingers on a keyboard.

------
ransom1538
My task: Take a top 500 website, all written in smalltalk (no documentation),
and port it to a modern scripting language. The company has: no money, a
hostile customer base, angry previous engineers, angry management, and hostile
current employees. The site generated billions of webviews and was all
unstructured (no database) user generated content.

It was grueling. There was only _one_ smalltalk engineer that would talk to us
(the others quit, or weren't paid). Getting the dev environment running was
impossible due to license issues with cincom. Luckly, I was able to get 3
smalltalk endpoints that would dump XML -- using the last remaining favors we
had with the last small talk dev. Using this and layers of hacks we were able
to start porting in 6 months.

The site was sold (3?) times and was ported again to something else.

~~~
throwaway20148
Ezboard?

------
tenaciousDaniel
I had to build an app where you could draw a trend line over a stock chart,
and it would set up an alert notification that would alert your phone if the
price tripped the line. The math that goes into this is absolutely insane.

I should add that I never really solved it, as the solution didn't work very
well.

------
typhonic
I convinced my wife to marry me. She was very pessimistic, thinking that I
would leave her. It took six months to get a positive response to my proposal
and another year of reassurance before we married. We have been married 41
years. The courtship was helpful experience for the marriage.

~~~
johnjohnsmith
Did her pessimism ever subside or did you just learn to accept it as part of
her?

~~~
typhonic
She still has a bit of fear and uncertainty, which I have learned to accept,
but I do not believe she is worried that I will ever leave.

~~~
johnjohnsmith
Thanks, I'm kinda in the same boat. :)

------
Dowwie
What was supposed to be a weekend project consisting of re-grouting bathroom
shower tile turned out to be a two-month side project replacement of wall from
the studs out. It was the kind of project that got worse at every step.

The contractors who built the bathroom cut corners by creating a bathroom in a
single day. They didn't sufficiently mortar each tile to the wall (back
butter) and wait a couple of days to set prior to grouting, as they should
have. As I cleared out grout between the tile using a multi tool, vibrations
agitated the tile, causing several to fall from the wall, crashing into pieces
in the bathtub. Evidently, many of the tile were hardly, if at all, mortared
to the substrate. Some tile weren't even fastened to the wall at all but were
rather being held in place by grout.

For those unfamiliar with the business of ceramic bathroom tile, the industry
does not make standard tile in perpetuity. Consequently, the odds that you
will find a match for 10 year old tile are extremely low. I could not find an
even remotely close style of tile to match against.

So, realizing the futility of this work, it seemed that I had no other option
but to remove the three walls of tile surrounding the bath area and start new.

I did all of this from a condominium unit four flights from ground. I cut tile
from the ground floor, outside, without appropriate workspace nor wet tile
cutting machine but rather an angle grinder and some buckets. If you're not
laughing, you really ought to try this some time.

Lessons learned:

    
    
        1. Never, ever, regrout anything beyond a very small area.  Avoid re-grouting entirely if possible.
        2. Buy at least twenty extra tile for any new project and store them somewhere safe for the long term.
        3. Professional contractors who tile for a living are worth their fee. Same applies for mortar and stone specialists.  
        4. DIY projects are a great distraction from cerebral day jobs but are minefields
    
    

This isn't the hardest problem I ever solved, by a stretch, but it's a horror
story I like to share.

~~~
Klathmon
I was always told to make sure you keep extra of any kind of material like
that around. A handful of extra tile, some extra wood flooring, some carpet
that we pulled out to replace with wood floor at one point, even some extra
molding just because why not?

Worst case scenario, it sits up in the attic and the 15 minutes spent taking
it up there is a waste, best case scenario it makes it so that a single
cracked tile or a deep scratch in a wood floor isn't as big of a deal. And I
have found that I pretty much always have at least some left over normally
anyway.

------
bobjordan
Back in 2008 during the start of the economic downturn. I was fresh out of MBA
studies, working as a management consultant.

I’d studied options and derivatives in my second year. From this, I further
became interested in “real options”, which provides ways to value real
decisions under conditions of uncertainty.

Anyway, we (as in the firm I worked for) were trying to figure out what was
going with the economy, what it meant, and how we could make money off of it
advising customers. So, we had a few group ideation meetings, stuff like that.

At this point in time in September-ish 2008, volatility in many prices for
many basic things, like fuel, even some food like milk, started noticeably
increasing. Like, fuel prices would go up and down $1 in a week.

I noticed, people started delaying purchases until they absolutely had no more
time to delay. Like, car sales dramatically dropped. Many people stopped
buying cars. Because, of the tangible levels of uncertainty they were
beginning to feel.

One day, a connection just clicked with me, that option valuation theory could
be really useful to help explain and predict, what was happening in the
economy.

As volatility increases, option value increases. In the real world, volatility
manifests itself as uncertainty. On a microeconomic level, the volatility
linked uncertainty gave consumers and businesses a more valuable real option
of postponement.

On a macroeconomic level, the individual decisions of consumers and businesses
to postpone spending, contributed to a death spiral for the overall economy.

I’m no economist. But, I put these ideas together in a deck, centered around a
pitch for 3M company, showed it to some directors. They loved it, formed a
team around it, continued developing it, pitched to 3M, and it helped to sell
millions in project work.

Meanwhile, after I had made the connection, it was really hard to stop
thinking about it. It ties together a lot of branches, old schools of
economics (which, by the way, you will be flayed for questioning) with finance
(has a lot of assumptions built in, which you will also be flayed for
questioning) with human behavior.

Continuing to think about these connections, led me to quit my job at the
consulting firm, after I had made a good name for myself.

Ultimately led me to become an entrepreneur, to build a company more closely
aligned with writing real options, which is what I feel like I do, every time
we give a quote at BOM Quote Manufacturing.

------
bane
I think the hardest problems I've encountered are the ones without a specific
right answer but tend to be more something that finds a balance between many
complex and often contradictory requirements.

These are typically problems that involve interaction with very complex meat-
space concerns.

------
hguhghuff
In the early 1990’s I solved a major performance issue on a DOS based serial
server by enabling the buffers in the 16550 serial controller UART.

------
ingmarheinrich
I left my wife and my two kids of 6 and 8, because I couldn't adjust to my
role as a father. A few months later I realized this was the biggest mistake
of my life, but my wife wouldn't allow me to return - can't blame her.

For a few years I was eaten by feelings of guilt, and the only thing that kept
me alive was my two sons. Today, I still get guilt attacks, but I somehow
managed to move on. I have a good relationship with my soon-to-be ex wife
(she's the best person I ever met), and I'm seeing my sons as often as
possible.

So this basically isn't a "problem" per se, but it's the hardest thing I ever
had to overcome.

------
philpem
Teaching a team of hardware engineers moonlighting as software guys to replace
ZIP files and "version[1..n]_backup_2_reallyfinal.c" type filenames on a
network share with proper version control.

Then static analysis and CI.

Then the supportive manager left, to be replaced with a new manager (one of
the aforementioned hardware guys "promoted out of mischief) hellbent on going
back to the "old, better way".

"Why should you get £1000 a year from capital expenditure just to keep our
static analysis licence! You should write good code to begin with! Otherwise
why did you get hired? Are you incompetent?"

Some problems are just unsolvable.

------
kabdib
I wrote a transactional object store for an early PDA that could use both
vanilla battery-backed RAM and flash media. Took about six back-to-back
hundred hour weeks (and had a surprisingly short tail of bugs).

[http://www.dadhacker.com/blog/?p=948](http://www.dadhacker.com/blog/?p=948)

Later, I hooked the store up to the NewtonScript garbage collector in a way
that let us have garbage-collected semantics for persistent memory-mapped
objects. This let us do things like "receive a FAX" on a system with not a
bunch of free RAM.

------
coleifer
In one day of interviews I variously had to use:

* Dijkstras

* Something I later learned were called powersets

* KMeans

* Dynamic programming

It was intense. I think I entered a fugue state. Got the job, too!

------
jcoffland
I wrote a 3-axis CNC simulator using a new algorithm and published it as Open-
Source. See https:/camotics.org/ More recently I created an S-curve path
planner for CNCs. Also Open-Source.

------
yxhuvud
Built a package manager to transfer user selected data from one installation
of the system to another. The data was stored in a mysql database, and used
foreign keys everywhere, so installing the exported records in the right order
was necessary. Some data was local to the machines though, and should
absolutely not be exported. But exported entities need to be able to reference
these local settings, so these references were exported using a symbolic value
- the records had a name field which was used. Exportable entities included
packages, so the whole thing was recursive and all requirements also needed to
handle that.

Sounds fun right? Well, everything should also be installed keeping the
integrity of the system intact, and if something didn't work out (say, an
entity of the same name as one that is being installed already exist), then
everything should fail in a controlled manner and the user should be given a
good error message and there shouldn't be any crap left behind in the system.

Then the people in charge realized that this is nice, but as they built a big
package hierarchy that was then installed and modified slightly in the
customer systems, it would be nice to be able to serialize, and uninstall the
local changes to packaged objects, uninstall the packages, install a new
version and reapply the changes. These should also be possible to package, so
that each customer system was possible to recreate. Supported serializable
changes was not only fields, but also above mentioned symbolic references as
well as whole new records.

This is easily the task that has formed me the most as a developer.

------
bpchaps
On a modded Minecraft server, I wanted to use an opencomputers 3d printer to
print out pngs from the internet. It had a pallette system that let you choose
colors at specified dimensions, but I wanted to constrain myself to only use
colors that not only existed on existing blocks, but also in the same
position.

1\. Had to build a Minecraft computer and figure out its interface. The
computer came with git and curl commands, so that's how I downloaded files.

2\. Every block had to be pulled apart and every pixel had to be mapped to
it's voxelspace dimensions.

3\. Some blocks had extra logic which caused random transparent pixels to show
up, so I had to find and exclude those from samples (flowers were the biggest
contributor). This was a bulk of the work, because there was no simple way to
find which blocks had extra logic. I even pulled out jad to find any sort of
useful pattern, but there really wasn't any.

4\. Had to do some k-map stuffs to find the closest color to each pixel from
all images.

5\. Create a file for each block, which all had to include the original
block's color for that specified dimension.

6\. Import each file one by one because the computer had limited diskspace.

7\. Place each block manually.

It ended up working surprisingly well and could be scaled to any size. Problem
was, even looking at a printed block used a LOT of resources at both server
and client side. So I scrapped the project.

Fun stuff, especially with that being the first sort of image processing work
I've ever done.

------
dfox
As far as CS goes the hardest problem I'd somehow "solved" involves designing
program that produces reasonable approximations of optimal solutions for
TSP/VRP. The actual implementation took a weekend of wall time (which includes
several weeks of single-core CPU time), but that was preceded by two months of
reading papers with abstracts like "We present practical approach to solving
intractable problems".

------
lauremerlin
Removing from my budding startup team a highly motivated guy, bright,
experienced in big companies... without damaging his ego AND without lying AND
while being clear. If I can re-orient everyone I need to, as well as this guy
until the end, I'd be so proud and happy! And I won't ever make enemies that
way.

That was a very charged conversation emotionally, I was vulnerable and highly
respectful, but still decisive and honest.

I refrained from exposing the case I had 'against' him to make the decision to
move him from a potential cofounder role to an external advisory role.
Instead, once decision was made, I committed for this to be as good as
possible for everyone. I started bravely with the clear news, pointing out it
had been my mistake to bring him on at that point.

I had taken serious time imagining his viewpoint. I talked about all the good
he had brought to the project, for real, and all the good I valued about him,
for real, and asked him to be an advisor (yes, he's been of value in that
role, despite the hurt feelings).

~~~
lauremerlin
Oh, and I gave birth to my daughter in silence all by myself on the couch.
That was pretty intense too. While living in a squat, I also unfolded a dead
guy starting to harden, to make sure we could later lay him down in a coffin.
Life is full of strange adventures, I guess!

------
juangacovas
Helped to arquitect and code a MMO game when the only reference was EverQuest
;)

~~~
dfischer
I miss evercrack so much.

~~~
neuralzen
Trying to get a Glowing Black Stone drop broke me on that game. The final
straw was after camping the npc that drops it as a rare drop for days, with
like 6-8 hours between spawns (Pyzjn), she finally spawned again and midway
through casting a spell to kill her, she spawned because it was morning. :/

~~~
dfischer
That’s the thing. I miss being broken by a game. I miss that struggle. I’ve
tried to experience it again but maybe nostalgia is superior. I’m hoping for
pantheon to bring that feeling back. I genuinely miss being addicted to the
struggle and joy Everquest was. It helped mold me into who I am today and I
love it.

------
bjoli
I wrote an inliner once. It might seem like an easy problem, but a proper
inliner is a rabid animal you can just barely restrain on a leash.

The simple part is making it work without clashing. The hard part is making
inlining always be an optimisation with reasonable code growth with regards to
compilation time and cache use.

~~~
nthompson
That sounds tough. You also have a code size/speed trade off that must be
managed. Where did yours land?

------
jacquesm
Toolpath equidistant to mill around an object designed with a CAD system. Math
is not my strength and it took forever to come up with a way of doing it that
would not amplify subtle errors t in the underlying model. The basic idea is
that you have an outline of what you want to make, then you have to offset
that outline by your tool radius. And preferably offset it to the _outside_
otherwise you end up destroying the workpiece. Now anywhere your original
input has overshoot (which is easy enough with a CAD package, too small to see
without zooming in in extreme detail) the direction will reverse and your
outside suddenly becomes insides.

Correcting that properly took me a long time.

------
cellis
I built a slot machine game with multiple lines, odds, payouts in unity. The
slot machine math was suprisingly complex ( calculating the odds of lines and
payouts ), and making the app performant was even more difficult ( 3D
animations and UI in NGUI ).

------
dawhizkid
Depression

------
anon2775
Convincing a department to graduate me with an ABET accredited degree without
finishing all coursework. It was significantly easier than deriving the time-
dependent wave equation for hydrogen, writing a Java-to-MIPS compiler with two
implementations, one in C++ and one in Java, and making an UNIX FORTRAN
reactor simulator compile on Windows 95 (lol!)... all with crushingly,
untreated depression. Also, survived the Loma Prieta earthquake and Paradise
Camp Fire... nothing special there though.

My friends: getting an ASW account while being neither rich nor famous;
another, coming into possession of a MiG-21. /unhumble-brag

~~~
tanderson92
> It was significantly easier than deriving the time-dependent wave equation
> for hydrogen

Can you elaborate?

------
sonnyblarney
Building a hardware product from scratch, getting it manufactured, and selling
it.

~~~
hguhghuff
Tell us more! What did you build, how long did it take?

~~~
sonnyblarney
A high tech toy for pets. I have a lot of experience in software at Unicorn
telcos and also some other big names ...

But my god 'hardware is hard'.

Just some quick points for the uninitiated:

There are 'creative' aspects both to design, and to play, and then of course
issues with pets. Industrial design to make the physicality work, some high
tech secrets, a lot of prototyping on Arduino/Raspi. Hardware customization,
PCBs, optimizing for manufacturing ... getting many pieces to fight just right
together (I mean product/market mix, i.e. features, price - not just physical)
with long dev cycles, waiting for parts, trying to predict the cost of various
things at various quantities. Setting up relationships with overseas
manufacturers - which is partly a product issue ... but mostly - working
capital requirement.

If you're well funded up front, you can get 'very expensive' units built, put
them in the hand of journalists, bloggers who will write about it. Even better
- get your first production run in place and give them those to coordinate
with first sales run. But who invests in hardware startups enough to support
working capital? Not many - it's hard enough just operating on regular Angel
... but coming up with the money for the first 'x units' is a huge bet. If you
can't get prototypes in the hands of buyers, bloggers (and good ones) it
increases risk quite a lot.

So the biggest issue with factories is not so much 'getting it made' because
ultimately if you're careful you'll definitely find someone competent to make
it ... but it's how they get paid, which is related to the effect of the
'hotness' of your product. Factories are not interested in 'one off' runs,
they want something that's hot, the hotter, the better the terms. Things like
moulding etc. which drive up the initial unit costs - factories will eat the
cost of this if they really want your business. And of course paying for the
production ... if you have something white-hot or an operational history -
they might give you credit or much better terms, which makes getting to market
more accessible.

Finally there are the channel issues - buyers that want guarantees, the
ability to return product ... retailers who don't put those terms of front but
return a bunch of inventory anyhow. Just because. Every buyer wants a 'unique'
aspect to differentiate their offer, but it's hardware, so that's hard, you
have to get creative etc..

Online channels are quite fundamentally different than retail - you can sell
at higher price points online to cash flush buyers, but in-store, things are
_extremely_ price sensitive. It hardly matters what you are selling: once
you're past $30 with a retail physical thing ... it gets risky to the buyers
who want to see big ad spends, lots of social media activity, yada yada.

This later part is why so many really cool products fail. They're just too
expensive for physical retail channels: plastic needs to have 'velocity' or
it's a big problem, and most 'cool things' are quite expensive. As soon as you
have a decent camera, or wifi, or a high end processor ... it starts to get
very hard to get prices down to a retail happy point.

And so many products tend to be 'hit driven' meaning it's hard to turn all of
that into something more long lived. Consider how many toy companies come and
go - they are so often 'betting the whole company' on every new line.

Unless you have some kind of well oiled machine with all of those things in
place, it's hard to break into - you kind of need a hit, and serious
backing/believers in order to create the initial momentum. There are a few SV
companies who've done that, kudos to them.

Very fun but very hard, and very 'business' i.e. a lot less about 'fun factor'
'design', many more raw operational concerns than a software startup.

These are just some quick thoughts, by no means comprehensive.

------
haolez
Develop a tool to boost the productivity of a software engineering team and
convince all members to adopt it. Software engineers (myself included) can be
very resistant to changes (especially when it comes to their tooling) and an
in-house tool with a single developer behind it didn't contribute to the ease
of adoption, but it was adopted regardless. I guess that the tool was indeed
very good, or maybe I'm a sweet talker :)

Note: the team was very specialized, so there really were no competing tools
around for that specific task.

~~~
emilsoman
What did the tool do?

~~~
haolez
The team had to work with tens of different embedded devices with all kinds of
weird compilers and flashing tools.

My tool would help with the context switching between projects, providing
wrapper scripts and inline documentation. It was some sort of literate
programming in bash :)

------
spo81rty
Building a profiler for .NET. Learning the .NET profiling APIs is a massive
learning curve and extremely complex. But, we figured it out! This is for
Stackify, BTW.

------
alok-g
Always-On Computer Vision [1]:

Summary: Qualcomm wanted me to devise a computer vision solution that was
nearly three orders of magnitude power-efficient than what they had then.
There was a clear justification existing as to why such a drastic improvement
was needed. Nobody had a solution in spite of trying for a long time. Most
laughed it as impossible. I started by looking for a proof as to why it could
not be done if it indeed could not be done. After some three months of pulling
my hair, I started getting glimpses of how to do it. Some three months later,
I could convince myself and a few others that it is doable. Some three months
later, the local team was fully convinced. Some three months later, the upper
management was convinced that the research phase was over.

Details:

I was told to reduce power consumption for an existing computer vision
solution by nearly three orders of magnitude. This included end-to-end power
consumption including camera as well as power for computer vision
computations.

No one in the industry knew how to do this. Many industry experts laughed at
the goal itself.

To make it worse, I by myself had no prior expertize in computer vision, nor
in camera design, though I knew image processing. I was assigned to the
project solely because of my brand name with the company. Being ignorant
perhaps helped since else I may not even have accepted the assignment.

I was told this is to be a five-years type research program, given the
agressive goal.

I was told that the company was ready to do change whatever is needed to make
this happen, be it the camera design (including pixel circuit), processor
architecture, computer vision algorithms, even optics and packaging.

With no starting point in hand (and in fact a false start assumed based on
half-baked ideas from one university), I had head-aches daily for more than a
month. Most of this was to learn and deep dive into the prior art.

The goal I assumed was to either solve the problem, or show a theoretical
basis for why it cannot be solved. (In hindsight, to solve a challenging
problem, try to _prove_ why it cannot be solved, since even that is usually
hard, and you may prove yourself wrong in the process by solving it!)

There was no assigned team, given that there was no solution in mind. There
were people with varied skillsets (optics, packaging, digital circuits) to
help as per need.

Three months in, I had some ideas for how it could work. Six months in, there
were three more engineers on the project working under me on a near full-time
basis. And some were complaining that we don't know what we were doing. Seven
months in, I had Excel-level calculations to show that it can work, and
actually a simpler solution than what I had in mind at the three-month point.
Nine months in, with about seven people working, the team was convinced of the
solution, as validated via rough circuit design and simulations. By the end of
the year, senior management was convinced that we have solved the problem and
the research phase is over. We had an estimated five years to solve the
problem, and the solution achieved three times less power than the target!

The solution was a mix of several new inventions (one was given gold rating by
the company) and significant power savings obtained just by careful design.
Unfortunately, I cannot talk much about the solution itself.

We started showing early preview of the capabilities (under NDA and without
disclosing how we solved it) at multiple Tier 1 companies, and saw their jaws
drop. At one company, one whispered to their team members, "Are these guys
kidding?!". At another company, someone smiled silently for half-an-hour till
we started showing the system emulator.

Soon, MIT Technology Review talked about the work [1].

I have since then left the company, so do not know much about the current
state of the project, other than additional news coverage received over the
time. Additional references are there in my LinkedIn profile.

PS: I have solved many hard problems through my career, have written about the
one that gave me the most head-aches. Here is a list of some other ones [2].

[1] [https://www.technologyreview.com/s/603964/qualcomm-wants-
you...](https://www.technologyreview.com/s/603964/qualcomm-wants-your-
smartphone-to-have-energy-efficient-eyes/)

[2]
[https://news.ycombinator.com/item?id=14222590](https://news.ycombinator.com/item?id=14222590)

~~~
MauranKilom
> (In hindsight, to solve a challenging problem, try to prove why it cannot be
> solved, since even that is usually hard, and you may prove yourself wrong in
> the process by solving it!)

This is how I approach almost any hard problem if I get stuck at some point.
Attempt to prove that the next step is impossible - a missing link in the
attempted proof usually reveals some deeper structure or invariant that you
can use for solving the problem.

It's a great way to force yourself into a different perspective and distill
the problem down further.

------
effie
A deceptively easy looking obscure disentanglement brain teaser, that I found
in a remarkable book on brain teasers from 80's. When you start solving it,
you think "this has to be very easy". Hours and days go, and slowly a feeling
sets in that your brain is not up to the level of this crazy problem. Then you
suddenly find the solution, no idea how. Banal stuff, but when you solve it,
you feel like million bucks.

~~~
fmela
Could you share the puzzle?

~~~
effie
I think it was something like this:
[https://image.ibb.co/nsz4x0/puzzle.png](https://image.ibb.co/nsz4x0/puzzle.png)
The goal is to separate the string with the beads from the rubber piece. The
rubber has 5 holes in it. The beads cannot go through the hole.

------
d--b
I built an olap cube for manipulating time series of financial models, and
made an API to serve it through a query language that I created just for this.

~~~
etaioinshrdlu
It can be harder than it sounds :)

------
singularity2001
Derived the Riemann Tensor[0] on a whiteboard being a tutor. Obviously I
wasn't the first person to 'solve' it, but being relatively unprepared in
class I was shocked and very proud about my personal achievement.

[0]
[https://en.wikipedia.org/wiki/Riemann_curvature_tensor](https://en.wikipedia.org/wiki/Riemann_curvature_tensor)

------
hermitdev
Probably one of my harder ones was finding a memory leak in one of my servers.
Didn't happen often, never in development and only leaked a fraction of a byte
per request. Turns out it was a bug in the logger, where if on rolling the log
and it couldn't open a new file, the logger would just queue up messages. Was
a real pain to track down as we normally had plenty of disk for logs.

~~~
fierro
out of curiosity, how do you leak a fraction of a byte

~~~
hermitdev
The fraction of a byte was amortorized over the number of requests (e.g. the
mean bytes lost per total requests), which is what made it puzzling and hard
to track down. It only happened under specific conditions that were not
evident running under valgrind.

------
johan_larson
I once had to design a function to open a series of nodes in a displayed tree
from one given node down to another. Not so hard in general, but because of
how the system was set up I had to do it using callbacks. So tell the first
node to open itself and when done call the callback that tells the second node
to open itself and when done call the callback that tells the third node ...

------
alok-g
I wrote about the hardest problem I have solved in comments on this page here:
[https://news.ycombinator.com/item?id=18478677](https://news.ycombinator.com/item?id=18478677)

Also wanted to brief about the hardest problem that I have _not_ been able to
solve, [... deleted as suggested by a comment below.]

~~~
Stratoscope
Edit: I'm glad my suggestion was helpful. I also edited this comment to remove
the reference. I wish you the best!

~~~
alok-g
I appreciate the advice. HN wasn't allowing me to delete my comment anymore,
so I just removed all details. Thanks.

------
Sharlin
Piecing together my polyamorous orientation and getting over my poor self-
esteem when it comes to romantic or sexual relationships. The journey from
being confused, lonely and depressed to understanding that there actually are
people that could be interested in me and desire similar things from a
relationship.

------
ipunchghosts
Determined the position of a underwater drone to within 1mm over a 30m span.
Not easy when competing with currents.

------
lbj
Definitely building a design language that had a minimal learning curve, yet
the expressiveness to build any website or shop imaginable. The MECE analysis
of CSS, the development of a flexible JS framework for users to mix and mash
without ever seeing the code. Took a few nights of thinking :)

------
flerchin
All of the trickiest software I ever wrote was preceded by an exhaustive
explanation to management to let me write it. In the end, the trick was
political, not technical. I could single-line-summary any of the changes and
they would sound absolutely banal to this crowd.

------
wdr1
Why Olympic tickets for the 2008 Games wouldn't print on the Chinese version
of Windows.

The short version it came down to Java on Chinese Window XP had a different
$CLASSPATH than the versions in the US & Europe.

The longer version:

Ticketing for the 2008 Games was done by Ticketmaster. At that point, the core
ticketing system was largely written in VAX assembler. For a longtime, most
ticketing agents used a command line interface directly into the backend.
Imagine, if you can, something even more user hostile than MS-DOS & you'd be
close.

That wouldn't work in 2008. Tickets were going to be sold across China, at the
Bank of China, including some very remote locations. Obviously they didn't
speak English, nor was it feasible to train them on an English-based command
line system.

So we created a web-based point of sale. It actually went really well. We had
a set up process to make sure all the basics where in place -- the client
certificates were installed correctly, they could reach the relevant servers,
the location had sufficient bandwidth & so on. We literally caught thousands
of problems before onsale.

Months later the ticket printers landed & everyone was to make sure they
worked. For almost everyone, it failed & it failed silently. Debugging it from
the United States was a nightmare. The time different, the language different,
non technical people trying to relay technical information, the game of
telephone from the bank agents to our Chinese staff to our American staff to
us.

One thing to mention here is we need to print without any user interaction.
The print prompt offered too many options to screw up tickets that had to be
printed _exactly_.

So we had a Java applet to do the printing. We'd ship a PDF to the applet & it
would take it from there.

It worked flawlessly in the US. We had EU folks test & it worked. We had our
colleagues in our China office try it & it worked.

But for the Bank of China, no luck.

The Bank of China also didn't want us to have access to the machines out of
security concerns. They were on the other side of the teller wall & we
couldn't get on the sider, despite lots of escalations.

Eventually they brought a machine to the glass, turned the monitor so we could
see it, and connected a keyboard through the slot at the bottom used to pass
money back & forth.

We quickly found out the applet was crashing for them. It took the better part
of a day, but eventually we figured it out: The $CLASSPATH was different. On
top of that, applets could request classes remotely. With the different
classpath it couldn't find one of the classes (I believe it was font related),
would fall back to asking the server the applet was served from. If the server
had returned a 404 that would have been fine, but since this wasn't expected &
this server had other uses, it first wanted to authenticate the user. And to
authenticate them, it would 302 to a login page (which Java would follow),
that would return a 200. But instead of it being a .class file, it was the
HTML of a login page.

Java, thinking the HTML was a .class file, would barf & crash.

Mystery solved.

There wasn't time to get all the machines to change the .class file, so we had
the server 404. Everything worked from there.

We later found out our office in China has the US version of XP, but several
people had the language set to Chinese. When they purchased the version of
"China" version of XP, we had no problem reproducing the problem. It was
somewhat moot, but we wanted to be ready for other problems.

This was sometime ago, so my memory might be off on some of the details, but
that's the best as I remember it.

------
ezconnect
I worked on an online test with different types of question with video
viewing.

The hard part is the video viewing progress is tracked by system. I have to
implement my own video server and player. Youtube was new and wanted to use it
but I can't track their view progress.

------
Const-me
Recently, it's slicing of triangle meshes for a DLP 3D printer. The result is
FullHD or 4K * up to 16x16 for anti-aliasing * a few thousands of Z layers =
~1E+12 voxels, while users only have ~4E+9 bytes of RAM to store them.

------
superasn
Automatically summarising a 550 word article into 2 to 3 lines using machine
learning.

Tried many libraries and eve saas solutions like monkeylearn, etx but couldn't
do it to what was required.

Had to switched to a mechanical turk like solution in the end :/

~~~
jcoffland
This could be the summary of 90% of the "AI" startups these days.

------
thomasfedb
I wrote a system that managed class assignments for a school.

Of course `Class` was a reserved constant and it most of a week before I
settled on `TeachingGroup` as a feasible alternative.

You can't beat naming things for sheer hand-wringing difficulty.

------
kleopullin
Maybe my first program in Pascal, my first programming course.

The school had never taught programming before (yeah, I'm that old), and it
was a business teacher who didn't know how to program or use a Mac, and the
incomplete syllabus was for a class from another school that used a Honeywell
and punch cards.

The first Pascal program was to write a calculator that allowed a user to
input arithmetic problems in words and get the correct answer.

>five plus seventeen

22

>two hundred fifty six divided by three

85 r 1

The professor hadn't taught us strings and the textbook Pascal was different
from the Mac.

However, the school required me to maintain minimum credits to stay in a
welding program I needed for a raise at work, so that by the time I realized
how bad the class was, it was too late to drop.

I welded all day at school, programmed for 1-2 hours, went to class, then
worked on the program all night at libraries at the UW. One of the UW students
complained to a librarian about my smell from the welding; he knew I was
junior college scum.

It seems ridiculous today, but it took me a week to pull it together. I
sometimes cried from the frustration. I wrote everything out longhand on
yellow legal pads at the UW libraries, welded 8 hours, then computer lab, and
class, bus back to the UW. I was struggling in my arc welding class, too. I
couldn't get the feel of striking an arc, and the men helped each other, but
initially refused to help the only woman in the class.

My first solution was clumsy, but I got all the components to run, figured out
reading strings, input/output, handling spelling errors. The text had a
rigorous theoretical description of top down programming, and after I got the
syntax and op system nailed, I used that to write an elegant and robust
algorithm. I didn't realize until years later what I was doing, writing an
algorithm.

Then late one night, library almost closing, I had it! Next day I entered the
hand written program into a Mac at school, and it ran perfectly.

It was the solving, not the problem that was hard, and I bet most here can't
imagine all the missing components I had to find for myself, but the lack of
tools, prior knowledge, and information made solving it almost impossible.
Today, I could Google all the help I needed for a zero-to-program in two
hours, but I'd never feel the level of victory as I did that afternoon when
all my classmates entered arithmetic problems into my calculator and got
answers.

------
dumbfoundded
I went from a software engineer to starting my own CBD company. Most of what I
do is sales now. Making the transition has been very difficult. It's a
completely different world.

~~~
uxcolumbo
This sounds interesting.

What have you learned and what was difficult?

Capital?

Good sources for CBD or are you producing everything from scratch?

~~~
dumbfoundded
The biggest thing I've learned with sales is that the logic of an engineering
mindset doesn't apply. With sales, you have to understand people. This sounds
sort of basic and I wish I could communicate it better but it's true. Sales
are all about social proof, making your customer feel special, and dealing
with people problems.

For capital, I've self-funded from working in the software industry in San
Francisco.

For CBD sourcing, initially, I was buying CBD to create products. This was a
good idea to test products but a bad way to grow a business. Since realizing
this, I've moved everything to Southern Oregon and partnered with a hemp farm.
The combination of my understanding of the internet plus his high quality CBD
rich hemp has allowed me to expand into other products while dropping my
prices by 50%.

------
hoodwink
One time I figured out how to double integrate acceleration from a MEMS
accelerometer to get displacement/distance. I did this for a fitness tracker
for lifting.

------
sky_projektor
Building healthy responses to mental ailment among some people who previously
had no idea how such ailments manifest or affect people. The effort is still
on!

------
kkielhofner
As usual, debugging. The Intel 82599 Ethernet controller issue (featured here
on HN and elsewhere) is easily the hardest issue I've ever worked through.

------
andersthue
My being stuck in moving forward with those areas of my life that are really
important.

------
james_s_tayler
Importing an Oracle database.

------
brianpgordon
Everyone loves a good bug hunt story so I'll share one of the hardest
_technical_ problems I've ever solved, and certainly the one that makes the
best story. Strap in; this is a long one.

The year was 2015 and the Christmas break was fast approaching. Unfortunately,
while the rest of the office thinned out, another engineer and I were stuck
debugging an increasingly urgent production issue. What had started weeks
prior as some random intermittent failures in a few of our microservices had
slowly escalated into a crisis where more and more services were experiencing
failures. We didn't have great monitoring back then to trace any given request
through the system's various microservices and figure out where the
bottlenecks were - all we knew was that lots of requests were getting backed
up _somewhere_.

We soon found Nagios metrics indicating that one particular critical service
on a few boxes had been seeing steadily increasing CPU usage over the past
days and weeks. It had reached a point where the service, which is normally
heavily IO-bound, was actually now CPU constrained. Our suspicions therefore
quickly centered on this service. Failures here could very well lead to the
cascading failures we were seeing across our system. A small but increasing
percentage of requests to this service were timing out. This made upstream
services time out, which in some cases made _their_ upstream services time
out.

Once we had sorted through the chaos of cascading failures, we were pretty
sure that this one critical service was the root cause of all of the trouble,
so we restarted it, one slave at a time so as not to cause downtime for the
whole product. Each instance came back up 100% healthy, with completely normal
CPU usage. Odd. But sure enough, within a day, CPU usage was spiraling out of
control again.

We knew the problem would just come back again if we kept restarting it, so we
enabled JMX on the JVMs and attached VisualVM to take some thread dumps and
run the profiler. After plenty of head-scratching at the stack traces and
close examination of the code, we finally figured out what was going on...

One of our developers had helpfully provided an implementation of
java.io.OutputStream for writing data back from the server to the client. The
one thing you should know about OutputStream is that it's a _blocking_
interface - if you write data to it, the data is written, and if there's a
failure then it should throw an exception right then and there, before the
method returns, so that the caller knows there was a failure. The one problem
with this is that our Java services were based on Netty, which is based on
java.nio, which is asynchronous. When you write to a Netty channel, you don't
get feedback right away on whether the data was successfully written to the
underlying socket. Instead, you get a java.util.concurrent.Future which will
_eventually_ tell you whether the write succeeded. It should be obvious that
there's a major impedance mismatch between trying to implement a blocking I/O
interface using nonblocking I/O primitives. Our developer had decided to
handle this by kicking off the I/O and then simply _completely discarding_ the
Future that the write call returned!

What would happen is that sometimes a client would disconnect while we were in
the process of returning a response, but the application would never find out
because it never checked the status of those discarded Futures. So the
application would happily keep streaming data through this OutputStream back
to the client. Every time the buffer was flushed, the data would make its way
through the Netty pipeline all the way to the bottom, where the write would
fail. This generated a rather large stack trace. This stack trace was written
to disk - and because it was written directly to standard error rather than
via the normal logging infrastructure, we never saw it. But it was being
written nonetheless, and every flush of the buffer would cause a new stack
trace to be generated and written out. It turns out that this is a rather
expensive thing for the JVM to do in a tight loop for dozens or hundreds of
concurrent connections.

Our solution was to do the obvious thing that should have been done in the
first place and check the results of the damn futures! Every API in the
service was depending on there being a blocking OutputStream to write data to,
and we didn't particularly want to do a major refactor over to async I/O and
push out such a high-risk change right before the Christmas break. So we made
a seemingly-harmless change, very minor, which should have fixed the issue and
cleared us for a well-deserved vacation. Where before the code was letting the
Future fall out of scope unused, now we made it block on its result, so that
it could throw an exception to stop the application if there was a failure.

When we deployed this fix and restarted the servers, the CPU usage remained
normal. We breathed a sigh of relief. Then, a few hours later, things got
interesting.

On one of my monitors I happened to be tailing the logs on one of the servers
and noticed, all of a sudden, a cascade of these messages flowing down my
terminal:

    
    
        WARN  c.s.j.rep.utilint.ServiceDispatcher - Server accept exception: class java.io.IOException : Too many open files
    

Sure enough, netstat showed thousands upon thousands of open TCP connections,
enough to exhaust all of the file handles that the Linux kernel was willing to
allocate to the JVM.

netstat reported that these connections were almost all stuck in a CLOSE_WAIT
state. What the hell did that mean? I had taken a couple of networking courses
in college, one lab course from a tech's perspective and another programming
course from an engineer's perspective, so I was pretty handy with the tools
and the general theory. I went home to get a textbook and found the TCP state
diagram:

[http://www.ssfnet.org/Exchange/tcp/Graphics/tcpStateDiagram1...](http://www.ssfnet.org/Exchange/tcp/Graphics/tcpStateDiagram1.gif)

We took some packet dumps with tcpdump to make sure that the client wasn't
misbehaving. It wasn't. After noodling over the packet dumps, the state
diagram, and RFC 793 for a bit, it became clear that the client was sending a
FIN segment, but our application was never acknowledging that by calling close
on the socket. The TCP stack would hold the connection open until it reached a
timeout, at which time it would close the socket for the application. But the
application quickly supplied more stuck sockets to replace the ones killed by
the networking stack.

Christmas Eve arrived and I needed to get on a flight back to the East Coast
to visit family. We decided that over the break we'd conduct rolling restarts
of the servers to clear out stuck sockets before they reached the maximum, and
pick the problem up after the new year.

After the holiday, we were able to locally reproduce the issue by introducing
a lengthy sleep stage inside our Netty pipeline. We went through the Netty
library's source code line by line to see exactly what was happening. As we
picked through the code one of us stumbled upon the following Javadoc comment:

[https://github.com/netty/netty/blob/6e840d8e62e98590e129ab6f...](https://github.com/netty/netty/blob/6e840d8e62e98590e129ab6f21e2deba46bfcacf/transport/src/main/java/io/netty/channel/ChannelFuture.java#L88)

Thank Christ for the Java community's pathological love of absurdly prolix
Javadoc. That footnote broke the case wide open. If you don't see the issue
yet, I'll lay it out... in the next comment - HN won't let me post the whole
thing in one comment.

~~~
brianpgordon
* Netty has a boss thread that accepts incoming connections and assigns each successfully-opened socket to a single particular I/O thread.

* Each I/O thread runs an infinite loop that repeatedly waits for activity on its assigned sockets (using epoll/kqueue/select) and runs each received TCP segment through our Netty pipeline on that thread.

* Usually when an I/O thread writes to one of its own sockets, the write takes place synchronously. However, crucially, it may defer at least part of the write until later, for example if the kernel’s buffer is full. The write would then be performed on a later loop when the selector (epoll/kqueue/select) reports that the socket is ready for writing.

* We were writing to a channel and then blocking on the result Future to see if it succeeded. Since we were writing from the I/O thread, the write would _usually_ be performed synchronously so the returned future would already be complete and the application would continue. However, sometimes the write wouldn’t fully complete and the future could not be completed until the next time around the selector loop. But the I/O thread was blocked, so the loop couldn’t proceed, so the future would never complete.

* Since the I/O thread was stuck, it couldn’t respond to any further messages from any of its managed sockets. Eventually the client would give up and send a FIN segment, and the kernel’s TCP/IP stack would put the socket in the CLOSE_WAIT state. Usually the I/O thread handles this in its worker loop by calling close on the socket, but it was stuck so this code never ran and the socket would never close.

* The boss thread was still running, so the system continued to accept new connections and assign some of them to the stuck I/O thread.

To add insult to injury, Netty actually has a deadlock detector to prevent
this scenario from occurring. Unfortunately, one of the libraries that we were
using at the time is also based on Netty, and they explicitly disabled the
deadlock checker, globally!

[https://github.com/AsyncHttpClient/async-http-
client/commit/...](https://github.com/AsyncHttpClient/async-http-
client/commit/b4a6dbd103ec611f70879cfc78fc5cce5c63a4e6#diff-f86cd91228f3cde51e071f609d997d5bR205)

So there you have it. Don't block in your Netty worker threads. This should
have been obvious at the time - I had worked with Netty before - but we were
too busy investigating to sit back and just _think_. If I had stood in front
of a whiteboard for an hour, I suspect I could have worked it out, but instead
we spent anxious days tearing our hair out attacking the problem by stepping
through multiple threads of code in a debugger (racing timeouts is always fun)
and taking endless stack traces and thread dumps.

The short-term mitigation ended up being trivial: we could still block the
worker thread; we just had to check to see if the channel was already closed
first! This is not a perfect fix, but it let us preserve the illusion that the
application code had a real bona-fide blocking java.io.OutputStream. The long-
term solution is to switch the API handlers over to using real async I/O.

To this day, my fellow engineer investigating this issue refers to this saga
as the Christmas Miracle bug. :)

------
kevin_nisbet
Here's one that stands out from my previous career in cellular telecom.

The story goes along the lines of, our engineering team purchased a vendor
supported BIND solution, to replace our unmaintained linux bind servers.
Sometime after that, our GRX provider (one of the companies that connect
various cellular networks together), had one of their DNS servers down for
replacement, and an outage on 1 of 2 working DNS servers. Our problem was, we
didn't failover to the last working server, and all inbound roaming (users
from other countries coming to Canada) was down (cached stuff still worked
until the caches started expiring)

In cellular roaming for IP services (circuit switched voice works
differently), the routing per APN back to the home network is done using DNS.
However, in investigating this and similar problems, I eventually found that
this wasn't fully standards compliant. And even when talking to our DNS vendor
about this problem, it was difficult because they weren't familiar with the
quirks of the GSMA recommendations.

The outage was eventually recovered, but the question remained why did it
happen?

This took place over a 2 or 3 weeks period, and I don't remember the exact
sequence of events, conference calls, etc now that it's been several years.
But I do know I ended up building a simulation of 3 networks (My Own, the GRX
Provider, and a roaming partners network). Using dig to datafill each server
with the exact response from the partner. This allowed me to control the
startup / failure of each server across the simulation of the 3 companies.

Using this simulation I eventually found the root cause.

When BIND started, it would take our root hints, load them into it's cache,
and begin to perform AAAA queries for the root servers. One of the upstream
servers would respond with 0 records, the other would respond with NXDOMAIN.
The server that responded with NXDOMAIN, would subsequently get deleted from
our BIND servers cache, and would no longer be used as a root.

The next question was why?

After some sleuthing through the DNS RFCs, I eventually found the answer.
There are two ways for a DNS server to return that an answer to a query
doesn't exist. Returning 0 records, and returning NXDOMAIN, and they have
slightly different meaning. Returning 0 records, means that the label (think
example.cm) exists, but the type of record does not (AAAA doesn't exist, but
A/SRV/TXT/etc might). Returning NXDOMAIN means the the label doesn't exists,
for any type of record, so don't bother querying me again for a different
record type (There may have been some vagueness around this, I don't
remember).

The second discovery, is that we had a typo in our configuration, what we
configured as the name of that root server, didn't match what our GRX provider
had configured, which is why we were getting NXDOMAIN on one but not all
servers we had configured as our roots.

The next question was why were our old servers working? This typo was actually
duplicated from our older servers... which still worked during that outage.

So using my simulation, I tested every version of BIND released across
something like a 3 year period, until I found it. Older version of BIND
interpreted NXDOMAIN the same as 0 record answer, and at some point, I can
only assume they fixed a bug, that updated this interpretation of NXDOMAIN.

Yay for not finding out you're redundancy doesn't work, until it's actually
triggered.

Anyway's, there were many challenging issues like this in my telco days. This
probably isn't close to the hardest, but hopefully made sense to those who
don't have a background in telco standards.

------
fuzzfactor
Once I recover from natural disaster that will be it.

------
zipotm
To know the truth about the world we live in.

------
mdip
Wow, so many of these sound so much better than mine, but I'm going to share
anyway.

We had an audit requirement to gather all AD accounts (computer, user, group)
with group memberships in all domains[0], discover all servers, login and
acquire their local user/group ACLs/config (win/linux), discover all databases
(MSSQL, MySql, Postgres and Oracrap), acquire all of the ACLs in the
databases, themselves, and a variety of application-specific credentials. This
had to be completed once every 24-hours in order for other compliance
operations to complete in time. We purchased a system to do this[1]. It
managed to hit _all_ AD domains in 4 days. A few more and it could get the
Windows server account ACL information. We gave up adding anything else and a
coworker took it upon himself to rewrite portions of it in a scripting
language, which due to his shocking abilities in said language, netted Active
Directory environments in 24-hours.

I wrote a plug-in based system that was as lockless as could be with a thread
scheduler which spawned old-school threads, increasing count when capacity was
available, decreasing when it wasn't. This had to be done because the options
available to me in C# 2.0 wouldn't reach the required 24-hours and I had to
substantially beat it if they were going to let me build this thing out. When
I was done with the MVP, I was handling all of AD in under 4 hours.

The component that was going out and collecting data was the slowest and most
difficult to optimize, partly because we'd flood a "pretty beefy for a link
from the US to Argentina/Brazil/Venezuela" but that's not saying a lot, so I
split things up, made the processor into two components, one with an
authenticated web service that handed off to a processor when it received
anything. I ended up using the oldest, most crotchety method to communicate
between the two apps because the web service ran as a down-level account and
the processor ran with incredible permissions (at least on the read-side).

It was horrible on every level. The scope was massive - more than 20,000 boxes
with the desire to even collect file system ACLs at some point. The
implications of screwing something up on the security side were daunting --
the firewall rules and layer upon layer upon layer that went into securing
each of these components (not just on the network, but isolating them as much
as possible using built-in OS ACLs and rules). Doing threading ... correctly
... in C# is easy to screw up and I think I managed to actually _witness_
every single thing that goes wrong when you fail to protect shared data that's
mutable. The available options for thread scheduling in .Net and Windows
didn't fit well with my use case (I tried several with tweaked parameters in
an attempt to bend them to my will). I ended up having to create a state
machine to monitor and compare various counters in the application to decide,
upon receiving a request, if it would improve speed to spin up a new thread in
a thread pool that I also hand-rolled. These were _certainly_ naive
implementations, but I could find no other way to make this work the way it
did. At the end of the day, I was able to scan the entire set of required
environment components _and_ grant, revoke, or apply a custom permissions rule
to anything I could scan. Rather than schedule it to run daily, I just had it
continue to repeat after it completed, meaning the completion time ranged from
about 3-hours to about 12-hours depending on traffic/server loads. This meant
when a new hire started, they received access around 4:00 AM to every system
in the entire company that their job required, and when they left, their
access was cut _immediately_ , _everywhere_ , due to a priority system I
_also_ added to the system.

This project came with a _mountain_ of politics. Two, large, companies where
one purchased the other and I was in a team on "the other" ("the other",
thankfully, had said contractual requirements so the company who purchased us
had no prior knowledge on how to do something like this). They strongly
resisted rolling our own solution in this manner. In the end, it performed so
ridiculously well, and was so _easy_ to build a new "module" for (compared
against the off-the-shelf product that we had both, coincidentally, purchased)
that it was kept, upgraded, much of the threading complexity replaced with new
features available in .Net 4.5 and _hopefully_ a lot more (I left when .Net
4.5 was released).

[0] Accurately ... 14 domains with varieties of trusts, various
misconfigurations, one case of three domains in the middle of migration to a
single domain with the fun that SID history injects. sIDHistory, alone,
represented a month of debugging to get it to return the correct account,
Alice in Domain "A", is a member of Group "1" with her account in both Domain
"B" and "A". Abstracted, Alice's two accounts are supposed to be seen as one
account. Higher-level AD libraries will return _either_ account regardless of
what is joined to the group (there's ways to coerce it to return a specific
domain, but not a way to coerce it to return the correct one without using a
more painful library).

[1] It was one of those solutions where you buy a sort-of framework and pay
the company to code modules for it that are custom to your environment in a
DSL that is custom to the application. It works as well as you'd imagine.

------
me551ah
Missing notifications for Android I was the lead for Flock Android, a team
messaging app which competes with the likes of Slack. We started receiving
some weird complaints from users about missing notifications. Now since it was
a business messaging app, notifications were critical. We tried out everything
to reproduce this issue, including bombarding devices with notifications,
trying out different networks but we just could not reproduce the issue. Worse
we had a couple of incidents with devices in our team but logs showed
absolutely nothing. Google GCM said that the notification was sent but there
was no trace of the notification in the app's logs, it was as the the OS was
swallowing notifications. And somehow popular apps like whatsapp, facebook
were immune to these issues. The final nail in the coffin was the CEO himself
missing notifications on some occassions. Google was unhelpful as usual with
their non existent dev customer support. We were able to find a bunch of
support articles on the internet where other apps had essentially listed down
steps to turn off battery optimizations and that seemed to work ( Interestly
whatsapp, facebook are usually added by default in these lists). But there was
no way to do it automatically and we could also do it in response to a support
ticket, which is painful for users.

After a couple of days we were able to encounter a few missed notifications on
a oneplus device which we had connected to a machine taking system logs. And
voila we got a few lines of what was happening.

01-12 09:05:23.649 2719 2839 I ActivityManager: [BgDetect]chkExcessCpu level:
0 doKills: true auto_mode: false uptime: 183054

01-12 09:05:23.661 2719 2839 I ActivityManager: [BgDetect]detect excessive cpu
on process to.talk(pid : 13406) level 0 usage 29

01-12 09:05:23.892 2719 2839 I ActivityManager: [BgDetect]force stop to.talk
(uid 10151) level 0

01-12 09:05:23.893 2719 2839 I ActivityManager: Force stopping to.talk
appid=10151 user=0: from pid 2719

01-12 09:05:23.893 2719 2839 I ActivityManager: Killing 13406:to.talk/u0a151
(adj 200): stop to.talk

Apparently some process called BgDetect figured out that our app was taking
too much CPU and decided to kill it. Android being open source, we figured
that we should be able to get source for it. But not only could we not find
BgDetect in Android sources, it was non existent on oneplus sources too. We
then got in touch with the marketing team and reached out to some contacts at
oneplus, who directed us to their dev team. They asked for our apk and voila ,
in a week we were whitelisted along with the likes of Whatsapp and Facebook.

But the issue persisted on Xiaomi, Oppo and a bunch of chinese manufacturers
and for those we still had to dish out steps to whitelist Flock in battery
optimizations. We eventually figured out that moving from http to xmpp for
android fcm notifications gave us delivery receipts for push notifications on
device. On google phones and reputable manufacturers these delivery receipts
also meant that app received the notification. On affected devices apparently
the OS would give us a delivery receipt but not deliver it to the application.
We built a double ack mechanism which also sent delivery receipts from the
app. Based on the time difference between app delivery receipts and fcm
delivery receipts we were able to figure out affected devices and send them
directed bot messages to preemptively ask users to whitelist flock. You can
read more about the double ack mechanism architechture here:
[https://hackernoon.com/notifications-in-android-are-
horribly...](https://hackernoon.com/notifications-in-android-are-horribly..).

Google still refuses to even acknowledge the issue while problems like these
are widespread: [https://bit.ly/2QSzTKK](https://bit.ly/2QSzTKK)

------
khendron
The most elusive bug I ever investigated was also one of my first, in my first
job after graduation, back in the early 90s.

TL;DR; spend months investigating a problem that didn't actually exist.

I was put on an investigation into why the data acquisition interface we were
developing was not working. It was designed to plug into the serial bus on a
Coast Guard icebreaker. Since we couldn't develop on the ship, one of the
other engineers made an hour long recording of the bus traffic, which we were
able to play back in a simulated setup in our lab.

Unfortunately, the traffic on the bus was not as reliable as we expected.
There were supposed to be fixed number of data channels being broadcast every
second. But frequently, seemingly at random, channels would disappear and then
reappear a few seconds later. This made it impossible to identify the
different channels, since they were order dependent. It usually wasn't even
easy to tell which channel was missing. Usually multiple channels would
missing at the same time. Our data readings would get corrupted and everything
would fall apart. Since the end-product was to be a real-time engine
diagnostics system, this was unacceptable.

When I was hired, the company had already been working at the issue for some
time. I was tasked to try to find a pattern in the drop-outs, so that they
could be predicted and our interface parsing adjusted as necessary. I tried to
find correlations in the intervals between drop-outs, and the length of the
drop-outs. I tried to apply smoothing functions so the effect of the drop-outs
would be dampened. I tried doing predictive analysis on the data channels, so
that when a drop-out occurred we would at least be able to make a pretty good
guess which channel was missing, and realign all the other channels. Months
went by. Nothing worked.

Then, one day while I was mulling over an printout of the data recording
(literally 100s of pages of nothing but columns of numbers), I noticed
something odd: almost every time a drop-out occurred, a channel would
transition from positive to negative, or from negative to positive. That is,
before the drop-out a channel would be positive, and after the drop-out had
ended, the channel would be negative. Or vice versa. But a channel would never
be 0. I poured over the entire recording, and there was not a single 0 value
to be found.

The data recording had been made by plugging into the serial port on the ice
breaker's data bus and using some modem software to record the traffic on the
bus. We quickly put together a test setup using the same modem software, and
sure enough, when a 0 value was received the software would not record it. It
would just bloop right over it. There was nothing at all wrong with our data
acquisition interface. The problem was with the recording we were using the
test it. We had a new recording made using different modem software, and the
problem was gone.

~~~
sawmurai
Nice one

------
eatMyFuck
Digging myself the fuck out of multiple compound interest debt traps, multiple
times, while generally managing to subsist and have what I'm reluctant to
refer to as " _A Life_."

------
toredash
I'm 33 now. Ill come back in 20Y time. I don't believe I've encountered my
hardest challenge yet

~~~
mooreds
I don't know about everyone else, but I'd settle for hearing about the hardest
challenge you have faced as of now.

------
FailMore
How to interpret dreams - please see my presentation here:
[https://www.youtube.com/watch?v=iPPNxc7nApY](https://www.youtube.com/watch?v=iPPNxc7nApY)

