It seems the gain had been reduced to cover up another problem: a tonal hissing sound. Once I learned how polyphase filter banks work, I tracked the problem down to a premature optimization, namely the replacement of an integer divide by 2^n with a right shift.
Such a shift of a negative 2's compliment integer rounds toward negative infinity instead of zero. This caused a slight DC bias within each sub band filter. In all but the lowest band, this DC bias gets shifted up to a non-zero frequency.
I call the optimization above premature because fixing it only added one cycle per operation. Granted, this was a real-time MP3 encoder on an ARM7, but the cycles were there.
At a yet another gig I developed a method to differentiate male and female insects at 30m using a beam-steered high speed low res camera.
Making the Newton data stores stable, about three months before we shipped. The Newton had a flash-based object store instead of a file system, and the code was such that a power loss or a reset in the middle of an update would toast your data. I spent about six 100+ hour weeks writing a transaction system to make sure that users wouldn't lose data to a crash or a dead battery. I think I had to fix only a couple of bugs in that code, after a massive checkin.
Then, making the audio pipeline for the Kinect stable enough for noise cancellation to work. I'd heard that doing isochronous audio was hard, and "yeah, yeah, sure," but I had no idea it was really hard until I'd shipped a system that used it, with tight constraints around latency variance. I worked with some really good people on this, and there were days when we were looking at Xbox hypervisor code, application code and even DMA traffic on the camera itself. Another three or four 100+ hour weeks, maybe three weeks before we shipped, changing scary stuff everywhere. I still remember a satisfying feeling when I discovered the exact buffer we needed to use as a clock root for audio (and it wasn't the obvious one).
Somehow the Newton team managed to do a whole stack of things like that in a ridiculously short time, and then shipped them in read-only memory -- not Flash, kids, but for-keeps permanently masked ROM -- and it worked. I've never seen anything like it since.
[That same B-tree code was powering our source code control system, IIRC. Every time we hit the next power of two on the database size something else would explode. Wheee.]
Patching the mask ROMs was fun. Nope, you can't change those things, but we had a bunch of bugs to fix between the time the ROMs taped out and we actually shipped. We couldn't afford a full ROM jump table (the newt had 4MB of ROM, 512K of RAM, and we ran the heaps mostly full and had essentially no memory left), so we randomized the jump table and played aliasing games with page mapping to get our patch budget down to 20K or so. It helped to have really good engineers -- people who really got into the swing of twisted, devious code -- write the patches. I wrote a couple of fixes, and had done a pretty good job, I thought, then handed them to Andy S. who always came back with something about five instructions long that worked better and maybe fixed another bug, too, with a bonus cackle of maniac laughter.
When I started, Newton was an expensive researchy pipe-dream. A year later we'd done a complete reset and shipped product it. I don't know how we did that.
Custom built hardware, including the actuators.
Custom built RTOS on micro nodes.
Driving via openCV, hough transforms for lane detection, stereovision and flows for obstacles, surf/etc for traffic signs(don't ask, I was learning)on a stack of 2 laptops connected via gigabit ethernet.
Nicest thing: I got the models trained mainly without moving the car, by pumping the framebuffer of two racing/car games, rFactor and GTA3 through glcs to openCV and controlling the games via uinput to make a virtual city to drive in.
Don't have a lot of pics, here's some HW:
http://imgur.com/9cfzbMv ( yes, that's an old cordless drill and an angle grinder head with a bespoke bike chain :) )
http://imgur.com/5x9T9gi (piston is weirdly offset so I could still put my foot on the brake in emergencies, like that one time i smashed it into a fence...)
http://imgur.com/Ig7MaLT (notice how I kept the costs down to almost nothing, in this case the air distributor for the brakes made with Meccano and old air valves and a geared motor, since I didn't have the funds to buy anything)
I came up with an algorithm treating the each scan line of the screen as a binary tree which allowed me to keep track of which part of each scan line was already written to, meaning, I was able to build the screen up from front to back and visit each video memory location only once. So, on a 320x200 screen only ever 64000 bytes would be written to memory. With all the clipping etc. this was quite a complex beast and fully written in 286 assembler. In the end I think it made the overall graphics rendering about 20% to 25% faster.
Edit: I don't have the code anymore. I lost all my "floppy disks" in a house move... :(
In my case I solved a problem in genomics that people had been trying to solve for around 20 years and which would have saved the human genome project billions of dollars - the only problem is I did it 10 years too late :(
Edit. If anyone is interested I published the method in BMC Genomics a few years ago http://www.biomedcentral.com/1471-2164/10/344
For the non-molecular biologists I should explain how it would have saved billions. The human genome was effectively sequenced by breaking it millions of random fragments of around 1000 bases (letters) and the sequence (order) of the bases in each fragment determined. In order to be able to put the fragments back together in the right order each base was sequenced 10-15 times in different fragments. My idea allowed you to avoid all this redundancy meaning you had to only sequence each base once or twice. I did some simulations based on the actual costs of the human genome project and it would have saved 80% of the costs and finished it 3 years earlier.
I slowly reinvented the basics of an information retrieval system (the curse of not having taken CS classes). Came up with the idea of a log-structured merge tree, made easier by this being a write-once database. Got some inspiration from the original Google paper. But most of it was just figuring out the least number of actions needed to retrieve info.
I published the core DB part, which maps an int64 (index hash value) to an int64 (docId) and stores in an efficient format (on our data, ~2.3 bits per packet). http://github.com/michaelgg/cidb - I couldn't find an existing library that has zero/low per-record overhead.
On a Q6600 and a single 7200RPM platter, I was able to index a TB of SIP a day and provide fairly quick flow reconstructions going back as far as disk space allowed. On a quad-core i7 parsing+indexing was over 1Gbps.
Company impact was huge, because we could suddenly troubleshoot things in minutes instead of hours. A few years later, after I was gone, I heard they were still using it. Neat.
This was all in F#, which presented fun challenges regarding optimization. Lots of unsafe code and manual memory management. SSE would improve varint encoding - the CLR generated code is a joke in comparison.
Last month, I dropped this lib in as a replacement for storing routing information in a telecom app and dropped RAM requirements from 6GB to 1GB.
On the downside, I'm sure any compsci student could build a similar thing in a week, and probably they do so for school projects. But to a lot of app-level developers, this kind of algorithmic work is sorta magic for some reason.
I thought about commercializing it but interest seemed lukewarm.
Looking forward: Intel 10G NICs have little processor in them, and there's a library called DNA (Direct NIC Access) that bypasses kernel mode for packet capture/transmit. Makes it super efficient and possible to do line-rate 10G apps on commodity hardware.
Those processors can also distribute packets to specific cores, which helps solves the next problem of scaling the capture part beyond a single core. Looking beyond 10G to 40G and beyond, I'd imagine something similar to Etherchannelling would be a cheap way to leverage existing hardware into breaking up the load. Unfortunately a low-level "dumb" load balancing system will split up dialogs so your indexing and compression gets a bit less efficient.
One of my semi-open research projects is to do line-rate SIP DDoS protection. So far, no one (not commercially, not academically) has any magic bullet and it would appear that almost every SIP network out there is trivially DoS-able.
State-of-the-art is along the lines of "we push a whitelist into our switches" (which, actually, technically doesn't even work with SIP since multiple IPs can get involved with the signalling, IPs unrelated to original hosts - although I've only seen this once or twice in production).
As far as I can tell, it's mainly an engineering problem - just do the grunt work of writing smart code. But there's no market for it until VoIP providers start suffering real serious pain - until then, I don't think telecom cares about security, DoS or otherwise.
- For an AI course at University, my partner and I developed a custom motion planning algorithm involving neural networks, RRTs  and POMDPs  in several thousand lines of Java. That was some of the craziest (and most fun) programming I've ever done. Our lecturer was Hanna Kurniawati , who is world famous (for some value of 'world') for her work on POMDPs, which was really cool.
Since then programming games was always my obsession, and just a few months ago I fulfilled my life long dream and got a job as a gameplay programmer at Ubisoft.
My initial hunch was that it was a timing issue, and that I was seeing different behavior based on temperature. Even when I made things super slow to exclude timing, the performance was inconsistent. Next on the list was excluding race conditions (maybe I'm not resetting in the right order, and getting lucky?)
At some point in time, it was 8am after an all nighter after I'd been debugging this, and I hadn't been able to reproduce the bug. Lo and behold, when I give up, the problem starts occuring, just as I open the blinds to get some sun.
Turns out that the display driver had light sensitivity issues. Since it was a cheap display, the backside of the driver IC was exposed (the epoxy fill didn't encaspulate it all the way, just the edges; you can see it as the white strip in the Digi-Key picture).
Putting a piece of tape over the IC solved the issue, and I didn't run into problems with the display again.
 PCB (business card sized): https://bitbucket.org/cyanoacry/ditch_day/src/3bf75f6bd2fba1...
Hardware picture: http://www.albertgural.com/blog/caltech-ditch-day-2013/image...
I do remember that they were however pretty picky regarding the reset sequence (something the datasheet warned about several times).
I was working for a company that made NICs which went into the EISA bus and we were seeing data corruption in this machine.
After a long, cold night in the Apollo works myself and an engineer from HP tracked it down to a timing problem on the EISA bus where the 16 bits being sent were arriving in two 8 bit chunks slightly delayed. Our NIC was spinning on a 16 bit word looking for a change in the top 8 bits as a single that the data had arrived. We'd then read all 16 bits, but 8 bits weren't ready yet.
Luckily, HP made lots of test equipment and getting a logic analyzer with 32 inputs, an in circuit emulator for the CPU, and some logic probes was easy...
 I think it was an HP 9000 http://en.wikipedia.org/wiki/HP_9000 being made at the old Apollo works outside Boston.
I don't work on that stuff anymore, but it definitely challenged my problem solving abilities to:
A) Learn Erlang.
B) Learn how to write solid distributed software in Erlang.
C) Figure out how to work around Google's temp-banning policies using IP balancing, captcha solving to produce cookies to balance connections on, and also how to localize the searches for geographic accuracy.
I don't like working on projects that are actively pitting me against someone though, so I'm happy to not be working on that. I now write scalable software for my energy-focused startup, we receive energy data from homes in near-real time, which has its own challenges.
That is the problem that many of the most interesting projects have some moral ambiguities(military, financial, etc, etc).
While one is getting paid, it is very easy to justify or not even think about where the money comes from(que Sinclair quote).
Scraping the search results was about getting data on where items were positioned in the index, not pumping spam into the search results. Although to be clear, I did help do stuff like that for a while too but it never felt good and we vastly preferred the rank tracking product to link building services.
Also, Google's getting extremely good at combating search spam.
I'm surprised you didn't also didn't mention the ethical problems with trying to take something (search results) without permission.
Your new project sounds awesome, though. : )
Also it's indicative of something if people are building businesses to scrape something they could offer as a pay-for API that many people would gladly / happily pay lots of money for. The arguments for NOT doing that are ridiculous because people will get the data regardless.
http://imgur.com/a/LiCxU ( a tiny part, full thing is 172 pages and I needed about 35 of them).
I'm developing software to help the MCS accredited renewable installers in the UK, I planned to buy the API that does the calcs in so I duly purchase it and do some quick testing...get the documentation and oh oh this doesn't match the real thing!, ring them up "oh yeah we are getting out of doing the API as our competitors are using it against us".
I'm not a mathematician (I got a B in my GCSE Maths for Christ's sake) and now I have to implement code that works out the solar irradiation using tilt, latitude, solar_declination, a dozen look up tables, some hairy trigonometry
Stuff that looks like this :-
A = k 1 × sin3(p/2) + k 2 × sin2(p/2) + k 3 × sin(p/2)
B = k 4 × sin3(p/2) + k 5 × sin2(p/2) + k 6 × sin(p/2)
C = k 7 × sin3(p/2) + k 8 × sin2(p/2) + k 9 × sin(p/2) + 1
As it was it took me two weeks to implement (about the most stressful two weeks of my life).
I'd love to post the code but it represents a significant amount of work and there are competitors in the market and that would be a big hand up to at least one I know off.
Same here, I got a B at GCSE maths. When I was 23 I enrolled for maths A level at the local college, I got an A. 16 years is too young to assess someones potential.
An in-house camera calibration application at my company solved the correspondence problem by using an algorithm that looked at properties that are not invariant under projection (i.e. angles and distances). This made the calibration process extremely fragile. The algorithm often failed to detect the calibration grid even when the image was crystal clear, which made calibrating a camera a very frustrating endeavour.
Since it was an in-house application there was never much priority to fix it. Eventually I got so annoyed though that I wrote a whole new real-time algorithm for the detection of arbitrarily-sized grids from a set of 2D points. The algorithm is capable of dealing with a significant amount of outliers (even when between the valid grid points), it can handle missing grid points and it is not affected by perspective or non-linear lens distortion. In the end it took me longer that I had hoped, but the resulting algorithm is one that still makes me proud.
I spent a lot of time reading books and papers on computational geometry. I had an idea that involved a few minutes of precomputing things, and eventually came across a useful algorithm in a paper that let me implement this as I envisaged. In the end, everything worked perfectly. It was very satisfying.
How my algorithm worked was to break the search space into a 2D array of much smaller squares. In the initialisation phase I put the points in each square that are closest to any search point in that square (some points were in multiple squares). Therefore, when a search was run, the square the search point falls in is looked up, and the list of twenty or so points was looked through for the closest point in a slightly optimised way (no square roots here).
I've been in tech for more then 30 years. In that time I've done so many things it's hard to pick just a couple interesting ones. I think some of the fun ones are really old but even a couple are very recent:
* In the good old days HD's would die and people would bring them to me hoping I could revive them. I've swapped logic cards with working HD's to recover data. One disk was soaked from a fire sprinkler system in a small business' office. I took it apart and rinsed it with distilled water, cleaned up a bunch of parts by hand, swapped logic boards and applied lubricants to various parts and was able to spin it up just long enough to back up the customer's data. Don't try this at home with modern drives, get a recovery team to help you if the customer can afford the cost.
* In 1988 I was approached by a military contractor with a GPS board they built that needed to have a device driver built for SCO Unix. They wanted 50ns realtime responses and I had to keep explaining to them that SCO Unix was not an RT OS but they were cornered into the OS by contract and had no choice. So I pushed on them "why this strict timing" and after many signed documents and stuff I don't really want to know I showed them we could build a device driver interface that allowed them to achieve their needed result. After many tests it turned out we were able to beat their timing requirements. The contractor was very happy, I went as far as to make it a SCO Install disk etc, and they were able to make the install part of their build process. The strange part was two years later I got a call from the contractor, they were in a panic because the driver didn't work with the latest version of SCO and they had to "urgently deploy a lot of these things" into a undisclosed "middle eastern territory". I told the guy I was really busy (I was) even after that the next day he showed up at my house (literally my house) with a machine in his car and a blank check and begged me, he said his job and a lot of others were on the line and they didn't have time for someone else to come up to speed and make the adjustments. I caved in, I updated the drivers over night, it wasn't too bad just some changes in the kernel interfaces and SCO had made some dumb changes to the way installs worked etc. I gave him the results the next day late in the evening and he thanked me and drove off into the sunset. I was never told exactly what they were used for... I've always wondered.. And no I didn't burn them I charged my regular rate, but I did work about 18 hours on it in a 24 hour period.
* In recent years I've taken to building low-latency high scale systems. One such system must respond to upwards of 1.6 million queries per second across six data centers and the response must be received by the requestor within 10ms. The reality is with jitter even on local networks you really have about 7ms to respond. I wrote this system in Python, Java, C, C++ and Nginx/LutJIT. Each time I re-implemented the same solution but new tech with twists to leverage the strength of the underlying tech (Long living objects in Java to avoid GC, Cython, etc.) and my best implementation ended up being nginx/LuaJIT. I was able to get about 15k qps/core using this configuration and it was rock stable, running for weeks without need of a reboot. The best part is I've been able to publish the system internally with all the system settings (lots of network tweaks) and a script so others can deploy it and do their own testing. Previously everything was C with Libevent and it's just painful trying to get a large pool of people up to speed on using that to do their projects. Most recently I've re-written this system in Go and am working through some crazy performance issues there. I can't seem to get Go's scheduler to react as quickly as Nginx and often times it seems to latch on to just four CPU's even though GOMAXPROCS is set to 8 or more.
I could go on for pages about all the other things I've done but, the underlying thing about them all is problem solving at a level of detail where most people give up. I often say I'm not very smart; I'm just really persistent. I'm willing to change just one thing, retest, tweak another, retest and onward until the problem starts to present itself. I often find people give up too soon, they think something is impossible or they're scared of how much time it will take to find the solution. For me if I see forward progress and I have the intuition that what I'm doing will work I keep pushing until I can disprove my intuition or prove it. I think the sign of a good technologist is less about how super smart they are and more about how they approach solving real world problems. I find it annoying when someone tries to get me to do some puzzle for an interview or other thought experiments that have little basis in reality. That said, if someone asks me to solve a real world problem in an interview I'll jump up to the whiteboard and tear it to up with great passion.
Don't belittle yourself because you're doing CRUD to pay the bills. Instead challenge yourself to do more when someone isn't paying the bill. All my life I've worked and played in with technology and most people can not tell when I'm working or playing. I am always pushing to learn new things. As the example above shows, the system I built is working fine why do Go? Why not? In 30 more years all the languages will change, all the tech will be different but the problems will be related to today's problems and the more you learn to stretch your mind and solve problems with many different approaches the more valuable you will be in the complex future that is coming at us every single day.
Maybe I need a ghost writer who can shadow me on projects and then write them up for us both. LOL
too many people look down on those doing the "grunt" work, despite the fact that it is usually necessary.
I was just checking your profile for an email address when I saw your employer on your LinkedIn profile and suddenly it all makes sense ;)
Some of the big learnings where sysctl settings and other os level tweaks.
My primary goal was to have customer reference implementations in many different languages. That said the more I do this the less I'm convinced there will be a broad range of tools that can reliably handle the constraints.
I would love to know more about this - care to write up an article or something?
I dug deeper and found that the application could figure out 5ms in advance the GPS time code of when it needed to fire, With that in mind what they really needed was the ability for the firmware to have a time trigger that was programmable through the device driver from the application. That way the dedicated board would fire the pulse exactly when the application wanted it fired.
Basically I turned their logic on it's head and it fixed the problem. The firmware guys and I met, we developed a protocol a couple days later they gave me a new board and I had already written the driver changes and a sample client.
I guess the underlying point of the story is just because someone asks you to do something impossible doesn't mean the problem isn't solvable. They may be coming at you with incomplete information and you need to push to better understand the problem. Once we were all on the same page it took a firmware rev and some small changes to their application and it was a success.
I agree about the writing style also.. have you heard of Leanpub? Pretty sure you could make a small book that would be tons of fun to read.
I get tired of explaining Xenix to people so I guess I just defaulted to SCO Unix in this story.
My favorite Xenix tidbit:
"Microsoft, which expected that Unix would be its operating system of the future when personal computers became powerful enough, purchased a license for Version 7 Unix from AT&T in 1978" - http://en.wikipedia.org/wiki/Xenix
I used to laugh every time I'd boot a Xenix machine and see the MS Copyright.
The hard part was taking those 3 sets of bezier curves and turning it into a 3d mesh. There's no pleasant mathematical way to do this directly, and there's no way to convert the 3 sets of curves into a 2d bezier surface.
The eventual solution involved several steps - first the top-down curve was rasterized into points at intervals of N on the X axis (from front of hull towards back). The maximum distance between any two sets of symmetric points was used to "scale" the front-back view so that the endpoints of each copy of the front-back view would match each set of symmetric points on the top-down view. At this point, each set of symmetric top-down points has a matching front-back curve that connects the two points. Now each front-back curve is rasterized at intervals of I on the Y axis (from port to starboard).
At this point I have all the points I need and could actually rasterize them into a mesh, but with one problem - the side-view curve still isn't accounted for. If I were to rasterize it at this point, the ship would probably look like a bullet cut in half.
So to take the side profile curves into consideration, the side view was rasterized like the top-down curves were, into points at intervals of N on the X axis. These points are converted into proportions (from 0 to 1) of how far they are from the top deck relative to the deepest point on the side-profile curve. Finally, each proportion was multiplied with the Z components of each point on the point-rasterized front-back curves. In this way, the side-profile just acts like a "scale" to how deep the front-back curves are allowed to go.
I was pretty happy with the results - however the mesh had densities in bad places that I later smoothed out using bicubic interpolation. I don't have any pics of final product, but here's some of it before the interpolation phase:
His approach was to translate the code by hand without understanding much - a monkey me? He told me I have approx. 1 week. I look at the code and see dozens of optimized c++ code - most of the features wasn't require for us however. I've quickly understood that I would have to figure out the core algorithms and implement them if I wanted to finish on time. The problem was: when you read the code of a complex algorithm in c++, some part can be so complex that you can spend a lot of time just to understand barely how it works. And I have 5 days, 8 hours a day, and the 3D game was waiting.
So I took 20 minutes, and succeed to figure out how to make a navmesh-based pathfinding library, on a piece of paper without even reading recast's code. The hard part to figure out is a portal-based algorithm and also to succeed to get error-perfect 3D floating-point geometrical approaches to avoid nasty corner cases which causes bugs on the position of the main avatar of the game. At the end of the week, our 3D game was running with the new library and I did not make more than 1 hours of overtime each day in average. I felt classy =) It has worked during the two years I was there and games has been shipped with it.
Early 90's (when C++ was new and broken in a different way for each compiler), I was on a QA team for a project where developers were learning C++ as they built the next version of the product. There was a buggy math function in the standard lib, and the compiler vendor didn't see it as a high priority. Devs didn't know how to find expressions that were at risk so they could cast them to a type for which the libraries worked reliably.
I discovered a useful combination of compiler flags and wrote an awk script to take the compiler output and make a list of source files and lines that produced calls to the broken lib function. The lead developer insisted that I was wasting their time with a bogus list, until I explained how it worked.
More challenging: I worked on an industrial machine that had to mix measured amounts of gasses over time. The developer who wrote the mass flow controller (the device that controls gas flows, basically) task just opened the MFC at the start time, then slammed it shut after the correct amount of gas had passed. I coded up a smooth open/close that kept the area under the curve correct. In 6809 assembly language. That was in the early 80s, when a 2MHz 8-bit CPU was some serious horsepower.
Probably the only thing worse than reading C++ is when it's poorly documented and in German. :P
This was made done so that the Ad network could cold call them for ad deals. I left immediately after that because of dropping levels of decency and ethical standards in that company.
- Deploying a financial app through 4 bastion hosts by keeping Russian doll ssh tunnels up (clients outsource IT bouncing across the world to get to the right boxes)
- porting a 8 MLoC fortran nuclear reactor simulator from UNIX to windows
- generating PowerShell on Linux to be run on a windows box by reverse engineering the MS Api.
- silencing dialog boxes by DLLs which patch and proxy system DLLs
- making Java JRE run from a CD-ROM with the right JNI dll/so and a custom installer I wrote, before the advent of installanywhere (talking Java 1.1 days)
Sorry but my god, WHY? was it morbid curiosity, because I can see why you would like to do it then or do you actually prefer the power shell syntax? (sorry if this comes off as biased, and it may very well be but I haven't actually heard of many people that can stand powershell)
It's ugly in its own unique way, but it's better than bash in terms of functionality and elegance.
How so? I haven't used it much, but I've not noticed any elegance to it? is it like lisp and if I use it for a while it will hit me what the power of it is?
You do have to be careful in some cases because UNIX style syntax is supported but the result may be different because of the underlying implementation. Like if you output text to a file using redirection, by default the output is UTF-16. It can make you crazy if you're not aware of things like that.
Why? Because Firefox didn't have any API to correlate cookies (network requests/responses) with tabs and I ended up traversing recursively objects to discover extremely indirect relationships.
2) Most weird?
i) Vulnerability and exploit using linux terminal escape codes (1999 in C/ASM), so you can run the exploit even if the user just "cat" your file or a ftp server displayed a specific banner: http://www.shmoo.com/mail/bugtraq/sep99/msg00145.html
ii) Adding a second keyboard to a Commodore Amiga 500 (1989 in ASM) (from a Commodore 64) enabling two people to use the computer at the same time in different screen places with one monitor.
iii) Adding an API to Microsoft Outlook Express (2003/2004 in C++) (the application doesn't offer a whole API architecture): http://www.nektra.com/products/oeapi-windows-mail-outlook-ex...
iv) Doing a file compression tool in Cobol (circa 1997) for the MVS operating system because the organization didn't want to buy a relatively expensive C package for the mainframe.
Later, we switched to the NXP LPC platform using only open source tools: GCC ARM Embedded toolchain, OpenOCD + gdb for debugging and vim, make as 'IDE'. What a relief.
I have loads of experience with the LPC series and I only realized how nice they are to work with until this project came by. Also, GCC + OpenOCD + gdb is a very nice toolchain to work with, although the first versions of OpenOCD were a bit of a pain to get (and keep!) running.
The tool was written in Java, and I was coding in mostly Java, so I figured it'd be easy. That first version was indeed pretty simple to patch (decompile, find a class with a method that ran a bunch of license checks, and just replace that bit with "return true" then rebuild the JAR). Then they released another version with more serious obfuscation applied to it. I found myself digging through classfile specs, writing my own code to edit classes directly (where decompilers failed), etc.. As obfuscators evolved (and they upgraded), it got more interesting.
I bought a license after a while (and I certainly didn't distribute my cracked versions; I liked the tool and wanted to support them), but I kept cracking new versions when they came out for a few years; it was entertaining & highly educational.
I worked for a while on a standalone classfile editor, but eventually ran out of time for the hobby. But along the way I learned a good amount about Java bytecode, how the VM interpreted it (and how much flexibility there actually was involved). I learned about UTF-8 (and realized quickly that most text editors didn't seem to actually support it properly), put quite a lot of thought into security (trickier than I had thought) and how it relates to program flow (e.g., good design can easily mean more-easily-bypassed security).
The real downside is that I couldn't brag about it to my colleagues, or online -- I didn't want to tangle with any potential legal issues, or reputation problems, especially at the start of my career.
So I'd be walking around, buzzing with the new stuff I'd figured out (and new obstacles surmounted), but unable to say a word about it to anyone who would know what I was talking about. I considered contacting the developers of the tool directly, just to let them know this little contest was even happening and offer suggestions, but decided not to run even that risk.
I was able to provide to an architect a tool that was very easy to use (UnrealEd was pretty cool), to create structures and allowed real time 3D visits of them, modify the textures and modify the sounds.
Funny thing, I didn't have the time to remove/hide a certain feature of the game: dying. So if a visitor of a virtual building dropped from more than a meter, the game would play the hurting animation or die if it wasn't in god mode.
Genetic Programming part was most fun because it required me to use Python language in a way I didn't use before (passing functions as variables and making trees of different Python functions).
I used to work for a massive media company for a short stent and I kept trying to explain to them why DRM will always be broken but it just never took. I think it's hard for non-technical people to think about the 100+ ways media leaks on its way to the human's ears or eyes. :)
I taught myself how DFS works (distributed file share - an extension of AD technology, it maps a set of shortcuts on domain controllers and other DFS root holders), and expanded the available DFS roots to lead to a reduction in site load times.
I hooked up an ultrasonic distance sensor and a motor shield to an arduino, and put the assembly on an RC car. I programmed it to slow down as it approaches walls, and to back up if something is too close. I want to build robots to fill some need, but I'm still looking for that need.
The problem with boot code is that the debug process cannot interfere with the boot memory itself or the bug won't show up. So I got to make one simple change in the assembly bootstrap, assemble the code, boot with a floppy, access a raw disk editor and replace the old bytes for the new test, boot twice and see what happens (mostly some chars printed on screen).
After one week focused the problem decided to show up. It was a micro misplaced bit operation in the blowfish algorithm (our version in assembly) that swap one or two bytes in a specific sector in the disk. This happened just in this machine with this HD and this version of O.S. because in the boot process some code in the O.S. bootstrap made a write operation in the disk just in this "wrong sector". It was not hard to fix this problem, but to find it for the first place, and after this system already running in thousands of different machines for a couple years without a clue.
It was a pretty nice week in the end.
A long time ago I was Mister Fix-it for a company that made top-of-the-line lighting desks for touring rock bands. If your band wasn't touring with one of their desks you hadn't yet made it.
The company developed a new desk and sold the first couple. Then one band said they had an intermittent problem during rehearsals where the desk would just stop making the lights go. I got sent to fix it. Couldn't find anything, and couldn't make it break.
First show of the tour, 15,000 people in the audience, the house lights have gone down, the intro tape is running and the guy on the desk announces that it's stopped working! I ran backstage to get my tools to open the desk (god knows why I'd left them there), pelted back to the desk, and on the way I realised what must be going wrong. It must have been the adrenalin that heightened the thought processes. Opened the desk, moved the component that I'd worked out was shorting to ground and causing the output to stop, and he was back in business before the intro tape had finished.
I still get a bit of a high when I think about the total head rush of fixing that problem like that.
The 'permanent' fix for him was a tiny piece of gaffer tape under the component. The permanent fix was to use the component the PCB was designed for rather than one someone had to hand with a different footprint.
I had to customize WebCrossing to the needs of Svenska Spel , the swedish lottery company. They had about 200 pages of drawn screenshots how they wanted _EVERY_ detail to look.
The problem was WebCrossing did not offer much of customization built in, and i had to basically override every single template. Especially i had to create whole new forum code, because the WebCrossing way of presenting stuff was not how the customer wanted it.
Also, for some reason, aftonbladet.se, Swedens then (and still?) biggest news site had got 2 guys from WebCrossing USA office to come over and build their site.
Since i did the first gig, i also ended up rewriting the publishing system to aftonbladet.se:s forum from a "pre-moderation" system, into a self-moderating (post-moderation) system. The reason for this was beause the site managed to approve some nazi posts, and the publisher got fined .
After all this horror i could finally start learning PHP instead.
In the pre-Twitter, 3G world, where mobile Internet meant WEP this was a quite advanced. The main technical problem involved connectivity for the journalists. We borrowed cellular dongles and laptops, but there was simply no guarantee that the cellular network would be available at the show and wouldn't be overwhelmed by the number of attendees. Cellular data plans (we were roaming) were also runinously expensive and had to be minimised.
We wanted the journalists to be able to write their stories directly into a structured Web form, but we didn't want them to have to cope with the flaky cellular network and the story disapearing with a 'no network' error on submit.
In the end I came up with the idea of simply running Web server on each lap top, to present the for, handle the input and batch it up ready for sending as and when the network was available.
It worked rather well and we were all rather chuffed.
For instance, a while back I've been helping a friend to automate filling in a form, captcha and send a request on a site which required to do it daily.
The way we did it eventually was using nodejs, phantomjs as a module (node-phantom), deathbycaptcha for solving the captcha and than running it daily using a crontab. This may be a very simple task to solve, but instead of solving it in languages we knew well (like php using curl requests), we used a new language and tried to use the most easy way to do it.
Also I've been switching to managing my servers using ansible instead of shell scripts. I'll pickup cheff or puppet too some day.
I guess what I'm really trying to say is, I haven't had any challenging or really difficult problems to solve yet. Thinking about whenever that 2h I would spend writing boilerplate code can be reduced or automated, or avoided by switching technologies/frameworks. Learning about new tools and their advantages and disadvantages. Like after a good year and half of using vim talking a look at emacs. That is what to me is difficult and challenging, yet very rewarding in the end.
I know this is not exactly what the question was, my apologies for that.
The most important lesson I've learned: for software and tech development, people comes first. Always.
* An algorithm to generate a "spine-spline" for an arbitrary 3D closed loop (simplest form is a torus): Essentially a path that passes through the volumetric centroid of the 3D object and closes with itself (for my upcoming game, to generate the pathway for a level just by uploading the 3D model for the level)
* Porting bitgym.com's vision processing algorithm to work on Android devices: Getting the camera frame pixel data in realtime, downsampling the image, integrating with a C++ vision processing algorithm through JNI, and making it actually finish a round-trip in under 50ms, and work on hundreds of different Android (versions and) devices. Actually considering there are still occasional crashes and bugs with BitGym on Android, I guess this is an ongoing challenge, still being solved!
* Countering the drift and rotational aberration in a high speed stepper motor that was trying to point a laser pointer at multiple precise coordinates on a flat wall, in quick succession (for college robotics class).
Solution: rename the second class, and create a style guide rule for class names.
I'll tell you a story about one of mine. When I was around 13, I've tried to develop a smallish application that could keep a small local DB of medical results for some medical equipment. It should have had a pretty-looking UI, and the fashionable technology at the time was Turbo Pascal/Turbo Vision. At the time I've already been an Ok hacker (at a level of being able to write an antivirus to an unknown virus), hacking hardware since I could remember myself and writing in C since eight or something. You get the gist.
Anyway, I had about a month to write that stupid DB application. Including learning Turbo Pascal, OOP, damn virtual functions tables, Turbo Vision. With no internet, no prior knowledge of Turbo Pascal, no examples, no manuals, and an idiotic Turbo Vision book that had contained broken code. I think it was my toughest project ever so far. And I've failed it too ;)
On a serious note, I think that inventing useful and fundamentally new algorithms is it. Easily takes half a year of time keeping it in the back of your mind.
The second was a weird one. A customer had a web application that was showing odd intermittent failures, but only on HTTP POST. The system was an extranet application so I could speak in detail to the users having these problems. It appeared that only certain offices were having the problem, but not all users. I had to prove that it wasn't a problem in the application. I had to prove it wasn't something in IIS. I had to prove it wasn't something specific to a client. It turned out to be a misconfigured load balancer, where the MTU size was incorrectly set. The HTTP POST errors only was the clue. Nearly all browsers send HTTP POSTs in two packets or more, even if they fit in one. GETs always go in one. When the penny dropped after weeks of pain it was extremely satisfying to see the problem solved.
At the highest level the engine is based on Hierarchical Task Networks (HTN) (http://en.wikipedia.org/wiki/Hierarchical_task_network). Primitive actions like Move and Fire can be compounded into higher level commands like Sustain Cover Fire or Setup Ambush. These are recursively aggregated into even higher level directives like Attack Enemy Position. All commands have pre- and post-conditions, and the AI engine stitches together a sequence of commands under the current constraints (eg using Charge Hill instead of Charge Building when Attack Enemy is specified on a knoll instead of a building, you need to Setup Ambush on an incoming road if expecting enemy reinforcement).
At the lowest level, I had to implement detailed algorithms for pathfinding (easy, but consider different terrain, on-road, etc), line-of-sight (actually pretty hard due to time constraints and size of map) for locating best vantage points, etc. Implementing fuzzy conditions are tricky, eg the idea of 'threat' where a unit knows it is outnumbered by a superior force (when is an enemy unit relevant? based on distance? what if it's currently engaged against an allied unit? Can you get threat from one direction but not the other? what if the enemy can't see you?)
The hardest part about this was actually getting the domain experts to write the heuristics using my script editor. These army guys don't think like programmers at all! Even getting them to encode their decision process using simple If-Then-Else was an insurmountable challenge in mental gymnastics. :(
Still it was fun. As it turns out, it was used to train actual army commanders in a small foreign country. :)
dead programmer/reverse engineering projects are in there, but not as common these days.
i did a bunch of rather amazing things in the rocket business for martin-marietta, raytheon, applied research associates.
once, i walked up to an employee who had been fighting a problem for two weeks, asked him what he was having an issue with, determined it was a variable initialization error, looked at his monitor, pointed to the missing # sign on an 8051 assembler listing and said, "it's right here". walked away... 60 seconds total. win.
another time... three engineers working on re-creating a poorly documented test station for a guidance section on a missile. again... two or three weeks spent trying to wake up an interface board based on a 68020 micro. they worked for me, and had sent an engineer to ask me what i'd do. i was busy, but took two pieces of test equipment down to the unit (a logic analyzer and a processor troubleshooter (Fluke 9010)), made them show me their failure. looked at what the processor was doing (nothing... just executing 00 op codes). inspected the memory to see where the real code lived. found it, but it was at the wrong address. burned an EPROM set with the code moved from 08000h to 000000h and installed them. problem solved. start to finish 2 hours. win. went back to my management stuff, which was making charts explaining schedule slips and cost overruns. these guys were hot shots. i smoked em. woot.
fixed a consumer product that was a serious design error. solved problem. product went on to sell several hundred million dollars of units, making the owner a very rich man and handing him 85% of his market. huge win.
i have at least one or two of these a year (though the last one is a bit more unusual because of the volumes involved.)
We had a cloud deployment system that was also responsible for configuring Nagios instances. The two processes (Java and Python) that were responsible for the Nagios configuration communicated over XMPP. The Java process could only be restarted once a week because we guaranteed a certain uptime to our customers.
One day I decided to refactor the way it worked, and this entailed a change to the api. I began with the Java code, but when I got to the Python parts, more important work came up so I left my code in a branch and forgot all about it. When the next release came up a coworker started to merge all the new features into the new release branch. He asked if he could merge my stuff, and as I had just finished some new features in another branch I said yes. Two days later the new half-refactored, untested and very scary Java code was live in production without my knowledge.
An hour after the release the first alerts of nagios syncing problems came in. We scratched our heads and soon discovered that we had merged too much. Now we had to choose: roll back or roll forward. To roll back we had to take down the application, restore a backup, synchronize a lot of other connected applications. After checking whether the Java code that was in production now actually worked and spit out sensible messages (it did, surprisingly) I decided to try and create the rest of the Python implementation right away.
So with my headphones on and a big NO sign on the door I started programming and created what would normally take one or two days in just over an hour, including the release to production. The good news was that we had had no customer downtime of the Java process and now had a much cleaner Nagios configuration system, the only bad thing was that the alerting system had been down for about 2 hours.
The situation was pretty fucked up ;)
Two years ago the startup I work at had an unofficial doorman working the building we occupied in the financial district of SF. A homeless Vietnam Vet who slept on the concrete above a steam pipe for warmth. This gentleman was almost always there when I left work for the evening, sitting on his milk crate or standing around either engaged in deep conversation with someone(s) about whatever or spouting err.. let's just say cat calls to women walking by. I'd always talk to this guy after work as his stories were always entertaining, funny and typically heartfelt.
Last summer he received a Sony Ericsson from a catholic priest in the tenderloin and was placed on her family plan. This was the W580 which, in its hay day, took amazing photographs. I don't remember how it was brought up but at some point I mentioned that he could share these pictures via the web for his adoring fans (of whom he had many) and his eyes lit up. "I'd love that, B!" he told me "I just don't want to be a part of twitter or facebook or anything like that." To which I responded that I could build a site for him.
I was winding down an entire rebuild of the UI for the company I work at. I was responsible for building aspects of everything from django db models to css (SCSS really) and at home that past year I had learned how to get a basic django site hosted on my very own virtual instance. If this guy had a smart phone the job would have been as simple as building a responsive UI and a basic django backend.
The challenge then became hooking now deprecated tech to the web. After considering the problem only one solution came to mind: e-mail. I then went to work attempting and failing to scrape gmail and wound up just installing sendmail on the same server I was serving his django instance from. After a number of late nights and a slew of cursing I got my first end-to-end sendmail to django integration setup and from them created email addresses which acted as api endpoints. I parsed the senders lists to ensure only my or my friends phone numbers had permission to post to the site. I then went to work figuring out how to read, save and resize images emailed to this sendmail server.
By this time my friends small Ericsson had seen better days. The man's hands were so big that his thumb usually pressed 4-5 keys at once and I would typically see him attempting to bang out text messages or dial phone numbers with his pinky. This is understandably frustrating AND he was working on actually cleaning up his drug and alcohol habits at the time so his phone would meet with the wall every so often. Typically I could repair it but eventually it was damaged beyond repair.
From there he ended up getting a cheap go phone with no camera and he was kinda bummed but I recommended a pivot to him. Audio posts powered by twilio. He liked the idea so I went about consuming the twilio api and was able to get a proof of concept working fairly easily. I had also unlocked the secrets of SSL, in order to give users a more secure login experience. I finally got one post from him, just saying hi, but from my cell phone. At this point I gave him a card with a number to call and access code to punch in, but he has asserted that he doesn't want audio only posts, he wants audio and image.
So, I now have a sever for him running django and sendmail which work in concert only for registration (your email address needs to be white listed if you want to register and the only way to do that is to have me or my friend add your email address to the body of a text message we text to a white list email endpoint).
He has also since found out that his liver is essentially fucked and that he has terminal cancer which puts the idea of a website for him on the back burner for both of us.
So I dunno, this was essentially a CRUD type problem, and while the technical challenges weren't as difficult as many of the ones I've read here, I think the creative challenge of giving a semi-tech literate homeless man a means to operate his own website with a decent amount of autonomy was a worthy and fulfilling one. I haven't been in this industry terribly long, but the possibilities I see out there are damn exciting. I made no money from this work but learned a lot I didn't already know and achieved something I feel like most people wouldn't even attempt. If you feel like you're stuck in CRUD land, why not attempt to mix up your customer base a bit? Think of something you could do that will be challenging and fun enough that you could perhaps open yourself up to an entire arena of unpaid work (I know, blasphemy!). It might not sound glamorous.. and well.. it really isn't, but I've found that the connections I can make with people using skills learned in this industry is what really drives me.
So fuck it man, get CRUDDY and build some cool-ass shit for people who aren't looking at some rigid business model. You might find yourself seeing CRUD work in a whole new light.
I'm really having a hard time parsing this sentence... what's a tenderloin? Who is 'her'?
The priest is the her in this sentence.
But I know this is a catholic church and she is a (or maybe the?) priest there. It's also possible that I could be getting my terminology wrong but she is definitely not a nun. This place is right in the middle of the TL and I believe functions mostly as an aid for the homeless of San Francisco, they close their doors some time in the afternoon. So yeah, I don't know enough about the catholic church to know what is or is not allowed I just know what I see and am told.
Thanks again for the kind words, they made me smile :)
I needed a video mode that was a better fit for the screen, so I derived a set of parameters to add a new video mode to the Linux drivers for the VT1625 Via video card.
I did this by writing a small C program which could inspect the video card registers on a Win98 machine that was using the official driver. Once I had a set of register values, I used online documentation to transform them into the initialisation values used in the video driver.
The video mode was submitted back to the openchrome project, and as far as I know it's still there. It took me about 5 days of prodding, and another day of trial&error.
There was one development server and at some point we added a NAS. Due to the access restrictions and my inexperience with other stuff, the server would mount the NAS and reshare that through Samba.
On Windows machines every folder would show up as a file.
You could still manually enter the path and get a directory listing, but you could of course not navigate the directory structure in any satisfying way.
It took me a few weeks to figure out the error that came from a error in the NFS file system. Figured out the fix and patched the kernel.
Felt nice when I was done.
I've put a link to the serverfault thread about the issue.
Turns out it's a basically a black box they've bought from an obscure company.
We then needed to parse the HTML tables by hand, which should be relatively straight-forward for simple tables. But then it turns out the tables are filled with rowspans, colspans and nested tables. It took me quite a while to create a decent algorithm, as I had never done anything similar before. But now we can parse all the HTML tables on that website and cache them in the app.
As I said, nothing impressive or fascinating, but I was still quite proud of my algorithm.
I had multiline XPath-expressions and comments like "grey box on the top-right of product overview pages containing links like X, Y and Z"
and the XPath expression was looking for things like a table with grey background that is an descendant of the table that comes after the comment "<!-- CONTENT STARTS HERE -->" and that is following-sibling of a form with name bla.
I hacked together a system of file system listeners and named pipes so that I can inject code into a sandboxed app on OSX and have it communicate with the non-sandboxed process that put it there.
Coworker was leading a project that had to get a Qt/QML software to show an accelerated and responsive display in at most 3.5s from power-on. On his request I was pulled in to solve the hardest part of that pipeline.
All the simple solutions had been done already: boot loader timings were down to nothing; the NAND flash driver was optimised for reading with UBIFS tweaks; we loaded kernel modules in sequences of parallel insmods so the wallclock time was kept at minimum; and so on...
A year or two earlier we had discovered an Intel engineer's presentation about optimising Qt binary load times. The two main tricks were to build a completely static binary, and to reorder the binary symbols in the order they were read. Through unofficial channels I found out that Intel used a customised linker to generate their symbol list orders in fully automated fashion. I did not have enough time for that luxury, and as a bonus I had to deal with proprietary 3D drivers and their respective kernel modules. Fully static linking was not an option.
I ended up using OBS in a very weird way. I had three diffrerent Qt builds. One was explicitly compiled for dynamic builds without ANY optimisations, all inlines disabled, and with function call tracer enabled. The other had all the regular optimisations and was built statically. to be used in production. The static Qt was patched just enough to allow loading dynamically built 3D driver modules. This was necessary for the next step...
The software was built against the dynamic, deoptimised Qt. It was then run on the target device and the function call trace log was saved. (GCC's function call tracer only worked with dynamic libraries.) The log was saved and extracted. I then had set of tools; one to extract the order the Qt symbols were accessed. Another to extract the symbols from the Qt libraries. And a third one produced a linker script to force the library symbols to be linked in a static binary in the order they were read.
These tools together produced files for the final Qt build: a static, heavily optimised Qt with the ability to load 3D driver modules at runtime to avoid (L)GPL mousetraps. This Qt static build contained only the symbols that were needed to build the final binary. This way, when building the optimised binary, we knew that there were no unnecessary symbols in the product binary - and with the linker script we had forced them so that the read from the NAND flash could be a single linear sweep.
The reason to do this in three distinct phases was to ensure that the client could reproduce the steps in their own OBS instance. If and when they updated their QML code to use new functionality, they could then regenerate the intermediate files and the final binaries. Once documented and tested, the full cycle took about 10h total, thanks to the need to rebuild the intermediate Qt.
These tricks got the final boot time to about 3.9s. The coworker who had requested my assistance came up with the final trick: patch Qt a bit more to allow loading graphics in pre-processed form, so that they would not need to go through all the intermediate parsing steps. We did a few profiling runs and discovered that processing the PNG files, even when built into the binary as static assets, took nearly 700ms in total to transform them into regular Qt pixbufs.
It wasn't the hardest project I've been in, but it pulled off a lot of low-level magic to get around system and platform limitations. And it was certainly satisfying once the problems were sorted out. :)
I've got a FLIR infrared camera that runs Windows CE, which takes a lot longer than 3-4 seconds to cold-boot. The camera initially goes to sleep when you power it off, then after a day or so it shuts down entirely to save its batteries.
One nice thing they did was to include enough intelligence in the low-level ASIC to drive the display with an unprocessed, uncalibrated live image while the user is waiting for the "real" camera application to come up. Much better than staring at a blank screen for 20-30 seconds.
That's all I can say about it for now. (Although I did learn to dislike 3D drivers in that project. With a passion. Can't get contiguous kernel memory? Oh, I'll just wait here and block until I get what I want. What, you wanted to do something ELSE while I probe the hardware and spend 1.15s initialising myself? Tough luck, go sit in a corner and cry.)
* Complex reporting requirements with an extremely vague specification on a completely normalised database. The sheer quantity of various joins was fun. There must've been a better way
* Optimising certain read operations which operated over a potentially large dataset.
* Providing an efficient full-text search based around lucene which had some pretty interesting requirements, that came down to lucene filters in the end.
It turned out we were racing the .NET garbage collector when passing a reference off to some native code. The variable being used by the native code would be collected by .NET as it determined that we no longer needed the variable when we were still actually using it. A classic .NET interop mistake that was fixed by wrapping the use in a 'using' clause. The reason we didn't see it while developing was that .NET garbage collection is different between Debug and Release builds (Debug builds extend the lifetimes of all variables so that they live for the entire length of the function they are used in; Release builds optimize for the shortest variable lifetime).
Another time, had a crash that seemed to occur after a consistent time period. The crash was in third-party code that we didn't have the source for. It 'kind of' behaved like a memory leak, except that everything was telling us the app's memory usage was normal.
After several weeks almost tearing my hair out over it, I ended up setting a breakpoint in the third-party lib using WinDbg and using PageHeap on the app while it was working in order to see who owned the memory that the third-party library was trying to write to. That lead us to a line of code that was performing a malloc() without checking the result; then sending that off to the third party library for use later. The malloc failed, neither us nor the third-party library checked for the NULL and the third-party lib wouldn't try to use the memory until later.
We eventually determined that we were seeing heap fragmentation causing the malloc to fail; we found that under certain conditions we'd leak small objects. Again WinDbg and other tools came in handy; they DID report high address space fragmentation, but it took us a long time to put the two things together. WHen we fixed the small object leak, the crash in the third-party library went away.
So classic bugs in hindsight, but they certainly taught us how awesome WinDbg can be (along with the other excellent debugging facilities provided by Windows). In the end, well worth the time investment in learning.
Very proud of the result :-)
Someone I knew in college heard Smith, called me,
and I flew from Maryland to FedEx and was hired to
solve the problem.
Back in Maryland, we had some meetings, including
one in a conference room at the Georgetown library,
with various approaches to the problem, none good.
There was some politics involved.
Really, the project was just mine, so I thought of a
first-cut approach and attacked the problem; I was
still teaching computer science at Georgetown. In
six weeks I had a program, turned in the grades for
the courses I was teaching at Georgetown, drove to
Memphis, rented a room, tweaked the program a
little, and declared it done.
Soon too many on the Board were saying that there
could be no good solution to the scheduling problem.
So, one evening a senior VP and I used my program to
schedule the whole planned fleet into all the
planned cities, printed out the schedule, and
At a senior staff meeting, Smith's reaction was, to
paraphrase, "An amazing document; solved the most
important problem facing FedEx".
Our two Board representatives from Board member
General Dynamics went over the schedule carefully
and announced, "It's a little tight in a few places
but it's flyable.". The Board was happy and a big
chunk of equity funding was enabled.
So, the software solved the practical problem at the
time and in time. But the software was not a grand
solution to everything in fleet scheduling.
The hard parts were, (a) the politics, (b) designing
a program that would be powerful enough to solve the
practical problem at the time but easy enough to
write to solve the problem in time.
Later I attacked the problem via 0-1 integer linear
programming set covering; but the politics got much
worse; the promised stock was very late and still
just a handshake deal with Smith with only my offer
letter on paper; my wife was still in Maryland; and
I wanted either the stock or a Ph.D. Smith's last
promise of stock was $500,000 worth that might be
worth 1000 times that now, but it was just a
handshake deal. So, left for a Ph.D.
(2) Nuclear War at Sea. To support my wife and I to
the end of our Ph.D. degrees, I took a part time job
in military systems analysis. At one point the US
Navy wanted to know how long the US SSBN fleet would
last under a special scenario of global nuclear war
but limited to sea. They wanted their answer in two
There was an old paper of B. Koopman that argued that
'encounters' at sea between Red and Blue weapons
systems would form a Poisson point process. I added
on a little and got a continuous time, finite state
space Markov process. There is a closed form
solution as a matrix exponential, but due to a
combinatorial explosion the state space was far too
large for doing anything numerical with that
But it was easy enough to generate sample paths, so
I wrote software to generate and average, say, 500
sample paths. The work passed a technical review by
a famous mathematician, and the Navy got their
results on time. The next day we took a vacation in
Shenandoah, and my wife got a vacation she wanted on
(3) Winning a Contract Competition. There in
Maryland I was in a software house working for a US
Navy lab. We were in a competition for a software
development process. Part of the work was to
measure the power spectrum of ocean wave noise. I
got smart on power spectral estimation and wrote
some illustrative software of passing white noise
through a filter with a specific transfer function
and accumulating the empirical power spectrum of the
output of the filter. The software showed what the
math claimed: At the low frequencies the project
wanted, an accurate power spectrum needed a
surprisingly long interval of data. The software
showed that with short intervals of data, the
estimated power spectrum could have big peaks that,
really, were just sampling noise that would go away
as the length of data increased.
So I called one of the customer's engineers and
showed them the news. As a result, our software
house won the competition.
(4) Anomaly Detection. I was working on applying
artificial intelligence to the monitoring and
management of server farms and their networks. One
of the main techniques was 'thresholds', and I
wanted something better.
So, I put my feet up, popped open a cold can of Diet
Pepsi, reviewed some of my best graduate school
material including some of ergodic theory, had some
ideas, wrote out some theorems and proofs, and wrote
some corresponding software.
So, suppose are given a system to monitor. Suppose
100 times a second we get data on each of 15
variables. Collect such data as 'history' data
('learning data') for, say, three months (assume a
stable server farm or network). Study this data,
let it 'age', and be fairly sure that the system
being monitored was 'healthy' during that time.
Then in real time 100 times a second, report if the
system is 'sick' or 'healthy'. Have false alarm
rate known in advance and adjustable over a wide
range. So, have a statistical hypothesis test that
is both multi-dimensional and distribution-free, one
of the first such. Although there is not enough
data on actual anomalies to apply the Neyman-Pearson
result, do process the data in a way to promise
relatively high detection rate for any selected
false alarm rate. Find a way to do the computations
quickly and efficiently.
I did those things.
Some office politics got involved: Suddenly I was
told to write a paper on my work and that the
company would review the paper. If the paper was
not publishable, then I would be fired. I wrote the
paper; the company claimed that the paper was not
publishable; and I was fired.
The guy who walked my out the door had been in
management for about 20 years but was demoted out of
management the next day. The main guy after my ass
was two levels higher up, pissed off at me for no
good reason, two weeks later was demoted one level
in the organization chart, ws given a "performance
plan", which he failed, and was demoted out of
The company wrote me a letter giving me intellectual
property rights to my work. Out of the company, I
submitted the paper for publication. The paper was
published in a good Elsevier journal, the first
journal to which the paper was submitted, without
significant revision (one reviewer wanted to change
how the first line of each paragraph was indented).
The Editor in Chief of the journal invited me to
give the paper at a conference he was running -- I
(5) Internet Search, Discovery, Recommendation,
Curation, Notification, and Subscription. My view
is that current Internet search techniques are
effective for only about 1/3rd of Internet content,
searches users want to do, and results they want to
find. I want the other 2/3rds.
With my feet up again and another cold can of Diet
Pepsi, I had some ideas and wrote the corresponding
code. Now all the code is written for a
corresponding 'search engine' Web site except I have
a nasty little bug having to do with class instance
de/serialization. Should fix that today. Should be
live in a few more months.
- How to store data in an embedded Linux system with no writable partitions. (That's more of a hack than everything, still, it worked!)
2 - The Integrated Medical Systems LS-1, a portable "ICU in a box" done under contract for the US Army. The box integrated multiple medical device subsystems (ventilator, ECG, infusion pumps, smart battery charging, invasive and non-invasive blood pressure, SpO2, etc. etc.) They communicated internally over 100-base-T cat-5, and also had to interface to hospital IT via wifi. And, there was a remote interface unit that had to be able to remotely monitor and control the entire system via internet, from the other side of the planet if necessary. We ended up doing a lot of work to ensure that the system performed exactly as expected and required by clinicians, even in these sort of remote-operation scenarios. (The thing the Army wanted was to have video and audio from the bedside to remote clinicians, and have the remote people act like team members that were coordinating their efforts with the bedside clinical team.)
3 - (Totally for the hell of it, in hobby mode) I was reading a bit about Shor's algorithm to use quantum computing to factor products of pairs of prime numbers. One key part of the algorithm depends on FFT's, to help with detection of lengths of cycles. If you have a vector of length N, consisting of K equally spaced non-zero elements, and K evenly divides N, then the FFT of that vector will have N/K evenly spaces non-zero elements, and crucially they will start in the zero'th element of the output vector. (There is no such requirement or assumption on the input side.) Happily enough, this is "almost" true if K does not happen to evenly divide N. But this is a bit difficult to prove. Some reasonably complete and rigorous presentations of quantum computing basically say, "The proof of this is outside of the scope of the present work." So, for fun I took it as a challenge to create an elementary proof of this lemma. It turned out to be quite hard, but REALLY fun and satisfying.
4 - Never been really happy with the standard presentations of red-black tree algorithms. It just felt like there was some underlying simplicity in there that was struggling to get out. So, I created a new formulation of red-black trees and corresponding algorithms. Did a web site (gregfjohnson.com/redblackbuilder.html) that illustrates these algorithms, and supports forward and backward execution and single-stepping through the algorithms. This web site was pretty darn hard to get correct.
Most of them were reinventing the wheel. :-(
I think the chemists have a saying: "With a few hard and dedicated work weeks in the lab, you might save hours in the library!"
Manually making sure your frontend/backend matches. I'll repeat. Manually. Weakly typed. Constant bugs. Constantly seeing parts of your implementation getting deprecated, going unsupported, and of course not matching clients' "hot new thing" (WHY do you need a UI designer, which is a title given to every fuckwit that has ever started photoshop, successfully or -usually- not on an internal order management system ?). Impossible to generate tests that actually catch those bugs. Inconsistent language implementations, to say nothing of dom. Zero support for any kind of legacy thing. Layout engine inferior to what was available before I was born on machines that get their ass kicked by my watch (talking about the NeXT platform).
Getting a redraw loop going for a game is so ridiculously difficult and slow it's painful to write about it. Give me a fucking canvas that calls me when it needs to draw itself, with OPTIONAL double-buffering. Doesn't exist (no, HTML5 canvas is not this, is slow and resource hungry, and this will never change due to how it's designed).
I was happy for 6 months with a small side-scroller implemented on a microcontroller that, together with the LCD, ran for about 6 months on an old CR2025 battery. This thing provides 225 mAh over it's entire lifetime, keeping a game going for 6 months.
The state of the art HTML5 side scroller at http://playbiolab.com/ drains my phone battery in 42 minutes flat. Granted, the music's better. The screen size isn't. That battery is a rechargeable 2300 mAh.
Getting anything remotely resembling productivity going on these platforms (I refuse to refer to web as a single platform) was so hard that I just gave up and ran the other way. And now, of course, every UI toolkit is "legacy" at best, just plain unmaintained at worst, and a non-starter for any project you might want to do.
That is actually considered progress.
How did webdev ever get this fucked up ?
What do you think ? How do your experiences on the web compare to the others ?