Ask HN: What piece of code you wrote are you most pride of? - lucas-piske
======
codegeek
I was working for an Investment Bank in 2010 as a Tech. Business Analyst. My
job was not necessarily to write code.However, we had a major problem with one
of the projects where we had to process hundreds and thousands of rows in an
Excel spreadsheet for testing and the version of Excel was 2003. The limit for
one worksheet was 65,000 rows only and it used to take 3-4 hours to do the
testing (don't remember details anymore) due to manual copy paste across
multiple worksheets. One time, my boss mentioned in a meeting that perhaps
someone could do it in MS-Access instead ? I decided to do it without telling
him and put together a proof of concept in MS-Access in 2-3 days and then
handed it over to a developer to finalize. I did not know much about VBA and
googled the heck out of it to put that application together. We reduced the
testing time to less than 30 mins and I was so proud of that. Not the best
code but the best problem I solved till then.

------
jjjbokma
I am quite proud of the tumblelog [0] code I wrote. There is both a Python and
a Perl version and it was quite a challenge to keep them in sync and idiomatic
(I hope I succeeded with the Python version). I am also happy with the output
[1]

[0] [https://github.com/john-bokma/tumblelog](https://github.com/john-
bokma/tumblelog)

[1] [http://plurrrr.com/](http://plurrrr.com/)

~~~
eindiran
I just wanted to say that I saw when you posted tumblelog a few weeks back,
and your posts on plurrrr and it inspired me to try writing a static site
generator and start writing blog posts again. It's a very cool project.

~~~
jjjbokma
Thanks! And good to hear, I wish more people started to blog (again).

------
closeparen
A conscice, self contained, unassuming little function that represents some
out of the box thinking to vastly simplify a previously hard problem. Those 25
lines are backed by several months and 10+ pages of analysis establishing that
their simple approach is at least as good as the obvious approach, which would
have meant integrating several very complex dependencies. Never been prouder
of a diff so small.

Close second, a system I designed and built 2.5 years ago, which faded into
the background and plugged along. Business interest has renewed, and
everything they’re now asking for fits neatly into the architecture. Really
satisfying to see something hold up.

------
throwaway43635
The ones I wrote and upstreamed to fix bugs in the Linux kernel.

------
watersb
The code I am most proud of is the code I did not need.

------
atomashpolskiy
I'm not particularly proud of any code that I've wrote (looking back, all of
it sucks in one way or another), so I'd rather share a piece of code, that
I've written most recently (and maybe receive some constructive criticism,
hehe).

The piece of code that I'm speaking of is a variation of Bip Buffer, which
itself is a special kind of ring buffer. Ring buffers, also called circular
queues, are commonly used for producer-consumer scenarios. The main difference
between a typical ring buffer and a Bip Buffer is that the former operates on
distinct elements whereas the latter operates on contiguous data (e.g. binary
streams).

\--

In short, Bip Buffer is backed by an array, which it splits into two regions:
A and B. At first, there's only a single region A, and all incoming data is
appended to it. Simultaneously, the data is consumed (stripped off) from the
beginning of the array, probably by some other process. Region's boundaries
are updated as the data comes in and goes out (in FIFO fashion), so the region
kind of floats from "left" to "right", akin to a sliding window, but shrinking
and growing in lockstep. As the array gets more than half full (region A's
right boundary crosses the halfway point of the array), it's a good time to
start thinking about wrapping around. Since now, on each change to region A's
boundaries the Bip Buffer compares free space on both sides of the region. As
soon as there's more space to the left than to the right, a second region B is
created there (on the left, i.e. in the beginning of the array).

Now the newly arriving data is appended to B, but the old data is still
consumed from A. This continues until one of two things happen:

1) B's right boundary hits A's left boundary. In this case no more data can be
added to the buffer, so the producer has to wait until the consumer catches
up.

2) All of A's data is consumed. Now there's only a single non-empty region
left, namely B, so we may as well start calling it A and start over.

\--

This aside, my use case is a bit more complicated than just reading a
continuous stream of data. I have a messaging system, where nodes exchange
large quantities of data with each other. All communication goes through a
single TCP connection, and this includes both service messages (like, requests
for blocks of data and announcements about having finished downloading certain
blocks of data) and the actual data.

What I as a well-behaved peer want to be able to do is:

(a) Be able to process actual data asynchronously, as there are multiple
incoming streams, disk I/O, checksum verification, etc., and all of these
contend with each other and generally take some time.

(b) Process service messages as soon as possible (e.g. in case some peer, that
happens to be sending me data, in the midst of the process sends me a request
for some other data). In technical terms, I want to minimize latency of my
peer's requests.

A simple way to accomplish this would be to copy blocks of data into memory
and enqueue them for delayed, asynchronous processing. This works pretty well
for small amounts of data (tens of MB's per second), but has some limitations:

\- Some runtimes incur overhead for each allocated object. E.g. in JVM each
object is prepended with several bytes of metadata; references to the object
are accounted for for purposes of garbage collection; if asynchronous
processing of the object is taking too long, it may be promoted (copied) to a
different, long-lived part of the memory (and, depending on the load, "long"
may actually mean 50 ms more than usual).

\- Copying takes time. Obviously, one copy from main memory to disk is
absolutely necessary. NIC buffer can be mapped to main memory via DMA, if the
OS supports it, or else it's one more copy. Why add yet another copy to
application process's memory?

\--

So my idea is to tweak the Bip Buffer in such a way, that it can hold the
data, that's going be processed asynchronously, while _at the same time_
continuing to process next messages in the stream. Basically, it remembers,
which areas of the buffer are "reserved" for delayed processing (and thus must
not be overwritten), and goes on operating as usual and reading subsequent
data, until it wraps around and hits the reserved data, and only then it
stops. And if the data arrival rate is not too high, then chances are, the
reserved data will already be processed and disposed of by the time the buffer
wraps around.

Link to code:
[https://github.com/atomashpolskiy/bt/blob/d0f10b7a306c7595ef...](https://github.com/atomashpolskiy/bt/blob/d0f10b7a306c7595efb214dafd1ae52a51068d0d/bt-
core/src/main/java/bt/net/pipeline/InboundMessageProcessor.java#L36)

Please keep in mind, that it's a POC and was programmed in a single day, so
there might be (must be?) some edge cases, that I did not think about, and the
code style is a bit rushed (was too busy drawing pictures of this craziness
and debugging :)

