
Ask HN: What useful internal tools or libraries have you built in your company? - crack-the-code
I&#x27;m curious to know what kind of tools, scripts, automation, libraries, etc. you all have built to help boost the productivity of your team(s).
======
z3ugma
At a company of 10,000, it's important to know the 100 people you'll be
working closest to. I built a "memory" game as a webapp which matched the
faces of 4 people on your team to a single name and a list of self-assigned
skills. You click on a photo to match a name to a face, and once you guess
right a new set loads. You can randomly click through your whole team and
learn a lot about them in just 15 minutes or so.

The whole thing was built with read-only SQL scripts, Flask, and some JQuery.

~~~
suramya_tomar
This sounds very interesting and useful. I was thinking about making something
similar but with my personal contacts & people I meet socially.

Is your code opensource or only for internal use?

~~~
sloaken
I agree, I think thats a great idea. I would like to put it on at my place.
Mostly for me, as I can never remember people and its not that big of a
company but there are so many people I do not know.

------
reacharavindh
At my current job, I saw a lab technician work manually with Excel sheets
entering sample IDs and then using a website where he'd copy/paste the sample
ID into to get a bar code, and then print it out to stick on the box.

I wrote a Python script that uses openpyxl module to read His Excel docs, and
report lab module to generate bar codes in a PDF document with appropriate
spacers so that he can simply print it out, and stick them in boxes.

He is happy and so am I that I could save his time. It only took me 20 mins to
write this script.

~~~
thedevindevops
There genuinely needs to be more of this sort of thing.

The overall 'productionisation' of our industry has led us into a cookie
cutter style of work and away from genuine problem solving like that.
Ironically that sort of productivity boosting work has been wrapped up in a
nonsense 'process automation consultant' role that is inflated beyond sense
and often dismissed by the receiving company as an unnecessary expense.

------
digitalsushi
I wrote a shim layer for all our packer/vagrant OS workflows to operate
against an unreliable vsphere ecosystem. It exposes a suite of posix sh
functions for sysadmins/developers to easily operate against this very
unreliable environment. It adds automatic logging, retrying, and adjustable
verbosity because of the numerous ways this environment randomly fails.

People can just . source the file in from a shared location and often find
that their scripts just start to work better. It's not perfect, nothing's
perfect. It's not even that clever. But when builds and deploys start to work
twice as good, even with the remaining failures, well, that's something. None
of the 65000 employees using it will ever know, but it feels good to know we
were dropping 2/3 orders and now we're dropping 1/3.

------
MediumD
Back at my old job, people would have trouble knowing what to do when on-call.

I built a slack app that would keep track of my team's pages and what people
did to respond to them. As new pages were triggered, the bot would show the
on-call person what previous people had done to resolve the page.

~~~
sethammons
Nifty. This should integrate with a service runbook.

------
i5h4n
In my previous organization, we dealt with a legacy enterprise software
product which had accumulated a massive bug history over multiple years and
sub-products. All being tracked by an in-house bug tracking product.

Lots of issues we used to see being reported were either already fixed or had
been config issues. In order to (somewhat) quickly find existing
fixes/comments for issues that we get reported, I built a search tool (webapp)
which scraped the bugs and comments in those bugs in order to find any
relevant information around your query and listed them in order of matching
probability.

Was a pretty cool learning experience to build that out. I had deployed it on
a personal remote VM that devs were granted, have no idea if people are still
using it.

------
tehlike
I built a hacked up experimentation framework for clientside flags, that
boosted my teams speed and confidence quite a bit. Hacked up because it didnt
use existing serverside mechanism for a bunch of reasons.

Used the same experimentation framework for automated javascript binary
releases, so at some point i could release 5 times a week, with no issues. Now
i left the team, people took that on and continuing like tic toc.

Showed them how to use powerdrill (data drilling, analysis tool), and taught
them metrics. It is surpising how little people care what their work is really
about eventually, and bringing them data driven mindset gave even more
productivity boost.

------
Adamantcheese
At my last job we had to do builds constantly and put it on hardware, which
was annoying because builds took 15 minutes and putting it on the hardware
took another 10. Couldn't solve the latter because it wasn't in our domain,
but the first half I managed to "multithread" a build using a really hacky
batch script compliation method, with a make file calling the compiler in a
new command window for each file that needed to be compiled, with some checks
for "needs to be compiled" or "wasn't changed". An extra script at the end of
the process made sure that all the compiler instances finished before
continuing with the next step. All of that work got it down to 2 minutes, or
in small change cases, about 30 seconds. And another part of that was
integrating some configuration data with existing files, which was simple as
writing up a bunch of excel macros to do the copy/pasting and file output. It
was hooked up to a shared folder on the network so the other team could just
do their part, and then my part was entirely automated. In fact, the team
testing things could do everything by themselves without any input from me at
that point and only needed me to answer certain questions.

Yes, it's really hacky and the whole thing is entirely silly and could have
been solved by using more proper tools (i.e. not a defunct make software
without wildcard support for input files or Excel for configuration), but I
was VERY pleased when I got it working.

------
actionowl
I was working on a project where we'd be printing several hundred thousand
badges for several schools. We had all the data and just needed photos. The
client sent us a DVD with several hundred thousand photos, upon inspection we
realized that the photos where really bad:

\- No single aspec ratio

\- Some photos had no one in it (picture of a chair, etc)

\- Some photos had multiple people in the photo (!?)

\- Some photos were of such poor quality that you couldn't make out the
person.

It seemed some locations let the students provide their own photo. This is the
first time we'd ever encountered data in this shape.

My company had two options: Print the data as-is (which would result in
thousands of reprints) or hire some temp staff to sort through the photos.

I asked them to let me try and sort them over the weekend with a library I
just learned about (OpenCV). I was able to write a custom OpenCV python script
a little over a hundred lines long and ran it over the weekend to crop and
sort the photos into several categories (based on face detection) leaving only
a few thousand that had to be manually reviewed! That had a real dollar impact
and felt really good.

------
stevekemp
In the past week I've written a broken-link checker, in perl, to sanity-check
the output of a static-site-generator.

I've also written a trivial PHP parser which was designed to match up class-
definitions with comments above them:

[https://blog.steve.fi/parsing_php_for_fun_and_profit.html](https://blog.steve.fi/parsing_php_for_fun_and_profit.html)

Both of these tools were designed to be invoked by CI/CD systems, to flag
potential problems before they became live.

Most of my work involves scripting, or tooling, around existing systems and
solutions. For example another developer-automation hack was to automatically
add the `approved` label to pull-requests which had received successful
reviews from all selected reviewers - on a self-hosted Github Enterprise
installation.

------
rahulrrixe
I built a code generator package in Kotlin which generates codes for Kotlin,
Swift, Web (JS), and React-Native (TypeScript). Basically, you provide your
class definition in a DSL style (Similar to TOML) and it will generate the
implementation and interfaces of the bridge for different technologies.

~~~
s66qnf92
This is great! We're working on code generation from class definitions right
now.

Any good resources worth looking at?

~~~
rahulrrixe
I started by checking how you can write HTML using Kotlin DSL. Here is the
source code
[https://github.com/Kotlin/kotlinx.html](https://github.com/Kotlin/kotlinx.html)

Now, I have to generate different languages once the DSL is finalized. To
achieve this I use Flask framework architecture. There we have routes with
HTML templates. Here each generator has its own templates.

------
solumos
When our company was doing more active Go development, a colleague and I built
Charlatan.

[https://github.com/percolate/charlatan](https://github.com/percolate/charlatan)

Ended up saving us a lot of time writing mocks for tests.

------
cyanide911
Python 3+:

Blue - A dead simple event based workflow execution framework.

I always find it easier to model systems from an event driven perspective.
Especially when you have to move fast and evolve unpredictably. I wanted a
framework anyone could learn to use within 5-10 minutes. At the same time it
should be able to solve all kinds of use cases that require event based
coordination between tasks in a distributed environment.

Works well for us for simple use cases (eg. data processing workflows) and
complex ones (eg. our entire retail order fulfilment system).

~~~
raihansaputra
Is this similar to Prefect? I'm interested on using those kind of system, I
think it would be really easy to mock up business processes on these kind of
tools quickly before building a more robust solution.

------
dhruvkar
I wrote a shipping container tracking system for ~7 shipping lines.

Each shipping line offers a tracking service through one of these methods --
email, RSS or website form. Our container numbers are collected into a Google
Spreadsheet via our freight forwarders. Our employees use an antiquated ERP
with no API.

The script collects relevant container numbers from the Google spreadsheet,
scrapes the update and the scrapes the ERP system to enter the update.

------
Random_Person
I wrote a custom documentation tool that we use on all of our projects. It's a
few input fields for heading/paragraph/images and a few buttons. You can add
as many "sections" as you want. It exports HTML/CSS that you can stick in any
<div> and it scales well, handles popups for images, and such. It's made our
life much simpler when adding documentation to our sites.

------
shanecleveland
Automated discovery of late shipments eligible for a refund, which the
carriers otherwise make very difficult to track. There are some services that
can do this, but they take a big chunk of the refunds. We save thousands each
year.

Many other specialized calculators and templates, which tend to be more
foolproof than Excel.

~~~
schappim
I did the same in Australia. When I ran the script to request refunds for
missed Express Post SLAs, we were 90% of Auspost’s inquiries for the day and
got back thousands.

~~~
shanecleveland
The interesting thing with both FedEx and UPS is that they also guarantee
delivery times for regular "ground" shipments, which are fairly prone to
delays. The exceptions are circumstances out of their control, such as weather
or recipient issues. And they suspend during a couple holiday windows. But
most shippers either don't realize there are guarantees or don't have the
means to efficiently track them.

------
schappim
Mine was hardware and software related.

I built a WebUSB Postal Scale and WebUSB Label Printer so our e-commerce
company could print carrier shipping labels with just one click.

It took the process of fulfilling an order down to ~10 seconds per order.

------
theSage
Wrote a simple fizzbuzz server which brought down the time we spent
interviewing freshers for internships/jobs. Since we're a small team, this had
a big impact.

------
atomashpolskiy
My last job was at a company that develops one of the most popular mobile MMO
action games in the world (with hundreds of millions of installs). It stores
data in large Cassandra clusters (depending on the platform, DCs contain up to
hundred nodes).

What I did was designing and developing a command line utility/daemon for
performing one-off and regular backups of production data. The solution is
able to:

\- work with a 24/7 live Cassandra cluster, containing tens of nodes

\- exert tolerable and tuneable performance/latency footprint on the nodes

\- backup and restore from hundreds of GBs to multiple TBs of data as fast as
possible, given the constraints of the legacy data model and concurrent load
from online players; observed throughput is 5-25 MB/s, depending on the
environment

\- provides highly flexible declarative configuration of the subset of data to
backup and restore (full table exports; raw CQL queries; programmatic
extractors) with first-class support for foreign-key dependencies between
extractors, compiled into a highly parallelizable execution graph

There was an "a-ha!" moment, when I realized, that this utility can be used
not only for backups of production data, but for the whole range of day-to-day
maintenance tasks, e.g.:

1) Restore a subset of production data onto development and test machines.
This solves the issue of developers and QA engineers having to fiddle with the
database, when they need to test something, whether it be a new feature or a
bugfix for production. They can just restore a small subset of real,
meaningful and consistent data onto their environment with just a bit of
configuration and a simple command. Developers may do this manually when
needed, and QA environment can be restored to a clean state automatically by
CI server at night.

2) Perform arbitrary updates of graphs of database entities. It's a common
approach to traverse Cassandra tables, possibly with a column filter, in order
to process/update some of the attributes (e.g. iterate through all users and
send a push notification to each of them). The more users there are, the
longer it takes, and negatively affects the cluster's performance and latency
for other concurrent operations. Having a tool like I described, one may clone
user data onto a separate machine beforehand (e.g. at night), and then just
run the maintenance operation somewhere during the day, on data that it is
still reasonably up-to-date.

All in all, it was a fun experience of devops, which I'm quite fond of. With
just a little creativity and out-of-the-box thinking, there are lots of ways
to improve the typical workflow of working with data.

