
Show HN: Hiring DevOps and SREs – testing troubleshooting live - hkh
Hi HN<p>We are launching our 3rd startup. Our 1st was a PhD project turned into a startup which simulated public cloud costs before adoption - an example use-case was Disney trying to figure out how much it would cost to render their new movie on AWS and Google, using different configs, RIs vs on-demand, different services. That was PlanForCloud and was acquired by RightScale.<p>Our 2nd startup was even more interesting - we built a public Container-as-a-Service cloud, but in a place where Amazon&#x2F;Google could not enter - Iran! This grew fast, and was acquired by the Uber of Iran - the world&#x27;s 5th largest ride hailing app by rides. That was AbarCloud and was acquired by Snapp. If anyone is interested in how you design a public cloud with minimal external dependencies, and continues to function when the underlying internet connections to outside of the country are cut, ask in the questions :)<p>Back to our current idea. When scaling AbarCloud, we needed to hire DevOps &amp; SREs. When hiring developers, to assess their skillsets, we’d use something like HackerRank or Codility. But when hiring DevOps and SREs, where troubleshooting, scaling and failover etc, are important, these assessment tools don’t help.<p>CircuitOps spins up live environments with broken things in it, drops the candidate in, and asks them to troubleshoot, diagnose and fix the issues. All their working is recorded, and (future feature) auto-marked. Try it here: www.CircuitOps.com<p>I’d love your feedback, specifically about:
1 - What is the hardest part of hiring? For anyone who has stopped hiring due to Covid-19, has this impacted DevOps&#x2F;SRE roles?<p>2 - What do you think about this style of testing candidates by doing real life scenarios? Have you done a DIY approach to this at all?<p>3 - If you are a hiring manager, would you use this tool during a live interview, or as a pre-interview filter? How else would you use this tool?<p>Cheers from Scotland,
Hassan, Ali &amp; Alistair
======
sgarland
Does it provide automated grading, and if so, how strict is it? For example,
whereas Warmup instructs the user to create a script with .sh file extension,
Find The Logs does not. If the user creates a working script but includes the
file extension, will it break? Does it require them to set the executable bit?

If this is all just recorded and the hiring manager can review it and decide
for themselves, moot point.

1\. Not a hiring manager in my current role; previously when I was, it wasn't
in software, but I'll give some thoughts that may be transferable: the hardest
part, especially for younger candidates, is trying to ascertain their ability
to learn. You may not hire someone who already knows your entire stack, but if
they are self-motivated and have a good baseline, they'll do well.

2\. I like it quite a bit, especially that it's not a simulated terminal with
limited commands.

3\. N/A

~~~
hkh
hey! We currently don't auto-mark and you are spot on, the hiring manager can
see exactly what the candidate ran and judge for themself (e.g. some scenarios
you can just re-start the machine, but that doesn't solve the root cause). But
I think we will need to create a report on how well the candidate did. Based
on what we've seen, it seems better to award lots of little points vs a few
big points for the "right answer". What would you expect/want to see in a
results report? Cheers, Hassan

~~~
sgarland
Agree on granular points. While a large chunk should certainly be "was it
successful," points for effort and/or general knowledge are good.

1\. Did they use man pages? While Google and Stackoverflow have the answers to
everything, being able to use built-in help is useful. Obviously if they just
know how to structure a command, don't detract points here.

2\. How clever are they with bash? This one is hard to objectively measure,
but for example, someone using awk to a large extent rather than chaining
together calls to grep and sed may know more Linux. Again, I don't know that
you should lose points if you're following the Linux philosophy of "Each
Program Does One Thing Well," but it's an interesting datapoint.

3\. Are they aware of the limitations of tools? I didn't do all the exercises
so I'm not sure if you have this, but having to find log entries in multi-GB
files can quickly demonstrate knowledge. If you blindly pass `grep -i foo` to
a giant file, it's going to take quite a while - even more so if the file is
compressed, as would be expected for a production system with logrotate.
Knowing that the -i flag slows down grep, and that passing a limited charset
(e.g. LC_ALL=C) to grep can speed things up is good.

3a. Make them / lead them to using find with xargs or exec. Once again,
understanding system calls and how your command string creation can
drastically alter performance is a good indicator of skill.

I suppose overall, I'd want to see the output of the script they wrote (you
don't have to auto-mark, you can just have the report print expected and
actual, so fuzziness is up to the manager), the time they took, what man pages
they read if any, and top commands issued.

------
tim775
I like this. Interviews are really hard for both parties. It's at least as
difficult to evaluate someone's skills and potential as it is for someone to
demonstrate them. I can see how this would be useful as a screening tool, but
I'm really interested in using it for live interviews to eliminate the power
imbalance that happens when one person already knows the answer. I want to
spin up a scenario then give it to the candidate and my technical interviewer
at the same time, sight unseen and let them work on it together. I think this
could get me a lot closer to answering "can you work with this person?"
instead of just "can they answer your questions?". Good luck!

~~~
hkh
Hey! That's a great idea too - testing the candidate in a team / pair with
your own team will really help the discussion of thought as well, vs just
knowing the commands to run and troubleshoot the system.

------
anilgulecha
hkh - Hackerrank has a well-formed DevOps solution with terminal/remote ssh
and deep reporting, library of questions, and automated evaluation.

[https://www.hackerrank.com/products/projects/?h_r=products&h...](https://www.hackerrank.com/products/projects/?h_r=products&h_l=header#role5)

Disclosure: I built it :)

~~~
hkh
Hey Anil, ah this is awesome! Quick question - why was this part of hackerrank
in the higher payment section? have you found that only bigger companies would
use this tool, or was this more of a pricing decision to have features
unlocked when customers pay more? What's been your experience of running this?
Would love to hear your thoughts on this space, good and bad! Cheers!

------
3dbrows
I think this is fantastic. I can see your investment pitch being "the
HackerRank/LeetCode of devops". I've only tried one scenario (DNS) so far but
it seems well done, not too long, not too easy, not too hard. I am excited to
try it out further.

First I'll address your questions and then add some comments of my own.

1\. I do a lot of technical interviewing. Many people can _sound_ like they
know what they're talking about, but it can be a struggle to get to the nitty
gritty to really test their understanding. This product seems to have an angle
of "take-home tests as a service", which is cool. Over time I can see how you
could come to have a large library of scenarios and use of them would be very
helpful in filtering candidates for real, demonstrable competence.

2\. I like it. I once tried a DIY approach involving having candidates talk to
a Kafka server I provided, with client certificates. It was cumbersome. There
was a 100% overlap of candidates who succeeded and candidates who could be
bothered to take the test at all. From this I gathered that the mere existence
of the test was a strong filter, but it's also fair to say that judging test
difficulty level well is extremely hard. (In other words: "oh no, he wants me
to actually understand how to use client certs? Bye.")

3\. Pre-interview. Async hiring process with just-in-time candidate attention
means the team does not lose too much productivity (as much as hiring is a
valuable use of time, developer time in a small company is at a premium).

Some quick-fire minor comments: consider making it easier to share the test
link (I generated it on my phone, wanted to run on my desktop); the password
field is before the username; it wasn't super obvious that different VMs were
in play between the warmup and real scenario, nor that a scenario could
involve _multiple_ VMs itself. Also, you've no doubt thought of this, but have
good defences against abuse of your VMs. By the way, have you considered this
approach [1] to VMs? More controllable, but less realistic.

Some questions/suggestions:

1\. As the test admin I'd love to to provide "verify.sh" to grade the test
using logic I specify. My script would test, for example, "Can this VM now
reach host X?" "Is command Y in the bash history?" and dump out a report
somewhere. -- What ideas do you have for auto-marking?

2\. What revenue model do you have in mind?

3\. How ready is this for real-life use in a hiring pipeline?

4\. Do you think there is future scope for a customer to specify their own
scenarios, or is the idea more that you build up a library of them?

5\. Have an advanced mode where the candidate must supply a public key to you
for SSH access, rather than password. Boom, you've filtered for a basic
understanding of public key cryptography.

6\. I am not sure the user flow currently has a clear distinction between
"recruiter view" and "candidate view" but no doubt that is on the roadmap.

7\. Scenario idea: certificate expiry blocking secure comms between hosts.
This happens in K8S clusters often enough. Could be replicated here.

8\. Just from curiosity, any details you can share about how the stack works?
What am I talking to, a docker-compose script of Centos containers? You've
done a nice job with the SSH user routing; am I being tunnelled through your
jumpbox to the target machine/container?

That's a lot, sorry! I think this is an exciting idea and that you guys are on
to something.

[1] [https://blog.benjojo.co.uk/post/qemu-monitor-socket-rce-
vnc](https://blog.benjojo.co.uk/post/qemu-monitor-socket-rce-vnc)

~~~
hkh
hey! thats awesome feedback! Thanks for taking the time and writing it out for
us.

A few followups: 1\. Thats an interesting idea - for the hiring manager to
write a test script according to what they want to check. We are thinking more
about auto-marking, but there are some basics we could do such as time it
took, how long between each command, what 'types' of command were used a lot
(is a person circling back and forth between folders, or are they going
somewhere) etc, and the obvious 'did they fix it'. Question - what do you
think if we let people actually write a terraform script to launch their own
scenarios as they want?

2\. Current working idea is either $ per test, or groups of interviews. So
<100 interviews is a flat $x a month. Does this make sense to you or would you
prefer it a different way?

3\. Ready to go! It has actually been used in real life hiring by a company
already.

4\. haha great minds ... just got to this question after I wrote my respond to
#1.

5, 6, 7. Yep and Great idea!

8\. We are using Pulumi to launch the scenarios, which are running on VMs. SSH
connections are made through a jump box which is recording the sessions.

Thank you so much for the feedback, greatly appreciated! Cheers, Hassan

