Hacker News new | comments | ask | show | jobs | submit login
The Programming Interview from Hell (pythonforengineers.com)
174 points by seer on Feb 20, 2017 | hide | past | web | favorite | 136 comments



The hiring manager of a small software company gave me a quick brief before handing me off to his technical heavy. "He's hard to get along with, but he's really smart. Oh, and he has two PhDs. He'll tell you that."

I was ushered in. The Guy with Two PhDs (he showed me his business card first, and there were indeed two PhDs on it) asked me:

"What is the simplest way to synchronize two threads?"

I rattled off some synchronization primitives. Semaphore. Critical section. I was ready to do an implementation if he wanted.

"No, the simplest."

I dug around. Interlocked operations? A mutex? A spinlock? I mentioned Dekker's algorithm (it sucks, but it's simple). I dredged up a few more.

"No, I want the SIMPLEST possible way to synchronize two threads. What is it?"

I'd run out. He gave a disgusted snort. The next question wasn't much better: "What's the BEST way to share data between programs?"

"Not sure what you mean by best. How many programs? Is it over the network? Hmm, shared memory and a maybe a signal of some kind?"

"No, the best way!"

This interview did not go well. At the end, Dr. DoublePhd scolded me for dropping out of college and told me to go back to school to finish my degree.

That evening I wrote the hiring manager that Dr. DoublePhd appeared to want the answer "raise interrupt priority" for the thread synchronization question -- which doesn't work on a multiprocessor -- and that I had no idea what the heck he was asking for on the other questions.

I didn't add that even the awful questions I was being asked could have been productive interview fodder in the hands of a good interviewer, but that in the hands of a terrible person they were destroying that company's ability to hire.

Turns out that I didn't need to add that last bit. A few months later the hiring manager emailed me, saying that they had fired Dr. DoublePhd and would I consider interviewing again? I politely declined.

I never found out the BEST way to share data between programs. In fact, I'm still looking. I think we all are.


> I never found out the BEST way to share data between programs

Come on, add another layer of indirection! When given a problem to solve, hand back another problem.

In this case:

Q: "I want the SIMPLEST possible way to synchronize two threads. What is it?"

A: The SIMPLEST possible way to synchronize two threads is a global maximal element (not necessarily unique) of a poset induced by a partial ordering on possible methods for synchronizing two threads. If the induced poset is totally ordered, and assuming that at least one possibly way of synchronizing two threads exists, then there exists a SIMPLEST possible way to synchronize two threads. If you hand me a method for comparing two synchronization methods for SIMPLICITY, then an algorithm retrieving such a best possible solution is trivial, but at this point you must define define what you mean by SIMPLEST before we can move any further.

Mr. Double PhD was trying to be clever.


I think we've found Dr double Phd ;)


Just run one after the other.


OMG...too funny.


If someone asks you the "best" way to do something but does not give you context for the problem being solved then the question is stupid. There is no all-purpose "best" way to do anything when it comes to complex systems. You use the "best" way for the problem in hand that may be shared memory, pipes, writing to a file, RPC, etc.


Exactly. When handed an under-specified problem (semantics), hand back a lambda compiled in the target language (syntax). "What do you mean? African or European?"


Upvote for "hand back a lambda". I was laughing.

"Here, run this unsigned code for me in your head." That has all kinds of possibilities . . .

[See John Barnes' book _Candle_ if you want some scary prognostication. Warning, it's kind of graphic and disturbing in spots]


Well put. If there was the best way, then it would be the only way.


"Raise interrupt priority" is not a way to synchronize threads, let alone the simplest. I know what I'm talking about; I worked on multithreading for Linux before the rewrite to NPTL and I invented the PTHREAD_MUTEX_ADAPTIVE_NP type that you still find in Glibc. On a project several years ago, I did use interrupt priority in conjunction with threads as a complete hack. The issue was this: the interrupt service routines for an IDE-based compact flash were running CPU intensive polling loops, taking away app CPU time. I created a priority scheme where certain interrupt priority levels overlapped with thread priority levels. That is, it was possible to suppress the interrupts for this IDE device while certain important threads were dispatched. As in, automatically: when the kernel dispatched a thread in that priority range, interrupts below that value would be blocked (not by disabling CPU interrupts, but the actual interrupt controller's detailed mask of interrupts, where we pick the specific ones that get masked). When the thread lost the CPU, they would be unblocked. This solution/workaround worked well enough to deal with the issue it was intended for.


You can raise IRQL and get okay synchronization on simple systems. It was fine for early versions of Unix, and I've used it quite a bit on uniprocessor embedded systems. Just a few instructions, hard to get wrong.

Of course it fails if you're working in user mode. Or you have non-maskable interrupts that you need to sync with.

In this case the poor PhD x 2 had been working for years on consumer PCs, which were at the time nearly all single CPU. I'd been mucking about with multiprocessing a fair amount, and the IRQL thing never even occurred to me during the interview. Just as well.


Early versions of Unix were co-operative. While a task executed kernel code, it was not preemptable. That might be the simplest thread synchronization. Masking interrupts is then just for exclusion between the one and only kernel thread, and interrupt context.


That's true.


> I never found out the BEST way to share data between programs. In fact, I'm still looking. I think we all are.

Copy-paste.


See, I was gonna go with Data.txt in a shared network drive.

Btw = if you're worried about conflicts - it's Ok there's a second file called "writeStatus.txt" which you have to claim by replacing the String NULL with your processId - thereby claiming write access to data.txt and causing any other processes to Thread.Sleep until it's free.

Unrelated, who's hiring...


But who has access to writeStatus.txt? Classic rookie mistake. You needed a writeStatusWriteStatus.txt.


Sounds like playing Byzantine Generals on the file system :)


Flat files FTW, Just need to re-invent server-client architecture using freely available tools and you're at management level kudos,


>> I never found out the BEST way to share data between programs. In fact, I'm still looking. I think we all are.

>Copy-paste.

You may joke, but it's the most universally supported, and therefore the most likely to be available.


For Linux users its gotta be the pipe. Works great.


You guys... Forgot about email...


My guess would be that simplest way to sync two threads is exit them both. ;)


...or not running them in the first place.


I interviewed someone who gave responses akin to "I'd Google it." When presssed, he did finally give some reasonable answers.

We ended up hiring him, as we'd interviewed 10 other candidates who'd failed at that point.

He ended up being a terrible employee.

Seriously, when someone asks you the details of a linked list, they're not trying to find out if you will be able to use one specifically on the job. They're trying to figure out if you know how to implement the details of a moderately tricky algorithm/data structure.

I've never implemented a linked list for work, but I have implemented complex data structures and other analogous code. Implementing a linked list demonstrates that I paid attention to the basics and can reconstitute mildly complex algorithms when needed.


If he gave reasonable answers when pressed, despite being initially resistant, it sounds like the interview didn't give any insight into whether he was a good employee or not.


It certainly did! I just didn't understand at the time not to hire employees who simply say "I'd Google it" over and over :D.


How about stack overflow?


Ah, so was the candidate poor because they had a bad attitude and refused to try to solve the problem at first, or was it because their technical skills were lacking?


More the former. Technically they seemed quite proficient, but they proved difficult to work with and didn't end up gelling with the team very well.


linked list is not complex for christ sake.


Tell that to the 1000's of people who can't implement them.

I agree, its not that hard, but that's all the more reason to avoid giving attitude about the question. Answer it quickly and move on. If you can't answer it quickly, maybe it _is_ that complex.

A lot of interviewers have a short time window in which to conduct their 1-on-1. Sometimes they ask easy questions for a reason.

Often, I start with a soft ball that I intend on building on - turn a linked list into a doubly linked list; a circular list; can you improve the lookup time; can you make it generic; what are the space constraints; what are the time constraints.

And if they can't answer the simple question, we just leave it at that.

A simple question can easily be built upon. "I'd Google it" can not.


> linked list is not complex for christ sake.

There are many people for whom this is complex, you're assuming a foundation that not everyone has. There are also many people for whom nothing is complex, they assume they can understand everything, while they don't currently, they assume they'll be able to learn it without issue.

Dealing with new starts who are straight out of education is often like reading posts from 4chan.org/b, at first you don't know if they're joking.

It's not all doom and gloom, occasional I'm pleasantly surprised by the calibre of those beginning their career/hobby, but this is the exception.


But there's also nothing crazy about a company saying: for this role, we expect the candidate know linked lists. It's not esoteric and I think a lot of us would consider normal programmers who have had to write linked lists (and other data structures) for one reason or another.


Personally I agree, however I know people employed to code PHP who have no concept of this. Some apprentices who have left college have joined our company who don't know this.


I think the original intent of the linked list question was to see if the candidate knew pointers. Implementing in a non pointer language would be trivial and definitely not complex.


Are linked lists even a difficult example of pointer use? You don't exactly have to do fancy pointer arithmetic with them. You just have to be a little bit careful when inserting or deleting nodes.


You do need to understand the difference between an object and a refrence to it. Seems trivial bit surprising many programmers seem not to understand it.


No it is not. However, it is totally irrelevant for 99% of jobs require programming skills. Being 15 years in this business I haven't seen linked lists being used for any particular problem, but I understand that when you work in the right industry it is invaluable. If I had to implement it I would probably use a library and I would educate myself on the subject. I usually find programmers running into many other issues that don't show up during interviews.

Few questions I usually ask:

- How can you make sure that your Java application runs on the server not only on your laptop? (I take any answers: containers, single JAR, etc.)

- What is printed out

def add_list(val, list=[]):

    list.append(val)

    return list
print add_list(10)

print add_list(20)

print add_list(123,[])

- Explain recursion

Funny to see how a non-trivial amount of programmers fail to answer these questions.


> linked list is not complex for christ sake.

...unless you need them for lock-free programming. There was a nice C++ talk by Herb Sutter on that, with the apt title "Juggling Razor Blades". That pretty much says it all.

From a practical standpoint, the question "what is a linked list good for?" really might be more interesting. Does anyone know of potential use cases besides kernel design, lock free programming or Clojure-style immutable datastructures? I'd guess CPU caches to have erradicated most of them...


Have you ever hired developers? FizzBuzz is still a very effective filter.


At my last job, we asked two simple questions and many "senior developers" couldn't get them:

1. Write a function that determines if a number is prime. If they didn't know what a prime number was, we would tell them.

2. A simple problem that required designing a database schema and sql query that involved a left outer join.

There were a lot of developers who couldn't do it.

On the other hand, there was one developer who was just learning c#, who had spent most of his time doing VB.net, couldn't answer a lot of the technical questions but we could tell by his thought process and how he explained real world problems he solved that he would be a great asset to the company. We fought for him over more "senior" developers.

When I have a chance to hire again, I'm going to fight to get him -- even though he sucks at interviewing and i might have to do a little convincing.


> Write a function that determines if a number is prime. If they didn't know what a prime number was, we would tell them.

Did you also give them an algorithm to implement or was a brute force method good enough?


First the brute force algorithm. Then we would ask them how could they optimize it. A lot of people couldn't even get to the brute force algorithm.

1st level optimization: skip even numbers.

2nd level optimization that I was the only person to get when I had to interview for the company: loop starting at 2, and go the square root of the argument.


They can be. Doubly linked lists. Intrusively linked lists. Ring buffers. Linked Lists optimized for storage in CPU caches. B Trees (stretching the definition a bit). And so forth.


linked list is not complex for christ sake

That depends on the language. If you're using something without pointers or references it's quite hard.


Hmm, which (general) language has no pointers or references?


Python. Ruby. Javascript. Lisp. Haskel (IIRC).

They all use them internally, but don't tend to make them available to the programmer (usually because they aren't needed).


They have references (at least python, lisp that I know of, likely the others as well), which is enough to implement linked lists.


The ability to implement linked lists != pointers and references

Behind the scene, yes, every Python variable is a reference to an object. It's not addressable, however, and in the case of immutable objects (like strings), you can't modify the underlying object and keep all references pointed at that updated object.


If have references you can pretty much always implement lists. First all you do not need mutability, you can implement linked lists form immutable cons cells, second python has mutable references so implementing linked lists is trivial (the fact that some objects are in fact immutable is immaterial).


Well, at least in Ruby and Javascript variables are references to objects and it is trivial to implement a linked list in both.


No, AND they are actually useful for any sort of problem, like, reading a records from a file and that sort of stuff. Heck, I wrote one last week [0] because it's quick to implement and does the job. Same applies for the double-linked kind (slightly less so but still nice) and queues. One of my favourite linux kernel header is lists.h!

[0]: https://github.com/buserror/rf_bridge/blob/master/src/rf_bri...


I hope I'm past the part of my career where I go on cold job interviews but if I ever had to go on another job hunt, I'd simply announce that I'm a web developer and not a algorithms person and it looks like I might not be the best fit for the position and just see how the interviewers respond.

Anything less than tossing the playbook warrants a gentle suggestion that we shouldn't waste any more time here and why don't we just end the interview. Stand up, shake hands, thank them for their time, and walk out.

Problems in the interview process should be seen as problems with company culture. I used to wonder about how to appropriately answer the question, "which companies are worth working for?" because it seems like you need a lot of time before you can really tell. But once I realized that the interview is just an extension of company culture, it got a lot easier to weigh opportunities.

I mean, obviously, if you need the money you need the money, but developers are hot enough commodities that it doesn't take long before you're entrenched enough to be able to call the shots like that.


> I'd simply announce that I'm a web developer and not a algorithms person

A while ago I was invited at a startup for a casual discussion about an open position they had (front-end developer). I went because I knew several employees and they were nice. The casual discussion turned out to be several questions about UDP/TCP and writing algorithms on a blackboard (counting leaves in a tree, finding is a word is an anagram, etc.). Nothing really hard, but quite disconnected from the day-to-day job of a front-end developer.


I hear about the whole 'casual discussion turns into hardcore interview' ploy all the time, and curious about a better way to respond to it. It's not a serious knock on company culture, but it is impolite. I'd probably overlook it if I didn't have a current job, but would write off the company if I was already employed. I don't think it warrants a walk-out.


OK - but here's a genuine problem that came up the other day in my work (reconciling two datasets - we have various many-to-one mappings of ids that we then want to reconcile against each other). I think it's quite a neat computer science/algorithm challenge, so here goes:

Write a function which takes as input a list of sets, many of which are not disjoint, but will output a list of sets where all of the non-disjoint ones have been merged back together again. So the output is a list of sets which are all disjoint from each other because any intersecting sets have been merged together.

e.g. given the input:

    [(1,2,3), (2,4,8), (10,11,12)]
it will report back:

    [(1,2,3,4,8), (10,11,12)]
because (1,2,3) and (2,4,8) are not disjoint, but the (10,11,12) set is.

Whereas given the input:

    [(1,2,3), (2,4,8), (10,11,12), (8,10)]
it would report back a single set:

    [(1,2,3,4,8,10,11,12)]
because now all of the sets are connected - the 8 and the 10 now connect everything else together.

I came up with an algorithm which is acceptable for the dataset we currently have - but I've no idea what time complexity it is (for our real dataset it was able to do it in "one pass" - but in principal it could be worse than that). I don't know how I would implement a distributed version if the list of sets was too big to fit in memory, etc. etc.

And I haven't found a good solution on Google (but I'm not even sure what to Google).


I would put each element from each set into a Disjoin-set data structure [1] and then report back all the sets whose elements all have cardinality 1.

The complexity of this data structure is pretty interesting. It basically comes to O(N) for N < any number that can be represented in the known universe. It's also the coolest use of the inverse ackermann function I've seen!

How to solve this on a distributed system I have no idea.

[1] https://en.wikipedia.org/wiki/Disjoint-set_data_structure


The standard data structure to look for in this case is the disjoint-set data structure (also called union-find), https://en.wikipedia.org/wiki/Disjoint-set_data_structure. That should make your function fairly easy to implement.


Thanks for that - I'll take a look!


You could model the problem as a graph (each integer represents a vertex and two consecutive integers an edge, e.g. (1, 2, 3) is a graph with nodes 1,2,3 and edges between 1 and 2 and 2 and 3). Then your problem is just to find all connected components of the graph (https://en.wikipedia.org/wiki/Connected_component_(graph_the....


Yeah but realising that two "nodes" of the graph are connected requires doing a set intersection, which I concluded was quite expensive to do between all possible sets. i.e. building the "graph" was an expensive operation... unless I've misunderstood you.


You build from the different tuples in your list just one graph. Then it's just a simple DFS/BFS with one random start node. Which gives you your first component. Then you can get the second if you start at a node which is not in the previous component until you visited all nodes. This should all be in O(n).


[(1, 2, 3), (2, 4, 5)] would result in a graph like this:

1 - 2 - 3

     \

      4 - 5
Building the (undirected) graph would take linear time, and once it is built, you can do a simple Depth First Search to mark all the connected components.


Great exercise! Here's my solution in Java, seems be about O(n):

    class Ptr {
        public Ptr next;
    }
    
    <T> List<List<T>> mergeIntersecting(List<List<T>> lists) {
        Map<T, Ptr> lookup = new HashMap<>();
        Map<Ptr, List<T>> output = new HashMap<>();
        for (List<T> list : lists) {
            Ptr ptr = new Ptr();
            if (list.isEmpty()) {
                output.put(ptr, new ArrayList<>());
            }
            for (T value : list) {
                Ptr prev = lookup.get(value);
                lookup.put(value, ptr);
                while (prev != null && prev != ptr) {
                    Ptr tmp = prev.next;
                    prev.next = ptr;
                    prev = tmp;
                }
            }
        }
        for (Map.Entry<T, Ptr> entry : lookup.entrySet()) {
            Ptr ptr = entry.getValue();
            while (ptr.next != null) {
                ptr = ptr.next;
            }
            if (!output.containsKey(ptr)) {
                output.put(ptr, new ArrayList<>());
            }
            output.get(ptr).add(entry.getKey());
        }
        return new ArrayList<>(output.values());
    }
If I give it random lists of integers, it takes around 1 microsecond per element of input. Really curious if there's any way to speed it up a lot.


Robert Sedgewick's course [1] and associated book/booksite [2] have a good overview of Union-Find problem and various algorithms to solve it.

[1] https://www.coursera.org/learn/algorithms-part1

[2] http://algs4.cs.princeton.edu/15uf/


Indeed, Union-Find is the first subject the course covers, because it uses it as an example of an elementary algorithm.


You could just go on merging sets with each other.

For i from 0 to n-1 , find all sets from i+1 to n-1 which have a non empty intersection with set i. Union set i with all those sets and replace i with the union set.

If you use a disjoint set data structure this will be quadratic or O(n^2)

EDIT: On further thought you need to merge from the end and backwards.


This is wrong headed.

Just add everything one by one to a Disjoint set union.

My bad.


I had a similar interview question at Booking.com once. You can solve it in linear time with some extra space (worst case - linear) for two constant-time lookup structures.


Your two examples have the same inputs but different outputs if I am reading this right. What did I miss?


On mobile, the 4th set is hidden until the line is dragged left.


Dang my bad. Thanks for clarifying.


The second example provides one more input set (8,10). So the first example has only 3 sets as input, the second example has 4 sets.


Here's a shell script for the case where the sets are too big to fit in memory, assuming that you are on a Unix system and it has a sort program that can sort a file that won't fit in memory.

Input: one file per set, with names of the form set.X. Format of the file is one value per line. E.g., the set (1, 2, 3) might be in file set.0 with contents

  1
  2
  3
Output: each run of the script will merge overlapping set.X files, deleting files that are made redundant. It will tell you how many sets were merged.

Run the script repeatedly until it says "merged 0".

    #!/bin/bash
    for i in set.*
    do
        sed -e "s/$/ $i/" < $i
    done | sort -k 1 -n > m.$$
    last_val=-1
    last_set=
    merged=0
    while read in
    do
        set x $in
        if [ $2 -eq $last_val ]
        then
            if [ -f $3 ]
            then
                cat $last_set $3 | sort -n | uniq > t
                mv t $last_set
                rm $3
                merged=$((merged + 1))
            fi
        else
            last_val=$2
            last_set=$3
        fi
    done < m.$$
    rm m.$$
    echo merged $merged
The above does more passes over the complete set of elements than is necessary, in order to minimize memory use. At the cost of a little more memory, it could write the commands done in the while loop (cat|sort|uniq;mv;rm) out to a file, and then edit that file to adjust it to take into account the affect of the rm's, and then do one pass of merging.

That would look something like this. First, you'd run this script once:

    #!/bin/bash
    for i in set.*; do sed -e "s/$/ $i/" < $i; done | sort -k 1 -n > m
    last_val=-1
    last_set=
    line=2
    > s
    while read in
    do
        set x $in
        if [ $2 -eq $last_val ]
        then
            #echo "if [ -f $3 ]; then cat $last_set $3 | sort -n | uniq > t; mv t $last_set; rm $3; fi"
            echo "cat $last_set $3 | sort -n | uniq > t; mv t $last_set; rm $3"
            echo "$line,\$s/$3/$last_set/g" >> s
            line=$((line + 1))
        else
            last_val=$2
            last_set=$3
        fi
    done < m > c
That gives an output command file, c, that looks like this:

  cat set.4 set.7 | sort -n | uniq > t; mv t set.4; rm set.7
  cat set.0 set.5 | sort -n | uniq > t; mv t set.0; rm set.5
  cat set.1 set.5 | sort -n | uniq > t; mv t set.1; rm set.5
  cat set.5 set.7 | sort -n | uniq > t; mv t set.5; rm set.7
  cat set.2 set.6 | sort -n | uniq > t; mv t set.2; rm set.6
  cat set.3 set.6 | sort -n | uniq > t; mv t set.3; rm set.6
Note the problem with this. Line #1 removes set.7 after merging it with set.4. But line #4 refers to set.7. Since 7 was merged into 4, it needs to refer to set.4 at that point, not set.7.

The script that made c also outputs a file, s, with sed commands to do the above fix. For the above example, it looks like this:

  2,$s/set.7/set.4/g
  3,$s/set.5/set.0/g
  4,$s/set.5/set.1/g
  5,$s/set.7/set.5/g
  6,$s/set.6/set.2/g
  7,$s/set.6/set.3/g
There is still a problem, because note that s suffers from the same problem that c does! Line #4 of s also refers to set.7, but at that point it should be set.4.

So, before using s to fix s, we have to use s to fix s: "sed -f s < s > s2", giving this for s2:

  2,$s/set.7/set.4/g
  3,$s/set.5/set.0/g
  4,$s/set.0/set.1/g
  5,$s/set.4/set.0/g
  6,$s/set.6/set.2/g
  7,$s/set.2/set.3/g
In this case, that is sufficient. We could now "sed -f s2 < c > c2" and then "bash c2", and we'd be left with set.1 and set.3, with the other sets properly merged in.

However, in more complicated cases one application of s to itself is not always enough. What we really should do is keep applying it to itself until we hit a fixed point, so "sed -f s2 < s2 > s3" giving:

  2,$s/set.7/set.4/g
  3,$s/set.5/set.0/g
  4,$s/set.0/set.1/g
  5,$s/set.4/set.1/g
  6,$s/set.6/set.2/g
  7,$s/set.2/set.3/g
and if you them apply s3 to itself, you will see that there is no change, so s3 is our fixed point. We could then "sed -f s3 < c > c3". Turns out that c3 is identical to c2, so we get the same results as earlier.


Assuming everything fits in memory, the following seems reasonable. The basic idea is to essentially think of each set as a region in some abstract space. If two sets have an element in common, then they are directly connected in that space. Build a map of these direct connections, and then you can use a flood fill to find connected regions. Each connected region corresponds to an output set.

Here's a test implementation, assuming input is one line per input set with space separated values on each line. Output format is the same.

    #!/usr/bin/env perl
    use strict;

    my @in;
    my @out;
    my %sawin;
    my %merge;
    my %merged;

    while (<>)
    {
        chomp;
        s/^\s+//;
        push @in, [split /\s+/];
    }

    for (my $i = 0; $i < @in; ++$i) {
        foreach (@{$in[$i]}) {
            if (defined $sawin{$_}) {
                $merge{$i}{$sawin{$_}} = 1;
                $merge{$sawin{$_}}{$i} = 1;
            } else {
                $sawin{$_} = $i;
            }
        }
    }

    foreach (my $i = 0; $i < @in; ++$i) {
        my %out;
        next if $merged{$i};
        $merged{$i} = 1;
        $out{$_} = 1 foreach @{$in[$i]};
        foreach my $ms (expand_merge($i)) {
            $merged{$ms} = 1;
            $out{$_} = 1 foreach @{$in[$ms]};
        }
        push @out, [sort {$a<=>$b} keys %out];
    }

    foreach (@out) {
        print join(" ", @$_), "\n";
    }

    sub expand_merge
    {
        my($base) = @_;
        my @todo = keys %{$merge{$base}};
        my %done = ($base => 1);
        while (@todo) {
            my $next = shift @todo;
            next if $done{$next};
            $done{$next} = 1;
            push @todo, keys %{$merge{$next}};
        }
        return keys %done;
    }
Everything should be linear in the total number of elements except for expand_merge (the flood fill-like part). I think worst case for expand_merge could be quadratic in the number of elements, which would occur if each set overlapped a large fraction of the other sets.

If things won't fit in memory, I don't know how to do it in the general case. I suppose the first thing I'd do is look at the source of the sets to see if there are any limits on that. For instance, if we are dealing with a very large number of sets without a lot of members per set, and the range of numbers in each set is not very large, then it should be possible to partition the input into two sets of sets, A and B, such that it is easy to show that no sets in A contain any overlap with any sets in B, so we've reduced the problem to two smaller problems that can be solved independently and their outputs concatenated. Repeat.

For the general case, I'd start out by sorting the elements of each set, and by sorting the set of sets. While Googling for a refresher on external sorting and then coding up that part, I'd be hoping for some flash of brilliance to deal with what to do after that.

If no flash of brilliance arrived, I'd probably try something like this (assuming that I can at least fit several of the sets into memory at once). Let's assume that each set is stored in a file, named after its order in the sorted list of sets.

Read the first set into memory. Then scan through the remaining sets, in sorted order, checking each for overlap with the first. For any that overlap, merge them in memory with the first. When all the sets have been processed, or a point is reached where the first element of the current set is larger than the last element of the merged first set and so you can infer that no more merging will happen on this pass, write the merged first set out, replacing the original first set, and delete the files for all the sets that merged with the first.

Repeat this until no new sets merge with the first. At this point, you can mark the first as done, and it becomes the first output set.

Repeat with the first remaining set as your new first set, and so on.

As long as the biggest single output set and the biggest single input set will both fit in memory at the same time, I think that the above approach works.

I have a feeling that there is some clever way to do this that is much more efficient and is much more obvious (in the mathematical sense...in other words, after you look at it for a very long time and think about it really really hard it was clearly obvious).

My guess is that the clever solution will heavily involve sorting...not that I'm really going out on a limb with that guess, because almost everything is sorting when you look at it right. For example, here's a shell script that given a list of x, y coordinates on STDIN (one coordinate pair per line, x and y separated by space) outputs the result of doing one generation of Conway's Life with the input being the initial cell configuration:

    > alive.$$
    while read cells
    do
        echo $cells >> alive.$$
        set x $cells
        x=$2
        y=$3
        echo $x $((y-1))
        echo $x $((y+1))
        echo $((x-1)) $((y-1))
        echo $((x-1)) $y
        echo $((x-1)) $((y+1))
        echo $((x+1)) $((y-1))
        echo $((x+1)) $y
        echo $((x+1)) $((y+1))
    done | sort | uniq -c > neighbors.$$
    grep '^ *3' < neighbors.$$ | sed -e 's/^ *[0-9].//'
    grep '^ *2' < neighbors.$$ | sed -e 's/^ *[0-9].//' > has2.$$
    sort alive.$$ -o alive.$$
    comm -12 has2.$$ alive.$$
    rm has2.$$ neighbors.$$ alive.$$

Note that the key operation is "sort". This runs in O(n log n) where n is the number of live cells (assuming your Unix uses an n log n sort...).


I have been hiring technical people for about 18 years. 15 years for my own business. I stopped with all the bullshit questions years ago. My own research into which approach to interviewing is the "best" turned out to be a tossup. You can spend a day, or even days to go through a gruelling interview schedule, and your success rate won't be significantly higher. We are not Google, we don't have their needs, and we don't have the resources. If I'm looking for a dev or ops person, and your CV matches the skills I am looking for, and I have a good feeling about how you present yourself on paper (decent cover letter, decent CV, etc.)you'll be invited for a formal interview, where we will discuss your CV (tell me a bit more about this job you list here, etc.) takes anywhere between 15 to 60 minutes. If there is a connection and a good vibe, your next interview will be informally, in the pub, with the rest of the team to see how you get on. This approach has netted me a higher percentage of succesful long term hires then any other approach. I trust that if you apply for a developer position, you will actually be able to cope. Your skillset will be determined during your probation period, which is something I make clear beforehand. Your probation is where we see how good your skills are.


I really wish there were more companies like this. Interviewing in SF/Silicon Valley is an absolute nightmare and I've basically given up. I simply can't put myself through that ordeal of weeks of back and forth emailing and talking with recruiters, followed by multiple day long interviews, just to be rejected because I couldn't whip out an algorithm on your white board in front of five people while interviewing for a basic front end position.


I get that this is a caricature, an exaggeration of reality that's trying to convey some bad practices. It must be incredibly frustrating to be on the receiving end of an interview this bad! But I'm not sure all of this plays particularly well, and I want to push back a little bit.

For context here, I'm a college dropout. I made it through two years before I discovered that I could skip right to having a job, and for where I was in life, it was the right decision for me. But seriously, what's the deal with this hatred of "Big O" notation?

This isn't some cryptic language only for use by CS PhDs. It's not even a concept that's hard to explain to someone who isn't in our industry!

"The question we're trying to answer here is, how does our solution scale as the problem gets bigger? Are we able to solve it in the same amount of time regardless of size? Does our solution take more time as the problem gets bigger, but only in proportion to how much bigger the problem is? Or does our solution take much, much longer as the problem gets bigger, to the point where it's no longer feasible to solve?"

Asking a candidate to be able to have a conversation about this -- how do time and space complexity scale -- is not unreasonable. This is the difference between designing a workable solution early, and discovering that what worked great in a unit test and some one offs behind the scenes fell over in production.

Similarly, I see no problem in expecting a candidate to have some familiarity with the basics of simple data structures. I don't generally rewrite them either! But I use data structures all the time, exposed to me via the standard libraries of programming languages that I use, and if I don't broadly understand how they work, when they are appropriate and when they're not, and when one choice is superior to another, I'm not doing my job.

I agree that in a lot of instances, Google is the answer. But Google is of no help if you don't know what to ask, or worse, if you don't even understand that there's a question to ask.

I'm not defending the truly awful interviews that I believe really do happen. But let's not pretend that it's not important to understand how these details work to get proficient, or even adequate, at our jobs.


Thats hilarious.

I was once interviewing for a C++ position and the interviewer presented me with some C code with a broken "swap" implementation and a driver function and asked me to fix it. I simply prefixed the call to swap with "std::".

He wasn't very happy about it. ;-)


Nice. I was once asked to implement a basic filesystem API in C++ including file manipulation, directories and paths, etc. I just said use boost::filesystem and move on with your life. Interviewer was not impressed but at 95% of companies that's what you are going to want to do.

Can you imagine interviewing someone to build you a house and spending 90% of the interview time asking him how he'd chop down trees for wood and manufacture the nails? Welcome to software interviewing.


I partially agree. It depends on what you are hiring for. If you are hiring for a low level code monkey, you want to see how they code. If you are hiring for a senior level engineer/architect, you want to see how often they avoid coding and know how to leverage existing frameworks.


Old post but wanted to respond: Even when interviewing more junior roles, I want to see the solution that involves the smallest amount of code, ideally no code. Nothing wrong with expecting that existing mentality from both junior and senior people alike. Saves you from having to instill it.

A candidate's ability to re-implement the language's standard library is not so useful if we simply use the language's standard library in our application code.


Of course, you acted like a smartass for no apparent reason.

You will never be asked to reimplement a library function on your job, these questions are used to see how you approach a problem, your reasoning, etc...


That is how I approach a problem. I've seen too many inexperienced developers reinvent the wheel - badly - instead of taking a step back and wondering, "is this a solved problem"? Is this part of our core competency or can we outsource it, find an existing package, etc.


Even good, senior developers do this. A lot of people just jump into a problem because it's fun, but don't stop to say "hey, someone must have solved this already. Let me see if there's an open-source solution out there."


I might do something like that on my own time as a learning experience, but not as part of my job. The less surface area we have to support the better.

It's also a lot easier to onboard new people if you're using a popular third party package and you have thousands of people that have already had the same problems you might encounter.


Hired.

Seriously that's the correct answer as far as any productive engineer is concerned right?


Except these days you are not supposed to call std::swap directly but should go though ADL lookup. I.e.:

  using std::swap
  swap(a, b)
/doublesmartass


What a fun interview it would be to actually do this.

My favorite interview was for an architect position, where my two future peers took turns telling me horror stories about working there, to see if I would run away screaming. I got the job and only ran away screaming 18 months later.


I've learned in the past few years that, if the interviewers gripe about anything in the interview (especially management), it's best to stay away from that company. In an interview, the company is trying to put it's best face on to sell themselves to you, and if the employees can't hold it in for 30 minutes, that's a bad sign.


I've been on both sides of the horror story style interview -- as an interviewer and an inteeviewee.

As an interviewer, if there was someone we liked, we were brutally honest about the company culture. We (myself and my manager) wanted senior engineers who weren't afraid to speak up and help shake things up in the larger organization. If they still accepted the offer, they were what we wanted. We had too many people quit out of frustration instead of fighting.

At another job I applied to as an architect, the manager told me all of the issues with the processes (and lack there of). He tried to "scare me away". I asked him one question. Will I be able to come in and make the changes needed and will he have my back if I inadvertently step on toes trying to do the right thing? He said yes and I accepted the offer.


I've actually used (politely, of course) many of these answers in interviews, whenever I felt the interviewer was trying to show off or read from the script instead of actually judging my abilities.

"I'd google it. Anything less would be a waste of our time." That cuts through a lot of bullshit. After saying that at my last interview, we got into actual problems (structuring a program, building an API, writing code to approximately simulate real world problems, data structures). Had a good time, got hired, and loved working with those people.

On the other end of the spectrum, I've been torn to shreds in an interview for not being able to write a fast hash table implementation in Java (didn't know Java at the time and the interviewer didn't know Ruby/Python/C) in an hour of trying. That was soul-destroying.


I used a polite version of "I'd google it" while interviewing with Google. I explained that I didn't think anyone could be reasonably expected to know how to do everything in every language on demand and that virtually every professional programmer I knew consulted Google from time to time.

Maybe I was just ahead of the times or they weren't looking for someone to be as honest as I was but I didn't make it past round 3 of their interviews.


    new java.util.HashMap()
In case you ever get such a stupid question again.


Yeah, I'm fairly proficient these days. But even then, I expressed amazement that Java didn't have a built-in or library data structure.

"It does."

"Then I'd use that."

"That wasn't the question."

"Oh... kay..."


> Manholes covers are round because that is the shape that won’t fall in

It's worth mentioning that this oft-repeated canard doesn't have much evidence. Manhole covers come in a great many shapes, and this has always been true: the Romans made them square. There are plenty of shapes which won't fall in, such as any curve of constant width. Basically fake-Feynman is right: manhole covers are round because it is often convenient to make manholes round.


Modern manhole covers are overwhelmingly round, and cast as not to fall into their own fitting, also cast. Not sure how a obscure counter example does anything other than disprove your point further. Source: worked IT for public civil engineer for a few years.


I assume that by 'obscure' you mean 'outside your-state, USA'. Here in the UK, rectangular manhole covers are the overwhelming majority.


There are many cities where manhole covers are overwhelmingly rectangular. I'm typing from one right now: Rome. SPQR!

I used to live in a town in the US where many of the manhole covers were shaped like a D.


I think the point is that a circle is not the only shape that can't fall into it's own hole/fitting, so it's the combination of this fact along with other factors that make a circular cover popular.

https://en.wikipedia.org/wiki/Manhole_cover#Circular


I've called out interviewers for stupid puzzle questions multiple times. Never got the job in any case, but I like to think I improved the hiring process marginally before being rejected.


I've also called out crappy interviews - while in the interview. I went on to be offered the job - but declined.


Heh, I would imagine they just found fellow aristocrats who had seen the puzzle questions before and whose groupthinking conformity was a perfect match for the companies.


While I understand the context is "stupid inteview questions", I would not say Big O-notation is useless, even for non-Google companies - just read some of the stories at thedailywtf.com to learn how bad it gets.

Also, there's a difference between asking "How would you solve X (in Y language)" and expecting a line-by-line code sample. In the former you should be able to talk about how you would structure the code, what input/output you would need, how to set it up, data stores, etc. whereas the latter is (virtually) impossible without help from Google.

While it would be lovely to hire candidates by "everything it says in the resume", the truth is that only works for really great hire... which you don't know until you've talked to them about how they work, what they've done and how they like to solve problem.


Isn't it interviews like that that turn job-seeking developers into entrepreneurs?


No hire. Bad culture fit. Seriously, even if you think the interviewer is asking the wrong questions, insulting the interviewer as if it's their job to fix the hiring process of the entire industry...


> How do you feel about overtime pay?

We provide industry-leading compensation, but no one pays for overtime so neither do we.


I think everyone should get overtime pay, it would make the company think about how to get better utilisation out of their staff in an allotted time.


that's illegal in many countries america is fucked up


Exempt employees rip.


We are an equal opportunity employer, we hide behind the market and external factors to discriminate salaries.


I once had to use a linked list to implement a sorting algorithm on a huge but partially sorted input. This was to detect duplications in an AST, for a static code analysis startup. This was literally the only time I professionally used any sort of non-trivial data structures and algorithms, and I've been working in the field for some 10 years now.


Knowing of data structures is more important IMO. The hoops I've seen people jump through in code because they don't know about sets.


And when you see that do you wish they'd never been hired due to this insurmountable hole in their knowledge of data structures or do you take 30 seconds to tell them about sets?


At what point do we just take someone out of high school and teach them to program?


Wait, how is a linked list a non-trivial data structure? It just consists of: {data, pointer to next element}.


Linked list was just a part of the data structure I ended up using.


It's somewhat ironic that they get horrified when you say you'd Google something and then the lateral thinking questions are normally the first five from the Google search 'lateral thinking puzzles'


I'd take very negative view of interviewee that claims linked lists and big-O (and optimization lore connected to it) is relevant only in interviews. Of course, if he's interviewing for HTML-"coding" grunt in a company making trivial website, I'd get it. But if any substantial programming involved, it's completely false.


I had the exact same interview when applying to be a barista at philz coffee.

The technical questions were my fault because I told the interviewer I was a researcher in number theory and enjoyed working with embedded systems.. to which they said, 'oh, can you talk a bit about that?'

Each comma denotes an email response explaining I was chosen to move on and to please fulfill the next request: Send resume with cover letter, fill out online application, make 90s video about why you should serve coffee, first video chat interview, come in to meet recruiter, come in to meet store manager.

I only made it through the first of the three interviews. Ultimately I think we both dodged a bullet there.

Honestly!

Ask an employee next time you get a coffee there what kind of interview process they had to go through.

PS- should have made a 90s '90s video about why I should serve coffee.. definitely would have gotten their minimum wage offering then.


It was fun to read. However, while interviewer was not perfect, the interviewee was too much crazy.

I always find it suspect when self proclaimed awesome genius programmers take offense over simple common questions (like linked list). Had it been something esoteric, ok. But these come up so often and take so little time to learn.


I wrote a linked list today, outside of a job interview. Systems programmers exist :'(


Not even just system programming. It's a fundamental data structure and concept. I don't think it's unreasonable for a company to say: we expect this role to understand what goes on behind this abstraction.


I had a recent interview where it literally went like this:

He: "Hi. My name is [X] from [Y:company]. We were scheduled for an interview now."

Me: Yes.

He: "Ok. Am I audible?"

Me: Yes.

He: "Cool. [Answer this question]"

Me: baffled screeching

He talked about a lot of things then including "How does Node.js manage it's asynchronus thread? Where is the sequence of functions to run stored in the memory?". In the end I asked him what was a general day like. To which (no kidding) his reply was:(sad tone) "It's Friday 6pm. Everyone is having donuts and here I am taking this interview".

Well, if you're more concerned about your donut, thanks I guess. I got rejected after this round.


If someone answered with "I'd google it", I would say "go ahead". I don't mind if someone doesn't know or remember something. That is a random signal. I want to know however if a person is mentally lazy and therefore is merely a "user" of technology or the "maker" of it.

That said, being able to at least describe some use cases of some relatively simple data structure is not too much to ask.


I once had an interview where they gave me some programming problems, prewritten unit tests that were failing that I had to make pass, a laptop and told me to use any resource I wanted to help and left me in a room for hour. It was one of the best technical interviews I've had.


I was once received a modified version of FizzBuzz (the word is replaced by company name) for my coding interview (via email). I told the HR person that I know the problem already, expected to have another more challenge problem. She asked do it anyway.

I gave them my solution, but, well, my interesting in that company has gone.


I'm taking an introductory course to CS, and a week before the midterm they introduced linked lists -- which to me at first just seemed to be lists within lists with extra rules. What applications do they have in the real world?


I recently graduated and I'm going for my first interview in a couple of days. It's in a recruitment agency, and it's for a general web developer position. What do you think I could expect?


Linked Lists are terrible. They're one of the reasons why early 2000's natd in FreeBSD was terrible. Ask me how I know!

Here's a hint - I trussed natd while it was running. Ooops.


loooooooooooooooooooool.


i recently had an interview which was plain smart what i need to get my work done questions i loved it ( i took that offer ). but then there were few where they did ask me a question on solution implementation and when i solve it they said there is a better way to do this and then I would be like ok then let's discuss but then they were quiet on the other side and waiting to hear me answer the best possible way to solve question ( like 15 mins )


You should definitely contact a reddit user called CommaHorror.

This is his writing style, which I believe would complement yours quite nicely: https://www.reddit.com/user/commahorror


i completely agree. I wrote that comment from my mobile and it is not my friend.

P.S : this is replied from desktop. hope this works for you


Punctuation is your friend.


not everyone who speaks desires to be heard by everyone :)




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: