> Lots of people do this, which is fascinating to me, to handle the negative n case
It makes sense, because performance differences between that and the "optimal" solutions are usually negligible, and the "optimal" solutions aren't usually optimal when you take into account readability and maintainability.
Embedded development is significantly different than developing for the web. Buying more servers is usually more efficient than taking twice as long to develop and update code. Interviewing with people who refuse to acknowledge this is extremely frustrating.
> Interviewing with people who refuse to acknowledge this is extremely frustrating.
I agree. Let me be clear, I'm just having a bit of fun posing this challenge on my blog. In no way do I count minor inefficiencies in implementations against candidates. Now, if they come up with an n^2 solution I might be concerned...
I know how web development works, I understand throwing more servers at it usually is the easiest solution.
But also now that mobile is becoming hot, it is more like embedded systems than web development, so when I'm interviewing mobile candidates we definitely discuss things regarding runtime and memory tradeoffs.
Coming from an embedded background gives you a very different perspective on things. I too have a strong embedded programming background. This, among other things, makes you keenly aware of the CPU and resource cost of your code.
If you've ever spent a day hunting for clock cycles to trim you know how that one can affect run time dramatically with a little attention. After a while this sort of thing becomes second nature.
It pre-computes all that can be pre-computed outside the loop and then, once inside the loop, it uses a calculated index to address the array. The index only requires a comparison and, if going in reverse order, a subtraction, which are computationally cheap. If absolute performance was the goal --at the expense of a few more bytes of code-- this could be tightened further.
One of the problems with programmers that don't have a low-level background, and, in particular, those who grew-up, if you will, with object-oriented programming, is that they don't generally have a connection with the cost of their code.
Object oriented programming is great for the programmer, but the computer doesn't need it. It exists purely for the programmer's benefit. The computer would execute functionally equivalent non-OO code just fine and much, much faster. Case in point: Objective-C classes. I forget where the stats are but I remember that in some cases things run 400 times slower if you use certain "NS" classes. Four hundred times slower is a pretty serious price to pay for human convenience.
I wrote a genetic evolver for an iOS app I worked on. The initial code was done in full Objective-C on purpose. I wanted to see just how efficient the code could be made "using the Force". I then re-coded in nice-clean C. Wow! I don't have the data in front of me but I remember that it went from having to wait for a solution to executing hundreds of generations (and arriving at a solution) almost as quickly as you retracted your finger after pushing the "Go" button.
If we apply this to web programming it means that one could save a tremendous amount of money in server costs and provide a far more responsive system at the same time. Server costs go beyond the mere monthly cost of the hardware. Calculations should include MTBF, maintenance, power, cooling, etc. If you are using AWS these costs are all part of the package, of course.
So, no, in my case, I don't take it lightly. I don't like it when people suggest that we "throw another server at it". First I want to see that the code is as tight and fast as can be. Then we look at throwing hardware at it. Of course, there is a point where hunting down clock cycles becomes more expensive than it is worth. The location of this point on the timeline is different for each business and project.
> I don't like it when people suggest that we "throw another server at it". First I want to see that the code is as tight and fast as can be.
I understand that you don't like it, but that doesn't mean it's actually more efficient. Except in very rare cases, it usually saves money in the long run to add additional servers. The correct way isn't always the best way.
Performance differences are anything but negligible. In the case of the "billion elements", you are now spending the majority of time for reversing the array twice.
As a programmer you should always know what kind of cost you are incurring, both asymptotically and down to cycles.
Trying to condense all the cases into one (using reverse or local state like 'step' from OPs version) is what hurts readability and maintainability.
Also, the "buy more metal" argument is hardly applicable for implementing this one way or another. Its an argument for using whatever platform you are most productive in.
> As a programmer you should always know what kind of cost you are incurring, both asymptotically and down to cycles.
No, as a programmer, you should be able to determine what kind of cost you are incurring when it matters, and much more importantly, you should be able to determine when it matters and when it doesn't.
And even in the case of a billion elements, the difference in time is still negligible. Developer time is almost always better to optimize for. There's a reason Facebook and Google weren't coded entirely in C.
Here's my second Javascript draft. I intend to refactor it tomorrow so that the three if branches are unrolled into their own functions. Especially since some bizarre facet of Javascript's type coercion makes it impossible to do early returns inside the iterator function ([] << 1 somehow comes out as 1, not [1]).
I've posted and deleted this a few times, since I keep rethinking it. This is a Lua version, and I'm pretty new to Lua (it's good practice). No reverses needed; just numeric for loops.
I'd rather keep n = 0 as a separate case since it doesn't need to consider a break point or a count variable. The test - Is n 0? - happens once and by including it, you can omit a lot of pointless tests - Am I at the breakpoint? - and increments - Add another one to count. Those tests and increments seem much worse than the single test - especially if the array (a table in Lua) will grow to very, very large sizes.
But I would be happy to hear if I'm looking at it wrong or missing a good trick in Lua.
function add1(t, val, n)
if n >= 0 then
start = 1; stop = #t; step = 1
else
start = #t; stop = 1; step = -1
end
if n == 0 then
for i = start, stop, step do
if t[i] == val then t[i] = t[i] + 1 end
end
else
break_point = math.abs(n)
count = 0
for i = start, stop, step do
if count == break_point then break end
if t[i] == val then
t[i] = t[i] + 1
count = count + 1
end
end
end
return t
end
I changed the interface to what I think it's more 'Ruby'. If you are looking for in-place solution, we should use the ! to indicate it's more 'dangerous' than the other solution which returns a new copy.
I'm new to Ruby so I'd love to hear feedback from other more experienced Rubyists on HN. (And I couldn't find reverse_each_with_index so I had to write my own.)
While the implementation of this would be pretty straightforward, the method signature you're asking for seems convoluted, in my opinion. Here's why:
- "add1" is not very descriptive of what the method actually does, since it adds one, but only sometimes. This title makes it sound like the method simply increments each value by one.
- When you say "if it is zero, do this, but if it is positive, do this...", this thinking creates a "secret handshake" in your code, so a new developer would have to read the method in depth to understand what it does in each case.
- If this method will have widespread use in your code for arrays, consider extending the array class with the method, so that you can call it directly on an instance of an array.
While the implementation may get a little more complicated this way, you're making the method signature much more clear and descriptive, which will lead to better code overall.
> Are you using iterators like .each or .map, or are you using a normal, index-based for loop?
I doubt you mean anything very judgmental by 'normal', but this still jumped out at me. In Ruby an index-based for loop is pretty unusual. The for loop is non-idiomatic, and it also has a scope leak: https://gist.github.com/3232118.
There actually _is no_ "normal, index-based" for loop in Ruby, only a for ... in iteration construct. Most C-style for loops can be implemented as #times or #up_to/#down_to/#step iterations.
hmm, my apologies if that sounded judgmental. I started coding a long time ago, so doing for loops was a very fundamental part of learning how to code. I (incorrectly?) assumed most people learned that form of looping first.
Again, I didn't really think you were being judgmental. It's just that in Ruby, the for loop definitely doesn't have pride of place at all. You're may be right that most programmers learn languages where for is more common, but I wonder about people who start with Ruby. That is, do they think for is weirdly primitive when they start doing Javascript?
Anyhow, no offense taken at all. But I would be shocked if any Rubyist did use for.
Even in languages like C# the standard for-loop seems to be heading in the direction of being considered harmful. Its status hasn't gone nearly as far along the path as for Ruby, but the tide's turning that way.
As far as creating a new array (or several new arrays) goes, there are some new considerations such as composability that are starting to be valued more highly than efficiency. And pure functions are more composable than ones that mutate the existing data.
Granted, the n < 0 case could still be handled in a way that's more efficient but still pure by allocating an array and filling it backward rather than using two reverses and an iterator.
def add1(arr, val, n):
it = xrange(len(arr))
if n < 0:
it = reversed(it)
n *= -1
found = 0
for i in it:
if arr[i] == val:
arr[i] += 1
found += 1
if found == n:
break
return arr
First step: the increment requirement, the in place requirement, the
short circuit requirement, are all the same thing. So, break that part
off.
def addl(list, target, n):
indexes = lazy_find_n(list, target, n)
for index in indexes:
list[index] += 1
The general purpose for loop gives me the willies.
for (var i = start; i != stop; i += step) {
if (arr[i] == val) {
arr[i]++;
count++;
...
What is the difference between i += step, arr[i]++, count++? Are you
absolutely sure your start, stop, step & count won't over run your array
boundaries? Are you willing to bet control of your users, CPU
instruction pointer? What if a new requirement is an operation besides
increment? What happens when the next guy, implementing the new
operation, isn't as careful with his loop/subscript conditions? (end
rant)
Second step: create a lazy_find_n() which returns an iterator of the
indexes of the list items we want to modify.
from itertools import ifilter, islice
def lazy_find_n(list, target, n):
# generate indexes in desired direction
if n >= 0:
indexes = xrange(len(list))
else:
indexes = xrange(len(list)-1, -1, -1)
# filter indexes for items matching target
indexes = ifilter(lambda i : list[i] == target, indexes)
# only as many as requested
if n != 0: indexes = islice(indexes, abs(n))
return indexes
Since xrange, ifilter and islice return iterators, the value of indexes
is computed only as needed. That is, if islice only returns a 2 item
iterator, it doesn't matter how many items the iterator from ifilter
would have returned, as long as there are at least 2. (If there are
less, islice returns less.) Furthermore, the iterator from xrange only
processes until ifilter, and islice are satisfied.
> My background is mainly in embedded systems programming. My main concern when I code is for speed and efficiencey, both in runtime and in memory consumption.
Do you specify this to your candidates when they write their code?
This is the thing that annoys me most about interview programming questions. If you don't give me constraints, I am going to implement a naïve version of your requirements, because you have not given me any additional criteria by which you are going to judge my code.
Knowing things about what sorts of inputs the method has to expect (should I assume that i'm going to encounter a billion elements or not?) is really important to judge correctness of fit, or whether you're just wasting your time thinking about cases you'll never encounter.
Part of being a good developer is teasing the constraints out of the customer. You're never going to be handed a perfect requirement. If you're not given constraints don't get annoyed, simply ask about them.
As one of my colleagues likes to complain, the hardest thing in the world to find is programmers who try to make sure they're giving you what you want instead of just giving you what you asked for.
Precisely. I also say, "This is a very open-ended question in which I am intentionally vague on some points. Ask me about information you want or points you need clarified."
Like I said, I have no desire for them to create the "perfect, correct" solution; but I do want to know how they think about a problem and what questions they ask in the face of deficient information. I proactively encourage them to do so, so I am not trying to trick them into failure or anything.
A problem with entirely made-up domain-less questions like this is that any additional constraints are equally as arbitrary as the question itself, and thus difficult to probe for in the same way that someone would probe a customer for constraints to real problems.
It seems like you're very interested in people asking questions like "how big might the array be" or "what order of time can you afford to wait for the answer to this", to which you might answer "a billion" or, "as little as possible", but those questions are very different and less meaningful than questions like "how many users will this system have", and "how much time can users of this system afford to wait for a response", even though the answers may be the same.
I have to say, I respond quite differently to "I have a problem I'm trying to solve" vs something like "I need this algorithm implemented", and a lot of the way these programming challenges are presented always come off to me as the latter, and as a result, contrived.
this reminds me of how even fulton, the author of the ruby way, made a critical mistake in "the ruby way" 1st edition (fixed in 2nd edition) and matz chimed in saying " "Don't modify the receiver while you are iterating
over it." http://www.justskins.com/forums/collect-with-block-modifying...
def add1(arr, val, n):
todo = abs(n) or len(arr)
seq = xrange(0, len(arr), 1) if n >= 0 else xrange(len(arr) - 1, 0, -1)
for i in seq:
if arr[i] == val:
arr[i] += 1
todo -= 1
if not todo:
break
return arr
sub add1 {
my ($arrayref,$val,$count) = @_;
# Just incremet matches and return if we aren't limited
# Useless return of array ref
return \map { $_++ if $_ == $val } @$arrayref unless $count;
# increment up to count matches
for ($count<0 ? reverse @$arrayref : @$arrayref) {
next if $_ != $val;
$_++;
# Advance count towards zero
$count<0 ? $count++ : $count--;
return if $count == 0;
}
}
Splitting the loop into separate versions for positive and negative counts is quicker than checking if count is positive or negative each iteration, but I figured that's a step more than I wanted to take it.
Alternatively, defining a variable for the amount the $count changes (-1 or 1) and just adding that to count would work as well, but I'm working under the assumption that's less likely to be special cased in the interpreter than post-{in,de}crement.
Edit: Clarified reasoning behind post-increment, etc.
add1 = (arr, val, n) ->
n = arr.length if n is 0
arr = arr.reverse() if n < 0
count = 0
for num, i in arr
break if count > (Math.abs n)
arr[i]++ if num is val
count++
return if n < 0 then arr.reverse() else arr
Here's my version without using reverse or underscore. If coffeescript supported reverse loops or negative indices, this would be a lot cleaner.
add1 = (arr, val, n)->
d = Math.abs(n||1)/(n||1)
n = if n is 0 then arr.length else Math.abs n
i = if d < 0 then arr.length - 1 else 0
end = if d < 0 then 0 else arr.length
while n and i isnt end
num = arr[i]
if num is val
n--
arr[i]++
i = i + d
arr
Here's my cheating version that uses a third-party library:
add1_cheating = (arr, val, n) ->
n = arr.length if n is 0
start = if n < 0 then arr.length else 0
end = if n < 0 then 0 else arr.length
step = n/(Math.abs n)
n = Math.abs n
count = 0
for i in _.range start, end, step
break if count > n
arr[i]++ if arr[i] is val
count++
return arr
There was only two ways I could think of to do this without the double reverse.
One would introduce more complex branches (looping backwards in CoffeeScript ain't easy) and the other would be using Underscore.js's range function to do the same thing - that solution would be basically identical to this, except without the reverses, only I'd consider that cheating.
yes, that is technically correct (the best kind), however everyone has understood it as intended so far. I also state it like that so as not to give a tip that abs() might be helpful to them down the line.
I don't know Common Lisp either but if you know any Lisp it's quite readable. I'll guess an explanation (without looking at any docs) and let the real lispers correct me.
substitute traverses the list sequence replace any element equal to val with val + 1. I don't know if sequence does this in-place but that doesn't necessarily affect the correctness, since this wasn't mentioned in the spec.
If used as above then all elements equal to val would be replaced. But there are 2 additional (optional?), named parameters.
from-end indicates whether to traverse the list in reverse order. It will be true if and only if n is less than 0.
count indicates how many replacements should be made. If n is 0, then this will be nil, which indicates "replace all". Otherwise it will be the absolute value of n.
It's likely that the interviewer would ask you to implement substitute, though.
I don't see an obvious solution that doesn't "unroll" the three cases and doesn't waste any cycles, but maybe that is why I don't work at OP's company. ;)
But decrementing n would still cost you cycles in the case where n was given as 0 that you wouldn't need if you had separated out that case, wouldn't it?
OK, here's a C version. Had to add a parameter, of course:
void add1(int arr[], int length, int val, int n)
{
int direction = n >= 0;
int counter = n == 0 ? length : abs(n);
int index;
for(int i = 0; (i < length) && (counter != 0) ; i++)
{
index = direction ? i : length - i;
if(arr[index] == val)
{
arr[index]++;
counter--;
}
}
}
var x = n,
end = 0;
if( n < 0 ){
arr.reverse();
x = Math.abs(n);
}
for(var i=0, len=arr.length; i<len; i++){
if( arr[i] === val )
arr[i]++;
if( x > 0 ){
end++;
if( end >= x+1)
break;
}
}
if( n < 0 ) arr.reverse();
return arr;
Here is a solution in Microsoft's Visual Basic .NET making a little use of some of .NET:
' ADD.VB --
'
' Created at 21:34:59 on Wednesday, August 1st, 2012.
' Note: There are some risks of fixed overflow
' depending on (1) the values of the components
' of arr() and (2) how the logic works for ending
' a For-Next loop in the case n is the largest
' value of Int32.
Sub add( _
arr() As Int32, _
val As Int32, _
n As Int32 )
Dim arr_upper_bound As Int32
Dim i As Int32
Dim n_changes As Int32
' http://msdn.microsoft.com/en-us/library/system.array.getupperbound%28v=vs.100%29.aspx#Y104
arr_upper_bound = arr.GetUpperBound( 0 )
' http://msdn.microsoft.com/en-us/library/cy37t14y(VS.80).aspx
Select n
Case n = 0
For i = 0 To arr_upper_bound
If arr( i ) = val Then arr( i ) = arr( i ) + 1
Next i
Case n > 0
n_changes = 0
For i = 0 To arr_upper_bound
If arr( i ) = val Then
arr( i ) = arr( i ) + 1
n_changes = n_changes + 1
If n_changes = n Then Exit Select
End If
Next i
Case n < 0
n_changes = 0
For i = arr_upper_bound To 0 Step -1
If arr( i ) = val Then
arr( i ) = arr( i ) + 1
n_changes = n_changes + 1
If n_changes = -n Then Exit Select
End If
Next i
End Select
Return
End Sub
Right at first I would have two remarks. The first remark has to do with how to describe such a problem. The second remark points to what appears to be a serious error in the problem description.
My first remark would be that the problem statement is not in any natural language.
E.g., what is the role of the many instances of the character '#'? Those instances are not in a natural language.
E.g., what is the token
'add1'?
That token is not in my Webster's dictionary, and I have been given no way to parse it.
E.g.,
"increments items in an array matching specified value"
is not a sentence.
E.g., "param:" is not English.
Continuing, it is not clear what meant by an 'array'. So, we do not know the value of the smallest subscript or the largest or how to determine these.
Next the terms
"from left-to-right (forward)"
and
"right-to-left (backward)"
are undefined and require a reader to guess. A careful guess would want to know if the computer was big endian or little endian.
Finally, one might object to "most optimal" since, really, something is 'optimal' or it is not so that 'most' is superfluous.
Here is another way to start to write the problem statement:
Suppose you are given positive integer n, and suppose for i = 1, 2, ..., n, you are given an integer x(i).
Etc.
Lesson: In computing, clear communications are important. Those communications should be in a natural language, e.g., English.
My second remark would be to consider the part of the problem statement:
n < 0 means increment up to n occurrences of val from right-to-left (backward)
So, with this statement, if n < 0, then there is nothing to do since we cannot do something a negative number of times.
So it appears that what was intended was:
n < 0 means increment up to -n occurrences of val from right-to-left (backward)
One of the great things about two people from any reasonably mature field communicating with each other is that they can take advantage of widely-agreed-upon shorthand. Nearly everyone coming in for a programming interview in 2012 will understand that the '#' character is often used for non-executable program meta-data, that functions are often named with tokens such as 'add1', that 'param:' typically means a function parameter with a description of its value following the colon, what an 'array' is, how subscripts work with them, and what is meant by the terms left-to-right, forward, right-to-left, and backward (regardless of endian-ness, the inclusion of which is sort of humorous since our hypothetical interviewee has, up to this point, been assumed to be completely ignorant of programming-related terms). You go on to suggest defining the problem in terms of short-hand from a more mature, but less specifically relevant field, which I would argue would actually be well understood by fewer potential interviewees.
I do agree with you that it should probably be pointed out that the number of occurrences should be stated as the absolute value of 'n', rather than just 'n', but this would become obvious very quickly when the candidate starts asking clarification questions.
I was trying to be simple, elementary, and easy to read! But my comment was not at all "pedantic"!
Even with all the context you suggested, the problem statement is just not very clear, and, in computing, computer science, and software, that lack of clarity is not a small point.
Put more bluntly, there is no excuse, and in serious software, especially for embedded systems, there is no role, for abbreviations that are on or over the line into gibberish. You are correct that such gibberish is common in software; my point is that we should not do that and should object.
Yes, in my suggested problem statement, I used notation common in mathematics. Still, the math community still says that such writing is English. GOOD.
Here is a big point: Computer science, computing, and software have no good substitute for such use of how mathematics is written. There is a partial substitute in practice if assume a particular programming language.
The error of n instead of -n or the absolute value of n is small but is an example of a huge point: A subroutine needs a description. A subroutine is precisely something and should have a precise description. The problem here is so simple we can get by with being imprecise as you suggest, but computing is not limited to such simple routines, and the complicated ones also need precise descriptions. Net, we need to write precise descriptions. Knuth did this in TACP (yes, I abbreviate!); it really is possible.
For assuming that 'add1' means an argument 1 passed to a function add is asking a lot even just in the context of software.
In passing a constant to a function there will be the issue, that should be addressed, of passing 'by value' or 'by reference'. In Fortran a constant is passed by reference with all the problems, side effects, and surprises there unto appertaining! We can't yet assume that Fortran is buried!
If my work does well and I start to hire, I will give a lot of weight to people who are good at reading and writing technical material and will prefer that they know how to read and write abstract math. For learning Visual Basic .NET, a book of code samples and a one week class should be able to do a lot. Net, I'm questioning if I should require new hires for programming to know any 'computing' at all.
It makes sense, because performance differences between that and the "optimal" solutions are usually negligible, and the "optimal" solutions aren't usually optimal when you take into account readability and maintainability.
Embedded development is significantly different than developing for the web. Buying more servers is usually more efficient than taking twice as long to develop and update code. Interviewing with people who refuse to acknowledge this is extremely frustrating.