Hacker Newsnew | past | comments | ask | show | jobs | submit | twanvl's commentslogin

This tile forces a pattern that does not repeat. If you use the simpler shape you can tile a wall such that it never repeats, but you can also make a repeating pattern.


Money is fungible, personal data is not.


You'd think that but personal data piles up quickly and the individuals who make it up dispensary. It's a little like saying any particular dollar is unique; I suppose it is but if you get enough of them nobody cares. I think expecting Facebook to treat all of their data as if it's ITAR is unfortunately unrealistic, we're not all that unique or special and if you use the internet your data has already been leaked or is otherwise public


Ha! “Fungible” is the word I settled on, too.


Other advantages of zero based indexing, beyond being 'closer to the machine':

It works better with the modulo operator: `array[i%length]` vs `array[(i+length-1)%length+1]`. Or you would have to define a modulo-like operator that maps ℕ to [1..n].

It works better if you have a multi-dimensional index, for example the pixels in an image. With 0 based indexing, pixel `(x,y)` is at `array[x+widthy]`. With 1 based indexing it is at `array[x+width(y-1)]`. You might argue that programming languages should support multi-dimensional arrays, but you still need operations like resizing, views, etc.


Another advantage is with ranges: 0-based indexing and exclusive ranges work well. This is apparent with cursor position in text selection

Consider:

    Characters       h e l l o
    Cursor index    0 1 2 3 4 5
    Char index       0 1 2 3 4
    Range [0,3)     [0,1,2]
    Range [2,5)         [2,3,4]
    Range [1,1)       []
If we used 1-based indexing and exclusive ranges, it leads to ranges where the end index is greater than the string's length...

    Characters       h e l l o
    Cursor index    0 1 2 3 4 5
    Char index       1 2 3 4 5
    Range [1,4)     [1,2,3]
    Range [3,6) (!)     [3,4,5]
    Range [2,2)       []
but if we use inclusive ranges, it leads to ranges where the end index is less than the start index...

    Characters       h e l l o
    Cursor index    0 1 2 3 4 5
    Char index       1 2 3 4 5
    Range [1,3]     [1,2,3]
    Range [3,5]         [3,4,5]
    Range [2,1) (!)   []
Also:

    Characters            h e l l o
    Cursor index         0 1 2 3 4 5
    0-based range [0,3)  [0,1,2]
    1-based range [1,4)  [1,2,3]
for the 0-based range [0, 3), the left array bracket is at cursor index 0, and the right bracket is at index 3. With 1-based indexing it doesn't work like that because the range is [1, 4)


This is the bane of my existence working with Lua.

Iterating an array or adding to the end are fine, we have ipairs and insert for that, but ranges on strings I'm constantly having to think harder and write more code than necessary.

I love the language, wouldn't trade it for another, but the 1-based indexing on strings, which represents an empty string at position 3 as (3,2), it's egregious.

Not as egregious as a dynamic language where 0 is false though.


> a dynamic language where 0 is false though.

well, C also considers 0 being false (and you can argue that C is "dynamic"!).


Yeah, offsets are just easier to mathematically manipulate than ordinals. It's not just pointer arithmetic where it matters that item i corresponds to start+i*step. Any time you want to convert between integer indices and general linearly-spaced values, 0-based indexing is more convenient.


In many domains code maintenance is more important than hardware costs. In many domains 1-based indexing is a better fit, meaning less conversion code, meaning simpler code. Thus, the best indexing choice depends on the domain and circumstances, as do many controversial questions. Most tend to specialize in specific kinds of domains and over-extrapolate their experience into other domains.


I'd agree so why did languages that allow specifying the base die (other than maybe VBA).


Because changing the base is a great source of bugs.

That said, not all languages have given up on this. For example Julia allows it. https://docs.julialang.org/en/v1/devdocs/offset-arrays/

Ironically I learned this from a discussion of Julia bugs. Apparently changing offsetting of arrays has proven to be a source of bugs in Julia. So maybe someday they will come to the same conclusion as languages like Perl and stop allowing it.


I'd argue 0-base is a source of bugs too! Ideally we'd be able to catch more array indexing bugs at compile time - there are definitely cases where it should be possible to determine that arrays are being incorrectly indexed via static analysis.


The problem is that libraries which assume 0-base break when you have a 1-based array. And vice versa. Trying to combine libraries with different conventions becomes impossible.

Therefore changing the base leads to more bugs than either base alone.

That said, the more you can just use a foreach to not worry about the index at all, the better.

Of 0-based and 1-based, the only data point I have is a side comment of Dijkstra's that the language Mesa allowed both, and found that 0-based arrays lead to the fewest bugs in practice. I'd love better data on that, but this is a good reason to prefer 0-based.

That said, I can work with either. But Python uses 0-based and plpgsql uses 1-based. Switching back and forth gets..annoying.


I'd expect the compiler not to let you to pass a 0-based array to a library function expecting a 1-based array. I'm pretty sure that's how it worked with Visual Basic, which was the only language I ever used such a feature in.


You are demanding a lot from the type system.

Search for OffsetArrays in https://yuri.is/not-julia/ for practical problems encountered in trying to make this feature work in a language whose compiler does try to be smart.


The type system does tell you if this is used. `::OffsetArray`.


Yes, but did the programmer tell the type system that they are expecting an OffsetArray, they have tested it, and it will work correctly?

The existence of a mechanism does not guarantee its correct use. As that link demonstrates.


A disadvantage comes to mind. While the following loop works as expected:

    for (size_t i = 0; i < length; i++) ...
The following causes an unsigned integer underflow and is an infinite loop:

    for (size_t i = length - 1; i >= 0; i--) ...


As Jens Gustedt points out[1], the following intentional unsigned overflow works perfectly for downwards iteration (even when length is 0 or SIZE_MAX), though it looks a bit confusing at first:

  for (size_t i = length - 1; i < length; i--) ...
You are also free to start at any other (not necessarily in-bounds) index, just like with ascending iteration.

[1] https://gustedt.wordpress.com/2013/07/15/a-praise-of-size_t-...


Principle of least surprise violated.

Also, that behavior is not guaranteed. The programmer would need to be aware of how the particular machine in question actually handles that.

Then again, that's C.


Unsigned integer underflow and overflow are both guaranteed to wrap by the C standard.


Uh I'm confused but don't know c++.

why doesn't that loop end instantly?

I mean length - 1 < length should always be true, right?

Or does it only terminate when the number underflows? Terribly confused here


For loops are translatable from:

  for(initialize; condition; increment) { ... }
to:

  initialize;
  while(condition) {
    ...
    increment
  }
(more or less, some scoping things not encompassed by the above; this is also how pretty much every for loop in a C-syntax language works) The condition of a for loop is equivalent to a while loop's condition. So yes, length - 1 < length will be true on the first iteration, which is fine because the loop continues as long as that condition is true.

What the above approach takes advantage of is that when underflow eventually happens you'll have this condition:

  MAXINT < length
Which will terminate it for all possible values of length.


It‘s an unsigned int, so past 0 it overflows back to the maximum


Ooh, i see. I wasn't aware that they under/overflow at 0 when they're unsigned. Thanks for broadening my horizon!


It loops while the condition is true. When an underflow happens, it stops being true.


It’s a condition to run, not a condition to stop


That was my reaction - why should anyone think it might be the latter? Are there languages that do have such a syntax without explicit keywords ("do...until")?


I think lisp or scheme does. I was often confused by that when I was playing with it


The loop continues until i transitions from 0 to 0 minus 1. 0-1 in this case actually doesn't equal -1 since size_t is an unsigned type, instead it wraps around to be the largest possible positive integer instead. TLDR; yes as you speculate it terminates when the number underflows.


the footnote [1] should be [0], just for the sake of this very topic.

Seriously though, while the idiom does work for unsigned integers, it's a bad idiom to learn [makes code reviews harder]. The post-decrement one in the loop body works with everything (signed/unsigned), and it's well known.


In that specific case I'd do the following:

    for (size_t i = n; i-- > 0 ;) ...
Or count from `length` to 1, but subtract 1 in the loop body, or count up and subtract the length in the loop body. Any modern compiler should be able to optimise these to be equivalent.

In the majority of cases, counting down is not necessarily. Nor is ordered iteration. Most languages have a `for each` style syntax that's preferable anyway.


Ah yes, the goes-to operator -->


And the wink operator, so the compiler knows you know the deal


I like how GP is called "operator-name" and instead of doing himself, he makes others joking with operator names. Although I'm not sure if it's altruism or highly manipulative behaviour.


Gold - have an upvote!


Or alternatively:

  size_t i = length;
  while (i--) ...


Unless I am remembering C wrong, this would work but the

   for (size_t i = length; i-- > 0 ; ) ...
that several other people posted would not execute for index 0. Shouldn't it be this instead?

   for (size_t i = length; --i > 0 ; ) ...


the correct way/idiom to reverse iterate an array is

  for (size_t i = length; i-- > 0; )...
It's surprising how often the issue pops, it works well with both signed and unsigned integers.

(edit) I've started with one based indexing (basic)... mixed with 0 based (assembly), more 1 based (pascal), then more stuff (all zero based). I am, yet, to see a real advantage of a one based indexing... after the initial process.


Is this cache friendly?


Sure, why wouldn't it be? As far as a cache is concerned, I don't think reverse sequential iteration would be any different than forward sequential. The actual RAM accesses may be less optimal if there's some speculative pre-fetching with assumed forward sequential access, but that's conjecture.


With some exceptions, hardware prefetch works in terms of ascending accesses. To learn if a particular CPU will prefetch for descending access, benchmarking is essential. Best to use soft prefetch calls if performance is critical.


i would suspect that the cache prefetch/prediction could use the "velocity" of the memory access to predict the next access; so if the access pattern was going backwards, the "velocity" would be negative, but prefetching would still work if they just followed the predicted pattern.


It is not, unless your compiler is smart enough to recognize reverse iteration and prefetch appropriately.

If performance matters, you should experiment with __builtin_prefetch, which is available in clang and GCC.


It’s not. It was nice on architectures were cache didn’t matter much and were subtracting and comparing to zero was just one instruction (looking at you old core ARM)


In the C programming language unsigned integers do not overflow. They wrap. This is well-defined behaviour and the example code is simply incorrect. Most modern compilers will give you a diagnostic for this.


If you have foolishly turned off -W in your build system, that could happen. Otherwise, you get a nice warning pointing out your folly.


Unless wrapping underflow is sensible for the domain (which it isn’t when representing the size of something), unsigned integers are usually a bad idea.


You can always rip a page out of C++’s playbook:

    for (size_t i = length; i > 0; i--) {
        // ...
        item = array[i - 1];
(This is how reverse iterators work in C++.)


  for (size_t i = 0; i < length; i++) {
      size_t j = (length - 1) - i;
      ...
  }
EDIT: change i to j


That's quite broken (you'd want a different variable inside the body, vs clobbering the iteration counter, else this would process the last item in your list, then exit).


I hope you haven't done this anywhere.

Makes my brain hurt, but I think this will only run through the loop one time looking at the last element of the array.


Uh no don't reassign the loop variable in the inner scope. Use:

    const j = (length - 1) - i;
in that case. Much safer.


You should save the old value of i somewhere and restore it back at the very end of the loop. Or simply define a new j like another comment says.


A rare case where 1-based indexing is more convenient is complete binary trees laid out breadth-first (as in a standard binary heap): parent is i div 2 and children are 2i and 2i+1 when starting at one and who knows what when starting at zero. But that’s the only one I know.


Except 1-based indexing is what we use in normal language. We don't use "zeroeth" or "player (number) zero" etc. And the word "first" is shortened to 1st etc. Personally I think we'd be better off if programming languages stuck to the same convention - off-by-1 errors aren't the hardest problems to deal with but they're still annoying.



Sure, it's technically a word, but hardly one you'd casually drop into your conversations (with non-programmers)


> "player (number) zero"

That would be because any game worth playing has at least one player... and so it's natural to continue from there. (In terms of language.)


You might have heard of Conway's Game of Life.


Fair :)


That's confusing ordinal and cardinal numbers. The element with index 0 is the first number. The element with index 1 is the second number, and so on.

Using the term "zeroth" is basically some form of showing off (even though it's kinda fun), but will be utterly confusing when you get to the fifty-second element which is the last in a group of 53 elements.


I'm not confusing them, my point about abbreviating "first" as 1st was that in typical speech we start counting at 1. Nobody says "let's start with item zero on the list". But programmers are stuck with having to say/think "item 0 in the array".


I don't disagree with you. I just don't think a programmer should be confused about the statement "item 0 is the 1'st item".


The 2 major blunders in programming: 1) Off by one errors.


With 0-based indexing the children are at 2i+1 and 2i+2. The parent is at (i-1) div 2.

Not hard to figure out.


> With 0-based indexing the children are at 2i+1 and 2i+2. The parent is at (i-1) div 2.

> Not hard to figure out.

While that's true, "you just shift by 1" is equally good at all arguments for or against 0-based indexing, so deploying it here probably won't convince.


I was not trying to convince.

That said, the effort of one versus the other is so trivial that there is no point in ever using effort as an argument either way. Doubly so because what seems like effort to us is simple unfamiliarity.

What is important is which one leads to more careless errors in practice. As a trivial example, consistent indentation takes effort, but failing to do it leads to more careless errors. Therefore everyone indents code.

The only data point I've seen on that is the side remark about Mesa in https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/E.... That remark, therefore, is the only argument that I care about.


An incredibly unrare case happens all the time in my work: array[length - 1], or variations of this. Anything involving the last element of the array, and often iterating through the whole array will use something similar at some point.


This is the reason I always think of when I hear the question


I think the 1-indexing folks would have to argue that a%b should return a value from 1 to b inclusive. This does make the same sort of intuitive sense as 1-indexing. For example we number clocks from 1 to 12.


% is defined as (mostly) a remainder operator. I don't think you want to change the semantics of division itself.


Right. Ultimately the 1-indexers are wrong.


But interestingly, the number at the top is 12, not 1.


No we don't. Clocks start at 0. what time is it when it's half an hour after midnight? 00:30.


but nobody "says" zero o'clock - it's always twelve o'clock!

and the example is in 24hr format - which needs to have the 00 to differentiate it from being 12:30. But if you write in 12hr format, you don't ever use 00 - it's always 12:30am or 12:30pm


First of all, you don't. Or I guess most Americans don't. I do, or most Europeans do. 12.30am and 12.30pm feels just very wrong in Europe.

But yea, I do agree with you. we don't say 0, we say 12, no matter if it's noon or midnight. That's because we humans avoid saying zero when we mean zero.

It's the same with other comments here, talking about counding seconds, we say "ok go, one, two,.. and not "zero, one, two,..". We say 3 months old baby, and not 0 years old baby.

We use 0-index in so many things, we just avoid saying the word "zero", and we use other names or other units to avoid that word.


I'm from Sweden, and I certainly would say 00:15 (as zero fifteen). Although the time at exactly 00:00 would be called midnight.


12 is zero, in 12-hour clocks.


The ergonomics of this are really bad without some kind of implicit parameters and trait resolution. Without trait methods, ad-hoc overloading becomes impossible. You would have to explicitly specify what + operator and what == operator you use everywhere. After all, there could be multiple different additions or comparisons defined for a type, so which one do you mean by `+`?

Another problem is that a data structure's invariant can depend on the trait implementation. For example a tree is balanced with a comparison. This means that the comparison has to be stored in the struct, with the corresponding runtime overhead. With traits you know that there is only a single implementation for any particular type, so if the trait v-table is an argument to every method call there is no risk of different implementations being used at different times.


> This means that the comparison has to be stored in the struct, with the corresponding runtime overhead.

The runtime overhead is solvable in principle by making the comparison a const-generic parameter of the data structure type. But to do this properly requires dependent types, because the type of that const-generic parameter is taken from an earlier parameter in the same definition. It's a dependent sum type, sometimes called a dependent record.


From my understanding, compile-time dependent types (which is all that's needed to mimic statically dispatched generics) are easy, and C++ templates support them (I haven't tried Rust const generics but I hear they're quite limited). Runtime dependent types are much less common, and (given my limited understanding of dependent types) I could describe trait objects (&dyn Trait) as a dependent pair (ptr, Trait-shaped vtable containing *fn taking ptr's type) (though rustc hard-codes support for trait objects, and its job is easier since it never lets ptr change runtime type while its target is used for a &dyn Trait).


Rust's const generics today are limited to the built-in integral types (so, char, bool, and the various sizes of signed or unsigned integer) and the constant must actually be an obvious constant, so e.g. a literal, or something you could assign to a const in Rust. So yes that's much less flexible than C++.

Some of the flexibility in C++ here is desirable, much of it is either wanking or outright dangerous. C++ will allow a float generic for example, which is obviously a footgun because good luck explaining why Thing<NaN> and Thing<NaN> are different types...

Rust expects to sooner or later ship Const generics for user-defined types, but to exclude the nonsense of "float generic" the likely rule goes like this: Your const parameter type must derive PartialEq and Eq. It won't be enough to implement them yourself as your implementation might be nonsense (e.g. misfortunate::Maxwell implements Eq but no Maxwell is equal to anything, including itself) you'll need to use the derive macro which promises the compiler these things actually have working equivalence so that Thing<A> and Thing<B> are the same type if A and B are equivalent.


>Some of the flexibility in C++ here is desirable, much of it is either wanking or outright dangerous. C++ will allow a float generic for example, which is obviously a footgun because good luck explaining why Thing<NaN> and Thing<NaN> are different types...

It's not nonsense, there are legitimate use-cases for floating point template params. A classic example is something like generateInlinedVersionOfFunction<someInt, someFloat>() that generates a class or function with some particular values inlined for efficiency. Using nan as a template parameter is stupid, but it's not especially more dangerous than using nan in general, since any issues specific to its compile-time usage will manifest at compile time, not runtime.

It does nobody any favours to just dismiss advanced features Rust doesn't support as dangerous and unnecessary just because most Rust programmers haven't come across a use-case.


> It does nobody any favours to just dismiss advanced features Rust doesn't support as dangerous and unnecessary just because most Rust programmers haven't come across a use-case.

I think requiring decidable equality for const generics is a fine approach for a limited MVP like what Rust is going with at present. Sure there are more general settings but they would need dependent types and the user would have to write proofs of type equality/apartness anytime the issue came up in a compile. Seems like a bit of a non-starter for now, even though it's effectively what enables non-trivial logical features like homotopy types.


> A classic example is something like generateInlinedVersionOfFunction<someInt, someFloat>() that generates a class or function with some particular values inlined for efficiency.

This smells very bad. Seems like a better choice is a macro, and I understand that in C++ the macros are awful and worth avoiding, but Rust has decent declarative macros for this type of work.

Do you have some real world examples?


Can't you do that in Rust with a macro?


You can always just convert your float to a bit-equivalent int and back for the type param if you really want it. I have never seen floats used for template parameters in the wild, other than party tricks like wrapping a float in a struct with an "epsilon" parameter used for the operator<. Which is all kinds of wrong.


AIUI, practical dependent types are always evaluated at compile time, but the difference with something like C++ generics is that the types can include references to program bindings that relate to runtime values. This means that dependent types can express proofs evaluated at compile time about the runtime properties of a program.


It matters if people figure out how to make your model output inappropriate things and start sharing screenshots on twitter. What if you use a GPT model as part of a chat bot? Then you definitely don't want to risk any sexual or discriminatory output for example. If that was a person talking, they would be fired.


There is no bot to fire here though, as it's not employed.


Depending on the type of puzzle, it might be possible to work backwards. For example here, if you can compute reverse-transitions you could run one or two steps of that, and put all states from which the goal can be reached in two moves into a HashMap. Then you run the forward search and stop as soon as you hit anything in that HashMap. In theory, you can reduce the runtime to find an n-move solution from O(bf^n) to O(bf^{n/2}), at the cost of more memory use. This can also be combined with A*.

Low-level optimization can be worth it:

* You can try to pack the game state into integers and use bitwise operations. An 8×8 board can be stored as a 64 bit vector, so a `u64`. If you know the edges of the board are never occupied, then moving around can be as simple as a bit shift (probably not for this game).

* A smaller state representation also means that HashMap lookups will be faster.

* Instead of using a pair of integers to represent a position, use a single integer and save a multiplication for every lookup into a grid.

* Add a ring of impassible cells around the board, instead of checking for the edges of the board each time.


This is a great idea, and the low-level optimization tips are all excellent ones I have used in the past. I want to talk a little bit more about using bidirectional A* though, because I think it's very interesting. It's a great strategy in general, but this may be a case where it doesn't do as well.

Working backwards for this particular puzzle is very difficult because on each turn an actor may or may not move. This effectively increases the branching factor from 4 (one for each direction) to 4 * 2^n (for each of four directions, each actor may or may not have moved). In practice it would be lower than that upper bound, but it could still be significantly higher than the forward branching factor. A nice visualization for this to think of your start and end states as points in space, and your A* searches as cones emitting from one point and growing toward the other. The angle of the cone would be roughly approximate of your branching factor, and when your cones meet each other or a point the search is done. If your branching factor is the same forwards and backwards, you can travel through much less space by searching forwards and backwards simultaneously. However, if your backwards branching factor is higher then the cone from the end state will be much broader. This could travel through much more space than just doing a forward search.

This kind of behavior is very evocative one-way functions, and makes me think it might be related to NP-hardness in some way. I'm really not qualified to prove these kinds of statements though. Maybe someone else can offer a more rigorous mathematical perspective?


For the quite similar puzzle Atomix, it also seems like the branching factor would be much higher for backward search because upper bounds are weaker, but you can show that on average the branching factor is actually the same [1]. I wonder if the same argument would work here.

[1] http://hueffner.de/falk/hueffner-studienarbeit-atomix.pdf Section 5.5


I wonder if it would make sense to do backward search, even if the forward and backward branching factors are very different. For example if the branching factor for forward search is 10 vs. 100 for backwards search, wouldn’t it make sense to do one step of backward search for every two steps of forward search? Or more generally log(b)/log(f) backward search steps for every forward search step, where the forward branching factor is f and backward branching factor is b?

This is all based on spontaneous intuitive ideas of mine and very superficial reasoning (and probably not even new).


An important input to this (and similar) algorithms is multiple sequence alignment, which tells the algorithm which parts of proteins are preserved between species and variants, and which amino-acids mutate together. So already it is relying on natural selection to do some of the work. And the algorithm will probably not work very well if you input a random sequence not found in nature and ask it to find the folding.


I hope not, as knowing how a novel mutation in a patient alters a protein would be extremely useful when trying to find disease causing variants.


I don't expect building a fusion power plant to become cheaper than a gas power plant. Both need steam turbines, cooling pipes, a big building, etc.. So that would be a lower bound on the construction cost.

If solar+batteries can outcompete fossil fuel plants (while ignoring fuel costs), then fusion likely wouldn't be viable commercially. And if you look at [the data](https://en.wikipedia.org/wiki/Cost_of_electricity_by_source#...), we are already close to this point.


A legitimate interest is a use of personal information that is needed to fulfill a service. This would be something like a session cookie for storing the contents of a shopping cart, a site's preferences, or login information. Using a cookie is the only way to provide that, and the user is basically implicitly asking for something to be stored. It would be silly to have a consent checkboxes like "before you can shop with us we need your permission to register what you want to buy" or "you give us permission to share your address details with the delivery company so they can actually deliver stuff to you".


Annoyingly, legitimate interest covers more than that - it also covers opt-in-by-default to direct marketing. Yes, if a customer registers an account or makes a purchase, you can opt them in by default on the basis of "legitimate" interest[0].

[0] https://ico.org.uk/for-organisations/data-protection-advice-...


Yeah, the problem with "legitimate interests" is they're being used for "build a marketing profile of you" and "send you targeted advertisements" anyway, with the excuse that they're interested in doing that as the basis of their business.


I'm not saying I agree with it, but just for the sake of playing devil's advocate - what if the business legitimately makes its revenue by serving ad content on it's site to it's users?


What if a business legitimately makes its revenue by polluting the air around it?

Maybe that business should fail.


This seems like a respectable position as long as you don't ever complain about paywalls, geographical blocks, or the quality of journalism.

Seems like many commenters want the businesses to both fail and provide them with expensively produced content for free.


Journalism survived quite well before a few companies started following every one of our steps and selling dossiers around.

In fact, its quality was better, and they did live mostly on advertisement.


Then it needs a new business model.


> A legitimate interest is a use of personal information that is needed to fulfill a service.

No, it's not. If you need it to fulfill a service, then you are covered by (b) of Article 6 GDPR I cited earlier:

processing is necessary for the performance of a contract to which the data subject is party or in order to take steps at the request of the data subject prior to entering into a contract;

Legitimate interest under (f) would be something that is not strictly needed to provide the service but (1) beneficial to the processor and (2) does not unduly negatively affect the data subject.


> I'd be interested to see what a sufficiently strong QuickCheck specification of this problem would look like.

I would write something like this in haskell:

    spec :: [Integer] -> Property
    spec xs =
       length xs <= 2  ==>  fun (intercalate "," (map show xs)) == sum xs
This captures the three requirements, but not the implicit fourth requirement that the function throws an exception for other inputs.


Nor does this exercise the trimming of the substrings, for example. This is good for testing the happiest path, I agree. I was interested in the tedious testing of all the unhappy paths.


> not the implicit fourth requirement that the function throws an exception for other inputs.

You could probably generate invalid inputs by taking a list of strings as input. Though of course at that point the property test has to reimplement half the function.

That's an issue I often end up having with property tests: the oracle for interesting properties is as complex as the SUT, so you end up with two of them.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: