
Ask HN: How to deal with OSS license infringement? - reikonomusha
I have found a project which happens to use code from a friend of mine without abiding by the license. The repository and relevant issue is here:<p>https:&#x2F;&#x2F;github.com&#x2F;bkkcoins&#x2F;klondike&#x2F;issues&#x2F;5#issuecomment-25203253<p>The responses have not been constructive, unfortunately.<p>I am wondering how cases like these are routinely dealt with in the open source community. Is there any recourse without falling back on litigation to make such a simple change to make a project compliant with the borrowed software&#x27;s license? What would you do if your open source code was &quot;stolen&quot; and capitalized on?
======
dalke
[Summary: Both algorithms derive from the SHA2 specification. There's no
copyright infringement here.]

To make sure I understand, the project uses modified code, where the original
code (at
[https://github.com/thomdixon/pysha2/blob/master/sha2/sha256....](https://github.com/thomdixon/pysha2/blob/master/sha2/sha256.py))
is under the MIT license, but the modified version (at
[https://github.com/bkkcoins/klondike/blob/master/utils/sha25...](https://github.com/bkkcoins/klondike/blob/master/utils/sha256.py)
) is under the GPLv2-or-later, and doesn't attribute the original source.

To answer your question, if the copyright holder believes that someone at
github is infringing on copyright then the easiest thing to do is to have the
copyright holder send in a DCMA notice. See
[https://help.github.com/articles/dmca-
takedown](https://help.github.com/articles/dmca-takedown) .

That said, I don't agree that there is copyright infringement. The argument is
that:

> The only real differences is that the derived code has been made a bit more
> compact and less readable, the capitalization of the class has changed, a
> method name and a few field names were modified, and a little bit of the
> message/data plumbing was changed. Otherwise the core computational aspects,
> down to the order of operations, destructuring assignment, variable names,
> and auxiliary routines are all the same—all things that are hard to
> duplicate coincidentally.

However, if you look at
[http://en.wikipedia.org/wiki/SHA-2#Pseudocode](http://en.wikipedia.org/wiki/SHA-2#Pseudocode)
you'll see that it also has the same core computational aspects, variable
names, etc. These all come from a common source - the SHA specification. Take
a look at
[http://csrc.nist.gov/groups/STM/cavp/documents/shs/sha256-38...](http://csrc.nist.gov/groups/STM/cavp/documents/shs/sha256-384-512.pdf)
and compare.

What this means is the different implementers used the same reference, and not
that one implementer infringed the other's copyright. While it's true that
'all things that are hard to duplicate coincidentally', there's no coincidence
here.

Now, remove those parts which are based directly on the standard. Those parts
are quite different. Compare:

    
    
        def digest(self):
            mdi = self._counter & 0x3F
            length = struct.pack('!Q', self._counter<<3)
            
            if mdi < 56:
                padlen = 55-mdi
            else:
                padlen = 119-mdi
            
            r = self.copy()
            r.update('\x80'+('\x00'*padlen)+length)
            return ''.join([struct.pack('!L', i)
                   for i in r._h[:self._output_size]])
    
    

-to-
    
    
        def finalize(self):
          tailbytes = self._length & 0x3f
          if tailbytes < 56: padding = 55 - tailbytes
          else: padding = 119 - tailbytes
          self.update(b"\x80" + (b"\0" * padding) +
                struct.pack("!Q",   self._length << 3))
    
    

The 0x3f is another way to say "mod 64". It's something that a C programmer
would think is faster, but profiling in Python just now shows that they have
identical performance.

Overall these implementations for how to pad is what you would expect from the
algorithm.

The last part of the implementations, however, are quite different. The first
makes a copy() then reaches into the copy's internals. This means people can
continue to update the hash even after the digest was generated. This matches
the API from Python's hashlib.

The latter uses an in-place modification of the hash, and the name 'finalize'
reflects that it's a different implementation. It modifies state in-place, and
requires a get_bytes() to get the actual bytes. This doesn't match the Python
hashlib API at all.

What I see is that the parts derive from the NIST specification are similar,
though not identical, and the parts which don't follow naturally from that
specification are quite different.

Therefore I disagree with you, and say that there is no infringement.

