Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: I started a repo for sharing algorithm implementations (github.com/kennyledet)
52 points by kennyledet on Dec 24, 2013 | hide | past | favorite | 47 comments



I like this idea. Here's how I'd expand it if I had 100 hours to kill:

- Rename the repo to be shared and add yourself as a contributor, instead of branding it as yourself.

- Organize the files by src/ALGO/LANG/author-name.EXT. That way the emphasis is on the algorithm, then the language. I don't care much about who implemented it, other than to differentiate it from others' implementations for quality, etc.

- Provide a common-ish API. You don't want to stick your entire algorithm into main(), that's not useful. You want a nice clean interface, so that the consumers of this repo can copy-paste these clean implementations into their own code. Maybe remove main() entirely, and stick it into src/ALGO/LANG/main.EXT. That way I can consume an entire file from your repo without modifying it.

- Provide unit tests. If you separate out main(), you could do stick all the tests into src/ALGO/LANG/tests.EXT. The value of this type of repo would be that I can grab a clean, well tested copy of -sort and use it directly.

- Edit: Change from underscores to dashes in filenames. Underscores are for losers :). Use dashes whenever you can.

- Add lots more algorithms, not just basic ones.

- Get a few friends to add some algorithms, then thoroughly review them in comments on GitHub.

- Provide speed test comparisons of each implementation, stick them in src/ALGO/LANG/speed-tests.EXT

- Now that we are getting fancy with the testing, etc., provide a Makefile or some such for building the various test programs and running unit and speed tests on them.

- Edit: Stub out lots of potential algorithms. Even if you don't know how to implement them, create the directory and main.EXT for them, then open a ticket so that others can implement them. Give everyone that submits a pull request committer access, so they might work directly on this repo and not deal with any friction by having to wait for you.

There is probably more, but the above would eat nicely into that first 100 hours. I have no idea what you'd personally get out of this on the other end. Probably nothing. On the other hand, this could become the repository to check for a weird "2x faster with smaller memory usage" implementation of a red-black tree, or a clever general number sieve, etc.


Sorry I replied to this one late. I was waiting til I had a bit more free time to reply to this one seriously.

You have a ton of great suggestions in here, all taken into consideration and this will definitely help the repo be much better by tomorrow. I will get on it. Thank you.

Most helpful post in the thread vs. "lel it's already been done xD links to wiki site"

>On the other hand, this could become the repository to check for a weird "2x faster with smaller memory usage" implementation of a red-black tree, or a clever general number sieve, etc.

Precisely my aim. Thanks for understanding this!! Apparently other people think I'm trying to make it a primary learning resource or something...


Missing from the discussion seems to be the subject of licenses. Some of the files have one and some don't (making using the code murky), so you should probably establish a rule of an explicit BSD license or something. "Public domain" isn't suitable as some countries make it difficult or impossible for people to give up copyright (that's why CC0 exists, but I'm not sure how well that works for code).


Ah yes, you're right. I didn't think far enough into it before I pushed the initial commit. I guess I was just excited, but this is perhaps one of the most important changes that need to be made. Thank you for bringing this to the forefront of my mind, it will be a new convention now.



Nicely done!



Have you considered contributing to LiteratePrograms[0] (or such) instead?

This is great practice, and I encourage you to go for it, but for those interested in looking at algorithms implementations, or looking to cut their teeth and contribute implementations, I'd encourage you to look at already existing algorithm repositories.

[0]http://en.literateprograms.org/LiteratePrograms:Welcome


Again, the purpose of this is to not use wiki-based collaboration but leverage the power of Git and Github.

>instead

Look guys, this is not trying to replace anything. It's something new. It should be supplemental to one's journey, I'm not trying to pass this off as some attempt to be even the main resource for learning algorithms. That would be foolish.

>This wheel was invented, but it's better for traction in the snow

We really need to stop holding ourselves back so much in this field, it's not really proper for us.

Thanks for the link though, it will be added.


Why post this when you only have a single insertion sort in C? Practically, this has zero use compared to books which include algorithms, illustrations, and code.


Release early, release often?


Whether you recognize it or not, you face a choice between: a) keeping your name on said repo, and b) substantially higher growth rate. When people do things for free, they don't do it so someone else can put their name on it. (It's irrelevant whether this was your "intention" -- I'm not saying, I'm just saying.)


Well he also puts his name in the insertion sort file at the top named, wait for it.. "kennyledet_insertion_sort.c":

// Kendrick Ledet 2013

As if git blame wasn't enough, let's stick our full names in every file prefixed with our name. People have got to know that I'm the one who created this sample algo code!! <waits for applause>


> let's stick our full names in every file prefixed with our name. People have got to know that I'm the one who created this sample algo code!!

I'm not exactly sure what you're getting at here, or why you're trying to be funny in what I wanted to be a relatively serious affair. The implementation as well as the comment were both written way before I even had the idea for this repository.

Also,

1. Kennyledet just so happens to be my Github name. The full name comment at the top is not a listed convention.

2. The convention of putting your github username is actually a feature. Do you suggest just clumping everything together under one algo folder, where filenames could be similar and possibly conflict? :)

Again, it's not about people discovering me, it's about making this popular on Github so more practicing programmers can share, discuss and learn new algorithms. Please think more logically here..

P.S., I know what git blame is. Don't try to undermine the point here. I wanted the file structure to be flat and adhere to those conventions for more reasons than you probably think I have. You're welcome to start your own repo and try and get it as active as this one has within 2 hours...Then I'll close mine.

Thanks.


I'm gonna get on that repo, because that's what my criticism infers, creating my own repo. Yes.


I don't think you read far enough in the README to see the conventions man.

Please don't get the wrong idea. Clearly it's not about me. If you have a suggestion to keep things more "unbranded", let me know. I could just put it under an organization.


RosettaCode is a site that might be of interest.

http://rosettacode.org/wiki/Rosetta_Code


Yes, that's actually one of the sites that led to this idea, which in my mind is superior in many cases. For instance (and probably the only instance that matters), we get to leverage the power of Git and Github over just wiki-based editing

Also, I think the organizational style on Github is more suitable for what this has the potential to be.

I would love if I could clone the code on RosettaCode to my machine, add things, and send the change requests right back from the power of the command line. We're both programmers, so of course we know it's doable with code. And I could go ahead and code that script. However, it's not really practical.

Thanks though, you just gave me a new idea for a command line utility (not just limited to Rosettacode's wiki)

P.S., I will add a link to it in the README.


That can be done and should be done. In IRC #rosettacode on freenode and ask if someone of the admins will add: http://www.mediawiki.org/wiki/Template:WikimediaGitCheckout or if you can contribute something to their wiki's sourcecode. It IS GPL, so you will be able to retrieve the code! :)


The shortcomings you mention with Rosetta Code living in Wiki format are some of the same motivations for Ward Cunningham's new federated wiki project.

See: https://github.com/WardCunningham/smallest-federated-wiki


I would suggest to not put the algorithm into a main function as it is done in your sort but instead in a separate one with a proper signature.


I've started the same thing before.

Right now I have algorithms implemented in Go, Java and C/C++

https://github.com/learnalgorithms/datastructures

Accepting contributions :)


I suggest you participate in online programming competitions. For me it was more helpful than just read books.


I agree with this very much. Check out http://uva.onlinejudge.org for some example problems with automatic judging of contest problems.


I had discovered this link a few days ago. Added it to the resources section


I don't see where you think I just read books, lol, but thanks for the suggestion man.

I have been planning to join a few competitions, and definitely some hackathons in 2014.


There is a handful of websites showing algorithm implementations and they are on edu sites. Unless you plan to write the algorithm in multiple popular algorithms...


How do you make sure the correctness of the implementations? Shouldn't there be tests?


Precisely. This is something that I have plans on implementing, but obviously I don't have the whole day 24/7 to manage this repo man. I'm doing other things right now (work), that's why I took so long to even answer the initial comments in this thread.


Tests and behavior should be defined before the code.

I recommend you comment out all code, write the tests and then un-comment the code line by line.

This not only tests the code, but tests the tests and assures that you are covering the edge cases the code contains already.


Update: Unit tests are now a convention.


The code is the test. If the code runs without compilation errors, then use it for healthcare.gov </sarcasm>


You forgot to open the the sarcasm tag :)


That's part of the hidden DOM. Implicit. ;-)


Sorry, but explicit > implicit for markup.

Btw, how's that repo going? :)


Sorry, haven't started it yet ;-)



Dude, holy crap. I had this link bookmarked in the past, lost all my bookmarks (yeah, back then I was dumb enough to not back these things up) and never remembered the link. Many thanks.


Great Idea! I have something similar going at - https://github.com/prakhar1989/Algorithms only in Python at the moment. Contributions are more than appreciated! :)


Nice man. Python is one of my favorite langs, starred and watching.


As far as books go, I am partial to "The Algorithm Design Manual" by Steven Skiena. It has good explanations of how the data structures and algorithms work, as well as entertaining "War Stories" of algorists in action.


Ah yes, surprised I forgot to include this classic. Adding it in.


It is nice to have a place with algorithms of fairly common interest.

There is however a definitive resource on the web: http://calgo.acm.org


I use The Stony Brook Algorithm Repository: http://www.cs.sunysb.edu/~algorith/.


This is pretty comprehensive. I linked to it in the resources section


I'm sure there is a resource somewhere that can be used to jump-start this repo with a bunch of implementations.


Of course, but the purpose of this repository was to keep it to the implementations of active participants in the repo.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: