Given your current setup, it might be easiest to just take one bin of identical pieces and then sort it by color into various bins. I'm salivating over what this would look like.
If you posted that video, I bet it would rack up millions of YouTube views.
Any good ideas for that first stage are very welcome. I've looked at all kinds of mechanism (belts, pickers, vibrating drums) and none so far have the required speed, regularity and are quiet enough to be used in a home.
A combination of pins and holes will get you roughly sized items.
A vibrating table with lots of W snapped channels underneath is going to help too.
A belt with "scoops" is a great way to feed out of your initial hopper, if you drop it onto a perpendicular belt (one with W grooves) your going to get a good initial separation more naturally.
Air knives and puffers are your friend as well. Since you already have a detection system running it twice might be of benefit to you, just puff off anything that you can't sort as a single item.
I don't think "quite" and "lego" really go together so no happy solution for you there other than sound isolation.
Massive hopper (non-tapered if the tapering causes jams), feeding into slowly rotating (60 rpm) spiral-inner drum, falling onto very fast moving (10 feet per second) belt.
Parts get spread into a ~2 inch thick layer inside the drum, and then falling (at least a foot) onto the fast moving belt will then likely separate them further. For parts that end up entangled, end up on the belt too close together, or are unrecognisable in the orientation they fall, simply redirect them back into the top of the hopper to be resorted.
Even a 50% successful sort rate should be good enough.
I think that a couple of tables, cones, and dividers (to create lanes of pieces) could do a reasonable job of turning a heap of pieces into a steady stream of individual pieces.
That reminds me of the economist George Stigler's observation: "If you never miss a plane, you're spending too much time at the airport".
If you take 1000 planes in your life, and spend an extra 1 hour per plane in order to be sure that you will be on time, that's 1000 hours in your life. If, instead, you cut down that hour to 30 minutes, and that makes you miss 3 planes in your life, you'd still have "earned" 500 hours, while losing 3 planes. It's a better trade off overall.
But yeah, just an analogy, I bet the guy never missed a plane...
I guess that's a rhetorical question, because the answer is the same to what you wrote; nvm.
I went and bought a duel deck the other day to play with the wife so this is going to become another issue in the future again.
A while back, I sorted my childhood LEGO pile before sending it to my brother. 73 pounds of LEGO sorted by hand. I'm sure it says things about how strangely wired my brain is, but this was an eminently enjoyable endeavor.
I found that by looking for pieces with certain shapes, my brain/eyes would lock in after a little while and it got really easy to spot them as I sifted through the pile.
It's interesting how much you learn about how our minds work by doing a job like this. All kinds of insights into color discrimination, shape recognition and so on, as well as how much of a job gets pushed down to our subconscious once we've done enough of it.
An idea: I have a three-month-old daughter who I'll definitely be buying Lego for when she gets old enough. What I'll be looking for is whatever's the least valuable in your collection -- big bins of cheap mixed stuff to see what she invents. Maybe that would be a cool side output for your project? Cheap chaff packs for kids of hackers ...
I've looked on eBay, but most of what I see in bulk there are lots of just one size or one color bricks, or just pounds of completely random pieces. I'd love it if I could find something like an assortment of 1,000 mixed 1-by-1/2/3/4/6/8 bricks in 4-5 colors, or the same with 2-by-N bricks.
If you're building any esthetic Lego project would have color restrictions. Also I imagine some colors are worth more than others, just because they're used in more projects.
Also, color separated Lego bins just look pretty.
Is it easier to find the functional piece you need in a bin full of same-colored pieces, or to find the color you need in a bin full of same-function pieces? For me, the latter would be far faster/easier.
That said, I do agree that individual pieces labeled by function and color is probably ideal for sale to buyers looking for a single elusive piece (or groups of pieces).
You're right about the need for individual separation when selling parts. Bricklink does that, so that people in need of, say, exactly four orange cheese wedges, can go and buy them. That seems to be the preferred style for most builders, since it makes possible a mapping from a parts list to a purchase order; there are other styles in use, including bulk sale by mass or approximate quantity, but the individual price for a given part is much lower when sold in bulk than alone. So if it can be made feasible to automatically, or mostly automatically, sort down to the individual part level by type and color, that'll show the best return on investment per part at the point of sale.
(Update: based on jacquesm's comments elsethread, I don't suppose I can any longer recommend selling on Bricklink. But I'd imagine the pricing effects I describe are similar elsewhere.)
If you have a bin of the same pieces in various color, you just pick the color you want.
Shape discrimination is a 3-dimensional task. In the worst case you need to change your point-of-view or the orientation of the object.
You'd need to solve the inverse problem of how to efficiently go from bin to packing container, which doesn't seem too hard. Rotating carousels of bins with little robots or something. Set the item on a tray, and if isn't the right item, puff it back into the bin.
The general nature of neural nets for sorting and classification I think is well recognized in industry and there are many companies that are capitalizing on this (and have been for some time). I highly doubt I'm doing anything original in that sense.
What does anyone else think? Should Jacques open it up for investment?
For 100K I will gladly sidestep all those problems and self-finance.
If it were me, I'd position it as an AI startup. I think Jacques might overestimate how awesome existing solutions are for sorting. Customer provides belts and bins, he provides software. The lego is a proof of concept and prototype.
Or he could do a lego website startup, and expand his project and purchase inventory, and operate a warehouse of lego. Bootstrap this way until the software is ready for general use.
Or he could construct systems like the one he has built and sell them to other lego resellers, and expand into industrial or agricultural sorting in time.
If the CAD models are not available, how much work is it to make a CAD model given an instance of a particular component? I'd expect that for some components, such as those NxM plates, you'd only need to build one model by hand in CAD software, and then could algorithmically generate the models for other sizes.
I find it hard to believe that there are only 1421 bricks, including color combinations.
Example: "Apple with leaf", Color Family: Dark Green, Exact Color: Bright green, Category: Foodstuff Element ID: 4107050, Design ID: 33051, orientation 8: https://sh-s7-live-s.legocdn.com/is/image/LEGOPCS/4107050_s8
...This also makes me wonder whether it would make sense to set up multiple cameras, to get multiple orientations for every block. Use the image that has the lowest classification error, but use all orientations to train the network.
The images are not very useful to me because the images the machine takes are very different from those. But maybe it would somehow contribute. I'd be very ware of using other people's copyrighted content though.
I do use multiple orientations.
I believe he mentioned somewhere that he's using mirrors to achieve essentially the same effect with a single camera.
But, that said, were I looking to buy parts, I'd hesitate to do so in bulk by these categories, unless the price were very good indeed - between the inability to match a parts list to a purchase order, and the extensive manual labor required to sort parts sold by mass at a useful degree of granularity, they'd need to be very cheap in order to make the purchase worth my while.
(On the other hand, most of my builds back in the day were spacecraft of various fictional types, and I heavily favored complexity, which means a large number of small parts, designing in wiring channels for LED lighting, and the like. I'm not sure how representative my comments here are of any other style.)
Already did that for a batch, about 150 partial sets as a result. Those would then need to be completed and checked by hand, I don't think that can be done profitably, but maybe there is some way.
> But in terms of just selling off excess Lego looking at the mix of the builder sets would let you figure out a sort of SKU you could put together and sell on Ebay or else where as a 'builder pack 1', 'builder pack 2' etc.
Yes, and right now I'm looking for what the contents of those packs should be.
* Run the machine over a sample of raw parts, just counting what it sees. Use that estimate to predict which sets can be completely made with high certainty.
* Next choose 6 types of set which can be made, and program the machine to put into 6 bins the correct parts with the correct ratio to form the sets. Anything uncertain or not within the ratios goes in the junk box.
* Now rerun the contents of each box, this time programming the machine to deposit one complete set into each bucket, again discarding parts which don't have a high probability of being correct. Make it beep when it believes a set is complete.
* Now you have an operator weigh the complete set, and if it matches the correct set weight, bag and ship it (self sealing padded bag and auto-printed label), and if not, throw it all back into the input hopper.
Total human time per set should be sub 30 seconds, so assuming a large enough market, it sounds profitable. Assuming 300 pieces per set and 5 parts sorted per second with a 50% failure rate, you would need 4 machines to keep a human busy for the 2nd part of the process. The first part is slower, but could run without supervision.
Assuming you have a 95% part recognition accuracy first time round, and 99.9% the 2nd time (since you are selecting from what is already believed to be 95% the correct part mix for the job, and it's fine to have a very high reject rate), most sets of <1000 parts will be all correct. The weighing step will then probably weed out >75% of errors, and the remaining errors are likely only color related and not so serious.
For the remaining 1%, just send out a new pack entirely if the customer complains. It adds 1% to your costs. Big deal.
I wonder if though, down the road, you could setup a web site with the complete inventory of pieces that you cataloged, and let people pick and choose what pieces they want for a specific order.
Then just put the pieces through the hopper to pick/pack individual orders. Maybe get a super accurate scale, weigh pieces when catalogging and weigh the end sets to help ensure order accuracy?
If you start copying the piece assortments ("lego recipes") that lego also sells, that's where I'm unsure that you're in a safe legal spot. Lego put time and effort making sets for specific purposes, and I would think that those "lego recipes" are possibly protected.
Especially if you were thinking of selling a large number of different sets, which would mean you'd be creating kind of a "lego recipe" catalog?
There's likely someone who's willing to pay X% of the set's price for a bucket that has a 95% chance of having all of the pieces, and a 100% chance of having too many pieces, including some that don't belong in the set, since you'd want to err on the side of caution.
But even that might not be profitable.
E.g. minifigs can get high enough prices in some instances that the total value of a set can be higher if you sell the minifigs and higher value bricks separately, because some people want to buy just the minifig, or need some piece that only appear in a few sets.
This would be my main issue with buying Lego to resell - I'd want to be very sure I'm not buying from other resellers where there'd be a risk it'd been stripped of the more valuable pieces.
Coincidentally, it kind of simplifies the sorting problem too. Just do like Amazon and don't sort!
Instead, use the machine to figure out exactly what was in that bin you just bought. Give the bin an ID, and store it as is. Then, when putting together a set, have your software find the minimum number of bins you need to pull from to assemble the set. Run them through, and have the machine pull the parts.
- the sets have to be absolutely perfect to be worth something
- you need to check to make sure all parts are present, this is hard enough for Lego which starts from known quantities and brand new parts, with second hand parts and people being ultra picky about such details as year-of-manufacture of the bricks it becomes an intractable problem.
- Rare parts are really rare. In fact, the only parts you would have to document like this are the rare ones, and 'rare' is actually one of the categories that I can sort for. This gives the option to sell only the rare parts for a certain set and to leave the bulk parts to some other method. Much easier to do that profitable.
Said another way, assemble sets around the rare needles you have stumbled across unintentionally while sorting the haystack.
And then there is the 'incrowd' which does everything they can to get rid of newcomers. All in all not too impressed with Bricklink in its current incarnation, the Lego group, which used to quietly promote bricklink has gone completely silent on them.
A final item about bricklink that I don't like is that they tend to sue or send C&D letters to other Lego related sites for the use of 'their' content whereas that content is all user generated, it's just that Bricklink overnight went from having all that content owned by the contributors to claiming copyright on images and terminology. They really pulled a Gracenote/Freedb stunt here.
Imo they don't have a leg to stand on with 'their' numbering system because it is based on Lego mold numbers to begin with.
Then you'd go and pick the bins, run them through, and the machine assembles the sets from those bins.
That's similar to the way Amazon does it. Now they have the shelves on robotic trolleys that bring them directly to the packers, but that's just a required efficiency at their scale.
I guess the problem with this scheme is that you move the problem from classifying to identifying... twice. So the precision requirement goes up. I don't know how big your dataset would need to be to require minimum human intervention.
The closest I've come to this is where you identify which sets are present in a batch by doing a trial on a sample (that's a pretty easy statistical job), and then sort directly into sets starting from the largest sets down. That way you reduce the parts count very rapidly. So, I did this for a bit and now have 18 60 liter crates of almost complete sets which all need to be manually completed and checked. Again, not profitable.
If you just want to do this to keep busy it is easy, if you want to make more than what you could make by flipping burgers it is surprisingly hard.
And picking them is stupendously hard due to the large variety in shapes and sizes.
Ok, should be much better now. Ironically, your helpful cache link made things worse because it kept people clicking on a version of the page with huge images, I totally forgot to resize them. Now I've copied the small images over the larger ones so that this link works too and that appears to have solved the issue.
You just need to re-train on those classes instead- but I bet the front-end of the tensorflow network would already be trained well enough that re-training for new classes would be very fast.
So each item is actually identified right down to the size, colour, type, etc, but then literally bucketed into a group of similar items?
This kind of stuff fascinates me. I've worked on software for package sorting machines in Amazon warehouses. Very similar idea to this: identify, remove from conveyor at right place/time. Only the machines are millions of dollars, run at very high speeds, and use barcode scanning for identification.
Does anyone know if there's a realtime movie anywhere?
Once the parts are on the recognition belt things more very fast, a single part can be scanned and recognized in about 30 milliseconds, so roughly 30 parts / second.
Making the machine much faster then means making that feed mechanism faster than it is today (roughly 1 part per second tops) and that's the challenge I'm looking at right now. Ideas more than welcome! Even bad ones, you never know which bad idea with a twist can become a good idea or even a very good idea.
I assume you've thought of this but I'm still curious.
Really cool stuff by the way!
I'm not familiar with the project but this seems like an interesting challenge.
There are some really good suggestions in this thread, the one that stands out for me most is the brush edged screw feeder, I think that one has most properties just right to work.
Would double the speed and sounds like the rest of it is fast enough that you wouldn't even need to worry about parts falling on top of each other.
Ideally parts would land on the belt in a very regular rhythm, without any kind of irregularity whether the parts are 8 mm round 1x1 plates or 16 long 1x1 bricks. You need about 3 cm of space on either side to make sure the right part gets blown off the belt, that's the most important thing that will eventually limit the speed.
Oh, another factor limiting the speed is the maximum acceleration over the stretch towards the camera, otherwise the parts are still moving on the belt when they reach the camera and then you get bad images.
Initially, after stepping on one in the middle of the night, any that I found left on the floor went right into the trash. I assume he dug them back out, being a kid and all. However, he learned.
Eventually, he amassed a giant collection. With his permission, I gave them away to a kid that lives nearby. There were six large tote boxes and several dozen box sets. That kid now wants to be an engineer, though my son went into biology.
So, they are really great and it's even better that the old blocks still work with the new blocks. Just look out, they're caltrops, on par with a four-sided die.
Yeah, watch out for them when you're not wearing shoes. Evil buggers.
Order all of them.
If I'd decided to sell out of my collection instead of giving it away, I'd have had some parallel cases - any large enough collection is liable to include a small but significant proportion of one-off parts. I'm not sure a 2x4 black brick is likely enough to sell that it'd be worth listing alone, but I suppose someone with a sufficiently completist bent might go to the trouble.
It sounds like your existing hopper is reliable but slow? If that's the case, could you just make two (perhaps somewhat smaller) hoppers and merge their outputs?
How about maximizing it instead? Try to build bridges as long as possible? Or some other fitness function?
In theory you could control the vibration sequence and the part feeding sequence (once you have sorted all the parts :) ) and use an evolutionary algorithm...
Essentially it would sort pieces according to their free-fall terminal velocity. You could exploit the Venturi effect to make the bottom of the tube have a higher velocity than the top, or simply add bleed-air holes at progressive levels. There could be take-off doors at different levels of the tube. A nice property is that if the ingress and egress mechanisms are designed elegantly, it could run continuously.
(Although, that last point is one reason I've never built it: it seems like there would be a fine line between "air powered sorter" and "air powered rock tumbler" and I worry it could wear the corners of all my Lego bricks if it goes wrong.)
Another potential issue is that there might not be much logical relationship between the pieces which share similar terminal velocities, and different pieces may have different or metastable terminal velocities depending on initial orientation.
Obviously this wouldn't be a perfect sorter, but it could be a good pre- or post- sorter for your visual system. And for my purposes it might work well since instead of adapting the system to suit my sorting preferences I could simply train my sorting expectations based on what I learn the machine tends to sort together.
Hmmm now that I have an industrial dust collector installed in my workshop, I actually already have the bulkiest / most expensive part of this invention. I just need a tall acrylic tube and I could try this!
This is a well-known thing in machine learning. As far as I know all ML algorithms have this property, where the learning curve is basically logarithmic and each additive increase in performance requires multiplicatively more training material.
Jacques, any plans to publish your models or image dataset somewhere?
A university has asked for copies of the images that I have right now to make a project out of it.
The idea is that if you can get a dataset of Lego packs / models you can use your inventory to make them. You can offer sales of the kits based on dynamic info. You can also provide inventory dynamically to users to purchase parts or upload a build manifest which will be filled.
I think the 2 huge assets you have are sorting for processing AND fulfillment, and real-time inventory levels of verified authentic parts.
Also, I would make some sort of contest, auction fun game where a person can win the whole pile of discarded stuff. Idk what's in there but I bet someone would want it.
You'd probably need the set numbers so that it'd be easier to choose which set to associate a part with.
I'm more than happy to be a beta customer!
To scale... I've always wanted to rent a cargo ship!
You drop parts onto the variable speed belt from enough sources to get a good feed rate, and constantly vary the speed of that belt with video feedback to drop them onto the main belt with reasonably even spacing.
Possibly you could feed in parts lists from a huge amount of existing sets - have some sort of (handwavey learning thing or even markov chain), and generate mixes of pieces based on data sets from spaceships / cars / etc.
Edit: this is in response to your asking for feedback about categories/demand.
One guy made a little Lego car factory:
EDIT: Some discussion on alternatives: https://news.ycombinator.com/item?id=11913330
2. Supports Booters, script kiddies, and pay-to-play blackhat hackers
3. Fucks over Tor any chance they can get.
Proof for said allegations:
[Youtube video] https://www.youtube.com/watch?v=wW5vJyI_HcU
A: He continually berates because "he called and Brian Krebs never responded"
B: Cloudflare CEO states that the booters (ddos pay-as-you-go sites protected by cloudflare) don't even pay, or pay with stolen credit cards. And admits is "just a disaster".
C: Direct insult towards Krebs onstage "Well, who needs to actually ask questions as a journalist?" 48:08 [/Youtube Video]
look for user:eastdakota
page text:"Yes, you can see Brian's critique of us here:"
I'm not sure I see this from your perspective; multiple people in the crowd clap after your point "C", and much more clapping is heard when he sits down. If I were forced to classify the behavior (edit:specifically at 48:08) I would choose "smartass". The entire discussion is interesting in light of what happened recently with Uber/Kalanick.
He was invited on stage, and continually berates Krebs. He could have did that graciously, but instead continually insults. He admits that when the Booters do pay, they're paid with stolen CCs. And then, finally attacks Krebs' standing as a Journalist.
Uh, no. Absolutely not. No way, no how.
I'll deal with Google, Amazon, Microsoft.. Whomever for a CDN. I'm not touching CloudFlare.
I personally appreciate the willingness to be true to oneself with a quick-witted reply rather than 100% PR-safe professionalism all the time. However, I don't mean to imply that your impression somehow should change or that you should change your decision! My world would be so much nicer if I could just avoid working with people that rubbed me the wrong way.
But the content is about if CloudFlare will kick off booters off their platform. And he defends them, EVEN after saying they don't even pay.
This isn't a "Should we let the KKK march in the streets?", or "Should $sexist group be able to spread propaganda?". This issue is directly about people who harm the fabric of the Internet, and the companies that knowingly allow it and continue to propagate it.
I get defending controversial things in terms of defending Free Speech. But these skiddies want to precisely use their "Free Speech" and actions to deprive others of it.