1) Air is blown through the product and any dust is taken out.
2) The product is run through a bunch of screens that take out anything too big or too small.
3) The product is put through a gravity separator to separate based on mass.
4) Finally, the product is put through an optical sorter (https://www.youtube.com/watch?v=O0gWUeqzk_o) which uses blasts of air to push out unwanted materials from a stream of falling product.
I'm sure you could use the same process for Legos. Not sure about how to distinguish between branded and unbranded Legos though.
What is the %age by weight of 'trash' versus 'good stuff' for such a sorter?
I do use screens for various pre-sorting stages, not shown in the article. The sorter is only good for parts up to 40 mm and anything that isn't a wheel or round so it will roll away while being imaged.
That's by far the bulk though so for me if it does that part well it is already more than worth it.
Branded/unbranded: spectrum is different (far more different than you would say by looking at it with the naked eye), weight does not match for the part (though this can be very close with really good fakes), logo on the studs is different.
I've been thinking about doing that gravity thing, but a bit more fancy, rather than just a binary sort to shoot parts in several directions, an alternative is a spiral slide under a steep angle where parts are fed in at the top and ejected when they reach the right bin.
That's a lot more complicated to make than what I have right now mechanically and also the time available for a classification operation would be much shorter, but it would allow for a much larger number of output bins without taking up a whole lot of space. So maybe a next generation, if I still need it (this one is going through piles of lego now).
Plans to launch a Lego sorting service? ;-)
Seven, so it takes multiple passes before it is done.
> Or was this more just a proof of concept thing?
Tough question :) No, it's for real it really has to sort through the 2000 kg, but if it needs to be beefed up or changed to get to the end then I'll do it. The next step 'up' would be a machine designed from scratch incorporating all the lessons learned with essentially unchanged software. There are still some limitations that could be addressed but then you'd lose the training set and you'd have to start all over from scratch. That might be worth it to get the last 1% error or so, so if this ever becomes 'real' then I'd have to do that. I highly doubt it will get to that though. time will tell.
> Plans to launch a Lego sorting service? ;-)
Not at present, though you're not the first person to think of that, parents with kids are suggesting I should make it mobile to visit people at their homes for $x / shot :) Still, that will only happen if I really have nothing better to do, which means likely never.
Don't sell yourself short. You built the thing, after all. Could be some fun road trips with, I'm sure, gracious hosts to entertain you during Lego sort. That could be a whole retirement life right there ! ;-)
The VCs should start lining up in ...five...four...three...
OP is familiar with the secondary marketplaces and knows if he could classify and sort well enough to make sets, then could potentially make money buying bulk and reselling sets on these marketplaces.
It could be a real business. He's already proven he has the chops to design it. If he doesn't, I don't doubt someone else will try.
Obviously Lego itself could do this but probably make more money from melting and recycling their own parts under their brand as new, like Apple does with its recyclers. Do we know if Lego is recyclable as new Lego by melting?
There was a short period that Lego allowed people to design their own sets and order them through Lego but it was so popular they had to shut it down.
People even buy new sets to sort them out just to get new bricks to combine into their own creations without having to buy them in bulk and not being able to use half of what they buy.
Could you make money just buying bulk, sorting out rare parts, and reselling those?
The price of buying a new (or used) box vs. the price of "Bricklink"ing the parts is usually pretty much in favor of the former, which makes sense since the latter involves more S&H fees.
In additions, new moulds or new colors for existing moulds come up all the time (yet new moulds are all designed in the same system, so that they increase the versatility of Lego rather than Playmobil-izing them). Therefore, advanced fans who design their own creations and buy bricks in bulk do get lots of new Lego boxes. For example the VW Beetle model, besides being really cool in itself, had a lot of azure bricks, including many shapes that had never been realised in that color. Likewise for some Architecture boxes.
 "New bricks are too specialized" is the Lego version of "HN is turning into Reddit"
There are plenty of people doing that but by hand. I figure that's about minimum wage, doing it like this should be quite a bit more lucrative.
Once Lego's patent expired it tried a fairly shady legal theory that the interconnect shapes were trademarked/trademarkable. They pursued this all the way to the Canadian Supreme Court where they lost unanimously.
I'm surprised that this strategy isn't more quicksort-ish, I was expecting a first pass into warm or cool colors, then another pass into large or small lego, etc.
As for the multiple passes, that's actually how it works, the first pass sorts into the 7 most common categories and then quickly works down from there. But the 'runoff' bin is by far the largest at the end of every run. After 3 passes the bulk has completely disappeared and most of the remains are sufficiently rare to be dumped into the right container individually. It goes pretty quickly: first pass: 7 bins, second pass: 7 bins again, but 49 if you put the first seven through again (but in larger volume accumulated from several runs on smaller lots). And so on. Since the bulk of the lego is in those first 49 and the value is in the remainder it doesn't take long to add value to a pile.
As for the mechanics, any obstruction, no matter how well intended will immediately become a point for a bridge to anchor to.
This isn't parallel invention - the principal of optical sorting and air ejection are well known and understood. (Which is not to lessen the achievement in building this, but building and inventing are not the same thing.)
Had I seen something like the optical sorter back then, I would have thought it (Arthur C. Clarkeian) magic!
I think that people might pay for the convenience of getting a bag full of proper chickpeas with no duds and no stones (I've heard of this, but never really seen it).
Maybe some day Suraj and NuPak will go a little further with their cleaning and get rid of damaged beans and not just stones.
Though, frankly I think they could do better than average with a specific gravity/densimetric sorter; it would probably require fewer passes through the optical sorter as well.
The equipment costs money; and reduces the volume of "product" left for sale. Your problem is capitalism - the companies don't do that because it lowers their profit.
Here's a fun one in Taipei: http://www.brickfinder.net/2017/03/22/taiwan-lego-store-visi...
(They also custom print on the surface of the parts; I saw an awesome Trump Lego man there complete with red hat.)
I bet these people would love to talk about this machine!
This is really amazing, awesome work!
What do they do with bodies that are facing backwards? Is there a little side channel that flips them around? Nope, they just get tossed back into the hopper to try again. It's effectively random which way they'll be facing when they get picked up, so they'll eventually get through.
I thought that was a nice illustration of keeping things simple and taking advantage of processes you already have, and seems similar in spirit to this.
Just a remark from a Lego nerd: it's attaching arms to bodies, not heads. The neck has a stripe on the front, so that is where the camera looks at to determine which side is facing up.
Maybe there's some defective minifig body with a mechanical flaw that causes it to always get picked up facing backwards, that's spent years being picked up and rejected over and over.
Adapt as needed :)
There's clearly not any air-tight seal. Most of the nubs on the top of pieces only have 3 or 4 points of contact with the piece they're connected to.
I made a mould out of Lego (to make Lego-shaped gummy candies) and with a thin coating of vaseline on the outside, it was indeed water-tight (and silicone-goop-tight too).
They're astonishingly well made for a children's toy.
That said, they've been cutting quality, moving production to China, using cheaper formulation plastic. That's a bad move imnsho. The brand image is quality and people pay for that, if they ever seriously let that go it's going to be game over for them.
Of course, chances are they'll reduce their production costs but maintain the high prices.
Do you have any specific set in mind?
He already has 31052 and 31051 which I think are excellent 3-in-1 sets. Would love to see more like those. They are about AU$80-120 which is a workable birthday/xmas pricepoint.
That's really not that much (1-999 μm). Keep in mind that sub-1mm is micrometer, and 1mm is a huge distance. I'm sure that LEGO bricks have tolerances of single or low double digits of micrometers...
The moulds are permitted a tolerance of up to two micrometres, to ensure the bricks remain connected.
For odd shapes -- clamping both ends then vibrating the grippers (just a little) could split bits.
It's an interesting problem. What percentage is still connected? May be better to pay a slightly higher price for bits are that not connected (distribute the job out to the sellers)
Question: Were you able to utilise any data about Lego parts from Lego's own catalogues (current and historal) or technical specifications? It sounds like you trained the classifier manually. I imagine if you want to sort into sets you need to know what makes up a particular set.....does Lego provide an API or anything regarding parts/sets?
Further to that, if you have pricing data on sets you have a nice little optimisation problem - given my metric ton of parts, what are the most valuable complete sets I can make?
I tried, but in the end a straight up train-correct-retrain loop took care of all the edge cases much quicker and much more reliable than any feature engineering and database correlation that I tried before. This is roughly the fourth incarnation of the software and by far the most clean and effective. HN pointed me in the direction of Keras a few weeks ago, that coupled with Jeremy Howard's course gave me the keys to finally crack the software in a decisive way.
> It sounds like you trained the classifier manually.
Only the first batch, after that it was mostly corrections. What it does is while it classifies one batch it saves a log which gives me more data to feed the classifier with for the next training session. There are so few errors now that I can add another 4K images to the training set in half an hour or so.
> Further to that, if you have pricing data on sets you have a nice little optimisation problem - given my metric ton of parts, what are the most valuable complete sets I can make?
I'm on that one :) And a few others that are not so obvious. There is a lot to know about lego. Far more than you'd think at first glance.
I'd love to hear more about what you tried specifically. I'm considering doing this myself, and I was thinking of building a very large labeled dataset of 3d rendered images using the LDraw parts library and training on that. I could include hundreds of images per part by using different viewing angles, zoom levels, focus, etc in the rendering process. Did you try anything like that?
After endless messing around I finally bit the bullet and trained a neural net, from 0 to 100 in a few weeks and it is rapidly getting more usable now.
The feature detection code may get a second life though: as a meta-data vector to be embedded in to the net. But only if it is really necessary.
I'm quite curious though if you can get your method to work, especially for the parts that are very rare and rare colors.
I was assuming that at minimum I'd need to do a lot of filtering in order to get the camera images and renders into a state where they are similar enough to work for training.
Any chance that you'll be releasing source code for this project and/or your labeled dataset?
Yes, but not yet. It needs to get a lot better before I'm going to stamp my name on it as a release. Right now it is rather embarrassing from a code quality point of view, it has been ripped apart and put together several times now and every time it gets a lot better but we're not there yet.
Just so I understand the process correctly, did you manually sort some pieces to get a labeled training set, feed those through the machine, train the NN with that, and then manually correct the errors when sorting unknown pieces, added all those pictures to the same training set and then finally run the full training again? How many labeled images do you need to start getting acceptable performance? Are you training the NN continuously with every new image, or from scratch with an increasing data set?
Do you think a stereo camera would improve the classification in a meaningful way, or maybe a second camera from a different angle?
Yes, but that cycle repeats every day. So the training never really stops, it just runs at night and the machine runs during the day. Today it sorted close to 10K parts and those images will now be added to the training set and then I'll start the training overnight so tomorrow morning my error rate should be much better than it was today and so on.
> How many labeled images do you need to start getting acceptable performance?
Good question! Answer: I don't really know but judging by how fast the error rate is improving between 100 and 200 per 'class' so that will be 200K images or so when it is one with the 1000 most commonly found parts.
> Are you training the NN continuously with every new image, or from scratch with an increasing data set?
From scratch with every expanded set. I suspect that's the better way but I have no proof. My intution is that it is hard to make a neural net learn something entirely new that it has not seen before and every day totally new stuff gets added. So I re-train all the way from noise.
> Do you think a stereo camera would improve the classification in a meaningful way, or maybe a second camera from a different angle?
You're getting close to the secret sauce :)
I guess my lack of knowledge in the field shines through. Continuous learning is apparently under active research at the moment, and this blog post about it  is less than two months old, so your intuition was right.
If I were to guess the secret sauce I'd say that a mirror might be involved. Is depth information not worth the trouble for these kinds of classification problems?
You might be right there :)
> Is depth information not worth the trouble for these kinds of classification problems?
Yes, it would be, but there's much more to it than that. Also keep in mind that there are parts that are almost transparent and that no matter what background color you come up with there will be a bunch of lego parts that match it.
Colored strobes may also help separating out different color pieces, although I expect that would be overkill.
This way you could also cross-reference with part seller sites to see the going rate for a part and determine whether it's worth your time to separate it manually. Have a bin for rare parts worth separating by hand.
Yes, but that did not give me accuracy enough, so now I train from scratch. I had hoped to save having to train the conv layers.
> also, did you use batch normalization?
> also, did you try ResNets?
> you probably don't care at this point, but all of this would __massively__ decrease training time
Oh, I care all right :) I'm re-training the net every evening (it's running right now) after adding another batch of training images.
I've played with OpenCV and tried for fun to train a HAAR cascade classifier to recognise a minifigure. It didn't work which made me realise one has to really understand under the hood of machine learning like this in order to give it good training data.
Kudos. Very, very impressive.
For the belt that lifts item up out of the hopper, I notice there's a little white hook (or platform, not sure what to call that) jutting out that does the actual lifting of the legos. How did you get the size of that right? Did you install that jutting-out part, or did it come pre-attached to the belt?
What tools are you using to make a computer do the actual belt rotation? I'm wondering how low-level it is - are you spinning the steppers directly or did the conveyor belts come with some kind of API? I'm guessing the belts don't have a USB port for easy control.
If you look closely at the belt you can see the traces of many failed experiments before I found a shape that worked without accidentally getting stuck on a part.
It is attached with super glue to the belt. I use the narrowest parts because that way it doesn't end up fighting with the curvature of the belt when it goes over the roller.
The belt rotation is done with a 3 phase AC motor hooked up to an inverter for the vertical belt, the camera belt is driven by a DC motor hooked up to a variable power supply.
So no steppers, that would have made life a bit easier because then I'd know (modulo some slippage) where the belt is positioned. So now I have to reconstruct that optically, hence the wavy line on the belt.
If necessary a lot can be pushed through the machine twice for instance to sort parts by length or to pick out sets (that last bit works in theory but in practice there are a lot of problems to overcome because of the limited number of bins to deposit into).
As for the hardware, there is a nifty little camera with a macro lens that connects to the USB port (noname Asian stuff), it has a 10x magnifying lens, a pololu servo/gpio to USB card to drive the relays and a Sainsmart 16 port relay board to drive the solenoids for the air valves.
The software is all in python with a generous amount of help from the people who wrote numpy, opencv, keras and theano.
The error rate is between 3 and 5% depending on how fast I set the machine, there are a number of sources for the errors, obviously classification errors, also sometimes two parts are too close to each other and even if the classifier got them right the airpuff for one pushes the other of the belt as well. To minimize this effect I keep the airpuff super short, on the order of 10 ms, which is about as fast as the solenoids can open and close reliably, but it does mean that if it misses even by a bit there is nothing to be done about it and that part will land in the 'other' bin.
That error rate is still too high but with every run the classification errors go down and that's the main component.
One nasty little problem was that I spaced the puffers too regular in the first iteration which meant that sometimes the parts would line up just so in the order in which they came under the camera so that more than one puffer would be active at once leading to a reduction on pressure and no parts would be pushed off the belt. That was a tricky one!
"more than one puffer would be active at once" — sounds like a job for prime numbers!
And that's exactly how it was solved. The puffers are now spaced prime distances apart and that took care of that, it would be harder for a longer belt because then you'd start wasting an awful lot of space next to the belt.
Since the pieces fall onto the faster conveyor belt with random spacing, isn't it possible that two consecutive pieces will have the spacing of the puffers they are destined for (within margins), regardless of the puffer placements?
In the end the error rate went down a lot because of a less predictable spacing. But you are right that if the pieces would fall with random spacings that it would not matter what the distance between the puffer stations would be.
The funny thing is that I spent a lot of time measuring out the vertical spacing on the conveyor in the hopper. If I had done that more sloppily it would have worked better :)
At any snapshot, the pieces are lying with a Gaussian distribution around x multiples along the belt i.e having sigma at x, 2x, 3x,...nx....
So for the bins to not overlap:
1) their width/span-along-the-belt should be lesser than x
2) And they can be placed at x, 3x/2, 5x/2, 7x/2 (i.e. prime multiples of x/2)
Wow! Learnt something useful today. Thank you. :)
Edit: I realize after posting that, my solution won't work! If somebody can explain how the prime thing works will be great. I can imagine, though, that the bin placements should be such that, at any given time the piece is only in front of a single bin. Meaning, no pair of bins should have a distance of x-multiple. I can guess, perhaps heuristics which work well, can be devised. But it will be great to know the mathematical solution for this.
12V on the valve -> air streams out of the corresponding tube.
Did you use any other resources to learn about deep learning besides http://course.fast.ai/? I'm looking to get started learning, and wondered what the best way forward would be.
After that it was mostly googling each and every term that I didn't understand until it all started to make sense.
course.fast.ai is probably the fastest way to get something concrete going which is very useful if you need that instant gratification kick to keep going.
EDIT: but some initial level of knowledge is advisable so you know how to pick a competition.
Several years ago I designed an industrial machine that is used untangle and sort nails, screws, etc for feeding robots in automatic product lines. Main elements were vibration beds (using eccentric), slopes with geometry to sort out and pneumatic cylinders - to untangle items in high speed.
This ones a nice Lego sorter, too:
A full speed video is super hard to shoot because you can't follow the parts as they move, they basically disappear because of the air puff being so short that a part is there one frame and gone the next. Right now classification takes about 30 ms, that's the limiting factor because that belt keeps moving during that 30 ms so you need to be 'back' at the camera before it moves so far that you can't stitch the next image to the previous one.
Another limiting factor is the relationship between the two belts, the second belt can only go so fast before the precision of the puffers starts to be insufficient to aim the parts into the right bin, they also carry too much forward momentum, and the second belt needs to go many times faster than the first in order to spread out the parts sufficiently. Yet another problem related to that: if you look at the video you'll notice that one of the little parts grabbers on the belt can push ahead of it quite a bit of stuff, if fortune is against you all of that lands on the belt in one go. By making that belt go slow it creates just enough pause between parts to be able to separate them with air without pushing the wrong part off the belt. It pays off to leave some safety margin there so I tend to set the second belt a bit faster than optimum and the first one a bit slower. That way the accuracy goes up quite a bit.
It took lots of experimenting and tweaking to get to this stage.
About 1 part per second is a practical upper limit right now, it can go way faster than that but then it starts dumping stuff all over the room :)
I'll see if I can shoot a video at a higher speed than the one in the post right now.
>also it would take lots of bins and there are some details that are important to get right that are almost invisible to the camera).
I wonder if you could build something that would use a small number of bins (maybe just two) and multiple passes.
The first pass could just get an inventory and just show you what sets it could build toward (with emphasis on sets that are the most profitable, closest to completion, or your favorites). The second pass would split off legos that are a part of the set(s) that are trying to be built.
You could also pass along what you think was a completed set again at the end to check for mistakes.
The air puffs seem like the most unreliable parts at the moment?
Maybe if you could have some manual labour, one thing to do would be to mount a project above the setup to shine a graphic you generate onto it.
If you have bins labeled, you could project the label next to or on top of the part as it's moving on the conveyor. That way if you have human pickers, you're not constrained by the number of bins with each needing its own air-puff solenoid.
Yes, absolutely. There are ways to improve on that but nothing simple.
> Maybe if you could have some manual labour, one thing to do would be to mount a project above the setup to shine a graphic you generate onto it.
I don't follow you, what do you mean?
> If you have bins labeled, you could project the label next to or on top of the part as it's moving on the conveyor. That way if you have human pickers, you're not constrained by the number of bins with each needing its own air-puff solenoid.
Ah I see, no, the airpuffs are reliable enough for now even if they are the most unreliable part, besides humans would not keep up with the machine at speed. But it is a solvable problem, just not the most interesting one at the moment. The most interesting one is the hopper feeder belt because if I can make that work just a little bit better it will speed things up by a factor of 4 to 10 depending on how well it will work.
I tried faster but then it is pointless you simply can't track the camera fast enough from the hopper to the bin where it will end up. Hope that is satisfactory :)
Lots of interesting questions come to mind though, in that if you have two bits of Lego that are attached, what bin do you put them into? And have you looked at ways to automatically disassemble Legos? And did any of your purchases have Legos that were superglued together? (as is done in some displays.)
Most fun I've had programming in years. Finally something where I don't have to worry right from the get-go if it is secure or not.
> sorry you lost your van though!
So am I. I had a ton of work in that thing and even if the insurance covered the value they did not give me back the many weeks I spent building it.
> Also interesting to know that the pile of Lego Technic parts I've got from my lego bot building days actually might have some resale value :-).
It'd be better if you had some really nice boxed sets from the 60's ;)
> if you have two bits of Lego that are attached, what bin do you put them into?
'Other', then pick them apart and run them through again
> And have you looked at ways to automatically disassemble Legos?
Yes, but this is very hard to do without damage.
> And did any of your purchases have Legos that were superglued together?
I've seen a few bit here and there but for the most part that doesn't happen. Kids are pretty destructive though so you have to count with a good %age of damaged / unusable parts.
As for competition, yes, plenty, but all manual. Scaling up now that I have the software working is definitely an option but I have a good set of very well paying customers and not much can compete with that.
Dropout is at .5 both for conv and connected layers.
All those things you mention have already been done and yet the error is as high as it is. But this does not surprise me at all, the difference is that the pre-trained models actually do very poorly on this dataset and so I'm having to train them up from scratch. That's getting there though, another week or two and I should be in the tens of thousands of images to train with and then life will get a lot easier.
Data augmentation works well and has been enabled right from the get-go.
I'll definitely do another write-up in a while about the software end of things, once it crystalizes to something a bit more stable, I'm still hacking on it daily.
I will post the training set at some point, first it needs to become large enough. But with the machine helping now that is getting there, rapidly.
Maybe you could use a push broom scanner? I don't know if there's any hardware available for consumers, otherwise that seems ideal for a conveyor belt application.
> You should cross post to hackaday blog.
I post my own stuff on my own blog, I don't feel like making accounts in 20 different places. HN and Twitter are my exceptions.
So why micro imaging instead of macro?
I haven't done any of the nn cv stuff yet, can you pair different image sources of the same item?
Because the details between some of the parts are only very minor and yet that can the difference between the part in its normal version or (for instance) its technic version.
> I haven't done any of the nn cv stuff yet, can you pair different image sources of the same item?
I took a chance on that and it seems to work. I just stack the images and present them as one larger image. Quite possibly that's 'wrong' and I should present the two images separately to two different nets which are then recombined at the end but this trick seems to work.
Good to know it worked on the composite. Thanks for the info!
Buying lots off of Ebay, it seems like you risk buying presorted Lego which have already been picked over for the most valuable pieces.
This was perfect timing for a good laugh from the title and an interesting read. Thanks!
1) What's the input image resolution?
2) How many classes you have?
3) How many samples per class did you need to achieve acceptable accuracy?
4) How long did the training take? How many epochs did it require?
2) the 1000 most common lego parts, 'other' and 'mess'. In the end the idea is to get to 20K classes and to sort directly into sets. This is very much a pipe dream at the moment but I think it is doable given a large enough set of samples. The problem is that you have to see all those parts at least a hundred times or so before it gets detected reliably.
3) too little :( The training data is still woefully insufficient but it is now good enough to bootstrap the rest. This took a while to achieve because without any sorted lego to begin with you have nothing to train with. So the first 20Kg or so were sorted by hand and imaged on the sorter without any actual sorting happening (everything into the run-off bin), then labeling the results by hand until the accuracy of the test set (500 parts or so) went over 80%. That was a week ago and since then it's been improving steadily day-by-day.
4) one training run per night, typically a few 100 epochs on the current set but, this will change soon. The machine is now expanding the training set rapidly with associated improvement in accuracy. This means that the training sessions are taking longer and longer but I'll be running fewer of them. What I'll probably do is offload training to one machine which will drop off a new set once per week or so and inference on another which is doing the sorting and capturing the new training data.
Checking the logged images for errors still takes up a bit of time though, but with the current error rate that is very well manageable. (Before it was an endless nightmare).
I'm doing a similar experiment now to train a model to parse out an image of a blood pressure monitor that's a 7-segment LCD display. To do it I separated out each segment of the display as masks with Gimp/Photoshop and then I can create my own images by just overlaying them on top of an image of a blank LCD display. That gets me basically unlimited training photos.
If you could render the 3D parts from various angles, colours, etc then something similar might be possible.
Also, you said you're doing modified VGG and into 20k classes. That works, but another thing to maybe try is use binary_crossentropy as the loss function and a sigmoid (instead of softmax) on the final activation layer, to be able to do multiclass classification.
Then your labels could be a vector of shape possibilities, colour possibilities, or whatever you could divide your 20k classes into.
> Also, you said you're doing modified VGG and into 20k classes. That works,
Right now there are 1002 classes, the 1000 most common lego parts, 'mess' and 'other'.
> but another thing to maybe try is use binary_crossentropy as the loss function and a sigmoid (instead of softmax) on the final activation layer, to be able to do multiclass classification. Then your labels could be a vector of shape possibilities, colour possibilities, or whatever you could divide your 20k classes into.
Ok, I can try that. Thank you!
Do you use artificial augmentation of the training set (random rotations, translations)?
Somewhat related aside: I have also worked on a classification task (albeit, much simpler one): detect the direction of grain in a piece of wood. I built the first version by manually extracting features (essentially, a few direction sensitive Gabor filters) so that I could collect a training dataset for CNN.
Turned out that accuracy of manual version was more than enough (~98%) so I didn't get to play with fun stuff :(
Demo of the system I'm talking about: https://youtu.be/vGa0tFXPffE
Every part that passes through the machine is logged and will be made part of the training set for the next round of training.
> Do you use artificial augmentation of the training set (random rotations, translations)?
Yes, quite a bit. But this is not of the same quality as really having more samples though it is definitely useful.
> Somewhat related aside: I have also worked on a classification task (albeit, much simpler one): detect the direction of grain in a piece of wood. I built the first version by manually extracting features (essentially, a few direction sensitive Gabor filters) so that I could collect a training dataset for CNN.
> Turned out that accuracy of manual version was more than enough (~98%) so I didn't get to play with fun stuff :(
Find a harder task!
> Demo of the system I'm talking about: https://youtu.be/vGa0tFXPffE
Slick! Do you use that to determine orientation prior to lamination so the laminate does not warp when it ages?
> Slick! Do you use that to determine orientation prior to lamination so the laminate does not warp when it ages?
In this specific case- to reduce surface defects during planing. Interestingly, client forgot to specify this requirement when we designed the equipment which feeds planer. Feeding the boards in correct orientation was something the operators had learned from each other as they improved reject rates in downstream scanner but nobody in management knew about that.
I am kinda boggled that you thought "Huh, Lego, think I'll get into that" and immediately ordered two metric tons of Lego. o_O
I get that you thought (for some reason) that you would only win some small fraction of your bids, but ordering, say, a quarter-ton of Lego at a go isn't reasonable either. The whole episode is pretty hilarious.
You should have seen my face. Also, try to explain to your s.o. that you're about to buy an extra garage solely to house something that you have no idea how it will all work out and when - and if - it will ever go away again. And that was two years ago.
It really is hilarious. For me it's more or less business as usual though, I take lots of chances. Some work out and some don't. This one is still undecided.
This is always an awesome, awesome feeling and one I live for when training neural nets.
I use some relatively cheap plastic sliders stacked 10 high, parts go by length from the top down and by width left to right. Then there are departments for minifigs and associated parts as well as irregular stuff like base-plates and so on. Storage could easily be another blog post all by itself! It's a crazy problem.
For technic, which is many small parts I use small bins and bags inside the larger bins, but you probably could use a raaco rack or equivalent if you don't have too much of it.
Maybe you package stuff up nicely and give it away as a course, or try and sell the plans as a course to one of the coding schools or large Education companies?
Imagine a group of student only had the rough scheme: which pieces do they need, maybe some templates, but then it's up to them to actually implement / build the thing.
Heck a university could even host a competition, where the machine with the fastest throughput + lowest % of errors wins :)
But if you're making some regular wage then you could easily live of this.
What about an initial bucket for pieces that are too close to be reliablely puffed? Maybe you already do that, I couldn't tell.
On the issue pressure drop from simultaneous puffs, if you add a buffer tank with a pressure regulator for every two puffers, you'd probably avoid that problem. Like the little capacitors that used to sit by every 74xx IC.
The simpler solution was to just stagger the tubes a bit, that way all I needed to do was drill a few new holes. I see what you mean though, that's also a clever way but the 'feed' to the air valves is a single 6 way manifold that they bolt straight on to. It would be very hard to put a buffer between the manifold and the valves.
What is the limiting factor determining how many buckets you can operate with?
And I may have missed this detail, but you must perform a multi stage sort since you just have a few buckets? What is the level of the final sort, and what becomes of the parts at that time?
Yep. But this fortunately does not happen often.
> What is the limiting factor determining how many buckets you can operate with?
Belt length, measurement precision during the stitching, cumulative errors in the math determining how far the belt has moved.
> And I may have missed this detail, but you must perform a multi stage sort since you just have a few buckets?
> What is the level of the final sort, and what becomes of the parts at that time?
A single part category in multiple colors, or and that's the next level, all (or at least most) of the parts required to make 6 sets + remainder.
That final stage is a bit tricky, it also interfaces with a database that keeps track of the lot id of that set and what parts are still missing. I have most of the pieces for that second idea ready now but they are not yet tied together.
And what about doubling the bins by placing them on both sides of the belt, with puffers blowing in both directions?
Or a laser trip sensor to get a more precise belt location measurement?
That's an option. But the problem is solved so that part is ok now.
> And what about doubling the bins by placing them on both sides of the belt, with puffers blowing in both directions?
It does that, you can see it in the video if you look closely and also in one of the pictures. There are two bins at the back and four in the front (that's the easiest way to fit them right now).
> Or a laser trip sensor to get a more precise belt location measurement?
Or a gray code printed on the belt! That would give you absolute positioning.
> Fun project!
The machine learning itself: models are hard to train, harder than you'd think at first glance, that you need a lot of data for training and if your particular application does not have any samples then you're going to have to come up with clever ways of making images because the quantities needed are not something you can just go and shoot by hand.
Also, that bootstrapping is a viable method, that way the machine does the hard part and you only have to correct, this takes the sting out of 90% or so of the work of labeling a dataset.
As for opensourcing it, yes, but then likely lego would have a trademark case so I would probably use another name. The whole thing would make for a pretty neat Kaggle competition.
What's the age bracket for red and white? (Plus grey base plates ;-)
Most of my lego is blue, black, or transparent red.
(space police 1)
Some is black and trans-yellow (blacktron)
The blacktron sets are from the late 1980's.
Metric ton / tonne: 1000 kg
Imperial (long) ton: 2240 lbs (1016 kg) (UK)
Imperial (short) ton: 2000 lbs (907 kg) (USA, Canada)
Your Dutch skills are shining trough :)
What an obnoxious way to end your piece.