Hacker News new | past | comments | ask | show | jobs | submit login
Image2Lego: Customized Lego Set Generation from Images (arxiv.org)
98 points by hubraumhugo 26 days ago | hide | past | favorite | 27 comments

"Finally, we test these automatically generated LEGO sets by constructing physical builds using real LEGO bricks."

That reads to me like: "I'm not playing honey, this is work!" I sure hope the team doing this had a lot of fun :)

The algorithm itself is OK but doesn't appear all that novel to me. It looks like they're using ShapeNet (or similar dataset) to train a 3D voxel autoencoder which can predict geometry from an image. And then they convert those voxels to LEGO instructions.

For the first part, I've seen equally good papers before, for example by the ShapeNet team. For the second part, there's working explicit algorithms. So their main contribution appears to be to combine two working things into a (somewhat) useful whole.

I think a big improvement here would be supporting all the specialty pieces rather than the generic LEGO blocks and panels. Would certainly make for more interesting builds!

Translation: we took the fun part out if building Lego and gave it to a machine...

Seriously, though, I have a real-world problem which needs a solution: given the roughly 3000 loose bricks in my son's Lego collection, tell me which ones go with which sets? Thank you - we can take it from here.

Of course there is an app for this: https://brickit.app/

Thank you sir!

It probably can be done, if someone hasn't already done it; put the bricks in a bin that pulls it through a conveyor belt, which uses image recognition to determine color + shape, matching that up with known lego pieces. And the list of parts for every set is stored online somewhere as well.

The difficulty is that some bricks will look like other bricks, close enough that the image recognition might be accurate. But, some margin for error is fine.

Here are a few... (and yes, they were on HN at the time)

2017 - Sorting 2 Metric Tons of Lego



2019 - AI-powered Lego sorting machine built with Lego bricks



I did see a sorting project somewhere, with image recognition. Apparently there is money to be made buying old Lego collections and selling sorted. Just the part where it puts the sets together was missing. If I had the time..

This is a video about such a project - of course done in LEGO!


I was involved in a small group of poeple looking into creating a commercial LEGO sorting machine, which you would feed buckets of bricks, and they would come out sorted and indexed on the other side (in boxes).

We moth-balled it, because it would not be possible to make it at a pricepoint, which made commercial sense.

jacquesm here famously built the sorter; everything except for the boxing it up step.

rebrickable.com is intended for finding new things to build with your existing sets/bricks, but as a consequence of that, it has a searchable database of which bricks shipped with which sets.

> In order to demonstrate the broad applicability of our system, we generate step-by-step building instructions and animations for LEGO models

Then perhaps the AI experts are on track for a successful prediction:

> The team also predicts that by the year 2023, a machine will be able to “physically assemble any Lego set given the pieces and instructions, using non-specialized robotics hardware.”


I made a tool that converts pixel art to Lego mosaics that people in this thread might find interesting: https://city41.github.io/pxtobrx/

Wouldn’t it be simpler to focus on the “3d model” -> “lego model” part of the conversion?

What are the benefits of generating a lego set from an image, as opposed to generating a lego set from a 3d model (and doing the “image to 3d model” conversion with a different tool)?

> doing the “image to 3d model” conversion with a different tool

Or using your phone's LIDAR sensors

Because then it's more steps and more things to go wrong, I'd assume.

Large amount of work, it gives me a nice insight about Image2Voxel, which I never test out before. What will be worth to add is an real case scenario or some actual benefit that it'll produce in your opinion.

Ages ago I built a 'lego' system. Basically I figured out you could build any lego shape from 5 basic voxel types, cube, bevel, cone segment, sphere segment, cylinder. Think I was calling them lego atoms, as you could scale the resolution depending on the amount of detail you needed. Not sure if that is true anymore with the more interesting shapes they have introduced. Not sure only 4 would cut it anymore.

I only took it to the paper/whiteboard design stage. The problem I had was voxels became wildly more memory consuming when you introduce more than a binary there not there type. I was looking at 10-15 megabytes in memory just to describe a simple 2x4 brick with the min resolution and it still worked as a lego. At the time I was lucky I had 96MB... The exe was pretty small (maybe 1-2MB total). The data on the other hand I could not find a nice way to compress it enough to work without a few dozen bricks filling my memory space. Mostly I got bored with the project and moved on.

The 'voxel' style has one very nice quality about it over the ldraw cad system. Unintentional gaps are nearly impossible to have and overlap is dead easy to find. Correct overlap lets you do interesting things in that you can intersect things correctly and it will just work on the snap grid. The ldraw system has to jump thru a few hoops to make it look like it is working. The effect is the same but more of a pain making sure your floating point is correct. ldraw has a nice quality in that rotating an object in freespace is 'cheap' and you only have to rotate a few points. Whereas a voxel system like I designed you in effect have to rotate everything.

At this time the ldraw system is the choice to use if you are doing this though. When I built my system there were maybe 200 different bricks total. Now there are thousands.

Love it! I started to working at my side project on lego classification and detection algorithm to identify what set could you build from bricks scattered on the floor. What change is that I am not that aware of memory consumption as in your project in the past. All best!

I had toyed with that idea. This was before ML visualizers had become decent. I was going to use the set numbers to help me figure out what I could or could not do. Think there are a couple of online resources that do that. There are a lot of sets so currently a "i have a pile of xyz pieces" and then just exhaustively going through them maybe with a % complete would be simple enough if you had a complete library of what pieces were in each set. You could get a bit clever with 'i have x piece which set is that in' and filter it down. Not sure if you could get away without it being basically O(m^n) though. But given how small the existing number of sets is compared to the compute power most computers have these days that may not be that bad to just brute force it? It really becomes a combinatoric problem and there are a lot of algs to choose (hehe) from. I personally was noodling with 3 or 4 SQL queries that did it.

With ML detection you can get a good ways decently the hard part is 'hidden' feature. Depending on orientation with some pieces they will hide features. For example a 2x4 flat looks the same as a 2x2 flat on many orientations (extreme example but shows off the effect nicely). I have seen some people use tumbling the piece or a few cameras to mitigate the issue.

My proj was more along the lines of how do you represent a single piece in memory without using a planar point system such as what ldraw did. I had toyed with the idea of converting between the two systems for memory reasons. But it became compute/IO expensive very quickly on collision detection with other pieces.

The real interesting thing in fact is that I did not even conclude issues that you meant. At the moment I focused only on ml/dl algorithm and its ‚accuracy’, but its center of gravity lays stricly in form of gathered dataset. It gives me that feeling that I am missing something right now and it wont be usable on production level. Indicating what set you could build from predicted bricks should not be hard because as far as I know there are official APIs which allows you to gather specific information about each set. Thank you for explaining part of your side project, it was really interesting!

nice! You could brute force it a bit? May take a little bit of work. But you could take the part descriptions from ldraw (from ldraw.org). Have a decent way to render a piece (and random known colors). Then feed in those pictures/pieces as random orientations to your net? That may let you automate it very nicely? You would basically cut out a huge time sink of the hardware bit and get a net that is 'close'? Then once it is close you can add in the hardware bit? If you wanted to simulate random pile of bricks you could use something like one of the physics simulation engines and add in random bricks and let them flop where they can? That could give you some input values for your net fairly quickly?

TLDR: Image2voxel algorithm with a trivially simple voxels2lego bit tagged on for fun and/or clickbait

Could be seen as "The image to 3d models produce mostly unusable garbage. Looks a bit like Lego though..."

Should work fine for minecraft creations too then I guess :)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact