Hacker News new | past | comments | ask | show | jobs | submit login
IKEA’s knowledge graph and why it has three layers (medium.com/flat-pack-tech)
208 points by mooreds on Sept 8, 2022 | hide | past | favorite | 77 comments



When I was a kid, I thought IKEA product codes (eg: 002.638.50) were coordinates in 3D space of where the item was stored in a giant IKEA warehouse (like in the film Cube, but less deadly).

I'd like to know the reasoning behind it though. Parts of a same product usually have very different numbers.


I think that very different numbering is to make sure customers don't pick up the wrong box. They seem to put very different numbers next to each other in the warehouse.


That seems smart - but then they need other numbers for coordinates I guess.


Assuming that all warehouses are the same size and shape and carry the same things, which is why they instead use Local bin numbers. Works a lot like an IP and MAC I suppose.


Which they do: Aisle and Bin numbers on each item describing where it is in the warehouse.

I think the product number codes are equivalent to SKUs.


Items in the warehouse go through many coordinates during manufacture, shipment, merchandiser, etc. before going out to the parking lot with a customer.

The part number is for accounting.

It allows allocation of expenses such as raw materials in a meaningful way.


Kinda like that (maybe false?) story about the Stalin solving a problem during WWII involving same-diameter mortar shells and rockets getting mixed up in logistical orders (so, your unit needs mortar shells, but you receive rockets, and vice versa) by dictating that one be re-labeled with an incorrect (but unique!) diameter. Make it different enough that even a partial or partially-understood name is unlikely to be mistaken for something else.


The world war two sherman tank originally had a 75 mm gun, later this was upgraded to a 76 mm high velocity gun. The interesting thing is that the 76 mm gun was actually 75mm, the army used the 76mm designation because the ammo was incompatible and they wanted to avoid exactly the sort of problems you would have if you had to deliver ammo for two different 75 mm guns.

https://en.wikipedia.org/wiki/76_mm_gun_M1


The wiki page seems to imply it was actually 76mm?


You may be right, I was mainly going off memory, and did a quick check to confirm it using reference point 10

'''The "76-mm" designation was chosen to help keep the supply of ammunition from being confused between the two guns'''


I'm guessing infanty went "1mm, close enough" and did it anyway?



Next time you're in the store... look up


Wow, those fans are pretty big ...


Maybe big ass fans, even.


Essentially it’s an 8 integer “ARTicle number” aka sku. Periods are a UI/UX feature added 50 years ago for Human Readability.

Factor in that the same exact article may have a different ART number due to a countries legal requirements/certifications/labels trying to keep any structure in that 8 digit range becomes impossible considering they are all in a global namespace.

The article didn’t dive into the true complexity and the core problem they are solving for when it comes to the relationships between Articles and Product Groupings (Combinations) required that makes a complete product ( Desk,Bed,Pax Wardrobe,Kitchen,Couch etc ) and the attributes/meta data required to keep track of compatible legs/hinges/slats/wheels/doors/covers that are required for each combo of each color of each variation along with the true assembled size and relationships back to compatible articles and complementary products.

2D/3D Coordinates were in use and a representation may be found looking up ;).


Can confirm the dots are just UI. And this is a pretty accurate description of some of the challenges.


Maybe I'm missing something, but .. what problem is this solving?


Most companies are a mess internally and information/meta data about various parts about the org is badly grouped and/or named. When reporting time comes, you have a big'ol mess to categorize things manually to make sense of it. The same issues spill over into code and thus websites and catalogues (ever wondered why some companies websites are better organised and filterable?, example Dell, but think of a car manufacturer and all their parts), so it really matters that someone or a group sits down and formalize their vocabulary & taxonomy & grouping of things. Meta data is often more important than the actual thing being described! This all plugs into DDD & standardization efforts (see banking and healthcare) and why programmers subconsciously hammers on about "naming things are hard" - whether they are aware of it or not, they are sensing that somehow the order and naming of things matter. We also have to deal with versioning of things, and not just in code, but in hardware too. So it really pays off if done well.


See also the parametric search available at McMaster Carr:

https://www.mcmaster.com/screws/hex-head-screws/

or Digikey:

https://www.digikey.com/en/products/filter/ceramic-capacitor...

or PCpartPicker:

https://pcpartpicker.com/products/motherboard/

compared to trying to find the same products on Amazon or AliExpress. Many of the products are available, but every product on those examples can be filtered against comparable products by the specs that matter to the engineers doing the selection. On other sites, the metadata is generated inconsistently by individual sellers.


It's very interesting to me that, as an absolute amateur at electronics, I find the Amazon/AliExpress approach infinitely more usable, even though Digi-Key has a better selection of products. Of course, this is because my needs are usually something extremely generic and simple like "I need some 400ohm resistors" or "I need an 8-bit shift register," with a strong emphasis on the "anything goes" aspect.

On Digi-Key I get absolutely lost. There are thousands of options, with different form factors, costs, availability, specs, and more. Can I buy 5 of them? What's the lead time? Is it SMD or through? How much is shipping? Which vendor do I choose?

On AliExpress, I get a bunch of "5 8-bit shift registers for electronics projects" listings, check a couple of them, look up some info, and purchase one in a few minutes.

Of course this means I end up with mediocre, sometimes outright broken parts, but it feels like it's worth the time I save. I wish there was a curated list or an "easy mode" on Digi-Key.


Conversly when I want a power jack with a 3/4 inch hole diameter, 3 wires and no more than 1 inch penetration into the chassis. It is so nice to be able to go to a vendor that has actual metadata on the products they are selling. I hate going on amazon and having to wade through 15 pages of stuff.

What I have done when I find myself with several similar items on digikey and am unable to make a choice, I will go with whatever has the biggest inventory, the assumption being whatever sells the best will probably be the best... or at least the most available, which is good enough for me.


As not so absolute amateur the key is to type "<parts site> <what you want>" in google and then fiddle

For example "8 vit shit register" gives first link to SN74HC595B, a generic shift register and second link to parametric search

https://eu.mouser.com/c/semiconductors/logic-ics/counter-shi...

with "logic type=shift register" and "number of bits=8" already selected.

You don't even need to spell it correctly

Googling smartly is

> I wish there was a curated list or an "easy mode" on Digi-Key.

exactly what you want.


You can ignore most of the specs that are irrelevant to your project on parametric searches.

For example, here's how you'd find a generic through hole LED. Search for "through hole LED", pick a color, stocking option: in stock, sort by price, and pick whatever seems right for your project, making sure that you're not ordering something silly like 2500 LEDs (it'll be pretty obvious when you check out anyway). If there's something else that's relevant to your project (eg. the size or shape of the LED) then you can specify that.

The datasheet is just a click or two away to do a sanity check that you're ordering something that makes sense.

Obviously, for a generic component like a LED where the specs might not matter too much and it's fine if you get one of dubious quality, you can just get a random grab bag of LEDs off AliExpress. Or if the catalog overwhelms you, there's always Adafruit, SparkFun, Pololu, etc.


For passives, I have a bunch of resistor/capacitor/inductor/etc kits for through hole and SMD. You can buy them on Amazon so they're usually available fast. You don't have too worry much about the specs but definitely learn the basics like electrolytic vs solid state, voltage/current/power ratings, etc.

For ICs search Google for "<part description> breakout" which will often show you Sparkfun, Adafruit, or some other hacker focused shop/blog. For example, some results roughly ranked by obscurity: 8-bit shift register [1], thermocouple [2], air quality [3], CPLD [4], crosspoint switch [5].

For anything more complex than that - for example, a DLP micromirror device - you'll have to dig deeper and search for "<part description> dev kit". Manufacturers usually make dev kits using their flagship chips from which you can easily navigate related parts. Be prepared to spit your morning coffee out at some of the prices of this path though (i.e. aforementioned DLP dev kits are $1k+, easily twice what you'd pay for a DLP projector using the same chip)

I assume that's what your Amazon/Aliexpress searches have done by proxy (I often do use Amazon too for more basic stuff like relays, ESP8/32 etc. when I'm impatient)

[1] https://www.sparkfun.com/products/10680

[2] https://www.adafruit.com/product/269

[3] https://www.adafruit.com/product/3566

[4] http://dangerousprototypes.com/docs/XC9500XL_CPLD_breakout_b...

[5] https://hackaday.io/project/167228-adg2128-breakout


Thanks for the tips! That's pretty much exactly what my process looks like today if I want to figure out which variant is the most "popular." Most of the uncertainty is specifically about specialist sites like Digi-Key.

For example, the 74HC595 you linked is a super common shift register, but DK Canada has 88 listings for it. Some surface mount, some in different DIP configurations, and now I have to scour through and figure out the one variant that works for me.

Generalist sites like Amazon and AliExpress tend to mostly stock the variants that tinkerers/makers want when prototyping, which makes it easier for me to plop "74HC595" in the search bar and quickly get a sensible result, especially when sorting by amount sold.


Digi-Key needs a sort by “most popular” You can often sort by number in stock to find a “normal” one


Number in stock tend to be better than popular -- popular digikey items tend to go out of stock


>See also the parametric search available at McMaster Carr

About 25 years ago I built a series of calculators for a competing industrial supplier, where you could match up the bolt / screw / washer / nut needed for each size, metal vs wood application, etc. Still a very relevant need today that I've seldom seen replicated usefully on any site.


To provide a concrete example, TFA said that 'product' was a 'concept'. Another concept could be 'part'. Another concept could be 'package'.

(You buy a product such as a bookcase. You then go and pick up one or more packages which make up that product. When you get the packages home and open them up, each contains several parts.)

There are lots of different systems at IKEA, which work somewhat in parallel but sometimes have to interoperate. For example, point of sale, returns, warehouse management, showroom design. Some of them care about all of product/package/part, but many of them care more about one of those levels than the other. Point of sale just needs to know what products you bought, not what parts are in the packages. Warehouse really cares about what packages there are, but also wants to be aware of which sets of packages are products.

Suppose that one system doesn't make a distinction between product and package, or uses 'product' to mean either product or package. That system might work perfectly fine on its own, but when it has to interact with another system which cares a lot about that categorical distinction, things will go wrong. So you want to have one place where all those concepts are defined. Then when you create a new system, you can confirm that you are using the same categories with the same meanings as everyone else.


> programmers subconsciously hammers on about "naming things are hard" - whether they are aware of it or not, they are sensing that somehow the order and naming of things matter.

Never seen someone noble savage programmers before.


This (disclosure: I'm one of the tech leads on this project)


I'm consistently impressed with how IKEA does business, with the exception of their website. For example, search for "storage bin" then try to filter by the dimensions of the bin (for fitting on a bookshelf or in a cubby hole). You're given two filtering choices: Large and Medium. Those are not very helpful choices.

I'm sure their metadata is great compared to some groups, but in my case it's still insufficient.


Not only you. It's an identified problem the Knowledge Graph is attempting to solve.


I also noticed it might be considering a more European usage of the word "bin". There were lots of trash cans in the results. The metadata might need to be regionalized.


Indeed localisation is more than just labels, even whole rooms or activities can be treated differently in countries (sleeping just for starters...)


Agreed the article doesn't make it clear. I know companies like Netflix have built knowledge graphs of their products which they create embeddings of to feed to ML models which helps them make better recommendations. NASA also built one to ease search of safety information when building new things.

https://netflixtechblog.com/supporting-content-decision-make...

https://neo4j.com/blog/nasa-critical-data-knowledge-graph/


NASA has more than one Knowledge Graph.

https://www.stardog.com/blog/nasas-knowledge-graph/


Search, data normalisation, equivalencies, recommendation engines, supply chain analytics and reporting, etc.

Having a well defined ontology helps power lots of data products, and even build new ones.


Hi, I'm the author of the blog post. What problem we are solving at IKEA exactly is something I plan to write next about, once we get some numbers in from AB testing etc. We are not there yet to share concrete results.

KGs in the general e-commerce space solve search disambiguation and bridging the gap of industry language with customer language; common sense recommendation, especially when more and more customers do not opt-in to sharing their data; and embeddable info box content throughout the customers browsing experience.

KG is not a technology but a paradigm shift from an application-centric data modeling (classic data engineering) to a business/purpose/people-centric data modeling. It means that on top of the data layer, the KG enables to access the data via business concepts. It enables e.g. simple questions like which products are suitable for customers that live in small apartments and automatically solves the problem of combining different data sources to answer the question.


Turns out it's easier to reason with binary relationships so if you only use tables with 2 columns it's somewhat easier to infer things from your data.

Categories of products is a good example as they form a transitive relation, but not necessarily a strict hierarchy (any particular entity might be in multiple distinct categories).


I've done contract work for IKEA.

When talking to their senior engineers, a common trope was:

> I wouldn't know, I've only worked here for 5 years.

The sheer volume of new hierarchies replacing old ones, teams being split into other reams, rebranding of departments ...

I doubt any one person understands how IKEA works, let alone enough to be a source for others.

Perhaps they are projecting onto their customers here.


wake up babe new computerphile video on knowledge graphs just dropped https://www.youtube.com/watch?v=PZBm7M0HGzw


I feel like IKEA likely operates in a pretty restricted problem space compared to a general knowledge graph like wikidata or what have you. I don't know that you can separate the "all knowledge" graph into 3 layers like this.


I don’t think they claimed it would work for everyone. Or indeed for anyone else. Seems like it's just saying "this is what we do."

(Edit: removed needlessly snarky wording.)


My main interest in a knowledge graph is the universal application of it, and I suspect that's true for much of HN audience. So of course it's natural when reading this article to wonder if their methodology applies, and I was answering my own implicit question.

Good on IKEA for solving the problem they have efficiently, but I can be interested in a larger scope.


Yes, you’re correct and that’s a reasonable interest. I also realize that I worded my comment in a needlessly obnoxious way.


> I don't know that you can separate the "all knowledge" graph into 3 layers like this.

The 3 layers mentioned are just the basic structure of ontologies. Concepts eg “vehicle, person, desease, atom, bridge,…” then categories of the concepts, and finally data, ie instances.

Larger knowledge graphs don’t grow by adding more “layers” they add more content to all three layers.


I'm not an expert in this field, but I had never heard of this as a standard ontological structure. I can't find any solid description of it either, are there any references you can share?

I thought that it gets complicated because you can have concepts of concepts, and you can have both categories of concepts and concepts of categories etc etc. It ends up being more than 3 layers, and you get layer violations and loops etc.


Maybe some day this will be available in JSON.

I built a system for TaskRabbit that scraped all the IKEA products from a variety of sources and ran algorithms to determine their category and predict how long they would take to be assembled. Then there was a Mechanical Turk sort of system for human input. When combined with real-world feedback from the Taskers, it was pretty good.

For better or worse, I've personally been through the entire catalog multiple times.


Maybe as Turtle or JSON-LD, but you need a format that encapsulates triples and various data types. Otherwise you're throwing away most of the utility of a semantic web knowledge base.


I like the idea of concepts, categories, and data but I had trouble seeing how it fits into the traditional knowledge graph structure of tuples that I am used to.


  (Product) -> (bookshelf) -> (billy white 80cm)
The above would be the graph representation of concepts, categories, and data. In RDF triple terms of subject-predicate-object it could be represented as:

  Product (s) has a (p) bookshelf (o)
  Billy white 80cm (s) is a (a) bookshelf (o)


It is actually (in pseudo RDF):

[BILLY bookcase in white 80x28x202 cm] rdf:type :Product . Product rdf:type :Class .

[BILLY bookcase in white 80x28x202 cm] :categorisedAs :bookcase . :bookCase rdf:type :Category .

*[ ] = IRI of that particular product


OWL2 does a good job of formalising these things. What they call concepts and categories would be Classes in OWL. What they call data would be Individuals.


The concepts are Classes in OWL and the categories are individuals of rdf:type Category.


Another analogy I can think of is food-related. The concept would be "bread". The categories would be "recipes for bread". The data would be "bread ingredients".


The only example of "Concept" in the article is "product", yet it talks to great length about it (Concept).

What are some other examples? I really struggle to understand the value of it, and find the lack of transparency here bizarre given how much the espouse this 3 tier system.


I'm not with IKEA but I've been part of other ontology exercises. Off the top of my head, they are a retailer with physical stores and distribution centers--which could both be categories within the concept "location". They have associates (people), and customers (maybe just "people" again, but maybe a different concept.) I could also imagine transactions or loyalty accounts as other concepts. Depending on how far they go, they could include things like "region" (i.e., region-> has many -> location) or general ledger accounts.


Concepts can go more into products and be actual home furnishings and interior design. IKEA is also a retailer so has things like mandatory store concepts that don't have a digital equivalent.

Product concepts are actually the low-hanging fruit here.


That is a wonderful article for several reasons. I especially like explaining the upper Ontology in terms for both W3C linked and and also for property graphs. I also especially liked the breakdown of data ownership.


Thank you, the intention was exactly to highlight managing and communicating data ownership and create a framework for data governance.


There should be a fourth layer in a fully fleshed out pyramid, and it was hinted at in the text, that comprises each individual copy of BILLY bookcase, white, 80cm, etc..

Maybe even a fifth layer could be necessary if the individual products are broken down by subcomponent.

In the software context this would each individual copy of an OS with subcomponents being each individual copy of a driver, etc...


a little surprised they don't have a fourth layer containing the individual product variants; i.e., layer 3 has 'Billy Bookcase' while layer 4 has each product variant and code (white, 3-shelves, 003.389.132)


I'm really grateful to read this about Ikea's effort and Dave McComb's continued work. I just received email of the below training opportunity from Dave's company Semantic Arts. If you are interested in Gist, ontologies, knowledge graph, it may be of interest to you.

https://events.eventzilla.net/e/semantic-boot-camp-introduct...

I have no relationship with Semantic Arts other than they performed a contract for a company at which I worked.


Dave McComb is a huge influence in my thinking of Enterprise Knowledge Graphs.


I wonder if they're using something like WebProtege to help make authoring/maintenance easier.


This seems wildly pointless and pedantic


I think the point is it's a technical structure that supports a very large organization's unified datastore, despite the fact that ownership of the structure of that datastore is decentralized and the responsibility for populating that datastore is maybe even more decentralized.

Very large organizations need technical solutions to organizational problems, especially when there are many loosely coupled teams interacting in one way or another.


Every large company will grow a pointless and pedantic way to organise data. Might just as well make it explicit.


Conway’s Law would have you believe that this is because every large company will grow a pointless and pedantic org chart. :P

(Yeah yeah I know that was technically about software architecture but code is data soooo…)


To the contrary - it's anything but pointless.

It tends to become pendantic to combat entropy.


If anything years in corporate taught me that something might be pedantic, pointless and still not combat entropy very well


Oh, it doesn't necessarily work. But it's a natural response, and typically the alternative is even worse!


Prone to disruption.


So Ikea is written in Prolog?


How do you get that?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: