Interesting post and great reference to [1] about why laundry hits a sweet spot of capability.
Interestingly the repeated critiques in the article are about sensor richness: primarily force feedback and tactility, which indicates lacking hardware. Software only robotics has a long and fraught history, but it really feels to me that current industrial hardware could be driven more intelligently without much change. No doubt the "ideal" robot for any given task requires developments in both.
I'm also curious about safety, since generally capable mechanisms need a multilayered safety stack that includes semantics, and cobot certification is likely not enough anymore. Examples: feeding someone the wrong pill, pouring a glass of water into electronics, cutting vegetables vs fingers.
“””
2035: AI is 10,000 times smarter than the smartest human.7 It composes “A Brief, Exhaustive and Completely Correct History of Robotics” which is much funnier than this one.
I had a dickens of a time with the ending. Having it end at the present seemed super abrupt (as it really feels like we are in the middle of a big shift) but I didn't really want to venture into my own predictions. One of my early readers had the suggestion of using prominent AI CEO/VC's predictions about the near future and treating them seriously as if they were inevitable fact, which I found very funny. And really this is all about amusing myself.
Two of the authors of the original Aurora system left Microsoft to found https://silurian.ai/ - interesting to keep tabs on if you're interested in this space!
Thanks, Benjie! Great to see you here. I hope it's OK if I plug your excellent writings on robotics that I think everyone should check out: https://generalrobots.substack.com/
Great questions and a few answers to various parts of your question that you've mostly identified yourself I think:
- Golf clubs specifically (and most sporting equipment) actually goes down a different pipeline in most airports, since generally this stuff doesn't behave well on conveyor belts.
- Data driven approaches can tell you a lot just from visual information, usually about deformability of objects, but also about expected centers of mass, etc
- Part of the reason we're using single big robots is because you can use heavy duty end-effectors - grasp all the way around the object with a grasp that's predicted to be robust to these kinds of perturbances, and then use quick feedback to safely execute a plan to place it
You're absolutely correct though, that there's a long tail of things coming through and that some objects are very, very difficult. Our problem formulation then becomes identifying confidence in graspability, and deciding explicitly that we shouldn't attempt to grasp some object and should instead flag them for human handling.
Not a pest at all and I've long been frustrated with ROS - our early demos were actually just a single C++ binary with multiple threads running for perception, control, and visualization, and I byte-packed robot control messages in our own software to avoid using ROS.
Unfortunately this breaks down in a few ways that you're probably familiar with, given that you asked this question:
- A crash in (e.g.) a third party sensor driver can bring down your whole binary, any signal catching here is awkward and you end up wanting process isolation
- Perception is, for better, or worse, easiest to prototype and try off the shelf in Python / Pytorch, so you either end up with pybind11 and driving things in Python, ONNX which is IME brittle for some of the crazier Pytorch modules, or message serialization and process isolation.
ROS/ROS2 does _way_ too much in my opinion - why does it have a build system and a ton of packages? This plus pinning OS versions are huge pain points. Unfortunately I also think many community-contributed ROS/2 packages are fairly low code-quality, with notable exceptions. Overall, I'd prefer to have ROS be a pubsub library with a few extra tools for logging and visualization.
In the end, we're currently using ROS2 for the reasons listed above and for easy prototyping, but I'd like to move to something more like FPrime, Basis, Cerulion, or Copper in the near future. I really want to grow something in-house with Zenoh or IceOryx2, but don't want to waste a lot of time on middleware, since I don't think it's what's kept the problem from being solved.
(At the end of this post I now see you're working on Basis, I apologize that I'm over-explaining to you. I'd love to chat about Basis if you have some time in the next few days!)
I agree. :) We don't specify an OS version, we allow arbitrary process boundaries, we use plain old CMake. Our perception story isn't great, yet. We have some tooling which could make certain GPU workflows easier, but have only prototyped with ONNX w/C++.
Would love to be able to provide your middleware, we've connected on LinkedIn, let's chat.
There have been some solid attempts in this space before - many projects take on the whole baggage system design and end up very very complex and often over budget. We're focusing on introducing tech that plays well in a larger system, particularly in "brownfield" existing processes - our bet is that recent advances in robot autonomy give us ability to handle items that weren't possible before, and therefore our units can be introduced in a more flexible way.
John B was obviously aware from previous experience what a manual and injury prone process this was, but I've also been really surprised as I've dived deeper into airport operations myself.
Bagroom is definitely what we're targeting first - being indoors (usually) is a huge plus, and lets us focus on the manipulation part of the problem without going fully mobile yet.
That said, we're definitely targeting tarmac/ramp operations, particularly between a TUG/PowerStow and narrow-body bag carts. Inside the bin is much trickier but we agree it's the least ergonomic part of the job, you just can't move a massive industrial arm in and out of a plane very easily. We have it on our longer-term roadmap, though, and intend to leverage the baggage dynamics data we collect everywhere else to give us a head start on the packing and manipulation problems there, just with a different mechanism.
Cargo packing is a huge area of interest for us! Particularly around optimizing weight distribution in loaded planes, or just optimizing packing efficiency in general.
We're really focused on health and safety aspects of this job - in a repetitive stress sense, these jobs are much more dangerous than many people imagine they would be and people end up with lifelong injuries.
Generally, regulators seem to be moving in this direction as well. The EU has introduced new regulations on the total amount of weight someone can move in a shift, and the Dutch government has mandated that baggage handling move away from manual processes like this in the near future.
Despite the focus difference, do you think it's unlikely that automating baggage handlers will replace their jobs?
The regulator focus seems like it'd reduce the max allowed weight of a checked bag, not automate the baggage handler handing the checked bag. So, I don't see the similarity between the regulatory push and your product? Edit - to clarify, beyond what Dutch regulators say about Dutch markets, which are a very small subset of "regulator focus" internationally.
Suction has gotten us pretty far at the prototype level but definitely isn't enough - we're testing out some new gripper designs that use suction as a broader part of an overall grasping system.
For these videos we have lidars and two Intel Realsense depth cameras mounted to the safety cage and on a wall. We're working on moving as many sensor on-robot as possible in the near future to aid with deployability.
Interestingly the repeated critiques in the article are about sensor richness: primarily force feedback and tactility, which indicates lacking hardware. Software only robotics has a long and fraught history, but it really feels to me that current industrial hardware could be driven more intelligently without much change. No doubt the "ideal" robot for any given task requires developments in both.
I'm also curious about safety, since generally capable mechanisms need a multilayered safety stack that includes semantics, and cobot certification is likely not enough anymore. Examples: feeding someone the wrong pill, pouring a glass of water into electronics, cutting vegetables vs fingers.
[1] https://substack.com/redirect/82d94852-76b6-4b0d-8595-86e46a...