Hacker News new | past | comments | ask | show | jobs | submit login
Software and Hardware for General Robots (evjang.com)
42 points by ericjang on Nov 29, 2020 | hide | past | favorite | 32 comments

In my robotics experience I've come to the conclusion that some problems just need to get back to the physics, and trying to put labels on things like "grasp" is fraught with problems that constrain things in unhelpful ways.

It's more useful to think in terms of what forces are needed to be applied to the environment to accomplish a task. Then you can build a solver to develop innovative physics based solutions that are unconstrained by the semantics.

Often defining a function that classifies whether a task is accomplished or not requires a fair bit of semantic precision as well...

... And going further, what it even means for a task to be 'accomplished'.

Certainly you cannot ever truly abstract away the physics, but also certainly you do not follow instructions, communicate to others, or introspect about how to e.g. cook dinner on the level only of forces.

Many tasks such as "cook an egg" has a well defined physics meaning. If you're trying to:

"Move egg insides into oiled pan on heated surface, move egg shell into trash" it can be purely physics based.

You'll have to find the egg- it's inside the carton inside the fridge behind the milk.

The fridge door, the milk, and the carton are all blocking the desired trajectory of shell and interior.

From a physics perspective you can minimise the energy expenditure required to apply the forces and motions of all objects in the immediate environment to accomplish the objective.

Such as: The door needs to rotate about the hinge, not be ripped out of the way. The milk could be pushed gently to the side rather than removed and set on the counter to expose the carton.

We still haven't programmed proper gripping technique for an egg, but if the chef robot only has a single spatula for a manipulator, a physics based robot would just have to generate teppanyaki techniques and balance the egg on the spatula rather than fail.

at what point would you consider an egg "cooked"? how would you define this as a measurable physical quantity?

I would say that has a physics based answer as well that is made more precise by a non-semantic answer... sliders for amount of mixing and flipping, firmness, temperature of griddle, testing for color change, etc all tie directly into physics or sensors and trying to write a "Sunny Side Up" function first, then writing an "Over Easy" function, and then writing a "Scrambled" function become much easier and more robust when those names map to setting of the physics model.

You've asserted the existence of a physics based answer, but you haven't answered what the "physics-based" criteria of an egg being "cooked" is. I assert that a straightforward short definition does not exist.

A thoughtful article. It is becoming increasingly obvious that the integration of field/home robotics into our daily lives is going to require us to adapt to the technology, more so than it adapts to us. For example, a (new) house constructed with a robotic-assistant in mind might alleviate some of the issues mentioned in the text; a gantry robotic arm, for example, with full access to the entire house. Washers/dryers designed to be integrated with robotic manipulators; things of this nature.

I think that's probably the way to go if a business wants to tackle the problem of a useful home robot in the next decade. I totally agree with your point about home developers being forward-looking and designing homes with robots in mind.

That said, I don't relish the idea of my oven or toaster being internet-connected (https://tech.slashdot.org/story/20/11/25/1910244/aws-outage-...).

I also think it's good to also think long term beyond what is immediately feasible or commercially viable. Personally, as someone who enjoys the sci-fi movie Blade Runner, I would much prefer to see a robotic android being "more human than human", than having to compromise in the design of homes.

It doesn't have to be a compromise in the design of the home. Almost nobody thinks of the inclusion of electrical wiring and sewer piping into modern houses as a compromise, even though they take up space and cost money. If there was a reasonable system for (say) a standardized robotic arm on a rail to cart itself around my house and do useful stuff that might be really nice and it can slowly grow as more appliances get adapted to the robot arm.

Humor me - what would you use such an arm for? What does the design look like? How many times a day would you use it? A lot of time in homemaking (a full time job for some people) is spent cleaning, unpacking things, and packing things away. How would such a system work?

If you'll indulge some science fiction dreams: I'd want something like the canadarm on the ISS but with ~4 segments of 50 cm each, a multipurpose gripper/vacuum/button presser attachment on one side and it'd be attached to some sort of rail system on the ceiling and/or walls.

Software wise it'd be both individually programmable and have access to some sort of thingiverse equivalent with actions that others have dreamed up. Otherwise it would not really need an internet connection at all, except perhaps to the local wifi to talk to other more-or-less automated systems in the same house.

For applications, some of the ideas I had while typing the rest of this comment:

- If there is a package outside the door, open the door, pull the package inside, close the door again.

- If the mail has been delivered (for houses that have a front door with a mail slot in them), use the vacuum attachment to pick up the mail and deliver it to a central location (desk? dinner table? kitchen?)

- If the dishwasher is done, use universal gripper to pull it open, open the closet and put the dishes away. (Or into a drying rack, whatever).

- For most people I know the washing machine for dirty laundry is someplace away from the bedroom. I don't think it'd be reasonable with current tech to expect robot folding of clothing but at least it could pick up dirty laundry from various places where it's collected and bring it to the washing machine so people only have to come in and turn the machine on.

- Depending on how good the gripper is and how well adapted my coffee machine is to it, perhaps it could pick up a coffee capsule and prepare it while I'm in the shower.

- Once per day it would make a round around all the bathrooms and refill the spare toilet paper roll holder if there is only one roll left.

- Maybe it could wipe down sinks and stuff regularly too.

All in all, it'd be useful several times per day I think. Not doing anything I couldn't do myself but definitely taking care of various simple chores throughout the day. It would need pretty good sensor coverage as well of course. Finally, I realize that with current tech this would be prohibitively expensive.

Something like this? https://www.theburnin.com/technology/toyota-ceiling-mounted-...

Thanks for enumerating the use cases. They are fun to think about. I wanted to dive into the first application you mentioned. As mentioned in my post, even the most basic manipulation tasks on human-centric objects and spaces are actually full of little details.

I hope people think more on that low-level dextrous manipulation when designing robot hardware rather than the fairly high level "open door, pull package inside, close the door", which might make robotics people receptive to the idea that a humanoid is truly the only viable solution.

What if your door has a step down to the porch where the package is delivered? The robot can't "just" pull a large box inside, it would get stuck. You'd need to lift it up, and at this point you'd require two arms or a gantry that can extend outside of the home above the porch. Obviously the home can be re-designed around this, but my point is that there are really two kinds of robotics - ones that try to solve a human problem, and ones that try to do everything a human can.

My vision for a helper robot is much simpler. I want a magic wand (laser pointer) that I can use to cast spells (issue commands) on anything around me. If I tell the toy on the floor to put itself away (point laser, issue command 3) then it will put itself on the shelf where it belongs.(robot arm moves object to next empty buffer location on shelf).

The example, of opening a package of dates, is actually pretty complex task even for humans. A child or even an adult with weaker fingers won't be able to do so.

There are plenty of kitchen tasks which are way easier for robot, for exampling making a bowl of cereal. Which brings me to the second point. You say:

> The only viable hardware for a robot meant to do any task in human spaces is an adult-sized humanoid, with two-arms, two-legs, and five fingers on each hand.

but I think this is only true if robot is constrained to using tools designed for humans. In the strawberry example, the knife is designed for multiple fingers, and cutting board has no elevated border to work well with the human knifes.

I think a properly designed "kitchen arm" (with maybe compliant/under-actuated fingers, a few sharp blades, a vacuum grip, etc..) as well as robot utensils (like uneven cutting boards that the food does not roll off from) would allow robots to do a large fraction of the kitchen tasks.

And maybe your bag of dates would be cut open instead of opened nicely -- but you'd still get to eat them.

Are you sure that making a bowl of cereal is that much easier? Are you sure that the act of making cereal doesn't involve some physical impediment that is trivial for humans but actually quite difficult for any non-humanoid robot morphology? I would encourage you to record a video of you making cereal with your hands or potentially even a "mocked robot end effector" that you control with your hands, and see if you can do the task from start to finish with that hardware.

I believe robotics hardware is near and damn it there, and has been for a while now. We're already seeing semi-reliable robots such as 'smart' hoovers enter homes.

The biggest problem by far is the software - in particular the AI. The biggest companies in the world have thrown billions of dollars at AI, Universities have had some of the most brilliant minds among their ranks, and we essentially got some (impressive) slightly better search algorithms.

There are absolute fundamental questions (+) that need to tackled in order to have the kind of generalized AI such environments require. Most AI (that I'm aware of) currently lacks the ability to do anything other than optimize itself for strictly specified scenarios/environments.

Personally I quite like the information theoretic approaches to self-motivated agents, there are some nice mechanisms out there such as empowerment [1]. It's not the full picture, but it's a step in the right direction. I don't think this is something we can throw larger neural networks and computation resources at and hope it solves itself.

(+) This is the subject of a paper I am currently writing.

[1] https://arxiv.org/abs/1310.1863

I'm not a roboticist, but can the ability to do general purpose manipulation be built up from a universe of known simpler manipulation tasks using something a bit like transfer learning? Is this used? Are there methods that don't need this?

Also, what would a good interface between a Software 1.0 program and a Software 2.0 program look like in robot software? I mean, what would the boundary between (3) and (4), and (4) and (5) look like in this imaginary stack?:

  (5) Autonomous controller (software 2.0)
  (4) A high level interface for giving instructions to (3), and finding out what (5) is doing
  (3) Motor manipulation controller (software 2.0)
  (2) A daemon for converting NN outputs into safe hardware control outputs (software 1.0)
  (1) OS kernel (software 1.0)

I’d love to work on the AI side of this. Are there any good arms I can buy for not too expensive and start trying out ideas?

Could I really have a pick_up_object() start up? There’s actually demand for that?

Regarding a startup, there is quite a demand in manufacturing. I hear investors will throw money at projects that promise such things, but there isn't much scope for "real" AI as such.

"Picking" in manufacturing is a well studied and basically well solved problem. It only becomes challenging when generic solutions are required: "pick up this bowling ball or this egg or this living chicken or this living plant with the same system, while maintaining a high speed and constrained motion envelope".

Probably more in logistics space, e.g. for retailer warehouses with many different products.

I would suggest picking up a small, cheap arm first. There are a few you can control via Arduino. Look at one with an MG958 15kg metal gear servo you can already pick up some small objects.

One you go to a large, heavy arm you need more powerful and precise motors, and the cost of mistakes dramatically increases (monetarily and physical damage).

The TinkerKit Braccio robotic arm seems a reasonable entry point.


Robotics founder here. IMHO the problem with notions about general robotics is that it doesn't exist: robots are very specific if they are any good, at least if you measure success as cost vs. effectiveness. This is unlikely to change greatly due to physics demanding specific and greatly disparate automation solutions for many tasks.

Came here to say this as well. I have a background in mechatronics, and when I think "robotics", I think of an assembly line with a number of specialized devices on the line. When I talk to other people about robotics, they think of a "humanoid robot that can cook and clean and move stuff for them". As far as I can imagine, a generalized robot would also need a generalized intelligence to perform general tasks, and to learn how to use the tools available to it to accomplish those.

Great article. What do you think of this paper? https://arxiv.org/abs/1909.10893

This type of research makes me optimistic about possibilities for generalization in the future.

Manual mode with onbody interface solves a lot of transitional problems.

See https://www.xprize.org/prizes/avatar

I actually registered a company named "General Biomimetics" in Delaware and have so much ambition and so many (unfortunately mostly vague) ideas about this. Specifically I have been thinking about washing dishes and other tasks in the kitchen. So its been something I have been spending at least two days a week on (I have a job).

But due to the depth of the problem and not having resources or much knowledge, it was maybe a little silly to create the company. But I like the idea of reserving that name, just in case I ever get anywhere.

From the hardware side, I feel like some of the robotics issues can be resolved by "just" copying people more closely. For example, it seems like the way real arms and muscles work should provide more leverage and force than the typical servo setup. And having five fingers provides the potential that manipulations could be copied from people.

There is also a very promising new type of artificial muscles called HASEL.

Of course, in order to efficiently build these human-like limbs, "all" we need is a way to 3D print with several materials at once, including a new type of conductive ink that can handle high voltages for the HASEL muscles.

But the starting point to me is a robot that can actually understand what it's looking at. In that it sees with depth, and understands the composition of objects and their orientation, etc.

Capsule networks seem interesting but also maybe are a bit computationally expensive and unproven? Also he seems to be focused on just the transformation matrix, but it seems like there are more aspects of the state that could be relevant and maybe are unique to different object types. But I am slowly trying to understand capsules anyway.

I have seen a few ideas about more general neural network-based systems that suggest it is necessarily to have multiple neural networks, or networks of networks, or neural modules, etc.

To me it seems like the ideal thing would be to have some standard shapes for networks or modules and also be able to reuse and adapt them for different tasks.

So my vague ideas now are something like: standard-shaped modules, trained on core modeling tasks such as finding 3d surfaces in 2d images. But at the same time somehow segmenting into different objects. And the potential high-level objects should be able to feed into the potential low-level understanding and visa-versa.

My intuition is that ideally there is a sort of 3d wireframe overlayed on the 2d image, identifying each object and sub-object with its exact dimensions, shape and orientation. Kind of like I've seen in one or two science fiction movies. So if I can somehow generate all of that, I know I have properly decoded the image.

Today I was looking at a GAN tutorial. But I have never made a CNN before, so decided that must be first.

Usually I think about this stuff for awhile and then just decide I don't really know what to do and then go back to Coursera. I finish Ng's first class and am looking at the hyperparameters one. I feel like I need to make some actual neural networks on my own though, because mainly Ng is teaching me how to convert from math notation to vectorized Python as much as anything.

If nothing else, this is really motivating me to learn about existing AI techniques. Which I feel like, to be a good programmer, I actually should be able to use things like Tensorflow etc. for narrow AI tasks.

Awesome write up. Sounds like you might be on the right track. I was thinking something similar myself. Or perhaps genetic programming would be a good fit for the problem.

Feel free to get in touch if you want to chat and trade ideas. (Email in profile)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact