I like to start talking about my research in terms of what it can enable a robot to do, and I start to talk about a robot helping a motor-impaired person. Consider the figure above, where a robot assists a motor-impaired person with daily tasks. The robot needs to retrieve the orange juice bottle from the back of a cluttered fridge. Other obstacles, at the front of the fridge, prevent the robot from directly reaching and grasping the object successfully. The robot needs to find a plan to manipulate those objects, perhaps by safely pushing them away, to create the required space to reach and grasp the orange juice bottle.
The problem at first sounds trivial, humans solve such problems on a daily basis, someone could argue without even conscious thinking. When you analyse the problem in depth, and especially when you attempt to tackle it by building a robot, you will be hit by multiple challenges. I will outline these challenges next.
Perception and Computer Vision
Starting with the obvious problem of perception, a robot requires the capability to sense the environment, build the model of the world to work with, and identify objects of interest, obstacles and their position in space. Although a huge leap has been achieved in recent years with deep learning methods and computer vision, the problem of pose estimation (which means the capability of a computer program to estimate the position and orientation of bodies in 3D space) is still unsolved in cluttered and occluded environments. In the real world (i.e., the fridge in our example) the objects could be occluded and in close proximity with each other. State-of-the-art systems can get confused, and the error can sometimes be in the magnitude of centimetres.
A robot needs a capability to plan its motion. The term motion planning refers to the problem of finding a solution to drive a robot from an initial state to a goal state. In the context of a mobile robot, that could mean to move the robot from one position to another (say, from the kitchen to the living room). If you own a vacuum robot that maps the environment and can move in it, it likely employs motion planning algorithms every time it operates. In the context of a robotic arm, with a more specific term, manipulation planning, the problem is to find the sequence of controls that would move the arm from the current position, to achieve a certain manipulation goal (say, the robot reaching into the fridge to grasp an object). Working with a robotic arm, it usually involves a much higher degree of freedom (DOF) system, which makes it much harder and computationally expensive to plan.
Although the problem of motion planning has seen great advancement in terms of capabilities, thanks to sampling-based planning (RRT and PRM) and trajectory optimisation and deep learning, this has been limited to collision-free motions mainly, and robot arms moving in space in a collision-free fashion. In our problem, the constraint of being collision-free is not, always, a hard one. Sometimes we want to avoid collisions (i.e., we don’t want to hit the fridge, other humans, the walls, etc) but in other cases we really want our motions not to be collision-free (when we manipulate objects). Having a hard collision-free motion constraint would not permit a robot arm to push obstacles out of the way, for example.
The main challenge of this motion planning definition is the high-dimensionality of the problem, which makes it infeasible for an algorithm to brute-force. Therefore, we need to employ advanced methods and algorithms to solve them efficiently. That’s where the terms “Artificial Intelligence”, “Machine Learning” and “Deep Learning”, just to name few, kick in. We need to develop intelligent systems to tackle this challenge.
Earlier, I described collision-free motion planning. I would refer to this as the “geometric motion planning”. This is because the algorithm is concerned with collision-free motion planning by employing a collision checking algorithm against the geometry of the world (the robot and the objects). Physics-based motion planning refers to motion planning algorithms that reason about the motion planning problem beyond geometry. Here, we acknowledge that the world is governed by physics laws and that dynamics are part of the problem. We want the robot to reason about the consequences of their actions when interacting with obstacles.
Implementing such algorithms that are physics aware and generic require the use of a physics simulator, which brings extra challenges. Two of which were my focus so far: (1) physics simulation is expensive to run1, and can slow us down when we try to find a motion plan, and (2) physics simulation is an approximation of the dynamics of the real world. The latter is a challenge because if we want to build robots that operate outside simulation, in the real world, we need them to be successful in the real world, however, the simulation as an approximation of the real world dynamics could generate solutions successful in simulation but that fail miserably in the real world.
Open world manipulation
Given a fixed environment, with fixed number of objects, known geometry, without unexpected actors (i.e., humans walking in), and known sequence of actions one can build a pretty successful system that can complete manipulation tasks successfully most of the time. However, we rarely have such environments. Such predictable environments exist mainly in manufacturing plants where robots, that have no “intelligence”2, precisely execute the same motion over and over again and they are not safe to work or collaborate with humans and are not robust to environment changes (they could fail if a change is introduced in their process).
Most of the environments we want robots to work in are dynamic, with unknown number of objects, unknown geometry, consisting of rigid and deformable objects and with humans unexpectedly working alongside them. Building a system that can work in an open world, and in collaboration with humans, can be an extra challenge. How does each fridge look like? How are objects expected to be placed in every fridge? Obviously this is the least of our problems at the moment, but it’s important to keep in mind the end-goal. Indeed, most researchers in manipulation assume at least some of these parameters to be fixed, and they focus on one problem at the time.
The general focus and interest of my research is in the above challenges: perception, motion planning, physics-based manipulation planning and motion control in an open world. I am interested in building algorithms that enable robots to solve real world problems to amplify what humans can do.
You could write your system dynamics by “hand”, for a specific task, and escape this challenge, to a certain extent, but if you aim to develop generic motion planning algorithms for an open world, a generic physics simulator is appealing. ↩︎
The word “intelligence” in this context can mean many things. I use the word here to refer to robotic systems that can sense and adapt to their environment to successfully complete a task. ↩︎