Close your eyes and grab an object close to you and it isn’t difficult to figure out what it is. The information gained from touching, holding and picking up objects allows humans to quickly infer what it is.
The same can’t be said for robots, which still struggle with manipulating physical objects. Their biggest problem is a lack of data – robot hands simply haven’t held and picked up as many objects or as frequently as the average person has.
"Humans can identify and handle objects well because we have tactile feedback. As we touch objects, we feel around and realize what they are. Robots don't have that rich feedback," explains robotics researcher Subramanian Sundaram a former MIT graduate.
"We've always wanted robots to do what humans can do, like doing the dishes or other chores. If you want robots to do these things, they must be able to manipulate objects really well," he adds.
In a paper published in Nature this week, Sundaram and MIT colleagues demonstrate how to give the robots a helping hand, by building up a huge dataset of object interactions using a $15 glove rig they call STAG (scalable tactile glove).
The knitted glove is equipped with 548 tiny sensors across nearly the entire hand. The glove is worn by a human who feels, lifts, holds, and drops a range of objects, and the sensors capture pressure signals as they do so.
The MIT researchers generated the dataset by picking up 26 everyday objects including a soda can, scissors, tennis ball, spoon, pen, and mug.
Using only the dataset – the system predicted the objects' identities with up to 76 per cent accuracy. The system also determined the correct weight of most objects within about 60 grams.
Grasping the problem
The glove is wired up to a circuit board that translates the pressure data into ‘tactile maps’ – essentially brief videos of dots growing and shrinking across a graphic of a hand. The bigger the dot the greater the pressure.
Using 135,000 video frames, a convolutional neural network (CNN) – commonly used to classify images – was used associate specific pressure patterns with specific objects and predict their weights by feel alone, without any visual input.
The researchers wanted their CNN to mimic the way humans can hold an object in a just few different ways in order to recognise it. They designed it to choose eight semi-random frames from the video that were most dissimilar, for example holding a mug from the rim, bottom and handle.
"We want to maximize the variation between the frames to give the best possible input to our network," says MIT researcher Petr Kellnhofer.
"All frames inside a single cluster should have a similar signature that represent the similar ways of grasping the object. Sampling from multiple clusters simulates a human interactively trying to find different grasps while exploring an object," he explains.
The researchers also used the dataset to examine how regions of the hand interacted during object manipulations. For example, when someone uses the middle joint of their index finger, they rarely use their thumb. But the tips of the index and middle fingers always correspond to thumb usage.
"We quantifiably show, for the first time, that if I'm using one part of my hand, how likely I am to use another part of my hand," Sundaram says.
It is hoped the work will help prosthetics manufacturers choose optimal spots for placing pressure sensors and make prosthetics better suited to interacting with everyday objects.
“Insights from the tactile signatures of the human grasp – through the lens of an artificial analogue of the natural mechanoreceptor network – can thus aid the future design of prosthetics, robot grasping tools and human–robot interactions,” Sundaram says.