In depth: Siri, how do I feel?
- 23 June, 2012 11:05
Microsoft is working on technology that will spy on you in your own home, watching your body language and face and listening to your voice for cues about your mood and emotional state.
But would you want to install such a device in your home?
Maybe you already have.
Microsoft this month filed a patent application for a method of " Targeting Advertisements Based on Emotion."
The idea is that users would install Kinect for the fun and games. But when they're not playing, Kinect will continue to watch everything they do. It will know when they're laughing and crying, slumping or beaming.
Microsoft wants to combine this data with information collected as people conduct searches with Bing and surf the Web with Internet Explorer. Using that data, the system will build an emotional profile of a user that will enable it to deliver ads "with the highest monetization values to the users that are emotionally compatible," according to the patent application.
Microsoft's plans are merely at the patent application stage. Other major companies are much closer to implementing emotion-sensing technology.
Facebook has acquired the face-recognition startup Face.com, a company whose software can scan a photograph and identify who's in it, based on user tagging.
While much of the press coverage of this acquisition has focused on the potential threat to privacy posed by facial recognition tools, a lesser-known feature of Face.com technology is the ability to detect emotion -- with stunning accuracy.
Face.com even has a Web page where you can give this technology a try. Upload a photo, and it will tell you the mood of the person in the picture. It will also identify the individual's gender and report other details. It's designed for developers, but anyone can try it. (To use it, just click on the "upload photo" link in the left nav bar and choose a picture from your computer. Or, click "Use URLs" and paste in the URL of a picture online. Then click the "Call Method" button, and hover your mouse pointer over the picture for analysis.)
Google has been offering good facial recognition tools on Google+ since last year. The company has not announced a tool for detecting emotion. But algorithm-based data-crunching of the type that emotion detection requires is Google's core competency. The company is probably working on it.
Google theoretically has a lot more access than Facebook to people's faces in real-time because of its Hangouts video service.
The company already demonstrates the ability to recognize the basic location of people's faces in Hangouts. A goofy feature called Google Effects lets users add cartoonish caricature glasses, hats, beards and other "enhancements" to their own faces while using Hangouts. The technology makes it clear that Google can recognize faces, understand their orientation and features, and do it in real time at scale.
Google+ doesn't have advertising. But if the company chose to, it could easily add "mood" as a "signal" for serving up contextual ads.
Transmitting your facial expressions but not your face
Both Facebook's integrated Skype service and Google Hangouts, as well as other services like the new Chatroulette-like Airtime, seek to persuade people to put their faces on video, where computers will be able to identify them and detect their moods.
The problem is that many people (probably most people) are uncomfortable about broadcasting images of themselves in live video chats -- but they do want to interact with people online.
There are many reasons why people don't like to do video chats. They may be shy. They may want to protect their privacy. They may be afraid of being recorded doing something embarrassing.
That's why it's likely that live-capture avatars will prove very popular on future social networking sites.
To create a live-capture avatar, a user first chooses an avatar or character. This might be a 3D cartoonish version of the user. Or it could be a picture of a bear or a celebrity or an image of a cartoon character, like Spider-Man or Bugs Bunny.
A standard webcam captures the user's movements, facial expressions and the orientation of his or her head, neck and torso. But instead of simply broadcasting video of the person, the system uses that information to animate the chosen avatar image in real time.
So as the person talks, for example, the avatar's mouth moves in concert with his mouth and the avatar appears to be saying what the person is saying. The avatar smiles when the user smiles, frowns when the user frowns and shrugs when the user shrugs.
I believe this kind of live-capture avatar chat will prove very popular, because it will offer people a rich interactive experience online without exposing actual images of users or even identifying them.
The cues necessary to animate the avatar are easily captured as moods or emotions.
Live-capture avatar services will probably generate revenue from contextual advertising.
Call someone who cares
Analyzing still images and video is just one way for computers to detect emotion. Monitoring activity is another.
The idea is that people being treated for depression will have the technology running on their phones. As they go about their lives, the app will always be paying attention. When the app detects that a user is severely depressed -- say, because he or she stayed in bed all day or hasn't been physically active -- it would automatically contact a healthcare professional working with the patient.
The technology is also being tested for use with average, healthy users. Ideally, it could provide a lot if information about whether someone is happy or sad, depressed or anxious -- in short, it could constantly detect mood and emotion.
Researchers at Samsung are using entirely different data to detect the emotions of smartphone users.
Samsung's method is to monitor how the user interacts with the phone itself, such as how fast the user types, how much the backspace button is used and even how much the phone is shaking during use to figure out if the user is happy, angry, fearful, sad or disgusted.
In fact, there are many ways to use the sensors in smartphones to detect people's moods or emotions.
A virtual assistant that's virtually human
Each of these technologies represents part of a larger trend toward computers that detect our emotions. Our computers will do it. Our phones will do it. We'll be monitored at all times for how we feel.
The payoff could be something that feels like empathy from the machines in our lives.
For example, the Siri virtual assistant from Apple or the Google version, Google Assistant (which may be announced next week), will become increasingly "human" by developing the ability to detect emotion.
Real human assistants interact with you differently if you're in a good mood or a bad one. They lend an ear when you want to vent, and share your happiness when you get good news. Similarly, virtual assistants will be improved to the point where they can emulate empathy.
There's no question in my mind that the evolution of Apple's Siri and Google Assistant will include mood and emotion detection capabilities.
Mood and emotion detection is a new frontier in human-machine interaction. It will make our computers and phones both more capable and more "human."
It will also help advertisers target us more effectively.
How do you feel about this? If you're not sure, don't worry. Soon, your phone will be able to tell you.
Mike Elgan writes about technology and tech culture. You can contact Mike and learn more about him at Elgan.com, or subscribe to his free email newsletter, Mike's List. You can also see more articles by Mike Elgan on Computerworld.com.
Read more about emerging technologies in Computerworld's Emerging Technologies Topic Center.