SARA — Shruti Murali

Building Rapport Between Human and Machine

RESEARCH YAHOO! & CARNEGIE MELLON UNIVERSITY

THE CHALLENGE

Explore how to create a rapport-building AI system that can have socially-aware conversations with users.

ROLE

Conversation Designer, User Researcher, Psychology Researcher

FUNDING

$10M from Yahoo!/Verizon, over the course of 5 years. Additional support from Microsoft, Google, LivePerson, World Economic Forum, Drone Data, KETI, and SimCoach Games

THE OUTCOME

A socially-aware robot assistant (SARA) that recommends movies while using rapport-building conversational strategies. Data showed that users were more likely to accept a recommendation from SARA if she used a social conversational strategies.

PUBLICATION

We also published a paper on our research findings.

AWARDS

Best Paper Award at the Human-Agent Interaction Conference in Kyoto, Japan, 2019

How do we build relationships with AI?

How should AI talk to us?

How can AI build trust and foster healthy rapport with users to keep them coming back?

All of these questions can be answered by directing our attention back to the fundamentals of psychology:

How do people converse?
How do people connect with each other?
How do we foster trust in relationships?

Our small team at Carnegie Mellon University was interested in investigating these questions as part of a $10M project funded by Yahoo! to explore the (at the time) nascent space of conversational AI technology.

As part of this exploration, we decided to build SARA, a Socially Aware Robot Assistant, that would recommend movies to users using rapport-building conversational strategies. SARA was a mobile app experience with an avatar that covered the whole screen.

As the psychology researcher on the team, I was tasked with researching conversational strategies and recommendation systems to design a computational language model.

RESEARCH PROCESS

CONVERSATIONAL STRATEGIES

My research revealed several patterns in conversation that I documented in great detail. Here are some of the conversational strategies I observed people using while discussing movies:

Explanation: This refers to any elaboration or description of a movie or movie-going experience. It can also include a recommendation.
- Movie Descriptions: Any explanation about the movie itself.
  - Public Opinion or Review
  - Rankings/Awards
  - Actor
  - Genre
  - Plot
  - Other
- Personal Experience: This describes someone’s movie-going experience.
  - Logistics
  - Anecdotes
  - Comparisons
  - Philosophical/Analytical
- Personal Opinion: This describes someone’s experience about the movie.
  - Positive
  - Negative
  - Neutral
  - Structured Opinion
- Initiation: When someone initiates or begins a conversation about a movie.
- Third-Party Experience: When participants reference a person who is not present but who they’re familiar with (kids, spouse, friends, etc.) and thier opinions on a movie.
Explanation Timing: This refers to when the explanation about the movie was given
- Same Turn: Participant uses explanation and mentions movie title in the same utterance without a cue.
- When Cued:There are several cues which signal to a person to explain more about the movie.
- Neither: This is for an explanation that is neither cued, nor given on the same turn.
Reaction: This refers to the different reactions people have in response to the other person’s explanations or movie recommendations.
- Follow-up Questions: Any question about movies that is not an initiation.
- Cue: A social cue to keep talking about the same movie.
- Agreement: Agreement on opinions about a movie.
  - Yes
  - No
- Loaded Response: Rhetorical questions that don’t need an answer but serve as transitions to one’s opinion.

We used these strategies to code transcripts of users talking over the phone about movies. That data was then used to train our language models so SARA could have a rapport-building conversation with users.

Here’s the full coding manual I created for this process. It includes further descriptions of each code and our full coding process.

USER RESEARCH

Once SARA was trained and ready, we sampled her conversation with a small set of users using a convenience sample of students in the building to gather feedback and iterate.

After this, we used Mechanical Turk to gather more user data of people interacting with SARA.

Below is an example conversation between SARA and a user.

Textual transcription of a conversation between a user (romanized) and the version of our agent (italicized) implementing our model of social explanations. Notice that SARA uses personal opinions, experiences, compliments, and hedging.

CONCLUSION

Our results showed that users were more likely to accept a movie recommendation from SARA when she used social conversational strategies.

PUBLICATION

We later published a paper with our findings. It won the Best Paper Award at the Human-Agent Interaction Conference in Kyoto, Japan, 2019. You can read it here.

Our research paper has been cited 20 times and downloaded more than 700 times. It's shown that voice assistants can provide recommendations in a socially-aware way by employing the right rapport-building conversational strategies.