Why PMs Need Qualitative Research

What A/B testing won’t be able to tell you, but talking to your customers will

Jens-Fabian Goetzmann

26 May 2017 ‧ 6 min read

I was recently at a panel discussion with a large group of very data-informed product managers. One of the participants asked the following question (I am somewhat exaggerating):

If we can A/B test, why do we need qualitative research?

Superficially, the question seems to hold some merit: A/B testing is as close to a scientific method that we have. As such, it reduces human biases, including the tendency to believe in individual anecdotes and stories even if the data speaks against them.

However, I have found (and written about in the past) that data can’t substitute for the real, deep insight gained from talking to real customers and users. The question above made me think a bit more about why exactly I believe that qualitative research is essential for a product manager, and will provide you with six reasons for why you should run qualitative research in addition to (or sometimes even instead of) A/B tests.

1. To get to the “why”

A/B test results tell you what the impact of a feature is. Users use some feature less, stay on your site for longer, or you acquire more users. What the data doesn’t tell you, however, is why these changes happened and why users behaved the way they did. Sometimes that’s obvious — you made a button bigger, more people clicked on it. Most of the time, however, it’s not going to be as easy. Your A/B test results may well be negative, ambiguous (some metrics up, others down) or flat (no significant movement in metrics despite drastic UI changes).

A classic question that user testing after getting flat or negative A/B test results can answer is: Did users not discover the new feature, did they not understand it, or did they understand it but it did not have value to them? It’s going to take many iterations of A/B tests to get to the bottom of this, but just speaking to a handful of users will often get you the answer much faster.

Speaking of iterations: Even with positive A/B tests, you will often want to iterate on a feature. Talking to users will help you understand what the remaining pain points are, where there is still friction.

Lastly, user testing can even uncover negative impact that’s hidden by positive A/B test results. The reason is that many of the metrics we measure are inherently trailing, i.e., they reflect the past, not the future. For example, consider a project that increases the number of push notifications that an app sends. In the short term, that is extremely likely to increase engagement metrics for your app, since there are more re-engagement hooks. However, the experience of getting “spammed” by notifications might be frustrating to users, but that frustration builds up over time and might be only impacting metrics after the A/B test is over. In contrast, a user study will likely highlight this much earlier.

2. To generate hypotheses to test

You need to have deep understanding of user goals and problems to build a product that is valuable to users. Qualitative user research is the best tool to provide you with insight into those goals and problems. That insight, in turn, can be used to generate hypotheses to test, e.g. in an A/B test.

When you first start creating a product, you often have little more than a hunch about a problem, or you may have a problem yourself but not know to what extent other people feel the same pain. You could of course start building something and A/B test, but qualitative research (a.k.a. talking to people, Steve Blank’s “getting out of the office”) is going to be faster and yield deeper insights.

Even when you are much further along, working on a product with established product/market fit, the nuances of what users are trying to achieve with your product will not always be obvious. If you want to understand how and why users are using your product today, and how you could make their lives better, you simply have to talk to them.

Again, this is even true if you are A/B testing features already. When you get positive results, how do you know if you should try more of the same, or if that means a fundamental user problem has now been solved and you should now move on to improve some other part of the product? Again, only talking to users will give you those answers.

3. To validate features you can’t A/B test

Unfortunately, not all features can be A/B tested with conclusive results. Sometimes, features have so little usage that in order to get a large enough sample size in an A/B test, you would have to run the test forever (or the feature is unlikely to impact the overall user behavior much). Other times, the feature is a sales blocker, so those users and customers that you want to affect with the feature aren’t even using your product yet — for example, compliance features.

There are also more fundamental reasons why a feature can’t be A/B tested (easily): A/B tests are typically administered on a per-user basis (i.e., users are randomly seleted into the experiment groups). Some features, however, have interaction effects. Think about a feature like Facebook’s Reactions: Having the ability to react in ways beyond a simple “Like” will likely impact the behavior of the person doing the liking / reacting, but also that of the person whose post was liked / reacted to. There is no way to easily measure that cross-user impact in a simple user-based A/B test.

4. To open up new opportunities

If you want to expand the reach of your products and tap into new customer bases, you will have to identify those new customers’ and users’ needs and validate that your solution works for them. However, they aren’t using your product yet, so you can’t A/B test on them. In contrast, qualitative research — ranging from simply talking to prospective customers to testing prototypes of potential solutions — can give you the answers to those questions.

Even for existing customers, there might be adjacent needs that you didn’t think about when first building your product. Users might already be using the product for things it wasn’t intended or optimized for. The data will never give you these insights (it tells you what users are doing, not why or more specifically what for). When you engage in qualitative research, you will be able to identify these opportunities and decide whether you want to expand your product to address those use cases in a more dedicated fashion.

In the product manager event that prompted this article, Steven Sinofsky answered the question partly by referring to this aspect — he called it “step function changes”, meaning that A/B testing can yield incremental improvements, but qualitative research is required to jump an entirely new level.

5. To get answers faster

Even though talking to actual users involves all of the overhead that human communication has (scheduling, sending emails back and forth, talking, …), it is still often much faster to get feedback by qualitative means than by actually building and A/B testing something. Think of the GV Design Sprint — it gets you from problem to (qualitatively) validated solution in a week. Even if you have the fastest engineers and the best analyst team, there is no way that you can design and build the solution to any meaningful problem and then run A/B test, get enough users to reach statistical significance, and analyze the results in the space of a week.

It is also often much cheaper to build a prototype than the actual experience. It could be a paper or PowerPoint prototype. It could be one built in Framer or InVision but with large parts of the experience non-functional. You don’t ever have to build a proper backend for a prototype.

Additionally, when A/B testing, you want to make your hypothesis is as specific as possible, which means you want to make your incremental change as small as possible. Otherwise, if you test a bigger change that affects multiple aspects of your product at once, and you see metrics move in unexpected ways, you won’t know which of the changes had what effect. In contrast, when user testing prototypes, you can easily mock up complex holistic experiences and by simply observing and asking determine what aspects of your prototype caused what user reactions. So validating a holistic user experience might take many iterations of A/B tests, but typically only one round of qualitative user research.

6. To keep your feet on the ground

Last, but not least: When you have worked on a product for a long time and you’ve gone through many iterations, improving and improving your metrics, you can start to feel like you “know” what is going to work and what isn’t. You feel like you understand how and why people use your product, and you have tons of ideas for how to improve their experience.

That is of course great — but it can also backfire very easily. Your own biases creep in, and reality is always more complex than the simplified models you can come up with in your head. Talking to users and customers on a regular basis provides a reality check and keeps you humble. Personally, every time I get off a call with a Yammer user, I think: “Wow, I did not know that you could use Yammer in this way / for this use case”. If I always just looked at the data and ran A/B tests, I would never have these revelations.

At Yammer, we pride ourselves to be very data-informed, and A/B test every feature we can. Nevertheless, for all of the above reasons, we also use qualitative user research extensively.

TL;DR: You should use qualitative research in addition to A/B testing because…

… it can get you to the “why” behind the data
… it can generate new ideas and hypotheses to test
… you can validate hypotheses that aren’t A/B testable
… it can address new users that aren’t using your product today
… it can get you to answers faster (without building anything)
… you will stay more humble and grounded

Did you find this article interesting? If it was, feel free to follow me on Twitter where I share interesting product management articles I come across daily.

Thanks to Cindy Alvarez and Emma Beede from the amazing Yammer research team for providing input to this article.