Uber Labs 因果推断实践

gingo 2021-08-18 16:27:36 资料仓库收藏

0 / 1366

Mediation Modeling at Uber: Understanding Why Product Changes Work (and Don’t Work)

Bonnie Li and Totte Harinen

April 18, 2018** 0

Mediation Modeling at Uber: Understanding Why Product Changes Work (and Don’t Work)

**Tweet23

25SHARES

At Uber Labs, our mission is to leverage insights and methodologies from behavioral science to help product and marketing teams improve the customer experience. Recently, we introduced mediation modeling, a statistical approach from academic research, to address user pain points.

Mediation modeling goes beyond simple cause and effect relationships in an attempt to understand what underlying mechanisms led to a result. Using this type of analysis, we can fine-tune product changes and develop new ones that focus on the underlying mechanisms behind successful features on the Uber platform.
在 Uber Labs，我们的任务是利用行为科学的洞察力和方法论来帮助产品和市场团队改善客户体验。最近，我们引入了中介模型来解决用户的痛点，它是一种来自学术研究的统计方法。

为了理解导致结果的潜在机制，中介模型超越了简单的因果关系。使用这种类型的分析，我们可以从微调或开发产品的变化中找到优步平台上功能成功背后的基本机制。

Knowing Whether vs. Understanding Why

At Uber, we have a strong culture of improvement, frequently conducting experiments to test whether one variable impacts another to ensure reliable, safe, and seamless user experiences.

Most of the time we have a hypothesis about why two variables are related. For example, suppose we believe that a promotion to give new riders trip discounts on their first few trips may improve new rider retention. While a standard analysis would tell us whether this promotion helped increase new rider retention, it would not tell us why. For example, do new riders return to the app after their first several trips specifically because of reduced trip fares? Or could it be that the promotion helped new riders familiarize themselves with the app, or something else entirely? If multiple mechanisms are present, which one plays a larger role and by how much? In a standard analysis, an underlying mechanism (i.e., why something happened) is often assumed to exist but not empirically tested with data.

While we may have some evidence that two variables are related, we may not have a clear understanding of why they are related, and when we do not understand why, we have to rely on trial and error. However, just like in academic research, knowing why is equally important for Uber because it helps us build better products for our users. For instance, in the example above, if we found that their familiarity with the app kept new riders on the platform, we should then prioritize product changes that encourage riders to use the app.

知其然 vs 知其所以然

在 Uber，我们有一个强大的改进文化，通过频繁开展实验来测试一个变量是否会影响另一个变量，以确保可靠、安全和无缝的用户体验。

大多数时候，我们对为什么两个变量是相关的有一个假设。例如，假设我们相信在新乘客的前几次旅行中给予他们行程折扣的促销活动可以提高新乘客的留存率。虽然一个标准的分析会告诉我们这种促销是否有助于增加新的骑手留存，但是它不会告诉我们为什么。例如，新乘客是否会在前几次行程后因为行程费用降低而重新使用该应用程序？或者是促销帮助新乘客熟悉了这个应用程序，或者完全是别的什么？如果存在多种机制，哪一种机制发挥更大的作用，作用大小如何？在一个标准的分析中，一个潜在的机制(例如，为什么会发生这样的事情)通常被认为是存在的，但是没有用数据进行实证检验。

虽然我们可能有一些证据表明两个变量是相关的，但我们可能不清楚为什么它们是相关的，当我们不明白为什么，我们不得不依赖于试验和错误。然而，就像在学术研究中一样，知道为什么对 Uber 同样重要，因为它帮助我们为用户打造更好的产品。例如，在上面的例子中，如果我们发现他们对应用程序的熟悉使新的乘客留在了平台上，那么我们应该优先考虑鼓励乘客使用应用程序的新的产品功能。

Mediation modeling: opening the black box

Mediation modeling opens the black box between a treatment and an outcome variable to reveal the underlying mechanisms, i.e. why something happened. Although widely used in academic research, this approach is under-utilized in business.1, 2, 3 When we have a causal assumption, instead of leaving it at that or relying on correlational evidence, mediation modeling lets us empirically test (vs. logically infer) the causal pathways between the two variables. More importantly, understanding these mechanisms enables us to develop better products quickly and more efficiently because it helps us pinpoint which features of these changes are responsible for making products successful.

So, what exactly can we do with mediation modeling to improve the user experience? We outline some sample hypothetical use cases in Figure 1, below:

First, we can use mediation modeling to test product assumptions. For example, we may believe that a new rider promotion could increase retention because of reduced trip fares (Figure 1a). With mediation modeling, we can empirically test this assumption. Our results of these test results can tell us whether our assumption is true, and if so, how much of the effect is because of reduced trip fares as opposed to other mechanisms (e.g., familiarity with the app).

Second, we can use mediation modeling to compare multiple mechanisms. In a hypothetical example, we may believe that a new design of Uber Eats menus can increase orders through more than one mechanism (Figure 1b). With mediation modeling, we are able to estimate how much of the treatment effect (i.e., increased orders) is accounted for by each of these mechanisms and which mechanism plays the largest role. The results help inform how we design our products and their future iterations.

Mediation modeling also allows us to connect intangible variables, such as consumer sentiment, with a specific feature to business metrics. As we know, customer feelings and satisfaction are critical to the success of a business. However, it is often difficult to quantify their business impact. Mediation modeling, however, enables us to test how these variables affect downstream business metrics (e.g., rider referrals, as depicted in Figure 1c).

Moreover, mediation modeling can be a creative way to break a long-term goal into smaller intermediate steps. For example, suppose our goal is to increase long-term rider satisfaction with Uber. How do we break this goal into smaller pieces that can be tied to our day-to-day work? If we have previously identified a key mediator behind rider satisfaction, we can then leverage this mediator as a short-term key performance indicator (KPI) (Figure 1d). If most of the effect of an intervention is mediated through a particular mechanism, then influencing that key mediator can be a necessary (although not sufficient) condition for the intervention to work.

Across various use cases, we identify upstream and downstream variables and test how they are connected with each other. Next, we take this explanation a step further and discuss the conceptual details behind mediation modeling.

中介模型: 打开黑匣子

中介模型打开了实验和结果变量之间的黑匣子，从而揭示了潜在的机制，即为什么会发生某些事情。当我们有一个因果假设，而不是让它停留在那里或依赖于相关证据，中介模型让我们经验性地检验(与逻辑推断)两个变量之间的因果路径。更重要的是，理解这些机制使我们能够更快更有效地开发出更好的产品，因为它有助于我们确定这些变化的哪些特征是使产品成功的原因。

那么，我们究竟可以通过中介建模做些什么来改善用户体验呢？我们在下面的图1中概述了一些假设的用例:

首先，我们可以使用中介建模来测试产品假设。例如，我们可能认为，一个新的附加促销可以增加保留，因为减少旅行费用(图1a)。通过中介建模，我们可以对这一假设进行实证检验。我们的这些测试结果可以告诉我们我们的假设是否正确，如果正确，有多少影响是由于旅行费用的减少而不是其他机制(例如，熟悉应用程序)。

其次，我们可以使用中介建模来比较多种机制。在一个假设的例子中，我们可能认为 Uber Eats 菜单的新设计可以通过多种机制增加订单(图1b)。通过中介建模，我们能够估计这些机制中的每一个对实验效果(即增加订单)的影响程度，以及哪种机制发挥了最大的作用。这些结果有助于告知我们如何设计我们的产品及其未来的迭代。

中介建模还允许我们将无形变量(如消费者情绪)与特定特性联系到业务指标。众所周知，顾客的感受和满意度对于企业的成功至关重要。然而，通常很难量化它们对业务的影响。然而，中介建模使我们能够测试这些变量如何影响下游业务指标(例如，如图1c 所示的附加引用)。

此外，中介建模可以是将长期目标分解为更小的中间步骤的一种创造性方法。例如，假设我们的目标是提高长期乘客对 Uber 的满意度。我们如何将这个目标分解成更小的部分，并与我们的日常工作联系起来？如果我们之前已经确定了骑手满意度背后的关键中介，那么我们就可以利用这个中介作为短期关键绩效指标(KPI)(图1 d)。如果干预的大部分效果是通过某一特定机制引起的，那么影响该关键调解人可能是干预发挥作用的一个必要(虽然不是充分)条件。

在不同的用例中，我们识别上游或下游变量并测试它们之间的连接方式。接下来，我们将进一步讨论这个解释，并讨论中介建模背后的概念细节。

Mediation modeling as causal inference

To successfully execute this technique, what exactly are the quantities we estimate and with what methodology? To answer these questions, we depict the simplest possible mediation model in Figure 2, below:

Figure 2. The conceptual framework of mediation modeling incorporates the intervention, mediator, and, finally, outcome.

From this type of model, our goal is to estimate three key quantities:

- The average direct effect (ADE): c
- The average causal mediated effect (ACME): ab
- The average total effect (ATE): ab + c

In recent years, researchers have begun to understand mediation modeling in causal terms.4, 5, 6 This has allowed them to conceptualize mediation using formal frameworks developed for causal inference, such as the potential outcomes approach developed by Neyman, Rubin, and others.7

To better grasp how this framework functions, suppose we have an outcome Y and treatment assignment t, such that t is either 1 if an individual is in the treatment group or 0 if an individual is in the control group. Then, the outcome Y under treatment assignment t for individual i is Yi(t).

Now, we are often interested in estimating the difference in Y between the treatment and control, which for individual i is

Yi(1)−Yi(0)

However, most of the time we only observe one of these two outcomes because an individual is usually in just one experimental condition. So, for example if individual i is in the treatment condition (t=1), then outcome Yi(0) for that individual is just a potential outcome, i.e., an outcome that could have occurred but did not actually occur.

Since we usually cannot observe the treatment effects at an individual level, we estimate the group-level average treatment effect, which is defined as

E[Yi(1)−Yi(0)]

Here, the outcome Y under the treatment and the control assignment is estimated from the group-level quantities.

What does all this have to do with mediation modeling? Well, it turns out we can represent the three key mediation quantities using the above framework. Let M(t) correspond to the potential mediator value under treatment assignment t. Then, we can start defining the key mediation quantities as follows:

ATE=E[Yi(1,Mi(1))−Yi(0,Mi(0))]

In short, ATE is the difference between the potential outcomes under the treatment and the control assignment when the mediator changes as it actually does with the treatment assignment. This solves for the average effect of the treatment on the outcome.

As mentioned earlier, the goal of mediation analysis is to decompose the total treatment effect into two parts: the average direct effect (ADE) and the average causal mediated effect (ACME). In other words, ADE is the impact from the treatment on the outcome that does not go through the mediator. So, if you fix the value of the mediator while varying the value of the treatment status, then you will generate the direct effect. Using the potential outcomes notation, we have

ADE=E[Yi(1,Mi(t))−Yi(0,Mi(t))]

for t=0,1. So, the ADE can be understood as any additional contribution of the treatment assignment on the outcome once we prevent the mediator from changing with the treatment status.8 (This prevention does not happen literally. We use statistical models to estimate what the outcome would have been if the mediator were fixed.)

Once we have established the ADE, it is clear that the ACME is simply its complement:

ACME=E[Yi(t,Mi(1))−Yi(t,Mi(0))]

for t=0,1. ACME corresponds to the difference in potential outcomes that would occur if we were to flip the mediator into the value it would take under the treatment status while holding the treatment status itself fixed.

One of the great things about these definitions is that they do not make any reference to a particular model. Consequently, researchers like Imai et. al.8 have developed algorithms that estimate the key mediation quantities using any valid model. This means, among other things, that we are free to estimate the relationships in the mediation graph using nonparametric and nonlinear models, a considerable advancement compared to the past.9, 10, 11, 12 For example, in a traditional approach such as Hayes’ PROCESS method, the mediators cannot be categorical variables and the outcome variables are restricted to only those that can be properly modeled with ordinary least squares or logistic regression13. The limitations on mediators and outcome variables imposed by the PROCESS method do not let us model discrete data, which makes it unsuitable for our work at Uber.

Finally, the potential outcomes framework proves helpful in laying out the identification assumptions for mediation effects. In the context of randomized experiments, the major assumption is that the mediator should be statistically independent of the potential outcomes of Y conditional on the observed treatment status and the values of the pretreatment covariates included in the model.14, 15 The reason why the mediation quantities are not consistently estimated without this assumption is that it would be otherwise possible for some third variable to confound the mediator-outcome relationship.

Even though the potential outcomes framework helps us to define this assumption clearly, it is not possible to conclusively verify that the assumption holds. The best we can do is to (1) include any pretreatment covariates that theoretical considerations suggest could de-confound the relationship between the mediator and the outcome and (2) conduct sensitivity analyses to see how our estimates would change if our assumptions were not satisfied to different degrees.16 Luckily, as a data-driven technology company, we generally have a good set of pretreatment covariates that we can use in our models to mitigate the risk of confounding.

中介模型 VS 因果推断

为了成功地执行这项技术，我们究竟需要用几个什么样的方法呢？为了回答这些问题，我们在下面的图2中描述了最简单的中介模型:

图2. 中介建模的概念框架包含了干预、中介和最终的结果

根据这种类型的模型，我们的目标是估计三个关键数量:

平均直接效应(ADE) :c
平均因果调节效应(ACME) :ab
平均总效应(ATE) : ab + c

近年来，研究人员已经开始理解中介模型的因果关系。这使他们能够使用为因果推断开发的正式框架将中介概念化，如由 Neyman，Rubin 和其他人开发的潜在结果方法。

为了更好地理解这个框架是如何运作的，假设我们有一个结果 y 和实验分配 t，如果一个人在实验组中 t 是1，如果一个人在对照组中 t 是0。然后，实验分配下个体 i 的结果 y 是 Yi (t)。

现在，我们经常感兴趣的是估计实验和控制之间 y 的差异，对于个体 i

Yi(1)−Yi(0)Yi(1)−Yi(0)

然而，大多数时候我们只观察到这两种结果中的一种，因为一个人通常只处于一种实验状态。例如，如果个体 i 处于治疗状态(t1) ，那么个体的结果 Yi (0)只是一个潜在的结果，也就是说，一个可能发生但实际上没有发生的结果。

由于我们通常不能观察治疗效果在个人水平，我们估计的群体水