Addressing Hidden Imperfections in Online Experimentation

AI-generated keywords: Randomized controlled trials technology companies biases experiment design imperfections

AI-generated Key Points

  • Use of randomized controlled trials (RCTs) in technology companies for product development is increasing
  • RCTs in technology companies can be imperfectly executed
  • Biases such as opt-in and user activity bias, selection bias, non-compliance with treatment, and challenges in testing the question of interest can affect RCT results
  • Collaboration between experiment designers, product designers, and user experience designers is recommended to balance learning goals and minimize burden on end consumers
  • Practical guidance provided on designing and scoping experiments, instrumenting the experimentation funnel, monitoring measurement imperfections, and adjusting statistical analysis
  • Challenges discussed are applicable to both on-device and server-side experiments
  • Consideration needed for randomization methods, how users trigger randomized experiences, target population, entry into experiment subset, and mechanisms that may create unequal randomization in treatment assignment
  • Importance of thoughtful experiment design highlighted for improving validity and reliability of experimental findings.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Jeffrey Wong, Jasmine Nettiksimmons, Jiannan Lu, Katherine Livins

Presented at CODE@MIT 2021
License: CC BY 4.0

Abstract: Technology companies are increasingly using randomized controlled trials (RCTs) as part of their development process. Despite having fine control over engineering systems and data instrumentation, these RCTs can still be imperfectly executed. In fact, online experimentation suffers from many of the same biases seen in biomedical RCTs including opt-in and user activity bias, selection bias, non-compliance with the treatment, and more generally, challenges in the ability to test the question of interest. The result of these imperfections can lead to a bias in the estimated causal effect, a loss in statistical power, an attenuation of the effect, or even a need to reframe the question that can be answered. This paper aims to make practitioners of experimentation more aware of imperfections in technology-industry RCTs, which can be hidden throughout the engineering stack or in the design process.

Submitted to arXiv on 25 Aug. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2209.00649v1

The use of randomized controlled trials (RCTs) is becoming increasingly common in technology companies for the development of new products and features. However, despite fine control over engineering systems and data instrumentation, these RCTs can still be imperfectly executed. Similar to biomedical RCTs, online experimentation in technology companies is prone to biases such as opt-in and user activity bias, selection bias, non-compliance with treatment, and challenges in testing the question of interest. These imperfections can result in biased estimates of causal effects, reduced statistical power, attenuation of effects or a need to reframe the research question. This paper aims to raise awareness among practitioners of experimentation about these imperfections that may be hidden throughout the engineering stack or design process. The authors recommend that experiment designers collaborate closely with product and user experience designers to balance learning goals with minimizing burden on end consumers. They provide practical guidance on designing and scoping experiments, instrumenting the experimentation funnel, proactively monitoring measurement imperfections and adjusting statistical analysis to mitigate imperfections. The concepts are illustrated using a running example that assumes on-device treatment assignment. The challenges discussed in this example are applicable to server-side experiments as well. Experimenters need to carefully consider randomization methods for users' experiences, how users trigger randomized experiences, the target population and how users enter the experiment subset and any mechanisms that may create unequal randomization in treatment assignment. Overall this paper highlights the importance of thoughtful experiment design and provides strategies for addressing imperfections in technology-industry RCTs. By following these guidelines practitioners can improve the validity and reliability of their experimental findings.
Created on 24 Jan. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.