Skip to content

Commit 7e043cd

Browse files
Craigievarcraig-statsig
authored andcommitted
Richer Cohort Docs
1 parent c1359bc commit 7e043cd

19 files changed

+124
-53
lines changed

cspell.json

+4-1
Original file line numberDiff line numberDiff line change
@@ -349,6 +349,9 @@
349349
"xAU",
350350
"yyyy",
351351
"nd",
352+
"lossier",
353+
"cohorted",
354+
"Cohorted",
352355
"winsorized",
353356
"onboarded",
354357
"misconfigured",
@@ -691,4 +694,4 @@
691694
"/\\[.*?\\]\\(.*?\\)/",
692695
"```[\\s\\S]*?```"
693696
]
694-
}
697+
}

docs/statsig-warehouse-native/configuration/metrics.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -166,11 +166,11 @@ Most aggregation-type metrics (sum, count, count distinct, unit count, means, ra
166166

167167
Cohort settings allow you to specify a window for data collection after a unit's exposure. For example, a 4-6 day cohort window would only count actions from days 4, 5, and 6 after a unit was exposed to an experiment.
168168

169-
Only include units with a completed window can be selected to remove units out of pulse analysis for this metric until the cohort window has completed. On experiment settings, you can choose to enable post-experiment data collection to allow these cohorts to mature in the case that you believe the intervention effect will still apply even if the user gets the control/shipped experiment (e.g. NUX experiments).
169+
Please refer to the full documentation on cohorts [here](../features/cohort-metrics.md).
170170

171171
### Baking
172172

173-
Many metric types support baking. Statsig will wait to calculate baked metrics, and use "old" data for baked metrics. This is appropriate for cases like credit card chargebacks, where you may adjust your payments dataset to account for chargebacks in a "net revenue" metric.
173+
Many metric types support baking. Statsig will wait to calculate baked metrics, and use "old" data for baked metrics. This is appropriate for cases like credit card chargebacks, where you may adjust your payments dataset to account for chargebacks in a "net revenue" metric. See additional information in the [cohort documentation](../features/cohort-metrics.md).
174174

175175
Statsig will:
176176

docs/statsig-warehouse-native/features/cohort-metrics.md

+85-11
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,102 @@
11
---
22
title: Cohort Metrics
33
slug: /statsig-warehouse-native/features/cohort-metrics
4-
sidebar_label: Cohort/Window Metrics
5-
description: Analyze Cohorts with Statsig
6-
displayed_sidebar: cloud
4+
sidebar_label: Cohort Metrics
5+
description: Analyzing Cohorts with Statsig
76
keywords:
8-
- owner:vm
7+
- owner:craig
98
last_update:
10-
date: 2024-08-05
9+
date: 2025-05-07
1110
---
1211

1312
# Cohort Metrics
1413

15-
Cohort metrics are a way to analyze the impact of an experiment in a certain time frame per user.
14+
Cohort metrics are a way to analyze the impact of an experiment in a certain time frame per experimental unit.
1615

17-
This can be a useful way to ignore noisy day-1 metrics or capture a specific critical window around something like subscriptions. It is also an easy way to measure experiment-based retention, since you can use it to ask if an experiment - for example - caused users to come back in their 2nd week from being exposed to an experiment.
16+
This is useful for many reasons. Common use cases are:
1817

19-
![Cohort Metric](https://github.com/statsig-io/docs/assets/102695539/cc96d4ba-4edc-4b31-b937-7ad8d62245f7)
18+
- By ensuring all users have equal periods for data collection, there is an "apples to apples" comparison across user enrolled early/late in the experiment (which often corresponds to power/occasional users), and across different "time periods" that may have extrinsic factors like holidays
19+
- If analyzing an unbounded periods, experimental units' variance in the population can increase over time - leading to scenario where error bars don't actually converge towards 0 as the experiment is run for longer!
20+
- This allows one to skip noisy early metrics, or focus on outcomes after users might churn - e.g. capturing "week-2 engagement" if a product have a 1-week trial period
21+
- This can also be used to capture "one-shot retention". [Retention metrics](../metrics/retention.md) are used to capture rolling, ongoing retention. A user metric with a window from day X to day Y is a good way to check if an experiment is causing more users to retain at least X days
2022

21-
In the example above, the metric will count distinct users in an experiment group with a click event between 3 and 7 days from their first exposure.
23+
The downsides of cohort metrics are that:
2224

23-
Since "Wait until users reach end window to include them in calculation" is clicked, users who are not yet at 7 days from their first exposure will be excluded from the numerator and the denominator of the metric in the analysis.
25+
- They do not capture any sort of long-term impact, or how that evolves over time. This is purely a point in time analysis and may not be appropriate for measuring complex, evolving behaviors
26+
- They make topline impact estimates lossier and harder to trust
2427

25-
# Metric Bake Windows
28+
Some practitioners have made compelling arguments that cohort metrics are a better "standard" metric for organizations to use in analysis. Statsig tends to believe that the use of cohorts is dependent on business context, but it's worth considering if they should be at least a part of an experiment's measurement (e.g. measuring topline revenue as an overall evaluation criteria, but also measuring 7d revenue alongside it for additional context).
29+
30+
Cohort metrics on Statsig are feature-rich. This page explains the different settings available, what they do, and how they interact so that you can be confident that you know what's being measured.
31+
32+
## Basic Cohort Windows
33+
34+
Basic cohort windows are fairly simple. They are a filter on metric data with a time range relative to the unit's time of exposure. For example, this cohort metric from 1 to 6 days would filter to events from 24 hours until 144 hours from when they were exposed to the experiment.
35+
36+
Note that this is calculated as a timestamp comparison; a unit enrolled at 12pm will have exactly 24 hours until they hit the end of a 0-1 day cohort.
37+
38+
For metrics from data sources that are marked as daily data, the cohort comparison is truncated to a date so that day-0 data behaves as expected (e.g. a user exposed on `2025-01-05T09:00` will include the date-based data from `2025-01-05` instead of truncating to times "after 9:00am").
39+
40+
## Waiting for Maturation
41+
42+
By default, cohort metrics can have a mix of maturation in the experimental population. For a 1-week cohort, users enrolled in the last week of the cohort will have a mix of maturities during analysis. This does yield maximal sample, but can "dilute" the analysis with partial cohort windows. To prevent this, mark the metric as needing to "Wait for cohort window to complete". This will drop units' metric data from analysis, and removes them from the experiment analysis population.
43+
44+
In the examples below, one metric forces cohorts to complete. It has **less units** in the analysis, since many units do not have a complete window, a **lower total** because of the small unit count, but a **higher mean** since the units it does have have completed their window and have a longer data collection period on average.
45+
46+
![Metric that did not wait to mature](/img/whn/basic_cohort_metric.png)
47+
![MEtric that waited to Mature](/img/whn/wait_to_mature_metric.png)
48+
49+
This can cause confusion since it leads to different populations between metrics, and also filters out the last few days of an experiment's data in the daily timeseries since new cohorts' data is not including. A metric's cohort settings can be seen quickly by hovering over the name of the metric in the experiment scorecard.
50+
51+
## Visual Examples
52+
53+
This is what the data collection looks like for a standard cohort metric with a 0-6 day window. Note that this collects data for **7 days** - it's 0-indexed.
54+
55+
![Basic cohort example](/img/whn/basic_cohort_example.png)
56+
57+
If the cohort period goes over the end of the experiment, the default behavior is that the data collection is truncated to the end of the experiment
58+
59+
![Basic cohort over end example](/img/whn/basic_cohort_over_end_example.png)
60+
61+
If the metric is configured to only allow completed cohort windows, the unit is completely excluded from the analysis. For example they are not part of the denominator for the average value of a sum or count metric, and their metric data is filtered from the analysis.
62+
63+
![Basic cohort example](/img/whn/completed_window_example.png)
64+
65+
If mature after end is configured in the experiment, it will continue to collect data after the end of the experiment (whether wait for mature is enabled or not).
66+
67+
![Basic cohort example](/img/whn/mature_after_end_example.png)
68+
69+
## Experiment-based Cohort Settings
70+
71+
Cohort controls are also available on experiments. This is because the relevance of these features depends on the kind of experiment being run, as discussed below.
72+
73+
### Allow post-experiment cohort data
74+
75+
Checking **Allow Cohort Metrics to Mature After Experiment End** allows metrics to be collected after the end of the experiment. This is recommended for one-time interventions, e.g. a new signup page; getting post-experiment signal from units who did get the intervention means additional statistical power.
76+
77+
![Allow Cohorts to Mature](/img/whn/allow_cohorts_to_mature.png)
78+
79+
This is **not** recommended in cases where the intervention is continuous, e.g. a ranking change; post-experiment data will be diluted (e.g. test users might get the control experience in their post-experiment period), diluting results.
80+
81+
On the data side, the analysis will be extended past the end of the experiment by the length of the longest cohort window on any metric in the analysis. Non-cohorted metrics will have their date constrained to the analysis period, and cohort metrics will be filtered to the end of the experiment + their cohort window.
82+
83+
### Fixed Duration
84+
85+
On the **Experiment Population** section of the experiment setup page, there is an option for fixed-window cohorting under **Configure Analysis Period** with Analysis Type **Fixed Duration**. This only counts metrics for a certain period after **experiment start**. This is useful when there's an experiment with a specific point in time, like email campaigns or other cases of fixed enrollment events.
86+
87+
![Analysis Period Settings](/img/whn/configure_analysis_period.png)
88+
89+
This setting is only available for assign and analyze experiments.
90+
91+
### Cohorted Duration
92+
93+
On the **Experiment Population** section of the experiment setup page, there is an option for allocation-based cohorting under **Configure Analysis Period** with Analysis Type **Cohorted Duration**. This is identical to the metric-based cohort, but applies globally to all possible metrics in the experiment analysis. This is a great way to globally add a cohort when appropriate, e.g. new user experiments.
94+
95+
Note that when this is used in conjunction with metric cohorts, the minimal end of the cohort window will be used. For example, if a metric cohort is set to end at 7 days and the experiment at 10, 7 will be used. If the metric cohort is set to end at 7 days and the experiment at 5, 5 will be used.
96+
97+
The **only include units with a completed cohort window** setting can also be specified at the experiment level and will apply to all metrics in the experiment if so. If not checked, this will be applied on a metric-by-metric basis.
98+
99+
## Metric Bake Windows
26100

27101
In some cases, a metric in your warehouse may not be matured until a certain time period, after which you care about the daily value. Statsig provides the option to specify a bake window for your metrics. Similarly to the option above, metrics who have not reached the bake window's end will be excluded from the numerator and denominator of the metric in the analysis.
28102

docs/statsig-warehouse-native/metrics/count-distinct.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,6 @@ In the metrics page view, we use APPROX_COUNT_DISTINCT (or equivalent) to avoid
7474
- Specify if you want to calculate CUPED, and the lookback window for CUPED's pre-experiment data inputs
7575
- Thresholding
7676
- Turn this metric into a 1/0 unit count metric counting if the unit's total count equals to or surpasses (>=) a given threshold
77-
- Cohort Windows
77+
- [Cohort Windows](../features/cohort-metrics.md)
7878
- You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
7979
- **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed

docs/statsig-warehouse-native/metrics/count.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,8 @@ Count metrics are simple, and will use the sql COUNT aggregation. However, there
7070
- Specify if you want to calculate CUPED, and the lookback window for CUPED's pre-experiment data inputs
7171
- Thresholding
7272
- Turn this metric into a 1/0 unit count metric counting if the unit's total count equals to or surpasses (>=) a given threshold
73-
- Cohort Windows
73+
- [Cohort Windows](../features/cohort-metrics.md)
7474
- You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
7575
- **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
76-
- Baked Metrics
77-
- Baked metrics allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
76+
- [Baked Metrics](../features/cohort-metrics.md)
77+
- [Baked Metrics](../features/cohort-metrics.md) allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.

docs/statsig-warehouse-native/metrics/latest-value.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ Users without a value will be treated as 0s; note that if there is an existing v
6565

6666
- Metric Breakdowns
6767
- You can configure Metadata Columns to group results by, getting easy access to dimensional views in pulse results
68-
- Cohort Windows
68+
- [Cohort Windows](../features/cohort-metrics.md)
6969
- You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
7070
- **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
7171
- CUPED

docs/statsig-warehouse-native/metrics/max-min.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@ last_update:
99

1010
## Summary
1111

12-
Max metrics calculate the maximum of a column from the metric source at the unit level.
13-
Min Metrics calculate the minimum of a column from the metric source at the unit level.
12+
Max metrics calculate the maximum of a column from the metric source at the unit level.
13+
Min Metrics calculate the minimum of a column from the metric source at the unit level.
1414

1515
### Use Cases
1616

17-
Max/min metrics allow you to easily track users' extremes during an experiment.
17+
Max/min metrics allow you to easily track users' extremes during an experiment.
1818

1919
Common examples are:
2020

@@ -100,8 +100,8 @@ Max/min metrics are simple and there are many advanced options you can apply.
100100
- Specify if you want to calculate CUPED, and the lookback window for CUPED's pre-experiment data inputs
101101
- Thresholding
102102
- Turn this metric into a 1/0 unit count metric counting if the unit's max/min equals to or surpasses (>=) a given threshold
103-
- Cohort Windows
103+
- [Cohort Windows](../features/cohort-metrics.md)
104104
- You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
105105
- **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
106-
- Baked Metrics
107-
- Baked metrics allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
106+
- [Baked Metrics](../features/cohort-metrics.md)
107+
- [Baked Metrics](../features/cohort-metrics.md) allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.

docs/statsig-warehouse-native/metrics/mean.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,8 @@ Mean metrics have the delta method applied to account for covariance between uni
5252
- You can configure Metadata Columns to group results by, getting easy access to dimensional views in pulse results
5353
- Winsorization
5454
- Specify a lower and/or upper percentile bound to winsorize at. All values below the lower threshold, or above the upper threshold, will be clamped to that threshold to reduce the outsized impact of outliers on your analysis
55-
- Cohort Windows
55+
- [Cohort Windows](../features/cohort-metrics.md)
5656
- You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
5757
- **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
58-
- Baked Metrics
59-
- Baked metrics allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
58+
- [Baked Metrics](../features/cohort-metrics.md)
59+
- [Baked Metrics](../features/cohort-metrics.md) allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.

docs/statsig-warehouse-native/metrics/ratio.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -63,12 +63,12 @@ By default, Statsig only includes numerators from metrics with non-null, non-zer
6363

6464
## Options
6565

66-
- Cohort Windows (Numerator and Denominator)
66+
- [Cohort Windows](../features/cohort-metrics.md) (Numerator and Denominator)
6767
- You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
6868
- **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
6969
- Winsorization
7070
- Specify a lower and/or upper percentile bound to winsorize at. Winsorization and its thresholds can be specified for both the numerator and denominator of the ratio metric independently. All values below the lower threshold, or above the upper threshold, will be clamped to that threshold to reduce the outsized impact of outliers on your analysis
7171
- Include units which do not have a denominator
7272
- Control whether you want to include numerators from units which don't have a denominator value
73-
- Baked Metrics
74-
- Baked metrics allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
73+
- [Baked Metrics](../features/cohort-metrics.md)
74+
- [Baked Metrics](../features/cohort-metrics.md) allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.

0 commit comments

Comments
 (0)