statsig-io
diff --git a/‎cspell.json
+4-1 b/‎cspell.json
+4-1
diff --git a/‎docs/statsig-warehouse-native/configuration/metrics.md
+2-2 b/‎docs/statsig-warehouse-native/configuration/metrics.md
+2-2
diff --git a/‎docs/statsig-warehouse-native/features/cohort-metrics.md
+85-11 b/‎docs/statsig-warehouse-native/features/cohort-metrics.md
+85-11
diff --git a/‎docs/statsig-warehouse-native/metrics/count-distinct.md
+1-1 b/‎docs/statsig-warehouse-native/metrics/count-distinct.md
+1-1
diff --git a/‎docs/statsig-warehouse-native/metrics/count.md
+3-3 b/‎docs/statsig-warehouse-native/metrics/count.md
+3-3
diff --git a/‎docs/statsig-warehouse-native/metrics/latest-value.md
+1-1 b/‎docs/statsig-warehouse-native/metrics/latest-value.md
+1-1
diff --git a/‎docs/statsig-warehouse-native/metrics/max-min.md
+6-6 b/‎docs/statsig-warehouse-native/metrics/max-min.md
+6-6
diff --git a/‎docs/statsig-warehouse-native/metrics/mean.md
+3-3 b/‎docs/statsig-warehouse-native/metrics/mean.md
+3-3
diff --git a/‎docs/statsig-warehouse-native/metrics/ratio.md
+3-3 b/‎docs/statsig-warehouse-native/metrics/ratio.md
+3-3
@@ -349,6 +349,9 @@
     "xAU",
     "yyyy",
     "nd",
+    "lossier",
+    "cohorted",
+    "Cohorted",
     "winsorized",
     "onboarded",
     "misconfigured",
@@ -691,4 +694,4 @@
     "/\\[.*?\\]\\(.*?\\)/",
     "```[\\s\\S]*?```"
   ]
-}
+}
@@ -166,11 +166,11 @@ Most aggregation-type metrics (sum, count, count distinct, unit count, means, ra
 
 Cohort settings allow you to specify a window for data collection after a unit's exposure. For example, a 4-6 day cohort window would only count actions from days 4, 5, and 6 after a unit was exposed to an experiment.
 
-Only include units with a completed window can be selected to remove units out of pulse analysis for this metric until the cohort window has completed. On experiment settings, you can choose to enable post-experiment data collection to allow these cohorts to mature in the case that you believe the intervention effect will still apply even if the user gets the control/shipped experiment (e.g. NUX experiments).
+Please refer to the full documentation on cohorts [here](../features/cohort-metrics.md).
 
 ### Baking
 
-Many metric types support baking. Statsig will wait to calculate baked metrics, and use "old" data for baked metrics. This is appropriate for cases like credit card chargebacks, where you may adjust your payments dataset to account for chargebacks in a "net revenue" metric.
+Many metric types support baking. Statsig will wait to calculate baked metrics, and use "old" data for baked metrics. This is appropriate for cases like credit card chargebacks, where you may adjust your payments dataset to account for chargebacks in a "net revenue" metric. See additional information in the [cohort documentation](../features/cohort-metrics.md).
 
 Statsig will:
 
 
@@ -1,28 +1,102 @@
 ---
 title: Cohort Metrics
 slug: /statsig-warehouse-native/features/cohort-metrics
-sidebar_label: Cohort/Window Metrics
-description: Analyze Cohorts with Statsig
-displayed_sidebar: cloud
+sidebar_label: Cohort Metrics
+description: Analyzing Cohorts with Statsig
 keywords:
-  - owner:vm
+  - owner:craig
 last_update:
-  date: 2024-08-05
+  date: 2025-05-07
 ---
 
 # Cohort Metrics
 
-Cohort metrics are a way to analyze the impact of an experiment in a certain time frame per user.
+Cohort metrics are a way to analyze the impact of an experiment in a certain time frame per experimental unit.
 
-This can be a useful way to ignore noisy day-1 metrics or capture a specific critical window around something like subscriptions. It is also an easy way to measure experiment-based retention, since you can use it to ask if an experiment - for example - caused users to come back in their 2nd week from being exposed to an experiment.
+This is useful for many reasons. Common use cases are:
 
-![Cohort Metric](https://github.com/statsig-io/docs/assets/102695539/cc96d4ba-4edc-4b31-b937-7ad8d62245f7)
+- By ensuring all users have equal periods for data collection, there is an "apples to apples" comparison across user enrolled early/late in the experiment (which often corresponds to power/occasional users), and across different "time periods" that may have extrinsic factors like holidays
+- If analyzing an unbounded periods, experimental units' variance in the population can increase over time - leading to scenario where error bars don't actually converge towards 0 as the experiment is run for longer!
+- This allows one to skip noisy early metrics, or focus on outcomes after users might churn - e.g. capturing "week-2 engagement" if a product have a 1-week trial period
+- This can also be used to capture "one-shot retention". [Retention metrics](../metrics/retention.md) are used to capture rolling, ongoing retention. A user metric with a window from day X to day Y is a good way to check if an experiment is causing more users to retain at least X days
 
-In the example above, the metric will count distinct users in an experiment group with a click event between 3 and 7 days from their first exposure.
+The downsides of cohort metrics are that:
 
-Since "Wait until users reach end window to include them in calculation" is clicked, users who are not yet at 7 days from their first exposure will be excluded from the numerator and the denominator of the metric in the analysis.
+- They do not capture any sort of long-term impact, or how that evolves over time. This is purely a point in time analysis and may not be appropriate for measuring complex, evolving behaviors
+- They make topline impact estimates lossier and harder to trust
 
-# Metric Bake Windows
+Some practitioners have made compelling arguments that cohort metrics are a better "standard" metric for organizations to use in analysis. Statsig tends to believe that the use of cohorts is dependent on business context, but it's worth considering if they should be at least a part of an experiment's measurement (e.g. measuring topline revenue as an overall evaluation criteria, but also measuring 7d revenue alongside it for additional context).
+
+Cohort metrics on Statsig are feature-rich. This page explains the different settings available, what they do, and how they interact so that you can be confident that you know what's being measured.
+
+## Basic Cohort Windows
+
+Basic cohort windows are fairly simple. They are a filter on metric data with a time range relative to the unit's time of exposure. For example, this cohort metric from 1 to 6 days would filter to events from 24 hours until 144 hours from when they were exposed to the experiment.
+
+Note that this is calculated as a timestamp comparison; a unit enrolled at 12pm will have exactly 24 hours until they hit the end of a 0-1 day cohort.
+
+For metrics from data sources that are marked as daily data, the cohort comparison is truncated to a date so that day-0 data behaves as expected (e.g. a user exposed on `2025-01-05T09:00` will include the date-based data from `2025-01-05` instead of truncating to times "after 9:00am").
+
+## Waiting for Maturation
+
+By default, cohort metrics can have a mix of maturation in the experimental population. For a 1-week cohort, users enrolled in the last week of the cohort will have a mix of maturities during analysis. This does yield maximal sample, but can "dilute" the analysis with partial cohort windows. To prevent this, mark the metric as needing to "Wait for cohort window to complete". This will drop units' metric data from analysis, and removes them from the experiment analysis population.
+
+In the examples below, one metric forces cohorts to complete. It has **less units** in the analysis, since many units do not have a complete window, a **lower total** because of the small unit count, but a **higher mean** since the units it does have have completed their window and have a longer data collection period on average.
+
+![Metric that did not wait to mature](/img/whn/basic_cohort_metric.png)
+![MEtric that waited to Mature](/img/whn/wait_to_mature_metric.png)
+
+This can cause confusion since it leads to different populations between metrics, and also filters out the last few days of an experiment's data in the daily timeseries since new cohorts' data is not including. A metric's cohort settings can be seen quickly by hovering over the name of the metric in the experiment scorecard.
+
+## Visual Examples
+
+This is what the data collection looks like for a standard cohort metric with a 0-6 day window. Note that this collects data for **7 days** - it's 0-indexed.
+
+![Basic cohort example](/img/whn/basic_cohort_example.png)
+
+If the cohort period goes over the end of the experiment, the default behavior is that the data collection is truncated to the end of the experiment
+
+![Basic cohort over end example](/img/whn/basic_cohort_over_end_example.png)
+
+If the metric is configured to only allow completed cohort windows, the unit is completely excluded from the analysis. For example they are not part of the denominator for the average value of a sum or count metric, and their metric data is filtered from the analysis.
+
+![Basic cohort example](/img/whn/completed_window_example.png)
+
+If mature after end is configured in the experiment, it will continue to collect data after the end of the experiment (whether wait for mature is enabled or not).
+
+![Basic cohort example](/img/whn/mature_after_end_example.png)
+
+## Experiment-based Cohort Settings
+
+Cohort controls are also available on experiments. This is because the relevance of these features depends on the kind of experiment being run, as discussed below.
+
+### Allow post-experiment cohort data
+
+Checking **Allow Cohort Metrics to Mature After Experiment End** allows metrics to be collected after the end of the experiment. This is recommended for one-time interventions, e.g. a new signup page; getting post-experiment signal from units who did get the intervention means additional statistical power.
+
+![Allow Cohorts to Mature](/img/whn/allow_cohorts_to_mature.png)
+
+This is **not** recommended in cases where the intervention is continuous, e.g. a ranking change; post-experiment data will be diluted (e.g. test users might get the control experience in their post-experiment period), diluting results.
+
+On the data side, the analysis will be extended past the end of the experiment by the length of the longest cohort window on any metric in the analysis. Non-cohorted metrics will have their date constrained to the analysis period, and cohort metrics will be filtered to the end of the experiment + their cohort window.
+
+### Fixed Duration
+
+On the **Experiment Population** section of the experiment setup page, there is an option for fixed-window cohorting under **Configure Analysis Period** with Analysis Type **Fixed Duration**. This only counts metrics for a certain period after **experiment start**. This is useful when there's an experiment with a specific point in time, like email campaigns or other cases of fixed enrollment events.
+
+![Analysis Period Settings](/img/whn/configure_analysis_period.png)
+
+This setting is only available for assign and analyze experiments.
+
+### Cohorted Duration
+
+On the **Experiment Population** section of the experiment setup page, there is an option for allocation-based cohorting under **Configure Analysis Period** with Analysis Type **Cohorted Duration**. This is identical to the metric-based cohort, but applies globally to all possible metrics in the experiment analysis. This is a great way to globally add a cohort when appropriate, e.g. new user experiments.
+
+Note that when this is used in conjunction with metric cohorts, the minimal end of the cohort window will be used. For example, if a metric cohort is set to end at 7 days and the experiment at 10, 7 will be used. If the metric cohort is set to end at 7 days and the experiment at 5, 5 will be used.
+
+The **only include units with a completed cohort window** setting can also be specified at the experiment level and will apply to all metrics in the experiment if so. If not checked, this will be applied on a metric-by-metric basis.
+
+## Metric Bake Windows
 
 In some cases, a metric in your warehouse may not be matured until a certain time period, after which you care about the daily value. Statsig provides the option to specify a bake window for your metrics. Similarly to the option above, metrics who have not reached the bake window's end will be excluded from the numerator and denominator of the metric in the analysis.
 
 
@@ -74,6 +74,6 @@ In the metrics page view, we use APPROX_COUNT_DISTINCT (or equivalent) to avoid
   - Specify if you want to calculate CUPED, and the lookback window for CUPED's pre-experiment data inputs
 - Thresholding
   - Turn this metric into a 1/0 unit count metric counting if the unit's total count equals to or surpasses (>=) a given threshold
-- Cohort Windows
+- [Cohort Windows](../features/cohort-metrics.md)
   - You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
     - **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
@@ -70,8 +70,8 @@ Count metrics are simple, and will use the sql COUNT aggregation. However, there
   - Specify if you want to calculate CUPED, and the lookback window for CUPED's pre-experiment data inputs
 - Thresholding
   - Turn this metric into a 1/0 unit count metric counting if the unit's total count equals to or surpasses (>=) a given threshold
-- Cohort Windows
+- [Cohort Windows](../features/cohort-metrics.md)
   - You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
     - **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
-- Baked Metrics
-  - Baked metrics allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
+- [Baked Metrics](../features/cohort-metrics.md)
+  - [Baked Metrics](../features/cohort-metrics.md) allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
@@ -65,7 +65,7 @@ Users without a value will be treated as 0s; note that if there is an existing v
 
 - Metric Breakdowns
   - You can configure Metadata Columns to group results by, getting easy access to dimensional views in pulse results
-- Cohort Windows
+- [Cohort Windows](../features/cohort-metrics.md)
   - You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
     - **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
 - CUPED
 
@@ -9,12 +9,12 @@ last_update:
 
 ## Summary
 
-Max metrics calculate the maximum of a column from the metric source at the unit level. 
-Min Metrics calculate the minimum of a column from the metric source at the unit level. 
+Max metrics calculate the maximum of a column from the metric source at the unit level.
+Min Metrics calculate the minimum of a column from the metric source at the unit level.
 
 ### Use Cases
 
-Max/min metrics allow you to easily track users' extremes during an experiment.  
+Max/min metrics allow you to easily track users' extremes during an experiment.
 
 Common examples are:
 
@@ -100,8 +100,8 @@ Max/min metrics are simple and there are many advanced options you can apply.
   - Specify if you want to calculate CUPED, and the lookback window for CUPED's pre-experiment data inputs
 - Thresholding
   - Turn this metric into a 1/0 unit count metric counting if the unit's max/min equals to or surpasses (>=) a given threshold
-- Cohort Windows
+- [Cohort Windows](../features/cohort-metrics.md)
   - You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
     - **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
-- Baked Metrics
-  - Baked metrics allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
+- [Baked Metrics](../features/cohort-metrics.md)
+  - [Baked Metrics](../features/cohort-metrics.md) allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
@@ -52,8 +52,8 @@ Mean metrics have the delta method applied to account for covariance between uni
   - You can configure Metadata Columns to group results by, getting easy access to dimensional views in pulse results
 - Winsorization
   - Specify a lower and/or upper percentile bound to winsorize at. All values below the lower threshold, or above the upper threshold, will be clamped to that threshold to reduce the outsized impact of outliers on your analysis
-- Cohort Windows
+- [Cohort Windows](../features/cohort-metrics.md)
   - You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
     - **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
-- Baked Metrics
-  - Baked metrics allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
+- [Baked Metrics](../features/cohort-metrics.md)
+  - [Baked Metrics](../features/cohort-metrics.md) allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
@@ -63,12 +63,12 @@ By default, Statsig only includes numerators from metrics with non-null, non-zer
 
 ## Options
 
-- Cohort Windows (Numerator and Denominator)
+- [Cohort Windows](../features/cohort-metrics.md) (Numerator and Denominator)
   - You can specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
     - **Only include units with a completed window** can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
 - Winsorization
   - Specify a lower and/or upper percentile bound to winsorize at. Winsorization and its thresholds can be specified for both the numerator and denominator of the ratio metric independently. All values below the lower threshold, or above the upper threshold, will be clamped to that threshold to reduce the outsized impact of outliers on your analysis
 - Include units which do not have a denominator
   - Control whether you want to include numerators from units which don't have a denominator value
-- Baked Metrics
-  - Baked metrics allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.
+- [Baked Metrics](../features/cohort-metrics.md)
+  - [Baked Metrics](../features/cohort-metrics.md) allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit's metric has matured.