Add warning/error for fused_tasks_and_states compute mode for 2D/List state tensors (#2899)

ge0405 · facebook-github-bot · commit 9eaec095ce75 · 2025-04-21T15:15:06.000-07:00
Summary: Pull Request resolved: #2899 I visited all child classes that use `RecMetricComputation` to see if any is incompatible with the added `FUSED_TASKS_AND_STATES_COMPUTATION` in D72010614. As of Apr 16, 2025, searching "`(RecMetricComputation`" in fbcode resulted in 47 results. - 1 is rec_metric in comment (not show in the following table) - 42 metric classes defined in torchrec, 29 in OSS, 13 in FB - 4 metric classes in customer codebase, e.g. MVAI, admarket (see row 43-46 in the table below) RecMetricComputation uses `state` tensors to compute/update results. I looked through all following metrics' state tensors and see if any if them are (1) not Tensor (e.g. List) or (2) 2D Tensor. If so, when these metrics use `FUSED_TASKS_AND_STATES_COMPUTATION`, the init should show warning (and simply FUSED_TASKS_COMPUTATION will be used) or raise Exception (if these metrics can't allow any fuse mode). | | dir | metric | type of state tensors [1] | tests/warning | |1 | oss | auc.py | [List](https://www.internalfb.com/code/fbsource/[0cd67a62f39734525b63e9fc054f9d169e48b793]/fbcode/torchrec/metrics/auc.py?lines=192 ) | warning | |2 | oss | tower_qps.py | | | |3 | oss | precision_session.py | not allow fuse, [code](https://www.internalfb.com/code/fbsource/[390e95a4e30dfd1f2c7ad23b2de92b1b7edbdf15]/fbcode/torchrec/metrics/precision_session.py?lines=203) | both fuse modes raise exception | |4 | oss | serving_ne.py | | | |5 | oss | recall_session.py | not allow fuse, [code](https://www.internalfb.com/code/fbsource/[390e95a4e30dfd1f2c7ad23b2de92b1b7edbdf15]/fbcode/torchrec/metrics/recall_session.py?lines=242) | both fuse modes raise exception | |6 | oss | calibration.py | | | |7 | oss | multiclass_recall.py | [2D tensor](https://www.internalfb.com/code/fbsource/[0cd67a62f39734525b63e9fc054f9d169e48b793]/fbcode/torchrec/metrics/multiclass_recall.py?lines=99)| warning | |8 | oss | ndcg.py | | | |9 | oss | gauc.py | | | |10 | oss | tensor_weighted_avg.py | | | |11 | oss | ne.py | | | |12 | oss | serving_calibration.py | | | |13 | oss | segmented_ne.py | [2D tensor](https://www.internalfb.com/code/fbsource/[0cd67a62f39734525b63e9fc054f9d169e48b793]/fbcode/torchrec/metrics/segmented_ne.py?lines=187)| warning | |14 | oss | scalar.py | | | |15 | oss | mae.py | | | |16 | oss | ne_positive.py | | | |17 | oss | weighted_avg.py | | | |18 | oss | output.py | | | |20 | oss | cali_free_ne.py | | | |21 | oss | unweighted_ne.py | | | |22 | oss | hindsight_target_pr.py | 1D but not n_tasks, [code](https://www.internalfb.com/code/fbsource/[0cd67a62f39734525b63e9fc054f9d169e48b793]/fbcode/torchrec/metrics/hindsight_target_pr.py?lines=131) | can still fuse states | |23 | oss | rauc.py | [List](https://www.internalfb.com/code/fbsource/[0cd67a62f39734525b63e9fc054f9d169e48b793]/fbcode/torchrec/metrics/rauc.py?lines=238) | warning | |24 | oss | precision.py | | | |25 | oss | recall.py | | | |26 | oss | auprc.py | [List](https://www.internalfb.com/code/fbsource/[0cd67a62f39734525b63e9fc054f9d169e48b793]/fbcode/torchrec/metrics/auprc.py?lines=186)| warning | |27 | oss | mse.py | | | |28 | oss | ctr.py | | | |29 | oss | accuracy.py | | | |30 | fb | log_normal_cnll.py | | | |31 | fb | coarse_grained_multiclass_ne.py | [2D tensor](https://www.internalfb.com/code/fbsource/[8670f823b23ca84165106e7a7bc236d066a03c7d]/fbcode/torchrec/fb/metrics/coarse_grained_multiclass_ne.py?lines=62)| warning | |32 | fb | regression_huber.py | | | |33 | fb | modified_poisson_nll.py | | | |34 | fb | res_ne.py | | | |35 | fb | dist_shift.py | | | |36 | fb | serving_ne.py | | | |37 | fb | unjoined_calibration.py | | | |38 | fb | unjoined_ne.py | | | |39 | fb | serving_calibration.py | | | |40 | fb | multiclass_ne.py | [2D tensor](https://www.internalfb.com/code/fbsource/[0cd67a62f39734525b63e9fc054f9d169e48b793]/fbcode/torchrec/fb/metrics/multiclass_ne.py?lines=148) | warning | |41 | fb | bucket_metric.py [2] | not allow fuse, [code](https://www.internalfb.com/code/fbsource/[0cc90263a7ae86e5327b153ccfe2f79b5956c69d]/fbcode/torchrec/fb/metrics/bucket_metric.py?lines=151-157) | both fuse modes raise exception | |42 | fb | bucket_weighted_average_metric.py [3] | shouldn't allow fuse, [code](https://www.internalfb.com/code/fbsource/[42cb13ac942ea4d2cb504d051b35419ccc6760f8]/fbcode/torchrec/fb/metrics/bucket_weighted_average_metric.py?lines=303) | both fuse modes should raise exception | |43 | MVAI | metrics.py | [List](https://www.internalfb.com/code/fbsource/[0cd67a62f39734525b63e9fc054f9d169e48b793]/fbcode/minimal_viable_ai/models/blue_reels_true_interest/metrics.py?lines=84) | | |44 | MVAI | ndcg_metrics.py | | | |45 | admarket | metrics.py | [2D or 3D?](https://www.internalfb.com/code/fbsource/[0cd67a62f39734525b63e9fc054f9d169e48b793]/fbcode/admarket/targeting/lookalike_nextgen_trainer/lal_lr_trainer/utils/metrics.py?lines=157) | | |46 | mrs/fm | metrics.py | | | [1] For metrics that I don't specify "type of state tensors" in the above table, they are all 1D tensors with (n_tasks) shape. [2] There are 7 bucket metrics (bucket_calibration, bucket_ctr, bucket_hindsight_target_pr, bucket_mse, bucket_ne, bucket_precision, bucket_recall) inherit `BucketMetricComputation` defined in bucket_metric.py. All of them have 2D tensors and most shapes are (n_tasks, num_buckets). [3] There is 1 bucket metric (bucket_weighted_average_logloss) inherit `BucketWeightedAverageMetricComputation` defined in bucket_weighted_average_metric.py. All the state tensors are 2D (n_tasks, num_buckets). Reviewed By: iamzainhuda Differential Revision: D73293593 fbshipit-source-id: 4ad9dfafeed1171d63dc301f4dc7608c52d105b8
diff --git a/torchrec/metrics/auc.py b/torchrec/metrics/auc.py
@@ -7,6 +7,7 @@
 
 # pyre-strict
 
+import logging
 from functools import partial
 from typing import Any, Callable, cast, Dict, List, Optional, Tuple, Type
 
@@ -23,6 +24,8 @@
 )
 
 
+logger: logging.Logger = logging.getLogger(__name__)
+
 PREDICTIONS = "predictions"
 LABELS = "labels"
 WEIGHTS = "weights"
@@ -405,3 +408,8 @@ def __init__(
         )
         if kwargs.get("grouped_auc"):
             self._required_inputs.add(GROUPING_KEYS)
+        if self._compute_mode == RecComputeMode.FUSED_TASKS_AND_STATES_COMPUTATION:
+            logging.warning(
+                f"compute_mode FUSED_TASKS_AND_STATES_COMPUTATION can't support {self._namespace} yet "
+                "because its states are not 1D Tensors. Only FUSED_TASKS_COMPUTATION will take effect."
+            )
diff --git a/torchrec/metrics/auprc.py b/torchrec/metrics/auprc.py
@@ -7,6 +7,7 @@
 
 # pyre-strict
 
+import logging
 from functools import partial
 from typing import Any, cast, Dict, List, Optional, Type
 
@@ -23,6 +24,8 @@
 )
 
 
+logger: logging.Logger = logging.getLogger(__name__)
+
 PREDICTIONS = "predictions"
 LABELS = "labels"
 WEIGHTS = "weights"
@@ -361,3 +364,8 @@ def __init__(
         )
         if kwargs.get("grouped_auprc"):
             self._required_inputs.add(GROUPING_KEYS)
+        if self._compute_mode == RecComputeMode.FUSED_TASKS_AND_STATES_COMPUTATION:
+            logging.warning(
+                f"compute_mode FUSED_TASKS_AND_STATES_COMPUTATION can't support {self._namespace} yet "
+                "because its states are not 1D Tensors. Only FUSED_TASKS_COMPUTATION will take effect."
+            )
diff --git a/torchrec/metrics/metrics_config.py b/torchrec/metrics/metrics_config.py
@@ -81,8 +81,12 @@ class RecComputeMode(Enum):
     """This Enum lists the supported computation modes for RecMetrics.
 
     FUSED_TASKS_COMPUTATION indicates that RecMetrics will fuse the computation
-    for multiple tasks of the same metric. This can be used by modules where the
-    outputs of all the tasks are vectorized.
+        for multiple tasks of the same metric. This can be used by modules where the
+        outputs of all the tasks are vectorized.
+    FUSED_TASKS_AND_STATES_COMPUTATION fuse both the tasks (same as FUSED_TASKS_COMPUTATION)
+        and states (e.g. calibration_num and calibration_denom for caliration) of the
+        same metric. This currently only supports 1D state tensors (e.g. when all state
+        tensors are of the same (n_tasks) shape).
     """
 
     FUSED_TASKS_COMPUTATION = 1
diff --git a/torchrec/metrics/multiclass_recall.py b/torchrec/metrics/multiclass_recall.py
@@ -7,9 +7,11 @@
 
 # pyre-strict
 
+import logging
 from typing import Any, cast, Dict, List, Optional, Type
 
 import torch
+from torchrec.metrics.metrics_config import RecComputeMode
 from torchrec.metrics.metrics_namespace import MetricName, MetricNamespace, MetricPrefix
 
 from torchrec.metrics.rec_metric import (
@@ -20,6 +22,9 @@
 )
 
 
+logger: logging.Logger = logging.getLogger(__name__)
+
+
 def compute_true_positives_at_k(
     predictions: torch.Tensor,
     labels: torch.Tensor,
@@ -154,3 +159,11 @@ def _compute(self) -> List[MetricComputationReport]:
 class MulticlassRecallMetric(RecMetric):
     _namespace: MetricNamespace = MetricNamespace.MULTICLASS_RECALL
     _computation_class: Type[RecMetricComputation] = MulticlassRecallMetricComputation
+
+    def __init__(self, *args: Any, **kwargs: Any) -> None:
+        super().__init__(*args, **kwargs)
+        if self._compute_mode == RecComputeMode.FUSED_TASKS_AND_STATES_COMPUTATION:
+            logging.warning(
+                f"compute_mode FUSED_TASKS_AND_STATES_COMPUTATION can't support {self._namespace} yet "
+                "because its states are not 1D Tensors. Only FUSED_TASKS_COMPUTATION will take effect."
+            )
diff --git a/torchrec/metrics/rauc.py b/torchrec/metrics/rauc.py
@@ -7,6 +7,7 @@
 
 # pyre-strict
 
+import logging
 from functools import partial
 from typing import Any, Callable, cast, Dict, List, Optional, Tuple, Type
 
@@ -23,6 +24,8 @@
 )
 
 
+logger: logging.Logger = logging.getLogger(__name__)
+
 PREDICTIONS = "predictions"
 LABELS = "labels"
 WEIGHTS = "weights"
@@ -448,3 +451,8 @@ def __init__(
         )
         if kwargs.get("grouped_rauc"):
             self._required_inputs.add(GROUPING_KEYS)
+        if self._compute_mode == RecComputeMode.FUSED_TASKS_AND_STATES_COMPUTATION:
+            logging.warning(
+                f"compute_mode FUSED_TASKS_AND_STATES_COMPUTATION can't support {self._namespace} yet "
+                "because its states are not 1D Tensors. Only FUSED_TASKS_COMPUTATION will take effect."
+            )
diff --git a/torchrec/metrics/segmented_ne.py b/torchrec/metrics/segmented_ne.py
@@ -7,6 +7,7 @@
 
 # pyre-strict
 
+import logging
 from typing import Any, Dict, List, Optional, Type
 
 import torch
@@ -21,6 +22,8 @@
 )
 
 
+logger: logging.Logger = logging.getLogger(__name__)
+
 PREDICTIONS = "predictions"
 LABELS = "labels"
 WEIGHTS = "weights"
@@ -346,3 +349,8 @@ def __init__(
         else:
             # pyre-ignore[6]
             self._required_inputs.add(kwargs["grouping_keys"])
+        if self._compute_mode == RecComputeMode.FUSED_TASKS_AND_STATES_COMPUTATION:
+            logging.warning(
+                f"compute_mode FUSED_TASKS_AND_STATES_COMPUTATION can't support {self._namespace} yet "
+                "because its states are not 1D Tensors. Only FUSED_TASKS_COMPUTATION will take effect."
+            )
diff --git a/torchrec/metrics/tests/test_mae.py b/torchrec/metrics/tests/test_mae.py
@@ -49,7 +49,7 @@ class MAEMetricTest(unittest.TestCase):
     clazz: Type[RecMetric] = MAEMetric
     task_name: str = "mae"
 
-    def test_unfused_mae(self) -> None:
+    def test_mae_unfused(self) -> None:
         rec_metric_value_test_launcher(
             target_clazz=MAEMetric,
             target_compute_mode=RecComputeMode.UNFUSED_TASKS_COMPUTATION,
@@ -63,7 +63,7 @@ def test_unfused_mae(self) -> None:
             entry_point=metric_test_helper,
         )
 
-    def test_fused_mae(self) -> None:
+    def test_mae_fused_tasks(self) -> None:
         rec_metric_value_test_launcher(
             target_clazz=MAEMetric,
             target_compute_mode=RecComputeMode.FUSED_TASKS_COMPUTATION,
@@ -77,6 +77,20 @@ def test_fused_mae(self) -> None:
             entry_point=metric_test_helper,
         )
 
+    def test_mae_fused_tasks_and_states(self) -> None:
+        rec_metric_value_test_launcher(
+            target_clazz=MAEMetric,
+            target_compute_mode=RecComputeMode.FUSED_TASKS_AND_STATES_COMPUTATION,
+            test_clazz=TestMAEMetric,
+            metric_name="mae",
+            task_names=["t1", "t2", "t3"],
+            fused_update_limit=0,
+            compute_on_all_ranks=False,
+            should_validate_update=False,
+            world_size=WORLD_SIZE,
+            entry_point=metric_test_helper,
+        )
+
 
 class MAEGPUSyncTest(unittest.TestCase):
     clazz: Type[RecMetric] = MAEMetric
diff --git a/torchrec/metrics/tests/test_precision_session.py b/torchrec/metrics/tests/test_precision_session.py
@@ -12,8 +12,12 @@
 
 import torch
 from torch import no_grad
-from torchrec.metrics.metrics_config import RecTaskInfo, SessionMetricDef
 
+from torchrec.metrics.metrics_config import (
+    RecComputeMode,
+    RecTaskInfo,
+    SessionMetricDef,
+)
 from torchrec.metrics.precision_session import PrecisionSessionMetric
 from torchrec.metrics.rec_metric import RecMetricException
 
@@ -234,6 +238,37 @@ def test_error_messages(self) -> None:
                 tasks=[task_info2],
             )
 
+    def test_compute_mode_exception(self) -> None:
+        task_info = RecTaskInfo(
+            name="Task1",
+            label_name="label1",
+            prediction_name="prediction1",
+            weight_name="weight1",
+        )
+        with self.assertRaisesRegex(
+            RecMetricException,
+            "Fused computation is not supported for precision session-level metrics",
+        ):
+            PrecisionSessionMetric(
+                world_size=1,
+                my_rank=0,
+                batch_size=100,
+                tasks=[task_info],
+                compute_mode=RecComputeMode.FUSED_TASKS_COMPUTATION,
+            )
+
+        with self.assertRaisesRegex(
+            RecMetricException,
+            "Fused computation is not supported for precision session-level metrics",
+        ):
+            PrecisionSessionMetric(
+                world_size=1,
+                my_rank=5,
+                batch_size=100,
+                tasks=[task_info],
+                compute_mode=RecComputeMode.FUSED_TASKS_AND_STATES_COMPUTATION,
+            )
+
     def test_tasks_input_propagation(self) -> None:
         task_info1 = RecTaskInfo(
             name="Task1",
diff --git a/torchrec/metrics/tests/test_recall_session.py b/torchrec/metrics/tests/test_recall_session.py
@@ -12,7 +12,11 @@
 
 import torch
 from torch import no_grad
-from torchrec.metrics.metrics_config import RecTaskInfo, SessionMetricDef
+from torchrec.metrics.metrics_config import (
+    RecComputeMode,
+    RecTaskInfo,
+    SessionMetricDef,
+)
 from torchrec.metrics.rec_metric import RecMetricException
 
 from torchrec.metrics.recall_session import RecallSessionMetric
@@ -243,6 +247,37 @@ def test_error_messages(self) -> None:
                 tasks=[task_info2],
             )
 
+    def test_compute_mode_exception(self) -> None:
+        task_info = RecTaskInfo(
+            name="Task1",
+            label_name="label1",
+            prediction_name="prediction1",
+            weight_name="weight1",
+        )
+        with self.assertRaisesRegex(
+            RecMetricException,
+            "Fused computation is not supported for recall session-level metrics",
+        ):
+            RecallSessionMetric(
+                world_size=1,
+                my_rank=0,
+                batch_size=100,
+                tasks=[task_info],
+                compute_mode=RecComputeMode.FUSED_TASKS_COMPUTATION,
+            )
+
+        with self.assertRaisesRegex(
+            RecMetricException,
+            "Fused computation is not supported for recall session-level metrics",
+        ):
+            RecallSessionMetric(
+                world_size=1,
+                my_rank=5,
+                batch_size=100,
+                tasks=[task_info],
+                compute_mode=RecComputeMode.FUSED_TASKS_AND_STATES_COMPUTATION,
+            )
+
     def test_tasks_input_propagation(self) -> None:
         task_info1 = RecTaskInfo(
             name="Task1",
diff --git a/torchrec/metrics/tests/test_serving_ne.py b/torchrec/metrics/tests/test_serving_ne.py
@@ -78,7 +78,7 @@ def test_ne_unfused(self) -> None:
             entry_point=metric_test_helper,
         )
 
-    def test_ne_fused(self) -> None:
+    def test_ne_fused_tasks(self) -> None:
         rec_metric_value_test_launcher(
             target_clazz=ServingNEMetric,
             target_compute_mode=RecComputeMode.FUSED_TASKS_COMPUTATION,
@@ -92,7 +92,21 @@ def test_ne_fused(self) -> None:
             entry_point=metric_test_helper,
         )
 
-    def test_ne_update_fused(self) -> None:
+    def test_ne_fused_tasks_and_states(self) -> None:
+        rec_metric_value_test_launcher(
+            target_clazz=ServingNEMetric,
+            target_compute_mode=RecComputeMode.FUSED_TASKS_AND_STATES_COMPUTATION,
+            test_clazz=TestNEMetric,
+            metric_name=ServingNEMetricTest.task_name,
+            task_names=["t1", "t2", "t3"],
+            fused_update_limit=0,
+            compute_on_all_ranks=False,
+            should_validate_update=False,
+            world_size=WORLD_SIZE,
+            entry_point=metric_test_helper,
+        )
+
+    def test_ne_update_unfused(self) -> None:
         rec_metric_value_test_launcher(
             target_clazz=ServingNEMetric,
             target_compute_mode=RecComputeMode.UNFUSED_TASKS_COMPUTATION,