test_model_parallel: new test for different table index types (#2770)

basilwong · facebook-github-bot · commit 0108153ccc93 · 2025-02-27T18:12:38.000-08:00
Summary: Pull Request resolved: #2770 # Diff Specific Changes This diff introduces a new test for different table index types in the test_model_parallel.py file. The code changes in the file include adding new arguments to the TestModelParallel class constructor to allow for specifying the index type of the table. The new test will be used to verify the correctness when different index types are used as input. # Context Doc: https://docs.google.com/document/d/1YVfxsafqXkxAAdRyXbjmSH4AEz3-6DBiTGjs1rT8ZHQ/edit?usp=sharing Updating the TorchRec unit test suite to cover int32 and int64 indices/offets support. # Summary Specifically for the [test_model_parallel](https://www.internalfb.com/code/fbsource/[3505ccb75a649a7d21218bcda126d1e8392afc5a]/fbcode/torchrec/distributed/test_utils/test_model_parallel.py?lines=34) suite that I am looking at the change appears to be fairly straightforward. 1.The [ModelParallelTestShared](https://www.internalfb.com/code/fbsource/fbcode/torchrec/distributed/test_utils/test_model_parallel.py?lines=34) class defines a [test suite python library](https://www.internalfb.com/code/fbsource/[cbd0bd0020a7afbec4922d8abc0d88b7d45cba56]/fbcode/torchrec/distributed/test_utils/TARGETS?lines=65-69) referenced by multiple unit tests in the TorchRec codebase including [test_model_parallel_nccl](https://www.internalfb.com/code/fbsource/[cbd0bd0020a7afbec4922d8abc0d88b7d45cba56]/fbcode/torchrec/distributed/tests/TARGETS?lines=85-100) in which we are particularly interested in for this particular case. The method all of the unit tests in this class use is [`_test_sharding`](https://www.internalfb.com/code/fbsource/[fa9508a29b62ce57681ee73cd6d4cac56f153a58]/fbcode/torchrec/distributed/test_utils/test_model_parallel.py?lines=132). Within the `_test_sharding` function, the "callable" argument input to the [`_run_multi_process_test`](https://www.internalfb.com/code/symbol/fbsource/py/fbcode/caffe2.torch.fb.hpc.tests.sparse_data_dist_test.SparseDataDistTest._run_multi_process_test) function is [`sharding_single_rank_test`](https://www.internalfb.com/code/fbsource/[fa9508a29b62ce57681ee73cd6d4cac56f153a58]/fbcode/torchrec/distributed/test_utils/test_sharding.py?lines=296) which shows us how the input data/model is generated. Additional arguments will need to be added to both the [`_test_sharding`](https://www.internalfb.com/code/fbsource/[fa9508a29b62ce57681ee73cd6d4cac56f153a58]/fbcode/torchrec/distributed/test_utils/test_model_parallel.py?lines=132) and [`_run_multi_process_test`](https://www.internalfb.com/code/symbol/fbsource/py/fbcode/caffe2.torch.fb.hpc.tests.sparse_data_dist_test.SparseDataDistTest._run_multi_process_test) functions. 2.The [`sharding_single_rank_test`](https://www.internalfb.com/code/fbsource/[fa9508a29b62ce57681ee73cd6d4cac56f153a58]/fbcode/torchrec/distributed/test_utils/test_sharding.py?lines=296) function is where we define additional kwargs. This function leverages the [`gen_model_and_input`](https://www.internalfb.com/code/fbsource/[f7e6a3281d924b465e0e90ff079aa9df83ae9530]/fbcode/torchrec/distributed/test_utils/test_sharding.py?lines=131) to define the test model and more importantly for our purposes the input tables. ``` generate=(cast(VariableBatchModelInputCallable, ModelInput.generate_variable_batch_input) if variable_batch_per_feature else ModelInput.generate), ``` 3.The [ModelInput](https://www.internalfb.com/code/fbsource/[4217c068fa966d569d2042a7263cefe1a06dc87a]/fbcode/torchrec/distributed/test_utils/test_model.py?lines=48) class' [`generate`](https://www.internalfb.com/code/fbsource/[4217c068fa966d569d2042a7263cefe1a06dc87a]/fbcode/torchrec/distributed/test_utils/test_model.py?lines=55) and [`generate_variable_batch_input`](https://www.internalfb.com/code/fbsource/[4217c068fa966d569d2042a7263cefe1a06dc87a]/fbcode/torchrec/distributed/test_utils/test_model.py?lines=589) methods are used to generate the input tensors used in the unit tests. All we need to do is add new arguments that enable configuring the index/offset type of the tables. # Diff stack change summary: a. Update the generate_variable_batch_input to enable configuring index/offset/length type b. Update the generate to enable configuring index/offset/length type c. Update Model Input Callable Protocol to Enable Configuring index/offset/length type d. test_model_parallel: new test for different table index types e. Deprecate long_indices argument for torch.dtype arguments Reviewed By: TroyGarden, ys97529 Differential Revision: D70055534 fbshipit-source-id: d338d74b30ffbecb36cf3bf91481e28355b610b3
diff --git a/torchrec/distributed/test_utils/test_model_parallel.py b/torchrec/distributed/test_utils/test_model_parallel.py
@@ -152,6 +152,10 @@ def _test_sharding(
         use_inter_host_allreduce: bool = False,
         allow_zero_batch_size: bool = False,
         custom_all_reduce: bool = False,
+        use_offsets: bool = False,
+        indices_dtype: torch.dtype = torch.int64,
+        offsets_dtype: torch.dtype = torch.int64,
+        lengths_dtype: torch.dtype = torch.int64,
     ) -> None:
         self._build_tables_and_groups(data_type=data_type)
         self._run_multi_process_test(
@@ -176,6 +180,10 @@ def _test_sharding(
             use_inter_host_allreduce=use_inter_host_allreduce,
             allow_zero_batch_size=allow_zero_batch_size,
             custom_all_reduce=custom_all_reduce,
+            use_offsets=use_offsets,
+            indices_dtype=indices_dtype,
+            offsets_dtype=offsets_dtype,
+            lengths_dtype=lengths_dtype,
         )
 
 
@@ -901,3 +909,58 @@ def test_sharding_grid_8gpu(
             apply_optimizer_in_backward_config=apply_optimizer_in_backward_config,
             pooling=pooling,
         )
+
+    @unittest.skipIf(
+        torch.cuda.device_count() <= 1,
+        "Not enough GPUs, this test requires at least two GPUs",
+    )
+    # pyre-fixme[56]
+    @given(
+        dtype=st.sampled_from([torch.int32, torch.int64]),
+        use_offsets=st.booleans(),
+        sharder_type=st.sampled_from(
+            [
+                SharderType.EMBEDDING_BAG_COLLECTION.value,
+            ]
+        ),
+        kernel_type=st.sampled_from(
+            [
+                EmbeddingComputeKernel.FUSED.value,
+            ],
+        ),
+    )
+    @settings(verbosity=Verbosity.verbose, max_examples=2, deadline=None)
+    def test_sharding_diff_table_index_type(
+        self,
+        dtype: torch.dtype,
+        use_offsets: bool,
+        sharder_type: str,
+        kernel_type: str,
+    ) -> None:
+        """
+        Test that the model correctly handles input indices and offsets
+        with both int32 and int64 data types.
+        """
+        sharders = [
+            cast(
+                ModuleSharder[nn.Module],
+                create_test_sharder(
+                    sharder_type=sharder_type,
+                    sharding_type=ShardingType.ROW_WISE.value,  # or any other relevant sharding type
+                    kernel_type=kernel_type,
+                    device=self.device,
+                ),
+            ),
+        ]
+        # TODO - how to pass dtype so that sampled data uses different type indices/offsets?
+        self._test_sharding(
+            sharders=sharders,
+            backend=self.backend,
+            apply_optimizer_in_backward_config=None,
+            variable_batch_size=False,
+            pooling=PoolingType.SUM,
+            use_offsets=use_offsets,
+            indices_dtype=dtype,
+            offsets_dtype=dtype,
+            lengths_dtype=dtype,
+        )