Open
Description
We recently had to increase the timeout period for ROCM to 2.5 hours, up from 2 hours because we started timing out.
I did an analysis of the top 100 longest tests, would be good for someone to go through and identify ones to cut.
note: cases like test_int8_dynamic_quant_subclass_api_10_cuda i.e. this test is parameterized to run 10+ times with different settings, The times of these runs are not compiled together, likely some of those different settings could be cut.
============================ slowest 100 durations =============================
399.42s call test/sparsity/test_sparse_api.py::TestQuantBlockSparseWeight::test_sparse_compile_True
222.07s call test/prototype/test_quantized_training.py::TestQuantizedTraining::test_int8_mixed_precision_training_compile_True_config0_module_swap_False
131.86s call test/quantization/pt2e/test_x86inductor_quantizer.py::TestQuantizePT2EX86Inductor::test_qat_conv2d_unary
127.37s call test/prototype/test_quantized_training.py::TestQuantizedTraining::test_int8_weight_only_training_compile_True_device_cuda
116.83s call test/prototype/test_quantized_training.py::TestQuantizedTraining::test_int8_weight_only_training_compile_False_device_cuda
108.52s call test/integration/test_integration.py::TestSubclass::test_int8_dynamic_quant_subclass_api_06_cuda
96.88s call test/prototype/test_quantized_training.py::TestQuantizedTraining::test_int8_weight_only_compile_leading_dims1_bias_False_device_cuda
91.91s call test/integration/test_integration.py::TestSubclass::test_int8_dynamic_quant_subclass_api_10_cuda
90.94s call test/integration/test_integration.py::TestSubclass::test_int8_dynamic_quant_subclass_api_08_cuda
89.74s call test/dtypes/test_uintx.py::test_uintx_target_dtype_compile[dtype6]
84.81s call test/prototype/test_quantized_training.py::TestQuantizedTraining::test_int8_weight_only_compile_leading_dims0_bias_False_device_cuda
80.68s call test/dtypes/test_uintx.py::test_uintx_target_dtype_compile[dtype4]
78.79s call test/prototype/test_quantized_training.py::TestFSDP2::test_fsdp2_correctness
78.78s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_True_is_dynamic_True
78.03s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_True_is_dynamic_True
77.84s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_True_is_dynamic_False
77.72s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_False_is_dynamic_False
77.70s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_False_is_dynamic_True
77.57s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_False_is_dynamic_False
77.38s call test/dtypes/test_floatx.py::TestFloatxTensorCoreAQTTensorImpl::test_to_scaled_tc_floatx_compile_ebits_2_mbits_2_device_cuda
73.38s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_False_is_dynamic_True
69.83s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_True_is_dynamic_False
69.24s call test/prototype/test_quantized_training.py::TestQuantizedTraining::test_int8_weight_only_compile_leading_dims0_bias_True_device_cuda
66.96s call test/dtypes/test_floatx.py::TestFloatxTensorCoreAQTTensorImpl::test_from_scaled_tc_floatx_compile_ebits_3_mbits_2_device_cuda
65.46s call test/dtypes/test_floatx.py::TestFloatxTensorCoreAQTTensorImpl::test_from_scaled_tc_floatx_compile_ebits_2_mbits_2_device_cuda
63.60s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion_no_conv_bias
62.90s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_fusion_no_conv_bias
61.49s call test/dtypes/test_uintx.py::test_uintx_target_dtype_compile[dtype0]
59.80s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_compile_10_cuda
55.92s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_compile_16_cuda
52.46s call test/dtypes/test_uintx.py::test_uintx_target_dtype_compile[dtype1]
50.53s call test/prototype/test_blockwise_triton.py::test_blockwise_fp8_gemm[dtype0-67-6656-1408]
49.92s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_23_cuda
49.74s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_17_cuda
Traceback (most recent call last):
File "/home/runner/_work/ao/ao/test-infra/.github/scripts/run_with_env_secrets.py", line 102, in <module>
48.96s call test/integration/test_integration.py::TestSubclass::test_aq_int8_weight_only_quant_3_subclass_3_cuda
48.79s call test/prototype/test_blockwise_triton.py::test_blockwise_fp8_gemm[dtype0-13-8704-8576]
48.21s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_27_cuda
48.19s call test/prototype/test_blockwise_triton.py::test_blockwise_fp8_gemm[dtype0-26-18944-1664]
47.00s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_18_cuda
46.60s call test/prototype/test_blockwise_triton.py::test_blockwise_fp8_gemm[dtype0-3-2048-2048]
45.42s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_22_cuda
44.79s call test/prototype/test_blockwise_triton.py::test_blockwise_fp8_gemm[dtype0-4-3584-640]
44.63s call test/integration/test_integration.py::TestSubclass::test_aq_int8_weight_only_quant_3_subclass_5_cuda
44.49s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_double_access_5_cuda
44.33s call test/prototype/test_quantized_training.py::TestQuantizedTraining::test_int8_weight_only_compile_leading_dims2_bias_True_device_cuda
44.31s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_28_cuda
43.75s call test/integration/test_integration.py::SmoothquantUnitTest::test_weight_t_and_non_t_numerics_match
43.55s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_compile_13_cuda
42.96s call test/prototype/test_quantized_training.py::TestQuantizedTraining::test_int8_weight_only_compile_leading_dims1_bias_True_device_cuda
40.52s call test/prototype/test_spinquant.py::test_spinquant_no_quantization[cpu]
38.83s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_compile_12_cuda
37.94s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_double_access_3_cuda
37.77s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_False_is_dynamic_True
35.60s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion
35.45s call test/prototype/test_blockwise_triton.py::test_blockwise_fp8_gemm[dtype0-2-512-128]
35.42s call test/integration/test_integration.py::TestSubclass::test_aq_int8_dynamic_quant_subclass_4_cuda
35.38s call test/prototype/test_spinquant.py::test_spinquant_no_quantization[cuda]
35.23s call test/integration/test_integration.py::TestSubclass::test_int4_weight_only_hqq_quant_subclass_api_5_cuda
35.06s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_mha_3_cuda
34.59s call test/dtypes/test_uintx.py::test_uintx_target_dtype_compile[dtype5]
34.51s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_True_is_dynamic_False
34.31s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_26_cuda
33.87s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_relu_fusion_cuda
33.85s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_double_access_4_cuda
33.69s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_16_cuda
33.68s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_relu_fusion
32.65s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_fusion_cuda
32.58s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_mha_5_cuda
32.46s call test/dtypes/test_uintx.py::test_uintx_target_dtype_compile[dtype3]
32.32s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_fusion_literal_args
32.18s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_fusion
32.14s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_relu_fusion_cuda
32.12s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_relu_fusion_no_conv_bias
32.03s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_relu_fusion
32.01s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion_cuda
31.90s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_True_is_dynamic_True
31.83s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_True_is_dynamic_True
31.51s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion_literal_args
31.33s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_False_is_dynamic_True
31.32s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_mha_4_cuda
30.90s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_relu_fusion_no_conv_bias
29.75s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_transpose_bn
29.65s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_compile_15_cuda
29.42s call test/integration/test_integration.py::TestWeightOnlyInt8Quant::test_weight_only_quant_force_mixed_mm_4_cuda
29.37s call test/integration/test_integration.py::TestSubclass::test_aq_int8_dynamic_quant_subclass_3_cuda
29.22s call test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_True_is_dynamic_False
29.11s call test/quantization/pt2e/test_x86inductor_quantizer.py::TestQuantizePT2EX86Inductor::test_qat_conv2d_binary
28.66s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_update_shared_qspec
28.00s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_compile_09_cuda
27.83s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_transpose_bn
27.46s call test/integration/test_integration.py::TestWeightOnlyInt8Quant::test_weight_only_quant_force_mixed_mm_5_cuda
27.43s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_update_shared_qspec
26.88s call test/integration/test_integration.py::TestSubclass::test_int8_weight_only_quant_subclass_api_4_cuda
26.61s call test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_transpose_bn_relu
26.58s call test/integration/test_integration.py::TestWeightOnlyInt8Quant::test_weight_only_quant_force_mixed_mm_3_cuda
25.63s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_21_cuda
25.54s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_19_cuda
25.28s call test/integration/test_integration.py::TestSubclass::test_int8_dynamic_quant_subclass_api_09_cuda
25.25s call test/integration/test_integration.py::TestAutoQuant::test_autoquant_one_input_29_cuda
25.15s call test/dtypes/test_uintx.py::test_uintx_target_dtype_compile[dtype2]
=========================== short test summary info ============================