DISABLED Test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE3_cuda_float16 (main.TestFlexAttentionCUDA)

Apr 21, 2025 by ADMIN 131 views

DISABLED test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE3_cuda_float16 (main.TestFlexAttentionCUDA)

The test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE3_cuda_float16 test case in the TestFlexAttentionCUDA suite has been disabled due to its flaky nature. This test case has been failing in Continuous Integration (CI) environments, and its recent failures have been tracked in the PyTorch issue tracker.

This test case is specific to the Linux platform.

To debug this test case, follow these steps:

Click on the recent samples link: Visit the recent examples page to view the recent failures of this test case.
Click on the workflow logs link: Click on the workflow logs linked above to view the logs of the failed test runs.
Expand the Test step: Expand the Test step of the job so that it is fully visible. This will allow you to grep for the test case name.
Grep for the test case name: Use the grep command to search for the test case name in the logs. This will help you identify the specific test run that failed.

The following is a sample error message that may be encountered when running this test case:

Traceback (most recent call last):
  File "/var/lib/jenkins/workspace/test/inductor/test_flex_attention.py", line 1201, in test_builtin_score_mods_different_block_size
    self.run_test(score_mod, dtype, block_mask=block_mask, device=device)
  File "/var/lib/jenkins/workspace/test/inductor/test_flex_attention.py", line 491, in run_test
    golden_out.backward(backward_grad.to(torch.float64))
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py", line 648, in backward
    torch.autograd.backward(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/__init__.py", line 354, in backward
    _engine_run_backward(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py", line 824, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py", line 307, in apply
    return user_fn(self, *args)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flex_attention.py", line 679, in backward
    ) = flex_attention_backward(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flex_attention.py", line 132, in __call__
 return super().__call__(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 490, in __call__
    return wrapper()
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 486, in wrapper
    return self.dispatch(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 346, in dispatch
    return kernel(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 320, in maybe_run_autograd
    return self(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flex_attention.py", line 132, in __call__
    return super().__call__(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 490, in __call__
    return wrapper()
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 486, in wrapper
    return self.dispatch(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 346, in dispatch
    return kernel(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flex_attention.py", line 870, in sdpa_dense_backward
    grad_scores, _, _, _, _, *grad_score_mod_captured = joint_score_mod(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py", line 202, in wrapped
    return vmap_impl(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 334, in vmap_impl
    return _flat_vmap(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 484, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py", line 202, in wrapped
    return vmap_impl(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 334, in vmap_impl
    return _flat_vmap(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 484, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py", line 202, in wrapped
    return vmap_impl(
  File "/opt/conda/envs/py_3.10/lib/python3.10-packages/torch/_functorch/vmap.py", line 334, in vmap_impl
    return _flat_vmap(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 484, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py", line 202, in wrapped
    return vmap_impl(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 334, in vmap_impl
    return _flat_vmap(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 484, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph_module.py", line 833, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph_module.py", line 409, in __call__
    raise e
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph_module.py", line 396, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "<eval_with_key>.1013 from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py:1265 in wrapped", line 9, in forward
    where = torch.ops.aten.where.self(ge, add, scalar_tensor);  add = scalar_tensor = where = None
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 795, in __call__
    return self._op(*args, **kwargs)
  File "/opt/conda/envs/py<br/>
**Q&A: DISABLED test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE3_cuda_float16 (__main__.TestFlexAttentionCUDA)**

**Q: What is the current status of the test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE3_cuda_float16 test case?**
A: The test case has been disabled due to its flaky nature and recent failures in Continuous Integration (CI) environments.

**Q: Why is the test case failing?**
A: The test case is failing due to a CUDA out of memory error. This error occurs when the GPU runs out of memory, and the test case is unable to allocate the required memory.

**Q: What is the cause of the CUDA out of memory error?**
A: The CUDA out of memory error is likely caused by the test case's use of large amounts of memory. This can be due to the test case's use of complex data structures or the use of large tensors.

**Q: How can I debug the test case?**
A: To debug the test case, you can follow these steps:

1. **Click on the recent samples link**: Visit the [recent examples](https://hud.pytorch.org/flakytest?name=test_builtin_score_mods_different_block_size_float16_score_mod6_BLOCK_SIZE3_cuda_float16&suite=TestFlexAttentionCUDA&limit=100) page to view the recent failures of this test case.
2. **Click on the workflow logs link**: Click on the workflow logs linked above to view the logs of the failed test runs.
3. **Expand the Test step**: Expand the Test step of the job so that it is fully visible. This will allow you to grep for the test case name.
4. **Grep for the test case name**: Use the `grep` command to search for the test case name in the logs. This will help you identify the specific test run that failed.

**Q: What is the sample error message that may be encountered when running this test case?**
A: The sample error message is as follows:
```python
Traceback (most recent call last):
  File "/var/lib/jenkins/workspace/test/inductor/test_flex_attention.py", line 1201, in test_builtin_score_mods_different_block_size
    self.run_test(score_mod, dtype, block_mask=block_mask, device=device)
  File "/var/lib/jenkins/workspace/test/inductor/test_flex_attention.py", line 491, in run_test
    golden_out.backward(backward_grad.to(torch.float64))
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py", line 648, in backward
    torch.autograd.backward(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/__init__.py", line 354, in backward
    _engine_run_backward(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py", line 824, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py", line 307, in apply
    return user(self, *args)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flex_attention.py", line 679, in backward
    ) = flex_attention_backward(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flex_attention.py", line 132, in __call__
 return super().__call__(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 490, in __call__
    return wrapper()
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 486, in wrapper
    return self.dispatch(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 346, in dispatch
    return kernel(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 320, in maybe_run_autograd
    return self(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flex_attention.py", line 132, in __call__
    return super().__call__(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 490, in __call__
    return wrapper()
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 486, in wrapper
    return self.dispatch(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 346, in dispatch
    return kernel(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flex_attention.py", line 870, in sdpa_dense_backward
    grad_scores, _, _, _, _, *grad_score_mod_captured = joint_score_mod(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py", line 202, in wrapped
    return vmap_impl(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 334, in vmap_impl
    return _flat_vmap(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 484, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py", line 202, in wrapped
    return vmap_impl(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 334, in vmap_impl
    return _flat_vmap(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_funch/vmap.py", line 484, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py", line 202, in wrapped
    return vmap_impl(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 334, in vmap_impl
    return _flat_vmap(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 484, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph_module.py", line 833, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph_module.py", line 409, in __call__
    raise e
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph_module.py", line 396, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "<eval_with_key>.1013 from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py:1265 in wrapped", line 9, in forward
    where = torch.ops.aten.where.self(ge, add, scalar_tensor);  add = scalar_tensor = where = None
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 795, in __call__
    return self._op(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/_trace_wrapped_higher_order_op.py", line 142, in __torch_function__
    return func(*args, **(kwargs or {}))
  File "/opt