Skip to content

BatchOperator incompatible with airflow 3 #50535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 tasks done
davidcroda opened this issue May 13, 2025 · 6 comments · May be fixed by #50751 or #50932
Open
2 tasks done

BatchOperator incompatible with airflow 3 #50535

davidcroda opened this issue May 13, 2025 · 6 comments · May be fixed by #50751 or #50932
Assignees
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:amazon AWS/Amazon - related issues

Comments

@davidcroda
Copy link

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==9.7.0

Apache Airflow version

3.0.1

Operating System

Debian GNU/Linux 12 (bookworm)

Deployment

Astronomer

Deployment details

No response

What happened

The BatchOperator is unable to be used when invoked with the .expand function. It fails to be parsed with the following error.

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/airflow/serialization/serialized_objects.py", line 1314, in _serialize_node
    op.operator_extra_links.__get__(op)
  File "/usr/local/lib/python3.12/site-packages/airflow/providers/amazon/aws/operators/batch.py", line 156, in operator_extra_links
    wait_for_completion = self.wait_for_completion
                          ^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'MappedOperator' object has no attribute 'wait_for_completion'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/airflow/serialization/serialized_objects.py", line 1787, in to_dict
    json_dict = {"__version": cls.SERIALIZER_VERSION, "dag": cls.serialize_dag(var)}
                                                             ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/airflow/serialization/serialized_objects.py", line 1696, in serialize_dag
    raise SerializationError(f"Failed to serialize DAG {dag.dag_id!r}: {e}")
airflow.exceptions.SerializationError: Failed to serialize DAG 'parcel_slope_pipeline': 'MappedOperator' object has no attribute 'wait_for_completion

I believe this is due to the following check failing: https://github.com/apache/airflow/blob/main/providers/amazon/src/airflow/providers/amazon/aws/operators/batch.py#L148-L157

Because MappedOperator is imported from the old import path it is failing the isinstance check on airflow 3.

I am willing to make a PR to fix it but I'm not familiar enough with airflow to know how to fix this in a way which would would be backwards compatible.

What you think should happen instead

The BatchOperator uses an older import path for MappedOperator. This causes the isinstance check to fail on line 148. This check or import should be updated in a way that is compatible with both airflow 2.x and airflow 3. I was able to monkey patch the issue in my own dag with the following code:

# Monkey patch to fix MappedOperator detection in amazon provider 9.7.0
import airflow.providers.amazon.aws.operators.batch as _batch
from airflow.sdk.bases.operator import MappedOperator as _sdk_map
_batch.MappedOperator = _sdk_map 

How to reproduce

Here is a minimal example which reproduces the issue on airflow 3

from airflow.providers.amazon.aws.operators.batch import BatchOperator
from airflow.sdk import dag

@dag()
def repro():

    BatchOperator.partial(
        task_id="repro",
        job_definition="repo",
        job_queue="repro"
    ).expand(job_name=[1, 2, 3])

repro()

Anything else

Here is the error log from the example for posterity, similar to the one in my original issue:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/airflow/serialization/serialized_objects.py", line 1314, in _serialize_node
    op.operator_extra_links.__get__(op)
  File "/usr/local/lib/python3.12/site-packages/airflow/providers/amazon/aws/operators/batch.py", line 156, in operator_extra_links
    wait_for_completion = self.wait_for_completion
                          ^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'MappedOperator' object has no attribute 'wait_for_completion'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/airflow/serialization/serialized_objects.py", line 1787, in to_dict
    json_dict = {"__version": cls.SERIALIZER_VERSION, "dag": cls.serialize_dag(var)}
                                                             ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/airflow/serialization/serialized_objects.py", line 1696, in serialize_dag
    raise SerializationError(f"Failed to serialize DAG {dag.dag_id!r}: {e}")
airflow.exceptions.SerializationError: Failed to serialize DAG 'repro': 'MappedOperator' object has no attribute 'wait_for_completion'

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@davidcroda davidcroda added kind:bug This is a clearly a bug area:providers needs-triage label for new issues that we didn't triage yet labels May 13, 2025
Copy link

boring-cyborg bot commented May 13, 2025

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@eladkal eladkal added provider:amazon AWS/Amazon - related issues good first issue and removed area:dynamic-task-mapping AIP-42 needs-triage label for new issues that we didn't triage yet labels May 13, 2025
@valentinDruzhinin
Copy link
Contributor

@eladkal could you please assign it to me? I would like to work on this issue.

@valentinDruzhinin
Copy link
Contributor

@potiuk may I ask you to assign this issue on me please?

@valentinDruzhinin
Copy link
Contributor

Looks like @Shashwatpandey4 already created PR for that issue :) @potiuk please assign this issue on him :)

@davidcroda
Copy link
Author

If you are referring to #50751 that does not address this issue.

@valentinDruzhinin
Copy link
Contributor

@davidcroda sorry, my bad :) it appears in issue for some reason

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment