Skip to content

Replace iterrows() with itertuples() for better performance #1

@SaFE-APIOpt

Description

@SaFE-APIOpt


Current code:

for index, row in x.iterrows():
    ...

Recommended replacement:

for row in x.itertuples(index=True):
    ...

Using iterrows() returns each row as a Pandas Series, which introduces overhead from object creation, type inference, and dictionary-based access. Every row iteration allocates a new Series object and resolves columns via dynamic indexing, which can become a major performance bottleneck when iterating over large DataFrames.

On the other hand, itertuples() yields each row as a lightweight namedtuple, constructed in Cython. This avoids unnecessary object overhead and allows fast, attribute-style access to columns. It is significantly faster and more memory-efficient than iterrows(), making it the preferred method for row-wise access when mutation is not required.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions