Skip to content

wildcard performance #1592

@dmpetrov

Description

@dmpetrov

Description

wildcard performance is a bit concerning:

$ time datachain ls 'gs://datachain-demo/dogs-and-cats/' > /dev/null
datachain ls 'gs://datachain-demo/dogs-and-cats/' > /dev/null  0.52s user 0.09s system 98% cpu 0.624 total

Adding wildcard and it's degradating 50 times:

$ time datachain ls 'gs://datachain-demo/dogs-and-cats/*' > /dev/null
datachain ls 'gs://datachain-demo/dogs-and-cats/*' > /dev/null  0.57s user 0.11s system 2% cpu 24.240 total

And it's only 400 files which is nothing.

Version Info

0.44.9
Python 3.12.11

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions