Closed
Description
Describe the bug
This happens if the path passed to read_parquet_metadata doesn't exist:
[ERROR] IndexError: list index out of range
Traceback (most recent call last):
File "/var/task/obscured/obscured.py", line 123, in do_a_thing
column_types, _ = wr.s3.read_parquet_metadata(
File "/opt/python/awswrangler/_config.py", line 715, in wrapper
return function(**args)
File "/opt/python/awswrangler/_utils.py", line 178, in inner
return func(*args, **kwargs)
File "/opt/python/awswrangler/s3/_read_parquet.py", line 846, in read_parquet_metadata
columns_types, partitions_types, _ = _read_parquet_metadata(
File "/opt/python/awswrangler/s3/_read_parquet.py", line 140, in _read_parquet_metadata
return reader.read_table_metadata(
File "/opt/python/awswrangler/s3/_read.py", line 280, in read_table_metadata
merged_schemas = _validate_schemas(schemas=schemas, validate_schema=False)
File "/opt/python/awswrangler/s3/_read.py", line 304, in _validate_schemas
first: pa.schema = schemas[0]
How to Reproduce
awswrangler.s3.read_parquet_metadata(path='s3://bucket-you-can-read/file-that-doesnt-exist')
Expected behavior
An exception like exceptions.NoFilesFound is thrown, or perhaps some kind of empty result? It's unclear what the correct behavior here should be, but it's not throwing an IndexError :)
Your project
No response
Screenshots
No response
OS
Linux
Python version
3.11
AWS SDK for pandas version
3.7.3
Additional context
No response