-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
In Sagemaker Im tring to load the data set from S3 path as follows
`train_path = 's3://xxxxxxxxxx/xxxxxxxxxx/train.csv'
valid_path = 's3://xxxxxxxxxx/xxxxxxxxxx/validation.csv'
test_path = 's3://xxxxxxxxxx/xxxxxxxxxx/test.csv'
data_files = {}
data_files["train"] = train_path
data_files["validation"] = valid_path
data_files["test"] = test_path
extension = train_path.split(".")[-1]
datasets = load_dataset(extension, data_files=data_files, s3_enabled=True)
print(datasets)`
I getting an error of
algo-1-7plil_1 | File "main.py", line 21, in <module> algo-1-7plil_1 | datasets = load_dataset(extension, data_files=data_files) algo-1-7plil_1 | File "/opt/conda/lib/python3.6/site-packages/datasets/load.py", line 603, in load_dataset algo-1-7plil_1 | **config_kwargs, algo-1-7plil_1 | File "/opt/conda/lib/python3.6/site-packages/datasets/builder.py", line 155, in __init__ algo-1-7plil_1 | **config_kwargs, algo-1-7plil_1 | File "/opt/conda/lib/python3.6/site-packages/datasets/builder.py", line 305, in _create_builder_config algo-1-7plil_1 | m.update(str(os.path.getmtime(data_file))) algo-1-7plil_1 | File "/opt/conda/lib/python3.6/genericpath.py", line 55, in getmtime algo-1-7plil_1 | return os.stat(filename).st_mtime algo-1-7plil_1 | FileNotFoundError: [Errno 2] No such file or directory: 's3://lsmv-sagemaker/pubmedbert/test.csv
But when im trying with pandas , it is able to load from S3
Does the datasets library support S3 path to load