mscoco bash instruction verification, does this look right to you?

This is my current download script. Does this look right to you?
```
# 1. Download the 2017 train images and annotations from http://cocodataset.org/:
#You can use gsutil to download them to mscoco/:
#cd $DATASRC/mscoco/ mkdir -p train2017
#gsutil -m rsync gs://images.cocodataset.org/train2017 train2017
#gsutil -m cp gs://images.cocodataset.org/annotations/annotations_trainval2017.zip
#unzip annotations_trainval2017.zip

# Download Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/. eta ~36m.
mkdir -p $MDS_DATA_PATH/mscoco
wget http://images.cocodataset.org/zips/train2017.zip -O $MDS_DATA_PATH/mscoco/train2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip
# both zips should be there, note: downloading zip takes some time
ls $MDS_DATA_PATH/mscoco/
# Extract them into mscoco/ (interpreting that as extracting both there, also due to how th gsutil command above looks like is doing)
# takes some time, but good progress display
unzip $MDS_DATA_PATH/mscoco/train2017.zip -d $MDS_DATA_PATH/mscoco
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco
# two folders should be there, annotations and train2017 stuff
ls $MDS_DATA_PATH/mscoco/
# check jpg imgs are there
ls $MDS_DATA_PATH/mscoco/train2017
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
ls $MDS_DATA_PATH/mscoco/annotations
ls $MDS_DATA_PATH/mscoco/annotations | grep -c .json
# move them since it says so in the natural language instructions ref for moving large # files: https://stackoverflow.com/a/75034830/1601580 thanks chatgpt!
find $MDS_DATA_PATH/mscoco/train2017 -type f -print0 | xargs -0 mv -t $MDS_DATA_PATH/mscoco
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
ls $MDS_DATA_PATH/mscoco | grep -c .jpg
mv $MDS_DATA_PATH/mscoco/annotations/* $MDS_DATA_PATH/mscoco/
ls $MDS_DATA_PATH/mscoco/ | grep -c .json

# 2. Launch the conversion script:
python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
  --dataset=mscoco \
  --mscoco_data_root=$MDS_DATA_PATH/mscoco \
  --splits_root=$SPLITS \
  --records_root=$RECORDS

# 3. Expect the conversion to take about 4 hours.

# 4. Find the following outputs in $RECORDS/mscoco/:
#80 tfrecords files named [0-79].tfrecords
ls $RECORDS/mscoco/ | grep -c .tfrecords
#dataset_spec.json (see note 1)
ls $RECORDS/mscoco/dataset_spec.json
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mscoco bash instruction verification, does this look right to you? #106

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

mscoco bash instruction verification, does this look right to you? #106

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions