How can I download a copy of an S3 public data set?
- by tripleee
i was naively assuming I could do something like
s3cmd sync s3://snap-d203feb5 /var/tmp/copy
but I seem to have the wrong idea of how to go about this. I cannot even get a simple thing to work;
vnix$ s3cmd ls s3://snap-d203feb5
Bucket 'snap-d203feb5':
ERROR: Bucket 'snap-d203feb5' does not exist
I guess the identifier I have is not for a "bucket" but for a "public data set". How do I go from one to the other? Do I have to start up an EC2 instance and create a bucket for this? How? The instructions at http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/using-public-data-sets.html seem to assume I want to use the data in an EC2 instance, but in this case, I'd just like to browse a bit, at least for a start.
By the by, copy/pasting the "US Snapshot ID" causes a nasty traceback from Python; they publish the ID with a weird Unicode (I presume) dash which cannot directly be copy/pasted. Is there a mistake when I copy it? And what's the significance of "US" in there? Can't I use the data outside North America??