We will be using the cloud computing provided by Amazon, known as Amazon Web Services (AWS). The specific AWS service used in this lab is the Amazon Simple Storage Service (S3). S3 allows users to store data items, known as objects, in locations known as buckets. Technically, an object is an arbitrary array of bytes, but it's best to think of an object as being the same thing as a file stored on a computer's hard drive. Each object has a key, which is a string that can be used to retrieve the object, and is analogous to the full name of a file, including its parent folders. The data in an object is analogous to the contents of a file. Is S3, a bucket is an abstract concept that can be thought of a location where objects are stored. Very roughly, a bucket is analogous to a disk drive attached to a computer: just as you can store files on the disk drives attached to your computer, you can store S3 objects in the buckets in an S3 account. One important difference between S3 buckets and the file systems we are used to is that buckets have no notion of hierarchy: a bucket has no folders or directories, so its namespace is flat and every object has a unique key in the bucket's namespace.
Certain credentials are required interacting with S3. Specifically, you need an access key and a secret key. The access key is similar to a login name; it allows S3 to identify you. The secret key is similar to a password. For this lab, you have been provided with an access key, a secret key, and the name of a bucket. You have full privileges to read, write, and list objects within this bucket.
Please note that your bucket resides within the instructor's personal AWS account. Amazon charges the instructor real money for use of this account. For any reasonable usage, the charges are minuscule: about 10 cents per gigabyte per month for storage, and one cent for every thousand objects transferred into or out of S3. Thus, please limit your backup experiments to at most a few thousand objects whose total size is less than a gigabyte. As with any other unacceptable behavior at the college, any abuse of your S3 access will be dealt with through the college disciplinary system. After the lab has been graded, your S3 credentials will be revoked and all objects in your bucket will be deleted.
Apart from these high-level requirements, the behavior of the program is left unspecified. You are free to use creativity in designing a useful, robust, and user-friendly backup utility. In particular, your program may use additional commandline arguments.
Amazon provides many ways of interacting with S3, including Java, Perl, PHP, C#, Python, and Ruby. You are welcome to use any programming language and programming tools for this lab, but these notes will provide tips only for Java, and will assume you're developing within the Eclipse environment.
Assuming you are using Java, you'll need the following zip file of code libraries and documentation: CloudLab.zip. Unzip the file, create a new Eclipse project, and add the four .jar files to its build path ("add External JARs" in Java Build Path properties). Add the AWS JavaDoc to the project by associating it with the file aws-java-sdk-1.1.1.jar (right-click on this file in Package Explorer, choose Properties, choose JavaDoc location, and navigate to the directory aws-javadoc in your unzipped version of CloudLab.zip).
Following are some code snippets that provide simple examples of some of the basic functionality you will need from S3.
String accessKey = "????????????";
String secretKey = "?????????????????????";
AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
AmazonS3 s3 = new AmazonS3Client(credentials);
String bucketName = "????????????????????????";
String objectKey = "backup-of-MyFile.txt";
String fileName = "MyFile.txt";
File file = new File(fileName);
s3.putObject(bucketName, objectKey, file);
S3Object object = s3.getObject(bucketName, objectKey);
InputStream input = object.getObjectContent();
/* read and save the data from the input stream */
input.close();
String prefix = "backup-1-5-2011";
ObjectListing listing = s3.listObjects(bucketName, prefix);
for (S3ObjectSummary objectSummary : listing.getObjectSummaries()) {
String key = objectSummary.getKey();
/* do something with the key */
}
For further details, consult the online JavaDoc for AWS, and other documents as needed from Amazon's online S3 documentation.
It will be possible to achieve an excellent grade on the assignment by implementing only the bare-bones utility described above in the "Basic Requirements" section. A reasonable attempt at any of the extensions suggested in the "Possible extensions" section above will receive up to 10% extra credit. If you attempt any of the extensions, clearly indicate this in the comments at the top of your main code source file.