Using JetS3t to upload larger number of files to S3

I was looking for a tool to upload large number of files to S3. While I have been a great fan of the bash tools for browsing and accessing s3 objects and buckets and a managing a limited number of files — I could not find an easy way of uploading a large number of files (the first batch being around 800K).

Then I downloaded JetS3t. It has a nice gui called Cockpit for managing the files on S3. The GUI is pretty neat. However, for simple upload/download S3 organizer, a simple Firefox plugin does the job. If you need to extensively manage your files then JetS3t’s cockpit is the way-to-go.

For uploading a large number of files, I was looking for something which is multi-threaded and configurable. JetS3t S3 suite has a “synchronize” application which is meant to synchronize files between a local PC and S3. JetS3t allows you to configure the number of threads and connections to the S3 service. Without reinventing the wheel, I got what I wanted. However, one additional thing I needed was the ability to delete the local files once the upload was complete. On tinkering with the java src, I modded the Synchronize.java and added the following code fragments:

public void uploadLocalDirectoryToS3(FileComparerResults disrepancyResults, Map filesMap,Map s3ObjectsMap, S3Bucket bucket, String rootObjectPath, String aclString) throws Exception  {
...
List filesToDelete = new ArrayList();
...
if (file.isDirectory() != true){
  filesToDelete.add(file.getPath());
}
...

// delete files once objects are S3d
for (Iterator ite = filesToDelete.iterator(); ite.hasNext();){
 String fName = (String)ite.next();
 File f = new File(fName);
f.delete();
}
}

Tags: , ,

  • I’m glad you like JetS3t’s Synchronize application. Thought you might like to know that the latest version of the code in CVS includes a new command-line option (–move) which will delete the source file/object once it has been uploaded or downloaded.

    It works much the same as your code changes, except it works in both directions – it’s very handy for downloading S3 log files and deleting them in one step.

    James (JetS3t author)

  • I’m glad you like JetS3t’s Synchronize application. Thought you might like to know that the latest version of the code in CVS includes a new command-line option (–move) which will delete the source file/object once it has been uploaded or downloaded.

    It works much the same as your code changes, except it works in both directions – it’s very handy for downloading S3 log files and deleting them in one step.

    James (JetS3t author)

  • Thanks for the note. I’ll check it out.

  • Thanks for the note. I’ll check it out.