Kumina | Blog

awssyncer: an automatic syncer for Amazon S3 that makes use of inotify

amazon_ec2_logo

amazon_ec2_logo

# awssyncer: Continuous syncing of local files into Amazon AWS S3.

At Kumina, we’re strong users of the Amazon AWS cloud computing platform. We’ve been using EC2 instances for quite some time and are currently working on expanding this by making use of Kubernetes.

While setting this up, we’ve noticed that we sometimes want to run jobs for which we want to keep track of small amounts of local state (i.e., files on disk). In this case we’ve decided that we want to store this data in S3, but do want to have it efficiently available through the local file system. The advantage of using S3 for this purpose is that it’s globally replicated, unlike EBS.

For this purpose we’ve developed a new utility called awssyncer, which is as of now available on GitHub! awssyncer is a utility written in C++ that uses Linux’s inotify to keep track of local modifications to a directory on disk. The purpose of this utility is to use these inotify events to determine which files need to be synced back into S3. This utility thus provides continuous one-way sychronisation from local disk to S3. A simple container startup script is used to sync files from S3 to local disk on startup.

Though we realise that this utility is fairly specific to our situation at hand, we do invite all of you to give it a try. Feel free to get in touch in case you have any questions or discover any bugs!

Exit mobile version