{{ head.image.alt }}

Insights

How to Transfer Petabytes of Data from AWS S3 to OCI Object Storage with Rclone

January 29, 2024

ransfer Petabytes of Data from AWS S3 to OCI Object Storage with Rclone

If you have a large amount of data stored in Amazon Web Services (AWS) Simple Storage Service (S3) and you want to migrate it to Oracle Cloud Infrastructure (OCI) Object Storage, you might face some challenges. For example, how do you transfer petabytes of data efficiently and securely? How do you monitor the progress and handle errors? How do you optimize the performance and cost of the data transfer?

See how to use Rclone, a popular open source command line tool, to transfer petabytes of data from AWS S3 to OCI Object Storage. Rclone is a versatile tool that supports many cloud storage providers, including AWS S3 and OCI Object Storage. It also has many features that make it suitable for large-scale data migration, such as:

  • Parallelism: Rclone can run multiple transfers in parallel, which can speed up the data transfer and utilize the available bandwidth.
  • Resume: Rclone can resume interrupted transfers and skip files that are already transferred, which can save time and avoid data duplication.
  • Encryption: Rclone can encrypt the data before transferring it, which can enhance the security and privacy of the data.
  • Logging: Rclone can generate detailed logs of the transfer process, which can help you monitor the progress and troubleshoot errors.
  • Filtering: Rclone can filter the files to be transferred based on various criteria, such as name, size, date, etc., which can help you select the files that you need to migrate.

To use Rclone to transfer petabytes of data from AWS S3 to OCI Object Storage, you need to follow these steps:

  1. Install Rclone on a Linux server that has access to both AWS S3 and OCI Object Storage. You can download the latest version of Rclone from its official website or use a package manager such as yum or apt.
  2. Configure Rclone with the credentials and settings of both AWS S3 and OCI Object Storage. You can use the rclone config command to interactively create or edit the configuration file, or edit it manually with a text editor. You can find more details on how to configure Rclone for AWS S3 here and for OCI Object Storage here.
  3. Run the rclone sync command to start the data transfer. The sync command will copy all the files from the source to the destination and delete any extra files in the destination. You can also use other commands such as copy or move depending on your needs. You can specify various options to customize the transfer process, such as:
    • --transfers: The number of file transfers to run in parallel. The default value is 4, but you can increase it to improve the performance.
    • --checkers: The number of checkers to run in parallel. The checkers are responsible for checking if a file needs to be transferred or not. The default value is 8, but you can increase it to speed up the checking process.
    • --bwlimit: The bandwidth limit for the data transfer. You can use this option to limit the bandwidth usage and avoid network congestion or extra charges.
    • --crypt-remote: The name of the encrypted remote storage. You can use this option to encrypt the data before transferring it. You need to configure the encryption settings in advance with the rclone config command.
    • --log-file: The name of the log file where Rclone will write the transfer details. You can use this option to monitor the progress and troubleshoot errors.
    • --filter-from: The name of the file that contains the filter rules. You can use this option to filter the files that you want to transfer based on various criteria.

You can find more options and examples on how to use Rclone here.

Here is an example of a rclone sync command that transfers petabytes of data from AWS S3 bucket named aws-s3-bucket to OCI Object Storage bucket named oci-object-bucket with 16 parallel transfers, 32 parallel checkers, 1 Gbps bandwidth limit, encryption enabled, log file named rclone.log, and filter file named filter.txt:

rclone sync aws-s3-bucket: oci-object-bucket: --transfers 16 --checkers 32 --bwlimit 1G --crypt-remote oci-object-bucket-crypt: --log-file rclone.log --filter-from filter.txt

  1. Wait for the data transfer to complete. Depending on the amount of data and the network conditions, this might take hours, days, or even weeks. You can check the log file or use the rclone progress command to see how much data has been transferred and how long it will take to finish.
  2. Verify that all the data has been transferred successfully. You can use the rclone check command to compare the files in both AWS S3 and OCI Object Storage and report any differences. You can also use the rclone ls command to list the files in both storages and compare the sizes and counts.

By following these steps, you can use Rclone to transfer petabytes of data from AWS S3 to OCI Object Storage with ease and efficiency. Rclone is a powerful tool that can help you migrate your data to the cloud and take advantage of the benefits of OCI Object Storage, such as scalability, durability, security, and cost-effectiveness.