- Uploading to Google Drive is limited to 750 Gigabytes/day. There's also a rate limit of 2 files/second
- A single file is limited to 5 Terabytes
- You need to have a Google account through ITS. Note that Beocat staff cannot create or reset passwords for ksu.edu Google accounts and have no control over this part of the process.
- Send an e-mail to firstname.lastname@example.org stating that you need to have a ksu.edu Google account for archiving data on Google Drive. This must come from your ksu.edu email account and not a personal email address.
- When this account request is approved, you will get an email from email@example.com with instructions on how to activate your account. It's quite likely this email will end up in your Spam, Junk Mail, or Clutter folders. Follow the instructions in that email to activate your Google account, which must be done within 48 hours. Note that passwords to your Google account are NOT synchronized with your eID.
- Login to the FIONA (Flash I/O Network Appliance) node.
- While logged into Beocat, you can 'ssh fiona'
- Alternatively, you can also ssh using your favorite ssh client (the same one you use to get to Beocat) to fiona.beocat.ksu.edu
- The username is your eID username and the password is your eID password
- While on the fiona, setup rclone
- Run the command 'rclone config'
- Create a (n)ew remote
- Give it a name. For my example, I'll use 'remotegdrive', but you can change this to whatever you want (just be sure to make this change throughout)
- The type of connection is "drive" (at the time of this writing, it is option #12, but this may change)
- You will be asked for Google Application Client Id and Google Application Client Secret. For these two prompts, just type Enter as you don't need to provide them
- Select '1' for full drive access
- For the next two prompts, which ask for the Root Folder ID and Service Account Credentials, you can type Enter at each prompt
- Type 'n' to skip the advanced configuration
- You are working on a remote machine, so choose 'n' for Auto Config
- You will be shown a link/URL that you need to copy and paste in a browser (keep your terminal running the rclone config open). Once you access that URL, you will be asked to log in. Log in using your the ksu.edu Google account username and password from above. You will then see a window that rclone wants to access your Google account. Click "allow".
- A code will show up on your web browser. You will need to copy this code
- Go back to your rclone setup and paste the verification code there
- Type 'n' to the prompt on setting this up as a team drive
- Type 'y' to confirm the setup
- You should now see that you have a 'remotegdrive' (or whatever you named it) remote
- Type 'q' to quit rclone setup
Using the gdrive bash script to manage your Google Drive account
Currently working on the gdrive script and this documentation
Your Google Drive account offers free unlimited long-term storage for your data. You can use this to archive data that you need to keep a copy of, but don't need ready access to. You can also store data that you will need later, which can then be copied back to your /scratch directory and accessed freely if you only need it for 30 days or less, or to your /bulk directory where you will be charged a usage fee. When deciding if this model will work for your data, keep in mind that compressing/uncompressing and transferring the data can take hours or days for very large data sets. Typical transfer rates are around 1 Gbps, but the compression process can take even longer. If you're thinking about how best to use your Google Drive account, please contact us if you have any questions.
We have a gdrive bash script that can help manage your files on your Google Drive account. You can use the rclone commands themselves, but the gdrive script transfers data more efficiently by automatically breaking the data files into blocks, compressing all blocks with gzip in parallel, then transferring those blocks in parallel, and provides you with some idea of the progress of the process at each step.
It is highly recommended that run all but the smallest transfers using either screen or tmux so that if your computer disconnects from the network the transfer will continue and you can simply reattach to the screen or tmux session. The directions below will cover using the screen command.
Below are the commands that you would typically run if you wanted to compress and transfer data to your Google Drive account, then copy it back to your /scratch directory and compare it to the original directory, because we're all paranoid about our data. I will use the Google Drive name remotegdrive: and copy my /bulk/daveturner/wtest directory, and start from either Beocat head node.
screen ssh fiona gdrive copy /bulk/daveturner/wtest remotegdrive:
Check what is on the Google Drive account by looking at the files with ls or the directories using lsd.
gdrive ls gdrive lsd
You can copy the files back to your /scratch directory to verify that it worked properly.
gdrive copy remotegdrive:wtest /scratch/daveturner
Compare the directory downloaded to /scratch with the original (this can take some time).
gdrive compare /scratch/daveturner/wtest /bulk/daveturner/wtest
Now that you know the data is safely stored on your Google Drive account, finish the process by deleting the files from the /scratch directory and the original directory.
gdrive delete /scratch/daveturner/wtest gdrive delete /bulk/daveturner/wtest
And we can exit fiona and the screen command to complete the process.
If your terminal does become detached, you can reattach from the head node.
Copying from Beocat to Google Drive
Below is an example of how to copy a file to the remote Google Drive. Note that for the rclone copy command, you need to follow this convention: rclone copy <path_file_you_want_to_move> <remote_string_name>:<destination_directory_you_want_to_move_into>. Don't forget the colon character that is in between of the remote name and the destination path. The -v flag is optional. It gives you a verbose output that includes progression of the upload.
[user@fiona ~]$ touch movethis.txt # Create a test file to copy [user@fiona ~]$ rclone copy movethis.txt remotegdrive:main -v 2018/10/17 11:34:41 INFO : movethis.txt: Copied (new) 2018/10/17 11:34:41 INFO : Transferred: 0 / 0 Bytes, -, 0 Bytes/s, ETA - Errors: 0 Checks: 0 / 0, - Transferred: 1 / 1, 100% Elapsed time: 1.8s
Copying from Google Drive to Beocat
To copy from the remote to local, make sure the source and destination paths are correct
[user@fiona ~] rm movethis.txt # delete the file we just created above - not necessary if this is a new file [user@fiona ~] rclone copy remotegdrive:movethis.txt . # Copy movethis.txt to current local directory
- To copy files from, say, your bulk space to the remote (here the remote name is remotegdrive; substitute to what you named yours):
[user@fiona ~]$ rclone copy /bulk/your-username/somefile remotegdrive:destination_path/ # Copy somefile to remote destination
- To transfer large files in parallel, you can use the options shown below with rclone copy. For details, see the documentation at https://rclone.org/commands/rclone_copy/ .
- --transfers int Number of file transfers to run in parallel. (default 4)
- --checkers int Number of checkers to run in parallel. (default 8 )
- --drive-chunk-size SizeSuffix Upload chunk size. Must a power of 2 >= 256k. (default 8M)
- --drive-upload-cutoff SizeSuffix Cutoff for switching to chunked upload (default 8M)
- Eg: rclone --transfers=32 --checkers=16 --drive-chunk-size=16384k --drive-upload-cutoff=16384k copy source:path remote:path
- Viewing remote content
Of course, you can always use a web browser ( https://drive.google.com/ ) to view your files, but you can also use the rclone command to view files and directories.
List remote directories:
rclone lsd remotegdrive:
List all your files:
rclone ls remotegdrive:
List all the files on your remote 'mydirectory'
rclone ls remotegdrive:mydirectory
- Forgot which remote you setup?
If you changed 'remotegdrive' to something else in the above example, but don't remember what you changed it to, you can view your remote setup name with
Many thanks to our friends at the University of Kentucky Center for Computational Sciences from whom we copied large swaths of setup instructions.