Transferring large amounts of data over unreliable network connections

Ever wanted to transfer a batch of multi-Gigabyte files through a slow DSL link and had trouble? Then this article may help – it shows how to do this properly, with standard Linux/Unix command line tools and working SSH key-auth between the two hosts.

Lets look at what the problems are one may encounter when attempting to, say, transfer an 8 GiB video file over the Internet, uploading through a slow link:

  • The connection to the destination host may break at any time, due to any network equipment failure in between, or DSL reconnects by the ISP, or your son pulling the plug of your home router, …
  • The data may get corrupted by faulty hardware or software along the way
  • The uploading may clog up your ADSL connection, essentially making it unusable for anything else
  • Transferring the data without encryption would allow the maybe sensitive data to be read by others, e.g. someone at your ISP, or nearby the destination host
  • The transfer may take hours or days, depending on the amount/link speed ratio. Not getting notified when the transfer finishes may be undesirable

Here is my solution to all the above problems, in a couple of commands. Once entered, you don’t need to worry about the transfer – just wait for a notification e-mail:

  1. ssh-agent bash
  2. ssh-add
  3. while ! rsync \
  4.           --bwlimit <KB/s value> \
  5.           -rP \
  6.           /path/to/directory_that_contain_the_data_to_be_transferred \
  7.           user@destination.host:/path/to/target_directory ; \
  8. do sleep 60 ; done && \
  9.  echo "File transfer completed successfully at $(date)." | mail -s "File transfer completed" your@e-mail.address

Quick explanation of the commands:

  1. Start a shell that is ssh-agent enabled, i.e. the SSH key passphrase can be cached within that shell
  2. Unlock the SSH key by entering the passphrase (which is then cached for all following commands in this shell session)
  3. Start a loop that will only end when the ‘rsync’ command (used as the loop’s ending condition) completes successfully, i.e. the file transfer is done.
    rsync is the perfect tool for the job, since it transfers files reliably (through checksums), can resume efficiently, and all traffic is going through an encrypted SSH connection.
  4. The –bwlimit option of rsync throttles the transfer to <KB/s value> – with my 512 KBit/s = 64 KB/s ADSL upload that means I would use a value around 35 or 40, to guarantee there is still some upload bandwidth left for other things.
  5. -r stands for recursive (transfer the whole directory, and all sub-directories), and -P keeps partially transferred files for resuming and shows progress information; for details see ‘man rsync’
  6. Should be self-explanatory, if not, see ‘man rsync’
  7. Should be self-explanatory, if not, see ‘man rsync’
  8. The actual content of the while loop is just “wait for 60 seconds”. So in case there is a connection problem, the ‘rsync’ command will be retried every minute.
  9. Once the whole loop has completed successfully, which equals a successful transfer, send a short e-mail that notifies you about the completed transfer.

(By the way, you can of course make one line out of lines 3 to 9, I just split it up to make it easier to read and explain)

Reading news the way I like it

I have been using Tiny Tiny RSS (or TT-RSS) (www-apps/tt-rss in Gentoo) as my news aggregator since the end of 2008. It is a web application (written in PHP) that provides a great news aggregator UI. Since it’s a web application, you can use it from anywhere with just a browser, and thus have all your feeds in the same state (read/starred/…), anywhere you are. Think of it as Google Reader, but without Google knowing exactly what news articles you’re interested in, since you can run TT-RSS on your own server / web host.
So, TT-RSS alone was great already, but the UI is more suitable for when you have a mouse and a big browser window, than for using it from your phone.

Recently I discovered the Android App TT-RSS Reader which solves exactly that problem. It connects to your TT-RSS instance via an API, and brings all your feeds, with all their states just as on the web interface, to your Android device. The interface is specifically designed for touch screens, so it’s much easier to navigate through your feeds and articles than via the web interface. Furthermore, it can cache articles and images, which you can trigger while on WiFi before leaving the house, and then read everything on the way in Offline mode, to save mobile traffic, which is expensive and/or slow for some. When you’re back on WiFi, you switch back to Online mode, and TT-RSS Reader synchronises your state to your TT-RSS instance. Absolutely awesome 🙂

A couple of quick screenshots from the web interface (TT-RSS):


… and the Android App (TT-RSS Reader):


Update (2011-07-03): Shortly after posting this, hwoarang offered to take TT-RSS from the Sunrise Overlay into Gentoo’s main repository (Portage), with me proxy-maintaining it. So now it’s even easier to get TT-RSS onto your Gentoo-powered server.