Thoughts on HTML5’s <time> element and other semantic info on the web

I just read about the <time> HTML5 element, and how it was introduced, then removed, and then re-introduced. While I think proper syntax, consistency, etc. are important, I am more concerned with what such new “semantic” elements will actually mean for the web and its users. This is not limited to <time>, but here it should be easy to explain my general concern, using an example:

It’s March 2012
Joe from the U.S. writes on his blog: “I’ll be on vacation in Europe starting 5/4/12, looking forward to meeting you there!”
Pierre from France reads the blog, and, knowing Joe is from the U.S., he will have the following thoughts: “Cool, Joe will be around… what’s that date… ah, Americans with their month/day/year format… ok, I interpret this as 4th of May, i.e. 4/5/12 in proper French format”

It’s March 2015, HTML5 and <time> are starting to get used
Joe from the U.S. writes on his blog: “I’ll be on vacation in Europe starting <time>2015-05-04</time>, looking forward to meeting you there!”
Pierre from France reads the blog, and having set his browser language to French, it shows “I’ll be on vacation in Europe starting 4/5/15, looking forward to meeting you there!”. Not knowing about is browser being clever, and showing him the date in the format he is used to, he thinks: “Cool, Joe will be around… what’s that date… ah, Americans with their month/day/year format… ok, I interpret this to 5th of April, i.e. 5/4/15 in proper French format”

Of course, with proper highlighting of automatically localised dates this could be mitigated to some extent, but I can imagine lots of cases where our current assumptions, coupled with technology that is trying to help, will cause even more confusion than we have now. When communicating, lots of information is “out of band” or just assumed known context. Therefore we need to be very careful when programming our machines to help us communicate, otherwise we achieve the opposite.

Transferring large amounts of data over unreliable network connections

Ever wanted to transfer a batch of multi-Gigabyte files through a slow DSL link and had trouble? Then this article may help – it shows how to do this properly, with standard Linux/Unix command line tools and working SSH key-auth between the two hosts.

Lets look at what the problems are one may encounter when attempting to, say, transfer an 8 GiB video file over the Internet, uploading through a slow link:

  • The connection to the destination host may break at any time, due to any network equipment failure in between, or DSL reconnects by the ISP, or your son pulling the plug of your home router, …
  • The data may get corrupted by faulty hardware or software along the way
  • The uploading may clog up your ADSL connection, essentially making it unusable for anything else
  • Transferring the data without encryption would allow the maybe sensitive data to be read by others, e.g. someone at your ISP, or nearby the destination host
  • The transfer may take hours or days, depending on the amount/link speed ratio. Not getting notified when the transfer finishes may be undesirable

Here is my solution to all the above problems, in a couple of commands. Once entered, you don’t need to worry about the transfer – just wait for a notification e-mail:

  1. ssh-agent bash
  2. ssh-add
  3. while ! rsync \
  4.           --bwlimit <KB/s value> \
  5.           -rP \
  6.           /path/to/directory_that_contain_the_data_to_be_transferred \
  7.           user@destination.host:/path/to/target_directory ; \
  8. do sleep 60 ; done && \
  9.  echo "File transfer completed successfully at $(date)." | mail -s "File transfer completed" your@e-mail.address

Quick explanation of the commands:

  1. Start a shell that is ssh-agent enabled, i.e. the SSH key passphrase can be cached within that shell
  2. Unlock the SSH key by entering the passphrase (which is then cached for all following commands in this shell session)
  3. Start a loop that will only end when the ‘rsync’ command (used as the loop’s ending condition) completes successfully, i.e. the file transfer is done.
    rsync is the perfect tool for the job, since it transfers files reliably (through checksums), can resume efficiently, and all traffic is going through an encrypted SSH connection.
  4. The –bwlimit option of rsync throttles the transfer to <KB/s value> – with my 512 KBit/s = 64 KB/s ADSL upload that means I would use a value around 35 or 40, to guarantee there is still some upload bandwidth left for other things.
  5. -r stands for recursive (transfer the whole directory, and all sub-directories), and -P keeps partially transferred files for resuming and shows progress information; for details see ‘man rsync’
  6. Should be self-explanatory, if not, see ‘man rsync’
  7. Should be self-explanatory, if not, see ‘man rsync’
  8. The actual content of the while loop is just “wait for 60 seconds”. So in case there is a connection problem, the ‘rsync’ command will be retried every minute.
  9. Once the whole loop has completed successfully, which equals a successful transfer, send a short e-mail that notifies you about the completed transfer.

(By the way, you can of course make one line out of lines 3 to 9, I just split it up to make it easier to read and explain)