Technology

Migrating from Google Apps to Office 365

Just over 6 years ago, I consolidated all my personal email accounts onto Google Apps back when the accounts were still free, however it wasn’t actually that long after that I switched up to a paid Google Apps for Business account – there were a number of useful benefits as part of the package, not just limited to the removal of advertising that plagued the interface. My view has always been that while free services are great, unless their is a sustainable business model behind it, don’t expect the service to always be around – especially important if you become heavily dependant on that service.

You might think that someone like Google are unlikely to drop free services? Take Google Reader as a prime example; an excellent online RSS reader, but dropped by the wayside in 2013, even though arguably it was the best service available at the time. Only in more recent years are there now services providing comparable equivalents, but you’ll notice the best ones are not 100% free. If you’re not paying for the product, you are the product.

Earlier this year I decided to move back into the Microsoft eco-system – the package of services within Microsoft Office 365 are hard to compete against. Even on a personal use level, for about £4 per month you get fully hosted email, calendar and contacts (Exchange), unlimited online file storage services (OneDrive for Business; aka Sharepoint) and a real-time communication tool (Lync/Skype for Business, which also provides a WebEx-like service). Believe it or not, take up of Office 365 has increased massively over the past 12 months, now exceeding Google Apps deployments, which many had thought would become the disrupter in this space to become the online office-suite ‘SaaS’ king, pushing Microsoft and it’s traditional software licensing model out of the market. However, Microsoft has adapted and is now unifying it’s approach across all platforms, not just providing software and services as first class citizens on Windows.

Once I was all setup, all said and done, the experience has been very good – the web interface is excellent, on my MacBook a new version of Outlook for Mac has been released which links into the same web interface, while with the acquisition of Acompli late last year, the now rebranded Outlook Mobile is now probably one of the best mobile email clients available. It’s not all been completely smooth sailing however, as migrating over all my historic data has brought about it’s own challenges, in particular the past 10+ years of email.

Migrating Email via IMAP

Within Office 365 there a Connected Accounts option which allows you to link to an existing (in this case, old) account, but for whatever reason it doesn’t support Google’s IMAP servers and so is effectively useless. Another option is to use the migration options available within Exchange’s administration interface, but again the file format of the CSV I created detailing my old account credentials were not accepted in any shape or form.

Googling (the irony) exposed many commercial software packages and chargeable services available to carry out a migration, but with some persistence I did come across a more hands-on approach, providing an ideal minimal cost option. Rick Sanders’ IMAP Tools are a set of perl scripts, one of which (imapcopy.pl) can be used to copy mail between two different IMAP servers. Rather than try and get them up and running on my MacBook, I decided to spin up an instance on DigitalOcean – this would take advantage of the high speed connectivity from the DigitalOcean data centre, but also not require my own MacBook to remain online and connected throughout the migration process.

There are a couple of steps that need to be completed before hand – first, within Google Apps, make sure that IMAP has been enabled, and if two factor authentication has been enabled an app password will also need to be generated. Similarly, if two factor authentication is enabled on Office 365, an app password needs to be generated there as well.

Next, create an account at DigitalOceanuse this link to get $10 free credit. Create a Droplet, assign a name (this can be anything), set to the smallest size ($5 per month) in the nearest region (in my case, London), and set the distribution to CentOS 7 x64 – no other options are required. It takes around 60 seconds to complete, at which point click on the ‘Console Access’ button. Type the user name as ‘root’, then check your email for a message from DigitalOcean, which contains the root password. When prompted, change the password, then you’ll have completed the login process.

There are a couple of dependancies to install, so type:

yum install -y perl openssl perl-IO-Socket-SSL screen

The IMAP Toolkit is no longer free, however the last free version of imapcopy.pl is still available on Google Code. This can be downloaded and prep’d with the following commands:

wget https://imaputils.googlecode.com/svn-history/r5/trunk/imapcopy.pl
chmod 755 imapcopy.pl

Next, to be able to convert between Google and Office 365’s folder structure, a mapping file needs to be created using the following command – I wanted to move all mail into a single folder on Office 365 called ‘migrated’, but you can add additional mappings as required:

echo [Gmail]/All Mail:migrated >> map

Finally, to start the migration process use the following commands – this will use the screen command so that if you loose your connection to the DigitalOcean host, the migration will still carry on in the background uninterrupted. Be sure to replace your Google Apps & Office 365 email address and password where highlighted below – if you did change the mapping file above, you’ll need to add the comma separated additional folders to be migrated to the -m command:

screen
time ./imapcopy.pl -S imap.gmail.com:993/<google email address>/<google apps password> -D outlook.office365.com:993/<office 365 email address>/<office 365 password> -v -U -d -I -z -M map -m "[Gmail]/All Mail" -L migration.log

Depending on the size of your mailbox, this may take a few hours to run. Once complete, the time command will display how long the migration took to complete. The logfile ‘migration.log’ will have full details of every action taken, including any errors encountered – the migration command can be re-run exactly as-is and it will pick-up where it last left off. If you do get disconnected at any point, reconnect back to your Droplet and login as root as before, then type the following to reconnect back to the migration script:

screen -D -r

On a few occasions I found that the copy hit a problematic email, which prevented the migration from proceeding – I was able to resolve this by opening up the log file using the following command:

nano migration.log

Then pressing CTRL-W to initiate a search, then typing ‘BAD FETCH’. This would jump to the place in the log where the issue first occurred. Scrolling up a few lines would reveal the message ID (it looks like an email address) of the problematic email, which I then searched for in Google Apps with the search phrase ‘rfc822msgid:<message id>’. Once found, I deleted the email then restarted the migration script – fortunately, none of the problematic emails were important.

Checking for Duplicates

After the migration had completed, I noticed that a few disconnects/reconnects had taken place, so I decided to check that no duplicates had been created, just in case the migration had re-started from scratch on each reconnect. Another Google search later produced the answer with Quentin Stafford-Fraser’s IMAPdedup hosted on GitHub. This can be download and prep’d as follows:

wget https://raw.githubusercontent.com/quentinsf/IMAPdedup/master/imapdedup.py
chmod 755 imapdedup.py

Similar to the migration script, however this time it is executed against only the folder copied over to Office 365 (i.e. ‘migrated’). I prefer to do this, rather than against the original source on Google Apps because if there are any issues / lost mail etc, as a roll-back I can re-copy over the mail from Google Apps. As an extra pre-caution, it’s initially run with a -n, which will carry out a dry-run, not actually deleting any mail, but also with -c and -m does a check-sum against the To, From, Subject, Date, Cc, Bcc and Message-ID fields as the safest method to check for duplicates.

./imapdedup.py -s outlook.office365.com -u <office 365 email address> -x -c -m -n migrated

A final report details how many messages would have been deleted as duplicates, if any. To proceed with deletion of the duplicates, re-run the command without the -n:

./imapdedup.py -s outlook.office365.com -u <office 365 email address> -x -c -m migrated

Once you’re happy that all your email has successfully migrated over, log back into the DigitalOcean control panel, and destroy the Droplet – the account will only be charged the hourly usage while the Droplet was running, deducted from the free $10 credit if you used the sign up link previously mentioned.

Technology

Yesterday’s internet, today’s dependency

In today’s connected society, access to the internet is now assumed. Everything we do, from paying taxes, doing shopping, watching TV, doing our job – it’s all online, and it’s only going to grow in ways we can’t currently imagine or foresee. Just 10 years ago I had an ADSL ‘Max’ connection running into my home, providing a download speed of around 6Mbit per second, and 5 years before that I was one of the first ADSL customers in the UK with BT Broadband able to download at around 512Kbit per second – a huge jump compared to the dial-up modem of the 90s, literally 10 times faster (plus a lot quieter), marking out the very early foundations of the online social revolution.

Back to the present day, and I am now able to download at around 60Mbit per second through my BT Infinity connection (10 times faster than 10 years ago), or was, until my phone line went dead on Thursday. What is pretty much seen as a critical (or the 4th) ‘utility’ of the modern household, won’t be back online until Wednesday at the earliest. To be fair, I’ve probably got a bit more tech than most, and I recognise that broadband speed still varies greatly across the UK, but in my house no internet means no TV (Netflix/Amazon Prime/YouView/YouTube), no Music (Spotify/Google Play), no gaming (Xbox/PC), no remote working, and the central heating can’t check in for weather information / no remote access away from home (Nest), among other things. It is also fair to say I am not going to die due to lack of internet either, but just look how quickly our dependency on the internet has changed the way we run and operate our lives.

It also highlights a legacy that still exists – the phone line. I have a telephone number at home, but there are very few people who know it, and even less who ever call it, but the need to have one is still enforced because I need a fixed line internet connection. We’ve not got past needing to have a ‘full’ phone line, just to have a broadband connection – don’t get me wrong, I need the physical wire into my property which also needs to be ‘maintained’, but I don’t need the additional costs lumped into the line rental for services I don’t use, which itself goes up every year without fail. The cost I pay for broadband should be all inclusive.

Fortunately, I have a mobile phone with a 3G internet connection, which means all is not lost. Although the experience of accessing the internet over a 3G connection feels like I have gone back 10 years (but being very much more dependent on reception), I am able to get a level of access which does not significantly change the way I live and work day to day. While my own use of technology is higher than most, it’s also easy to see how most people can live without a fixed line connection. Research into this area is showing a year on year increase in the number of homes that have no phone line at all, as the use of mobile technology gradually negates the need for these services. With the advent of faster 4th generation (4G) data services, and beyond, what can be received over the air is becoming on-par with traditional fixed line broadband connections, except it’s with you everywhere you go and not just at home.

Technology

Even Google gives bad support on PAID services

I’ve hosted my personal domains on Google Apps for years without issue, to the point that I switched over to a paid account sometime back to get extra email routing options and to remove the adverts. However, just today, Google sent out the following email (I’ve added the highlight/bold):

Your Google Apps tax setting will change in the next few weeks

Hello,

We’re writing to let you know about a change to Google Apps that will affect your account for apps.andrewallen.co.uk. We will be changing your tax setting in Google Apps to “business” from its existing setting of “personal” in the next few weeks. After this change, we won’t add VAT to your Google Apps charges, and you’ll be responsible for determining taxes due in your country if you’re not an Irish customer. If you are an Irish customer, you will still be charged VAT.

A “business” tax setting means that your Google Apps account is used for commercial purposes. We believe that your account is a business account, which includes corporates, affiliates, sole traders, self-employed merchants or professionals, and partnerships.

If you are using Google Apps for non-business reasons, then you will be allowed to sign up again as non-business account. We will send another communication with details about how to sign up again.

Sincerely,

The Google Apps Team

No contact details were provided, and return emails just bounced. Makes you wonder, how do they decide (incorrectly in my case) if you’re a business? Do they read your emails, your calendar and make assumptions? Is a human even involved, or is this a result of data mining all the information they hold on you from across all their services? Oh, and they graciously say I’ll “be allowed” to sign up again as a non-business account.

It suddenly feels like, even though I pay for services, I have no control or say over my own account. Even I can’t decide my own status. An Orwellian moment.

Update: Thursday, 9th October, 2014

I called up Google support over lunch to ask for clarification. The response was that all european Google Apps for Work customers will be switched over to a busines account setting, sometime in mid-November. I asked, what about all the other individuals who are not businesses and use Google Apps for personal use, and the response after a couple of minutes on hold was that there will be an email sent out sometime afterwards, which will allow users to switch the setting back to personal.

Why would Google make such a bad broad assumption? Why cause unecessary work for it’s customers? My immediate thought is this is some kind of accounting dodge – I am no way an expert and it’s just personal opinion, but if Google no longer have to account for collecting tax from customers across all the various countries of Europe and instead put the onus back onto their customers, it’s got to same them some money, right? And with a number of large corporates being added to those being investigated by the European Commission for the way they operate their tax arrangements, including now Amazon, I don’t believe it will be long before Google joins the same list.

In any case, the original email was terribly worded and insinuated that it’s customers were acting nefariously in some way, and in my book that is totally unacceptable.

Technology

Can I teach my kids to code?

With the change in air outside here in the UK and Autumn really beginning to kick in after the extended warm weather in September, it’s time to recognise that keeping the kids shut outside is no longer a humane method of childcare – besides, thawing a wet, cold child is not a peaceful affair, and certainly doesn’t afford the appropriate investment of time. In a shear stroke of coincidence, the fruits of a Kickstarter campaign I joined last year has just been delivered through my mailbox this week, which will again bring a sense of peace and calm to the household, at least if only for one day.

Good things come in small packages, which seems to hold true at least for this boxed up modern day electronics kit, as after taking out the bright orange box from the bland brown packaging, then sliding out the beige box within, the colourful picture of the printed circuit board on the front is revealed. With what appears to be an Apple like attention to detail, the front edge of the box lifts up, held in place with a satisfyingly grippy magnetic strip hidden away inside the cardboard interior, the neatly organised interior is finally exposed.

Nestled inside the top are a couple of manuals and sheets of stickers, while the rest is divided into different sections, each securly holding the various components and cables required to build out the computer itself. Since this is a UK version, it’s been supplied with an appropriate mains adapter, but otherwise requires that you supply the monitor or display, which does need to sport an HDMI connection.

Since it would defeat the purpose, I’ve managed to resist chucking out the manuals (who RTFMs anyway?) and slamming it all together myself, but instead had a quick flick through the coding manual to identify that my kids will be well and truly absorbed by the Minecraft section. It looks like it I might already be calling mission accomplished the quiet household front, since Minecraft has held their attention for many hours along with their friends, and to be honest some of the results of their virtual construction is very, very, impressive.

Closing back up the box ready for Sunday, the true test will be keeping them from fighting over who gets to touch it first.

Productivity, Technology

Productivity on the Mac: aText

Recently I’ve been looking at a number of ways to improve productivity – this includes changing the way I work, live, and operate, but some of these improvements can be done in small steps as well. Having switched back to Mac at the end of last year as my primary device for work and play (the biggest improvements gained were through choosing high quality hardware with an SSD drive, by the way!) I have since built up a useful collection of software utilities which make those small improvements in productivity.

The latest of these is called aText, which I came across when originally looking at start using TextExpander. In simple terms, it will replace text automatically with pre-defined text, which you configure within the app. How is that useful? It takes some getting used to, but if you spent anytime thinking about it, you’ll find that you regularly type the same particular phrases all day long – this app allows you to, for example, type ‘tkvm’ in any application, and aText will replace that with ‘Thank you very much’.

Not an immediately obvious benefit, but as I’ve began to populate aText with business names, people, products, I do find myself writing emails, documents, meeting notes much quicker. In Evernote I recently built a template note which I use to document all my meetings – using aText, I can now quickly include attendees (e.g. ‘AA*’ becomes ‘Andrew Allen’), add dates (‘ddate’ becomes ‘Friday, 23 May 2014’), include timings (‘ttime’ becomes ’13:56’) etc.

Arranging meetings and including the conference bridge details is quick as well. I’ve setup ‘bridge*’ to be automatically replaced with:

To join the teleconference:

United Kingdom: 0800 xxxxxxx (freephone) or 0203 xxxx xxx
United States: 855 xxx xxxx (toll-free) or 404 xxx xxxx
Conference Code: 7785xxxxxx#

To view all global dial-in numbers, please click the link:
https://www.tcconline.com/offSite/OffSiteController.jpf?cc=7785xxxxxx

So why aText? Like I mentioned above, I was looking to use TextExpander as I heard it described on a recent productivity podcast, but I couldn’t justify ~$35 – fortunately a few Google searches later, I came across aText, which provides pretty much the same functionality and at only $5 on the App Store, it was a no brainer.
Technology

Archiving on Microsoft Outlook 2010

Google Apps is my primary choice for email at home, however Microsoft Outlook 2010 is the staple mail client supplied on my corporate laptop. We’re pretty lucky as the mailbox limit is pretty large, compared to all previous organisations I’ve worked for – I’ve had anything from 50MB, to 150MB on average, but even my current 2GB limit won’t hold everything. Since starting at Savvis in 2011, I’ve adopted an archive everything policy, rather than deleting any messages – this has been and continues to be a very useful source of information, as a record of all previous conversations, but also as a reference library.

The downside to effectively keeping everything you send and receive, is that it has a constantly growing storage requirement – the old data needs to be taken out of the mailbox, and stored off-line in PST files. If you’ve ever used archiving in Outlook, something you may not have realised is that messages are archived based on their received date and the last modified date and time, whichever is later. This is a problem in the way I work since I move messages between folders and update categories on a regular basis, all of which affects the last modified date. I’ve have also recently been trialling Taglocity and re-tagging a lot of mail which has ended up changing the last modified date on 95% of mail currently in my mailbox – when I ran an archive on all mail over 3 months old, hardly any mail was moved out to the archive.

Fortunately, it is possible to change the default behaviour by adding the following registry key:

Key: HKEY_CURRENT_USER\Software\Microsoft\Office\14.0\Outlook\Preferences
Name: ArchiveIgnoreLastModifiedTime
Type: DWORD
Value: 1

This will remove the check against last modified date, and only check against the received date when processing messages as part of the archive process. In Outlook 2010, this feature was added in KB2516474, so you need to make sure to have this patch installed. Once I had made this change and re-ran archive, 600MB of data was now moved out to my local PST. More information on this registry change can be found in KB2553550.

Technology

Setting Up Vagrant for Octopress

Back last year in the summer of 2012, I switched my site from Posterous over to Octopress having then decided to remove the dependency on a proprietary blogging platform, primarily because all the site data was held with an organisation that may on a whim choose to cease the service. Case in point, in February of this year Posterous announced they were closing down the service – having been acquired by Twitter back in March 2012, you would probably agree that this was inevitable.

If you’re not familiar with Octopress, it’s a static blogging platform framework built on top of Jekyll, which itself is a simple, blog aware, static site generator. It takes a template directory containing raw text files in various formats, runs it through Markdown and Liquid converters, and spits out a complete, ready-to-publish static website suitable for serving on a preferred web server. Jekyll is very minimalistic and very efficient. The most important thing to realize about Jekyll is that it creates a static representation of a website requiring only a static web-server. Traditional dynamic blogs like WordPress require a database and server-side code. Heavily trafficked dynamic blogs must employ a caching layer that ultimately performs the same job Jekyll sets out to do; serve static content.

Since there are no server side requirements, the dependencies for building out the site therefore exist on your own machine. If you’ve ever followed the documentation on the Octopress site, you’ll have an implementation similar to what I had in place last year – on my Windows 7 machine, I had installed Git for Windows, along with Yari to install and manage Ruby). Once all the relevant RubyGems were installed, I was able to create and maintain my Octopress site, but since I was publishing to Amazon S3, I also needed to use CloudBerry Explorer for Amazon S3 to manually upload the generated content. SkyDrive was also used to store a synced copy of the site data.

Initially using an Amazon S3 bucket to host the static content and Amazon CloudFront as the CDN, in the past few days I’ve just switched over to hosting the site content on Heroku and using CloudFlare as the CDN. It has been quite a number of months since I last made a post, and while working on the move it has quickly become apparent that the local system I was previously using no longer existed (had since been rebuilt), along with the various software components required to post to and maintain the site.

This time around, rather than re-install all the various components again, I wanted to find a more suitable solution which would allow me to update the site from a number of different machines; i.e. a more portable or reproducible method of setting up the environment to allow ongoing maintenance of the site. Some Goggling later, I came across a very interesting piece of software called Vagrant, which allows exactly that.

Vagrant provides easy to configure, reproducible, and portable work environments by isolating dependencies and their configuration within a single disposable, consistent environment. From a single Vagrantfile, regardless of the platform being Mac, Windows or Linux, the environment will be brought up in the exact same way, against the same dependencies, all configured in the exact same way.

Not without it’s own requirements, Vagrant requires an installation of VirtualBox to be present (used to manage the virtual machines), and also an SSH client (used to connect into the virtual machines). On my Windows 7 system, my preference was to install Cygwin with the OpenSSH package – if you’re installing Cygwin to a system where you don’t have administrative rights, make sure you rename SETUP.EXE (retrieved from the Cygwin site) to anything else, such as CYGWIN.EXE, else the system will trigger UAC and prompt for administrative credentials.

With Cygwin, VirtualBox and Vagrant now installed (with default configuration), I then opened up a Cygwin Terminal. By default, I was my in my Cygwin home directory (C:\cygwin\home\andrew.allen) – for example’s sake, this is where I built my Vagrant environment, but if you have your own preference where to store your project data, you should change to the relevant directory on your system. I then typed the following commands:

vagrant init precise32 http://files.vagrantup.com/precise32.box
vagrant up
vagrant ssh

Once these three commands completed, I had a new virtual machine started in VirtualBox, running Ubuntu 12.04 LTS 32-bit to which I had was now connected via SSH – now that was pretty cool, and took little to no effort.

Next, I configured the machine with the dependencies I needed for Octopress. In the Cygwin Terminal window, I typed the following:

sudo su -
apt-get update && apt-get install -y git curl
curl -L https://get.rvm.io | bash -s stable --ruby=1.9.3
source /usr/local/rvm/scripts/rvm
rvm rubygems latest
gem install bundler
gem install heroku
exit
exit

This installed Git, cURL, RVM, Ruby 1.9.3 with RubyGems dependencies including support for publishing to Heroku, then disconnected from the VM. Here is the clever bit – by updating the Vagrantfile with these commands, bringing up this environment in future would automatically install these components. When I first ran the ‘vagrant init’ command, it created the Vagrantfile in the current directory – I opened this file up in notepad, and added the following lines, just before the last ‘end’:

config.vm.synced_folder "/Personal Data/SkyDrive/Projects/www.andrewallen.co.uk", "/projects/www.andrewallen.co.uk"
config.vm.provision :shell, :path => "bootstrap.sh"

The first line creates a mapping between my host system where I wanted to store my Octopress site, and a folder within the virtual machine, while the second line contains a reference to a script file which will contain the commands I manually executed previously. I saved the file, closed notepad, created a new file in the same directory called ‘bootstrap.sh’ then added the following lines:

#!/usr/bin/env bash
apt-get update && apt-get install -y git curl
curl -L https://get.rvm.io | bash -s stable --ruby=1.9.3
source /usr/local/rvm/scripts/rvm
rvm rubygems latest
gem install bundler

Now when the virtual machine is brought on-line with ‘vagrant up’, this script file will be transferred into the VM and executed as root. Saving the file and closing notepad, I proved this to be the case – switching back to the Cygwin Terminal window, I ran the following commands:

vagrant destroy -f
vagrant up
vagrant ssh

Once I was connected back to a newly created VM instance, I ran ‘ruby —version’ which confirmed that ruby 1.9.3 was successfully installed. Taking a copy of the Vagrantfile I had created, anyone can now bring up the exact same environment, ready for use by Octopress. Back in the Cygwin Terminal window, I ran the following commands to configure a clean Octopress site, within the project directory I had mapped within the Vagrantfile:

cd /projects/www.andrewallen.co.uk/
git clone git://github.com/imathis/octopress.git octopress
rm octopress/.rvmrc
rm octopress/.rbenv-version
cd octopress
bundle install
rake install

These files were directly written to the mapped folder on my host machine, which also happens to be within my SkyDrive folder, and not within the VM itself.

Following this work, I am now in the position of being able to bring up an environment to support the management of my site quickly, easily and consistently, but also that all my site data is stored safely, synced automatically, and version controlled within my SkyDrive account. On any machine, I am now able to quickly recreate my blogging environment.

Using the Vagrantfile that I’ve created, you can now bring up an environment with all the correct pre-requisites and dependencies to support an Octopress site, by just installing Vagrant and VirtualBox onto your preferred platform (Mac, Windows, Linux) along with a suitably installed SSH client.