Migrate from Blogger to ruhoh with proper redirects
Published 2013-1-19Watch on YouTube: youtu.be/ScBoaQw84Ks
Overview
This is based off of Migrate from Blogger to Jekyll from many moons ago.
You're going to
- export your BlogSpot blog (posts and comments) to xml
- translate in from xml to html for static bloggers
- set up redirects to maintain trafic and SEO
It used to be that in order to redirect without losing your google page rank (SEO), you had to use a custom domain so that the redirect would be on the same domain. Supposedly that's not true anymore but, whatever, we'll play on the safe side. Plus it means more is being handled by you and your servers rather than by Google anyway.
To be clear, the full migration path involves 3 redirects:
-
The custom domain redirect which is handled by HTTP 301 + Location (done by blogspot automatically)
oldblog.blogger.com -> oldblog.mysite.com
-
The template variable redirect handled by a
<meta>
refresh + canonical<link>
in<head>
(inserted by you into a classic template)oldblog.mysite.com/post-url -> newblog.mysite.com/redirect?blogger=http://oldblog.mysite.com/post-url
-
Your server redirect handled by HTTP 301 + Location (via nginx, NodeJS, ruby, python, etc)
newblog.mysite.com/redirect?blogger=http://oldblog.mysite.com/post-url -> newblog.mysite.com/post-url
Remember: once you embark on this journey, you MUST keep the redirects in place for several months (or as long as you can).
Most users will probably never update their bookmarks, but most users probably use search terms to find your site again more than they use bookmarks to find your site again.
You see, there was some talk about how non-HTTP cross-domain redirects could be malicious and even talk about future browsers ignoring cross-domain meta refreshes in HTML. I think it's all bunk and we'll be fine.
Setting up a custom domain
If you don't already have a domain, I'd recommend purchasing from name.com or gandi.net
- Login at Blogger.com (you should end up at http://blogger.com/home)
- Click on the
title of the blog
you wish to migrate - At the bottom of the bar on the left, click
settings
- Under
Publishing
selectBlog Address
andAdd a custom domain
- You will need to add 2 CNAME records through your domain registrar. (In the screencast I show how this is done with http://name.com, but it will vary from site to site)
Backup your blog in full in both "upgraded" and "classic" formats
Back on your blog's settings page we'll download all posts, comments, settings, etc as they currently stand (for your safety).
- Click
Settings
->Other
- At the top under
Blog tools
you'll seeexport blog
, click it - Rename the file from
blog-dd-mm-yyyy.xml
to something likeblog-dd-mm-yyyy.current.xml
The "classic" template is what I actually tested with and I know it works. With the "upgraded" template style, it's not clear to me how to use the variables we'll need later on.
- Click
Template
on the left bar - Scroll to the bottom and click
Revert to classic template
Now to export again (this is the one we'll use)
- Go back to
Settings
on the left, and then back to other. - Click
export blog
again (your blog is now in a different format, btw) - Rename the file to something like
blog-dd-mm-yyyy.classic.xml
Import to Ruhoh (for example)
Go on over to http://nodejs.org and download and install NodeJS and then translate your backup xml file into html with yaml frontmatter using blogger2jekyll
npm install -g blogger2jekyll
blogger2jekyll ~/Downloads/blog-*.classic.xml ~/blogger-posts/
Note: I used xmllint --format to help me look at the xml glob and figure out what it meant.
Now head to http://rvm.io and make sure you have ruby-1.9.3 afterwards.
source ~/.rvm/scripts/rvm
rvm reload
rvm use default
ruby --version
See Installing Ruby on Ubuntu 12.04 if you run into any trouble with the default install.
Time to get ruhoh up and rolling as per http://ruhoh.com
pushd ~
git clone 'git://github.com/ruhoh/blog.git' 'blog-v2'
pushd blog-v2
git checkout "2.0.alpha"
bundle install
bundle exec rackup -p 9292
Checkout your empty blog at http://localhost:9292 then hit ctrl+c
in the terminal to kill rackup
.
rsync -avhP ~/blogger-posts/ ~/blog-v2/posts/
bundle exec rackup -p 9292
Checkout your fancy blog at http://localhost:9292 then hit ctrl+c
to exit.
You should really see Hosting your blog on ruhoh.com, because you still need to configure a few things before you're ready to rumble in the blog jungle!
Get hosted
Now it's time to get your site hosted. If you don't have a server yet, I recommend either thrustvps or chunkhost.
bundle exec ruhoh compile
rsync -avhP ./compiled/ user@server:/var/www/blog.example.com/
ssh user@server
npm install -g blogger2jekyll
sudo blogger2jekyll-server 80 /var/www/blog.example.com/
See the appendix for instructions on how to host with Nginx.
Contact me if you need further help. There is such a broad range of environments, I can't explain it all here.
Finalize the redirects
You need to get to the template editor:
- Go back to your Blog settings
- Click
Template
on the left bar - Scroll to the bottom and where the editor is
Now you need to put some template code up in the <head>
.
It will not work in <body>
!
Here's an example of the code. Note, however, that you need to change all four
occurances of localhost:8080
with your domain!
<head>
...
<MainPage>
<meta
http-equiv=refresh
content="0; url=http://localhost:8080/">
<link
rel="canonical"
href="http://localhost:8080/" />
</MainPage>
<ItemPage><Blogger>
<meta
http-equiv=refresh
content="0; url=http://localhost:8080/redirect?blogger=<$BlogItemPermalinkURL$>">
<link
href="http://localhost:8080/redirect?blogger=<$BlogItemPermalinkURL$>"
rel="canonical"/>
</Blogger></ItemPage>
...
</head>
<body>
...
<!--script>
location.href = 'http://blog.coolaj86.com/?blogger=' + location.href; //location.pathname.split('/').pop()
</script-->
</body>
Once you hit save you should utilize http://validator.w3.org/checklink to make sure that everything was moved over and redirected properly.
BTW, If you don't want to take my word for what the heck that glob means, feel free to check my references:
- Template Tag list
- Template Tags: defined
- What are conditional tags?
- How To Add rel=canonical to Blogspot Blogs
Note: I tried using just $BlogItemURL$
, but it didn't work.
Be Happy
Now you sit and await the probably-will-never-come day when you can remove your old blag.
Links I couldn't have done without
- http://stackoverflow.com/questions/5392001/seo-consequences-of-redirecting-with-meta-refresh
- http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394
- http://blogger2wordpress.com/
- http://techie-buzz.com/webmaster-tips/how-to-add-rel-canonical-blogspot-blogs.html (January 31st, 2012)
- http://ruhoh.com/docs/2/pages/#toc_33
- http://ruhoh.com/docs/1/configure/#toc_10
- http://support.google.com/blogger/bin/answer.py?hl=en&answer=97416
Appendix
Nginx
You'll probably want to put this in /etc/nginx/conf.d/blogger-redirect.inc
or directly into /etc/nginx/sites-available/blog.example.com
.
# matches '/redirect' and '/redirect/'
location ~ /redirect/?$ {
# match a querystring 'blogger=<anything>//<anything-not-slash>/(all-the-leftovers)'
if ($args ~ blogger=.*//[^/]*/(?<postUrl>.*)){
# '^' matches any url (we've already limited ourselves to /redirect above)
# '$postUrl' is the variable as matched above
# the trailing '?' means 'throw away the original query arguments'
rewrite ^ /$postUrl? permanent;
# the arguments matched but the rewrite failed? that's a server error
return 500;
}
# Couldn't process the redirect because it didn't match the pattern
return 422;
}
TODO yet another tutorial
Bad Words
If you're wondering why I didn't mention Apache or PHP in any of this... just do us all a favor and get with the program! What is this, 1996? Stop it. No. Just, no!
By AJ ONeal
Did I make your day?
Buy me a coffee
(you can learn about the bigger picture I'm working towards on my patreon page )