See the new article: Migrate from Blogger to Jekyll with Proper Redirects


Migrate from blogspot to Jekyll.


  • indexing related posts with --lsi
  • parse formatted content (not likely to be accomplished)


Turn full rss feeds on for your blog

  1. Navigate to your blog http://yourblog.blogspot.com
  2. Sign in
  3. Click on the edit link for any post
  4. Click settings
  5. Click site feed
  6. Click advanced mode
  7. Select full for all options
  8. Save


The caveats are that you either lose a lot of formatting or a lot of your time. You pick.

with vilcans' Jekyll rss_importer


git clone http://github.com/vilcans/jekyll.git
cd jekyll/
git branch -a
git checkout origin/rss_importer
git checkout -b rss_importer 
git branch
mkdir -p _posts
sed -i "s/require \"YAML\"/require \"yaml\"/" ./lib/jekyll/converters/rss.rb
wget http://${BLOGGER}.blogspot.com/feeds/posts/default?alt=rss -O ${BLOGGER}.rss.xml
ruby -r './lib/jekyll/converters/rss' -e 'Jekyll::RSS.process("'${BLOGGER}'.rss.xml")'

Use the by-hand approach


wget --convert-links --html-extension --mirror --random-wait --wait 3 http://${BLOGGER}.blogspot.com/

Essentially you would want to parse

  • YYYY_MM_DD_title.html as the filename and name:
  • <title> up to </title> and put it in title:
  • <div class='post-body entry-content'> until <div class='post-footer'> and put it after the YAML front-matter.
  • <h2 class='date-header'> up to </h2> and put it in as date: (includes time, which the filename doesn't)

If you write a script to strip out all of the garbage and keep the post + formatting, I'd love to hear about it.

Here's a post that will get you halfway to converting html to markdown.

Categorize by Blog

I'm using Fastr as a template for my blog. Fastr supports categories with vanilla Jekyll.

Here's a script I used to go through one of my blogs, which was created back when there was no title field:

ID=0 # Fastr doesn't allow posts of the same name

cd ${BLOG}_posts
ls | while read POST; do
  sed -i "s/^title:/title: untitled ${ID}\ncategories: ${BLOG} uncategorized/" ${POST}
  mv ${POST} `basename ${POST} .html`_${ID}.html
  let ID=ID+1

And the other, which thankfully did have titles:


cd ${BLOG}_posts
ls | while read POST; do
  sed -i "s/^\(title:.*\)/\1\ncategories: ${BLOG} uncategorized/" ${POST}
  let ID=ID+1

And then to give them the Fastr layout

ls | while read P
  sed -i "s/layout: post/layout: article/" ${P}

Possible Errors

If you didn't enable full rss feeds (and click save):

No content in RSS item '2006_03_01_archive'
Created 0 posts!

If you didn't replace "YAML" with "yaml":

/home/user/jekyll/lib/jekyll/converters/rss.rb:5:in `require': no such file to load -- YAML (LoadError)
  from /home/user/jekyll/lib/jekyll/converters/rss.rb:5:in `<module:Jekyll>'
  from /home/user/jekyll/lib/jekyll/converters/rss.rb:1:in `<top (required)>'
  from ruby:0:in `require'

If you don't have a _posts:

 http://coolaj86.blogspot.com/2010_05_01_archive.html#8976446356395410673 -> _posts/2010-05-06-2010_05_01_archive.html
/home/user/jekyll/lib/jekyll/converters/rss.rb:39:in `initialize': No such file or directory - _posts/2010-05-06-2010_05_01_archive.html (Errno::ENOENT)
  from /home/user/jekyll/lib/jekyll/converters/rss.rb:39:in `open'
  from /home/user/jekyll/lib/jekyll/converters/rss.rb:39:in `block in process'
  from /usr/local/lib/ruby/1.9.1/rexml/element.rb:906:in `block in each'
  from /usr/local/lib/ruby/1.9.1/rexml/xpath.rb:64:in `each'
  from /usr/local/lib/ruby/1.9.1/rexml/xpath.rb:64:in `each'
  from /usr/local/lib/ruby/1.9.1/rexml/element.rb:906:in `each'
  from /home/user/jekyll/lib/jekyll/converters/rss.rb:16:in `process'
  from -e:1:in `<main>'

By AJ ONeal

If you loved this and want more like it, sign up!

Did I make your day?
Buy me a coffeeBuy me a coffee  

(you can learn about the bigger picture I'm working towards on my patreon page )