Uncategorized

Blogging With Dexy

September 15, 2010   ·   By   ·   2 Comments   ·   Posted in Uncategorized

Blogging about programming often means talking about code, and including code snippets in your blog post. Anyone who has tried to set up a technical blog will know that this isn’t that easy. GUI text editors like to reformat your code, especially ignoring whitespace. Then there’s the thorny issue of syntax highlighting. And, blog posts have the same issues as any other type of writing when it comes to including source code: if you write the source code directly into the document, it might be wrong (and I find it to be a relatively slow process). If you write it separately and test it, then copy and paste, it’s probably correct but then any changes you make may introduce errors. (I always find that, as I write, I iterate the example code.) Also, the snippet you want to type into your blog may not be able to run on its own. You want to be able to write code in code files using your favourite text editor, test and run it, and then leave it there to be run again at any time or modified and re-tested, but be able to pull out just the useful and interesting sections to include in your discussion. This post is going to show you how Dexy makes this possible.

I started out using WordPress for my personal blog a few years ago, but eventually gave up and switched to a static text generator called webby which gave me all the control I wanted over my content, but didn’t have ‘bloggy’ features like comments. I ended up writing something for that which works but certainly isn’t as convenient as what you’d find in a system like WordPress. Also, the static text generation approach means you need to be writing at a machine which has your blog source and all the tools needed to build it. You can’t just log in from anywhere, anytime and fire off a short post.

So, when I was getting ready to start blogging regularly about Dexy, I started thinking about how I would do that. And, I decided to find out whether Dexy could solve this for me. I (eventually) found the documentation for WordPress’s XMLRPC-based APIs (there’s a confusing mess of them), and a shockingly short time later I was posting content directly to WordPress. I added a Pygments stylesheet to my WordPress template, and my source code was instantly highlighted.

In principle, you could use Dexy to post content to any blogging system with an API that supports creating and editing posts and uploading media content. I have implemented WordPress first since it’s widely available. If you have a favourite blogging platform which has an API and you’d like Dexy to support it, then please feel free to submit a patch or request it as a feature via bitbucket. I have been asked already about supporting Posterous, but the Posterous API does not seem to support editing posts, just creating them, so it would be very difficult to make corrections to a post after it had been posted. For this reason I won’t implement Posterous support just yet (but shout if you really, really want it).

To configure posting to WordPress via Dexy, you need to create a file named wp-config.json which should specify ‘user’, ‘pass’ and ‘xmlrpc_url’

{
  "user" : "author",
  "pass" : "password",
  "xmlrpc_url" : "http://blog.dexy.it:80/xmlrpc.php"
}

Make sure you have enabled the XMLRPC interface, it’s off by default. The user should have sufficient privileges to create and publish content.

The wp-config.json file should be in the root of your blog directory. I use a convention like this, where you write the blog post content in a file called index.html or index.txt and have all files relating to the blog post kept in the same folder:

.dexy
wp-config.json
2010/09/name-of-blog-post/index.html
2010/09/name-of-blog-post/post.json
2010/09/name-of-blog-post/source code files etc...
2010/09/another-blog-post/index.html
2010/09/another-of-blog-post/post.json
2010/09/another-blog-post/source code files etc...

Then my dexy config file looks like this:

{
  "index.txt|jinja|textile|wp" : {
    "allinputs" : true,
    "inputs" : [
      "wp-handler.py|idio",
      "wp-config-example.json|dexy",
      "../../../.dexy|dexy"
      ]
  }
}

Then I can run “dexy .” or “dexy 2010” or “dexy 2010/09/name-of-blog-post” to render and upload the content. The post.json file specifies the title of the blog post and whether it should be published or just uploaded as a draft. Once you have uploaded once to WordPress this file also stores the post id.

{"post-id": "241", "publish": true, "title": "Blogging With Dexy"}

Handler Code

Here’s the code which handles uploading content to WordPress. (You can see the most up-to-date version of this code here.)

import xmlrpclib
class WordPressHandler(DexyHandler):
ALIASES = ['wp']

WordPressHandler is a subclass of DexyHandler, and we define ‘wp’ to be the alias for this handler, this means you add ‘wp’ to your list|of|filters to indicate that content should be uploaded to WordPress.

The ‘process_text’ method does all the work, starting by opening the wp-config.json file to read and then check its contents:

    def process_text(self, input_text):
f = open("wp-config.json", "r")
wp_conf = json.load(f)
f.close()

expected_keys = ["pass", "user", "xmlrpc_url"]
actual_keys = sorted(wp_conf.keys())
if not (actual_keys == expected_keys):
exception_msg = "expected to find wp-config.json file with keys %s, instead found %s"
raise Exception(exception_msg % (expected_keys, actual_keys))

That file contains the configuration for the blog as a whole, the login credentials and the location of the blog. Next we want the information specifically relating to this blog post, in the post.json file:

        self.artifact.load_input_artifacts()
matches = [k for k in self.artifact.input_artifacts_dict.keys() if k.endswith("post.json|dexy")]
k = matches[0]
# Read config from file
post_conf = json.loads(self.artifact.input_artifacts_dict[k]['data'])

### @export "connect"
# Connect to server
s = xmlrpclib.ServerProxy(wp_conf["xmlrpc_url"], verbose=False)

So now the post_conf dict knows the title of this blog post and, if it has one already, the post id.

Next we connect to the server:

Now we are going to define a nested function, a feature of Python which I love! If our blog post contains images, audio files or other media, we want to upload these to WordPress automatically. I’ll skip ahead first and show you how the upload_files_to_wp function is used:

        input_text = upload_files_to_wp('(<img src="(artifacts/.+\.(\w{2,4}))")', input_text)
        input_text = upload_files_to_wp('(<embed src="(artifacts/.+\.(\w{2,4}))")', input_text)
        input_text = upload_files_to_wp('(<audio src="(artifacts/.+\.(\w{2,4}))")', input_text)

We pass in regular expressions which should match the tags we have used to embed images and audio. Our function is going to search for these regular expressions, upload the file that is being referenced, then update the URL to be the remote WordPress URL instead of a local URL. You can see we are replacing the input_text variable each time we call this.

Here’s the function. It takes the regular expression and the blog post text, and the first thing it does is set up a dict to cache uploaded files, we will use this so that if the same file is referenced more than once, we only upload it once:

        def upload_files_to_wp(regexp, input_text):
url_cache = {}

Then we search for the regular expression and loop over every occurrence in the text. If we can find the file in the cache, then we’re done:

            for t in re.findall(regexp, input_text):
if url_cache.has_key(t[1]):
url = url_cache[t[1]]
print "using cached url for", t[1], url
else:

If not, then we start the upload process. We read the contents of the media file, and convert them to a binary form which the WordPress API is expecting:

                    f = open(t[1], 'rb')
image_base_64 = xmlrpclib.Binary(f.read())
f.close()

Then we set up a dict so that we can figure out the MIME type based on the file extension:

                    mime_types = {
                        'png' : 'image/png',
                        'jpg' : 'image/jpeg',
                        'jpeg' : 'image/jpeg',
                        'aiff' : 'audio/x-aiff',
                        'wav' : 'audio/x-wav',
                        'wave' : 'audio/x-wav',
                        'mp3' : 'audio/mpeg'
                    }

Now we set up a dict which has the elements we will upload, the file’s name, MIME type and the actual file contents:

                    upload_file = {
                        'name' : t[1].split("/")[1],
                        'type' : mime_types[t[2]], # *should* raise error if not on whitelist
                        'bits' : image_base_64,
                        'overwrite' : 'true'
                    }

Finally, we upload. WordPress will return the new URL of the file. We note this and store it in our cache in case we use this same file again later:

                    upload_result = s.wp.uploadFile(0, wp_conf["user"], wp_conf["pass"], upload_file)
                    url = upload_result['url']
                    url_cache[t[1]] = url
                    print "uploaded", t[1], "to", url

Almost done, now we need to replace the local URL in our document with the new URL which WordPress gave us:

                replace_string = t[0].replace(t[1], url)
input_text = input_text.replace(t[0], replace_string)
return input_text

Once again, here’s how we use this:

        input_text = upload_files_to_wp('(<img src="(artifacts/.+\.(\w{2,4}))")', input_text)
        input_text = upload_files_to_wp('(<embed src="(artifacts/.+\.(\w{2,4}))")', input_text)
        input_text = upload_files_to_wp('(<audio src="(artifacts/.+\.(\w{2,4}))")', input_text)

Now that our media has been uploaded, we’re finally ready to upload our actual blog post:

        content = { 'title' : post_conf['title'], 'description' : input_text}
        publish = post_conf['publish']
        if post_conf.has_key('post_id'):
            post_id = post_conf['post_id']
            s.metaWeblog.editPost(post_id, wp_conf["user"], wp_conf["pass"], content, publish)
        else:
            post_id = s.metaWeblog.newPost(0, wp_conf["user"], wp_conf["pass"], content, publish)
            # Save post_id in JSON file for next revision 
            post_conf['post_id'] = post_id
            json_file = re.sub('\|dexy$', "", k)
            f = open(json_file, 'w')
            json.dump(post_conf, f)
            f.close()
        return "post %s updated" % post_id

If this is the first time we’ve posted, then we will use the newPost() API call. This returns a post_id which we save in our post.json config file. If there’s already a post_id in there, then we instead call editPost() which changes the contents of the post.

A nice feature of the API is that you can use it to create a draft of your blog post so you can preview it. You do this by passing ‘publish’ : false. When you are ready to publish, update your config to ‘publish’ : true and run Dexy again.

Blogging Workflow

I mentioned at the start that a disadvantage of using a static text generator is that you are ‘tied’ somewhat to a machine with the correct software installed. To write and publish a blog post using Dexy, you will need to be on a machine with Dexy installed, along with whatever other dependencies you need (for example, R if you are blogging about R and want to run examples). However, by using WordPress you do not need to be at such a machine to moderate comments, write a new non-code blog post, or even to make changes to a blog post created using Dexy (although if you did make a manual edit to a generated blog post, then you should update your sources as soon as possible so that they remain in sync with the published version). This approach gives you the best of both worlds.

The source code for this blog is available from a Mercurial repository hosted on Bitbucket. You can view the raw textile source for this blog post, and see how the various source files are incorporated using Jinja tags. If you decide to try blogging with Dexy, then I would recommend that you do something similar so that it’s easy for people to recreate your examples and download your source code files.

2 Comments
  1. If you are a WordPress.com user, you’ll need to pay extra to enable custom CSS in order to use Pygments or a similar external syntax highlighter. If you’d rather not and you don’t mind being restricted to a rather smaller subset of languages, you can use the sourcecode plugin and wrap your code blocks in [sourcecode language="R"] blocks as per this: http://en.support.wordpress.com/code/posting-source-code/. I do think it would be nice if WordPress offered a free plugin with the Pygmentize stylesheets. I’m working on support for the Tumblr API currently, and this offers the option of custom CSS for free.

  2. An alternative to getting dexy to publish (and update) the blog, could be to get the blog to pull the content from webby. I do this with two WordPress blogs where one live drags content from the other (I had to augment the wordpress RPC-XML API a little for this). I do this server side with a WordPress plugin, but this could be managed cleint side using Javascript syndication, which I would guess would then work with Posterous so long as it isn’t too heavy with HTML/JS sanitisation.