Batch-remove empty lines at the end of many Confluence pages

In a customer project we’ve decided to collaboratively write a bigger bunch of documentation in Atlassians Confluence and export that with Scroll Office, a third-party Confluence plugin, into Word.

That worked fine so far, but soon we figured that we’ve been kind of sloppy with empty lines at the end of each page, which were obviously taken over into the final document. So instead of going over each and every page and remove the empty lines there, I thought it might be easier to directly do this on the database, in our case MySQL.

The query was quickly developed, but then I realized that MySQL had no PREG_REPLACE function built-in, so I needed to install a UDF, a user-defined function first. Luckily, this UDF worked out of the box and so the query could be finalized:

UPDATE BODYCONTENT 
JOIN CONTENT ON CONTENT.CONTENTID=BODYCONTENT.CONTENTID 
   AND CONTENTTYPE LIKE "PAGE" AND PREVVER IS NULL 
SET BODY=PREG_REPLACE("/(<p>&nbsp;<.p>)+$/", "", BODY) 
WHERE BODY LIKE "%<p>&nbsp;</p>";

This query updates all current pages (no old versions) from all spaces that end with at least one empty line <p>&nbsp;</p> – this is Confluence’s internal markup for that – and removes all of these empty lines from all matches pages.

This was tested with MySQL 5.5.35, lib_mysqludf_preg 1.2-rc2 and Confluence 5.4.2.

I don’t need to mention that it is – of course – highly recommended that you backup your database before you execute this query on your server, right?