Search and replace multiple lines across many files

sed is usually my favourite tool to search and replace things from the command line, but sometimes Perl’s regexes are far more convenient to use. Recently I found out another reason why Perls -pi -e is superior over plain sed: when you want to change multiple lines in a document!

Imagine you have hundreds of source code files where somebody once had the great idea to add a ___version___ property into each class:

public class Foo
{
    private static final String ___version___ = "$Version:$";
    
    // other stuff
}

With Perl the line in question is easy to remove:

$ for file in $(find . -name "*.java"); do \
   cp $file $file.bkp; perl -pi -e \
      "s/\s*public.+___version___.+\n//g" \
   < $file.bkp > $file; rm $file.bkp; done

But, there is one problem: Perl processes each line of the file separately when it slurps in the file, which results in unwanted empty lines:

public class Foo
{
    
    // other stuff
}

Then I stumbled upon this article and the solution is to set a special input separator to let Perl slurp in the file as a whole:

$ for file in $(find . -name "*.java"); do \
   cp $file $file.bkp; perl -p0777i -e \
     "s/\s*public.+___version___.+\n(\s*\n)*/\n/g" \
   < $file.bkp > $file; rm $file.bkp; done

and voila, we get what we want:

public class Foo
{
    // other stuff
}

Digging a little deeper what -0777 actually means leads us to perlrun(1):

The special value 00 will cause Perl to slurp files in paragraph mode. The value 0777 will cause Perl to slurp files whole because there is no legal byte with that value.

Another day saved – thanks to Perl!

And while we’re at it, have a look at Rakudo Star, the best Perl 6 compiler which was released just recently. Perl 6 is in my humble opinion one of the well-designed languages I’ve came across so far, so if you find some time, go over and read the last christmas special, its really worth it!