Faster Regexes: What to do when text matching is your bottleneck

Par Aaron Crane (‎arc‎) de Edinburgh.pm, London.pm
Date : vendredi 16 novembre 2007 17h15
Durée : 20 minutes
Cible : Intermédiaire
Langue : English
Tags : optimisation regex


We all know how good Perl is at munging text. But what do you do when your Perl text-munging code isn't fast enough for what you're trying to do?

We needed to extract useful information from tens of gigabytes of web-server log files. Our Perl code was simple and obvious, but not fast enough for our purposes. When profiling revealed a frequently-executed regex as the bottleneck, we tried several things to make it faster.

This talk looks at what we did to speed up our regex-heavy code (by a factor of well over 100 in some places), identifying a few general-purpose optimisation techniques on the way.