Ravenous Regex
July 30th, 2006I like ViM. No… I love ViM. No… my fingers love ViM, and my brain doesn’t know it.
One thing I really like about ViM, is that after 10 years, I still learn new things. For example, today I learned how to construct non-greedy regex.
For one of my consulting gigs, I’m migrating a decent sized website into the rails framework. In doing so, I’m also converting (to the best of my ability and time) the site to use as much CSS as possible. Thus, gone are things like <font…> tags. The problem is trying to glob out the attributes of the font tag. How do you match <font size=”12px” color=”#ebeb1d”>? One might try “%s/<font.*>//g“. The problem with the ViM regex, is that the final > doesn’t get matched… rather, the regex is greedy, skips the intended > and magically find a > further on down the path. So how do you disable the “greediness” ?
The trick is to prefix the intended > with “.\{-0,\}”. (that’s a zero)
Here’s the actual ViM example that will match <font…..>:
:%s/<font.\{-0,\}>//g
I agree that the keystrokes are a bit insane but (a) it works and (b) it works. enjoy!






August 2nd, 2006 at 9:36 pm
In most other regular expression grammars this can be done much more simply by adding a greedy modifier (suprised VIM doesn’t have this). For example, in Java you can just do this:
.*?
This will be non-greedy. Wonder why VIM didn’t add something like that as well. Perhaps you should request an enhancement. =)
August 17th, 2006 at 2:21 pm
I prefer to use this construction:
%s#]*>##g
I find it cleaner to read, easier to add capturing parens, and more portable across the regex engines I use.
Does non-greedy (lazy) modifier have advantages I’m not seeing?
August 17th, 2006 at 2:28 pm
Uh-oh. Escaping probs in the previous post. Example was supposed to be:
:%s#<font[^>]*>##g
(assuming I escaped correctly this time)
November 2nd, 2006 at 2:57 pm
This is not a problem with vim’s regexp. This is intentional and is common regexp design. I highly recommend checking out the Mastering Regular Expressions book for details and for many other cool regexp tricks. If you’re going to be reformating a lot of html this book can probably help.
http://www.amazon.com/Mastering-Regular-Expressions-Jeffrey-Friedl/dp/0596528124/sr=8-1/qid=1162479023/ref=pd_bbs_sr_1/104-7862648-1582341?ie=UTF8&s=books