Zero-width negative lookbehind assertion

by Erik Lane 28. September 2005 01:12

This group contruct continues match only if the sub-expression does not match at the position on the left.

I learned how to use this yesterday.  I had some ASP files that have two tables in them.  One has an include statement just above it at the top of the file and the other had all of the content that I needed. 

<!--#include virtual="includes/file.asp"-->

<table width="100%" border="0" cellspacing="0" cellpadding="0">

  <tr>

    <td> I don't want this table </td>

  </tr>

</table>

<table width="100%" border="0" cellspacing="0" cellpadding="0">

  <tr>

    <td> I want this table </td>

  </tr>

</table>

Every regex I tried that included some variation of "<table>...</table>" would return one match that included both tables.  Grrr.

In comes the negative lookbehind assertion.  I mentioned that at the top of the file, before the first table, was an include statement.  All include statements have "-->".  A negative lookbehind assertion means that it will only be returned as a match if the match is not proceeded by a specific value.

Here is my expression:  (?<!-->\s*)(<table[\x0A-\x7E]*>[\x0A-\x7E]*</table>)

Here I'm saying I want to match any table tag, its contents and the closing table tag....as long as it's NOT proceeded by a combination of white space and "-->".  The negative lookbehind syntax I'm using is "(?<!-->\s*)".  Yes, it includes the parenthesis (but not the quotes).

You can also do the opposite and only include matches that are proceeded by a specific value.  Those are called positive lookbehind assertions and that syntax would be "(?<=-->\s*)".

Tags:
Comments are closed