Hpricot is a very flexible HTML parser, based on Tanaka Akira's HTree and John Resig's JQuery, but with the scanner recoded in C (using Ragel for scanning).