Happy !
ToddBlanchard / HTMCSSValidatingParser
Monticello registration
About HTMCSSValidatingParser
This is an HTML and CSS parser and DOM that handles rotten HTML and broken CSS quite well. I wrote it to provide validation of web pages and it is the underlying technology behind http://www.badpage.info. The tag nesting and attribute rules are determined by interpreting the DTD's at the W3C. Hopefully this will make it fairly future proof. The CSS parser understands most of CSS 2 and some CSS 3 and the CSS selectors can tell if they match a DOM node. There is no visual rendering and no calculation of layout.
I hearby license it free for all uses under the standard MIT license.
I also find it quite useful for scraping web pages and Sebastian Sastre has used it in Sanitize
