Is there any Python library that allows me to parse an HTML document similar to what jQuery does?
i.e. I'd like to be able to use CSS selector syntax to grab an arbitrary set of nodes from the document, read their content/attributes, etc.
The only Python HTML parsing lib I've used before was BeautifulSoup, and even though it's fine I keep thinking it would be faster to do my parsing if I had jQuery syntax available. :D
If you enjoyed my content for some reason, I'd love to hear from you! Here are some options:
- You can buy me a coffee!
- Click here to post a comment!
- You can write a reply on your own site and submit the URL as a webmention via the form below.
- Or you can just contact me!
The http://lxml.de/">lxml library supports http://lxml.de/cssselect.html">CSS selectors.
via stackoverflow.comIf you are fluent with BeautifulSoup, you could just add soupselect to your libs.
Soupselect is a CSS selector extension for BeautifulSoup.
Usage:
Consider PyQuery:
http://packages.python.org/pyquery/