HTML Parser

An HTML parser reads raw HTML text and converts it into a structured tree of nodes (the Document Object Model, or DOM) that programs can traverse and query.

Scrapers use parsers like BeautifulSoup (Python), Cheerio (Node.js), or the built-in browser DOM to locate and extract specific data elements.

Related Terms

Web Scraping

Web scraping is the automated extraction of structured data from websites.

XPath

XPath is a query language for selecting nodes from an XML or HTML document tree.

CSS Selector

A CSS selector is a pattern used to select HTML elements by their tag, class, ID, attribute, or structural position.