XPath, short for XML Path Language, is a query language that is used to navigate through an XML document. It was first introduced in 1999 as a way to provide a standard way to access elements within an XML document. The language is built around the concept of a "path" which is used to select specific elements based on their position within the document.
One of the key features of XPath is its use of path notation. This notation allows you to select elements based on their location in the document tree. For example, in an HTML document, the path "html/body/p" would select all p elements that are direct children of the body element, which in turn is a direct child of the html element.
The syntax for XPath is pretty straightforward:
//tagname[@attribute='value']
Where 'tagname' is the type of HTML element you are looking for (e.g. div, a, p), 'attribute' is a property of the desired HTML element by which our locator performs the search (e.g. class) and 'value' is the specific value you want to match
CSS selectors, short for Cascading Style Sheets, are used to select elements based on their properties, such as class, id, and attributes. They are easier to read and understand than XPath, but they are limited in their ability to navigate through the document. CSS selectors are mostly used for styling and layout, but you can also use them to extract information from a webpage:
<html> <body> <p class="highlight">Hello, world!</p> </body> </html>
|
To select the text "Hello, world!" using CSS selectors, we’ll have to use Javascript:
let p_tag = document.querySelector(“p.highlight”)
let p_text = p_tag.innerText
XPath on the other hand, was specifically designed to provide a query language for XML documents and comes with a wide range of built-in functions. These functions can be used to perform calculations and extract specific information from the elements. For example, with XPath's text() function, you can directly select the text value of an element:
<html> <body> <p>Hello, world!</p> </body> </html>
To select the text "Hello, world!" using the text() function in XPath, the expression would be:
/html/body/p/text()