==== CSS Selector Syntax ====

^Selector^Example^Example description|
|.//class//|.intro|Selects all elements with class="intro"|
|.//class1.class2//|.name1.name2|Selects all elements with both name1 and name2 set within its class attribute|
|.//class1 .class2//|.name1 .name2|Selects all elements with name2 that is a descendant of an element with name1|
|#//id//|#firstname|Selects the element with id="firstname"|
|*|*|Selects all elements|
|//element//|p|Selects all <p> elements|
|//element.class//|p.intro|Selects all <p> elements with class="intro"|
|//element,element//|div, p|Selects all <div> elements and all <p> elements|
|//element// //element//|div p|Selects all <p> elements inside <div> elements|
|//element//>//element//|div > p|Selects all <p> elements where the parent is a <div> element|
|//element//+//element//|div + p|Selects all <p> elements that are placed immediately after <div> elements|
|//element1//~//element2//|p ~ ul|Selects every <ul> element that are preceded by a <p> element|
|[//attribute//]|[target]|Selects all elements with a target attribute|
|[//attribute//=//value//]|[target=_blank]|Selects all elements with target="_blank"|
|[//attribute//~=//value//]|[title~=flower]|Selects all elements with a title attribute containing the word "flower"|
|[//attribute//%%|%%=//value//]|[lang%%|%%=en]|Selects all elements with a lang attribute value starting with "en"|
|[//attribute//%%^%%=//value//]|a[href%%^%%="https"]|Selects every <a> element whose href attribute value begins with "https"|
|[//attribute//$=//value//]|a[href$=".pdf"]|Selects every <a> element whose href attribute value ends with ".pdf"|
|[//attribute//*=//value//]|a[href*="w3schools"]|Selects every <a> element whose href attribute value contains the substring "w3schools"|
|:active|a:active|Selects the active link|
|::after|p::after|Insert something after the content of each <p> element|
|::before|p::before|Insert something before the content of each <p> element|
|:checked|input:checked|Selects every checked <input> element|
|:default|input:default|Selects the default <input> element|
|:disabled|input:disabled|Selects every disabled <input> element|
|:empty|p:empty|Selects every <p> element that has no children (including text nodes)|
|:enabled|input:enabled|Selects every enabled <input> element|
|:first-child|p:first-child|Selects every <p> element that is the first child of its parent|
|::first-letter|p::first-letter|Selects the first letter of every <p> element|
|::first-line|p::first-line|Selects the first line of every <p> element|
|:first-of-type|p:first-of-type|Selects every <p> element that is the first <p> element of its parent|
|:focus|input:focus|Selects the input element which has focus|
|:hover|a:hover|Selects links on mouse over|
|:in-range|input:in-range|Selects input elements with a value within a specified range|
|:indeterminate|input:indeterminate|Selects input elements that are in an indeterminate state|
|:invalid|input:invalid|Selects all input elements with an invalid value|
|:lang(//language//)|p:lang(it)|Selects every <p> element with a lang attribute equal to "it" (Italian)|
|:last-child|p:last-child|Selects every <p> element that is the last child of its parent|
|:last-of-type|p:last-of-type|Selects every <p> element that is the last <p> element of its parent|
|:link|a:link|Selects all unvisited links|
|:not(//selector//)|:not(p)|Selects every element that is not a <p> element|
|:nth-child(//n//)|p:nth-child(2)|Selects every <p> element that is the second child of its parent|
|:nth-last-child(//n//)|p:nth-last-child(2)|Selects every <p> element that is the second child of its parent, counting from the last child|
|:nth-last-of-type(//n//)|p:nth-last-of-type(2)|Selects every <p> element that is the second <p> element of its parent, counting from the last child|
|:nth-of-type(//n//)|p:nth-of-type(2)|Selects every <p> element that is the second <p> element of its parent|
|:only-of-type|p:only-of-type|Selects every <p> element that is the only <p> element of its parent|
|:only-child|p:only-child|Selects every <p> element that is the only child of its parent|
|:optional|input:optional|Selects input elements with no "required" attribute|
|:out-of-range|input:out-of-range|Selects input elements with a value outside a specified range|
|::placeholder|input::placeholder|Selects input elements with the "placeholder" attribute specified|
|:read-only|input:read-only|Selects input elements with the "readonly" attribute specified|
|:read-write|input:read-write|Selects input elements with the "readonly" attribute NOT specified|
|:required|input:required|Selects input elements with the "required" attribute specified|
|:root|:root|Selects the document's root element|
|::selection|::selection|Selects the portion of an element that is selected by a user|
|:target|#news:target|Selects the current active #news element (clicked on a URL containing that anchor name)|
|:valid|input:valid|Selects all input elements with a valid value|
|:visited|a:visited|Selects all visited links|

\\


=== Problem ===
You want to find or manipulate elements using a CSS or jquery-like selector syntax.

=== Solution ===
Use the Element.select(String selector) and Elements.select(String selector) methods:

<code>
File input = new File("/tmp/input.html");
Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");

Elements links = doc.select("a[href]"); // a with href
Elements pngs = doc.select("img[src$=.png]");
  // img with src ending .png

Element masthead = doc.select("div.masthead").first();
  // div with class=masthead

Elements resultLinks = doc.select("h3.r > a"); // direct a after h3
</code>


Description
jsoup elements support a CSS (or jquery) like selector syntax to find matching elements, that allows very powerful and robust queries.

The select method is available in a Document, Element, or in Elements. It is contextual, so you can filter by selecting from a specific element, or by chaining select calls.

Select returns a list of Elements (as Elements), which provides a range of methods to extract and manipulate the results.

=== CSS Selector overview ===

tagname: find elements by tag, e.g. a
ns|tag: find elements by tag in a namespace, e.g. fb|name finds <fb:name> elements
#id: find elements by ID, e.g. #logo
.class: find elements by class name, e.g. .masthead
[attribute]: elements with attribute, e.g. [href]
[^attr]: elements with an attribute name prefix, e.g. [^data-] finds elements with HTML5 dataset attributes
[attr=value]: elements with attribute value, e.g. [width=500] (also quotable, like [data-name='launch sequence'])
[attr^=value], [attr$=value], [attr*=value]: elements with attributes that start with, end with, or contain the value, e.g. [href*=/path/]
[attr~=regex]: elements with attribute values that match the regular expression; e.g. img[src~=(?i)\.(png|jpe?g)]
*: all elements, e.g. *

=== Selector combinations ===

el#id: elements with ID, e.g. div#logo
el.class: elements with class, e.g. div.masthead
el[attr]: elements with attribute, e.g. a[href]
Any combination, e.g. a[href].highlight
ancestor child: child elements that descend from ancestor, e.g. .body p finds p elements anywhere under a block with class "body"
parent > child: child elements that descend directly from parent, e.g. div.content > p finds p elements; and body > * finds the direct children of the body tag
siblingA + siblingB: finds sibling B element immediately preceded by sibling A, e.g. div.head + div
siblingA ~ siblingX: finds sibling X element preceded by sibling A, e.g. h1 ~ p
el, el, el: group multiple selectors, find unique elements that match any of the selectors; e.g. div.masthead, div.logo

=== Pseudo selectors ===

:lt(n): find elements whose sibling index (i.e. its position in the DOM tree relative to its parent) is less than n; e.g. td:lt(3)
:gt(n): find elements whose sibling index is greater than n; e.g. div p:gt(2)
:eq(n): find elements whose sibling index is equal to n; e.g. form input:eq(1)
:has(selector): find elements that contain elements matching the selector; e.g. div:has(p)
:not(selector): find elements that do not match the selector; e.g. div:not(.logo)
:contains(text): find elements that contain the given text. The search is case-insensitive; e.g. p:contains(jsoup)
:containsOwn(text): find elements that directly contain the given text
:matches(regex): find elements whose text matches the specified regular expression; e.g. div:matches((?i)login)
:matchesOwn(regex): find elements whose own text matches the specified regular expression

\\
Note that the above indexed pseudo-selectors are 0-based, that is, the first element is at index 0, the second at 1, etc
See the Selector API reference for the full supported list and details.

