public class Element extends Node
<elementName attributename='attributeValue'>
). An element may also have one or more children (which can be other Elements, Comments, or Text), and a single parent. When two elements share the same parent, they are said to be sibling elements with respect to each other. The term "ancestor" refers to any element that rests at or above the level of the element's parent, and the term "descendant" refers to any element at or beneath the level of the element's child/children.COMMENT_TYPE, DOCUMENT_TYPE, ELEMENT_TYPE, TEXT_TYPE
Constructor and Description |
---|
Element(String markup)
Constructs an Element (as well as child/descendant nodes) that represents the specified HTML/XHTML/XML markup; if the specified markup consists of multiple elements at the topmost level, all but the first are ignored.
|
Element(String name,
List<Node> children)
Constructor
|
Modifier and Type | Method and Description |
---|---|
void |
addChild(Node node)
Adds the specified Node as a child after the last existing child.
|
void |
addChildren(int index,
List<Node> nodes)
Inserts the specified Nodes at the specified index (any elements at that position are shifted down).
|
void |
erase()
Removes this element but not this element's children; the children are inserted into this element's parent, starting at the position formerly occupied by this element.
|
List<String> |
findAttributeValues(String query)
Searches all child/descendant elements and retrieves all attribute values that were found to match the query; example queries:
(finds all href values in anchor tags), (find all image urls starting with 'http:'), (finds all id values in tags of any name). |
List<Comment> |
findEach(short commentType)
Searches all children/descendants and returns all Comments matching the specified Comment type constant (CDATA sections, processing instructions, and DOCTYPE definitions are treated as types of Comment).
|
Elements |
findEach(String query)
Searches child/descendant elements and retrieves elements that were found to match the query, not including those elements that are children/descendants of any matching elements.
|
Elements |
findEvery(String query)
Searches all child/descendant elements and retrieves all elements that were found to match the query.
|
Comment |
findFirst(short commentType)
Searches all children/descendants and returns the first Comment matching the specified Comment type constant (CDATA sections, processing instructions, and DOCTYPE definitions are treated as types of Comment).
|
Element |
findFirst(String query)
Searches all child/descendant elements and retrieves the first one found to match the query.
|
Element |
findNearestAncestor(String tagNameRegex)
Finds the nearest ancestor Element (working upwards from the current Element) whose tagname matches the specified case insensitive regular expression (using Matcher.matches()).
|
String |
getAt(String attributeName)
Returns the attribute value for the specified attributeName, or throws a NotFound Exception if the attribute name or attribute value is not found.
|
String |
getAtString(String attributeName)
Returns the attribute value for the specified (case insensitive) attribute name; returns an empty string rather than throwing an Exception if the attribute name or attribute value is not found.
|
List<String> |
getAttributeNames()
Retrieves a list of the attribute names.
|
List<Element> |
getChildElements()
Returns a list of child Elements, or an empty list if no child elements exist.
|
List<Node> |
getChildNodes()
Returns a list of child nodes; Attribute nodes are not included in the List.
|
String |
getChildText()
Returns a concatenation of all immediate Text children, or an empty String if none exist.
|
String |
getChildText(boolean replaceEntities)
Returns child text (a concatenation of all Text nodes that are immediate children), or an empty String if none exist.
|
Comment |
getComment(int n)
Retrieves the nth child Comment (CDATA sections, processing instructions, and DOCTYPE definitions are treated as types of Comment).
|
List<Comment> |
getEach(short commentType)
Searches only children (not all descendants) and returns all Comments matching the specified Comment type constant (CDATA sections, processing instructions, and DOCTYPE definitions are treated as types of Comment).
|
Elements |
getEach(String query)
Searches only child elements (not all descendants) and retrieves those that
were found to match the query.
|
Element |
getElement(int n)
Retrieves the nth child Element or throws a NotFound Exception if no element exists at the specified position.
|
Comment |
getFirst(short commentType)
Searches only children (not all descendants) and returns the first Comment matching the specified Comment type constant (CDATA sections, processing instructions, and DOCTYPE definitions are treated as types of Comment).
|
Element |
getFirst(String query)
Searches only child elements (not all descendants) and retrieves the first child element that was found to match the query.
|
String |
getInlineStyle(String attributeName)
Returns the css value of the inline style associated with the specified css attribute name or null if no value exists.
|
Map<String,String> |
getInlineStyles()
Collects and returns a Map containing the inline styles from the style attribute (or an empty Map if no styles exist).
|
String |
getName()
Retrieves tagname normalized to lowercase.
|
String |
getNameOC()
Retrieves tagname in original case.
|
Element |
getRoot()
Returns the root ancestor Element (or Document container, if it exists).
|
int |
getSiblingIndex()
Retrieves the sibling index number of this Element (eg, if it's the first child element of its parent, it has index 0).
|
String |
getTextContent()
Returns the concatenation of all children/descendants of type Text, or an empty String if none exist.
|
String |
getTextContent(String delimeter,
boolean excludeScripts,
boolean replaceEntities)
Returns the concatenation of all children/descendants of type Text delimited by the specified delimiter.
|
String |
getTextContent(String delimeter,
Node startNode,
Node endNode,
boolean rangeInclusive,
boolean includeComments,
boolean excludeScripts,
boolean replaceEntities)
Returns the concatenation of all children/descendants of type Text delimited by the specified delimiter.
|
boolean |
hasAttribute(String attributeName)
Tests whether the specified attribute or keyword exists.
|
boolean |
hasKeyword(String keyword)
Returns a boolean value indicating whether the specified (case-insensitive) attribute is present, either as an attribute name or a keyword (ie an attribute name without an associated attribute value).
|
String |
innerHTML()
Returns a String representation of this element's children/descendant nodes as unformatted HTML.
|
String |
innerHTML(int indents)
Returns a String representation of this element's children/descendant nodes as formatted (ie indented) HTML.
|
void |
innerHTML(String html)
Replaces the contents of this element (ie the descendant nodes) with the specified HTML (or XHTML).
|
String |
innerText()
Deprecated.
use
getTextContent() instead. |
String |
innerText(String delimeter,
boolean excludeScripts,
boolean replaceEntities)
Deprecated.
use
getTextContent(String, boolean, boolean) instead. |
String |
innerText(String delimeter,
Node startNode,
Node endNode,
boolean rangeInclusive,
boolean includeComments,
boolean excludeScripts,
boolean replaceEntities)
Deprecated.
|
String |
innerXML()
Returns a String representation of this element's children/descendant nodes as unformatted XML.
|
String |
innerXML(int indents)
Returns a String representation of this element's children/descendant nodes as formatted (ie indented) XML.
|
void |
innerXML(String xml)
Replaces the contents of this element (ie the descendant nodes) with the specified XML, which is parsed in a matter that ignores the default HTML/XHTML DTD.
|
Element |
nextSiblingElement()
Retrieves the next sibling Element in the DOM (ie, the next Element that shares the same parent); note that this method is not suitable for iterating through the children of an Elements container, since the those elements may not be siblings in the dom tree.
|
String |
outerHTML()
Returns a String representation of this element (including its children/descendant nodes) as unformatted (ie unindented) HTML.
|
String |
outerHTML(int indents)
Returns a String representation of this element as formatted (ie indented) HTML, including its children/descendants.
|
String |
outerXML()
Returns a String representation of this element (including its children/descendant nodes) as unformatted (ie unindented) XML.
|
String |
outerXML(int indents)
Returns a String representation of this element (including its children/descendant nodes) as formatted (ie indented) XML.
|
com.jaunt.Attribute |
removeAttribute(String attName)
Removes the specified attribute or keyword.
|
void |
removeAttributes()
Removes all attributes.
|
boolean |
removeChild(Node node)
Removes the specified Node and sets the parent property of the removed Node to null.
|
void |
removeChildren()
Removes all child nodes and sets their parent property to null.
|
void |
saveAs(String filename)
Saves the current element and it's children/descendants as (UTF-8) HTML in the file specified by the relative filepath.
|
void |
saveAsXML(String filename)
Saves the current element and it's children/descendants as (UTF-8) XML in the file specified by the relative filepath.
|
void |
setAttribute(String attributeName,
String attributeValue)
Adds a new attribute name and attribute value or associates an existing attribute name with a new attribute value.
|
String |
toString()
Creates a string representation of the current Element as an HTML tag (not including its children/descendants or counterpart closing tag).
|
String |
toXMLString()
Creates a string representation of the current Element as an XML tag (but not including its children/descendants).
|
delete, findNearestCommonAncestor, getParent, getType, isAfter, isBefore, isBetween, isBetween, nextNode, nextNodeSibling, nextNonDescendantNode, previousNode, previousNodeSibling, typeToString
public Element(String markup) throws NotFound
NotFound
public String getName()
public String getNameOC()
public Element getRoot()
public Element findNearestAncestor(String tagNameRegex) throws NotFound
tagNameRegex
- case insensitive regular expression for matching the tagname.NotFound
public Element nextSiblingElement() throws NotFound
NotFound
public int getSiblingIndex()
public void addChild(Node node)
public void addChildren(int index, List<Node> nodes)
innerHTML(String)
,
innerXML(String)
public boolean removeChild(Node node)
public void removeChildren()
public List<Element> getChildElements()
public List<Node> getChildNodes()
public Comment getComment(int n) throws NotFound
n
- the index (starting at 0) of the child Comment to returnNotFound
public Comment getFirst(short commentType) throws NotFound
commentType
- Comment type constant or -1 to match any comment TypeNotFound
public List<Comment> getEach(short commentType)
commentType
- Comment type constant or -1 to match any comment Typepublic Comment findFirst(short commentType) throws NotFound
commentType
- Comment type constant or -1 to match any comment TypeNotFound
public List<Comment> findEach(short commentType)
commentType
- Comment type constant or -1 to match any comment Typepublic Element getElement(int n) throws NotFound
n
- the index (starting at 0) of the child Element to returnNotFound
public Element getFirst(String query) throws NotFound
query
- a query that has the general form <tagnameRegex attributeName="attributeValueRegex">childTextRegex
where multiple attributes are allowed. In order for the query to match against an element, the tagnameRegex, attribute name, attributeValueRegex and immediateTextRegex must match if they are specified.Matcher.find()
.NotFound
public Elements getEach(String query)
query
- a query that has the general form
<tagnameRegex attributeName="attributeValueRegex">childTextRegex
where multiple attributes are allowed. In order for the query to
match against an element, the tagnameRegex, attribute name,
attributeValueRegex and immediateTextRegex must match if they are
specified.Matcher.find()
.public Element findFirst(String query) throws NotFound
query
- a query that has the general form <tagnameRegex attributeName="attributeValueRegex">childTextRegex
where multiple attributes are allowed. In order for the query to match against an element, the tagnameRegex, attribute name, attributeValueRegex and immediateTextRegex must match if they are specified.Matcher.find()
.NotFound
public Elements findEach(String query)
query
- a query that has the general form <tagnameRegex attributeName="attributeValueRegex">childTextRegex
where multiple attributes are allowed. In order for the query to match against an element, the tagnameRegex, attribute name, attributeValueRegex and immediateTextRegex must match if they are specified.Matcher.find()
.public Elements findEvery(String query)
query
- a query that has the general form <tagnameRegex attributeName="attributeValueRegex">childTextRegex
where multiple attributes are allowed. In order for the query to match against an element, the tagnameRegex, attribute name, attributeValueRegex and immediateTextRegex must match if they are specified.Matcher.find()
.public List<String> getAttributeNames()
public String getAtString(String attributeName)
attributeName
- case insensitive attribute name.getAt(String)
public boolean hasKeyword(String keyword)
public String getAt(String attributeName) throws NotFound
attributeName
- case insensitive attribute name.NotFound
getAtString(String)
public void setAttribute(String attributeName, String attributeValue)
attributeName
- the attribute name.attributeValue
- the attribute value.public com.jaunt.Attribute removeAttribute(String attName)
attName
- the case insensitive name of the attribute (or keyword) to be removed and returned.public void removeAttributes()
public boolean hasAttribute(String attributeName)
attributeName
- case insensitive attribute name or keyword.public String getInlineStyle(String attributeName)
attributeName
- the case insensitive css attribute name.public Map<String,String> getInlineStyles()
public void erase()
Node.delete()
public void saveAs(String filename) throws IOException
filename
- the path/filename relative to system property user.dir (the current working directory).IOException
public void saveAsXML(String filename) throws IOException
filename
- the path/filename relative to system property user.dir (the current working directory).IOException
public String getChildText()
public String getChildText(boolean replaceEntities)
replaceEntities
- whether to replace entity references with character equivalents.public String getTextContent()
@Deprecated public String innerText()
getTextContent()
instead.public String getTextContent(String delimeter, boolean excludeScripts, boolean replaceEntities)
delimeter
- a delimeter insert between each text when concatenating, or null for no delimeterexcludeScripts
- whether to exclude any text content of script tags.replaceEntities
- whether to replace entity references with character equivalents.@Deprecated public String innerText(String delimeter, boolean excludeScripts, boolean replaceEntities)
getTextContent(String, boolean, boolean)
instead.delimeter
- a delimeter insert between each text when concatenating, or null for no delimeterexcludeScripts
- whether to exclude any text content of script tags.replaceEntities
- whether to replace entity references with character equivalents.public String getTextContent(String delimeter, Node startNode, Node endNode, boolean rangeInclusive, boolean includeComments, boolean excludeScripts, boolean replaceEntities)
delimeter
- a delimeter insert between each text when concatenating, or null for no delimeterstartNode
- the node at which text concatenation begins, or null not to speciy a start node.endNode
- the node at which text concatenation ends, or null not to speciy an end node.rangeInclusive
- whether to concatenate the innerText start and end node, or only the nodes between.includeComments
- whether to include text occuring within HTML/XML comments.excludeScripts
- whether to exclude any text content of script tags.replaceEntities
- whether to replace entity references with character equivalents.@Deprecated public String innerText(String delimeter, Node startNode, Node endNode, boolean rangeInclusive, boolean includeComments, boolean excludeScripts, boolean replaceEntities)
getTextContent(String, Node, Node, boolean, boolean, boolean, boolean)
instead.delimeter
- a delimeter insert between each text when concatenating, or null for no delimeterstartNode
- the node at which text concatenation begins, or null not to speciy a start node.endNode
- the node at which text concatenation ends, or null not to speciy an end node.rangeInclusive
- whether to concatenate the innerText start and end node, or only the nodes between.includeComments
- whether to include text occuring within HTML/XML comments.excludeScripts
- whether to exclude any text content of script tags.replaceEntities
- whether to replace entity references with character equivalents.public String toString()
toString
in class Object
outerHTML()
public String toXMLString()
outerXML()
public String outerHTML()
public String outerHTML(int indents)
indents
- the number of whitespace characters to use per indent.public String innerHTML()
public String innerHTML(int indents)
indents
- the number of whitespace characters to use per indent.public void innerHTML(String html)
public String outerXML()
public String outerXML(int indents)
indents
- the number of whitespace characters to use per indent.public String innerXML()
public String innerXML(int indents)
indents
- the number of whitespace characters to use per indent.public void innerXML(String xml)
public List<String> findAttributeValues(String query)
"<a href>"
(finds all href values in anchor tags), "<img src='http:.*'>"
(find all image urls starting with 'http:'), "< id>"
(finds all id values in tags of any name).query
- a query that has the general form <tagnameRegex attributeName="attributeValueRegex">
. In order for the query to match against an attribute value, the tagnameRegex, attribute name, and attributeValueRegex must match if they are specified.