public class Document extends Element
Modifier and Type | Field and Description |
---|---|
boolean |
truncated
whether the document was truncated during parsing as a result of UserAgentSettings.maxBytes
|
COMMENT_TYPE, DOCUMENT_TYPE, ELEMENT_TYPE, TEXT_TYPE
Modifier and Type | Method and Description |
---|---|
Document |
apply(Object... params)
Applies the specified parameters in succession to the active form's editable (ie, visible and non-disabled) inputs, starting from the input after the most recently edited input (or starting from the first input, if the form has not yet been edited); the "active" form is the form that was most recently edited; if no form is active, the inputs are applied to the first editable form in the document (ie, the first form that has at least one visible, editable, non-disabled input).
|
Document |
choose(short labelPosition,
String labelRegex)
Deprecated.
|
Document |
choose(String menuLabel,
String menuItemRegex)
Deprecated.
use
chooseMenuItem(java.lang.String, java.lang.String) instead. |
Document |
chooseCheckBox(String labelRegex,
short labelPosition)
Selects the checkbox who's label is matched by the specified (case-insensitive) regular expression.
|
Document |
chooseMenuItem(String menuLabel,
String menuItemRegex)
Completes the menu that has the specified text label (case insensitive, wordspace insensitive) by selecting the menuitem(s) that matches the specified (case insensitive) regular expression (using Matcher.matches).
|
Document |
chooseRadioButton(String labelRegex,
short labelPosition)
Selects the radiobutton who's label is matched by the specified (case-insensitive) regular expression.
|
Document |
fillout(String label,
String value)
Deprecated.
use
filloutField(java.lang.String, java.lang.String) instead. |
Document |
filloutField(String label,
String value)
Fills the textfield, password field, or textarea field that has the specified text label (case-insensitive, wordspace-insensitive) with the specified value.
|
boolean |
followMetaRedirect(int redirectCount)
Follows the first redirecting meta tag in the document.
|
Form |
getActiveForm()
Returns the Form that is currently active (ie, most recently edited) or returns null if no form has been edited.
|
Form |
getForm(Element element)
Returns the Form for the specified form Element or throws a NotFound Exception.
|
Form |
getForm(int n)
Returns the form object corresponding to the nth form (starting at 0 for the first form)
|
Form |
getForm(String elementQuery)
Returns a Form object for the first form Element that matches the specified elementQuery (or throws a NotFound Exception).
|
Form |
getFormByButton(String buttonLabelRegex)
Returns a Form object for the first form Element containing a button who's text/values is matched (case insensitive) by the specified regular expression (using Matcher.matches()), or throws a NotFound Exception.
|
Form |
getFormByButtons(String[] buttonLabelRegexs)
Returns a Form object for the first form Element containing buttons who's text/values are matched (case insensitive) by the regular expressions in the specified array (using Matcher.matches()), or throws a NotFound Exception.
|
List<Form> |
getForms()
returns a List of form components for this page.
|
MultiMap<String,String> |
getHeaders(String key)
Returns a MultiMap containing the headers received for this document, where each header name is associated with one or more header values.
|
com.jaunt.component.Hyperlink |
getHyperlink(String linkRegex)
Returns the first hyperlink who's element has innerText (from
Element.innerText() ) matching the specified (case insensitive) regular expression (using Matcher.matches). |
com.jaunt.component.Meta |
getRedirectingMeta()
Returns a Meta component, which represents the first redirecting meta tag in the document, returns null if the document contains no redirecting meta tag.
|
com.jaunt.component.Table |
getTable(Element tableElement)
Returns a Table component for the specified table Element or throws a NotFound exception if the specified element is not a table.
|
com.jaunt.component.Table |
getTable(int n)
Returns a Table component representing the nth non-nested table Element in the document, or throws a NotFound exception if no table element is found.
|
com.jaunt.component.Table |
getTable(String elementQuery)
Returns a Table component for the first table Element that matches the specified elementQuery (or throws a NotFound Exception).
|
String |
getUrl()
Returns the url used to request the document or null if the document was not retrieved using an HTTP request.
|
com.jaunt.component.Hyperlink |
nextPageLink()
returns the next hyperlink in a series of numeric links within the document, such as links that represent pages of search results; throws exceptions if no sequences or multiple (non-equivalent) sequences were found.
|
com.jaunt.component.Hyperlink |
nextPageLink(Element container)
returns the next hyperlink in a series of numeric links within the specified container, such as links that represent pages of search results; throws exceptions if no sequences or multiple (non-equivalent) sequences were found.
|
boolean |
nextPageLinkExists()
returns true if another hyperlink exists in a series of numeric links within the document, such as links that represent pages of search results
|
boolean |
nextPageLinkExists(Element container)
returns true if another hyperlink exists in a series of numeric links within the specified container, such as links that represent pages of search results
|
List<ResponseException> |
saveCompleteWebPage(File file)
Saves the document in the specified file and creates a folder in the same directory where associated content (eg., images, js, css, nested frames, etc) is saved, so that the document may be viewed offline;
The name of the content folder is "content_for_FILENAME" where FILENAME is the specified file's name.
|
Document |
submit()
Submits the active form by pressing the submit button, throws a MultipleFound exception if more than one submit button exists in the form.
|
Document |
submit(String buttonLabelRegex)
Submits the active form using the submit button who's text matches the specified (case insensitive) regular expression, using Matcher.matches.
|
Document |
unchoose(short labelPosition,
String labelRegex)
Deprecated.
use
unchooseRadioButton(String,short) or unchooseCheckBox(String,short) instead. |
Document |
unchooseCheckBox(String labelRegex,
short labelPosition)
Deselects the checkbox who's label is matched by the specified (case-insensitive) regular expression.
|
Document |
unchooseRadioButton(String labelRegex,
short labelPosition)
Deselects the radiobutton who's label is matched by the specified (case-insensitive) regular expression.
|
addChild, addChildren, erase, findAttributeValues, findEach, findEach, findEvery, findFirst, findFirst, findNearestAncestor, getAt, getAtString, getAttributeNames, getChildElements, getChildNodes, getChildText, getChildText, getComment, getEach, getEach, getElement, getFirst, getFirst, getInlineStyle, getInlineStyles, getName, getNameOC, getRoot, getSiblingIndex, getTextContent, getTextContent, getTextContent, hasAttribute, hasKeyword, innerHTML, innerHTML, innerHTML, innerText, innerText, innerText, innerXML, innerXML, innerXML, nextSiblingElement, outerHTML, outerHTML, outerXML, outerXML, removeAttribute, removeAttributes, removeChild, removeChildren, saveAs, saveAsXML, setAttribute, toString, toXMLString
delete, findNearestCommonAncestor, getParent, getType, isAfter, isBefore, isBetween, isBetween, nextNode, nextNodeSibling, nextNonDescendantNode, previousNode, previousNodeSibling, typeToString
public final boolean truncated
public com.jaunt.component.Hyperlink nextPageLink(Element container) throws NotFound, MultipleFound
NotFound
MultipleFound
public com.jaunt.component.Hyperlink nextPageLink() throws NotFound, MultipleFound
NotFound
MultipleFound
public boolean nextPageLinkExists()
public boolean nextPageLinkExists(Element container)
public com.jaunt.component.Meta getRedirectingMeta()
public String getUrl()
public com.jaunt.component.Hyperlink getHyperlink(String linkRegex) throws NotFound
Element.innerText()
) matching the specified (case insensitive) regular expression (using Matcher.matches).NotFound
public com.jaunt.component.Table getTable(String elementQuery) throws NotFound
<tagnameRegex attributeName="attributeValueRegex">
NotFound
public com.jaunt.component.Table getTable(Element tableElement) throws NotFound
NotFound
public com.jaunt.component.Table getTable(int n) throws NotFound
NotFound
public Form getFormByButtons(String[] buttonLabelRegexs) throws NotFound
buttonLabelRegexs
- can be null to force a button match, otherwise contains the regular expressions for matching (in a case insensitive manner) button text/values.NotFound
public Form getFormByButton(String buttonLabelRegex) throws NotFound
buttonLabelRegex
- can be null to force a button match, otherwise contains the regular expressions for matching (in a case insensitive manner) button text/values.NotFound
public Form getForm(int n)
public Form getForm(String elementQuery) throws NotFound
<tagnameRegex attributeName="attributeValueRegex">
NotFound
public Form getForm(Element element) throws NotFound
NotFound
public boolean followMetaRedirect(int redirectCount) throws ResponseException
redirectCount
- the number of automatic, sequential redirects that occured prior to this redirection.ResponseException
public MultiMap<String,String> getHeaders(String key)
public Form getActiveForm()
public Document apply(Object... params) throws JauntException
params
- a string param will populate a textfield, textarea, or password field (unless the string is "\t", in which case the next form component is skipped); a boolean param will set/unset a checkbox, an integer param will select the nth radio button in a radiobutton group OR select the nth option in a menu (starting at index 0); a regular expression (string that is enclosed in round brackets) will select the matching menu option(s) in a menu; a File object will set the vale of a file-upload component; any other object datatype will be treated as a string by calling its toString()
method.JauntException
public Document filloutField(String label, String value) throws NotFound, MultipleFound
label
- case-insensitive, wordspace-insensitive, left-side label.NotFound
MultipleFound
@Deprecated public Document fillout(String label, String value) throws NotFound, MultipleFound
filloutField(java.lang.String, java.lang.String)
instead.label
- case-insensitive, wordspace-insensitive, left-side label.NotFound
MultipleFound
public Document chooseMenuItem(String menuLabel, String menuItemRegex) throws NotFound, MultipleFound
menuLabel
- case-insensitive, wordspace-insensitive, left-side label.menuItemRegex
- case-insensitive regular expression to match the text of one or more options.NotFound
MultipleFound
@Deprecated public Document choose(String menuLabel, String menuItemRegex) throws NotFound, MultipleFound
chooseMenuItem(java.lang.String, java.lang.String)
instead.menuLabel
- case-insensitive, wordspace-insensitive, left-side label.menuItemRegex
- case-insensitive regular expression to match the text of one or more options.NotFound
MultipleFound
public Document chooseCheckBox(String labelRegex, short labelPosition) throws NotFound, MultipleFound
labelRegex
- case-insensitive regular expression matching the label of the checkbox.labelPosition
- position constant of class Label indicating the position of the label with respect to the checkbox.NotFound
MultipleFound
public Document chooseRadioButton(String labelRegex, short labelPosition) throws NotFound, MultipleFound
labelRegex
- case-insensitive regular expression matching the label of the radiobutton.labelPosition
- position constant of class Label indicating the position of the label with respect to the radiobutton.NotFound
MultipleFound
@Deprecated public Document choose(short labelPosition, String labelRegex) throws NotFound, MultipleFound
chooseMenuItem(java.lang.String, java.lang.String)
or chooseCheckBox(java.lang.String, short)
instead.labelRegex
- case-insensitive regular expression matching the label of the radiobutton or checkbox.labelPosition
- position constant of class Label indicating the position of the label with respect to the radiobutton or checkbox.NotFound
MultipleFound
public Document unchooseRadioButton(String labelRegex, short labelPosition) throws NotFound, MultipleFound
labelRegex
- case-insensitive regular expression matching the label of the radiobuttonlabelPosition
- position constant of class Label indicating the position of the label with respect to the radiobuttonNotFound
MultipleFound
public Document unchooseCheckBox(String labelRegex, short labelPosition) throws NotFound, MultipleFound
labelRegex
- case-insensitive regular expression matching the label of the checkbox.labelPosition
- position constant of class Label indicating the position of the label with respect to the checkbox.NotFound
MultipleFound
@Deprecated public Document unchoose(short labelPosition, String labelRegex) throws NotFound, MultipleFound
unchooseRadioButton(String,short)
or unchooseCheckBox(String,short)
instead.labelRegex
- case-insensitive regular expression matching the label of the radiobutton or checkbox.labelPosition
- position constant of class Label indicating the position of the label with respect to the radiobutton or checkbox.NotFound
MultipleFound
public Document submit() throws SearchException, ResponseException
SearchException
ResponseException
public Document submit(String buttonLabelRegex) throws SearchException, ResponseException
buttonLabelRegex
- case-insensitive regular expression to match the value of the submit button's value attribute.SearchException
ResponseException
public List<ResponseException> saveCompleteWebPage(File file) throws IOException
String.replaceAll("\\W+", "")
), with filenames longer than 200 chars truncated to the last 200 characters.
Within the content folder, subfolders are created to store content for any HTML/XHTML documents that are referenced through frames. Nested frames are supported to a maximum of three nesting levels.
Note that if the specified filename (or content folder) already exists, it is deleted/overwritten when save operation occurs.
To have download progress printed to the console, see UserAgentSettings.showTravel.
After the download is complete, JSON manifest object is created and accessible via userAgent.json. Below is the format of the manifest file:
{ "id":"downloadCompleteWebPage", "downloadStartTime":1454869283120, "downloadEndTime":1454869293681, "contentItems":[ { "class": "contentItem", "requestUrl": "http:\/\/foo.com\/bar.htm", "filePath": "C:\\Users\\Moi\\Downloads\\bar.htm", "contentType": "text\/html", "downloadStartTime":1454869283345, "downloadEndTime":1454869293681, "statusCode":404, "attempts":1, "exception":{ "message":"[message]" } } ] }Each contentItem object represents a single downloaded file. If no ResponseException occurred during downloading (or if a second attempt at downloading the file succeeded), the "exception" field is null. In the event of a ResponseException, the downloadStartTime will be set to a time value, and the downloadEndTime (and other fields) will be set to null.
file
- the destination file to save the current documentIOException