public class HTMLParser
extends org.jsoup.Jsoup
Title: Framework Support Library
Description: Wrapper for Jsoup library.
Copyright: Copyright (c) 2008
Company: StreamScape Technologies
Constructor and Description |
---|
HTMLParser() |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
cleanHTML(java.lang.String bodyHtml,
java.lang.String baseUri,
org.jsoup.safety.Whitelist whitelist)
Get safe HTML from untrusted input HTML, by parsing input HTML and filtering it through a white-list of permitted
tags and attributes.
|
java.lang.String |
cleanHTML(java.lang.String bodyHtml,
org.jsoup.safety.Whitelist whitelist)
Get safe HTML from untrusted input HTML, by parsing input HTML and filtering it through a white-list of permitted
tags and attributes.
|
static HTMLParser |
getInstance() |
boolean |
isValidHTML(java.lang.String bodyHtml,
org.jsoup.safety.Whitelist whitelist)
Test if the input body HTML has only tags and attributes allowed by the Whitelist.
|
static org.jsoup.nodes.Document |
parseHTML(java.lang.String html)
Parse HTML into a Document.
|
org.jsoup.nodes.Document |
parseHTML(java.lang.String html,
java.lang.String baseUri)
Parse HTML into a Document.
|
org.jsoup.nodes.Document |
parseHTMLBodyFragment(java.lang.String bodyHtml)
Parse a fragment of HTML, with the assumption that it forms the
body of the HTML. |
org.jsoup.nodes.Document |
parseHTMLBodyFragment(java.lang.String bodyHtml,
java.lang.String baseUri)
Parse a fragment of HTML, with the assumption that it forms the
body of the HTML. |
public static HTMLParser getInstance()
public org.jsoup.nodes.Document parseHTML(java.lang.String html, java.lang.String baseUri)
html
- HTML to parsebaseUri
- The URL where the HTML was retrieved from. Used to resolve relative URLs to absolute URLs, that occur
before the HTML declares a <base href>
tag.public static org.jsoup.nodes.Document parseHTML(java.lang.String html)
<base href>
tag.html
- HTML to parseJsoup.parse(String, String)
public org.jsoup.nodes.Document parseHTMLBodyFragment(java.lang.String bodyHtml, java.lang.String baseUri)
body
of the HTML.bodyHtml
- body HTML fragmentbaseUri
- URL to resolve relative URLs against.Document.body()
public org.jsoup.nodes.Document parseHTMLBodyFragment(java.lang.String bodyHtml)
body
of the HTML.bodyHtml
- body HTML fragmentDocument.body()
public java.lang.String cleanHTML(java.lang.String bodyHtml, java.lang.String baseUri, org.jsoup.safety.Whitelist whitelist)
bodyHtml
- input untrusted HTML (body fragment)baseUri
- URL to resolve relative URLs againstwhitelist
- white-list of permitted HTML elementsCleaner.clean(Document)
public java.lang.String cleanHTML(java.lang.String bodyHtml, org.jsoup.safety.Whitelist whitelist)
bodyHtml
- input untrusted HTML (body fragment)whitelist
- white-list of permitted HTML elementsCleaner.clean(Document)
public boolean isValidHTML(java.lang.String bodyHtml, org.jsoup.safety.Whitelist whitelist)
The input HTML should still be run through the cleaner to set up enforced attributes, and to tidy the output.
Assumes the HTML is a body fragment (i.e. will be used in an existing HTML document body.)
bodyHtml
- HTML to testwhitelist
- whitelist to test againstJsoup.clean(String, org.jsoup.safety.Whitelist)
Copyright © 2015-2024 StreamScape Technologies. All rights reserved.