org.htmlparser.Parser.setLexer()方法的使用及代码示例

x33g5p2x  于2022-01-26 转载在 其他  
字(9.3k)|赞(0)|评价(0)|浏览(77)

本文整理了Java中org.htmlparser.Parser.setLexer()方法的一些代码示例,展示了Parser.setLexer()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Parser.setLexer()方法的具体详情如下:
包路径:org.htmlparser.Parser
类名称:Parser
方法名:setLexer

Parser.setLexer介绍

[英]Set the lexer for this parser. The current NodeFactory is transferred to (set on) the given lexer, since the lexer owns the node factory object. It does not adjust the feedback object. Trying to set the lexer to null is a no-op.
[中]为此解析器设置lexer。由于lexer拥有node factory对象,因此当前的NodeFactory被传输到(设置在)给定的lexer。它不会调整feedback对象。尝试将lexer设置为null是不可行的。

代码示例

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

/**
 * Set the connection for this parser.
 * This method creates a new <code>Lexer</code> reading from the connection.
 * @param connection A fully conditioned connection. The connect()
 * method will be called so it need not be connected yet.
 * @exception ParserException if the character set specified in the
 * HTTP header is not supported, or an i/o exception occurs creating the
 * lexer.
 * @see #setLexer
 */
public void setConnection (URLConnection connection)
  throws
    ParserException
{
  if (null == connection)
    throw new IllegalArgumentException ("connection cannot be null");
  setLexer (new Lexer (connection));
}

代码示例来源:origin: org.htmlparser/htmlparser

/**
 * Set the connection for this parser.
 * This method creates a new <code>Lexer</code> reading from the connection.
 * @param connection A fully conditioned connection. The connect()
 * method will be called so it need not be connected yet.
 * @exception ParserException if the character set specified in the
 * HTTP header is not supported, or an i/o exception occurs creating the
 * lexer.
 * @see #setLexer
 * @see #getConnection
 * @exception IllegalArgumentException if <code>connection</code> is <code>null</code>.
 * @exception ParserException if a problem occurs in connecting.
 */
public void setConnection (URLConnection connection)
  throws
    ParserException
{
  if (null == connection)
    throw new IllegalArgumentException ("connection cannot be null");
  setLexer (new Lexer (connection));
}

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

/**
 * Initializes the parser with the given input HTML String.
 * @param inputHTML the input HTML that is to be parsed.
 * @throws ParserException If a error occurs in setting up the
 * underlying Lexer.
 */
public void setInputHTML (String inputHTML)
  throws
    ParserException
{
  if (null == inputHTML)
    throw new IllegalArgumentException ("html cannot be null");
  if (!"".equals (inputHTML))
    setLexer (new Lexer (new Page (inputHTML)));
}

代码示例来源:origin: org.htmlparser/htmlparser

/**
 * Initializes the parser with the given input HTML String.
 * @param inputHTML the input HTML that is to be parsed.
 * @throws ParserException If a error occurs in setting up the
 * underlying Lexer.
 * @exception IllegalArgumentException if <code>inputHTML</code> is <code>null</code>.
 */
public void setInputHTML (String inputHTML)
  throws
    ParserException
{
  if (null == inputHTML)
    throw new IllegalArgumentException ("html cannot be null");
  if (!"".equals (inputHTML))
    setLexer (new Lexer (new Page (inputHTML)));
}

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

/**
 * Construct a parser using the provided lexer and feedback object.
 * This would be used to create a parser for special cases where the
 * normal creation of a lexer on a URLConnection needs to be customized.
 * @param lexer The lexer to draw characters from.
 * @param fb The object to use when information,
 * warning and error messages are produced. If <em>null</em> no feedback
 * is provided.
 */
public Parser (Lexer lexer, ParserFeedback fb)
{
  setFeedback (fb);
  if (null == lexer)
    throw new IllegalArgumentException ("lexer cannot be null");
  setLexer (lexer);
  setNodeFactory (new PrototypicalNodeFactory ());
}

代码示例来源:origin: org.htmlparser/htmlparser

/**
 * Construct a parser using the provided lexer and feedback object.
 * This would be used to create a parser for special cases where the
 * normal creation of a lexer on a URLConnection needs to be customized.
 * @param lexer The lexer to draw characters from.
 * @param fb The object to use when information,
 * warning and error messages are produced. If <em>null</em> no feedback
 * is provided.
 */
public Parser (Lexer lexer, ParserFeedback fb)
{
  setFeedback (fb);
  setLexer (lexer);
  setNodeFactory (new PrototypicalNodeFactory ());
}

代码示例来源:origin: org.htmlparser/htmlparser

setLexer (new Lexer (new Page (resource)));
else
  setLexer (new Lexer (getConnectionManager ().openConnection (resource)));

代码示例来源:origin: org.htmlparser/htmlparser

/**
 * Create a Parser Object having a String Object as input (instead of a url or a string representing the url location).
 * <BR>The string will be parsed as it would be a file.
 * @param input The string in input.
 * @return The Parser Object with the string as input stream.
 */
public static Parser createParserParsingAnInputString (String input)
  throws ParserException, UnsupportedEncodingException
{
   Parser parser = new Parser();
  Lexer lexer = new Lexer();
  Page page = new Page(input);
  lexer.setPage(page);
  parser.setLexer(lexer);
  
  return parser;
  
}

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

/**
 * Create a Parser Object having a String Object as input (instead of a url or a string representing the url location).
 * <BR>The string will be parsed as it would be a file.
 * @param input The string in input.
 * @return The Parser Object with the string as input stream.
 */
public static Parser createParserParsingAnInputString (String input)
  throws ParserException, UnsupportedEncodingException
{
   Parser parser = new Parser();
  Lexer lexer = new Lexer();
  Page page = new Page(input);
  lexer.setPage(page);
  parser.setLexer(lexer);
  
  return parser;
  
}

代码示例来源:origin: com.bbossgroups.pdp/pdp-cms

parser.setLexer(lexer);
if(CMSUtil.autoCloseHtmlTag())

代码示例来源:origin: org.opencms/opencms-core

/**
 * @see org.opencms.util.I_CmsHtmlNodeVisitor#process(java.lang.String, java.lang.String)
 */
public String process(String html, String encoding) throws ParserException {
  m_result = new StringBuffer();
  Parser parser = new Parser();
  Lexer lexer = new Lexer();
  // initialize the page with the given char set
  Page page = new Page(html, encoding);
  lexer.setPage(page);
  parser.setLexer(lexer);
  if ((m_noAutoCloseTags != null) && (m_noAutoCloseTags.size() > 0)) {
    // Degrade Composite tags that do have children in the DOM tree
    // to simple single tags: This allows to finish this tag with opened HTML tags without the effect
    // that html parser will generate the closing tags.
    PrototypicalNodeFactory factory = configureNoAutoCorrectionTags();
    lexer.setNodeFactory(factory);
  }
  // process the page using the given visitor
  parser.visitAllNodesWith(this);
  // return the result
  return getResult();
}

代码示例来源:origin: org.opencms/opencms-solr

/**
 * @see org.opencms.util.I_CmsHtmlNodeVisitor#process(java.lang.String, java.lang.String)
 */
public String process(String html, String encoding) throws ParserException {
  m_result = new StringBuffer();
  Parser parser = new Parser();
  Lexer lexer = new Lexer();
  // initialize the page with the given char set
  Page page = new Page(html, encoding);
  lexer.setPage(page);
  parser.setLexer(lexer);
  if (m_noAutoCloseTags != null && m_noAutoCloseTags.size() > 0) {
    // Degrade Composite tags that do have children in the DOM tree 
    // to simple single tags: This allows to finish this tag with opened HTML tags without the effect 
    // that html parser will generate the closing tags. 
    PrototypicalNodeFactory factory = configureNoAutoCorrectionTags();
    lexer.setNodeFactory(factory);
  }
  // process the page using the given visitor
  parser.visitAllNodesWith(this);
  // return the result
  return getResult();
}

代码示例来源:origin: org.opencms/org.opencms.workplace.tools.content

Page page = new Page(html, encoding);
lexer.setPage(page);
parser.setLexer(lexer);

代码示例来源:origin: org.opencms/opencms-core

/**
 * Extract the text from a HTML page.<p>
 *
 * @param in the html content input stream
 * @param encoding the encoding of the content
 *
 * @return the extracted text from the page
 * @throws ParserException if the parsing of the HTML failed
 * @throws UnsupportedEncodingException if the given encoding is not supported
 */
public static String extractText(InputStream in, String encoding)
throws ParserException, UnsupportedEncodingException {
  Parser parser = new Parser();
  Lexer lexer = new Lexer();
  Page page = new Page(in, encoding);
  lexer.setPage(page);
  parser.setLexer(lexer);
  StringBean stringBean = new StringBean();
  parser.visitAllNodesWith(stringBean);
  String result = stringBean.getStrings();
  return result == null ? "" : result;
}

代码示例来源:origin: com.bbossgroups.pdp/pdp-cms

parser.setLexer(lexer);

代码示例来源:origin: com.bbossgroups.pdp/pdp-cms

/**
 * Extract the text from a HTML page.<p>
 *
 * @param in the html content input stream
 * @param encoding the encoding of the content
 *
 * @return the extracted text from the page
 * @throws ParserException if the parsing of the HTML failed
 * @throws UnsupportedEncodingException if the given encoding is not supported
 */
public static String extractText(InputStream in, String encoding)
throws ParserException, UnsupportedEncodingException {
  Parser parser = new Parser();
  Lexer lexer = new Lexer();
  Page page = new Page(in, encoding);
  lexer.setPage(page);
  parser.setLexer(lexer);
  StringBean stringBean = new StringBean();
  parser.visitAllNodesWith(stringBean);
  return stringBean.getStrings();
}

代码示例来源:origin: org.opencms/opencms-solr

/**
 * Extract the text from a HTML page.<p>
 *
 * @param in the html content input stream
 * @param encoding the encoding of the content
 *
 * @return the extracted text from the page
 * @throws ParserException if the parsing of the HTML failed
 * @throws UnsupportedEncodingException if the given encoding is not supported
 */
public static String extractText(InputStream in, String encoding)
throws ParserException, UnsupportedEncodingException {
  Parser parser = new Parser();
  Lexer lexer = new Lexer();
  Page page = new Page(in, encoding);
  lexer.setPage(page);
  parser.setLexer(lexer);
  StringBean stringBean = new StringBean();
  parser.visitAllNodesWith(stringBean);
  String result = stringBean.getStrings();
  return result == null ? "" : result;
}

代码示例来源:origin: org.opencms/opencms-core

Page page = new Page(content);
lexer.setPage(page);
parser.setLexer(lexer);

代码示例来源:origin: org.opencms/opencms-solr

Page page = new Page(content);
lexer.setPage(page);
parser.setLexer(lexer);

相关文章