org.apache.commons.io.input.XmlStreamReader类的使用及代码示例

x33g5p2x  于2022-02-03 转载在 其他  
字(11.3k)|赞(0)|评价(0)|浏览(152)

本文整理了Java中org.apache.commons.io.input.XmlStreamReader类的一些代码示例,展示了XmlStreamReader类的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。XmlStreamReader类的具体详情如下:
包路径:org.apache.commons.io.input.XmlStreamReader
类名称:XmlStreamReader

XmlStreamReader介绍

[英]Character stream that handles all the necessary Voodo to figure out the charset encoding of the XML document within the stream.

IMPORTANT: This class is not related in any way to the org.xml.sax.XMLReader. This one IS a character stream.

All this has to be done without consuming characters from the stream, if not the XML parser will not recognized the document as a valid XML. This is not 100% true, but it's close enough (UTF-8 BOM is not handled by all parsers right now, XmlStreamReader handles it and things work in all parsers).

The XmlStreamReader class handles the charset encoding of XML documents in Files, raw streams and HTTP streams by offering a wide set of constructors.

By default the charset encoding detection is lenient, the constructor with the lenient flag can be used for an script (following HTTP MIME and XML specifications). All this is nicely explained by Mark Pilgrim in his blog, Determining the character encoding of a feed.

Originally developed for ROME under Apache License 2.0.
[中]字符流,处理所有必要的巫术,以确定流中XML文档的字符集编码。
重要提示:该类与组织没有任何关系。xml。萨克斯。XMLReader。这是一个字符流。
所有这些都必须在不使用流中的字符的情况下完成,否则XML解析器将无法将文档识别为有效的XML。这并不是100%正确,但已经足够接近了(UTF-8 BOM目前不是由所有解析器处理的,XmlStreamReader处理它,所有解析器都可以工作)。
XmlStreamReader类通过提供大量构造函数来处理文件、原始流和HTTP流中XML文档的字符集编码。
默认情况下,字符集编码检测为lenient,带有lenient标志的构造函数可用于脚本(遵循HTTP MIME和XML规范)。Mark Pilgrim在他的博客Determining the character encoding of a feed中很好地解释了这一切。
最初是在Apache许可证2.0下为ROME开发的。

代码示例

代码示例来源:origin: commons-io/commons-io

protected void _testHttpLenient(final String cT, final String bomEnc, final String streamEnc,
                final String prologEnc, final String shouldbe) throws Exception {
  final InputStream is = getXmlStream(bomEnc,
      prologEnc == null ? XML2 : XML3, streamEnc, prologEnc);
  final XmlStreamReader xmlReader = new XmlStreamReader(is, cT, true);
  assertEquals(xmlReader.getEncoding(), shouldbe);
  xmlReader.close();
}

代码示例来源:origin: commons-io/commons-io

final String cTMime = getContentTypeMime(httpContentType);
final String cTEnc  = getContentTypeEncoding(httpContentType);
final boolean appXml  = isAppXml(cTMime);
final boolean textXml = isTextXml(cTMime);
    return calculateRawEncoding(bomEnc, xmlGuessEnc, xmlEnc);
  } else {
    return defaultEncoding == null ? US_ASCII : defaultEncoding;

代码示例来源:origin: commons-io/commons-io

/**
 * Process a HTTP stream.
 *
 * @param bom BOMInputStream to detect byte order marks
 * @param pis BOMInputStream to guess XML encoding
 * @param httpContentType The HTTP content type
 * @param lenient indicates if the charset encoding detection should be
 *        relaxed.
 * @return the encoding to be used
 * @throws IOException thrown if there is a problem reading the stream.
 */
private String doHttpStream(final BOMInputStream bom, final BOMInputStream pis, final String httpContentType,
    final boolean lenient) throws IOException {
  final String bomEnc      = bom.getBOMCharsetName();
  final String xmlGuessEnc = pis.getBOMCharsetName();
  final String xmlEnc = getXmlProlog(pis, xmlGuessEnc);
  try {
    return calculateHttpEncoding(httpContentType, bomEnc,
        xmlGuessEnc, xmlEnc, lenient);
  } catch (final XmlStreamReaderException ex) {
    if (lenient) {
      return doLenientDetection(httpContentType, ex);
    } else {
      throw ex;
    }
  }
}

代码示例来源:origin: commons-io/commons-io

/**
 * Process the raw stream.
 *
 * @param bom BOMInputStream to detect byte order marks
 * @param pis BOMInputStream to guess XML encoding
 * @param lenient indicates if the charset encoding detection should be
 *        relaxed.
 * @return the encoding to be used
 * @throws IOException thrown if there is a problem reading the stream.
 */
private String doRawStream(final BOMInputStream bom, final BOMInputStream pis, final boolean lenient)
    throws IOException {
  final String bomEnc      = bom.getBOMCharsetName();
  final String xmlGuessEnc = pis.getBOMCharsetName();
  final String xmlEnc = getXmlProlog(pis, xmlGuessEnc);
  try {
    return calculateRawEncoding(bomEnc, xmlGuessEnc, xmlEnc);
  } catch (final XmlStreamReaderException ex) {
    if (lenient) {
      return doLenientDetection(null, ex);
    } else {
      throw ex;
    }
  }
}

代码示例来源:origin: commons-io/commons-io

final BOMInputStream pis = new BOMInputStream(bom, true, XML_GUESS_BYTES);
if (conn instanceof HttpURLConnection || contentType != null) {
  this.encoding = doHttpStream(bom, pis, contentType, lenient);
} else {
  this.encoding = doRawStream(bom, pis, lenient);

代码示例来源:origin: commons-io/commons-io

@Test
public void testReadXmlWithBOMUtf32Be() throws Exception {
  Assume.assumeTrue("JVM and SAX need to support UTF_32BE for this", jvmAndSaxBothSupportCharset("UTF_32BE"));
  final byte[] data = "<?xml version=\"1.0\" encoding=\"UTF-32BE\"?><X/>".getBytes("UTF_32BE");
  parseXml(new BOMInputStream(createUtf32BeDataStream(data, true), ByteOrderMark.UTF_32BE));
  // XML parser does not know what to do with UTF-32, so we warp the input stream with a XmlStreamReader
  parseXml(new XmlStreamReader(createUtf32BeDataStream(data, true)));
}

代码示例来源:origin: commons-io/commons-io

@Test
public void testRawContent() throws Exception {
  final String encoding = "UTF-8";
  final String xml = getXML("no-bom", XML3, encoding, encoding);
  final ByteArrayInputStream is = new ByteArrayInputStream(xml.getBytes(encoding));
  final XmlStreamReader xmlReader = new XmlStreamReader(is);
  assertEquals("Check encoding", xmlReader.getEncoding(), encoding);
  assertEquals("Check content", xml, IOUtils.toString(xmlReader));
}

代码示例来源:origin: commons-io/commons-io

protected void _testRawNoBomInvalid(final String encoding) throws Exception {
  final InputStream is = getXmlStream("no-bom", XML3, encoding, encoding);
  try {
    (new XmlStreamReader(is, false)).close();;
    fail("It should have failed");
  } catch (final IOException ex) {
    assertTrue(ex.getMessage().contains("Invalid encoding,"));
  }
}

代码示例来源:origin: commons-io/commons-io

private void checkHttpError(final String msgSuffix, final boolean lenient, final String httpContentType,
    final String bomEnc, final String xmlGuessEnc, final String xmlEnc, final String defaultEncoding) {
  try {
    checkHttpEncoding("XmlStreamReaderException", lenient, httpContentType, bomEnc, xmlGuessEnc, xmlEnc, defaultEncoding);
    fail("Expected XmlStreamReaderException");
  } catch (final XmlStreamReaderException e) {
    assertTrue("Msg Start: " + e.getMessage(), e.getMessage().startsWith("Invalid encoding"));
    assertTrue("Msg End: "   + e.getMessage(), e.getMessage().endsWith(msgSuffix));
    assertEquals("bomEnc",      bomEnc,      e.getBomEncoding());
    assertEquals("xmlGuessEnc", xmlGuessEnc, e.getXmlGuessEncoding());
    assertEquals("xmlEnc",      xmlEnc,      e.getXmlEncoding());
    assertEquals("ContentTypeEncoding", XmlStreamReader.getContentTypeEncoding(httpContentType),
                      e.getContentTypeEncoding());
    assertEquals("ContentTypeMime", XmlStreamReader.getContentTypeMime(httpContentType),
                    e.getContentTypeMime());
  } catch (final Exception e) {
    fail("Expected XmlStreamReaderException, but threw " + e);
  }
}

代码示例来源:origin: commons-io/commons-io

/**
 * Do lenient detection.
 *
 * @param httpContentType content-type header to use for the resolution of
 *        the charset encoding.
 * @param ex The thrown exception
 * @return the encoding
 * @throws IOException thrown if there is a problem reading the stream.
 */
private String doLenientDetection(String httpContentType,
    XmlStreamReaderException ex) throws IOException {
  if (httpContentType != null && httpContentType.startsWith("text/html")) {
    httpContentType = httpContentType.substring("text/html".length());
    httpContentType = "text/xml" + httpContentType;
    try {
      return calculateHttpEncoding(httpContentType, ex.getBomEncoding(),
          ex.getXmlGuessEncoding(), ex.getXmlEncoding(), true);
    } catch (final XmlStreamReaderException ex2) {
      ex = ex2;
    }
  }
  String encoding = ex.getXmlEncoding();
  if (encoding == null) {
    encoding = ex.getContentTypeEncoding();
  }
  if (encoding == null) {
    encoding = defaultEncoding == null ? UTF_8 : defaultEncoding;
  }
  return encoding;
}

代码示例来源:origin: commons-io/commons-io

final BOMInputStream bom = new BOMInputStream(new BufferedInputStream(is, BUFFER_SIZE), false, BOMS);
final BOMInputStream pis = new BOMInputStream(bom, true, XML_GUESS_BYTES);
this.encoding = doHttpStream(bom, pis, httpContentType, lenient);
this.reader = new InputStreamReader(pis, encoding);

代码示例来源:origin: commons-io/commons-io

@SuppressWarnings("boxing")
private void checkAppXml(final boolean expected, final String mime) {
  assertEquals("Mime=[" + mime + "]", expected, XmlStreamReader.isAppXml(mime));
}

代码示例来源:origin: commons-io/commons-io

@SuppressWarnings("boxing")
private void checkTextXml(final boolean expected, final String mime) {
  assertEquals("Mime=[" + mime + "]", expected, XmlStreamReader.isTextXml(mime));
}

代码示例来源:origin: commons-io/commons-io

private void checkContentTypeMime(final String expected, final String httpContentType) {
  assertEquals("ContentTypeMime=[" + httpContentType + "]", expected, XmlStreamReader.getContentTypeMime(httpContentType));
}

代码示例来源:origin: commons-io/commons-io

private void checkContentTypeEncoding(final String expected, final String httpContentType) {
  assertEquals("ContentTypeEncoding=[" + httpContentType + "]", expected, XmlStreamReader.getContentTypeEncoding(httpContentType));
}

代码示例来源:origin: commons-io/commons-io

@Test
public void testReadXmlWithBOMUtf32Le() throws Exception {
  Assume.assumeTrue("JVM and SAX need to support UTF_32LE for this", jvmAndSaxBothSupportCharset("UTF_32LE"));
  final byte[] data = "<?xml version=\"1.0\" encoding=\"UTF-32LE\"?><X/>".getBytes("UTF_32LE");
  parseXml(new BOMInputStream(createUtf32LeDataStream(data, true), ByteOrderMark.UTF_32LE));
  // XML parser does not know what to do with UTF-32, so we warp the input stream with a XmlStreamReader
  parseXml(new XmlStreamReader(createUtf32LeDataStream(data, true)));
}

代码示例来源:origin: commons-io/commons-io

@Test
public void testHttpContent() throws Exception {
  final String encoding = "UTF-8";
  final String xml = getXML("no-bom", XML3, encoding, encoding);
  final ByteArrayInputStream is = new ByteArrayInputStream(xml.getBytes(encoding));
  final XmlStreamReader xmlReader = new XmlStreamReader(is, encoding);
  assertEquals("Check encoding", xmlReader.getEncoding(), encoding);
  assertEquals("Check content", xml, IOUtils.toString(xmlReader));
}

代码示例来源:origin: org.onosproject/onlab-thirdparty

/**
 * Process the raw stream.
 *
 * @param bom BOMInputStream to detect byte order marks
 * @param pis BOMInputStream to guess XML encoding
 * @param lenient indicates if the charset encoding detection should be
 *        relaxed.
 * @return the encoding to be used
 * @throws IOException thrown if there is a problem reading the stream.
 */
private String doRawStream(BOMInputStream bom, BOMInputStream pis, boolean lenient)
    throws IOException {
  String bomEnc      = bom.getBOMCharsetName();
  String xmlGuessEnc = pis.getBOMCharsetName();
  String xmlEnc = getXmlProlog(pis, xmlGuessEnc);
  try {
    return calculateRawEncoding(bomEnc, xmlGuessEnc, xmlEnc);
  } catch (XmlStreamReaderException ex) {
    if (lenient) {
      return doLenientDetection(null, ex);
    } else {
      throw ex;
    }
  }
}

代码示例来源:origin: commons-io/commons-io

protected void _testHttpInvalid(final String cT, final String bomEnc, final String streamEnc,
                final String prologEnc) throws Exception {
  final InputStream is = getXmlStream(bomEnc,
      prologEnc == null ? XML2 : XML3, streamEnc, prologEnc);
  try {
    (new XmlStreamReader(is, cT, false)).close();;
    fail("It should have failed for HTTP Content-type " + cT + ", BOM "
        + bomEnc + ", streamEnc " + streamEnc + " and prologEnc "
        + prologEnc);
  } catch (final IOException ex) {
    assertTrue(ex.getMessage().contains("Invalid encoding,"));
  }
}

代码示例来源:origin: org.onosproject/onlab-thirdparty

BOMInputStream pis = new BOMInputStream(bom, true, XML_GUESS_BYTES);
if (conn instanceof HttpURLConnection || contentType != null) {
  this.encoding = doHttpStream(bom, pis, contentType, lenient);
} else {
  this.encoding = doRawStream(bom, pis, lenient);

相关文章