org.apache.hadoop.io.Text.bytesToCodePoint()方法的使用及代码示例

x33g5p2x  于2022-01-29 转载在 其他  
字(10.2k)|赞(0)|评价(0)|浏览(86)

本文整理了Java中org.apache.hadoop.io.Text.bytesToCodePoint()方法的一些代码示例,展示了Text.bytesToCodePoint()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Text.bytesToCodePoint()方法的具体详情如下:
包路径:org.apache.hadoop.io.Text
类名称:Text
方法名:bytesToCodePoint

Text.bytesToCodePoint介绍

[英]Returns the next code point at the current position in the buffer. The buffer's position will be incremented. Any mark set on this buffer will be changed by this method!
[中]返回缓冲区中当前位置的下一个代码点。缓冲区的位置将增加。此缓冲区上设置的任何标记都将通过此方法更改!

代码示例

代码示例来源:origin: org.apache.hadoop/hadoop-common

/**
 * Returns the Unicode Scalar Value (32-bit integer value)
 * for the character at <code>position</code>. Note that this
 * method avoids using the converter or doing String instantiation
 * @return the Unicode scalar value at position or -1
 *          if the position is invalid or points to a
 *          trailing byte
 */
public int charAt(int position) {
 if (position > this.length) return -1; // too long
 if (position < 0) return -1; // duh.
  
 ByteBuffer bb = (ByteBuffer)ByteBuffer.wrap(bytes).position(position);
 return bytesToCodePoint(bb.slice());
}

代码示例来源:origin: apache/hive

/**
 * Translates the input string based on {@link #replacementMap} and {@link #deletionSet} and
 * returns the translated string.
 *
 * @param input
 *          input string to perform the translation on
 * @return translated string
 */
private String processInput(Text input) {
 StringBuilder resultBuilder = new StringBuilder();
 // Obtain the byte buffer from the input string so we can traverse it code point by code point
 ByteBuffer inputBytes = ByteBuffer.wrap(input.getBytes(), 0, input.getLength());
 // Traverse the byte buffer containing the input string one code point at a time
 while (inputBytes.hasRemaining()) {
  int inputCodePoint = Text.bytesToCodePoint(inputBytes);
  // If the code point exists in deletion set, no need to emit out anything for this code point.
  // Continue on to the next code point
  if (deletionSet.contains(inputCodePoint)) {
   continue;
  }
  Integer replacementCodePoint = replacementMap.get(inputCodePoint);
  // If a replacement exists for this code point, emit out the replacement and append it to the
  // output string. If no such replacement exists, emit out the original input code point
  char[] charArray = Character.toChars((replacementCodePoint != null) ? replacementCodePoint
    : inputCodePoint);
  resultBuilder.append(charArray);
 }
 String resultString = resultBuilder.toString();
 return resultString;
}

代码示例来源:origin: apache/hive

int fromCodePoint = Text.bytesToCodePoint(fromBytes);
 int toCodePoint = Text.bytesToCodePoint(toBytes);

代码示例来源:origin: apache/drill

/**
 * Translates the input string based on {@link #replacementMap} and {@link #deletionSet} and
 * returns the translated string.
 *
 * @param input
 *          input string to perform the translation on
 * @return translated string
 */
private String processInput(Text input) {
 StringBuilder resultBuilder = new StringBuilder();
 // Obtain the byte buffer from the input string so we can traverse it code point by code point
 ByteBuffer inputBytes = ByteBuffer.wrap(input.getBytes(), 0, input.getLength());
 // Traverse the byte buffer containing the input string one code point at a time
 while (inputBytes.hasRemaining()) {
  int inputCodePoint = Text.bytesToCodePoint(inputBytes);
  // If the code point exists in deletion set, no need to emit out anything for this code point.
  // Continue on to the next code point
  if (deletionSet.contains(inputCodePoint)) {
   continue;
  }
  Integer replacementCodePoint = replacementMap.get(inputCodePoint);
  // If a replacement exists for this code point, emit out the replacement and append it to the
  // output string. If no such replacement exists, emit out the original input code point
  char[] charArray = Character.toChars((replacementCodePoint != null) ? replacementCodePoint
    : inputCodePoint);
  resultBuilder.append(charArray);
 }
 String resultString = resultBuilder.toString();
 return resultString;
}

代码示例来源:origin: apache/drill

int fromCodePoint = Text.bytesToCodePoint(fromBytes);
 int toCodePoint = Text.bytesToCodePoint(toBytes);

代码示例来源:origin: org.apache.orc/orc-core

/**
 * Get the next code point from the ByteBuffer. Moves the position in the
 * ByteBuffer forward to the next code point.
 * @param param the source of bytes
 * @param defaultValue if there are no bytes left, use this value
 * @return the code point that was found at the front of the buffer.
 */
static int getNextCodepoint(ByteBuffer param, int defaultValue) {
 if (param.remaining() == 0) {
  return defaultValue;
 } else {
  return Text.bytesToCodePoint(param);
 }
}

代码示例来源:origin: ch.cern.hadoop/hadoop-common

/**
 * Returns the Unicode Scalar Value (32-bit integer value)
 * for the character at <code>position</code>. Note that this
 * method avoids using the converter or doing String instantiation
 * @return the Unicode scalar value at position or -1
 *          if the position is invalid or points to a
 *          trailing byte
 */
public int charAt(int position) {
 if (position > this.length) return -1; // too long
 if (position < 0) return -1; // duh.
  
 ByteBuffer bb = (ByteBuffer)ByteBuffer.wrap(bytes).position(position);
 return bytesToCodePoint(bb.slice());
}

代码示例来源:origin: io.prestosql.hadoop/hadoop-apache

/**
 * Returns the Unicode Scalar Value (32-bit integer value)
 * for the character at <code>position</code>. Note that this
 * method avoids using the converter or doing String instantiation
 * @return the Unicode scalar value at position or -1
 *          if the position is invalid or points to a
 *          trailing byte
 */
public int charAt(int position) {
 if (position > this.length) return -1; // too long
 if (position < 0) return -1; // duh.
  
 ByteBuffer bb = (ByteBuffer)ByteBuffer.wrap(bytes).position(position);
 return bytesToCodePoint(bb.slice());
}

代码示例来源:origin: io.hops/hadoop-common

/**
 * Returns the Unicode Scalar Value (32-bit integer value)
 * for the character at <code>position</code>. Note that this
 * method avoids using the converter or doing String instantiation
 * @return the Unicode scalar value at position or -1
 *          if the position is invalid or points to a
 *          trailing byte
 */
public int charAt(int position) {
 if (position > this.length) return -1; // too long
 if (position < 0) return -1; // duh.
  
 ByteBuffer bb = (ByteBuffer)ByteBuffer.wrap(bytes).position(position);
 return bytesToCodePoint(bb.slice());
}

代码示例来源:origin: com.facebook.hadoop/hadoop-core

/**
 * Returns the Unicode Scalar Value (32-bit integer value)
 * for the character at <code>position</code>. Note that this
 * method avoids using the converter or doing String instatiation
 * @return the Unicode scalar value at position or -1
 *          if the position is invalid or points to a
 *          trailing byte
 */
public int charAt(int position) {
 if (position > this.length) return -1; // too long
 if (position < 0) return -1; // duh.
  
 ByteBuffer bb = (ByteBuffer)ByteBuffer.wrap(bytes).position(position);
 return bytesToCodePoint(bb.slice());
}

代码示例来源:origin: com.github.jiayuhan-it/hadoop-common

/**
 * Returns the Unicode Scalar Value (32-bit integer value)
 * for the character at <code>position</code>. Note that this
 * method avoids using the converter or doing String instantiation
 * @return the Unicode scalar value at position or -1
 *          if the position is invalid or points to a
 *          trailing byte
 */
public int charAt(int position) {
 if (position > this.length) return -1; // too long
 if (position < 0) return -1; // duh.
  
 ByteBuffer bb = (ByteBuffer)ByteBuffer.wrap(bytes).position(position);
 return bytesToCodePoint(bb.slice());
}

代码示例来源:origin: org.jvnet.hudson.hadoop/hadoop-core

/**
 * Returns the Unicode Scalar Value (32-bit integer value)
 * for the character at <code>position</code>. Note that this
 * method avoids using the converter or doing String instatiation
 * @return the Unicode scalar value at position or -1
 *          if the position is invalid or points to a
 *          trailing byte
 */
public int charAt(int position) {
 if (position > this.length) return -1; // too long
 if (position < 0) return -1; // duh.
  
 ByteBuffer bb = (ByteBuffer)ByteBuffer.wrap(bytes).position(position);
 return bytesToCodePoint(bb.slice());
}

代码示例来源:origin: com.facebook.presto.hive/hive-apache

int fromCodePoint = Text.bytesToCodePoint(fromBytes);
 int toCodePoint = Text.bytesToCodePoint(toBytes);

代码示例来源:origin: com.facebook.presto.hive/hive-apache

/**
 * Translates the input string based on {@link #replacementMap} and {@link #deletionSet} and
 * returns the translated string.
 *
 * @param input
 *          input string to perform the translation on
 * @return translated string
 */
private String processInput(Text input) {
 StringBuilder resultBuilder = new StringBuilder();
 // Obtain the byte buffer from the input string so we can traverse it code point by code point
 ByteBuffer inputBytes = ByteBuffer.wrap(input.getBytes(), 0, input.getLength());
 // Traverse the byte buffer containing the input string one code point at a time
 while (inputBytes.hasRemaining()) {
  int inputCodePoint = Text.bytesToCodePoint(inputBytes);
  // If the code point exists in deletion set, no need to emit out anything for this code point.
  // Continue on to the next code point
  if (deletionSet.contains(inputCodePoint)) {
   continue;
  }
  Integer replacementCodePoint = replacementMap.get(inputCodePoint);
  // If a replacement exists for this code point, emit out the replacement and append it to the
  // output string. If no such replacement exists, emit out the original input code point
  char[] charArray = Character.toChars((replacementCodePoint != null) ? replacementCodePoint
    : inputCodePoint);
  resultBuilder.append(charArray);
 }
 String resultString = resultBuilder.toString();
 return resultString;
}

代码示例来源:origin: ch.cern.hadoop/hadoop-common

public void testbytesToCodePointWithInvalidUTF() {
 try {                 
  Text.bytesToCodePoint(ByteBuffer.wrap(new byte[] {-2}));
  fail("testbytesToCodePointWithInvalidUTF error unexp exception !!!");
 } catch (BufferUnderflowException ex) {      
 } catch(Exception e) {
  fail("testbytesToCodePointWithInvalidUTF error unexp exception !!!");
 }
}

代码示例来源:origin: com.github.jiayuhan-it/hadoop-common

public void testbytesToCodePointWithInvalidUTF() {
 try {                 
  Text.bytesToCodePoint(ByteBuffer.wrap(new byte[] {-2}));
  fail("testbytesToCodePointWithInvalidUTF error unexp exception !!!");
 } catch (BufferUnderflowException ex) {      
 } catch(Exception e) {
  fail("testbytesToCodePointWithInvalidUTF error unexp exception !!!");
 }
}

代码示例来源:origin: ch.cern.hadoop/hadoop-common

/**
 * test {@code Text.bytesToCodePoint(bytes) } 
 * with {@code BufferUnderflowException}
 * 
 */
public void testBytesToCodePoint() {
 try {
  ByteBuffer bytes = ByteBuffer.wrap(new byte[] {-2, 45, 23, 12, 76, 89});                                      
  Text.bytesToCodePoint(bytes);      
  assertTrue("testBytesToCodePoint error !!!", bytes.position() == 6 );                      
 } catch (BufferUnderflowException ex) {
  fail("testBytesToCodePoint unexp exception");
 } catch (Exception e) {
  fail("testBytesToCodePoint unexp exception");
 }    
}

代码示例来源:origin: com.github.jiayuhan-it/hadoop-common

/**
 * test {@code Text.bytesToCodePoint(bytes) } 
 * with {@code BufferUnderflowException}
 * 
 */
public void testBytesToCodePoint() {
 try {
  ByteBuffer bytes = ByteBuffer.wrap(new byte[] {-2, 45, 23, 12, 76, 89});                                      
  Text.bytesToCodePoint(bytes);      
  assertTrue("testBytesToCodePoint error !!!", bytes.position() == 6 );                      
 } catch (BufferUnderflowException ex) {
  fail("testBytesToCodePoint unexp exception");
 } catch (Exception e) {
  fail("testBytesToCodePoint unexp exception");
 }    
}

代码示例来源:origin: org.apache.orc/orc-core

int cp = Text.bytesToCodePoint(sourceBytes);

相关文章