org.apache.pig.data.TupleFactory类的使用及代码示例

x33g5p2x  于2022-01-30 转载在 其他  
字(8.2k)|赞(0)|评价(0)|浏览(188)

本文整理了Java中org.apache.pig.data.TupleFactory类的一些代码示例,展示了TupleFactory类的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。TupleFactory类的具体详情如下:
包路径:org.apache.pig.data.TupleFactory
类名称:TupleFactory

TupleFactory介绍

[英]A factory to construct tuples. This class is abstract so that users can override the tuple factory if they desire to provide their own that returns their implementation of a tuple. If the property pig.data.tuple.factory.name is set to a class name and pig.data.tuple.factory.jar is set to a URL pointing to a jar that contains the above named class, then #getInstance() will create a an instance of the named class using the indicated jar. Otherwise, it will create an instance of DefaultTupleFactory.
[中]构造元组的工厂。该类是抽象的,因此如果用户希望提供自己的返回元组实现的元组工厂,则可以重写元组工厂。如果财产是猪。数据元组。工厂名称设置为类名和pig。数据元组。工厂jar被设置为指向包含上述命名类的jar的URL,然后#getInstance()将使用指定的jar创建命名类的实例。否则,它将创建DefaultTupleFactory的实例。

代码示例

代码示例来源:origin: elastic/elasticsearch-hadoop

private PigTuple createTuple(Object obj, ResourceSchema schema) {
  PigTuple tuple = new PigTuple(schema);
  tuple.setTuple(TupleFactory.getInstance().newTuple(obj));
  return tuple;
}

代码示例来源:origin: elastic/elasticsearch-hadoop

@Override
public Object addToArray(Object array, List<Object> value) {
  return TupleFactory.getInstance().newTupleNoCopy(value);
}

代码示例来源:origin: elastic/elasticsearch-hadoop

dataMap = reader.getCurrentValue();
Tuple tuple = TupleFactory.getInstance().newTuple(dataMap.size());
    tuple.set(i, result);
  Set<Entry<?, ?>> entrySet = dataMap.entrySet();
  for (Map.Entry entry : entrySet) {
    tuple.set(i++, entry.getValue());

代码示例来源:origin: org.apache.pig/pig

@Override
  public Tuple exec(Tuple input) throws IOException {
    // Initial is called in the map.
    // we just send the tuple down
    try {
      // input is a bag with one tuple containing
      // the column we are trying to operate on
      DataBag bg = (DataBag) input.get(0);
      if (bg.iterator().hasNext()) {
        return bg.iterator().next();
      } else {
        // make sure that we call the object constructor, not the list constructor
        return tfact.newTuple((Object) null);
      }
    } catch (ExecException e) {
      throw e;
    } catch (Exception e) {
      int errCode = 2106;
      throw new ExecException("Error executing an algebraic function", errCode, PigException.BUG, e);
    }
  }
}

代码示例来源:origin: org.apache.pig/pig

private Tuple createTuple(Tuple[] data) throws ExecException {
  Tuple out = TupleFactory.getInstance().newTuple();
  for (int i = 0; i < data.length; ++i) {
    Tuple t = data[i];
    int size = t.size();
    for (int j = 0; j < size; ++j) {
      out.append(t.get(j));
    }
  }
  return illustratorMarkup(out, out, 0);
}

代码示例来源:origin: org.apache.pig/pig

if (input.size() != 2) {
  int errCode = 2107;
  String msg = "DIFF expected two inputs but received " + input.size() + " inputs.";
  throw new ExecException(msg, errCode, PigException.BUG);
  Object o1 = input.get(0);
  if (o1 instanceof DataBag) {
    DataBag bag1 = (DataBag)o1;
    Object d2 = input.get(1);
    if (!d1.equals(d2)) {
      output.add(mTupleFactory.newTuple(d1));
      output.add(mTupleFactory.newTuple(d2));

代码示例来源:origin: org.netpreserve.commons/webarchive-commons

private ArrayList<Tuple> applyView(Tuple inner) {
  // [0] is the JSON. Remaining elements are Strings describing paths
  // into the JSON to "flatten" into a single tuple:
  if(inner == null || inner.size() == 0) {
    return null;
  }
  try {
    JSONObject json = new JSONObject(inner.get(2).toString());
    List<List<String>> matches = view.apply(json);
    if(matches.size() == 0) {
      return null;
    }
    ArrayList<Tuple> results = new ArrayList<Tuple>();
    for(List<String> t : matches) {
      mCacheProtoTuple.addAll(t);
      Tuple tup = mCacheTupleFactory.newTuple(mCacheProtoTuple);
      mCacheProtoTuple.clear();
      results.add(tup);
    }
    return results;
  } catch (JSONException e) {
    LOG.warning("Failed to parse JSON:"+e.getMessage());
  } catch (ExecException e) {
    LOG.warning("ExecException:"+e.getMessage());
  }
  
  return null;
}

代码示例来源:origin: org.apache.pig/pig

@Override
  public Tuple exec(Tuple input) throws IOException {
    try {
      return tfact.newTuple(doTupleWork(input, this));
    } catch (ExecException ee) {
      throw ee;
    } catch (Exception e) {
      int errCode = 2106;
      throw new ExecException("Error executing function on Doubles", errCode, PigException.BUG, e);
    }
  }
}

代码示例来源:origin: lucidworks/solr-scale-tk

public DataBag exec(Tuple input) throws IOException {
  DataBag outputBag = bagFactory.newDefaultBag();        
  String idBase = (String)input.get(0);        
  for (int k=0; k < numKeys; k++) {
   String key = idBase+k;
   int key_bucket = random.nextInt(maxRandom);
   Tuple next = tupleFactory.newTuple(2);
   next.set(0, key);
   next.set(1, key_bucket);
   outputBag.add(next);
  }
  return outputBag;
}

代码示例来源:origin: pl.edu.icm.coansys/dc-logic

private DataBag getCategories(List<ClassifCode> classifCodeList) {
  DataBag db = new DefaultDataBag();
  for (ClassifCode code : classifCodeList) {
    for (String co_str : code.getValueList()) {
      db.add(TupleFactory.getInstance().newTuple(co_str));
    }
  }
  return db;
}

代码示例来源:origin: org.apache.pig/pig

final TupleFactory tf = TupleFactory.getInstance();
Map<String, Object> distMap = (Map<String, Object>) t.get(0);
DataBag partitionList = (DataBag) distMap.get(PartitionSkewedKeys.PARTITION_LIST);
Iterator<Tuple> it = partitionList.iterator();
while (it.hasNext()) {
  Tuple idxTuple = it.next();
  Integer maxIndex = (Integer) idxTuple.get(idxTuple.size() - 1);
  Integer minIndex = (Integer) idxTuple.get(idxTuple.size() - 2);
  Tuple keyTuple = tf.newTuple();
  for (int i = 0; i < idxTuple.size() - 2; i++) {
    keyTuple.append(idxTuple.get(i));
log.warn(e.getMessage());

代码示例来源:origin: com.linkedin.datafu/datafu

@Override
public void accumulate(Tuple arg0) throws IOException
{
 DataBag inputBag = (DataBag)arg0.get(0);
 for (Tuple t : inputBag) {
  Tuple t1 = TupleFactory.getInstance().newTuple(t.getAll());
  t1.append(i);
  outputBag.add(t1);
  if (count % 1000000 == 0) {
   outputBag.spill();
   count = 0;
  }
  i++;
  count++;
 }
}

代码示例来源:origin: mozilla-metrics/akela

@Override
public Tuple exec(Tuple input) throws IOException {
  if (input == null || input.size() == 0) {
    return null;
  }
  DataBag db = (DataBag) input.get(0);
  Iterator<Tuple> iter = db.iterator();
  Tuple output = tupleFactory.newTuple();
  while (iter.hasNext()) {
    Tuple t = iter.next();
    for (Object o : t.getAll()) {
      output.append(o);
    }
  }
  return output;
}

代码示例来源:origin: apache/hive

private static Tuple transformToTuple(List<?> objList, HCatSchema hs) throws Exception {
 if (objList == null) {
  return null;
 }
 Tuple t = tupFac.newTuple(objList.size());
 List<HCatFieldSchema> subFields = hs.getFields();
 for (int i = 0; i < subFields.size(); i++) {
  t.set(i, extractPigObject(objList.get(i), subFields.get(i)));
 }
 return t;
}

代码示例来源:origin: org.apache.pig/pig

@Override
public Tuple call(Tuple input) throws Exception {
  Tuple output = TupleFactory.getInstance()
      .newTuple(input.getAll().size() - 2);
  
  for (int i = 1; i < input.getAll().size() - 2; i ++) {
    output.set(i, input.get(i+2));
  }
  
  long offset = calculateOffset((Integer) input.get(0));
  output.set(0, offset + (Long)input.get(2));
  return output;
}

代码示例来源:origin: thedatachef/varaha

public DataBag exec(Tuple input) throws IOException {
    if (input == null || input.size() < 1 || input.isNull(0))
      return null;

    // Output bag
    DataBag bagOfTokens = bagFactory.newDefaultBag();
        
    StringReader textInput = new StringReader(input.get(0).toString());
    PTBTokenizer ptbt = new PTBTokenizer(textInput, new CoreLabelTokenFactory(), "");

    for (CoreLabel label; ptbt.hasNext(); ) {
     label = (CoreLabel)ptbt.next();
     Tuple termText = tupleFactory.newTuple(label.toString());
     bagOfTokens.add(termText);
    }
    
    return bagOfTokens;
  }
}

代码示例来源:origin: org.apache.pig/pig

@Override
  public Tuple exec(Tuple input) throws IOException {
    // Since Initial is guaranteed to be called
    // only in the map, it will be called with an
    // input of a bag with a single tuple - the 
    // count should always be 1 if bag is non empty
    DataBag bag = (DataBag)input.get(0);
    return mTupleFactory.newTuple(bag.iterator().hasNext()? 
        Long.valueOf(1L) : Long.valueOf(0L));
  }
}

代码示例来源:origin: org.apache.pig/pig

@Override
  public Tuple exec(Tuple input) throws IOException {
    // Since Initial is guaranteed to be called
    // only in the map, it will be called with an
    // input of a bag with a single tuple - the 
    // count should always be 1 if bag is non empty
    DataBag bag = (DataBag)input.get(0);
    Iterator it = bag.iterator();
    if (it.hasNext()){
      Tuple t = (Tuple)it.next();
      if (t != null && t.size() > 0 && t.get(0) != null)
        return mTupleFactory.newTuple(Long.valueOf(1));
    }
    return mTupleFactory.newTuple(Long.valueOf(0));
  }
}

代码示例来源:origin: pl.edu.icm.coansys/document-similarity-logic

private <T1, T2> DataBag listToDataBag(List<T1> list1, List<T2> list2)
    throws ExecException {
  DataBag output = BagFactory.getInstance().newDefaultBag();
  for (int i = 0; i < Math.min(list1.size(), list2.size()); i++) {
    Tuple t = TupleFactory.getInstance().newTuple(2);
    t.set(0, list1.get(i));
    t.set(1, list2.get(i));
    output.add(t);
  }
  return output;
}

代码示例来源:origin: apache/phoenix

Tuple t = tupleFactory.newTuple();
t.append(1);
t.append(dt);
t.append(dt);
t.append(dt);

相关文章

微信公众号

最新文章

更多