本文整理了Java中cc.mallet.types.InstanceList
类的一些代码示例,展示了InstanceList
类的具体用法。这些代码示例主要来源于Github
/Stackoverflow
/Maven
等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。InstanceList
类的具体详情如下:
包路径:cc.mallet.types.InstanceList
类名称:InstanceList
[英]A list of machine learning instances, typically used for training or testing of a machine learning algorithm.
All of the instances in the list will have been passed through the same cc.mallet.pipe.Pipe, and thus must also share the same data and target Alphabets. InstanceList keeps a reference to the pipe and the two alphabets.
The most common way of adding instances to an InstanceList is through the add(PipeInputIterator)
method. PipeInputIterators are a way of mapping general data sources into instances suitable for processing through a pipe. As each cc.mallet.types.Instance is pulled from the PipeInputIterator, the InstanceList copies the instance and runs the copy through its pipe (with resultant destructive modifications) before saving the modified instance on its list. This is the usual way in which instances are transformed by pipes.
InstanceList also contains methods for randomly generating lists of feature vectors; splitting lists into non-overlapping subsets (useful for test/train splits), and iterators for cross validation.
[中]机器学习实例列表,通常用于机器学习算法的训练或测试。
列表中的所有实例都将通过相同的cc传递。木槌管管道,因此也必须共享相同的数据和目标字母。InstanceList保留对管道和两个字母的引用。
向InstanceList添加实例的最常见方法是通过add(PipeInputIterator)
方法。PipeInputInterators是一种将常规数据源映射到适合通过管道处理的实例的方法。作为每个cc。木槌类型。实例是从PipeInputIterator中提取的,InstanceList复制该实例并在其列表中保存修改后的实例之前通过其管道运行副本(产生破坏性修改)。这是通过管道转换实例的常用方法。
InstanceList还包含随机生成特征向量列表的方法;将列表拆分为非重叠子集(用于测试/训练拆分),并使用迭代器进行交叉验证。
代码示例来源:origin: de.julielab/jcore-mallet-2.0.9
public InstanceList pipeInstances (Iterator<Instance> source)
{
// I think that pipes should be associated neither with InstanceLists, nor
// with Instances. -cas
InstanceList toked = new InstanceList (tokenizationPipe);
toked.addThruPipe (source);
InstanceList piped = new InstanceList (getFeaturePipe ());
piped.addThruPipe (toked.iterator());
return piped;
}
代码示例来源:origin: de.julielab/jcore-mallet-2.0.9
/** Return an list of instances with a particular label. */
public InstanceList getCluster(int label) {
InstanceList cluster = new InstanceList(instances.getPipe());
for (int n=0 ; n<instances.size() ; n++)
if (labels[n] == label)
cluster.add(instances.get(n));
return cluster;
}
代码示例来源:origin: cc.mallet/mallet
public Alphabet[] getAlphabets () {
return new Alphabet[] {getDataAlphabet(), getTargetAlphabet() };
}
代码示例来源:origin: cc.mallet/mallet
public InstanceList subList (int start, int end)
{
InstanceList other = this.cloneEmpty();
for (int i = start; i < end; i++) {
other.add (get (i));
}
return other;
}
代码示例来源:origin: cc.mallet/mallet
public void testFixedNumLabels () throws IOException, ClassNotFoundException
{
Pipe p = new GenericAcrfData2TokenSequence (2);
InstanceList training = new InstanceList (p);
training.addThruPipe (new LineGroupIterator (new StringReader (sampleFixedData), Pattern.compile ("^$"), true));
assertEquals (1, training.size ());
Instance inst1 = training.get (0);
LabelsSequence ls1 = (LabelsSequence) inst1.getTarget ();
assertEquals (4, ls1.size ());
}
代码示例来源:origin: com.github.steveash.mallet/mallet
public void testOne ()
{
Pipe p = createPipe();
InstanceList ilist = new InstanceList (p);
ilist.addThruPipe(new StringArrayIterator(data));
assertTrue (ilist.size() == 3);
}
代码示例来源:origin: com.github.steveash.jg2p/jg2p-core
private InstanceList makeExamplesFromAligns(Collection<SWord> inputs) {
Pipe pipe = makePipe();
int count = 0;
InstanceList instances = new InstanceList(pipe);
for (SWord word : inputs) {
Instance ii = new Instance(word, null, null, null);
instances.addThruPipe(ii);
count += 1;
}
log.info("Read {} instances of training data for syll phone tag", count);
return instances;
}
代码示例来源:origin: uk.gov.dstl.baleen/baleen-mallet
@Override
protected void doProcess(JCas jCas) throws AnalysisEngineProcessException {
InstanceList instances = new InstanceList(classifierModel.getInstancePipe());
instances.addThruPipe(new Instance(jCas.getDocumentText(), "", "from jcas", null));
Classification classify = classifierModel.classify(instances.get(0));
Metadata md = new Metadata(jCas);
md.setKey(metadataKey);
md.setValue(classify.getLabeling().getBestLabel().toString());
addToJCasIndex(md);
}
代码示例来源:origin: cc.mallet/mallet
public InstanceList subList (double proportion)
{
if (proportion > 1.0)
throw new IllegalArgumentException ("proportion must by <= 1.0");
InstanceList other = (InstanceList) clone();
other.shuffle(new java.util.Random());
proportion *= other.size();
for (int i = 0; i < proportion; i++)
other.add (get(i));
return other;
}
代码示例来源:origin: cc.mallet/mallet
public double getInstanceWeight (int index) {
if (index > this.size()) {
throw new IllegalArgumentException("Index out of bounds: index="+index+" size="+this.size());
}
if (instWeights != null) {
Double value = instWeights.get(get(index));
if (value != null) {
return value;
}
}
return 1.0;
}
代码示例来源:origin: cc.mallet/mallet
public Sequence pipeInput (Object input)
{
InstanceList all = new InstanceList (getFeaturePipe ());
all.add (input, null, null, null);
return (Sequence) all.get (0).getData();
}
}
代码示例来源:origin: cc.mallet/mallet
public InstanceList sampleWithReplacement (java.util.Random r, int numSamples)
{
InstanceList ret = this.cloneEmpty();
for (int i = 0; i < numSamples; i++)
ret.add (this.get(r.nextInt(this.size())));
return ret;
}
代码示例来源:origin: de.julielab/jcore-mallet-2.0.9
public LabelVector targetLabelDistribution ()
{
if (this.size() == 0) return null;
if (!(get(0).getTarget() instanceof Labeling))
throw new IllegalStateException ("Target is not a labeling.");
double[] counts = new double[getTargetAlphabet().size()];
for (int i = 0; i < this.size(); i++) {
Instance instance = get(i);
Labeling l = (Labeling) instance.getTarget();
l.addTo (counts, getInstanceWeight(i));
}
return new LabelVector ((LabelAlphabet)getTargetAlphabet(), counts);
}
代码示例来源:origin: de.julielab/jcore-mallet-2.0.9
/**
*
* @param i
* @param j
* @return A new {@link InstanceList} containing the two argument {@link Instance}s.
*/
public static InstanceList makeList (Instance i, Instance j) {
InstanceList list = new InstanceList(new Noop(i.getDataAlphabet(), i.getTargetAlphabet()));
list.add(i);
list.add(j);
return list;
}
代码示例来源:origin: cc.mallet/mallet
public void setPerLabelFeatureSelection (FeatureSelection[] selectedFeatures)
{
if (selectedFeatures != null) {
for (int i = 0; i < selectedFeatures.length; i++)
if (selectedFeatures[i].getAlphabet() != getDataAlphabet())
throw new IllegalArgumentException ("Vocabularies do not match");
}
perLabelFeatureSelection = selectedFeatures;
}
代码示例来源:origin: cc.mallet/mallet
/** Replaces the <code>Instance</code> at position <code>index</code>
* with a new one. */
public void setInstance (int index, Instance instance)
{
assert (this.getDataAlphabet().equals(instance.getDataAlphabet()));
assert (this.getTargetAlphabet().equals(instance.getTargetAlphabet()));
this.set(index, instance);
}
代码示例来源:origin: com.github.steveash.mallet/mallet
public BaggingClassifier train (InstanceList trainingList)
{
Classifier[] classifiers = new Classifier[numBags];
java.util.Random r = new java.util.Random ();
for (int round = 0; round < numBags; round++) {
InstanceList bag = trainingList.sampleWithReplacement (r, trainingList.size());
classifiers[round] = underlyingTrainer.newClassifierTrainer().train (bag);
}
this.classifier = new BaggingClassifier (trainingList.getPipe(), classifiers);
return classifier;
}
代码示例来源:origin: de.julielab/jcore-mallet-2.0.9
/** Adds the input instance to this list, after passing it through the
* InstanceList's pipe.
* <p>
* If several instances are to be added then accumulate them in a List\<Instance\>
* and use <tt>addThruPipe(Iterator<Instance>)</tt> instead.
*/
public void addThruPipe(Instance inst)
{
addThruPipe(new SingleInstanceIterator(inst));
}
代码示例来源:origin: de.julielab/jcore-mallet-2.0.9
public TokenClassifiers(ClassifierTrainer trainer, InstanceList trainList, int randSeed, int numCV)
{
super(trainList.getPipe());
m_trainer = trainer;
m_randSeed = randSeed;
m_numCV = numCV;
m_table = new HashMap();
doTraining(trainList);
}
代码示例来源:origin: cc.mallet/mallet
public void testSetGetParameters ()
{
MaxEntTrainer trainer = new MaxEntTrainer();
Alphabet fd = dictOfSize (6);
String[] classNames = new String[] {"class0", "class1", "class2"};
InstanceList ilist = new InstanceList (new Randoms(1), fd, classNames, 20);
Optimizable.ByGradientValue maxable = trainer.getOptimizable (ilist);
TestOptimizable.testGetSetParameters (maxable);
}
内容来源于网络,如有侵权,请联系作者删除!