本文整理了Java中org.apache.flink.examples.java.clustering.KMeans
类的一些代码示例,展示了KMeans
类的具体用法。这些代码示例主要来源于Github
/Stackoverflow
/Maven
等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。KMeans
类的具体详情如下:
包路径:org.apache.flink.examples.java.clustering.KMeans
类名称:KMeans
[英]This example implements a basic K-Means clustering algorithm.
K-Means is an iterative clustering algorithm and works as follows:
K-Means is given a set of data points to be clustered and an initial set of K cluster centers. In each iteration, the algorithm computes the distance of each data point to each cluster center. Each point is assigned to the cluster center which is closest to it. Subsequently, each cluster center is moved to the center (mean) of all points that have been assigned to it. The moved cluster centers are fed into the next iteration. The algorithm terminates after a fixed number of iterations (as in this implementation) or if cluster centers do not (significantly) move in an iteration.
This is the Wikipedia entry for the K-Means Clustering algorithm.
This implementation works on two-dimensional data points.
It computes an assignment of data points to cluster centers, i.e., each data point is annotated with the id of the final cluster (center) it belongs to.
Input files are plain text files and must be formatted as follows:
"1.2 2.3\n5.3 7.2\n"
gives two data points (x=1.2, y=2.3) and (x=5.3, y=7.2)."1 6.2 3.2\n2 2.9 5.7\n"
gives two centers (id=1, x=6.2, y=3.2) and (id=2, x=2.9, y=5.7).Usage: KMeans --points <path> --centroids <path> --output <path> --iterations <n>
If no parameters are provided, the program is run with default data from org.apache.flink.examples.java.clustering.util.KMeansData and 10 iterations.
This example shows how to use:
"1.2 2.3\n5.3 7.2\n"
给出了两个数据点(x=1.2,y=2.3)和(x=5.3,y=7.2)。"1 6.2 3.2\n2 2.9 5.7\n"
给出了两个中心(id=1,x=6.2,y=3.2)和(id=2,x=2.9,y=5.7)。KMeans --points <path> --centroids <path> --output <path> --iterations <n>
代码示例来源:origin: apache/flink
DataSet<Point> points = getPointDataSet(params, env);
DataSet<Centroid> centroids = getCentroidDataSet(params, env);
代码示例来源:origin: apache/flink
TestingExecutionEnvironment.setAsNext(validator, parallelism);
KMeans.main(new String[0]);
KMeans.main(new String[] {
"--points", tmpDir,
"--centroids", tmpDir,
代码示例来源:origin: org.apache.flink/flink-java-examples
public static void main(String[] args) throws Exception {
if(!parseParameters(args)) {
return;
DataSet<Point> points = getPointDataSet(env);
DataSet<Centroid> centroids = getCentroidDataSet(env);
代码示例来源:origin: org.apache.flink/flink-examples-batch_2.10
DataSet<Point> points = getPointDataSet(params, env);
DataSet<Centroid> centroids = getCentroidDataSet(params, env);
代码示例来源:origin: apache/flink
@Test
public void dumpIterativeKMeans() {
// prepare the test environment
PreviewPlanEnvironment env = new PreviewPlanEnvironment();
env.setAsContext();
try {
KMeans.main(new String[] {
"--points ", IN_FILE,
"--centroids ", IN_FILE,
"--output ", OUT_FILE,
"--iterations", "123"});
} catch (OptimizerPlanEnvironment.ProgramAbortException pae) {
// all good.
} catch (Exception e) {
e.printStackTrace();
Assert.fail("KMeans failed with an exception");
}
dump(env.getPlan());
}
代码示例来源:origin: com.alibaba.blink/flink-examples-batch
DataSet<Point> points = getPointDataSet(params, env);
DataSet<Centroid> centroids = getCentroidDataSet(params, env);
代码示例来源:origin: apache/flink
@Test
public void dumpIterativeKMeans() {
// prepare the test environment
PreviewPlanEnvironment env = new PreviewPlanEnvironment();
env.setAsContext();
try {
KMeans.main(new String[] {
"--points ", IN_FILE,
"--centroids ", IN_FILE,
"--output ", OUT_FILE,
"--iterations", "123"});
} catch (OptimizerPlanEnvironment.ProgramAbortException pae) {
// all good.
} catch (Exception e) {
e.printStackTrace();
Assert.fail("KMeans failed with an exception");
}
dump(env.getPlan());
}
代码示例来源:origin: org.apache.flink/flink-examples-batch
DataSet<Point> points = getPointDataSet(params, env);
DataSet<Centroid> centroids = getCentroidDataSet(params, env);
内容来源于网络,如有侵权,请联系作者删除!