k表示集群mahout

i1icjdpr  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(329)

我试图集群的样本数据集是在csv文件格式。但当我发出以下命令时,

user@ubuntu:/usr/local/mahout/trunk$ bin/mahout kmeans -i /root/Mahout/temp/parsedtext-seqdir-sparse-kmeans/tfidf-vectors/ -c /root/Mahout/temp/parsedtext-kmeans-clusters -o /root/Mahout/reuters21578/root/Mahout/temp/parsedtext-kmeans -dm org.apache.mahout.common.distance.CosineDistanceMeasure -x 2 -k 1 -ow --clustering -cl

我得到以下错误,说没有可用的输入集群,并检查 -c 群集参数。有人能帮帮我吗>
下面是我对上述命令的错误:

16/05/11 16:09:15 INFO compress.CodecPool: Got brand-new decompressor [.deflate]
Exception in thread "main" java.lang.IllegalStateException: No input clusters found in /root/Mahout/temp/parsedtext-kmeans-clusters/part-randomSeed. Check your -c argument.
at org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:213)
at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:147)
at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:110)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
a64a0gku

a64a0gku1#

让我为您复制错误消息:
在/root/mahout/temp/parsedtext kmeans clusters/part randomseed中找不到输入群集。检查你的-c参数。
你考虑过检查或删除你的-c参数吗?
但行家k-means的质量真的很低。用别的东西。 apt-get install elki 试试看,速度快多了。

相关问题