hadoop在mahout中的logistic回归应用

llycmphe 于 2021-06-03 发布在 Hadoop

关注(0)|答案(1)|浏览(332)

我在一个csv文件中有大约11000行数据，其中包含列文本和类。文本是twitter消息，每个消息在课堂上都被指定为true或false。我使用这两个命令来训练和测试数据，使用logistic回归模型，但结果不好，与auc0.52。我不太了解一些参数，比如
--rate --features 以及 --lambda 有人能帮我用更合适的命令吗？谢谢！

$ bin/mahout trainLogistic --passes 100 --rate 50 --lambda 0.001 --input twitter.csv --features 10000 --output twitter.model --target Class --categories 2 --predictors Text --types t

$ bin/mahout runlogistic --input twitter.csv --model twitter.model --AUC --confusion

数据文件链接：twitter.csv

hadoop machine-learning logistic-regression regression mahout

来源：https://stackoverflow.com/questions/23208799/using-logistic-regression-in-mahout

1条答案

按热度按时间

9gm1akwq1#

以下是用于训练模型的参数。

"input" : training data
"output" : path to the file where model will be written.
"target" : dependent variable which is to be predicted
"categories" : number of unique possible values that target can be assigned
"predictors" : list of field names that are to be used to predict target variable
"types" : datatypes for the items in predictor list
"passes" : number of passes over the input data
"features" : size of internal feature vector
"lambda" : amount of co-efficient decay to use
"rate" : initial learning rate

详细描述请参考mahout post中的logistic回归。

赞(0）回复(0）举报 2021-06-03

我来回答

hadoop在mahout中的logistic回归应用

1条答案

相关问题

热门标签

最新问答