mapper和reducer程序,用于python中的预测

7gcisfzg  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(129)

我对hadoop的实现感到困惑。我已经编写了预测类值的代码

import numpy as np

import pandas as pd

import os

import random

from sklearn import tree

from sklearn.metrics import accuracy_score

os.chdir('/home/PYTHON/')
data=pd.read_csv('wine.csv')

test_score=[]
error1=[]
error2=[]
accuracy=[]

n_fold=10

for i in xrange(n_fold):

    train_data = data.sample(frac=0.70,random_state=1)

    test_data = data.loc[~data.index.isin(train_data.index)]    

    tree_model = tree.DecisionTreeClassifier()

    predictors = train_data.ix[:,0:13]

    train_y = train_data.ix[:,13]

    model=tree_model.fit(X = predictors, y = train_y)

    test_feat = test_data.ix[:,0:13]

    test_y = test_data.ix[:,13]

    #Finding the class value of each row and the accuracy

    test_preds = model.predict(X=test_feat)

    test_score.append(i)

    test_score[i] = accuracy_score(test_y, test_preds)

print("Accuracy by acc_score", sum(accuracy)/len(accuracy))

我是python和hadoop的初学者。我不知道如何将这个程序划分为mapper和reducer。我只对3个数据节点使用hadoop-2.7.3。我可以在hadoop集群中实现这个程序来预测类的值并找到准确度吗?如果不是这样的话,Map器和还原器在预测类和发现精度方面会是什么样子?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题