mapreducejava:计算数组列表的平均值

zu0ti5jz  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(279)

我已经完成了mapreduce的任务,我是mapreduce编程的新手。我想计算每年和特定城市的平均值、最小值和最大值。这是我的输入示例
2009-01-01-0760468012694,2.52077754,0.52577754,0.0657571168,0.06571168,0.065668362,0.972051954,0.037000279,0.037000279,0.022319018,,,,0.003641149,2009-01-01-076080808012694,2.525757577754,0.5257575757571168,0.0657575711688,0.0.0656565636362,0.0,0,0 0 0.25516,0.0,0,0,0.5050505151518,0,0.0.0.0,0,0.0.0 0 0 0 0.31313131313131313131318,,,,,,0.0.0.0.0.0 0.0.0.0.12312331313145522号,0.097792741,0.032738892,0.368454554,0.019228992,0.032882053,,,0.004778065,,,0.003190444,,,0.064203865卡尔加里,ab,2010-01-1360468012551,3.006492921,0.09051656,0.041508534,0.215395047,0.012081755,0.023706119,,,0.004231772,,,0.003083003,,,0.155212503
我知道如何找到我使用此代码的城市和年份:

String line = value.toString();
    String[] tokens = line.split(",");
    String[] date = tokens[2].split("-");
    String year = date[0];
    String location = tokens[0];

现在我想在每行中找到这两个数字(例如2.5207754,0.065721168,不完全相同,但都是第三个和第四个逗号后面的数字),然后找到一个平均值,一个最小值和一个最大值。
在输出中应该是这样的:
卡尔加里2009年平均值:“,min;”“,最大值:“卡尔加里2010年平均值:”,最小值;“”,最大值:“”
我试图用这段代码来找到每一行中的值,但是因为数据集在每一行中都不相同,所以我得到了错误(在这个部分中没有数据或者比这个长度更大)

float number = 0;
    float number2 = 0 ;
    char a;
    char c;
    a = line.charAt(34);
    c = line.charAt(44);
    if (a == ',') 
    { 
        number = Float.parseFloat(line.substring(35, 44));
    }
    else 
    {
        number = Float.parseFloat(line.substring(35, 46));
    }

    if (c == ',')
    {
        number2 = Float.parseFloat(line.substring(45, 56));

    } else 
    {
        number = Float.parseFloat(line.substring(47, 58));
    }

    Text numbers = new Text(number + " " + number2 + " ");

然后我试着用这个代码和上面一样它不起作用:

String number = tokens[4];
String number2 = tokens[5];

你能帮我做这个项目吗?

gblwokeq

gblwokeq1#

看看你的输入,你的记录似乎被空格隔开了。您可以先使用“”进行拆分,然后获取各个值并使用它们进行计算

String[] arr = line.split(" ");
        for(String val : arr){
            String[] dataArr = val.split(",");
            String city  = dataArr[0];
            String date = dataArr[2];
            String v1 = dataArr[5];
            String v2 = dataArr[6];
            System.out.println("city: "+city +" date: "+ date +" v1: "+ v1+"v2: "+ v2);
        }

城市:卡尔加里日期:2009-01-07 v1:2.5207754v2:0.065721168城市:卡尔加里日期:2009-12-30 v1:2.051769654v2:0.060114973城市:卡尔加里日期:2010-01-06 v1:4.015745522v2:0.097792741城市:卡尔加里日期:2010-01-13 v1:3.006492921v2:0.09051656城市:卡尔加里日期:2009-01-07 v1:2.5207754v2:0.065721168

相关问题