我已经完成了mapreduce的任务,我是mapreduce编程的新手。我想计算每年和特定城市的平均值、最小值和最大值。这是我的输入示例
2009-01-01-0760468012694,2.52077754,0.52577754,0.0657571168,0.06571168,0.065668362,0.972051954,0.037000279,0.037000279,0.022319018,,,,0.003641149,2009-01-01-076080808012694,2.525757577754,0.5257575757571168,0.0657575711688,0.0.0656565636362,0.0,0,0 0 0.25516,0.0,0,0,0.5050505151518,0,0.0.0.0,0,0.0.0 0 0 0 0.31313131313131313131318,,,,,,0.0.0.0.0.0 0.0.0.0.12312331313145522号,0.097792741,0.032738892,0.368454554,0.019228992,0.032882053,,,0.004778065,,,0.003190444,,,0.064203865卡尔加里,ab,2010-01-1360468012551,3.006492921,0.09051656,0.041508534,0.215395047,0.012081755,0.023706119,,,0.004231772,,,0.003083003,,,0.155212503
我知道如何找到我使用此代码的城市和年份:
String line = value.toString();
String[] tokens = line.split(",");
String[] date = tokens[2].split("-");
String year = date[0];
String location = tokens[0];
现在我想在每行中找到这两个数字(例如2.5207754,0.065721168,不完全相同,但都是第三个和第四个逗号后面的数字),然后找到一个平均值,一个最小值和一个最大值。
在输出中应该是这样的:
卡尔加里2009年平均值:“,min;”“,最大值:“卡尔加里2010年平均值:”,最小值;“”,最大值:“”
我试图用这段代码来找到每一行中的值,但是因为数据集在每一行中都不相同,所以我得到了错误(在这个部分中没有数据或者比这个长度更大)
float number = 0;
float number2 = 0 ;
char a;
char c;
a = line.charAt(34);
c = line.charAt(44);
if (a == ',')
{
number = Float.parseFloat(line.substring(35, 44));
}
else
{
number = Float.parseFloat(line.substring(35, 46));
}
if (c == ',')
{
number2 = Float.parseFloat(line.substring(45, 56));
} else
{
number = Float.parseFloat(line.substring(47, 58));
}
Text numbers = new Text(number + " " + number2 + " ");
然后我试着用这个代码和上面一样它不起作用:
String number = tokens[4];
String number2 = tokens[5];
你能帮我做这个项目吗?
1条答案
按热度按时间gblwokeq1#
看看你的输入,你的记录似乎被空格隔开了。您可以先使用“”进行拆分,然后获取各个值并使用它们进行计算
城市:卡尔加里日期:2009-01-07 v1:2.5207754v2:0.065721168城市:卡尔加里日期:2009-12-30 v1:2.051769654v2:0.060114973城市:卡尔加里日期:2010-01-06 v1:4.015745522v2:0.097792741城市:卡尔加里日期:2010-01-13 v1:3.006492921v2:0.09051656城市:卡尔加里日期:2009-01-07 v1:2.5207754v2:0.065721168