这是我的输入数据文件示例
+-------------+---------+--------+------+------+-----+
|applicationId|firstname|lastname|addres|gender|phone|
+-------------+---------+--------+------+------+-----+
| t1| suraj| ghimire| 10oak| m| 1111|
| t2| Kiran| Kumar| 5wren| m| 2222|
| t3| sara| chauhan| 15nvi| f| 6666|
| t4| suraj| ghimire| 11oak| m| 1111|
| t5| Kiran| Kumar| 5wren| f| 2222|
| t6| Prakash| Jha| 18nvi| f| 3333|
| t7| Kiran| Kumar| 5wren| f| 2222|
| t8| Suraj| Ghimire| 10oak| m| 1111|
| t9| Prakash| Jha| 18nvi| m| 3333|
| t10| Kiran| Kumar| 5wren| f| 2222|
| t11| Suraj| Ghimire| 10oak| m| 1111|
| t12| Prakash| Jha| 18nvi| f| 3333|
使用sparkrdd方法,我需要对数据进行分组,并且只打印任何列中出现次数最多的第一条记录,最终输出应该如下所示
+-------------+---------+--------+------+------+-----+
|applicationId|firstname|lastname|addres|gender|phone|
+-------------+---------+--------+------+------+-----+
| t3| sara| chauhan| 15nvi| f| 6666|
| t6| Prakash| Jha| 18nvi| f| 3333|
| t7| Kiran| Kumar| 5wren| f| 2222|
| t8| Suraj| Ghimire| 10oak| m| 1111|
暂无答案!
目前还没有任何答案,快来回答吧!