spark结构化流文件接收器创建空的json文件

8ehkhllq  于 2021-05-18  发布在  Spark
关注(0)|答案(0)|浏览(265)

我正在阅读Kafka主题的数据并能够处理这些数据。
当我尝试以.json格式接收文件时,hdfs有空的.json文件。
我已经使用控制台验证了输出put。但文件接收器会创建空文件。
"""

query = KPI_Final_DF \
    .writeStream \
    .outputMode("Append") \
    .format("json") \
    .option("truncate", "false") \
    .option("path","output_3") \
    .option("checkpointLocation", "output_json") \
    .trigger(processingTime="1 minute") \
    .start()

# query termination command

query.awaitTermination()
"""

Below is the console output: 
-------------------------------------------
Batch: 27
-------------------------------------------
+------------------------------------------+--------------+------------------+---+-------------------+
|window                                    |country       |Total_Volume_Sale |OPM|Rate_Return        |
+------------------------------------------+--------------+------------------+---+-------------------+
|[2020-11-05 16:30:00, 2020-11-05 16:31:00]|United Kingdom|37.010000705718994|2  |0.0                |
|[2020-11-05 16:29:00, 2020-11-05 16:30:00]|United Kingdom|613.1199990212917 |11 |0.15384615384615385|
+------------------------------------------+--------------+------------------+---+-------------------+

-------------------------------------------
Batch: 28
-------------------------------------------
+------------------------------------------+--------------+-----------------+---+-----------+
|window                                    |country       |Total_Volume_Sale|OPM|Rate_Return|
+------------------------------------------+--------------+-----------------+---+-----------+
|[2020-11-05 16:30:00, 2020-11-05 16:31:00]|United Kingdom|66.70999991893768|3  |0.0        |
+------------------------------------------+--------------+-----------------+---+-----------+

这是hdfs输出:

[root@ip-10-0-0-22 ~]# hadoop fs -ls output_3
Found 9 items
drwxr-xr-x   - root supergroup          0 2020-11-05 21:02 output_3/_spark_metadata
-rw-r--r--   3 root supergroup          0 2020-11-05 20:58 output_3/part-00000-08fb95d2-b875-4a18-8c55-4f09f282b825-c000.json
-rw-r--r--   3 root supergroup          0 2020-11-05 21:03 output_3/part-00000-12f4656f-64d9-412e-a1a5-a8d1eb8f0991-c000.json
-rw-r--r--   3 root supergroup          0 2020-11-05 21:00 output_3/part-00000-45b83c6a-f348-4b2a-b982-0cb99a201223-c000.json
-rw-r--r--   3 root supergroup          0 2020-11-05 21:02 output_3/part-00000-6c4f1258-afd4-426c-9b52-9822e6a9f8ab-c000.json
-rw-r--r--   3 root supergroup          0 2020-11-05 20:57 output_3/part-00000-82519fb3-9801-4847-934b-32e57b3346cc-c000.json
-rw-r--r--   3 root supergroup          0 2020-11-05 21:01 output_3/part-00000-8fb0f74c-c6be-4401-ab34-35cd05385122-c000.json
-rw-r--r--   3 root supergroup          0 2020-11-05 20:59 output_3/part-00000-90a55ab9-04f7-4d19-ad22-5175a3306050-c000.json
-rw-r--r--   3 root supergroup          0 2020-11-05 20:56 output_3/part-00000-c3e7a091-1f32-4c56-a1e6-3ba59e719dae-c000.json
[root@ip-10-0-0-22 ~]# hadoop fs -cat output_3/part-00000-6c4f1258-afd4-426c-9b52-9822e6a9f8ab-c000.json
[root@ip-10-0-0-22 ~]#

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题