列 dat
有两种时间戳。我正在尝试将多个字符串日期格式转换为单个格式。
from pyspark.sql.types import StructType,StructField, StringType, IntegerType
# Sample data
data1 = [("host1","cpu","2020-03-23 07:30:20"),
("host2","memory","1616131516"),
("host3","disk","2020-03-23 08:50:00"),
("host4","memory","1816131316"),
]
# Defining Schema
schema1= StructType([ \
StructField("hostname",StringType(),True), \
StructField("kpi",StringType(),True), \
StructField("dat",StringType(),True), \
])
# Creating dataframe
df = spark.createDataFrame(data=data1,schema=schema1)
df.printSchema()
df.show(truncate=False)
root
|-- hostname: string (nullable = true)
|-- kpi: string (nullable = true)
|-- dat: string (nullable = true)
+--------+------+-------------------+
|hostname|kpi |dat |
+--------+------+-------------------+
|host1 |cpu |2020-03-23 07:30:20|
|host2 |memory|1616131516 |
|host3 |disk |2020-03-23 08:50:00|
|host4 |memory|1816131316 |
+--------+------+-------------------+
我有只转换unixtime格式的代码。我需要转换列的两种格式 dat
'到所需格式: "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
在一行代码中,因为我正在使用数据流。
df1 = df.withColumn('datetime',from_unixtime(df.dat,"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")).show(truncate=False)
df.show(truncate=False)
+--------+------+-------------------+------------------------+
|hostname|kpi |dat |datetime |
+--------+------+-------------------+------------------------+
|host1 |cpu |2020-03-23 07:30:20|null |
|host2 |memory|1616131516 |2021-03-19T05:25:16.000Z|
|host3 |disk |2020-03-23 08:50:00|null |
|host4 |memory|1816131316 |2027-07-21T00:55:16.000Z|
+--------+------+-------------------+------------------------+
我想要的Dataframe是:
+--------+------+-------------------+------------------------+
|hostname|kpi |dat |datetime |
+--------+------+-------------------+------------------------+
|host1 |cpu |2020-03-23 07:30:20|2020-03-23T07:30:20.000Z|
|host2 |memory|1616131516 |2021-03-19T05:25:16.000Z|
|host3 |disk |2020-03-23 08:50:00|2020-03-23T08:50:00.000Z|
|host4 |memory|1816131316 |2027-07-21T00:55:16.000Z|
+--------+------+-------------------+------------------------+
1条答案
按热度按时间7gs2gvoe1#
你可以用
date_format
将其他“标准”日期格式转换为所需格式,以及coalesce
使用转换的现有列from_unixtime
.