动态添加填充零

bsxbgnwa  于 2021-05-27  发布在  Spark
关注(0)|答案(1)|浏览(294)
mock_data = [('TYCO', ' 1303','13'),('EMC', '  120989  ','123'), ('VOLVO  ', '102329  ','1234'),('BMW', '1301571345  ',' '),('FORD', '004','21212')]

df = spark.createDataFrame(mock_data, ['col1', 'col2','col3'])

+-------+------------+-----+

|   col1 |        col2| col3|

+-------+------------+-----+

|   TYCO|        1303|   13|

|    EMC|    120989  |  123|

|VOLVO  |    102329  | 1234|

|    BMW|1301571345  |     |

|   FORD|         004|21212|

+-------+------------+-----+

修剪col2并基于长度(10-col2 length)需要在col3中动态添加填充零。连接col2和col3。

df2 = df.withColumn('length_col2', 10-length(trim(df.col2)))
+-------+------------+-----+-----------+

|   col1|        col2| col3|length_col2|

+-------+------------+-----+-----------+

|   TYCO|        1303|   13|          6|

|    EMC|    120989  |  123|          4|

|VOLVO  |    102329  | 1234|          4|

|    BMW|1301571345  |     |          0|

|   FORD|         004|21212|          7|

+-------+------------+-----+-----------+

预期产量

+-------+----------+-----+-------------

|   col1|      col2   | col3|output

+-------+----------+-----+-------------

|   TYCO|      1303   |   13|1303000013

|    EMC|  120989     |  123|1209890123

|VOLVO  |  102329     | 1234|1023291234

|    BMW|  1301571345 |     |1301571345

|   FORD|       004   |21212|0040021212

+-------+----------+-----+-------------
0mkxixxg

0mkxixxg1#

你要找的是 rpad 中的函数 pyspark.sql.functions 如下所示=>https://spark.apache.org/docs/2.3.0/api/sql/index.html
请参见下面的解决方案:

%pyspark

mock_data = [('TYCO', ' 1303','13'),('EMC', '  120989  ','123'), ('VOLVO  ', '102329  ','1234'),('BMW', '1301571345  ',' '),('FORD', '004','21212')]

df = spark.createDataFrame(mock_data, ['col1', 'col2','col3'])

df.createOrReplaceTempView("input_df")

spark.sql("SELECT *, concat(rpad(trim(col2),10,'0') , col3) as OUTPUT from input_df").show(20,False)

和结果

+-------+------------+-----+---------------+
|col1   |col2        |col3 |OUTPUT         |
+-------+------------+-----+---------------+
|TYCO   | 1303       |13   |130300000013   |
|EMC    |  120989    |123  |1209890000123  |
|VOLVO  |102329      |1234 |10232900001234 |
|BMW    |1301571345  |     |1301571345     |
|FORD   |004         |21212|004000000021212|
+-------+------------+-----+---------------+

相关问题