我需要从csv文件转置第一和第三行,并需要在其中添加一些关键字。例如,我有一个包含这样数据的文件。
Col1,Col2,Col3,Col4,Col5,Col6,Col7,Test1_AA_Col8,Test1_AA_Col9,Test1_AA_Col10,Test1_BB_col11,Test1_BB_col12,Test1_BB_col13,Test2_AA_col14,Test2_AA_col15,Test2_AA_cl16,Test2_BB_col17,Test2_BB_col18,Test2_BB_col19,Test3_AA_col20,Test3_AA_col21,Test3_AA_col22,Test3_BB_col23,Test3_BB_col24,Test3_BB_col25,Test4_AA_col26,Test4_AA_cl27,Test4_AA_col28,Test4_BB_col29,Test4_BB_col30,Test4_BB_col31
对象,对象,对象,对象,对象,对象,对象,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64字符串,字符串,字符串,字符串,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”,“十进制(35,15)”
我试着换位第一行和第三行,我的输出是这样的
CAST(Col1 AS String) AS Col1,
CAST(Col2 AS String) AS Col2,
CAST(Col3 AS String) AS Col3,
CAST(Col4 AS String) AS Col4,
CAST(Col5 AS String) AS Col5,
CAST(Col6 AS String) AS Col6,
CAST(Col7 AS String) AS Col7,
CAST(Test1_AA_Col8 AS Decimal(35,15)) AS Test1_AA_Col8,
CAST(Test1_AA_Col9 AS Decimal(35,15)) AS Test1_AA_Col9,
CAST(Test1_AA_Col10 AS Decimal(35,15)) AS Test1_AA_Col10,
CAST(Test1_BB_col11 AS Decimal(35,15)) AS Test1_BB_col11,
CAST(Test1_BB_col12 AS Decimal(35,15)) AS Test1_BB_col12,
CAST(Test1_BB_col13 AS Decimal(35,15)) AS Test1_BB_col13,
CAST(Test2_AA_col14 AS Decimal(35,15)) AS Test2_AA_col14,
CAST(Test2_AA_col15 AS Decimal(35,15)) AS Test2_AA_col15,
CAST(Test2_AA_cl16 AS Decimal(35,15)) AS Test2_AA_cl16,
CAST(Test2_BB_col17 AS Decimal(35,15)) AS Test2_BB_col17,
CAST(Test2_BB_col18 AS Decimal(35,15)) AS Test2_BB_col18,
CAST(Test2_BB_col19 AS Decimal(35,15)) AS Test2_BB_col19,
CAST(Test3_AA_col20 AS Decimal(35,15)) AS Test3_AA_col20,
CAST(Test3_AA_col21 AS Decimal(35,15)) AS Test3_AA_col21,
CAST(Test3_AA_col22 AS Decimal(35,15)) AS Test3_AA_col22,
CAST(Test3_BB_col23 AS Decimal(35,15)) AS Test3_BB_col23,
CAST(Test3_BB_col24 AS Decimal(35,15)) AS Test3_BB_col24,
CAST(Test3_BB_col25 AS Decimal(35,15)) AS Test3_BB_col25,
CAST(Test4_AA_col26 AS Decimal(35,15)) AS Test4_AA_col26,
CAST(Test4_AA_cl27 AS Decimal(35,15)) AS Test4_AA_cl27,
CAST(Test4_AA_col28 AS Decimal(35,15)) AS Test4_AA_col28,
CAST(Test4_BB_col29 AS Decimal(35,15)) AS Test4_BB_col29,
CAST(Test4_BB_col30 AS Decimal(35,15)) AS Test4_BB_col30,
) AS Test4_BB_col31
这是我写的代码。
hadoop fs -text $f | sed -n '1p;3p' | awk -F, '{for (i=1; i<=NF; i++) a[i,NR=$i; max=(max<NF?NF:max)} END {for (i=1; i<=max; i++) {for (j=1; j<=NR;j++) printf "%s%s" , a[i,j], (i==NR?RS:FS) }}'
这就是我想要达到的目标。
CAST(Col1 AS String) AS Col1,
CAST(Col2 AS String) AS Col2,
CAST(Col3 AS String) AS Col3,
CAST(Col4 AS String) AS Col4,
CAST(Col5 AS String) AS Col5,
CAST(Col6 AS String) AS Col6,
CAST(Col7 AS String) AS Col7,
CAST(Test1_AA_Col8 AS Decimal(35,15)) AS Test1_AA_Col8,
CAST(Test1_AA_Col9 AS Decimal(35,15)) AS Test1_AA_Col9,
CAST(Test1_AA_Col10 AS Decimal(35,15)) AS Test1_AA_Col10,
CAST(Test1_BB_col11 AS Decimal(35,15)) AS Test1_BB_col11,
CAST(Test1_BB_col12 AS Decimal(35,15)) AS Test1_BB_col12,
CAST(Test1_BB_col13 AS Decimal(35,15)) AS Test1_BB_col13,
CAST(Test2_AA_col14 AS Decimal(35,15)) AS Test2_AA_col14,
CAST(Test2_AA_col15 AS Decimal(35,15)) AS Test2_AA_col15,
CAST(Test2_AA_cl16 AS Decimal(35,15)) AS Test2_AA_cl16,
CAST(Test2_BB_col17 AS Decimal(35,15)) AS Test2_BB_col17,
CAST(Test2_BB_col18 AS Decimal(35,15)) AS Test2_BB_col18,
CAST(Test2_BB_col19 AS Decimal(35,15)) AS Test2_BB_col19,
CAST(Test3_AA_col20 AS Decimal(35,15)) AS Test3_AA_col20,
CAST(Test3_AA_col21 AS Decimal(35,15)) AS Test3_AA_col21,
CAST(Test3_AA_col22 AS Decimal(35,15)) AS Test3_AA_col22,
CAST(Test3_BB_col23 AS Decimal(35,15)) AS Test3_BB_col23,
CAST(Test3_BB_col24 AS Decimal(35,15)) AS Test3_BB_col24,
CAST(Test3_BB_col25 AS Decimal(35,15)) AS Test3_BB_col25,
CAST(Test4_AA_col26 AS Decimal(35,15)) AS Test4_AA_col26,
CAST(Test4_AA_cl27 AS Decimal(35,15)) AS Test4_AA_cl27,
CAST(Test4_AA_col28 AS Decimal(35,15)) AS Test4_AA_col28,
CAST(Test4_BB_col29 AS Decimal(35,15)) AS Test4_BB_col29,
CAST(Test4_BB_col30 AS Decimal(35,15)) AS Test4_BB_col30,
CAST(Test4_BB_col31 AS Decimal(35,15)) AS Test4_BB_col31
1条答案
按热度按时间mrphzbgm1#
使用gnu awk从给定样本到预期输出(如果使用
FPAT
,不是FS
):输出:
一个管道内衬:
已更新带引号字段的输出:
更新的版本
FPAT
少一点臭味。它有一个入门级解析器来处理(单对)双引号中的逗号(\"
未正确处理):