试图加载列之间包含空格的csv文件。
csv的第一行:
058921107 039128053 20200701-290640-0 20200701 000000BORGWARNER ITHACA LLC DBA BORGWARNE 489140-10001 LDD INVENTORY 039128053 1 4359697 PACKAGE,CHAIN DRIVE 005 285000492 0 19691231 185959 0 20200101 00000020200630 000000IMMEDIATE 1600 20200630 000000
使用的脚本示例:
import org.apache.spark.sql.{SQLContext, SparkSession}
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.{DataFrame, SparkSession}
import org.apache.spark.sql.functions._
var df1: DataFrame = null
df1=spark.read.option("header","true").option("inferSchema","true").option("delimiter"," ").option("ignoreLeadingWhiteSpace","true")
.option("ignoreTrailingWhiteSpace","true").csv("test.csv")
df1.show(2)
1条答案
按热度按时间jw5wzhpr1#
我已将列大小指定为
18
不管这是否正确。