DataX datax同步数据从hdfs到mysql传输非常慢

exdqitrt  于 2022-10-20  发布在  HDFS
关注(0)|答案(1)|浏览(660)

datax同步数据从hdfs到mysql传输非常慢
hdfs里的数据orc格式
写入都RDS里,内网传输。传输速度只有500K

启动命令:

python2 /alxxxxxtax/bin/datax.py -p" -Ddt='dt='2020-06-30" --jvm="-Xms5G -Xmx5G" /axxxxs/dwsxxxxofile_df.json

json配置文件:
{
"core":{
"transport":{
"channel":{
"speed":{
"channel": 10,
"record":-1,
"byte":-1,
"batchSize":20480
}
}
}
},
"job": {
"setting": {
"speed": {
"channel": 10,
"record":-1,
"byte":-1,
"batchSize":20480
}
},
"content": [
{
"reader": {
"name": "hdfsreader",
"parameter": {
"path": "/xxxx_df/$dt",
"defaultFS": "hdfs://17xxxxx:8020",
"hadoopConfig": {
"dfs.client.use.datanode.hostname":"true",
"dfs.nameservices": "flashHadoop",
"dfs.ha.namenodes.minq-cluster": "nn1,nn2",
"dfs.namenode.rpc-address.flashHadoop.nn1": "17xxxx:8020",
"dfs.namenode.rpc-address.flashHadoop.nn2": "172xx2:8020",
"dfs.client.failover.proxy.provider.minq-cluster":"org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
},

"column": [
          {
            "index": 0,
            "type": "string"
          },
          {
            "index": 1,
            "type": "string"
          },

...
{
"index": 134,
"type": "string"
}
],
"compress": "SNAPPY",
"fileType": "orc",
"encoding": "UTF-8",
"fieldDelimiter": "\t"
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"writeMode": "insert ignore",
"username": "dxxxxxx",
"password": "8xxxxxxxx",
"column": [
"created_at",
...
"isss_vms",
"isssssp",
"isxxx_irm"
],
"preSql": [
" truncate dwsxxxxxe_df "
],
"connection": [
{
"jdbcUrl": "jdbc:mysql://rxxxxncs.com:3306/daxxxvice",
"table": [
"xxxxxxxxxxx"
]
}
]
}
}
}
]
}
}

kognpnkq

kognpnkq1#

正常,因为hdfs是分布式的,需要每个文件都开启一个线程来进行数据传输

相关问题