gobblin:错误:java.io.ioexception:未能提交作业\u gobblingafkaquickstart的某些数据集的数据集状态

q43xntqr  于 2021-05-27  发布在  Hadoop
关注(0)|答案(0)|浏览(223)

我尝试从Kafka主题中摄取数据到hdfshttps://gobblin.readthedocs.io/en/latest/case-studies/kafka-hdfs-ingestion/
我遵循的步骤:
启动zookeeper $ zookeeper-server-start.bat C:\Users\name\kafka_2.11-1.1.0\config\zookeeper.properties 开始Kafka $ kafka-server-start.bat C:\Users\name\kafka_2.11-1.1.0\config\server.properties 创建Kafka主题(如果不存在) $ kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test 启动hadoop $ C:\Users\name\hadoop-3.1.3\sbin\start-all.cmd 创建kafka-hdfs.pull-in gobblin\u job\u config\u dir,如下所示

job.group=GobblinKafka
job.description=Gobblin quick start job for Kafka
job.lock.enabled=false

kafka.brokers=localhost:9092

source.class=org.apache.gobblin.source.extractor.extract.kafka.KafkaSimpleSource
extract.namespace=org.apache.gobblin.extract.kafka

writer.builder.class=org.apache.gobblin.writer.SimpleDataWriterBuilder
writer.file.path.type=tablename
writer.destination.type=HDFS
writer.output.format=txt

data.publisher.type=org.apache.gobblin.publisher.BaseDataPublisher

mr.job.max.mappers=1

metrics.reporting.file.enabled=true
metrics.log.dir=/gobblin-kafka/metrics
metrics.reporting.file.suffix=txt

bootstrap.with.offset=earliest

fs.uri=hdfs://localhost:9000
writer.fs.uri=hdfs://localhost:9000
state.store.fs.uri=hdfs://localhost:9000

mr.job.root.dir=/gobblin-kafka/working
state.store.dir=/gobblin-kafka/state-store
task.data.root.dir=/jobs/kafkaetl/gobblin/gobblin-kafka/task-data
data.publisher.final.dir=/gobblintest/job-output

设置gobblin\u work\u dir $ export GOBBLIN_WORK_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_WORK_DIR 设置gobblin\u job\u config\u dir $ export GOBBLIN_JOB_CONFIG_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_JOB_CONFIG_DIR 独立启动 $ bin/gobblin.sh service standalone start 下面是在logs/standalone.out中发现的一些错误

[JobScheduler-0] org.apache.gobblin.scheduler.JobScheduler$NonScheduledJobRunner  637 - Failed to run job GobblinKafkaQuickStart
org.apache.gobblin.runtime.JobException: Failed to run job GobblinKafkaQuickStart

ERROR [ForkExecutor-0] org.apache.gobblin.runtime.fork.Fork  258 - Fork 0 of task task_GobblinKafkaQuickStart_1580883582897_0 failed to process data records. Set throwable in holder org.apache.gobblin.runtime.ForkThrowableHolder@721ea24d
java.lang.RuntimeException: Error creating writer

ERROR [TaskExecutor-0] org.apache.gobblin.runtime.Task  545 - Task task_GobblinKafkaQuickStart_1580883582897_0 failed
java.lang.RuntimeException: Some forks failed.

ERROR [Commit-thread-0] org.apache.gobblin.runtime.SafeDatasetCommit  196 - Failed to persist dataset state for dataset  of job job_GobblinKafkaQuickStart_1580883582897
org.apache.hadoop.security.AccessControlException: Permission denied: user=name, access=WRITE, inode="/":name:supergroup:drwxrwxr-x

ERROR [JobScheduler-0] org.apache.gobblin.util.executors.IteratorExecutor  163 - Iterator executor failure.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: Permission denied: user=name, access=WRITE, inode="/":name:supergroup:drwxrwxr-x

ERROR [JobScheduler-0] org.apache.gobblin.runtime.AbstractJobLauncher  521 - Failed to launch and run job job_GobblinKafkaQuickStart_1580883582897: java.io.IOException: Failed to commit dataset state for some dataset(s) of job job_GobblinKafkaQuickStart_1580883582897
java.io.IOException: Failed to commit dataset state for some dataset(s) of job job_GobblinKafkaQuickStart_1580883582897

请告诉我怎么解决。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题