我有一个图,其中一些节点有数百万条传入边。我需要定期获得这些节点的边数。我正在使用cassandar作为存储后端。查询:
g.V().has('vid','qwerty').inE().count().next()
所有可用的文档都解释了如何从gremlin控制台利用apachespark来实现这一点。我是否可以在gremlin控制台外编写逻辑作为spark作业,并在hadoop集群上定期运行id。
以下是我不使用spark时在gremlin控制台上的查询输出:
14108889[gremlin-server-session-1]warn org.apache.tinkerpop.gremlin.server.op.abstractevalopprocessor-处理请求脚本时出现异常[requestmessage{,requestid=c3d902b7-0fdd-491d-8639-546963212474,op='eval',processor='session',args={gremlin=g.v().has('vid','qwerty').ine().count().next(),session=2831d264-4566-4d15-99c5-d9bbb202b1f8,bindings={},managetransaction=false,batchsize=64}]。位于org.apache.cassandra.thrift.cassandra$multiget\u slice\u result$multiget\u slice\u resultstandardscheme.read(cassandra。java:14696)在org.apache.cassandra.thrift.cassandra$multiget\u slice\u result$multiget\u slice\u resultstandardscheme.read(cassandra。java:14633)在org.apache.cassandra.thrift.cassandra$multiget\u slice\u result.read(cassandra。java:14559)位于org.apache.thrift.tserviceclient.receivebase(tserviceclient。java:78)在org.apache.cassandra.thrift.cassandra$client.recv\u multiget\u slice(cassandra。java:741)在org.apache.cassandra.thrift.cassandra$client.multiget\u slice(cassandra。java:725)在org.janusgraph.diskstorage.cassandra.thrift.cassandrathriftkeycolumnvaluestore.getnamesslice(cassandrathriftkeycolumnvaluestore)。java:143)在org.janusgraph.diskstorage.cassandra.thrift.cassandrathriftkeycolumnvaluestore.getslice(cassandrathriftkeycolumnvaluestore。java:100)在org.janusgraph.diskstorage.keycolumnvalue.kcvsproxy.getslice(kcvsproxy。java:82)在org.janusgraph.diskstorage.keycolumnvalue.cache.expirationkcvscache.getslice(expirationkcvscache。java:129)在org.janusgraph.diskstorage.backendtransaction$2.call(backendtransaction。java:288)在org.janusgraph.diskstorage.backendtransaction$2.call(backendtransaction。java:285)在org.janusgraph.diskstorage.util.backonoperation.executedirect(backonoperation。java:69)在org.janusgraph.diskstorage.util.backonoperation.execute(backonoperation。java:55)在org.janusgraph.diskstorage.backendtransaction.executeread(backendtransaction。java:470) 在org.janusgraph.diskstorage.backendtransaction.edgestoremultiquery(backendtransaction。java:285)在org.janusgraph.graphdb.database.standardjanusgraph.edgemultiquery(standardjanusgraph。java:441)在org.janusgraph.graphdb.transaction.standardjanusgraphtx.lambda$executemultiquery$3(standardjanusgraphtx。java:1054)在org.janusgraph.graphdb.query.profile.queryprofiler.profile(queryprofiler。java:98)在org.janusgraph.graphdb.query.profile.queryprofiler.profile(queryprofiler。java:90)在org.janusgraph.graphdb.transaction.standardjanusgraphtx.executemultiquery(standardjanusgraphtx。java:1054)在org.janusgraph.graphdb.query.vertex.multivertexcentricquerybuilder.execute(multivertexcentricquerybuilder。java:113)在org.janusgraph.graphdb.query.vertex.multivertexcentricquerybuilder.edges(multivertexcentricquerybuilder。java:133)在org.janusgraph.graphdb.tinkerpop.optimize.janusgraphvertexstep.initialize(janusgraphvertexstep。java:95)在org.janusgraph.graphdb.tinkerpop.optimize.janusgraphvertexstep.processnextstart(janusgraphvertexstep。java:101)位于org.apache.tinkerpop.gremlin.process.traversal.step.util.abstractstep.hasnext(抽象步骤)。java:143)在org.apache.tinkerpop.gremlin.process.traversal.step.util.expandablestepiterator.hasnext(expandablestepiterator。java:42)在org.apache.tinkerpop.gremlin.process.traversal.step.util.reducingbarrierstep.processallstarts(reducingbarrierstep。java:83)在org.apache.tinkerpop.gremlin.process.traversal.step.util.reducingbarrierstep.processnextstart(reducingbarrierstep。java:113)在org.apache.tinkerpop.gremlin.process.traversal.step.util.abstractstep.next(抽象步骤)。java:128)在org.apache.tinkerpop.gremlin.process.traversal.step.util.abstractstep.next(abstractstep。java:38)在org.apache.tinkerpop.gremlin.process.traversal.util.defaulttraversal.next(defaulttraversal。java:200)at java_util_iterator$next.call(未知源代码)在org.codehaus.groovy.runtime.callsite.callsitearray.defaultcall(callsitearray。java:48)在org.codehaus.groovy.runtime.callsite.abstractcallsite.call(abstractcallsite。java:113)在org.codehaus.groovy.runtime.callsite.abstractcallsite.call(abstractcallsite。java:117)在script13.运行(script13。groovy:1)在org.apache.tinkerpop.gremlin.groovy.jsr223.gremlingroovyscript引擎.eval(gremlingroovyscript引擎)。java:843)位于org.apache.tinkerpop.gremlin.groovy.jsr223.gremlingroovyscriptengine.eval(gremlingroovyscriptengine)。java:548)在javax.script.abstractscriptengine.eval(abstractscriptengine。java:233)在org.apache.tinkerpop.gremlin.groovy.engine.scriptengines.eval(脚本引擎)。java:120)在org.apache.tinkerpop.gremlin.groovy.engine.gremlinexecutor.lambda$eval$0(gremlinexecutor。java:290)在java.util.concurrent.futuretask.run(futuretask。java:266)在java.util.concurrent.executors$runnableadapter.call(executors。java:511)在java.util.concurrent.futuretask.run(futuretask。java:266)位于java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor。java:1149)在java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor。java:624)在java.lang.thread.run(线程。java:748)
然而 g.V().has('vid','qwerty').inE().limit(10000).count().next()
很好的工作和给予 ==>10000
1条答案
按热度按时间zengzsys1#
以下是java客户端,它使用sparkgraphcomputer创建图形:
相应的属性文件是:
以及所有spark和hadoop特定类的相应pom.xml依赖关系:
希望这有帮助:)