weixin_39897267
weixin_39897267
2021-01-12 13:49

a bug of titan-hadoop in titan0.5.2?

I run a hadoop job using gremlin,find some exception:

 java
Error: java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.hadoop.formats.util.input.current.TitanHadoopSetupImpl
    at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:55)
    at com.thinkaurelius.titan.hadoop.formats.util.TitanInputFormat.getGraphSetup(TitanInputFormat.java:49)
    at com.thinkaurelius.titan.hadoop.formats.cassandra.TitanCassandraRecordReader.initialize(TitanCassandraRecordReader.java:44)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:525)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:44)
    ... 10 more
Caused by: com.thinkaurelius.titan.core.TitanException: Could not execute operation due to backend exception
    at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:44)
    at com.thinkaurelius.titan.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:428)
    at com.thinkaurelius.titan.diskstorage.BackendTransaction.edgeStoreQuery(BackendTransaction.java:253)
    at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.edgeQuery(StandardTitanGraph.java:344)
    at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph$1.retrieveSchemaRelations(StandardTitanGraph.java:299)
    at com.thinkaurelius.titan.graphdb.database.cache.MetricInstrumentedSchemaCache$1.retrieveSchemaRelations(MetricInstrumentedSchemaCache.java:33)
    at com.thinkaurelius.titan.graphdb.database.cache.StandardSchemaCache.getSchemaRelations(StandardSchemaCache.java:157)
    at com.thinkaurelius.titan.graphdb.database.cache.MetricInstrumentedSchemaCache.getSchemaRelations(MetricInstrumentedSchemaCache.java:53)
    at com.thinkaurelius.titan.graphdb.types.vertices.TitanSchemaVertex.getRelated(TitanSchemaVertex.java:98)
    at com.thinkaurelius.titan.graphdb.types.vertices.RelationTypeVertex.getBaseType(RelationTypeVertex.java:74)
    at com.thinkaurelius.titan.graphdb.database.management.ManagementSystem$7.apply(ManagementSystem.java:1081)
    at com.thinkaurelius.titan.graphdb.database.management.ManagementSystem$7.apply(ManagementSystem.java:1077)
    at com.google.common.collect.Iterators$7.computeNext(Iterators.java:649)
    at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
    at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
    at com.thinkaurelius.titan.graphdb.schema.SchemaContainer.<init>(SchemaContainer.java:37)
    at com.thinkaurelius.titan.hadoop.formats.util.input.current.TitanHadoopSetupImpl.<init>(TitanHadoopSetupImpl.java:40)
    ... 15 more
Caused by: com.thinkaurelius.titan.diskstorage.PermanentBackendException: Permanent failure in storage backend
    at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.convertException(CassandraThriftKeyColumnValueStore.java:249)
    at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:148)
    at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:91)
    at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:80)
    at com.thinkaurelius.titan.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:65)
    at com.thinkaurelius.titan.diskstorage.util.MetricInstrumentedStore$1.call(MetricInstrumentedStore.java:92)
    at com.thinkaurelius.titan.diskstorage.util.MetricInstrumentedStore$1.call(MetricInstrumentedStore.java:90)
    at com.thinkaurelius.titan.diskstorage.util.MetricInstrumentedStore.runWithMetrics(MetricInstrumentedStore.java:214)
    at com.thinkaurelius.titan.diskstorage.util.MetricInstrumentedStore.getSlice(MetricInstrumentedStore.java:89)
    at com.thinkaurelius.titan.diskstorage.keycolumnvalue.cache.KCVSCache.getSliceNoCache(KCVSCache.java:63)
    at com.thinkaurelius.titan.diskstorage.BackendTransaction$1.call(BackendTransaction.java:256)
    at com.thinkaurelius.titan.diskstorage.BackendTransaction$1.call(BackendTransaction.java:253)
    at com.thinkaurelius.titan.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:56)
    at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:42)
    ... 31 more
Caused by: TimedOutException()
    at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14526)
    at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14463)
    at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:14389)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
    at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:732)
    at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:716)
    at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:129)
    ... 43 more
</init></init>

some map the job failed ,but the job SUCCEEDED. I don't konw this is a bug or some bad data of my graph?

该提问来源于开源项目:thinkaurelius/titan

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

5条回答

  • weixin_39621185 weixin_39621185 4月前

    I edited the stacktrace in your original report. The trace has newlines and indentation now instead of being strung out on one very long line. The content should be identical aside from whitespace changes.

    I can't seem to reproduce this. I induced a this failure in a job by changing TitanHadoopSetupImpl's constructor:

     java
        public TitanHadoopSetupImpl(final Configuration config) {
            BasicConfiguration bc = ModifiableHadoopConfiguration.of(config).getInputConf();
            graph = (StandardTitanGraph)TitanFactory.open(bc);
            throw new TitanException("#904");
    //        FaunusSchemaManager.getTypeManager(null).setSchemaProvider(new SchemaContainer(graph));
    //        tx = (StandardTitanTx)graph.buildTransaction().readOnly().setVertexCacheSize(200).start();
       }
    

    Then I compiled and tried reading the Graph of the Gods from Cassandra + ES:

    
    $ bin/gremlin.sh
    
             \,,,/
             (o o)
    -----oOOo-(_)-oOOo-----
    23:52:42 WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    gremlin> h = HadoopFactory.open('conf/hadoop/titan-cassandra-input.properties')
    ==>titangraph[hadoop:titancassandrainputformat->graphsonoutputformat]
    gremlin> h._
    23:52:49 INFO  com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler  - Added mapper IdentityMap.Map via ChainMapper with output (class org.apache.hadoop.io.NullWritable,class com.thinkaurelius.titan.hadoop.FaunusVertex); current state is MAPPER
    23:52:49 INFO  com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler  - Configuring 1 MapReduce job(s)...
    23:52:49 INFO  com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler  - Configuring [Job 1/1: IdentityMap.Map]
    23:52:49 INFO  com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler  - Configured 1 MapReduce job(s)
    23:52:49 INFO  com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler  - Preparing to execute 1 MapReduce job(s)...
    23:52:49 INFO  com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler  - Executing [Job 1/1: IdentityMap.Map]
    23:52:49 WARN  org.apache.hadoop.mapreduce.JobSubmitter  - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
    23:52:51 INFO  org.apache.hadoop.mapreduce.JobSubmitter  - number of splits:5
    23:52:51 INFO  org.apache.hadoop.mapreduce.JobSubmitter  - Submitting tokens for job: job_local1519462141_0001
    23:52:51 WARN  org.apache.hadoop.conf.Configuration  - file:/tmp/hadoop-dalaro/mapred/staging/dalaro1519462141/.staging/job_local1519462141_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
    23:52:51 WARN  org.apache.hadoop.conf.Configuration  - file:/tmp/hadoop-dalaro/mapred/staging/dalaro1519462141/.staging/job_local1519462141_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
    23:52:54 WARN  org.apache.hadoop.conf.Configuration  - file:/tmp/hadoop-dalaro/mapred/local/localRunner/dalaro/job_local1519462141_0001/job_local1519462141_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
    23:52:54 WARN  org.apache.hadoop.conf.Configuration  - file:/tmp/hadoop-dalaro/mapred/local/localRunner/dalaro/job_local1519462141_0001/job_local1519462141_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
    23:52:54 INFO  org.apache.hadoop.mapreduce.Job  - The url to track the job: http://localhost:8080/
    23:52:54 INFO  org.apache.hadoop.mapreduce.Job  - Running job: job_local1519462141_0001
    ...
    23:52:56 INFO  org.apache.hadoop.mapred.LocalJobRunner  - Map task executor complete.
    23:52:56 WARN  org.apache.hadoop.mapred.LocalJobRunner  - job_local1519462141_0001
    java.lang.Exception: java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.hadoop.formats.util.input.current.TitanHadoopSetupImpl
            at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
    Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.hadoop.formats.util.input.current.TitanHadoopSetupImpl
            at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:55)
            at com.thinkaurelius.titan.hadoop.formats.util.TitanInputFormat.getGraphSetup(TitanInputFormat.java:49)
            at com.thinkaurelius.titan.hadoop.formats.cassandra.TitanCassandraRecordReader.initialize(TitanCassandraRecordReader.java:44)
            at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:524)
            at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:762)
            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
            at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
            at java.util.concurrent.FutureTask.run(FutureTask.java:262)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:744)
    Caused by: java.lang.reflect.InvocationTargetException
            at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
            at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
            at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
            at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
            at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:44)
            ... 11 more
    Caused by: com.thinkaurelius.titan.core.TitanException: #904
            at com.thinkaurelius.titan.hadoop.formats.util.input.current.TitanHadoopSetupImpl.<init>(TitanHadoopSetupImpl.java:41)
            ... 16 more
    23:52:56 INFO  org.apache.hadoop.mapreduce.Job  - Job job_local1519462141_0001 failed with state FAILED due to: NA
    23:52:56 INFO  org.apache.hadoop.mapreduce.Job  - Counters: 0
    23:52:56 ERROR com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler  - Error executing [Job 1/1: IdentityMap.Map]; this job has failed and 0 subsequent MapReduce job(s) have been canceled
    gremlin>
    </init>

    The job reports final state FAILED. My exception is thrown on TitanHadoopSetupImpl.java:41 and yours is thrown on line 40, but it's effectively the same point in execution. My line is just shifted down by one because I added an import for TitanException.

    I used the local/inmemory Hadoop executor above, but I also tried with a 2.2.0 pseudo-distributed Hadoop cluster, and I got effectively the same result. The pseudo-distributed cluster retried the mapper three times, but it failed with the exception each time. Hadoop then gave up and switched the job to state FAILED.

    Could this have been a transient timeout, such that Hadoop retried the failed task and it then completed? That's the only explanation I have for this right now.

    点赞 评论 复制链接分享
  • weixin_39871162 weixin_39871162 4月前

    I am facing the same issues reported by "boliza" on titan-0.5.4. What is the resolution???

    点赞 评论 复制链接分享
  • weixin_39983383 weixin_39983383 4月前

    I'm reasonably sure this was closed as was unable to reproduce the problem and no additional information was supplied. It would be interesting to know if Titan 1.x has this problem. I'm not sure if you are in a position to upgrade, but you might try that as an option.

    点赞 评论 复制链接分享
  • weixin_39871162 weixin_39871162 4月前

    Hi Stephen,

    Upgrade is not possible at the moment, as we have to do lot of code changes to get compatible with Titan 1.x. Recently we have started the replica of our production environment for reporting and we are using Titan-Hadoop, unfortunately when we run any job, we are getting the below exception.

    java.lang.Exception: java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.hadoop.formats.util.input.current.TitanHadoopSetupImpl

    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)

    Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.hadoop.formats.util.input.current.TitanHadoopSetupImpl

    at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:55)

    at com.thinkaurelius.titan.hadoop.formats.util.TitanInputFormat.getGraphSetup(TitanInputFormat.java:49)

    at com.thinkaurelius.titan.hadoop.formats.cassandra.TitanCassandraRecordReader.initialize(TitanCassandraRecordReader.java:44)

    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:524)

    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:762)

    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)

    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)

    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

    at java.util.concurrent.FutureTask.run(FutureTask.java:266)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

    at java.lang.Thread.run(Thread.java:745)

    Caused by: java.lang.reflect.InvocationTargetException

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

    at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:44)

    ... 11 more

    Caused by: java.lang.IllegalStateException: Could not find type for id: 247829

    at com.google.common.base.Preconditions.checkState(Preconditions.java:177)

    at com.thinkaurelius.titan.graphdb.types.vertices.TitanSchemaVertex.getName(TitanSchemaVertex.java:45)

    at com.thinkaurelius.titan.graphdb.schema.EdgeLabelDefinition.(EdgeLabelDefinition.java:20)

    at com.thinkaurelius.titan.graphdb.schema.SchemaContainer.(SchemaContainer.java:34)

    at com.thinkaurelius.titan.hadoop.formats.util.input.current.TitanHadoopSetupImpl.(TitanHadoopSetupImpl.java:40)

    ... 16 more

    Please find our properties file here...

    conf/hadoop/titan-cassandra-input.properties

    input graph parameters

    titan.hadoop.input.format= com.thinkaurelius.titan.hadoop.formats.cassandra.TitanCassandraInputFormat

    titan.hadoop.input.conf.storage.backend=cassandrathrift

    titan.hadoop.input.conf.storage.hostname=

    titan.hadoop.input.conf.storage.port=

    titan.hadoop.input.conf.storage.cassandra.keyspace=titan

    titan.hadoop.input.conf.index.search.backend=elasticsearch

    titan.hadoop.input.conf.index.search.hostname=

    cassandra.input.partitioner.class= org.apache.cassandra.dht.Murmur3Partitioner

    cassandra.input.split.size=512

    cassandra.thrift.framed.size_mb=160

    cassandra.thrift.message.max_size_mb=161

    output data (graph or statistic) parameters

    titan.hadoop.sideeffect.format= org.apache.hadoop.mapreduce.lib.output.TextOutputFormat

    titan.hadoop.output.format= com.thinkaurelius.titan.hadoop.formats.graphson.GraphSONOutputFormat

    titan.hadoop.output.conf.storage.batch-loading=true

    titan.hadoop.output.conf.storage.infer-schema=true

    It is possible to provide Hadoop configuration parameters

    Note that these parameters are provided to each MapReduce job within the

    entire Titan/Hadoop job pipeline

    Some of these parameters may be over written by Titan/Hadoop as deemed

    necessary

    mapred.linerecordreader.maxlength=5242880

    mapred.map.child.java.opts=-Xmx1024m

    mapred.reduce.child.java.opts=-Xmx1024m

    mapred.map.tasks=6

    mapred.reduce.tasks=3

    mapred.job.reuse.jvm.num.tasks=-1

    mapred.task.timeout=5400000

    mapred.reduce.parallel.copies=50

    io.sort.factor=100

    io.sort.mb=200

    A quick help would be highly appreciable.

    Thanks,

    Karteek.

    On Wed, Nov 25, 2015 at 4:12 PM, stephen mallette notifications.com wrote:

    I'm reasonably sure this was closed as https://github.com/dalaro was unable to reproduce the problem and no additional information was supplied. It would be interesting to know if Titan 1.x has this problem. I'm not sure if you are in a position to upgrade, but you might try that as an option.

    — Reply to this email directly or view it on GitHub https://github.com/thinkaurelius/titan/issues/904#issuecomment-159568068 .

    点赞 评论 复制链接分享
  • weixin_39983383 weixin_39983383 4月前

    Caused by: java.lang.IllegalStateException: Could not find type for id: 247829

    I've known that error to represent data corruption (which isn't something the initial post showed so this isn't quite the same problem). It's been suggested to me that this issue is repairable by cassandra tooling, but i'm not 100% sure.

    might have to weigh in and provide an answer on that one....

    点赞 评论 复制链接分享