dtmm0148603 2017-03-21 09:47
浏览 60
已采纳

在AWS EMR sdk中使用AddJobFlowStep的正确方法是什么?

I've used the go AWS sdk to create a cluster and added a job flow step to it. However the execution of the step always fails when I do it programatically. An interesting point to notice is that when I attach the jar from the UI, it successfully executes.

So when the jar is attached from the UI, this is the outcome of the step execution(it runs successfully and moves to the COMPLETED state): (Copying the full text)

JAR location : command-runner.jar Main class : None Arguments : spark-submit --deploy-mode cluster --class Hello s3://mdv-testing/Util-assembly-1.0.jar Action on failure: Continue

However, this is the output of the step when I try programatically:

Status :FAILED Reason : Main Class not found. Log File : s3://mdv-testing/awsLogs/j-3RW9K14BS6GLO/steps/s-337M25MLV3BHT/stderr.gz Details : Caused by: java.lang.ClassNotFoundException: scala.reflect.api.TypeCreator JAR location : s3://mdv-testing/Util-assembly-1.0.jar Main class : None > Arguments : spark-submit "--class Hello" Action on failure: Cancel and wait

I tried various combinations for the arguments and realised that the command-runner.jar was never present. I accordingly made changes to the code and send the command-runner.jar as the argument now. This now reflects the same details as the step that executes successfully. This is the revised output:

Status :FAILED Reason : Unknown Error. Log File : s3://mdv-testing/awsLogs/j-3RW9K14BS6GLO/steps/s-3NI5ZO15VTWQK/ JAR location : command-runner.jar Main class : None Arguments : "spark-submit --deploy-mode cluster --class Hello s3://mdv-testing/Util-assembly-1.0.jar Action on failure: Cancel and wait

Go Code

package main
import (
"fmt"

"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/emr"
)

func main() {
sess := session.New(&aws.Config{Region: aws.String("us-east-1")})
svc := emr.New(sess)

params := &emr.AddJobFlowStepsInput{
JobFlowId: aws.String("j-3RW9K14BS6aaa"),
Steps: []*emr.StepConfig{
{
    ActionOnFailure: aws.String("CANCEL_AND_WAIT"), //TERMINATE_CLUSTER"),
    HadoopJarStep: &emr.HadoopJarStepConfig{
    Args: []*string{
                     aws.String("spark-submit --deploy-mode cluster --class Hello s3://mdv-testing/Util-assembly-1.0.jar"),
                   },
                     Jar: aws.String("command-runner.jar"), },
                     Name: aws.String("ReportJarExecution"),
    },
},
}

resp, err := svc.AddJobFlowSteps(params)

if err != nil {
// Print the error, cast err to awserr. sError to get the Code and
// Message from an error.
fmt.Println(err.Error())
return
}

// Pretty-print the response data.
fmt.Println(resp)
}

Can someone please help me !!! I think I'm pretty close to the solution but it is evading me big time :(

  • 写回答

1条回答 默认 最新

  • duancaishi1897 2017-03-23 10:31
    关注

    I managed to solve this issue. For anyone who is struggling with something similar, the answer is that we need to send the arguments separately in an array.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 delta降尺度计算的一些细节,有偿
  • ¥15 Arduino红外遥控代码有问题
  • ¥15 数值计算离散正交多项式
  • ¥30 数值计算均差系数编程
  • ¥15 redis-full-check比较 两个集群的数据出错
  • ¥15 Matlab编程问题
  • ¥15 训练的多模态特征融合模型准确度很低怎么办
  • ¥15 kylin启动报错log4j类冲突
  • ¥15 超声波模块测距控制点灯,灯的闪烁很不稳定,经过调试发现测的距离偏大
  • ¥15 import arcpy出现importing _arcgisscripting 找不到相关程序