大数据东哥(Aidon) 2023-08-08 14:20 采纳率: 0%
浏览 14

DolphinScheduler 3.1.1的Hive UDF报错

DolphinScheduler 3.1.1 && Hive 3.1.2版本UDF报错

在资源中心添加自己编写的Java版本Hive UDF函数,上传后,并创建udf函数,然后在SQL中使用。执行上述任务,则报空指针,报错如下:

LOG-PATH]: /opt/install/dolphinscheduler-3.1.1/worker-server/logs/20230310/8742018725184_4-128-245.log, [HOST]:  Host{address='172.24.86.97:1234', ip='172.24.86.97', port=1234}
[INFO] 2023-03-10 02:52:18.081 +0000 - Begin to pulling task
[INFO] 2023-03-10 02:52:18.084 +0000 - Begin to initialize task
[INFO] 2023-03-10 02:52:18.085 +0000 - Set task startTime: Fri Mar 10 02:52:18 UTC 2023
[INFO] 2023-03-10 02:52:18.085 +0000 - Set task envFile: /opt/install/dolphinscheduler-3.1.1/worker-server/conf/dolphinscheduler_env.sh
[INFO] 2023-03-10 02:52:18.085 +0000 - Set task appId: 128_245
[INFO] 2023-03-10 02:52:18.085 +0000 - End initialize task
[INFO] 2023-03-10 02:52:18.088 +0000 - Set task status to TaskExecutionStatus{code=1, desc='running'}
[INFO] 2023-03-10 02:52:18.088 +0000 - TenantCode:hdfs check success
[INFO] 2023-03-10 02:52:18.089 +0000 - ProcessExecDir:/opt/install/dolphinscheduler-3.1.1/data/exec/process/hdfs/7781792440512/8742018725184_4/128/245 check success
[INFO] 2023-03-10 02:52:18.089 +0000 - Resources:{} check success
[INFO] 2023-03-10 02:52:18.098 +0000 - Task plugin: SQL create success
[INFO] 2023-03-10 02:52:18.098 +0000 - Success initialized task plugin instance success
[INFO] 2023-03-10 02:52:18.098 +0000 - Success set taskVarPool: null
[INFO] 2023-03-10 02:52:18.099 +0000 - Full sql parameters: SqlParameters{type='HIVE', datasource=2, sql='select testudf("abc");', sqlType=0, sendEmail=true, displayRows=10, limit=0, segmentSeparator=, udfs='3', showType='null', connParams='null', groupId='1', title='775600040@qq.com', preStatements=[], postStatements=[]}
[INFO] 2023-03-10 02:52:18.099 +0000 - sql type : HIVE, datasource : 2, sql : select testudf("abc"); , localParams : [],udfs : 3,showType : null,connParams : null,varPool : [] ,query max result limit  0
[ERROR] 2023-03-10 02:52:18.138 +0000 - sql task error
java.lang.NullPointerException: null
    at org.apache.dolphinscheduler.plugin.task.sql.SqlTask.lambda$buildJarSql$1(SqlTask.java:500)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
    at org.apache.dolphinscheduler.plugin.task.sql.SqlTask.buildJarSql(SqlTask.java:505)
    at org.apache.dolphinscheduler.plugin.task.sql.SqlTask.createFuncs(SqlTask.java:474)
    at org.apache.dolphinscheduler.plugin.task.sql.SqlTask.handle(SqlTask.java:158)
    at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerDelayTaskExecuteRunnable.executeTask(DefaultWorkerDelayTaskExecuteRunnable.java:49)
    at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecuteRunnable.run(WorkerTaskExecuteRunnable.java:174)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
[ERROR] 2023-03-10 02:52:18.138 +0000 - Task execute failed, due to meet an exception
org.apache.dolphinscheduler.plugin.task.api.TaskException: Execute sql task failed
    at org.apache.dolphinscheduler.plugin.task.sql.SqlTask.handle(SqlTask.java:168)
    at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerDelayTaskExecuteRunnable.executeTask(DefaultWorkerDelayTaskExecuteRunnable.java:49)
    at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecuteRunnable.run(WorkerTaskExecuteRunnable.java:174)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException: null
    at org.apache.dolphinscheduler.plugin.task.sql.SqlTask.lambda$buildJarSql$1(SqlTask.java:500)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
    at org.apache.dolphinscheduler.plugin.task.sql.SqlTask.buildJarSql(SqlTask.java:505)
    at org.apache.dolphinscheduler.plugin.task.sql.SqlTask.createFuncs(SqlTask.java:474)
    at org.apache.dolphinscheduler.plugin.task.sql.SqlTask.handle(SqlTask.java:158)
    ... 9 common frames omitted
[INFO] 2023-03-10 02:52:18.139 +0000 - Get a exception when execute the task, will send the task execute result to master, the current task execute result is TaskExecutionStatus{code=6, desc='failure'}
  • 写回答

1条回答 默认 最新

  • CSDN-Ada助手 CSDN-AI 官方账号 2023-08-08 17:10
    关注

    【以下回答由 GPT 生成】

    对于这个问题,空指针异常是由于UDF在DolphinScheduler中配置有误导致的。

    首先,我们需要确保添加的Java版本的Hive UDF函数是正确的,并且已经成功编译和打包。

    然后,我们需要在DolphinScheduler中正确配置该UDF函数。

    1. 确保UDF函数的jar包已经上传到DolphinScheduler的资源中心。可以使用以下代码来上传jar包到资源中心:
    import org.apache.dolphinscheduler.api.dto.resources.upload.ResourceUploadResponseDto;
    import org.apache.dolphinscheduler.api.enums.ResourceType;
    import org.apache.dolphinscheduler.api.service.ProcessDefinitionService;
    import org.apache.dolphinscheduler.api.service.impl.ProcessDefinitionServiceImpl;
    import org.apache.dolphinscheduler.api.service.impl.ResourceServiceImpl;
    import org.apache.dolphinscheduler.common.Constants;
    
    import java.io.File;
    
    public class UDFUpload {
    
        public static void main(String[] args) {
            String resourcePath = "/path/to/udf.jar";   // 替换成实际的jar包路径
            String resourceName = "udf.jar";   // 替换成实际的jar包名称
            String uploadPath = "/upload";// 替换成实际的上传路径
    
            ResourceUploadResponseDto responseDto = uploadResource(resourcePath, resourceName, uploadPath);
            System.out.println(responseDto.getMsg());
        }
    
        private static ResourceUploadResponseDto uploadResource(String resourcePath, String resourceName, String uploadPath) {
            File resourceFile = new File(resourcePath);
            ResourceType resourceType = ResourceType.UDF;
    
            ProcessDefinitionService processDefinitionService = new ProcessDefinitionServiceImpl();
            ResourceServiceImpl resourceService = new ResourceServiceImpl();
    
            return processDefinitionService.uploadResource(resourceFile, resourceType, uploadPath, resourceName, resourceService);
        }
    }
    
    1. 配置UDF函数的路径,并确保函数的名称和参数个数与实际的函数一致。可以使用以下代码来配置UDF函数:
    import org.apache.dolphinscheduler.api.dto.resources.udf.AddUdfFunctionRequest;
    import org.apache.dolphinscheduler.api.dto.resources.udf.UdfFuncParameter;
    import org.apache.dolphinscheduler.api.dto.resources.udf.UdfType;
    import org.apache.dolphinscheduler.api.enums.Status;
    import org.apache.dolphinscheduler.api.service.impl.ResourceServiceImpl;
    
    public class UDFConfig {
    
        public static void main(String[] args) {
            String resourceName = "udf.jar";   // 替换成实际的jar包名称
            String functionClassName = "com.example.MyUDF";   // 替换成实际的UDF函数的类名
            String functionName = "testudf";   // 替换成实际的UDF函数的名称
            int functionArgNum = 1;   // 替换成实际的UDF函数的参数个数
    
            AddUdfFunctionRequest request = new AddUdfFunctionRequest();
    
            request.setStatus(Status.ONLINE);
            request.setResourceName(resourceName);
            request.setFunctionClassName(functionClassName);
            request.setFunctionName(functionName);
            request.setFunctionDescription("");
            request.setUdfType(UdfType.SQL);
            request.setFunctionArgNum(functionArgNum);
    
            for (int i = 0; i < functionArgNum; i++) {
                request.getUdfFuncParameters().add(new UdfFuncParameter());
            }
    
            ResourceServiceImpl resourceService = new ResourceServiceImpl();
            resourceService.addUdfFunction(request);
        }
    }
    
    1. 确保在SQL任务中正确调用了该UDF函数。根据你提供的日志信息,可以看到SQL语句中调用了函数testudf("abc")。请确保函数名称和参数类型与实际的函数一致。

    希望以上解决方案能够帮助到你,如果问题依然存在,请提供更多的信息以便我能够更准确地解决问题。如果需要进一步的帮助,可以参考DolphinScheduler的官方文档或者向DolphinScheduler社区寻求支持。



    【相关推荐】



    如果你已经解决了该问题, 非常希望你能够分享一下解决方案, 写成博客, 将相关链接放在评论区, 以帮助更多的人 ^-^
    评论

报告相同问题?

问题事件

  • 创建了问题 8月8日

悬赏问题

  • ¥15 Oracle中如何从clob类型截取特定字符串后面的字符
  • ¥15 想通过pywinauto自动电机应用程序按钮,但是找不到应用程序按钮信息
  • ¥15 MATLAB中streamslice问题
  • ¥15 如何在炒股软件中,爬到我想看的日k线
  • ¥15 seatunnel 怎么配置Elasticsearch
  • ¥15 PSCAD安装问题 ERROR: Visual Studio 2013, 2015, 2017 or 2019 is not found in the system.
  • ¥15 (标签-MATLAB|关键词-多址)
  • ¥15 关于#MATLAB#的问题,如何解决?(相关搜索:信噪比,系统容量)
  • ¥500 52810做蓝牙接受端
  • ¥15 基于PLC的三轴机械手程序