weixin_39796152
weixin_39796152
2020-12-31 03:29

Yolov4-tiny not showing detections

The window of the picture is showing, the image is there, but I can not see any detections... I use the following command:

BASH
user-pc:~/darknet$ ./darknet detector test cfg/coco.data cfg/yolov4-tiny.cfg weights/yolov4-tiny.weights data/dog.jpg 
Device IDs: 1
Device ID: 0
Device name: Ellesmere
Device vendor: Advanced Micro Devices, Inc.
Device opencl availability: OpenCL 1.2 AMD-APP (3180.7)
Device opencl used: 3180.7
Device double precision: YES
Device max group size: 256
Device address bits: 64
layer     filters    size              input                output
    0 conv     32  3 x 3 / 2   416 x 416 x   3   ->   208 x 208 x  32  0.075 BFLOPs
    1 conv     64  3 x 3 / 2   208 x 208 x  32   ->   104 x 104 x  64  0.399 BFLOPs
    2 conv     64  3 x 3 / 1   104 x 104 x  64   ->   104 x 104 x  64  0.797 BFLOPs
    3 route  2
Unused field: 'groups = 2'
Unused field: 'group_id = 1'
    4 conv     32  3 x 3 / 1   104 x 104 x  64   ->   104 x 104 x  32  0.399 BFLOPs
    5 conv     32  3 x 3 / 1   104 x 104 x  32   ->   104 x 104 x  32  0.199 BFLOPs
    6 route  5 4
    7 conv     64  1 x 1 / 1   104 x 104 x  64   ->   104 x 104 x  64  0.089 BFLOPs
    8 route  2 7
    9 max          2 x 2 / 2   104 x 104 x 128   ->    52 x  52 x 128
   10 conv    128  3 x 3 / 1    52 x  52 x 128   ->    52 x  52 x 128  0.797 BFLOPs
   11 route  10
Unused field: 'groups = 2'
Unused field: 'group_id = 1'
   12 conv     64  3 x 3 / 1    52 x  52 x 128   ->    52 x  52 x  64  0.399 BFLOPs
   13 conv     64  3 x 3 / 1    52 x  52 x  64   ->    52 x  52 x  64  0.199 BFLOPs
   14 route  13 12
   15 conv    128  1 x 1 / 1    52 x  52 x 128   ->    52 x  52 x 128  0.089 BFLOPs
   16 route  10 15
   17 max          2 x 2 / 2    52 x  52 x 256   ->    26 x  26 x 256
   18 conv    256  3 x 3 / 1    26 x  26 x 256   ->    26 x  26 x 256  0.797 BFLOPs
   19 route  18
Unused field: 'groups = 2'
Unused field: 'group_id = 1'
   20 conv    128  3 x 3 / 1    26 x  26 x 256   ->    26 x  26 x 128  0.399 BFLOPs
   21 conv    128  3 x 3 / 1    26 x  26 x 128   ->    26 x  26 x 128  0.199 BFLOPs
   22 route  21 20
   23 conv    256  1 x 1 / 1    26 x  26 x 256   ->    26 x  26 x 256  0.089 BFLOPs
   24 route  18 23
   25 max          2 x 2 / 2    26 x  26 x 512   ->    13 x  13 x 512
   26 conv    512  3 x 3 / 1    13 x  13 x 512   ->    13 x  13 x 512  0.797 BFLOPs
   27 conv    256  1 x 1 / 1    13 x  13 x 512   ->    13 x  13 x 256  0.044 BFLOPs
   28 conv    512  3 x 3 / 1    13 x  13 x 256   ->    13 x  13 x 512  0.399 BFLOPs
   29 conv    255  1 x 1 / 1    13 x  13 x 512   ->    13 x  13 x 255  0.044 BFLOPs
   30 yolo4
[yolo4] params: iou loss: ciou (4), iou_norm: 0.07, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.05
nms_kind: greedynms (1), beta = 0.600000 
   31 route  27
   32 conv    128  1 x 1 / 1    13 x  13 x 256   ->    13 x  13 x 128  0.011 BFLOPs
   33 upsample            2x    13 x  13 x 128   ->    26 x  26 x 128
   34 route  33 23
   35 conv    256  3 x 3 / 1    26 x  26 x 384   ->    26 x  26 x 256  1.196 BFLOPs
   36 conv    255  1 x 1 / 1    26 x  26 x 256   ->    26 x  26 x 255  0.088 BFLOPs
   37 yolo4
[yolo4] params: iou loss: ciou (4), iou_norm: 0.07, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.05
nms_kind: greedynms (1), beta = 0.600000 
Loading weights from weights/yolov4-tiny.weights...Done!
data/dog.jpg: Predicted in 0.393254 seconds.
user-pc:~/darknet$

Yolo3, yolo3-tiny and yolo4 are working as expected. Is this because yolo4-tiny is not supported?

该提问来源于开源项目:sowson/darknet

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

15条回答

  • weixin_39796152 weixin_39796152 4月前

    Ok, no problem man. I will wait for any update.

    点赞 评论 复制链接分享
  • weixin_39796152 weixin_39796152 4月前

    Sorry for late reply.

    Detection is still not showing a thing

    Screenshot from 2020-11-24 17-42-08

    And with training... well, at least now I dont get the segmentation fault error, but now there is something else wrong. Training is not working at all, I get the following output: out.pdf avg is Nan... and it doesnt change no matter the iterations I let it run.

    点赞 评论 复制链接分享
  • weixin_39633493 weixin_39633493 4月前

    can you pls try to remove yolov4-tiny.conv.29 from train command. Thx!

    点赞 评论 复制链接分享
  • weixin_39796152 weixin_39796152 4月前

    Still the same with Nan: out.pdf Here is the config file if that is useful: yolov4-tiny-custom.txt I suppose data set and everything else is in good conditions, because yolov3-tiny can be trained successfully with it.

    点赞 评论 复制链接分享
  • weixin_39633493 weixin_39633493 4月前

    I will look into it soon, for now, I am training other models, the answer is probably in the model, I have to compare it with yolo4 and look for any additional layer or activate function I may not have in the engine, sorry for inconvenient situation with it.

    点赞 评论 复制链接分享
  • weixin_39633493 weixin_39633493 4月前

    I re-port from YOLO4 repo route layer one more time (it indicates in your output not used variables) but it still not detecting objects... I will commit it soon... maybe the threshold is too high?

    点赞 评论 复制链接分享
  • weixin_39796152 weixin_39796152 4月前

    Lowering the threshold has no effect

    点赞 评论 复制链接分享
  • weixin_39633493 weixin_39633493 4月前

    Maybe you should try to train this model on your own? Thx!

    点赞 评论 复制链接分享
  • weixin_39796152 weixin_39796152 4月前

    Ok, I will try that. I will update results as soon as I have them.

    点赞 评论 复制链接分享
  • weixin_39796152 weixin_39796152 4月前

    I still cant train yolo4-tiny, but before posting the issue I was able to train yolo3 and yolo3-tiny and now I can not train any of those... Here is the output

    
    user-pc:~/darknet2$ ./darknet detector train data/obj.data yolo-obj.cfg yolov3-tiny.conv.11
    Device IDs: 1
    Device ID: 0
    Device name: Ellesmere
    Device vendor: Advanced Micro Devices, Inc.
    Device opencl availability: OpenCL 1.2 AMD-APP (3180.7)
    Device opencl used: 3180.7
    Device double precision: YES
    Device max group size: 256
    Device address bits: 64
    yolo-obj
    layer     filters    size              input                output
        0 conv     16  3 x 3 / 1   416 x 416 x   3   ->   416 x 416 x  16  0.150 BFLOPs
        1 max          2 x 2 / 2   416 x 416 x  16   ->   208 x 208 x  16
        2 conv     32  3 x 3 / 1   208 x 208 x  16   ->   208 x 208 x  32  0.399 BFLOPs
        3 max          2 x 2 / 2   208 x 208 x  32   ->   104 x 104 x  32
        4 conv     64  3 x 3 / 1   104 x 104 x  32   ->   104 x 104 x  64  0.399 BFLOPs
        5 max          2 x 2 / 2   104 x 104 x  64   ->    52 x  52 x  64
        6 conv    128  3 x 3 / 1    52 x  52 x  64   ->    52 x  52 x 128  0.399 BFLOPs
        7 max          2 x 2 / 2    52 x  52 x 128   ->    26 x  26 x 128
        8 conv    256  3 x 3 / 1    26 x  26 x 128   ->    26 x  26 x 256  0.399 BFLOPs
        9 max          2 x 2 / 2    26 x  26 x 256   ->    13 x  13 x 256
       10 conv    512  3 x 3 / 1    13 x  13 x 256   ->    13 x  13 x 512  0.399 BFLOPs
       11 max          2 x 2 / 1    13 x  13 x 512   ->    13 x  13 x 512
       12 conv   1024  3 x 3 / 1    13 x  13 x 512   ->    13 x  13 x1024  1.595 BFLOPs
       13 conv    256  1 x 1 / 1    13 x  13 x1024   ->    13 x  13 x 256  0.089 BFLOPs
       14 conv    512  3 x 3 / 1    13 x  13 x 256   ->    13 x  13 x 512  0.399 BFLOPs
       15 conv     21  1 x 1 / 1    13 x  13 x 512   ->    13 x  13 x  21  0.004 BFLOPs
       16 yolo
       17 route  13   18 conv    128  1 x 1 / 1    13 x  13 x 256   ->    13 x  13 x 128  0.011 BFLOPs
       19 upsample            2x    13 x  13 x 128   ->    26 x  26 x 128
       20 route  19 8   21 conv    256  3 x 3 / 1    26 x  26 x 384   ->    26 x  26 x 256  1.196 BFLOPs
       22 conv     21  1 x 1 / 1    26 x  26 x 256   ->    26 x  26 x  21  0.007 BFLOPs
       23 yolo
    Loading weights from yolov3-tiny.conv.11...Done!
    Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
    Saving weights to backup/yolo-obj.start.conv.weights
    Resizing
    384
    Segmentation fault (core dumped)
    user-pc:~/darknet2$
    

    I really have no idea what is wrong, I used the exact same files, I even created them again from zero, but it is still not working... I ran out of ideas here, training yolo3-tiny was working a few days ago...

    点赞 评论 复制链接分享
  • weixin_39796152 weixin_39796152 4月前

    I followed all the instructions of AlexeyAB to train, multiple times, in different ways. - Images where generated using yolo-mark, and they worked before, so I doubt there is the problem. - I downloaded the initial weights for yolo3-tiny from here - yolo-obj.cfg:

    
    [net]
    # Testing
    #batch=1
    #subdivisions=1
    # Training
    batch=64
    subdivisions=16
    width=416
    height=416
    channels=3
    momentum=0.9
    decay=0.0005
    angle=0
    saturation = 1.5
    exposure = 1.5
    hue=.1
    
    learning_rate=0.001
    burn_in=1000
    max_batches = 6000
    policy=steps
    steps=4800,5400
    scales=.1,.1
    
    [convolutional]
    batch_normalize=1
    filters=16
    size=3
    stride=1
    pad=1
    activation=leaky
    
    [maxpool]
    size=2
    stride=2
    
    [convolutional]
    batch_normalize=1
    filters=32
    size=3
    stride=1
    pad=1
    activation=leaky
    
    [maxpool]
    size=2
    stride=2
    
    [convolutional]
    batch_normalize=1
    filters=64
    size=3
    stride=1
    pad=1
    activation=leaky
    
    [maxpool]
    size=2
    stride=2
    
    [convolutional]
    batch_normalize=1
    filters=128
    size=3
    stride=1
    pad=1
    activation=leaky
    
    [maxpool]
    size=2
    stride=2
    
    [convolutional]
    batch_normalize=1
    filters=256
    size=3
    stride=1
    pad=1
    activation=leaky
    
    [maxpool]
    size=2
    stride=2
    
    [convolutional]
    batch_normalize=1
    filters=512
    size=3
    stride=1
    pad=1
    activation=leaky
    
    [maxpool]
    size=2
    stride=1
    
    [convolutional]
    batch_normalize=1
    filters=1024
    size=3
    stride=1
    pad=1
    activation=leaky
    
    ###########
    
    [convolutional]
    batch_normalize=1
    filters=256
    size=1
    stride=1
    pad=1
    activation=leaky
    
    [convolutional]
    batch_normalize=1
    filters=512
    size=3
    stride=1
    pad=1
    activation=leaky
    
    [convolutional]
    size=1
    stride=1
    pad=1
    filters=21
    activation=linear
    
    
    
    [yolo]
    mask = 3,4,5
    anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
    classes=2
    num=6
    jitter=.3
    ignore_thresh = .7
    truth_thresh = 1
    random=1
    
    [route]
    layers = -4
    
    [convolutional]
    batch_normalize=1
    filters=128
    size=1
    stride=1
    pad=1
    activation=leaky
    
    [upsample]
    stride=2
    
    [route]
    layers = -1, 8
    
    [convolutional]
    batch_normalize=1
    filters=256
    size=3
    stride=1
    pad=1
    activation=leaky
    
    [convolutional]
    size=1
    stride=1
    pad=1
    filters=21
    activation=linear
    
    [yolo]
    mask = 0,1,2
    anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
    classes=2
    num=6
    jitter=.3
    ignore_thresh = .7
    truth_thresh = 1
    random=1
    

    No matters what I change, the result is the same

    I still cant train yolo4-tiny, but before posting the issue I was able to train yolo3 and yolo3-tiny and now I can not train any of those... Here is the output

    
    user-pc:~/darknet2$ ./darknet detector train data/obj.data yolo-obj.cfg yolov3-tiny.conv.11
    Device IDs: 1
    Device ID: 0
    Device name: Ellesmere
    Device vendor: Advanced Micro Devices, Inc.
    Device opencl availability: OpenCL 1.2 AMD-APP (3180.7)
    Device opencl used: 3180.7
    Device double precision: YES
    Device max group size: 256
    Device address bits: 64
    yolo-obj
    layer     filters    size              input                output
        0 conv     16  3 x 3 / 1   416 x 416 x   3   ->   416 x 416 x  16  0.150 BFLOPs
        1 max          2 x 2 / 2   416 x 416 x  16   ->   208 x 208 x  16
        2 conv     32  3 x 3 / 1   208 x 208 x  16   ->   208 x 208 x  32  0.399 BFLOPs
        3 max          2 x 2 / 2   208 x 208 x  32   ->   104 x 104 x  32
        4 conv     64  3 x 3 / 1   104 x 104 x  32   ->   104 x 104 x  64  0.399 BFLOPs
        5 max          2 x 2 / 2   104 x 104 x  64   ->    52 x  52 x  64
        6 conv    128  3 x 3 / 1    52 x  52 x  64   ->    52 x  52 x 128  0.399 BFLOPs
        7 max          2 x 2 / 2    52 x  52 x 128   ->    26 x  26 x 128
        8 conv    256  3 x 3 / 1    26 x  26 x 128   ->    26 x  26 x 256  0.399 BFLOPs
        9 max          2 x 2 / 2    26 x  26 x 256   ->    13 x  13 x 256
       10 conv    512  3 x 3 / 1    13 x  13 x 256   ->    13 x  13 x 512  0.399 BFLOPs
       11 max          2 x 2 / 1    13 x  13 x 512   ->    13 x  13 x 512
       12 conv   1024  3 x 3 / 1    13 x  13 x 512   ->    13 x  13 x1024  1.595 BFLOPs
       13 conv    256  1 x 1 / 1    13 x  13 x1024   ->    13 x  13 x 256  0.089 BFLOPs
       14 conv    512  3 x 3 / 1    13 x  13 x 256   ->    13 x  13 x 512  0.399 BFLOPs
       15 conv     21  1 x 1 / 1    13 x  13 x 512   ->    13 x  13 x  21  0.004 BFLOPs
       16 yolo
       17 route  13   18 conv    128  1 x 1 / 1    13 x  13 x 256   ->    13 x  13 x 128  0.011 BFLOPs
       19 upsample            2x    13 x  13 x 128   ->    26 x  26 x 128
       20 route  19 8   21 conv    256  3 x 3 / 1    26 x  26 x 384   ->    26 x  26 x 256  1.196 BFLOPs
       22 conv     21  1 x 1 / 1    26 x  26 x 256   ->    26 x  26 x  21  0.007 BFLOPs
       23 yolo
    Loading weights from yolov3-tiny.conv.11...Done!
    Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
    Saving weights to backup/yolo-obj.start.conv.weights
    Resizing
    384
    Segmentation fault (core dumped)
    user-pc:~/darknet2$
    

    I really have no idea what is wrong, I used the exact same files, I even created them again from zero, but it is still not working... I ran out of ideas here, training yolo3-tiny was working a few days ago...

    点赞 评论 复制链接分享
  • weixin_39796152 weixin_39796152 4月前

    Should I use a specific branch or version? Is the master branch safe to clone? Does the images used for trainning need to be of specific size (pixelxpixel)? Is there a limit? Do I need a different procedure to train this repo? Those are other questions I have too.

    点赞 评论 复制链接分享
  • weixin_39633493 weixin_39633493 4月前

    code is fine, compilation too, your GPU needs rest, turn off your PC, unplug the power cord and give it rest about 1-2 hour and everything will be fine again :D. I often have a similar issue after many tries and OpenCL inint without deinint..., I checked and on my computer, all the mentioned training work just fine. On your end, you have garbage in VRAM that has to be cleaned up. Hope that helps.

    点赞 评论 复制链接分享
  • weixin_39633493 weixin_39633493 4月前

    btw, gdb is your friend if you build with -g flag or DEBUG=1 then you may after gdb command put your training command and see where is the breakpoint fails... if it will be in opencl.c hight probably my last comment is relevant :).

    点赞 评论 复制链接分享
  • weixin_39633493 weixin_39633493 4月前

    there was an error with OpenCL resources free in the Route layer... I have just fixed and committed it. Thx!

    点赞 评论 复制链接分享

相关推荐