main content

code generation for object detection using yolo v3 deep learning network -凯发k8网页登录

this example shows how to generate cuda® mex for a you only look once (yolo) v3 object detector. yolo v3 improves upon yolo v2 by adding detection at multiple scales to help detect smaller objects. the loss function used for training is separated into mean squared error for bounding box regression and binary cross-entropy for object classification to help improve detection accuracy. the yolo v3 network in this example was trained on the coco dataset. the tiny yolo v3 network reduces large number of convolution layers of the yolo v3 network. it is more suitable for real-time object detection as it requires less computing power requirements. for more information, see (computer vision toolbox) and (computer vision toolbox).

third-party prerequisites

required

  • cuda enabled nvidia® gpu and compatible driver.

optional

for non-mex builds such as static, dynamic libraries or executables, this example has the following additional requirements.

  • nvidia cuda toolkit.

  • nvidia cudnn library.

  • environment variables for the compilers and libraries. for more information, see third-party hardware and .

verify gpu environment

to verify that the compilers and libraries for running this example are set up correctly, use the function.

envcfg = coder.gpuenvconfig('host');
envcfg.deeplibtarget = 'cudnn';
envcfg.deepcodegen = 1;
envcfg.quiet = 1;
coder.checkgpuinstall(envcfg);

pretrained yolo v3 network

this example uses a pretrained yolo v3 object detection network trained on the coco dataset. the object detector can detect 80 different objects, including person, bicycle, car and so on. to use the yolo v3 network, download and install the from add-on explorer. for more information about installing add-ons, see .

specify a name for the network and save the yolov3objectdetector object to a mat-file. save the yolov3objectdetector object to a mat-file and proceed.

name = "tiny-yolov3-coco";
vehicledetector =  yolov3objectdetector(name);
matfile = 'tinyyolov3coco.mat';
save(matfile,'vehicledetector');
net = vehicledetector.network;
inputlayersize = net.layers(1).inputsize;
disp(vehicledetector.classnames(1:5))
     person 
     bicycle 
     car 
     motorbike 
     aeroplane 

the tinyyolov3detect entry-point function

the tinyyolov3detect entry-point function takes an image input and runs the detector on the image. the function loads the network object from the tinyyolov3coco.mat file into a persistent variable yolov3obj and reuses the persistent object during subsequent detection calls.

type('yolov3detect.m')

generate cuda mex

to generate cuda code for the entry-point function, create a gpu code configuration object for a mex target and set the target language to c . use the function to create a cudnn deep learning configuration object and assign it to the deeplearningconfig property of the gpu code configuration object. run the codegen command specifying an input size of 416-by-416-by-3. this value corresponds to the input layer size of the yolo v3 network.

cfg = coder.gpuconfig('mex');
cfg.targetlang = 'c  ';
cfg.deeplearningconfig = coder.deeplearningconfig('cudnn');
inputargs = {coder.typeof(uint8(0),inputlayersize),coder.constant(matfile)};
codegen -config cfg yolov3detect -args inputargs -report
code generation successful: view report

test the generated mex on an image

load an input image. call tinyyolov3cocodetect_mex on the input image and display the detection results.

im = imread('highway.png');
im = preprocess(vehicledetector,im);
outputimage = yolov3detect_mex(im,matfile);
imshow(outputimage);

test the generated mex on a video

set up the video file reader and read the input video. create a video player to display the video and the output detections.

videofile = 'highway_lanechange.mp4';
videofreader = vision.videofilereader(videofile,'videooutputdatatype','uint8');
depvideoplayer = vision.deployablevideoplayer('size','custom','customsize',[640 480]);

read the video input frame-by-frame and detect the vehicles in the video using the detector.

cont = ~isdone(videofreader);
while cont
    i = step(videofreader);
    in = imresize(i,inputlayersize(1:2));
    out = yolov3detect_mex(in,matfile);
    step(depvideoplayer, out);
    % exit the loop if the video player figure window is closed
    cont = ~isdone(videofreader) && isopen(depvideoplayer); 
end

references

1. redmon, joseph, and ali farhadi. “yolov3: an incremental improvement.” arxiv, april 8, 2018. http://arxiv.org/abs/1804.02767.

2. lin, tsung-yi, michael maire, serge belongie, james hays, pietro perona, deva ramanan, piotr dollár, and c. lawrence zitnick. “microsoft coco: common objects in context.” in computer vision – eccv 2014, edited by david fleet, tomas pajdla, bernt schiele, and tinne tuytelaars, 8693:740–55. cham: springer international publishing, 2014. https://doi.org/10.1007/978-3-319-10602-1_48.

see also

functions

objects

  • | | | | (deep learning toolbox) | (deep learning toolbox)

related examples

  • (computer vision toolbox)

more about

    网站地图