Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Darknet Python IMAGE object #289

Closed
FranciscoGomez90 opened this issue Nov 7, 2017 · 56 comments
Closed

Darknet Python IMAGE object #289

FranciscoGomez90 opened this issue Nov 7, 2017 · 56 comments

Comments

@FranciscoGomez90
Copy link

Hi,

I'm trying to forward the feed of a webcam to YOLO in python.
My problem is that I capture the images from OpenCV but i don't know how to transform them to an IMAGE object in order to perform the .network_detect()
Could you please throw some insight on this matter?

Thank you!

@TheMikeyR
Copy link

TheMikeyR commented Nov 7, 2017

Not sure if this will work but try it out @FranciscoGomez90

def nparray_to_image(img):

    data = img.ctypes.data_as(POINTER(c_ubyte))
    image = ndarray_image(data, img.ctypes.shape, img.ctypes.strides)

    return image

source

@FranciscoGomez90
Copy link
Author

Thank you for your answer!
I cannot test the code as the function ndarray_image is not defined. Is it inside darknet lib?
Thank you again!

@TheMikeyR
Copy link

I forgot some lines after line load_image = lib.load_image_color in your darknet.py file.
Add these lines and it might work (haven't tested though)

ndarray_image = lib.ndarray_to_image
ndarray_image.argtypes = [POINTER(c_ubyte), POINTER(c_long), POINTER(c_long)]
ndarray_image.restype = IMAGE

@FranciscoGomez90
Copy link
Author

I'm sorry. I can't make it work.
undefined symbol: ndarray_to_image
It seems that the function is not on the library :S
Thank you!

@TheMikeyR
Copy link

Can you post the python script you are using to use the webcam, I will see if I can make it work and report back :)

@FranciscoGomez90
Copy link
Author

FranciscoGomez90 commented Nov 7, 2017

There is a extract of the code

def runLive():
    handDetector = HandDetectorYolo() #Yolo-darknet encapsulation
    video_capture = cv2.VideoCapture(1)
    while True:
        ret, frame = video_capture.read()
        im = cv2.resize(frame,(640,480))
        # somehow transform im from opencv python to an IMAGE object
        # network_detect is self.network_detect.argtypes = [c_void_p, IMAGE, c_float, c_float, c_float, POINTER(BOX), POINTER(POINTER(c_float))]
        handDetector.network_detect(self.net, im, self.thresh, self.hier_thresh, self.nms, boxes, probs)

So i need to convert from an opencv image to an IMAGE object

@FranciscoGomez90
Copy link
Author

sorry i closed it by mistake

@TheMikeyR
Copy link

TheMikeyR commented Nov 7, 2017

@FranciscoGomez90 I've managed to compile with webcam, this is what needs to be changed.
in src/image.c line 558 add this:

#ifdef NUMPY
image ndarray_to_image(unsigned char* src, long* shape, long* strides)
{
    int h = shape[0];
    int w = shape[1];
    int c = shape[2];
    int step_h = strides[0];
    int step_w = strides[1];
    int step_c = strides[2];
    image im = make_image(w, h, c);
    int i, j, k;
    int index1, index2 = 0;

    for(i = 0; i < h; ++i){
            for(k= 0; k < c; ++k){
                for(j = 0; j < w; ++j){

                    index1 = k*w*h + i*w + j;
                    index2 = step_h*i + step_w*j + step_c*k;
                    //fprintf(stderr, "w=%d h=%d c=%d step_w=%d step_h=%d step_c=%d \n", w, h, c, step_w, step_h, step_c);
                    //fprintf(stderr, "im.data[%d]=%u data[%d]=%f \n", index1, src[index2], index2, src[index2]/255.);
                    im.data[index1] = src[index2]/255.;
                }
            }
        }

    rgbgr_image(im);

    return im;
}
#endif

Then in src/image.h add after line 19

#ifdef NUMPY
image ndarray_to_image(unsigned char* src, long* shape, long* strides);
#endif

in your Makefile add this after line 47

ifeq ($(NUMPY), 1) 
COMMON+= -DNUMPY -I/usr/include/python2.7/ -I/usr/lib/python2.7/dist-packages/numpy/core/include/numpy/
CFLAGS+= -DNUMPY
endif

And lastly also in Makefile add to the top a numpy flag like this:

GPU=1
CUDNN=1
OPENCV=1
OPENMP=0
NUMPY=1
DEBUG=0

Then recompile the library using make in darknet root.

I've managed to detect using tiny-yolo.cfg and weights from website by using my webcam to take an image and send it to darknet and get output back

[('person', 0.850121259689331, (55.81684494018555, 350.7244873046875, 83.28942108154297, 125.45098876953125)), ('person', 0.7424463033676147, (423.0081481933594, 325.21337890625, 454.625, 314.12713623046875)), ('cup', 0.6773239374160767, (167.36326599121094, 377.0586853027344, 26.889423370361328, 39.0584602355957)), ('person', 0.6160455346107483, (267.4810485839844, 332.66485595703125, 62.76264190673828, 101.07070922851562)), ('tvmonitor', 0.5468893647193909, (266.1971435546875, 301.3278503417969, 40.8376350402832, 48.03203582763672))]

This is the script i've used named webcam.py placed in root of darknet folder.

from ctypes import *
import math
import random
import cv2


def sample(probs):
    s = sum(probs)
    probs = [a / s for a in probs]
    r = random.uniform(0, 1)
    for i in range(len(probs)):
        r = r - probs[i]
        if r <= 0:
            return i
    return len(probs) - 1


def c_array(ctype, values):
    return (ctype * len(values))(*values)


class BOX(Structure):
    _fields_ = [("x", c_float),
                ("y", c_float),
                ("w", c_float),
                ("h", c_float)]


class IMAGE(Structure):
    _fields_ = [("w", c_int),
                ("h", c_int),
                ("c", c_int),
                ("data", POINTER(c_float))]


class METADATA(Structure):
    _fields_ = [("classes", c_int),
                ("names", POINTER(c_char_p))]


lib = CDLL("/home/mikeyr/git/darknet/libdarknet.so", RTLD_GLOBAL)
lib.network_width.argtypes = [c_void_p]
lib.network_width.restype = c_int
lib.network_height.argtypes = [c_void_p]
lib.network_height.restype = c_int

predict = lib.network_predict
predict.argtypes = [c_void_p, POINTER(c_float)]
predict.restype = POINTER(c_float)

make_boxes = lib.make_boxes
make_boxes.argtypes = [c_void_p]
make_boxes.restype = POINTER(BOX)

free_ptrs = lib.free_ptrs
free_ptrs.argtypes = [POINTER(c_void_p), c_int]

num_boxes = lib.num_boxes
num_boxes.argtypes = [c_void_p]
num_boxes.restype = c_int

make_probs = lib.make_probs
make_probs.argtypes = [c_void_p]
make_probs.restype = POINTER(POINTER(c_float))

detect = lib.network_predict
detect.argtypes = [c_void_p, IMAGE, c_float, c_float, c_float, POINTER(BOX), POINTER(POINTER(c_float))]

reset_rnn = lib.reset_rnn
reset_rnn.argtypes = [c_void_p]

load_net = lib.load_network
load_net.argtypes = [c_char_p, c_char_p, c_int]
load_net.restype = c_void_p

free_image = lib.free_image
free_image.argtypes = [IMAGE]

letterbox_image = lib.letterbox_image
letterbox_image.argtypes = [IMAGE, c_int, c_int]
letterbox_image.restype = IMAGE

load_meta = lib.get_metadata
lib.get_metadata.argtypes = [c_char_p]
lib.get_metadata.restype = METADATA

load_image = lib.load_image_color
load_image.argtypes = [c_char_p, c_int, c_int]
load_image.restype = IMAGE

ndarray_image = lib.ndarray_to_image
ndarray_image.argtypes = [POINTER(c_ubyte), POINTER(c_long), POINTER(c_long)]
ndarray_image.restype = IMAGE

predict_image = lib.network_predict_image
predict_image.argtypes = [c_void_p, IMAGE]
predict_image.restype = POINTER(c_float)

network_detect = lib.network_detect
network_detect.argtypes = [c_void_p, IMAGE, c_float, c_float, c_float, POINTER(BOX), POINTER(POINTER(c_float))]


def classify(net, meta, im):
    out = predict_image(net, im)
    res = []
    for i in range(meta.classes):
        res.append((meta.names[i], out[i]))
    res = sorted(res, key=lambda x: -x[1])
    return res


def detect(net, meta, image, thresh=.5, hier_thresh=.5, nms=.45):
    im = load_image(image, 0, 0)
    boxes = make_boxes(net)
    probs = make_probs(net)
    num = num_boxes(net)
    network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
    res = []
    for j in range(num):
        for i in range(meta.classes):
            if probs[j][i] > 0:
                res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
    res = sorted(res, key=lambda x: -x[1])
    free_image(im)
    free_ptrs(cast(probs, POINTER(c_void_p)), num)
    return res


def detect_im(net, meta, im, thresh=.5, hier_thresh=.5, nms=.45):
    boxes = make_boxes(net)
    probs = make_probs(net)
    num = num_boxes(net)
    network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
    res = []
    for j in range(num):
        for i in range(meta.classes):
            if probs[j][i] > 0:
                res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
    res = sorted(res, key=lambda x: -x[1])
    free_image(im)
    free_ptrs(cast(probs, POINTER(c_void_p)), num)
    return res


def detect_np(net, meta, np_img, thresh=.5, hier_thresh=.5, nms=.45):
    im = nparray_to_image(np_img)
    boxes = make_boxes(net)
    probs = make_probs(net)
    num = num_boxes(net)
    network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
    res = []
    for j in range(num):
        for i in range(meta.classes):
            if probs[j][i] > 0:
                res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
    res = sorted(res, key=lambda x: -x[1])
    free_image(im)
    free_ptrs(cast(probs, POINTER(c_void_p)), num)
    return res


def nparray_to_image(img):
    data = img.ctypes.data_as(POINTER(c_ubyte))
    image = ndarray_image(data, img.ctypes.shape, img.ctypes.strides)

    return image


if __name__ == "__main__":
    cap = cv2.VideoCapture(0)
    ret, img = cap.read()
    # im=nparray_to_image(img)

    net = load_net(b"cfg/tiny-yolo.cfg", b"tiny-yolo.weights", 0)
    meta = load_meta(b"cfg/coco.data")
    r = detect_np(net, meta, img)
    print(r)

@FranciscoGomez90
Copy link
Author

Yay! That worked like a charm!!
Thank you @TheMikeyR and @reinaldomaslim !!

@sleebapaul
Copy link

Hi @TheMikeyR ,

That works it seems. Thanks very much.
The incoming stream is captured and displayed. I tried it in CPU and it is obviously sluggish.
Any best practices you could suggest if I try it on an incoming 30 fps H.264 stream with GPU?

Another worry on restreaming this tagged output video. My current understanding is that the output will be written in an MJPEG container and then that file is taken for further processing. FFMPEG can be involved for this and we can restream.

@TheMikeyR
Copy link

TheMikeyR commented Feb 6, 2018

@sleebapaul
The 30 fps depends mostly on your hardware.

I haven't tested maximum fps from video stream, but when loading images (2208x1242) from disk I can achieve 15 fps with tiny yolo + sort (online tracker) on an m1200 gpu. I'm primarily using multiprocessing package, where I have an imageLoader() function in its own process: Where it only focus on loading images (according to my results it went 3x faster) and then I'm saving the results to a file also with a separate process. (tears on the memory though, but if I reach a limit I will make the loop sleep until I cleared the buffer).

About restreaming, i've tried to restream the images I save to disk with great success (using flask), where I modified the code to stream images as they arrive in a folder, I had to run the python script separately though.

I think it would be much better to go with the FFMPEG method, but I haven't tried that yet, since I couldn't find a good example. Feel free to share if you have a great example or solution :)

@stucksubstitute
Copy link

Hi @TheMikeyR , I'm trying to follow your instructions. Could you post the entire python code? I need to find a way to have both the webcam image displayed and get text output. Thanks!

@TheMikeyR
Copy link

@stucksubstitute

from ctypes import *
import math
import random
import cv2
import time


def sample(probs):
    s = sum(probs)
    probs = [a / s for a in probs]
    r = random.uniform(0, 1)
    for i in range(len(probs)):
        r = r - probs[i]
        if r <= 0:
            return i
    return len(probs) - 1


def c_array(ctype, values):
    return (ctype * len(values))(*values)


class BOX(Structure):
    _fields_ = [("x", c_float),
                ("y", c_float),
                ("w", c_float),
                ("h", c_float)]


class IMAGE(Structure):
    _fields_ = [("w", c_int),
                ("h", c_int),
                ("c", c_int),
                ("data", POINTER(c_float))]


class METADATA(Structure):
    _fields_ = [("classes", c_int),
                ("names", POINTER(c_char_p))]

# Loading libdarknet.so
# I've used absolute path with great success, didn't experiment with relative path
lib = CDLL("/PATH/TO/libdarknet.so", RTLD_GLOBAL) 
lib.network_width.argtypes = [c_void_p]
lib.network_width.restype = c_int
lib.network_height.argtypes = [c_void_p]
lib.network_height.restype = c_int

predict = lib.network_predict
predict.argtypes = [c_void_p, POINTER(c_float)]
predict.restype = POINTER(c_float)

make_boxes = lib.make_boxes
make_boxes.argtypes = [c_void_p]
make_boxes.restype = POINTER(BOX)

free_ptrs = lib.free_ptrs
free_ptrs.argtypes = [POINTER(c_void_p), c_int]

num_boxes = lib.num_boxes
num_boxes.argtypes = [c_void_p]
num_boxes.restype = c_int

make_probs = lib.make_probs
make_probs.argtypes = [c_void_p]
make_probs.restype = POINTER(POINTER(c_float))

detect = lib.network_predict
detect.argtypes = [c_void_p, IMAGE, c_float, c_float, c_float, POINTER(BOX), POINTER(POINTER(c_float))]

reset_rnn = lib.reset_rnn
reset_rnn.argtypes = [c_void_p]

load_net = lib.load_network
load_net.argtypes = [c_char_p, c_char_p, c_int]
load_net.restype = c_void_p

free_image = lib.free_image
free_image.argtypes = [IMAGE]

letterbox_image = lib.letterbox_image
letterbox_image.argtypes = [IMAGE, c_int, c_int]
letterbox_image.restype = IMAGE

load_meta = lib.get_metadata
lib.get_metadata.argtypes = [c_char_p]
lib.get_metadata.restype = METADATA

load_image = lib.load_image_color
load_image.argtypes = [c_char_p, c_int, c_int]
load_image.restype = IMAGE

ndarray_image = lib.ndarray_to_image
ndarray_image.argtypes = [POINTER(c_ubyte), POINTER(c_long), POINTER(c_long)]
ndarray_image.restype = IMAGE

predict_image = lib.network_predict_image
predict_image.argtypes = [c_void_p, IMAGE]
predict_image.restype = POINTER(c_float)

network_detect = lib.network_detect
network_detect.argtypes = [c_void_p, IMAGE, c_float, c_float, c_float, POINTER(BOX), POINTER(POINTER(c_float))]


def classify(net, meta, im):
    out = predict_image(net, im)
    res = []
    for i in range(meta.classes):
        res.append((meta.names[i], out[i]))
    res = sorted(res, key=lambda x: -x[1])
    return res


def detect(net, meta, image, thresh=.5, hier_thresh=.5, nms=.45):
    im = load_image(image, 0, 0)
    boxes = make_boxes(net)
    probs = make_probs(net)
    num = num_boxes(net)
    network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
    res = []
    for j in range(num):
        for i in range(meta.classes):
            if probs[j][i] > 0:
                res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
    res = sorted(res, key=lambda x: -x[1])
    free_image(im)
    free_ptrs(cast(probs, POINTER(c_void_p)), num)
    return res


def detect_im(net, meta, im, thresh=.5, hier_thresh=.5, nms=.45):
    boxes = make_boxes(net)
    probs = make_probs(net)
    num = num_boxes(net)
    network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
    res = []
    for j in range(num):
        for i in range(meta.classes):
            if probs[j][i] > 0:
                res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
    res = sorted(res, key=lambda x: -x[1])
    free_image(im)
    free_ptrs(cast(probs, POINTER(c_void_p)), num)
    return res


def detect_np(net, meta, np_img, thresh=.5, hier_thresh=.5, nms=.45):
    im = nparray_to_image(np_img)
    boxes = make_boxes(net)
    probs = make_probs(net)
    num = num_boxes(net)
    network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
    res = []
    for j in range(num):
        for i in range(meta.classes):
            if probs[j][i] > 0:
                res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
    res = sorted(res, key=lambda x: -x[1])
    free_image(im)
    free_ptrs(cast(probs, POINTER(c_void_p)), num)
    return res


def nparray_to_image(img):

    data = img.ctypes.data_as(POINTER(c_ubyte))
    image = ndarray_image(data, img.ctypes.shape, img.ctypes.strides)

    return image


def convertBack(x, y, w, h):
    xmin = int(round(x - (w / 2)))
    xmax = int(round(x + (w / 2)))
    ymin = int(round(y - (h / 2)))
    ymax = int(round(y + (h / 2)))
    return xmin, ymin, xmax, ymax


if __name__ == "__main__":
    # load video here
    cap = cv2.VideoCapture(0)
    ret, img = cap.read()
    # im=nparray_to_image(img)
    fps = cap.get(cv2.CAP_PROP_FPS)
    print("Frames per second using video.get(cv2.CAP_PROP_FPS) : {0}".format(fps))
    net = load_net(b"cfg/tiny-yolo.cfg", b"tiny-yolo.weights", 0)
    meta = load_meta(b"cfg/coco.data")
    cv2.namedWindow("img", cv2.WINDOW_NORMAL)
    while(1):
        ret, img = cap.read()
        r = detect_np(net, meta, img)
        # print(r)
        for i in r:
            x, y, w, h = i[2][0], i[2][1], i[2][2], i[2][3]
            xmin, ymin, xmax, ymax = convertBack(float(x), float(y), float(w), float(h))
            pt1 = (xmin, ymin)
            pt2 = (xmax, ymax)
            cv2.rectangle(img, pt1, pt2, (0, 255, 0), 2)
            cv2.putText(img, i[0].decode() + " [" + str(round(i[1] * 100, 2)) + "]", (pt1[0], pt1[1] + 20), cv2.FONT_HERSHEY_SIMPLEX, 1, [0, 255, 0], 4)
        cv2.imshow("img", img)
        k = cv2.waitKey(1)
        if k == 27:
            cv2.destroyAllWindows()
            exit()

@iraadit
Copy link

iraadit commented Mar 7, 2018

Hi @TheMikeyR,

Would you mind sharing your implementation of sort with YOLO, or point me to good resources?

Thank you

@TheMikeyR
Copy link

TheMikeyR commented Mar 7, 2018

@iraadit there are not many steps to add into the pipeline with the code above, just import sort.py into your code and follow the steps from https://github.com/abewley/sort/#using-sort-in-your-own-project

Furthermore you need to do some data-juggling, since you have to go from YOLO output to SORT.
YOLO Outputs [class, probability, [x, y, w, h]] and sort expects [x_min, y_min, x_max, y_max, score] according to https://github.com/abewley/sort/blob/master/sort.py#L188 and in this case probability_yolo = score_sort

I hope this provide enough info to help you add it to your own implementation.

@iraadit
Copy link

iraadit commented Mar 7, 2018

@TheMikeyR, thank you for your answer.
I will try to do the same but with a C++ implementation of SORT: https://github.com/mcximing/sort-cpp
Or this one, that seems to be an amelioration of SORT: https://github.com/samuelmurray/tracking-by-detection

@sleebapaul
Copy link

sleebapaul commented Apr 24, 2018

Hi @TheMikeyR There are issues with YOLOV3 I guess, followed all the instructions but still getting errors on

Traceback (most recent call last): File "darknet_test.py", line 122, in <module> make_boxes = lib.make_boxes File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ctypes/__init__.py", line 361, in __getattr__ func = self.__getitem__(name) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ctypes/__init__.py", line 366, in __getitem__ func = self._FuncPtr((name_or_ordinal, self)) AttributeError: dlsym(0x118d08c40, make_boxes): symbol not found

@TheMikeyR
Copy link

TheMikeyR commented Apr 24, 2018

@sleebapaul the python wrapper isn't updated for yolov3, so the "hack" above doesn't work. It still works for yolov2 though.

@animebing
Copy link

@TheMikeyR I'm trying to detect from numpy array without adding another function, so before detect_np, for one frame I get from cv2, I do pre-processing first like below

      # from (h, w, 3) to (3, h, w) and convert to float
      frame_in = np.transpose(frame, (2, 0, 1)) / 255.0 
       # from BGR to RGB
      frame_in = frame_in[[2, 1, 0], :, :].astype(np.float32) 
      detect_np(net, meta,frame_in)

then in function detect_np, I do make_image and assign the pointer as below

    im = make_image(frame_in.shape[2], frame_in.shape[1], 3)
    im.data = frame_in.ctypes.data_as(POINTER(c_float))

it get the right detection, but after detect_np, there will be Segmentation fault (core dumped) and the program exits. I am new to ctypes, can you give me some suggestions about the segmentation fault? thanks.

dmaugis added a commit to dmaugis/darknet that referenced this issue Nov 24, 2018
@julianweisbord
Copy link

julianweisbord commented Nov 29, 2018

After following TheMikeyR's darknet compilation steps, I too was getting the make_boxes error with Yolov3. This worked for me:

from ctypes import *
import math
import random
import cv2
import time
import numpy as np


def sample(probs):
    s = sum(probs)
    probs = [a/s for a in probs]
    r = random.uniform(0, 1)
    for i in range(len(probs)):
        r = r - probs[i]
        if r <= 0:
            return i
    return len(probs)-1

def c_array(ctype, values):
    arr = (ctype*len(values))()
    arr[:] = values
    return arr

class BOX(Structure):
    _fields_ = [("x", c_float),
                ("y", c_float),
                ("w", c_float),
                ("h", c_float)]

class DETECTION(Structure):
    _fields_ = [("bbox", BOX),
                ("classes", c_int),
                ("prob", POINTER(c_float)),
                ("mask", POINTER(c_float)),
                ("objectness", c_float),
                ("sort_class", c_int)]


class IMAGE(Structure):
    _fields_ = [("w", c_int),
                ("h", c_int),
                ("c", c_int),
                ("data", POINTER(c_float))]

class METADATA(Structure):
    _fields_ = [("classes", c_int),
                ("names", POINTER(c_char_p))]



#lib = CDLL("/home/pjreddie/documents/darknet/libdarknet.so", RTLD_GLOBAL)
lib = CDLL("./libdarknet.so", RTLD_GLOBAL)
lib.network_width.argtypes = [c_void_p]
lib.network_width.restype = c_int
lib.network_height.argtypes = [c_void_p]
lib.network_height.restype = c_int

predict = lib.network_predict
predict.argtypes = [c_void_p, POINTER(c_float)]
predict.restype = POINTER(c_float)

set_gpu = lib.cuda_set_device
set_gpu.argtypes = [c_int]

make_image = lib.make_image
make_image.argtypes = [c_int, c_int, c_int]
make_image.restype = IMAGE

get_network_boxes = lib.get_network_boxes
get_network_boxes.argtypes = [c_void_p, c_int, c_int, c_float, c_float, POINTER(c_int), c_int, POINTER(c_int)]
get_network_boxes.restype = POINTER(DETECTION)

make_network_boxes = lib.make_network_boxes
make_network_boxes.argtypes = [c_void_p]
make_network_boxes.restype = POINTER(DETECTION)

free_detections = lib.free_detections
free_detections.argtypes = [POINTER(DETECTION), c_int]

free_ptrs = lib.free_ptrs
free_ptrs.argtypes = [POINTER(c_void_p), c_int]

network_predict = lib.network_predict
network_predict.argtypes = [c_void_p, POINTER(c_float)]

reset_rnn = lib.reset_rnn
reset_rnn.argtypes = [c_void_p]

load_net = lib.load_network
load_net.argtypes = [c_char_p, c_char_p, c_int]
load_net.restype = c_void_p

do_nms_obj = lib.do_nms_obj
do_nms_obj.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]

do_nms_sort = lib.do_nms_sort
do_nms_sort.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]

free_image = lib.free_image
free_image.argtypes = [IMAGE]

letterbox_image = lib.letterbox_image
letterbox_image.argtypes = [IMAGE, c_int, c_int]
letterbox_image.restype = IMAGE

load_meta = lib.get_metadata
lib.get_metadata.argtypes = [c_char_p]
lib.get_metadata.restype = METADATA

load_image = lib.load_image_color
load_image.argtypes = [c_char_p, c_int, c_int]
load_image.restype = IMAGE

rgbgr_image = lib.rgbgr_image
rgbgr_image.argtypes = [IMAGE]

predict_image = lib.network_predict_image
predict_image.argtypes = [c_void_p, IMAGE]
predict_image.restype = POINTER(c_float)


def convertBack(x, y, w, h):
    xmin = int(round(x - (w / 2)))
    xmax = int(round(x + (w / 2)))
    ymin = int(round(y - (h / 2)))
    ymax = int(round(y + (h / 2)))
    return xmin, ymin, xmax, ymax

def array_to_image(arr):
    # need to return old values to avoid python freeing memory
    arr = arr.transpose(2,0,1)
    c, h, w = arr.shape[0:3]
    arr = np.ascontiguousarray(arr.flat, dtype=np.float32) / 255.0
    data = arr.ctypes.data_as(POINTER(c_float))
    im = IMAGE(w,h,c,data)
    return im, arr

def detect(net, meta, image, thresh=.5, hier_thresh=.5, nms=.45):
    im, image = array_to_image(image)
    rgbgr_image(im)
    num = c_int(0)
    pnum = pointer(num)
    predict_image(net, im)
    dets = get_network_boxes(net, im.w, im.h, thresh,
                             hier_thresh, None, 0, pnum)
    num = pnum[0]
    if nms: do_nms_obj(dets, num, meta.classes, nms)

    res = []
    for j in range(num):
        a = dets[j].prob[0:meta.classes]
        if any(a):
            ai = np.array(a).nonzero()[0]
            for i in ai:
                b = dets[j].bbox
                res.append((meta.names[i], dets[j].prob[i],
                           (b.x, b.y, b.w, b.h)))

    res = sorted(res, key=lambda x: -x[1])
    if isinstance(image, bytes): free_image(im)
    free_detections(dets, num)
    return res


if __name__ == "__main__":
    # load video here
    cap = cv2.VideoCapture("your_camera")
    ret, img = cap.read()
    fps = cap.get(cv2.CAP_PROP_FPS)
    print("Frames per second using video.get(cv2.CAP_PROP_FPS) : {0}".format(fps))
    net = load_net(b"cfg/your_config.cfg", b"your_weights.weights", 0)
    meta = load_meta(b"cfg/your_data.data")
    cv2.namedWindow("img", cv2.WINDOW_NORMAL)
    while(1):

        ret, img = cap.read()
        if ret:
            # r = detect_np(net, meta, img)
            r = detect(net, meta, img)

            for i in r:
                x, y, w, h = i[2][0], i[2][1], i[2][2], i[2][3]
                xmin, ymin, xmax, ymax = convertBack(float(x), float(y), float(w), float(h))
                pt1 = (xmin, ymin)
                pt2 = (xmax, ymax)
                cv2.rectangle(img, pt1, pt2, (0, 255, 0), 2)
                cv2.putText(img, i[0].decode() + " [" + str(round(i[1] * 100, 2)) + "]", (pt1[0], pt1[1] + 20), cv2.FONT_HERSHEY_SIMPLEX, 1, [0, 255, 0], 4)
            cv2.imshow("img", img)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

@Flock1
Copy link

Flock1 commented Jan 17, 2019

@TheMikeyR, I followed every step you recommended to get rid of the error.. I'm still getting the same
undefined symbol: ndarray_to_image

I'm trying to run it on a TX2 board. Can you help me?

@PallawiSinghal
Copy link

@FranciscoGomez90 I've managed to compile with webcam, this is what needs to be changed.
in src/image.c line 558 add this:

#ifdef NUMPY
image ndarray_to_image(unsigned char* src, long* shape, long* strides)
{
    int h = shape[0];
    int w = shape[1];
    int c = shape[2];
    int step_h = strides[0];
    int step_w = strides[1];
    int step_c = strides[2];
    image im = make_image(w, h, c);
    int i, j, k;
    int index1, index2 = 0;

    for(i = 0; i < h; ++i){
            for(k= 0; k < c; ++k){
                for(j = 0; j < w; ++j){

                    index1 = k*w*h + i*w + j;
                    index2 = step_h*i + step_w*j + step_c*k;
                    //fprintf(stderr, "w=%d h=%d c=%d step_w=%d step_h=%d step_c=%d \n", w, h, c, step_w, step_h, step_c);
                    //fprintf(stderr, "im.data[%d]=%u data[%d]=%f \n", index1, src[index2], index2, src[index2]/255.);
                    im.data[index1] = src[index2]/255.;
                }
            }
        }

    rgbgr_image(im);

    return im;
}
#endif

Then in src/image.h add after line 19

#ifdef NUMPY
image ndarray_to_image(unsigned char* src, long* shape, long* strides);
#endif

in your Makefile add this after line 47

ifeq ($(NUMPY), 1) 
COMMON+= -DNUMPY -I/usr/include/python2.7/ -I/usr/lib/python2.7/dist-packages/numpy/core/include/numpy/
CFLAGS+= -DNUMPY
endif

And lastly also in Makefile add to the top a numpy flag like this:

GPU=1
CUDNN=1
OPENCV=1
OPENMP=0
NUMPY=1
DEBUG=0

Then recompile the library using make in darknet root.

I've managed to detect using tiny-yolo.cfg and weights from website by using my webcam to take an image and send it to darknet and get output back

[('person', 0.850121259689331, (55.81684494018555, 350.7244873046875, 83.28942108154297, 125.45098876953125)), ('person', 0.7424463033676147, (423.0081481933594, 325.21337890625, 454.625, 314.12713623046875)), ('cup', 0.6773239374160767, (167.36326599121094, 377.0586853027344, 26.889423370361328, 39.0584602355957)), ('person', 0.6160455346107483, (267.4810485839844, 332.66485595703125, 62.76264190673828, 101.07070922851562)), ('tvmonitor', 0.5468893647193909, (266.1971435546875, 301.3278503417969, 40.8376350402832, 48.03203582763672))]

This is the script i've used named webcam.py placed in root of darknet folder.

from ctypes import *
import math
import random
import cv2


def sample(probs):
    s = sum(probs)
    probs = [a / s for a in probs]
    r = random.uniform(0, 1)
    for i in range(len(probs)):
        r = r - probs[i]
        if r <= 0:
            return i
    return len(probs) - 1


def c_array(ctype, values):
    return (ctype * len(values))(*values)


class BOX(Structure):
    _fields_ = [("x", c_float),
                ("y", c_float),
                ("w", c_float),
                ("h", c_float)]


class IMAGE(Structure):
    _fields_ = [("w", c_int),
                ("h", c_int),
                ("c", c_int),
                ("data", POINTER(c_float))]


class METADATA(Structure):
    _fields_ = [("classes", c_int),
                ("names", POINTER(c_char_p))]


lib = CDLL("/home/mikeyr/git/darknet/libdarknet.so", RTLD_GLOBAL)
lib.network_width.argtypes = [c_void_p]
lib.network_width.restype = c_int
lib.network_height.argtypes = [c_void_p]
lib.network_height.restype = c_int

predict = lib.network_predict
predict.argtypes = [c_void_p, POINTER(c_float)]
predict.restype = POINTER(c_float)

make_boxes = lib.make_boxes
make_boxes.argtypes = [c_void_p]
make_boxes.restype = POINTER(BOX)

free_ptrs = lib.free_ptrs
free_ptrs.argtypes = [POINTER(c_void_p), c_int]

num_boxes = lib.num_boxes
num_boxes.argtypes = [c_void_p]
num_boxes.restype = c_int

make_probs = lib.make_probs
make_probs.argtypes = [c_void_p]
make_probs.restype = POINTER(POINTER(c_float))

detect = lib.network_predict
detect.argtypes = [c_void_p, IMAGE, c_float, c_float, c_float, POINTER(BOX), POINTER(POINTER(c_float))]

reset_rnn = lib.reset_rnn
reset_rnn.argtypes = [c_void_p]

load_net = lib.load_network
load_net.argtypes = [c_char_p, c_char_p, c_int]
load_net.restype = c_void_p

free_image = lib.free_image
free_image.argtypes = [IMAGE]

letterbox_image = lib.letterbox_image
letterbox_image.argtypes = [IMAGE, c_int, c_int]
letterbox_image.restype = IMAGE

load_meta = lib.get_metadata
lib.get_metadata.argtypes = [c_char_p]
lib.get_metadata.restype = METADATA

load_image = lib.load_image_color
load_image.argtypes = [c_char_p, c_int, c_int]
load_image.restype = IMAGE

ndarray_image = lib.ndarray_to_image
ndarray_image.argtypes = [POINTER(c_ubyte), POINTER(c_long), POINTER(c_long)]
ndarray_image.restype = IMAGE

predict_image = lib.network_predict_image
predict_image.argtypes = [c_void_p, IMAGE]
predict_image.restype = POINTER(c_float)

network_detect = lib.network_detect
network_detect.argtypes = [c_void_p, IMAGE, c_float, c_float, c_float, POINTER(BOX), POINTER(POINTER(c_float))]


def classify(net, meta, im):
    out = predict_image(net, im)
    res = []
    for i in range(meta.classes):
        res.append((meta.names[i], out[i]))
    res = sorted(res, key=lambda x: -x[1])
    return res


def detect(net, meta, image, thresh=.5, hier_thresh=.5, nms=.45):
    im = load_image(image, 0, 0)
    boxes = make_boxes(net)
    probs = make_probs(net)
    num = num_boxes(net)
    network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
    res = []
    for j in range(num):
        for i in range(meta.classes):
            if probs[j][i] > 0:
                res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
    res = sorted(res, key=lambda x: -x[1])
    free_image(im)
    free_ptrs(cast(probs, POINTER(c_void_p)), num)
    return res


def detect_im(net, meta, im, thresh=.5, hier_thresh=.5, nms=.45):
    boxes = make_boxes(net)
    probs = make_probs(net)
    num = num_boxes(net)
    network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
    res = []
    for j in range(num):
        for i in range(meta.classes):
            if probs[j][i] > 0:
                res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
    res = sorted(res, key=lambda x: -x[1])
    free_image(im)
    free_ptrs(cast(probs, POINTER(c_void_p)), num)
    return res


def detect_np(net, meta, np_img, thresh=.5, hier_thresh=.5, nms=.45):
    im = nparray_to_image(np_img)
    boxes = make_boxes(net)
    probs = make_probs(net)
    num = num_boxes(net)
    network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
    res = []
    for j in range(num):
        for i in range(meta.classes):
            if probs[j][i] > 0:
                res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
    res = sorted(res, key=lambda x: -x[1])
    free_image(im)
    free_ptrs(cast(probs, POINTER(c_void_p)), num)
    return res


def nparray_to_image(img):
    data = img.ctypes.data_as(POINTER(c_ubyte))
    image = ndarray_image(data, img.ctypes.shape, img.ctypes.strides)

    return image


if __name__ == "__main__":
    cap = cv2.VideoCapture(0)
    ret, img = cap.read()
    # im=nparray_to_image(img)

    net = load_net(b"cfg/tiny-yolo.cfg", b"tiny-yolo.weights", 0)
    meta = load_meta(b"cfg/coco.data")
    r = detect_np(net, meta, img)
    print(r)

Thnk u for ur help. I want to test it on a single channel image. How do I do that.

@FranciscoGomez90
Copy link
Author

@PallawiSinghal You will need to modify the network. So far, the input volume of the architecture is (batchSize x height x width x 3), being the last 3 the RGB channels of an image. You will need to change that in order to take (batchSize x height x width x 1), being the 1 a greyscale image.

@PallawiSinghal
Copy link

@PallawiSinghal You will need to modify the network. So far, the input volume of the architecture is (batchSize x height x width x 3), being the last 3 the RGB channels of an image. You will need to change that in order to take (batchSize x height x width x 1), being the 1 a greyscale image.

Thank you, But where do I make changes in the code. Can you please help.

@PallawiSinghal
Copy link

Hi, Everyone
is there solution for this :
AttributeError: /home/pret/darknet/libdarknet.so: undefined symbol: make_boxes
i try recompile and run the script above but i still get same error
thanks

I am getting the same error, can you help me if u have solved the problem.

@FranciscoGomez90
Copy link
Author

FranciscoGomez90 commented Jan 30, 2019

@PallawiSinghal You should change the architecture itself so I think you must hack the .cfg files. For instance, the yolov3-voc.cfg has this input layer:

[net]
# Testing
 batch=1
 subdivisions=1
# Training
# batch=64
# subdivisions=16
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

You must change the parameter channels to 1

ashirgao added a commit to ashirgao/darknet that referenced this issue Mar 8, 2019
ashirgao added a commit to ashirgao/darknet that referenced this issue Mar 8, 2019
ashirgao added a commit to ashirgao/darknet that referenced this issue Mar 8, 2019
@MarviB16
Copy link

MarviB16 commented Mar 9, 2019

@TheMikeyR Does this also work for the current darknet? Because when i try to adapt things in the current darknet the lines are not valid.

And when i put it in spots where i think it belong i get the error:

NameError: global name 'nparray_to_image' is not defined

@TheMikeyR
Copy link

@MarviB16 I don't believe so, but you can try this out #289 (comment)

@enesozi
Copy link

enesozi commented Mar 18, 2019

@TheMikeyR @julianweisbord I tried the code you shared for Yolov3 but I'm observing that bounding boxes aren't aligned correctly as given below. I looked into the coordinates of boxes and I confirm that they are wrong. Do you have any idea what would be the cause of this ?
TIA,
img

@enesozi
Copy link

enesozi commented Mar 18, 2019

@TheMikeyR @julianweisbord I tried the code you shared for Yolov3 but I'm observing that bounding boxes aren't aligned correctly as given below. I looked into the coordinates of boxes and I confirm that they are wrong. Do you have any idea what would be the cause of this ?
TIA,
img

Ok, I solved the issue. For anyone who's been struggling with this problem, the following lines of code may help you:

 custom_image = cv2.cvtColor(custom_image_bgr, cv2.COLOR_BGR2RGB)
 custom_image = cv2.resize(custom_image, (lib.network_width(
            self.net), lib.network_height(self.net)), interpolation=cv2.INTER_LINEAR)
 im, _ = self.array_to_image(custom_image)

 num = c_int(0)
 pnum = pointer(num)
 predict_image(self.net, im)
 dets = get_network_boxes(self.net, custom_image_bgr.shape[1], custom_image_bgr.shape[
            0], self.thresh, hier_thresh, None, 0, pnum, 0)
 num = pnum[0]
 if nms:
    do_nms_obj(dets, num, self.meta.classes, nms)

a

Edit:
If you use NEAREST NEIGHBOUR interpolation instead of LINEAR, you would get better results.

 custom_image = cv2.resize(custom_image, (lib.network_width(
            self.net), lib.network_height(self.net)), interpolation=cv2.INTER_NEAREST)

@saddiesh
Copy link

@TheMikeyR I tried the modifications in darknet and got this error:

File "/usr/lib/python3.6/ctypes/init.py", line 366, in getitem
func = self._FuncPtr((name_or_ordinal, self))
AttributeError:....../libdarknet.so: undefined symbol: ndarray_to_image

since I'm using python3.6, I change the makefile as:
ifeq ($(NUMPY), 1)
COMMON+= -DNUMPY -I/usr/include/python3.6/ -I/usr/lib/python3.6/dist-packages/numpy/core/include/numpy/
CFLAGS+= -DNUMPY
endif

I don't know what's the problem? Can you help me with that?

@ganeshkharad2
Copy link

To commit above changes for YOLO v3 go to my link----------------

https://github.com/ganeshkharad2/darknet-yolo_v3_numpy

I have uploaded necessary files there just replace all 3 files with original ones

@strivehub
Copy link

@TheMikeyR I tried the modifications in darknet and got this error:

File "/usr/lib/python3.6/ctypes/init.py", line 366, in getitem
func = self._FuncPtr((name_or_ordinal, self))
AttributeError:....../libdarknet.so: undefined symbol: ndarray_to_image

since I'm using python3.6, I change the makefile as:
ifeq ($(NUMPY), 1)
COMMON+= -DNUMPY -I/usr/include/python3.6/ -I/usr/lib/python3.6/dist-packages/numpy/core/include/numpy/
CFLAGS+= -DNUMPY
endif

I don't know what's the problem? Can you help me with that?

I have the same promblem,how to ```

@rafferino
Copy link

rafferino commented Jul 16, 2019

@julianweisbord I tried your code and my kernel simply crashes when it attempts to predict on the webcam image. There is no error message, the code simply crashes. I'm assuming it's because of some problem to occur in the C-code rather than the python code. It could also be crashing because some variable isn't being set up correctly. Could you (or anyone else) point me in some direction fix this problem?

@zrion
Copy link

zrion commented Jul 31, 2019

@saddiesh Did you solve the problem?

@saddiesh
Copy link

saddiesh commented Aug 3, 2019

@saddiesh Did you solve the problem?

Yes. I updated some libraries and then it worked. I couldn't say which library updating helped to fix this problem and I can't remember clearly now. If you still have a problem with that I can check the libraries next week.

@zrion
Copy link

zrion commented Aug 3, 2019

@saddiesh Did you solve the problem?

Yes. I updated some libraries and then it worked. I couldn't say which library updating helped to fix this problem and I can't remember clearly now. If you still have a problem with that I can check the libraries next week.

That's fine I solved it myself. I'm not sure why, but my problem is that the flag NUMPY=1 is not recognized in image.c, and thus it's not compiled. Hence, I had to manually set it in image.c.

@skeeseem
Copy link

where is libdarknet.so file

@zrion
Copy link

zrion commented May 18, 2020

@TheMikeyR Could you please explain why you need to swap the channels with rgbgr_image()? As far as I understand darknet takes rgb images for training and test without the swapping. Thank you!

@bamboosdu
Copy link

bamboosdu commented Sep 5, 2020

This code can process the video and record the video and FPS using opencv.
Screenshot from 2020-09-04 18-53-08

"darknet.py"
`from ctypes import *
import math
import random
import numpy as np
import cv2
import time
def sample(probs):
s = sum(probs)
probs = [a/s for a in probs]
r = random.uniform(0, 1)
for i in range(len(probs)):
r = r - probs[i]
if r <= 0:
return i
return len(probs)-1

def c_array(ctype, values):
arr = (ctype*len(values))()
arr[:] = values
return arr

class BOX(Structure):
fields = [("x", c_float),
("y", c_float),
("w", c_float),
("h", c_float)]

class DETECTION(Structure):
fields = [("bbox", BOX),
("classes", c_int),
("prob", POINTER(c_float)),
("mask", POINTER(c_float)),
("objectness", c_float),
("sort_class", c_int)]

class IMAGE(Structure):
fields = [("w", c_int),
("h", c_int),
("c", c_int),
("data", POINTER(c_float))]

class METADATA(Structure):
fields = [("classes", c_int),
("names", POINTER(c_char_p))]

lib = CDLL("/home/**/git-space/darknet/libdarknet.so", RTLD_GLOBAL)
lib.network_width.argtypes = [c_void_p]
lib.network_width.restype = c_int
lib.network_height.argtypes = [c_void_p]
lib.network_height.restype = c_int

predict = lib.network_predict
predict.argtypes = [c_void_p, POINTER(c_float)]
predict.restype = POINTER(c_float)

set_gpu = lib.cuda_set_device
set_gpu.argtypes = [c_int]

make_image = lib.make_image
make_image.argtypes = [c_int, c_int, c_int]
make_image.restype = IMAGE

get_network_boxes = lib.get_network_boxes
get_network_boxes.argtypes = [c_void_p, c_int, c_int, c_float, c_float, POINTER(c_int), c_int, POINTER(c_int)]
get_network_boxes.restype = POINTER(DETECTION)

make_network_boxes = lib.make_network_boxes
make_network_boxes.argtypes = [c_void_p]
make_network_boxes.restype = POINTER(DETECTION)

free_detections = lib.free_detections
free_detections.argtypes = [POINTER(DETECTION), c_int]

free_ptrs = lib.free_ptrs
free_ptrs.argtypes = [POINTER(c_void_p), c_int]

network_predict = lib.network_predict
network_predict.argtypes = [c_void_p, POINTER(c_float)]

reset_rnn = lib.reset_rnn
reset_rnn.argtypes = [c_void_p]

load_net = lib.load_network
load_net.argtypes = [c_char_p, c_char_p, c_int]
load_net.restype = c_void_p

do_nms_obj = lib.do_nms_obj
do_nms_obj.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]

do_nms_sort = lib.do_nms_sort
do_nms_sort.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]

free_image = lib.free_image
free_image.argtypes = [IMAGE]

letterbox_image = lib.letterbox_image
letterbox_image.argtypes = [IMAGE, c_int, c_int]
letterbox_image.restype = IMAGE

load_meta = lib.get_metadata
lib.get_metadata.argtypes = [c_char_p]
lib.get_metadata.restype = METADATA

load_image = lib.load_image_color
load_image.argtypes = [c_char_p, c_int, c_int]
load_image.restype = IMAGE

rgbgr_image = lib.rgbgr_image
rgbgr_image.argtypes = [IMAGE]

predict_image = lib.network_predict_image
predict_image.argtypes = [c_void_p, IMAGE]
predict_image.restype = POINTER(c_float)

def classify(net, meta, im):
out = predict_image(net, im)
res = []
for i in range(meta.classes):
res.append((meta.names[i], out[i]))
res = sorted(res, key=lambda x: -x[1])
return res

def array_to_image(arr):
# need to return old values to avoid python freeing memory
arr = arr.transpose(2,0,1)
c, h, w = arr.shape[0:3]
arr = np.ascontiguousarray(arr.flat, dtype=np.float32) / 255.0
data = arr.ctypes.data_as(POINTER(c_float))
im = IMAGE(w,h,c,data)
return im, arr

def convertBack(x,y,w,h):
xmin=x-0.5w
ymin=y+0.5
h
xmax=x+0.5w
ymax=y-0.5
h
return int(xmin),int(ymin),int(xmax),int(ymax)

def cvDrawBoxes(detections, img):

for detection in detections:
    print(detection)
    if(detection[1]>0.5):
        x, y, w, h = detection[2][0], \
                     detection[2][1], \
                     detection[2][2], \
                     detection[2][3]
        xmin, ymin, xmax, ymax = convertBack(
            float(x), float(y), float(w), float(h))
        pt1 = (xmin, ymin)
        pt2 = (xmax, ymax)
        cv2.rectangle(img, pt1, pt2, (13, 23, 227), 4)
        cv2.putText(img,
                    detection[0].decode() +
                    " [" + str(round(detection[1] * 100, 2)) + "]",
                    (pt1[0], pt1[1] - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                    [0, 255, 0], 2)
return img

def detect(net, meta, image, thresh=.3, hier_thresh=.5, nms=.45):
if isinstance(image, bytes):
im = load_image(image, 0, 0)
else:
im, imagee = array_to_image(image)
rgbgr_image(im)

num = c_int(0)
pnum = pointer(num)
predict_image(net, im)
dets = get_network_boxes(net, im.w, im.h, thresh,
                         hier_thresh, None, 0, pnum)


num = pnum[0]
if nms: do_nms_obj(dets, num, meta.classes, nms)

res = []
for j in range(num):
    a = dets[j].prob[0:meta.classes]
    if any(a):
        ai = np.array(a).nonzero()[0]
        for i in ai:
            b = dets[j].bbox
            res.append((meta.names[i], dets[j].prob[i],
                       (b.x, b.y, b.w, b.h)))

res = sorted(res, key=lambda x: -x[1])
image=np.array(image)
img=cvDrawBoxes(res,image)


if isinstance(image, bytes): free_image(im)
free_detections(dets, num)
return res, img

if name == "main":
net = load_net(b"myData/my_yolov3.cfg", b"my_yolov3_900.weights", 0)
meta = load_meta(b"myData/my_data.data")

capture=cv2.VideoCapture("data/insulator_video/s4.mp4")
fps=0.0
w = 640
h = 480
video_writer = cv2.VideoWriter('/home/**/git-space/darknet/data/s4.avi', cv2.VideoWriter_fourcc(*'XVID'),
                               22, (w, h))
flag = capture.isOpened()
while(flag):
    t1=time.time()
    Success,frame=capture.read()
    if not flag:
        break
    res,img=detect(net, meta, frame)
    fps = (fps + (1. / (time.time() - t1))) / 2
    print("fps= %.2f" % (fps))
    fframe = cv2.putText(img, "fps= %.2f" % (fps), (0, 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
    video_writer.write(fframe)
    cv2.imshow("video", fframe)
    c = cv2.waitKey(1) & 0xff
    if c == 27:
        capture.release()
        break
capture.release()
video_writer.release()
cv2.destroyAllWindows()
print('Finished!')`

@jgvaraujo
Copy link

@TheMikeyR! Thank u so much!
You make me have a great day.

I just make a change in your Makefile commands that can help some people:

NPPATH = $(shell python3 -c "import os; import numpy; print(os.path.dirname(numpy.__file__))")
PYINCL = $(shell python3 -c "from sysconfig import get_paths; print(get_paths()['include'])")
COMMON+= -DNUMPY -I$(PYINCL)/ -I$(NPPATH)/core/include/numpy/
CFLAGS+= -DNUMPY

If you're using another version of Python or a specific interpreter of Python, just change the python3 in the NPPATH and PYINCL variables.

@muhammetsaitcelik
Copy link

Hİ I have a question for you this my error and I did what was said above but nothing change how can ı fix this.

pygame 2.0.1 (SDL 2.0.14, Python 3.6.9)
Hello from the pygame community. https://www.pygame.org/contribute.html
Traceback (most recent call last):
File "corona.py", line 125, in
ndarray_image = lib.ndarray_to_image
File "/usr/lib/python3.6/ctypes/init.py", line 361, in getattr
func = self.getitem(name)
File "/usr/lib/python3.6/ctypes/init.py", line 366, in getitem
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /home/sait/Genel/Darknet/darknet/libdarknet.so: undefined symbol: ndarray_to_image

@ljbkusters
Copy link

ljbkusters commented Aug 21, 2021

Hello everyone,

I recently ran into this same problem and saw TheMikeyR's solution. It helped me find a solution that is 100% implemented in Python. There's no need to recompile darknet, unless the c-code implementation is perhaps a little bit faster and you need the speed, however my solution uses numpy so it's pretty fast as well, and surprisingly simple!

TL;DR: All you have to do is the following (it's explained inline):

import cv2
import ctypes
import darknet as dn
import numpy as np

def np_image_to_c_IMAGE(input_frame):
	"""
	converts a numpy image (w x h x c dim ndarray) to a C type IMAGE as defined in darknet.
	"""
        # The image data is stored in IMAGE.data as a c float pointer therefore...

	# create a flattened image and normalize by devision by 255., needs to be float32 (to be converted to float pointer)
	flattened_image = input_frame.transpose(2, 0, 1).flatten().astype(np.float32)/255.
        # NOTE ON THE ABOVE CODE: 
        # transpose(2, 0, 1) permutes the axes from 0,1,2 to 2,0,1 (clockwise cycle) which is
        # required because on the C-side the data is structured slightly differently

        # define LP_c_float type (*float)
	c_float_p = ctypes.POINTER(ctypes.c_float) 

	# cast flattened_image to LP_c_float
	c_float_p_frame = flattened_image.ctypes.data_as(c_float_p)

	# create empty C_IMAGE type object
        w, h, c = input_frame.shape
	C_IMAGE_frame = dn.make_image(w, h, c)

        # set data to the pointer containing the image data
	C_IMAGE_frame.data = c_float_p_frame

	return C_IMAGE_frame

# initialize net
net, class_names, class_colors = dn.load_network("path/to/config", "path/to/meta", "path/to/weights")

cap = cv2.VideoCapture(0)
ret, img = cap.read()
C_IMAGE = np_image_to_c_IMAGE(img)
r = dn.detect(net, class_names, C_IMAGE)
# do something with r. ...

Note that the darknet.py module I use is AlexeyAB's version

The function without all the comments:

def np_image_to_c_IMAGE(input_frame):
	"""converts a numpy image (w x h x c dim ndarray) to a C type IMAGE as defined in darknet."""
	flattened_image = input_frame.transpose(2, 0, 1).flatten().astype(np.float32)/255.
	c_float_p = ctypes.POINTER(ctypes.c_float) 
	c_float_p_frame = flattened_image.ctypes.data_as(c_float_p)
        w, h, c = input_frame.shape
	C_IMAGE_frame = dn.make_image(w, h, c)
	C_IMAGE_frame.data = c_float_p_frame
	return C_IMAGE_frame

Some more info for the interested reader:
The transpose and flatten step essentially does the same as this in TheMikeyR's code:

for(i = 0; i < h; ++i){
    for(k= 0; k < c; ++k){
        for(j = 0; j < w; ++j){
            index1 = k*w*h + i*w + j;
            index2 = step_h*i + step_w*j + step_c*k;
            //fprintf(stderr, "w=%d h=%d c=%d step_w=%d step_h=%d step_c=%d \n", w, h, c, step_w, step_h, step_c);
            //fprintf(stderr, "im.data[%d]=%u data[%d]=%f \n", index1, src[index2], index2, src[index2]/255.);
            im.data[index1] = src[index2]/255.;
        }
    }
}

I initially implemented the code above exactly in Python. It worked, but it was rather slow. I tested my code a little and found that the above for loop it is simply converting the (w x h x c) ndarray to a one dimensional array with the data being strung together column-wise (the columns as they are visually represented by print(np_image)). For this reason, all you have to do is use np.transpose() and np.flatten(). Since in a 3-dimensional matrix/array it is not entirely trivial how transpose works (and I didn't know how the function was implemented either), I will also give a short explanation on that. What you're doing is permuting the axes from 0, 1, 2 (h, w, c) to 2, 0, 1 (c, h, w) if I understand it correctly. Then flattening the data using np.flatten() gives you exactly the data you need, and all that's left to do is cast this numpy array to a float using ctypes. Of course, using numpy sped it up significantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests