I got a desktop computer to train deep learning model last week. The GPU is GTX1050TI with 4GB memory which is enough for basic training on object detection. But the CPU is too old. Therefore when I run the training process, the idle of CPU is 0%. I need to reduce the burden of it.
I tried DALI from Nvidia. Although confessed that it is powerful, I also noticed that DALI is too specific to be used for customer dataset. For example, if I want to use complicate label structures more than just ‘bounding box’ coordinates, I can’t find any code example to in DALI to meet this requirement. By the way, the GPU memory in my computer is not big enough, so if moved computation burden from CPU to GPU, I would have to reduce batch size for training. That’s not a good option too.
Yesterday, from this post, I saw this suggestion:

You can use jpeg4py, library dedicated to encode big jpeg files much faster than PIL. Just read image using this library, then transform it to PIL.

After changed my code from using ‘cv.imread()’ to jpeg4py.JPEG().decode()’, the average training time for 1000 batches in my mode boosted improved 700 seconds to 670 seconds. This is just the simple and useful solution I need.