An important scientific direction is the development and study of computer vision systems (CVS) for mobile robotic complexes. Today, developers of CVS are most often using convolutional neural networks (CNN). For increasing the speed detection of objects on images in CVS, there has been a trend of using CNN, which are hardware-implemented on field-programmable gate array (FPGAs).This article shows that the perspective for hardware implementation on the FPGA is the tiny-YOLO CNN from the YOLO class. For reduce required FPGA computing resources in this CNN, was proposed to use Inception-ResNet modules. We was found that with high detection accuracy of objects in images with minimum resources requirements provide by the tiny-YOLO-Inception-ResNet2 network architecture. It is obtained from replacing the fifth tiny-YOLO convolutional layer of the tiny-YOLO CNN with two sequential processing Inception-ResNet modules. Also results of the study of the detection accuracy of objects using the CNN for this architecture with the lack of resource-intensive operations: batch normalization and bias from calculations were given. These studies were performed for different formats of representation numbers in the FPGA.