Torchvision provides several pretrained models in the torchvision.models. No need to forward propagate through the pretrained layers during training if the pretrained layers are kept fixed ! Do not worry, we will come back to our task soon, for now on, we focus on extracting and saving the features of a pretrained model. In a first approach, we will not further train the pretrained layers so that it makes sense to propagate the datasets through the pretrained layers, save the output features and then separately proceed on training the classification and regression heads. The interest of using a pretrained model lies in the fact that all the ImageNet data has been used to train that model and, hopefully, the extracted features are sufficiently good for this other task of object detection or at least can provided a reasonably good starting point. Now we are ready to feed the data within a model ! Great ! How will we proceed ? We will be using a pretrained model, i.e. a model trained on ImageNet for classification, we will cut off the head and replace it by the appropriate head for classifying and regressing the largest object. Extracting the features with a pretrained model image_transform_params : this defines how the images are resized, this is a dictionary which can be one of :. We will use this parameter when defining our first model in Largest object detection, transform : the operations to be applied on the images before feeding in the models.In this practical, we will play around with three important parameters image_transform_params, target_transform_params and transform : The function data.make_trainval_dataset is the one loading your data and gets few parameters. The python code below shows you how to load your dataset. data.py : provides some useful functions to create your pascal VOC datasets.To explore the dataset you need to download : In the next sections, you will progressively extend this code. You are provided with some basic codes that allow you to explore the dataset. The head of the model will be defined depending on the problem of interest. From these models, we will cut-off the classification head and keep the model up to the last convolutional feature maps. We will use pretrained models, and more specifically, models like resnet, densenet, etc. One of the interests of this practical also lies in the way we will compute the features from which to detect objects. I invite you to read (Huang et al., 2016 Hui, 2018) which present some variations. Also we will follow a particular track to perform object detection but a lot of variations are actually possible. We will progress step by step starting by regressing and classifying the largest object’s bounding box and then move on detecting multiple objects (an interesting pedagogical approach I borrow from J. Segmentations, which we are not going to use for now, are also provided. The dataset consists in 11.530 images, annotated with 27.450 bounding boxes belonging to one of 20 classes. Pascal VOC used to be a popular contest on the topic of object recognition in computer vision. In this practical, we will work with the Pascal VOC 2012 dataset. Given an image, we want to output a set of bounding boxes for every object classes of interest.īelow is an example of what we want to do : Object detection : bounding box regression and classification The task is here to find every occurrence of a set of classes of objects. We now consider a second problem in computer vision : object detection. In the previous practical, you trained feedforward neural networks for classifying images, i.e. assigning a single label to each image hopefully reaching a good accuracy on the test set. Licence : Creative Commons Attribution 4.0 International (CC BY-NC-SA 4.0) Copyright : Jeremy Fix, CentraleSupelec Last revision : Ma10:13 Link to source : 01-pytorch-object-detection.md Lectures project page: Objectives
0 Comments
Leave a Reply. |