Last month, we released the Edge library to help data scientists and computer vision engineers conduct training experiments faster and in a simpler manner. In this article, we help you get started with RetinaNet State-of-the-art Object Detection algorithm on your custom data and get to a accurate & deployable model. We would recommend reading how RetinaNet works and how it offers higher accuracy compared to other single-stage networks such as YOLOv3 and Faster-RCNN.

1. Preparing data

To make things simpler and faster, we have put together small datasets to quickly get started with testing and understaning input data format. For detection, we have an aerial vehicle detection dataset that contains imagery from a drone and vehicles (car, truck, bus, minibus, cyclist) annotations. Use the annotations in PascalVOC (XML) format as we currently support this format.  

2. Setting Up Segmind Edge

Dependencies

  • tensorflow-gpu==1.15
  • cython
  • numpy
  • opencv-python==3.4.2.17

Docker

A simpler and faster approach is to use docker image that has all the dependencies pre-installed and ready for use. Use the following commands to start the docker.

sudo docker pull segmind/edge:tf1.15-py36
sudo docker run --gpus all -it segmind/edge:tf1.15-py36

a. Download & Install Edge

# Download Edge
wget https://segmind-data.s3.ap-south-1.amazonaws.com/edge/dist/edge-0.3.0/edge-0.3.0-cp36-cp36m-linux_x86_64.whl

# Install Edge
sudo pip3 install edge-0.3.0-cp36-cp36m-linux_x86_64.whl

b. Configure

mkdir ~/.segmind
nano ~/.segmind/secret.file

In the secret.file update your credentials. You can get your email/password by singing up from here.

[secret]
email=john.doe@gmail.com
password=the_password
File format to store your credentials in secret.file

3. Start training

# -----------------
#  Setup project
# -----------------

from edge import set_project_name, set_workspace

set_project_name("first_retinanet_project")

# workspace path to store all the project files.
set_workspace("/path/to/project/") 

# Fetch and view data analytics
from edge.detection import get_analytics

analytics_result = get_analytics(
        image_dir = "/path/to/jpgs/folder",
        annotation_dir = "/path/to/xml/folder",
)

print(analytics_result)

# -----------------
#  Start training
# -----------------

from edge.detection.retinanet import train

train(
    resize_height=600,
    resize_width=600,
    num_epochs=100,
    batch_size=2,
    checkpoint_prefix="try1",
    snapshot_every_epoch=10,
    steps_per_epoch=None,
    sizes=[16,32,64,128, 256],
    strides=[8,16,32,64,128],
    ratios=[0.5,1,2.0],
    scales=[1,1.25,1.5],
    initial_epoch=0,
    weights='imagenet',
    aug=None,
    backbone_network='resnet50',
    lr=1e-5,
    print_summary=True)
Training Script

For more information on parameters, you can check the documentation here.

4. Inference

from edge.detection.retinanet import get_inference_model, image_predict

# Checkpoint file stored at set_workspace path
checkpoint = "path_to_checkpoint"

# Image for inference
image_path = "path_to_image_for_inference"

# Load the model
model = get_inference_model(checkpoint)

# Use the loaded instance to get predictions
detections = image_predict(image_path, model)

from edge.detection import draw_bboxes

# Draw boxes to view output
image = draw_bboxes(image_path,
  bboxes=detections['boxes'],
  labels=detections['labels'],
  scores=detections['scores'],
  threshold=0.5,
  label_dict=None)

image.show()
Inference after 80 Epochs

5. Improving accuracy

The network can be tweaked for better accuracy by playing around with the hyperparameters. RetinaNet hyperparameter tuning involves tweaking the sizes, strides, ratios and scales. You can also try one of the 3 backbone options: ResNet 50, 101 & 152 depending on your deployment constraints.

If you have any suggestions or feedback please comment on the post or feel free to write to us contact@segmind.com.