Last month, we released the Edge library to help data scientists and computer vision engineers conduct training experiments faster and in a simpler manner. In this article, we help you get started with RetinaNet State-of-the-art Object Detection algorithm on your custom data and get to a accurate & deployable model. We would recommend reading how RetinaNet works and how it offers higher accuracy compared to other single-stage networks such as YOLOv3 and Faster-RCNN.
1. Preparing data
To make things simpler and faster, we have put together small datasets to quickly get started with testing and understaning input data format. For detection, we have an aerial vehicle detection dataset that contains imagery from a drone and vehicles (car, truck, bus, minibus, cyclist) annotations. Use the annotations in PascalVOC (XML) format as we currently support this format.
2. Setting Up Segmind Edge
A simpler and faster approach is to use docker image that has all the dependencies pre-installed and ready for use. Use the following commands to start the docker.
sudo docker pull segmind/edge:tf1.15-py36 sudo docker run --gpus all -it segmind/edge:tf1.15-py36
a. Download & Install Edge
# Download Edge wget https://segmind-data.s3.ap-south-1.amazonaws.com/edge/dist/edge-0.3.0/edge-0.3.0-cp36-cp36m-linux_x86_64.whl # Install Edge sudo pip3 install edge-0.3.0-cp36-cp36m-linux_x86_64.whl
mkdir ~/.segmind nano ~/.segmind/secret.file
In the secret.file update your credentials. You can get your email/password by singing up from here.
3. Start training
For more information on parameters, you can check the documentation here.
from edge.detection.retinanet import get_inference_model, image_predict # Checkpoint file stored at set_workspace path checkpoint = "path_to_checkpoint" # Image for inference image_path = "path_to_image_for_inference" # Load the model model = get_inference_model(checkpoint) # Use the loaded instance to get predictions detections = image_predict(image_path, model) from edge.detection import draw_bboxes # Draw boxes to view output image = draw_bboxes(image_path, bboxes=detections['boxes'], labels=detections['labels'], scores=detections['scores'], threshold=0.5, label_dict=None) image.show()
5. Improving accuracy
The network can be tweaked for better accuracy by playing around with the hyperparameters. RetinaNet hyperparameter tuning involves tweaking the sizes, strides, ratios and scales. You can also try one of the 3 backbone options: ResNet 50, 101 & 152 depending on your deployment constraints.
If you have any suggestions or feedback please comment on the post or feel free to write to us firstname.lastname@example.org.