Despite the rapid advances in AI, computer vision (CV) is still challenged in matching the precision of human perception. The training data here is as important as algorithms. The more accurate the input data annotation, the more effective the model prediction. How do we annotate data, though? There are multiple ways to go with this one, but it all depends on your use case. For the purposes of this article, we’ll take a deeper dive into bounding boxes as one of the most extensively used annotation techniques. Moving forward, we’ll walk you through the following:
- What are bounding boxes?
- Why are they important?
- Bounding boxes for object detection
- Common use cases
- Precautions and best practices
- Bounding boxes with SuperAnnotate
- Key takeaways
What are bounding boxes?
Bounding boxes are rectangles that serve as a point of reference for object detection in CV. Fully enclosing the target, they are used to assign a class to the object of interest. Regarding functionality, bounding boxes make it considerably easier for algorithms to find what they're looking for in an image and associate the detected object with what they were initially trained on.
Why are they important?
No wonder bounding boxes are fundamental for image annotation, as they constitute the training and testing data for a model that is expected to perform a CV task. Without these annotations, machines won’t be able to detect the objects of desire.
Bounding boxes for object detection
So, what connects bounding boxes and object detection? To get this rolling, let’s first clarify this: object detection consists of object classification and object localization. Meaning, the model has to know what is on the image and where it is to make accurate predictions. Regardless of how carefully you annotate an image, bounding boxes do not yet guarantee the highest prediction rates. Some projects, say, road lanes for self-driving cars, may better be suited for other annotation techniques, to which we will dedicate a separate post.
Common use cases
The use cases of bounding box annotations are endless and increase steadily. For now, let’s take a look at several of them:
Object detection for self-driving cars
Bounding box training data helps machines detect objects on the road and beyond, such as traffic lights, other cars, lanes, street signs, pedestrians, etc. The more extensive and versatile the training data, the better machines can recognize barriers on the streets and execute instruction based on the perceived information.
Image tagging for eCommerce and retail
Bounding box annotations ensure better product visualization in retail stores or online shops. Perception models trained on similar data can recognize objects like fashion items, pieces of furniture, skincare products, and so forth when labeled correctly. Here are several problems bounding box annotations address in retail:
Incorrect search results: Incorrect catalog data leads to incorrect search results, and this can be a significant disadvantage given that searching is how customers bump into an eCommerce store.
The continuous digitization process: All products need to be digitized and tagged in a timely manner so that customers do not miss out on new opportunities. Besides, all the tags have to be in context, which is hard to manage as the volume of the products in stocks expands.
Chaotically organized supply chains: If you intend to grow your retail business to an extent when you ship millions of products annually, your offline and online data have to be in line with each other.
Damage detection for insurance claims
In insurance, bounding box annotations are used to train a model that can quickly identify recurrent mishaps and accidents. Damages on the roof, body, front and trail light, broken window glasses, and other defects are identifiable by CV. Here, bounding box annotations help machines estimate the level of damage so that insurance companies process claims accordingly.
Object detection with robotics and drone imagery
The applications of bounding boxes extend beyond cars and retail products, additionally covering object recognition with robotics and drone imagery. Likewise, drones or unmanned aerial vehicles can spot damaged roofs, AC units, and the migration of species, when combined with top-accuracy, annotated training data. With the variety of elements annotated by a bounding box, it becomes easier for robots and drones to detect physical objects from far distances.
Disease and plant growth identification in agriculture
Early identification of plant disease is a guarantee of increased chances of prevention in an early stage. With the development of smart farming, there comes the challenge of collecting training data to teach models to detect plant diseases and growth rates alike. Bounding box annotation has a major take in this process, giving the machines that necessary vision.
Precautions and best practices
Yet again, pixel-perfect bounding boxes are not a long shot, as long as you’re cautious of the tips below.
Box size variety
Keep in mind that your model will perform worse if it is only trained on objects that are of the same size. If the same object appears considerably smaller, the model will have a hard time detecting it. Same for macro cases: the relative IoU is less affected when objects appear larger than expected and thus take up a greater number of pixels. Lesson learned: pay attention to the variation of the size and volume of the object to ensure the wanted outcome.
Tightness is a top priority. The edges of a bounding box should be as close to the labeled object as possible. Consistent gaps may create issues with the IoU (Intersection Over Union: the area of overlap between the model’s prediction and the ground truth, divided by their union). One thing to remember, the IoU of perfectly overlapping annotations has a value of 1.00.
Diagonal items in bounding boxes
You may want to fight me on this, but the problem with diagonal items is that they take up significantly smaller space within the bounding box compared with the background. Take a closer look at the image below. Even though it is obvious to the human eye that the object of interest is the watch, if exposed continuously, the model may assume that the background is its target just because it takes up more space. So, diagonal objects are best labeled with polygons and instance segmentation. Yet teaching the model to identify them with a bounding box is still possible with enough training data.
Reduce box overlap
Annotation overlap should be avoided at all events. Sometimes the nature of your image may leave you with nothing else than a bunch of overlapping boxes, which is especially the case when dealing with clutter. Objects that have a labeling overlap with other items will perform considerably worse. This is because the model will fail to distinguish the target element associating a box with another entity if overlaps are excessive. What if there's no way you can avoid the overlap? If so, consider using polygon instead for higher accuracy in precision.
Bounding boxes with SuperAnnotate
Feeding your system with faulty data is one of the shorter paths to sabotage your CV project. The phrase “garbage in, garbage out” couldn’t be more relevant here. Mislabeled data will only swallow your resources, not to mention the time necessary to diagnose and fix them along the way. SuperAnnotate increases your chances of getting pixel-perfect results. SuperAnnotate’s user-friendly editor and enthusiastic team of professionals are here to lend you a helping hand at any point throughout your CV cycle. Don’t waste your time on defective data. Outsource your image and video annotation services to the SuperAnnotate’s marketplace of vetted professionals.
Fair enough, bounding boxes are the first thing that comes to mind when recalling image annotation or labeling. Bounding box annotated images advance the object detection of visual perception models by spotting targets across multiple industries. The latter ones keep expanding, opening up room for wider applications of bounding boxes and adding up to the list of precautions when annotating data. We hope this article provided you with a basic understanding of how this annotation technique facilitates object detection. Let us know if we can be of further help.