In the field of machine learning, mislabeling is a common problem that can lead to inaccurate models and poor performance in object detection projects. In our past experience, we found that roughly 70% of project time is spent on identifying and fixing mislabels. This process was manual and took lots of effort from machine learning engineers, subject matter experts, and labelers.
In order to address this problem, we are proud to share that there is a solution— – the mislabel detection feature in our computer vision software, LandingLens. This feature is designed to automatically identify and suggest fixes for incorrect bounding box labels, saving time and effort for our users.
About the Feature
Our mislabel detection feature works by scanning through users’ label data and identifying inconsistencies among labels for similar examples. This process is very time-consuming to do manually when there are a large number of labels to review, but it becomes feasible with the use of machine learning algorithms. We leverage our proprietary pre-trained machine learning algorithms to analyze the label data, learn from the data labeling patterns, and compare labels with what the algorithms would expect to see. If inconsistencies are detected, the feature flags them and suggests alternative label fixes.
The feature runs in the background and displays suggestions in a pop-up banner, as shown in the image below:
Once you enter the review page, you will see all mislabels the feature identified, and alternative label suggestions. If you agree with the suggestion, you can click “Accept” and it will automatically update your labels.
Below are two mislabeling examples:
In the left image, the user wants to label all instances of the `Heart` pattern, but two `Heart` shapes have been missed. The feature detects these missing shapes and suggests adding a bounding box around each of them.
In the right image, the user has used a very large bounding box to label the `Spade` pattern. This is not ideal for model training. As a result, the feature suggests narrowing down the size of the bounding box so that it tightly encloses the target `Spade` pattern.
In the following image, the goal is to label all `foreign materials` instances among cereals. With such a large amount of cereal, it is easy to miss a few `foreign materials` instances. The feature has found them and suggests adding them to the list.
In addition to saving time and effort, the mislabel detection feature also improves the accuracy and performance of object detection models. This is because mislabeling can lead to inaccurate models and poor performance, as mentioned earlier. By automatically identifying and suggesting fixes for mislabels, the feature helps to improve the accuracy and performance of these models.
Our team has extensively tested real-world machine learning projects with mislabeled data. Since the launch of the mislabel detection feature this February, we have closely monitored the metrics and ensured that it is helpful for all our users’ projects. The feature has identified and fixed more than 4,600 incorrect labels, and this trend is continuing to increase. For projects that actively use the mislabel detection feature, we have observed performance improvements of up to 71%. The more label suggestions that users review and accept, the better the performance becomes.
In conclusion, the mislabel detection feature in LandingLens is an essential tool for anyone working with object detection projects. It can save you time and effort when fixing labels while improving the accuracy and performance of your models. We encourage you to give it a try and see how it can benefit your project. If you have any feedback or suggestions, please feel free to share them with us in the community! With the mislabel detection feature, you can rest assured that your object detection models will be accurate and perform at their best.