Abstrakt | One of the main issues for deploying neural networks in fully autonomous applications, such as self-driving cars, is rarely occurring edge cases. These edge cases are underrepresented or non-existent in both the training and test sets. We implemented an automatic and a semi-automatic pipeline for identifying and generating underrepresented edge cases without requiring any specific domain knowledge or prior information by utilizing diffusion models. By enriching the data set with the generated samples we can train a more robust classifier. With our automatic approach, the accuracy of the classifiers increases by up to 20.84% on the edge case data of the Oxford-IIIT Pet data set (OPD), while achieving improvements of up to 54.16% for individual classes, and decreasing the standard deviation by up to 2.07%. Even on the entire OPD, the accuracy of the classifiers improves slightly. With our semi-automatic pipeline, we achieve improvements of up to 12.87% on a subset of manually generated edge cases, with individual classes gaining up to 37%. Our automated pipeline also achieves up to 8.52% improvement on the edge case data of the CIFAR-100 dataset. |
---|