To further increase the performance of our model we made use of different augmentation techniques that can simulate various lighting conditions and structural variances. We applied the Deep Lab v3 neural network architecture, the current state-of-the-art in image segmentation. The model and training pipeline were implemented with PyTorch.
Model performance was evaluated based on the dice coefficient as well as true positive rate (TPR). The former indicates how well the predicted crack pixels align with the ground truth. TPR measures how many crack pixels were accurately detected by the model.
The final model achieved a dice coefficient of 88% and a TPR of 97% on the validation set. This performance is suitable for automated crack detection applications.
Predictions where then denoised using morphological opening. Stitching the tiles back together we could analyze the entire pictures regarding number of cracks as well as crack length and width.