How to justify that VGG16 gives better results in medical image classification instead of newer and more performant ones (ResNet, DenseNet etc)?

I have a query? how to justify that the used CNN (VGG16 with attention mechanism) gives better results in classification as compared DenseNet, ResNet, and inception with the same attention mechanism attention as used with VGG16 for the classification task.

First, I trained three different CNN’s models (VGG16, DenseNet and ResNet). Among them DenseNet gives me excellent accuracy results as compared to other. Later I modified these CNN models and used an attention mechanism with them, but when I checked the result among them VGG16 gives me the best results now I don’t know how to justify this. Since VGG16 is an old CNN model as compared to other networks but still gives me good results.


Hi @zakirshah

It may be helpful to provide more information regarding the network design for each model, especially with regard to deployment of the attention mechanism. Also, take a look at the papers below.


The performance benchmarks done on these models are on very large datasets of natural images (IMAGENET for example). If a model performs well in the very particular usecase of natural images, it doesn’t mean that it is an overall better model for all scenarios. In fact, there may exist hypothetical model architectures that always perform better on medical images than natural images. The architecture of the model should be optimized to the particular problem. The age of the architecture being used is irrelevant to the performance on your specific usecase.

The answer to your overall question, however, has to do with your dataset.

Category icons made by Freepik from