allow changing backbone network through command line arguments. Add saving of model with best validation loss.

sohaib023 · sohaib023 · commit 7570c7ee815a · 2021-04-19T23:44:17.000+05:00
diff --git a/README.md b/README.md
@@ -0,0 +1,73 @@
+# Siamese Network 
+
+A simple but pragmatic implementation of Siamese Networks in PyTorch using the pre-trained feature extraction networks provided in ```torchvision.models```. 
+
+## Design Choices:
+- The siamese network provided in this repository uses a sigmoid at its output, thus making it a binary classification task (positive=same, negative=different) with binary cross entropy loss, as opposed to the triplet loss generally used. 
+- I have added dropout to the final classification head network along-with BatchNorm. On online forums there is discussion that dropout with batchnorm is ineffective, however, I found it to improve the results on my specific private dataset. 
+- Instead of concatenating the feature vectors of the two images, I opted to multiply them element-wise, which increased the validation accuracy for my specific dataset.
+
+
+## Setting up the dataset.
+The expected format for both the training and validation dataset is the same. Image belonging to a single entity/class should be placed in a folder with the name of the class. The folders for every class are then to be placed within a common root directory (which will be passed to the trainined and evaluation scripts). The folder structure is also explained below:
+```
+|--Train or Validation dataset root directory
+  |--Class1
+    |-Image1
+    |-Image2
+    .
+    .
+    .
+    |-ImageN
+  |--Class2
+  |--Class3
+  .
+  .
+  .
+  |--ClassN
+```
+
+## Training the model:
+To train the model, run the following command along with the required command line arguments:
+```shell
+python train.py [-h] --train_path TRAIN_PATH --val_path VAL_PATH -o OUT_PATH
+                [-b BACKBONE] [-lr LEARNING_RATE] [-e EPOCHS] [-s SAVE_AFTER]
+
+optional arguments:
+  -h, --help            show this help message and exit
+  --train_path TRAIN_PATH
+                        Path to directory containing training dataset.
+  --val_path VAL_PATH   Path to directory containing validation dataset.
+  -o OUT_PATH, --out_path OUT_PATH
+                        Path for outputting model weights and tensorboard
+                        summary.
+  -b BACKBONE, --backbone BACKBONE
+                        Network backbone from torchvision.models to be used in
+                        the siamese network.
+  -lr LEARNING_RATE, --learning_rate LEARNING_RATE
+                        Learning Rate
+  -e EPOCHS, --epochs EPOCHS
+                        Number of epochs to train
+  -s SAVE_AFTER, --save_after SAVE_AFTER
+                        Model checkpoint is saved after each specified number
+                        of epochs.
+```
+The backbone can be chosen from any of the networks listed in [torchvision.models](https://pytorch.org/vision/stable/models.html)
+
+## Evaluating the model:
+Following command can be used to evaluate the model on a validation set. Output images with containing the pair and their corresponding similarity confidence will be outputted to `{OUT_PATH}`.
+
+Note: During evaluation the pairs are generated with a deterministic seed for the numpy random module, so as to allow comparisons between multiple evaluations.
+
+```shell
+python eval.py [-h] -v VAL_PATH -o OUT_PATH -c CHECKPOINT
+
+optional arguments:
+  -h, --help            show this help message and exit
+  -v VAL_PATH, --val_path VAL_PATH
+                        Path to directory containing validation dataset.
+  -o OUT_PATH, --out_path OUT_PATH
+                        Path for saving prediction images.
+  -c CHECKPOINT, --checkpoint CHECKPOINT
+                        Path of model checkpoint to be used for inference.
+```
diff --git a/eval.py b/eval.py
@@ -16,24 +16,25 @@
     parser = argparse.ArgumentParser()
 
     parser.add_argument(
+        '-v',
         '--val_path',
         type=str,
         help="Path to directory containing validation dataset.",
-        default="../dataset/test"
+        required=True
     )
     parser.add_argument(
         '-o',
         '--out_path',
         type=str,
-        help="Path for outputting model weights and tensorboard summary.",
-        default="output/images"
+        help="Path for saving prediction images.",
+        required=True
     )
     parser.add_argument(
         '-c',
         '--checkpoint',
         type=str,
-        help="Path to model to be used for inference.",
-        default="output/epoch_200.pth"
+        help="Path of model checkpoint to be used for inference.",
+        required=True
     )
 
     args = parser.parse_args()
@@ -45,13 +46,12 @@
     val_dataset     = Dataset(args.val_path, shuffle_pairs=False, augment=False, testing=True)
     val_dataloader   = DataLoader(val_dataset, batch_size=1)
 
-    model = SiameseNetwork()
-    model.to(device)
     criterion = torch.nn.BCELoss()
 
     checkpoint = torch.load(args.checkpoint)
+    model = SiameseNetwork(backbone=checkpoint['backbone'])
+    model.to(device)
     model.load_state_dict(checkpoint['model_state_dict'])
-
     model.eval()
 
     losses = []
diff --git a/train.py b/train.py
@@ -19,20 +19,27 @@
         '--train_path',
         type=str,
         help="Path to directory containing training dataset.",
-        default="../dataset/train"
+        required=True
     )
     parser.add_argument(
         '--val_path',
         type=str,
         help="Path to directory containing validation dataset.",
-        default="../dataset/test"
+        required=True
     )
     parser.add_argument(
         '-o',
         '--out_path',
         type=str,
         help="Path for outputting model weights and tensorboard summary.",
-        default="output"
+        required=True
+    )
+    parser.add_argument(
+        '-b',
+        '--backbone',
+        type=str,
+        help="Network backbone from torchvision.models to be used in the siamese network.",
+        default="resnet18"
     )
     parser.add_argument(
         '-lr',
@@ -68,14 +75,16 @@
     train_dataloader = DataLoader(train_dataset, batch_size=8, drop_last=True)
     val_dataloader   = DataLoader(val_dataset, batch_size=8)
 
-    model = SiameseNetwork()
+    model = SiameseNetwork(backbone=args.backbone)
     model.to(device)
 
     optimizer = torch.optim.Adam(model.parameters(), lr=args.learning_rate)
     criterion = torch.nn.BCELoss()
 
     writer = SummaryWriter(os.path.join(args.out_path, "summary"))
 
+    best_val = 10000000000
+
     for epoch in range(args.epochs):
         print("[{} / {}]".format(epoch, args.epochs))
         model.train()
@@ -120,16 +129,30 @@
             correct += torch.count_nonzero(y == (prob > 0.5)).item()
             total += len(y)
 
-        writer.add_scalar('val_loss', sum(losses)/len(losses), epoch)
+        val_loss = sum(losses)/max(1, len(losses))
+        writer.add_scalar('val_loss', val_loss, epoch)
         writer.add_scalar('val_acc', correct / total, epoch)
 
-        print("\tValidation: Loss={:.2f}\t Accuracy={:.2f}\t".format(sum(losses)/len(losses), correct / total))
+        print("\tValidation: Loss={:.2f}\t Accuracy={:.2f}\t".format(val_loss, correct / total))
+
+        if val_loss < best_val:
+            best_val = val_loss
+            torch.save(
+                {
+                    "epoch": epoch + 1,
+                    "model_state_dict": model.state_dict(),
+                    "backbone": args.backbone,
+                    "optimizer_state_dict": optimizer.state_dict()
+                },
+                os.path.join(args.out_path, "best.pth")
+            )            
 
         if (epoch + 1) % args.save_after == 0:
             torch.save(
                 {
                     "epoch": epoch + 1,
                     "model_state_dict": model.state_dict(),
+                    "backbone": args.backbone,
                     "optimizer_state_dict": optimizer.state_dict()
                 },
                 os.path.join(args.out_path, "epoch_{}.pth".format(epoch + 1))