Gun Detection Lab
Background
In recent years, gun-related violence has become a significant concern for law enforcement agencies and communities worldwide. Detecting firearms in images or videos can play a crucial role in preventing crimes, enhancing surveillance systems, and ensuring public safety. Object detection models, such as Faster R-CNN, have proven to be effective tools for identifying and localizing objects like guns in visual data.
This lab introduces you to the basics of computer vision and object detection using PyTorch. You will build a gun detection system using a pre-trained Faster R-CNN model fine-tuned on a custom dataset containing annotated images of guns.
This lab is a modified version of Gun Detect project hosted by Kaggle
Goal
The goal of this lab is to:
- Understand R-CNN, Fast R-CNN, and Faster R-CNN
- Train a Faster R-CNN model to detect guns in images.
- Evaluate the model's performance using metrics such as precision, recall, and F1-score.
- Test the trained model on custom images to visualize its predictions.
By the end of this lab, you will have a working gun detection system that can identify and localize firearms in static images.
Dataset
Source
The dataset used in this lab is available at (an original and a backup copy):
Structure
The dataset contains the following folders:
- Images: Contains JPEG images with guns.
- Labels: Contains text files with bounding box annotations in Pascal VOC format:
<number_of_guns>
xmin ymin xmax ymax
- The first line specifies the number of guns in the image.
- Each subsequent line contains the bounding box coordinates
(xmin, ymin, xmax, ymax)for a single gun.
- our_test_images (Optional): A folder containing custom test images for evaluating the trained model.
Preprocessing
The annotations are parsed into bounding boxes and labels, where:
- All objects are labeled as
1(Gun). - Background regions are labeled as
0.
An example training image with annotations
R-CNN and Fast R-CNN Algorithm
-
The R-CNN (Region-based Convolutional Neural Network) algorithm is a foundational object detection technique in computer vision. You MUST watch R-CNN Tutorial first
-
Fast R-CNN is an object detection algorithm that significantly improved upon the original R-CNN (Region-based Convolutional Neural Network) by addressing its speed limitations. Fast R-CNN processes the entire input image through the CNN only once This significantly reduces redundant computations.
Faster R-CNN Algorithm
Faster R-CNN is an object detection algorithm that builds upon Fast R-CNN, addressing its primary remaining bottleneck: the region proposal generation. It integrates the region proposal network (RPN) directly into the network, making it significantly faster and more efficient.
Faster R-CNN is a two-stage object detection model:
- Region Proposal Network (RPN): Proposes candidate regions of interest (RoIs) in the image.
- Classification and Regression Head: Classifies the RoIs and refines their bounding box coordinates.
Why Faster R-CNN?
- Accuracy: Faster R-CNN achieves high accuracy for detecting small or occluded objects like guns.
- Transfer Learning: By leveraging a pre-trained backbone (ResNet-50 + FPN), the model performs well even with smaller datasets.
The Architecture of Faster R-CNN (Image Credits)
Training Process
Steps
-
Load Dataset:
- The
GunDatasetclass loads images and parses annotations from theImagesandLabelsfolders. - Data augmentation (e.g., random horizontal flip) is applied during training.
- The
-
Model Initialization:
- A pre-trained Faster R-CNN model (
torchvision.models.detection.fasterrcnn_resnet50_fpn) is loaded. - The classifier head is replaced to support two classes:
BackgroundandGun.
- A pre-trained Faster R-CNN model (
-
Training Loop:
- The model is trained for 10 epochs using the SGD optimizer.
- Losses (classification and regression) are computed and backpropagated.
-
Save Model:
- The trained model is saved as
gun_detection_model.pth.
- The trained model is saved as
Code Snippet
# Train for 10 epochs
num_epochs = 10
for epoch in range(num_epochs):
print(f"Epoch {epoch}/{num_epochs}")
train_one_epoch(model, optimizer, train_loader, device, epoch)
Testing and Evaluation Process
Testing
-
Load Pre-trained Model:
- The saved model (
gun_detection_model.pth) is loaded for inference. - Alternatively, a pre-trained model can be downloaded from Dropbox:
- The saved model (
-
Predict on Custom Images:
- The model predicts bounding boxes and confidence scores for guns in images from the
our_test_imagesfolder. - Predictions with confidence scores above a threshold (e.g., 0.5) are visualized.
Evaluation
- Metrics:
- Precision, recall, and F1-score are computed using the
classification_reportfunction fromsklearn.metrics. - Intersection over Union (IoU) is used to match predicted bounding boxes with ground truth boxes.
- Visualization:
- Images with annotated bounding boxes are displayed using
matplotlib.
Code Snippet
# Evaluate the model
evaluate_model(model, test_loader, device)
# Test on custom images
predict_custom_images(model, custom_test_images_dir, get_transform(train=False))
Usage Instructions
Setup
-
Install dependencies:
pip install torch torchvision matplotlib opencv-python pycocotools pyy7zr requests -
Download and extract the dataset:
!wget -O /content/data.7z https://github.com/frankwxu/AI4DigitalForensics/raw/main/lab02_Gun_detection !py7zr x /content/data.7z /content/dataset -
Run the notebook cells sequentially to:
- Train the model.
- Save the trained model.
- Test the model on custom images.
Testing
To test the model on custom images:
- Place your test images in the
our_test_imagesfolder. - Run the
predict_custom_imagesfunction to visualize predictions.
Expected Outputs
- Training Logs:
Epoch 1/10, Iteration 0, Loss: 0.5200
- Evaluation Metrics:
Classification Report:
precision recall f1-score support
Background 0.95 0.98 0.96 500
Gun 0.90 0.85 0.87 500
accuracy 0.92 1000
- Custom Prediction:
- A plot of the test image with bounding boxes around detected guns.
Conclusion
This lab demonstrates how to build a gun detection system using PyTorch and Faster R-CNN. By leveraging transfer learning and a custom dataset, you can achieve high accuracy in detecting and localizing firearms in images. This project can be extended for real-time applications, such as video surveillance or embedded systems.
For questions or feedback, feel free to reach out!