diff --git a/.gitignore b/.gitignore new file mode 100644 index 000000000..6a15910fd --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ +/recognition/Detect_lesions_s4770608/wandb/ +/recognition/Detect_lesions_s4770608/save_weights/ diff --git a/recognition/Detect_lesions_s4770608/README.MD b/recognition/Detect_lesions_s4770608/README.MD new file mode 100644 index 000000000..5dd426261 --- /dev/null +++ b/recognition/Detect_lesions_s4770608/README.MD @@ -0,0 +1,183 @@ +## Customized Mask R-CNN for Skin Lesion Detection and Classification +### 1 Problem Statement +One of the most prevalent cancers in the world, skin cancer, can be efficiently treated if found early. Accurately identifying and categorizing skin lesions in dermoscopic pictures is a crucial first step in the diagnosis of skin cancer. + +### 2 Algorithm +This project employs a customized Mask R-CNN model, tailored for the precise detection and classification of skin lesions as melanoma or seborrheic keratosis. + + + + + +#### 2.1 Model Architecture +The customized Mask R-CNN uses a ResNet backbone for feature extraction. The model has been fine-tuned specifically for skin lesion detection tasks, with a particular emphasis on handling imbalanced datasets and enhancing the localization of the lesions with improved bounding boxes and masks. Custom anchors and a specialized loss function have been introduced to handle the unique challenges posed by skin lesion images. + +The code of model is in modules.py. I will explain it detailly. + +Loading the Pre-trained Model: +The function begins by loading a pre-trained Mask R-CNN model with a ResNet-50 backbone and FPN (Feature Pyramid Network). The version used here appears to be a custom or updated version, denoted by _v2. + +```python +model = torchvision.models.detection.maskrcnn_resnet50_fpn_v2(pretrained=True) +``` +Modifying the Classification Head: +To adapt the model for a different number of classes, the classifier head of the model is modified. The number of output features of the final classification layer is set equal to num_classes. + +```python +in_features = model.roi_heads.box_predictor.cls_score.in_features +model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes) +``` + +Modifying the Mask Prediction Head: +Similarly, the mask prediction head is also customized. The hidden layer dimension is set to 256, and the output features are set to num_classes to match the new dataset. + +```python +in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels +model.roi_heads.mask_predictor = MaskRCNNPredictor(in_features_mask, 256, num_classes) +``` + +Parameter Tuning Setup: +All parameters in the model are set to be trainable. It means that during the fine-tuning process, the weights of all layers can be updated.In the original setting only, last three layer are update. +```python +in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels +for name, para in model.named_parameters(): + para.requires_grad = True +``` + + + + +#### 2.1 Data preprocessing +Training Transformations for Images and Masks: + +(1)Random Vertical Flip:With a probability of 0.3, the images and masks are flipped vertically. +v2.RandomVerticalFlip(p=0.3) + +(2)Random Horizontal Flip:With a probability of 0.3, the images and masks are flipped horizontally. +v2.RandomHorizontalFlip(p=0.3) + +(3)Random Rotation:The images and masks are rotated randomly within a range of 0 to 180 degrees. +v2.RandomRotation(degrees=(0,180)) + + +(4)Random Resized Crop:A random size and aspect ratio is selected to crop the images and masks, and the crops are resized to a size of 1129x1504 pixels. +v2.RandomResizedCrop(size=(1129, 1504)) + +(5)Conversion to Tensor:The images and masks are converted to tensors, which are suitable for model input. +v2.ToTensor() + +(6)Normalization for Training Images: train_transform_stage2_for_img +The images are normalized using the mean and standard deviation values from the ImageNet dataset. +v2.Normalize(mean=imagenet_mean, std=imagenet_std) + +In validation stage, only Normalize is applied + + + +#### 2.3 Data spliting +The ISIC2017 dataset are splitted already to traning, valid, testing + +### 3 Implementation Detail + +#### 3.1 Custom dataset class. +Because Maskrcnn model requires bounding boxes and images as input, however , ISIC dataset are images and masks. Thus, I need convert mask to bounding boxes in custom the dataset class. + +Here is a detailed explanation +Overview: +CustomISICDataset is a specialized dataset class tailored for processing and loading skin lesion images and their corresponding masks for object detection tasks. The dataset is primarily designed to handle RGB images as inputs and binary masks, which delineate the regions of interest. + +Initialization: +Upon initialization, the class requires paths to the CSV file containing labels, image directory, and mask directory. It also accepts optional transformations and a target size for the images and masks. + +csv_file: Path to the CSV file containing image labels. +img_dir: Directory containing the image files. +mask_dir: Directory containing the mask files. +transform_stage1_for_img_mask: Initial transformations applied to both images and masks. +transform_stage2_for_img: Additional transformations applied primarily to images. +Integrity Checks: +The _check_dataset_integrity function ensures that the dataset's integrity is maintained by confirming the correspondence between image IDs and mask IDs, ensuring there are no mismatches or missing files. + +Data Loading: +The __getitem__ method loads an individual image-mask pair based on the provided index, applying the necessary transformations and generating a target dictionary containing bounding boxes, labels, and masks. + +Image and Mask Loading: + +Images are loaded and converted to RGB format. +Masks are loaded as single-channel images. +Transformations: + +Specified transformations are applied sequentially. A loop ensures that the mask retains essential information post-transformation. +Label Processing: + +Labels are derived based on the conditions specified, ensuring a unique label for each state. +Bounding Box Calculation: + +Bounding boxes are determined based on the masks. + +A target dictionary is formed containing the bounding boxes, labels, and masks, ensuring a structured format for model training. + + + + +Requirements and Dependencies +Python 3.9 + +torch 2.0.0+cu118 + +torchvision 0.15.1+cu118 + +shapely 2.0.2 + +tqdm 4.65.0 + + +To ensure reproducibility, a requirements.txt file is provided. +Install the necessary packages using the following command: + + +pip install -r requirements.txt +#### 3.1 Training +modified the dataset path in get_data_loaders function in main.py then you are good to go +```python +def get_data_loaders(target_size): +``` +```console +python main.py +``` + +#### 3.1 predict +modified the dataset path in following code snippiet in predict.py then you are good to go. The output images will be saved to argument output_path directory +```python +test_data = CustomISICDataset \ + (csv_file='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_label.csv', + img_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_imgs', + mask_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_gt', + transform_stage1_for_img_mask=val_transform_stage1_for_img_mask, + transform_stage2_for_img=val_transform_stage2_for_img, + target_size=target_size) + +``` +```console +python predict.py --output_path output +``` + + + +### Result + +#### Metrics used, IOU and Average precision +The Inter-section Over Union between bouding boxes is used. The Average precision is used for classification. +#### Visulization +A detailed visualizaiton could be cound at https://wandb.ai/baibizhe-s-team/ISIC?workspace=user-baibizhe + +Good result: + +![good1.png](good1.png) +![good2.png](good2.png) + + +Bad result: +![bad1.png](bad1.png) +![bad2.png](bad2.png) +#### Quantitative +The final IOU I've reached is 0.724 and mean accuracy is 0.513 diff --git a/recognition/Detect_lesions_s4770608/dataset.py b/recognition/Detect_lesions_s4770608/dataset.py new file mode 100644 index 000000000..dd60be35f --- /dev/null +++ b/recognition/Detect_lesions_s4770608/dataset.py @@ -0,0 +1,118 @@ +import os +import torch +import torchvision.ops +import torchvision.transforms as transforms +from torch.utils.data import Dataset +from PIL import Image +import pandas as pd +from torchvision.transforms import functional as F, InterpolationMode + +import torch +import numpy as np +from PIL import Image + +def mask_to_bbox(mask): + pos = torch.nonzero(mask, as_tuple=True) + xmin = torch.min(pos[2]) + xmax = torch.max(pos[2]) + ymin = torch.min(pos[1]) + ymax = torch.max(pos[1]) + + # 检查宽度和高度是否为正 + assert (xmax - xmin) > 0 and (ymax - ymin) > 0, f"Invalid bbox: {[xmin, ymin, xmax, ymax]}" + + return torch.tensor([[xmin, ymin, xmax, ymax]], dtype=torch.float32) + + +def get_targets_from_mask(mask, label): + """从mask获取目标字典. + Args: + - mask (Tensor): (H, W)大小的二进制掩码图像. + - label (int): 目标的标签值. + Returns: + - target (dict): Mask R-CNN所需的目标格式. + """ + mask = mask.unsqueeze(0) # 添加一个额外的batch维度 + bbox = mask_to_bbox(mask) + target = { + "boxes": bbox, + "labels": torch.tensor([label], dtype=torch.int64), + "masks": mask + } + return target + + + +class CustomISICDataset(Dataset): + def __init__(self, csv_file, img_dir, mask_dir, transform_stage1_for_img_mask=None,transform_stage2_for_img = None, target_size=224): + self.labels = pd.read_csv(csv_file) + self.img_dir = img_dir + self.mask_dir = mask_dir + self.transform_stage1_for_img_mask = transform_stage1_for_img_mask + self.transform_stage2_for_img=transform_stage2_for_img + self._check_dataset_integrity() + + def _check_dataset_integrity(self): + # Getting image and mask file names without extensions + image_ids = self.labels['image_id'].tolist() + mask_ids = [ + os.path.splitext(mask_file)[0].replace('_segmentation', '') + for mask_file in os.listdir(self.mask_dir) + ] + + # Checking if lengths are the same + assert len(image_ids) == len(mask_ids), \ + f"Number of images ({len(image_ids)}) and masks ({len(mask_ids)}) do not match." + + # Checking if filenames correspond + assert set(image_ids) == set(mask_ids), \ + "Image IDs and Mask IDs do not correspond." + + def __len__(self): + return len(self.labels) + + def __getitem__(self, idx): + img_name = self.labels.iloc[idx, 0] + img_path = f"{self.img_dir}/{img_name}.jpg" + mask_path = f"{self.mask_dir}/{img_name}_segmentation.png" + + image = Image.open(img_path).convert("RGB") + mask = Image.open(mask_path).convert("L") # assuming mask is 1 channel + if self.transform_stage1_for_img_mask: + image,mask = self.transform_stage1_for_img_mask([image,mask]) + count = 0 + while True: + if mask.sum()>100 : + break + if count >10: + return None + else: + count+=1 + image, mask = self.transform_stage1_for_img_mask([image, mask]) + + if self.transform_stage2_for_img: + image,mask = self.transform_stage2_for_img([image,mask]) + # Your class labels + melanoma = int(self.labels.iloc[idx, 1]) + seborrheic_keratosis = int(self.labels.iloc[idx, 2]) + + # Define label based on your conditions + if melanoma == 1 and seborrheic_keratosis == 0: + label = 1 + elif melanoma == 0 and seborrheic_keratosis == 1: + label = 2 + elif melanoma == 0 and seborrheic_keratosis == 0: + label = 3 + else: + raise ValueError("Invalid label found!") + # Resize image and mask + bbox = torchvision.ops.masks_to_boxes(mask) + target = { + "boxes": bbox, + "labels": torch.tensor([label], dtype=torch.int64), + "masks": mask + } + + + + return image, target diff --git a/recognition/Detect_lesions_s4770608/main.py b/recognition/Detect_lesions_s4770608/main.py new file mode 100644 index 000000000..ec767a541 --- /dev/null +++ b/recognition/Detect_lesions_s4770608/main.py @@ -0,0 +1,415 @@ +import os +from datetime import datetime + +import torch +from torchvision.models.detection.anchor_utils import AnchorGenerator +from torchvision.transforms import v2, InterpolationMode + +import wandb +import torch.nn as nn +import torch.nn.functional as F +from dataset import CustomISICDataset +from torch.utils.data import DataLoader +import tqdm +import torchvision.transforms as transforms +import matplotlib.pyplot as plt +from shapely.geometry import Polygon + +import torchvision +from torchvision.models.detection import MaskRCNN +from torchvision.models.detection.faster_rcnn import FastRCNNPredictor +from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor +import torch.optim as optim +import numpy as np +import wandb +from torch.optim.lr_scheduler import CosineAnnealingWarmRestarts + +from modules import get_model_instance_segmentation,ImageClassifier,get_deeplab_model +from PIL import Image +def get_data_loaders(target_size): + imagenet_mean = [0.485, 0.456, 0.406] + imagenet_std = [0.229, 0.224, 0.225] + def collate_fn(batch): + images,targets = [],[] + for i in batch: + if i == None: + continue + else: + images.append(i[0]),targets.append(i[1]) + # images, targets = zip(*batch) + images = [img.cuda() for img in images] + + for target in targets: + target["boxes"] = target["boxes"].cuda() + target["labels"] = target["labels"].cuda() + target["masks"] = target["masks"].cuda() + + + return list(images), list(targets) + train_transform_stage1_for_img_mask = transforms.Compose([ + v2.RandomVerticalFlip(p=0.3), + v2.RandomHorizontalFlip(p=0.3), + v2.RandomRotation(degrees=(0,180)), + v2.RandomResizedCrop(size=(1129, 1504)), + # v2.Resize(size=(target_size,target_size),interpolation=InterpolationMode.NEAREST), + v2.ToTensor(), + ]) + + + train_transform_stage2_for_img = transforms.Compose([ + v2.Normalize(mean=imagenet_mean, std=imagenet_std) + ]) + + + + train_data = CustomISICDataset( + csv_file='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/train_label.csv', + img_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/train_imgs', + mask_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/train_gt', + transform_stage1_for_img_mask=train_transform_stage1_for_img_mask, + transform_stage2_for_img=train_transform_stage2_for_img, + target_size=target_size) + val_transform_stage1_for_img_mask = transforms.Compose([ + # v2.Resize(size=(target_size, target_size), interpolation=InterpolationMode.NEAREST), + v2.ToTensor(), + + ]) + val_transform_stage2_for_img = transforms.Compose([ + v2.Normalize(mean=imagenet_mean, std=imagenet_std) + ]) + val_data = CustomISICDataset( + csv_file='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/val_label.csv', + img_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/val_imgs', + mask_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/val_gt', + transform_stage1_for_img_mask=val_transform_stage1_for_img_mask, + transform_stage2_for_img=val_transform_stage2_for_img, + target_size=target_size) + + test_data = CustomISICDataset\ + (csv_file='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_label.csv', + img_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_imgs', + mask_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_gt', + transform_stage1_for_img_mask=val_transform_stage1_for_img_mask, + transform_stage2_for_img=val_transform_stage2_for_img, + target_size=target_size) + train_data_loader = DataLoader(train_data, batch_size=4, shuffle=True,collate_fn=collate_fn) + val_data_loader = DataLoader(val_data, batch_size=1, shuffle=False,collate_fn=collate_fn) + test_data_loader = DataLoader(test_data, batch_size=1, shuffle=False,collate_fn=collate_fn) + print(f'Training data: {len(train_data_loader.dataset)} samples, ' + f'{len(train_data_loader)} batches') + print(f'Validation data: {len(val_data_loader.dataset)} samples, ' + f'{len(val_data_loader)} batches') + print(f'Testing data: {len(test_data_loader.dataset)} samples, ' + f'{len(test_data_loader)} batches') + return train_data_loader,val_data_loader,test_data_loader + +def calculate_iou_bbox(box_1, box_2): + """ + Calculate the Intersection over Union (IoU) of two bounding boxes. + + Parameters: + box_1, box_2: list of float + Bounding boxes coordinates: [x1, y1, x2, y2] + + Returns: + float + IoU value + """ + # 将 bounding boxes 从 [x1, y1, x2, y2] 格式转换为 Polygon 所需的坐标格式 + poly_1 = Polygon([(box_1[0], box_1[1]), (box_1[2], box_1[1]), (box_1[2], box_1[3]), (box_1[0], box_1[3])]) + poly_2 = Polygon([(box_2[0], box_2[1]), (box_2[2], box_2[1]), (box_2[2], box_2[3]), (box_2[0], box_2[3])]) + + # 计算 IoU + iou = poly_1.intersection(poly_2).area / poly_1.union(poly_2).area + + return iou + + + + + + + +def compute_accuracy(pred_labels, target_labels): + correct = (pred_labels == target_labels).sum().item() + total = len(target_labels) + return correct / total + + +def select_best_prediction(predictions): + """ + predictions: List of dicts. Each dict contains 'boxes', 'labels', 'scores', and 'masks'. + """ + best_predictions = [] + + for pred in predictions: + # 获取得分最高的预测框的索引 + if len(pred['scores']) ==0: + print(len(pred['scores']),len(pred['boxes'])) + max_score_idx=0 + else: + max_score_idx = pred['scores'].argmax() + + # 选择得分最高的预测框 + best_pred = { + 'boxes': pred['boxes'][max_score_idx].unsqueeze(0), # 添加额外的维度,以保持一致性 + 'labels': pred['labels'][max_score_idx].unsqueeze(0), + 'scores': pred['scores'][max_score_idx].unsqueeze(0), + 'masks': pred['masks'][max_score_idx].unsqueeze(0) + } + best_predictions.append(best_pred) + + return best_predictions + +def log_predictions_to_wandb(images, predictions, targets,predicted_label): + plt.close() + fig, axs = plt.subplots(1, 3, figsize=(15, 5)) + input_image = images[0].cpu().permute(1,2,0).numpy() + axs[0].imshow(input_image) + axs[0].set_title("Input Image") + axs[1].imshow(input_image) + pred=predictions[0] + pred_boxes = pred['boxes'].cpu().numpy() + pred_masks = pred['masks'].cpu().numpy() + label_map = { + 0: "Melanoma", + 1: "Seborrheic Keratosis", + 2: "Healthy" + } + for box, mask in zip(pred_boxes, pred_masks): + xmin,ymin, xmax, ymax, = box + rect = plt.Rectangle((xmin, ymin), xmax - xmin,ymax - ymin, fill=False, color='red') + axs[1].add_patch(rect) + axs[1].text(xmin, ymin, label_map[predicted_label.item()], color='red') + axs[1].imshow(mask[0], alpha=0.7) + axs[1].set_title("Predicted Boxes and Masks") + + # 绘制真实框和掩膜 + axs[2].imshow(input_image) + targets = targets[0] + true_boxes = targets['boxes'].cpu().numpy() + true_masks = targets['masks'].cpu().numpy() + true_labels = targets['labels'].cpu().numpy() + + for box, mask, label in zip(true_boxes, true_masks, true_labels): + xmin,ymin, xmax, ymax, = box + rect = plt.Rectangle((xmin, ymin), xmax - xmin,ymax - ymin,fill=False, color='green') + axs[2].add_patch(rect) + axs[2].text(xmin, ymin, label_map[label.item()-1], color='green') + axs[2].imshow(mask, alpha=0.7) + axs[2].set_title("True Boxes and Masks") + + + for ax in axs: + ax.axis('off') + + plt.tight_layout() + + return wandb.Image(plt) + + + +def main(): + # Define your transformations + max_epoch =50 + + target_size = 384 + train_data_loader,val_data_loader,test_data_loader = get_data_loaders(target_size) + wandb.init(project='ISIC',name='new maskrcnnv2 all requires_grad and classifier no resize ') # Please set your project and entity name + now = datetime.now() + timestamp = now.strftime("%Y-%m-%d_%H-%M-%S") + output_folder = os.path.join('save_weights', timestamp) + os.makedirs(output_folder,exist_ok=True) + + maskrcnn_model = get_model_instance_segmentation(4) + + backbone = maskrcnn_model.backbone.body + + # 创建图片分类器 + image_classifier = ImageClassifier(backbone, num_classes=3) # a + maskrcnn_model.cuda() + image_classifier.cuda() + image_classifier_loss = torch.nn.CrossEntropyLoss() + params = [p for p in maskrcnn_model.parameters() if p.requires_grad] + params.extend([p for p in image_classifier.parameters() if p.requires_grad]) + optimizer = optim.AdamW(params, lr=0.0005) + lr_sheduler = CosineAnnealingWarmRestarts(optimizer,T_0=3,T_mult=1,eta_min=2e-7) + # for cur_e in pbar: + pbar = tqdm.tqdm(range(max_epoch)) + scaler = torch.cuda.amp.GradScaler() + + max_iou = 0 + for epoch in pbar: + maskrcnn_model.train() + image_classifier.train() + epoch_loss = 0 + if epoch == 0: + train_data_loader = tqdm.tqdm(train_data_loader) + else: + pass + epoch_loss_dict = {"loss_classifier": 0, + "loss_box_reg": 0, + 'new_loss_classifier':0, + "loss_mask": 0, + "loss_objectness": 0, + "loss_rpn_box_reg": 0 + } + for images, targets in train_data_loader: + optimizer.zero_grad() + with torch.cuda.amp.autocast(): + loss_dict = maskrcnn_model(images, targets) + classify_logits = image_classifier(torch.stack(images)) + labels = torch.tensor([t['labels'] - 1 for t in targets]).cuda() + classify_loss = image_classifier_loss(classify_logits,labels) + loss_dict['new_loss_classifier'] = classify_loss + + epoch_loss_dict['new_loss_classifier'] += classify_loss + losses = sum(loss for loss in loss_dict.values()) + scaler.scale(losses).backward() + + # losses.backward() + # optimizer.step() + scaler.step(optimizer) + scaler.update() + + epoch_loss += losses.item() + for t in loss_dict: + epoch_loss_dict[t] +=loss_dict[t].item() + # break + + # wandb.log({"new_loss_classifier": loss_dict['new_loss_classifier'].item()},step=epoch) + + + wandb.log({"loss_classifier": epoch_loss_dict['loss_classifier']/2000, + "loss_box_reg": epoch_loss_dict['loss_box_reg']/2000, + 'new_loss_classifier':epoch_loss_dict['loss_classifier']/2000, + "loss_mask": epoch_loss_dict['loss_mask']/2000, + "loss_objectness": epoch_loss_dict['loss_objectness']/2000, + "loss_rpn_box_reg": epoch_loss_dict['loss_rpn_box_reg']/2000 + }, step=epoch) + # wandb.log({ "loss_box_reg": loss_dict['loss_box_reg'].item(), + # 'new_loss_classifier':loss_dict['loss_classifier'].item(), + # "loss_mask": loss_dict['loss_mask'].item(), + # "loss_objectness": loss_dict['loss_objectness'].item(), + # "loss_rpn_box_reg": loss_dict['loss_rpn_box_reg'].item(), + # 'lr':lr_sheduler.get_lr() + # }, step=epoch) + maskrcnn_model.eval() + image_classifier.eval() + + all_ious = [] + all_accuracies = [] + with torch.no_grad(): + pbar_val = tqdm.tqdm(val_data_loader, desc=f'Epoch {epoch + 1} VAL', leave=False) + wandb_images= [] + for i,(images, targets) in enumerate(pbar_val): + predictions = maskrcnn_model(images) + if len(predictions[0]['boxes']) ==0: + all_ious.append(0) + all_accuracies.append(0) + print('zero prediction occurs') + continue + predictions = select_best_prediction(predictions) + iou = calculate_iou_bbox(predictions[0]["boxes"].cpu().numpy()[0], targets[0]["boxes"].cpu().numpy()[0]) + all_ious.append(iou) + classify_result = image_classifier(torch.stack(images)).argmax(1) + labels = torch.tensor([t['labels'] - 1 for t in targets]).cuda() + accuracy = compute_accuracy(classify_result, labels) + print( classify_result, labels) + all_accuracies.append(accuracy) + if 20 max_iou: + torch.save(maskrcnn_model.state_dict(),os.path.join(output_folder,'best_iou_model.pt')) + lr_sheduler.step() + wandb.log({"Val Mean IoU": mean_iou, "Val Mean Accuracy": mean_accuracy}, step=epoch) +# +# def main_deeplabv3(): TODO Note, this is the traning and validation function for deeplabv3, I put it here because the performance is good +# # Define your transformations +# max_epoch =50 +# +# target_size = 384 +# train_data_loader,val_data_loader,test_data_loader = get_data_loaders(target_size) +# wandb.init(project='ISIC',name='deeplabv3 full ') # Please set your project and entity name +# now = datetime.now() +# timestamp = now.strftime("%Y-%m-%d_%H-%M-%S") +# output_folder = os.path.join('save_weights', timestamp) +# os.makedirs(output_folder,exist_ok=True) +# +# deeplabModel = get_deeplab_model(2) +# +# ce_loss = torch.nn.CrossEntropyLoss() +# +# # 创建图片分类器 +# params = [p for p in deeplabModel.parameters() if p.requires_grad] +# optimizer = optim.AdamW(params, lr=0.0005) +# lr_sheduler = CosineAnnealingWarmRestarts(optimizer,T_0=3,T_mult=1,eta_min=2e-7) +# pbar = tqdm.tqdm(range(max_epoch)) +# deeplabModel.cuda() +# max_iou = 0 +# for epoch in pbar: +# deeplabModel.train() +# epoch_loss = 0 +# if epoch == 0: +# train_data_loader = tqdm.tqdm(train_data_loader) +# else: +# pass +# loss_dict= {'segmentation loss':0,'classification loss':0} +# for images, targets in train_data_loader: +# mask_logits,classify_logits = deeplabModel(torch.stack(images).cuda()) +# masks = torch.stack([t['masks'] for t in targets]).cuda() +# segmentation_loss = ce_loss(mask_logits,masks.squeeze(1).long()) +# loss_dict['segmentation loss'] += segmentation_loss.item() +# labels = torch.tensor([t['labels'] - 1 for t in targets]).cuda() +# classification_loss = ce_loss(classify_logits, labels) +# loss_dict['classification loss'] += classification_loss.item() +# total_loss = segmentation_loss+ classification_loss +# +# optimizer.zero_grad() +# total_loss.backward() +# optimizer.step() +# +# +# for t in loss_dict: +# loss_dict[t] = loss_dict[t]/2000 +# wandb.log(loss_dict, step=epoch) +# +# deeplabModel.eval() +# +# all_ious = [] +# all_accuracies = [] +# with torch.no_grad(): +# pbar_val = tqdm.tqdm(val_data_loader, desc=f'Epoch {epoch + 1} VAL', leave=False) +# wandb_images= [] +# for i,(images, targets) in enumerate(pbar_val): +# predictions= dict() +# mask_logits, classify_logits = deeplabModel(torch.stack(images).cuda()) +# predictions['masks'] = mask_logits.argmax(1).unsqueeze(0) +# if predictions['masks'].min() == predictions['masks'].max(): +# continue +# predictions['boxes'] = torchvision.ops.masks_to_boxes(predictions['masks'][0]) +# iou = calculate_iou_bbox(predictions["boxes"].cpu().numpy()[0], targets[0]["boxes"].cpu().numpy()[0]) +# all_ious.append(iou) +# classify_result = classify_logits.argmax(1) +# labels = torch.tensor([t['labels'] - 1 for t in targets]).cuda() +# accuracy = compute_accuracy(classify_result, labels) +# all_accuracies.append(accuracy) +# if 20 max_iou: +# torch.save(deeplabModel.state_dict(),os.path.join(output_folder,'best_iou_model.pt')) +# lr_sheduler.step() +# wandb.log({"Val Mean IoU": mean_iou, "Val Mean Accuracy": mean_accuracy}, step=epoch) + +if __name__ == '__main__': + main() + diff --git a/recognition/Detect_lesions_s4770608/modules.py b/recognition/Detect_lesions_s4770608/modules.py new file mode 100644 index 000000000..9bd8c129b --- /dev/null +++ b/recognition/Detect_lesions_s4770608/modules.py @@ -0,0 +1,50 @@ +import torch +import torchvision +from torch import nn +from torchvision.models.detection.faster_rcnn import FastRCNNPredictor +from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor +import segmentation_models_pytorch as smp + +def get_model_instance_segmentation(num_classes): + # 加载预训练的Mask R-CNN模型 + model = torchvision.models.detection.maskrcnn_resnet50_fpn_v2(pretrained=True) + + # 获取分类器的输入特征数 + in_features = model.roi_heads.box_predictor.cls_score.in_features + + # 重新定义模型的分类器部分 + model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes) + + # 获取掩膜分类器的输入特征数 + in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels + + # 定义新的掩膜预测器 + model.roi_heads.mask_predictor = MaskRCNNPredictor(in_features_mask, 256, num_classes) + + for name, para in model.named_parameters(): + para.requires_grad =True + return model +def get_deeplab_model(num_classes): + model = smp.DeepLabV3Plus( + encoder_name="efficientnet-b4", # choose encoder, e.g. mobilenet_v2 or efficientnet-b7 + encoder_weights="imagenet", # use `imagenet` pre-trained weights for encoder initialization + in_channels=3, # model input channels (1 for gray-scale images, 3 for RGB, etc.) + classes=2, # model output channels (number of classes in your dataset) + aux_params={'classes':3} + ) + return model +class ImageClassifier(torch.nn.Module): + def __init__(self, backbone, num_classes: int): + super().__init__() + self.backbone = backbone + input_dim = 2048 + self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) + self.fc = nn.Linear(input_dim, num_classes) + + def forward(self, x): + + x = self.backbone(x)['3'] # Assuming we take the output of the last layer (tuple) + x = self.avgpool(x) + x = torch.flatten(x, 1) + x = self.fc(x) + return x \ No newline at end of file diff --git a/recognition/Detect_lesions_s4770608/predict.py b/recognition/Detect_lesions_s4770608/predict.py new file mode 100644 index 000000000..64273136b --- /dev/null +++ b/recognition/Detect_lesions_s4770608/predict.py @@ -0,0 +1,158 @@ +import argparse +import os +from datetime import datetime + +import torch +from torchvision.models.detection.anchor_utils import AnchorGenerator +from torchvision.transforms import v2, InterpolationMode + +import wandb +import torch.nn as nn +import torch.nn.functional as F +from dataset import CustomISICDataset +from torch.utils.data import DataLoader +import tqdm +import torchvision.transforms as transforms +import matplotlib.pyplot as plt +from shapely.geometry import Polygon + +import torchvision +from torchvision.models.detection import MaskRCNN +from torchvision.models.detection.faster_rcnn import FastRCNNPredictor +from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor +import torch.optim as optim +import numpy as np +import wandb +from torch.optim.lr_scheduler import CosineAnnealingWarmRestarts +from main import select_best_prediction,compute_accuracy,calculate_iou_bbox,log_predictions_to_wandb +from modules import get_model_instance_segmentation,ImageClassifier,get_deeplab_model +from PIL import Image + +parser = argparse.ArgumentParser() +parser.add_argument('--output_path', default="output") +args = parser.parse_args() +os.makedirs(args.output_path,exist_ok=True) +print(f'save figs to dir {args.output_path}') +def calculate_iou(box_1, box_2): + poly_1 = Polygon([(box_1[0], box_1[1]), (box_1[2], box_1[1]), (box_1[2], box_1[3]), (box_1[0], box_1[3])]) + poly_2 = Polygon([(box_2[0], box_2[1]), (box_2[2], box_2[1]), (box_2[2], box_2[3]), (box_2[0], box_2[3])]) + iou = poly_1.intersection(poly_2).area / poly_1.union(poly_2).area + return iou + + +# Load pre-trained Mask R-CNN model + +maskrcnn_model = get_model_instance_segmentation(4).cuda() + +backbone = maskrcnn_model.backbone.body + +# 创建图片分类器 +image_classifier = ImageClassifier(backbone, num_classes=3).cuda() # amodel.eval() +maskrcnn_model.load_state_dict(torch.load('/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/code/PatternAnalysis-2023/recognition/Detect_lesions_s4770608/save_weights/2023-10-19_17-03-26/epoch34.pt')) +maskrcnn_model.eval() + +# Load images +test_img_dir = '/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_imgs' +test_gt_dir = '/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_gt' +target_size= 384 +val_transform_stage1_for_img_mask = transforms.Compose([ + v2.ToTensor(), +]) + +imagenet_mean = [0.485, 0.456, 0.406] +imagenet_std = [0.229, 0.224, 0.225] +val_transform_stage2_for_img = transforms.Compose([ + v2.Normalize(mean=imagenet_mean, std=imagenet_std) +]) + + +def collate_fn(batch): + images, targets = zip(*batch) + images = [img.cuda() for img in images] + + for target in targets: + target["boxes"] = target["boxes"].cuda() + target["labels"] = target["labels"].cuda() + target["masks"] = target["masks"].cuda() + + return list(images), list(targets) +test_data = CustomISICDataset \ + (csv_file='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_label.csv', + img_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_imgs', + mask_dir='/home/ubuntu/works/code/working_proj/2023S3Course/COMP3710/project/data/test_gt', + transform_stage1_for_img_mask=val_transform_stage1_for_img_mask, + transform_stage2_for_img=val_transform_stage2_for_img, + target_size=target_size) + + +test_data_loader = DataLoader(test_data, batch_size=1, shuffle=False, collate_fn=collate_fn) +def log_predictions_to_output(images, predictions, targets,predicted_label,output_dir): + fig, axs = plt.subplots(1, 3, figsize=(15, 5)) + input_image = images[0].cpu().permute(1,2,0).numpy() + axs[0].imshow(input_image) + axs[0].set_title("Input Image") + axs[1].imshow(input_image) + pred=predictions[0] + pred_boxes = pred['boxes'].cpu().numpy() + pred_masks = pred['masks'].cpu().numpy() + label_map = { + 0: "Melanoma", + 1: "Seborrheic Keratosis", + 2: "Healthy" + } + for box, mask in zip(pred_boxes, pred_masks): + xmin,ymin, xmax, ymax, = box + rect = plt.Rectangle((xmin, ymin), xmax - xmin,ymax - ymin, fill=False, color='red') + axs[1].add_patch(rect) + axs[1].text(xmin, ymin, label_map[predicted_label.item()], color='red') + axs[1].imshow(mask[0], alpha=0.7) + axs[1].set_title("Predicted Boxes and Masks") + + # 绘制真实框和掩膜 + axs[2].imshow(input_image) + targets = targets[0] + true_boxes = targets['boxes'].cpu().numpy() + true_masks = targets['masks'].cpu().numpy() + true_labels = targets['labels'].cpu().numpy() + + for box, mask, label in zip(true_boxes, true_masks, true_labels): + xmin,ymin, xmax, ymax, = box + rect = plt.Rectangle((xmin, ymin), xmax - xmin,ymax - ymin,fill=False, color='green') + axs[2].add_patch(rect) + axs[2].text(xmin, ymin, label_map[label.item()-1], color='green') + axs[2].imshow(mask, alpha=0.7) + axs[2].set_title("True Boxes and Masks") + + + for ax in axs: + ax.axis('off') + + plt.tight_layout() + plt.savefig(os.path.join(output_dir,f'{i}.png')) + + +with torch.no_grad(): + pbar_val = tqdm.tqdm(test_data_loader, ' Test', leave=False) + wandb_images = [] + all_ious=[] + all_accuracies=[] + for i, (images, targets) in enumerate(pbar_val): + predictions = maskrcnn_model(images) + if len(predictions[0]['boxes']) == 0: + all_ious.append(0) + all_accuracies.append(0) + print('zero prediction occurs') + continue + predictions = select_best_prediction(predictions) + iou = calculate_iou_bbox(predictions[0]["boxes"].cpu().numpy()[0], targets[0]["boxes"].cpu().numpy()[0]) + all_ious.append(iou) + classify_result = image_classifier(torch.stack(images)).argmax(1) + labels = torch.tensor([t['labels'] - 1 for t in targets]).cuda() + accuracy = compute_accuracy(classify_result, labels) + print(classify_result, labels) + all_accuracies.append(accuracy) + # Log images every 10 epochs + log_predictions_to_output(images, predictions, targets=targets, predicted_label=classify_result,output_dir=args.output_path) + mean_iou = sum(all_ious) / len(all_ious) + mean_accuracy = sum(all_accuracies) / len(all_accuracies) +print({"Val Mean IoU": mean_iou, "Val Mean Accuracy": mean_accuracy}) diff --git a/recognition/Detect_lesions_s4770608/requirements.txt b/recognition/Detect_lesions_s4770608/requirements.txt new file mode 100644 index 000000000..bb3f2fb7d --- /dev/null +++ b/recognition/Detect_lesions_s4770608/requirements.txt @@ -0,0 +1,102 @@ +Package Version +--------------------------- ------------ +addict 2.4.0 +appdirs 1.4.4 +asttokens 2.2.1 +backcall 0.2.0 +certifi 2022.12.7 +charset-normalizer 2.1.1 +click 8.1.7 +cmake 3.25.0 +coloredlogs 15.0.1 +contourpy 1.0.7 +cycler 0.11.0 +decorator 5.1.1 +docker-pycreds 0.4.0 +easydict 1.10 +efficientnet-pytorch 0.7.1 +executing 1.2.0 +filelock 3.9.0 +flatbuffers 23.3.3 +fonttools 4.39.3 +fsspec 2023.9.2 +gitdb 4.0.10 +GitPython 3.1.37 +huggingface-hub 0.18.0 +humanfriendly 10.0 +idna 3.4 +imageio 2.28.1 +ipython 8.11.0 +jedi 0.18.2 +Jinja2 3.1.2 +joblib 1.2.0 +kiwisolver 1.4.4 +lazy_loader 0.2 +lit 15.0.7 +MarkupSafe 2.1.2 +matplotlib 3.7.1 +matplotlib-inline 0.1.6 +mmcv-full 1.7.0 +monai 1.1.0 +mpmath 1.2.1 +munch 4.0.0 +networkx 3.0 +nibabel 5.1.0 +numpy 1.24.1 +onnxruntime 1.14.1 +opencv-python 4.7.0.72 +packaging 23.1 +pandas 2.1.1 +parso 0.8.3 +pathtools 0.1.2 +pexpect 4.8.0 +pickleshare 0.7.5 +Pillow 9.3.0 +pip 23.0.1 +pretrainedmodels 0.7.4 +prompt-toolkit 3.0.38 +protobuf 3.20.3 +psutil 5.9.5 +ptyprocess 0.7.0 +pure-eval 0.2.2 +Pygments 2.15.1 +pyparsing 3.0.9 +python-dateutil 2.8.2 +pytz 2023.3.post1 +PyWavelets 1.4.1 +PyYAML 6.0 +requests 2.28.1 +safetensors 0.4.0 +scikit-image 0.20.0 +scikit-learn 1.2.2 +scipy 1.10.1 +segmentation-models-pytorch 0.3.3 +sentry-sdk 1.32.0 +setproctitle 1.3.3 +setuptools 66.0.0 +shapely 2.0.2 +SimpleITK 2.2.1 +six 1.16.0 +smmap 5.0.1 +stack-data 0.6.2 +sympy 1.11.1 +tensorboardX 2.6 +terminaltables 3.1.10 +threadpoolctl 3.1.0 +tifffile 2023.4.12 +timm 0.9.2 +tomli 2.0.1 +torch 2.0.0+cu118 +torchmetrics 0.11.4 +torchsummary 1.5.1 +torchvision 0.15.1+cu118 +tqdm 4.65.0 +traitlets 5.9.0 +triton 2.0.0 +typing_extensions 4.4.0 +tzdata 2023.3 +urllib3 1.26.13 +wandb 0.15.12 +wcwidth 0.2.6 +wheel 0.38.4 +yapf 0.33.0