shakes76 · ShanJiang929 · Sep 17, 2023 · Oct 3, 2023 · Oct 3, 2023 · Oct 3, 2023
diff --git a/git b/git
diff --git a/recognition/README.md b/recognition/README.md
@@ -0,0 +1,10 @@
+# Recognition Tasks
+Various recognition tasks solved in deep learning frameworks.
+
+Tasks may include:
+* Image Segmentation
+* Object detection
+* Graph node classification
+* Image super resolution
+* Disease classification
+* Generative modelling with StyleGAN and Stable Diffusion
diff --git a/recognition/SuperResolutionShanJiang/README.md b/recognition/SuperResolutionShanJiang/README.md
@@ -0,0 +1,71 @@
+Note: Sorry that the images in this repo hence this readme cannot load properly, please check Turnitin submission of the readme file to look at images included in this doc. Sorry for the inconvenience.
+# Brain MRI super-resolution network 
+## Introduction
+This project implemented a super-resolution CNN network trained on the ADNI brain dataset. The trained model can convert a low resolution image into a high resolution version. The CNN consists of four convolutional layers followed by a depth-to-space transformation. The dataset used in training and testing is ADNI dataset, including 2D slices of MRI for both Alzheimer’s disease patients (AD) and health control (HC). For our purpose, dataset for AD and HC are combined together into one dataset (since our model do not deal with classification). The training achieves mean PSNR of 28.82 with loss of 0.0013 testing achieves mean PSNR of 27.56, which is higher than mean PSNR of arbitary lower resolution images (25.96).
+## Getting Started
+### Install the required dependencies
+pip install -r  recognition/SuperResolutionShanJiang/requirements.txt   
+### Loading dataset
+The dataset used for training(and validation) and testing is loaded in `dataset.py`.The images are cropped into specified size (256*248 in our case). 20% of the training dataset is reserved for validation. Pixel values of traning and validation images are rescales to range of 0 to 1. A list of path for each testing and prediction image is also created for later use. The dataset is can be found at [ADNI MRI Dataset (2D slices)
+](https://cloudstor.aarnet.edu.au/plus/s/L6bbssKhUoUdTSI)
+
+Then we produce pairs of  high resolution and loss resolution images from the training and validation dataset. To get high resolution images,we convert images from the RGB color space to the YUV colour space and only keeps Y channel. To get low recolution version, we convert images from the RGB color space to the YUV colour space,only keeps Y channel and resize them by certain ratio (4 in our case) so that their resolution is reduced. Each pair is put into a tuple to be fed into the model for training. The following images show exmaple of a high resolution image and the corresponding low resolution image. 
+![A high resolution MRI image](./readme_images/high_res_train.png)
+![A low resolution MRI image](./readme_images/low_res_train.png)
+To run dataset.py, follow these steps in `dataset.py`:
+1. Creat a folder in the same directory as the python file and put images for training inside. Specify the exact directory of this folder at line 20 by altering the value of variable `training_dir`.
+2. Creat a folder in the same directory as the python file and put images for testing inside. Specify the exact directory of this folder at line 80 by altering the value of variable `test_path`.
+3. Creat a folder in the same directory as the python file and put images for prediction inside. Specify the exact directory of this folder at line 95 by altering the value of variable `prediction_path`. Those images is used to provide a demo of the prediction result of the model in `predict.py`
+4. (optional) change the value of `upscale_factor` at line 14 to downsample your training and validation images by a different ratio
+5. Change the values of `crop_width_size` at line 16 and `crop_height_size` in line 15 to make sure they are less than or equal to the orginal width and height of the images, and is divisible by `upscale_factor`.
+6. (optional) Adjust batch size for training and validation by changing the value of `batch_size`at line 17
+7. run `dataset.py`
+### Building model
+The model structure is defined in `modules.py`. using keras framwork. The structure of the model is as following:
+- first layer: A convolutional layer with 64 filters and a kernel size of 5 to extract features.
+- second layer: A convolutional layer with 64 filters and a kernel size of 3 to extract features.
+- third layer: A convolutional layer with 32 filters and a kernel size of 3 to extract features.
+- fourth layer: A convolution layer with `channels * (upscale_factor ** 2)` filters and a kernel size of 3 to increase spatial resolution.
+- depth to space operation: Using TensorFlow's tf.nn.depth_to_space function to perform a depth-to-space upscaling operation specified 'upscale_factor' to produce the super-resolved image with a higher resolution.
+Note: for best performance, keep the value of `upscale_factor` parameter (default to 4) in `get_model` the same as the value of `upscale_factor` parameter defined in `dataset.py` and keep the value of `channels` to the default value (1).
+### Utilities
+Two functions are defined in `utils.py`. 
+- `get_lowres_image(img, upscale_factor)` downsamples given `image` by  ratio of given `upscale_factor`. It is later used in train.py to convert testing images to low resolutions images.
+- `upscale_image(model, img)` preprocessed given `image` and use the give `model` to increase its resolution. The preprocessing include convert the image into YCbCr color space and isolate and nomalise(dividing by 255) the Y channel, reshape the Y channel array to shape shape matches the model input shape. The prediction output from the model is demornalised(multiplying by 255) and restored to RGB color space.
+### Model training
+Model training, validation and testing is are implemented in `train.py`.
+#### Training and validation 
+`ESPCNCallback` class is used to monitor and display mean PSNR after each epoch; and the plots of loss function (for both training and validation) vs epoch number are saved to specified directory after every 10 epochs. Mean Squared Error is used as the loss function and Adam is used as the optimiser. `early_stopping_callback` is set so that the training stops automatically if loss does not improve for 10 consecutive epochs. During training, the best(the one result in in minimal loss) model weight is saved to specified path. The following image is the plot of loss over epoch for the entire training process (epoch 1 to 60).
+![loss for each epoch during training](./readme_images/loss_plot.png)
+To run model training, do following in `train.py`:
+1. Make sure the value of `upscale_factor` at line 12 is the same as the one defined in `dataset.py`
+2. Make sure training dataset is well defined in `dataset.py`. (Refer to "Loading dataset" of this doc )
+3. Creat a empty folder in the same directory as the python files to save the weights. Specify the exact directory of this folder at lineby altering the value of variable `checkpoint_filepath`at line 16. Make sure add a "/" at the end of the path, for example: "exact/path/to/the/folder/".
+4. Creat a empty folder in the same directory as the python files to save the loss plots. Specify the exact directory of this folder at line ?? by altering the value of variable `loss_plot_path` at line 13. Make sure add a "/" at the end of the path, for example: "exact/path/to/the/folder/".
+5. Comment out code for testing (from line 88 to line 117)
+6. run `train.py`
+#### Testing
+During model testing, the images are first downsampled by passing them to the functions `get_lowres_imageget_lowres_image(img, upscale_factor)` and then a reconstructed high resolution version is predicted using the model. The average PSNR of lower resolution images and prediction are calculated to verify the effectiveness of the model (PSNR of prediction should be higher than lower resolution images)
+To run model testing, do following in `train.py`:
+1. Make sure the value of `upscale_factor` at line 12 is the same as the one defined in `dataset.py`
+2. Make sure the model has been trained and weights have been saved (see training part)
+3. Make sure testing dataset is well defined in `dataset.py`. (Refer to "Loading dataset" of this doc )
+4. Comment out code for training (from line 18 to line 85)
+5. run `train.py`
+### Prediction
+Example usage of this model is shown in `predict.py`. In this file, 10 images from testing daaset are chosen to be downsampled and predicted using the model. For each image, we show the lower resolution version, higher resolution version and prediction in one figure and saved in specified directory. The following figures show an example of the figure. ![prediction figure](./readme_images/prediction.jpeg)
+To run this file, do following in `predict.py`:
+1. Make sure the model has been trained and weights have been saved (see training part).
+2. Make sure the value of `checkpoint_filepath` at line 17 is the same as `checkpoint_filepath` defined in `train.py`.
+3. Make sure prediction dataset is well defined in `dataset.py`. (Refer to "Loading dataset" of this doc)
+4. Creat a empty folder in the same directory as the python files to save the example figures. Specify the exact directory of this folder at by altering the value of variable `prediction_result_path` at line 22. Make sure add a "/" at the end of the path, for example: "exact/path/to/the/folder/".
+5. run `predict.py` 
+### References 
+Long  X. (2020). Image Super-Resolution using an Efficient Sub-Pixel CNN.  https://keras.io/examples/vision/super_resolution_sub_pixel/
+
+
+
+
+
+
+
diff --git a/recognition/SuperResolutionShanJiang/dataset.py b/recognition/SuperResolutionShanJiang/dataset.py
@@ -0,0 +1,117 @@
+import tensorflow as tf
+import os
+from tensorflow import keras
+from keras import layers
+from keras.utils import load_img
+from keras.utils import array_to_img
+from keras.utils import img_to_array
+from keras.preprocessing import image_dataset_from_directory
+from IPython.display import display
+
+# Reference
+""" Title: Image Super-Resolution using an Efficient Sub-Pixel CNN
+Author: Xingyu Long
+Date: 28/07/2020
+Availability: https://keras.io/examples/vision/super_resolution_sub_pixel/"""
+
+#Set parameters for cropping
+crop_width_size = 256
+crop_height_size = 248
+upscale_factor = 4 # ratio that dowansample orginal image for training and upscale images to predict at
+input_height_size = crop_height_size // upscale_factor
+input_width_size = crop_width_size // upscale_factor
+batch_size = 8
+
+#Specify directory containing training dataset
+training_dir = "D:/temporary_workspace/comp3710_project/PatternAnalysis_2023_Shan_Jiang/recognition/SuperResolutionShanJiang/train_dataset"
+
+#Create traning dataset
+train_ds = image_dataset_from_directory(
+    training_dir,
+    batch_size=batch_size,
+    image_size=(crop_height_size, crop_width_size),
+    validation_split=0.2,
+    subset="training",
+    seed=1337,
+    label_mode=None,
+)
+
+#Create validation dataset
+valid_ds = image_dataset_from_directory(
+    training_dir,
+    batch_size=batch_size,
+    image_size=(crop_height_size, crop_width_size),
+    validation_split=0.2,
+    subset="validation",
+    seed=1337,
+    label_mode=None,
+)
+
+# resacla training and validation images to take values in the range [0, 1].
+def scaling(input_image):
+    input_image = input_image / 255.0
+    return input_image
+
+train_ds = train_ds.map(scaling)
+valid_ds = valid_ds.map(scaling)
+
+# A fucntion that turns given image to grey scale and crop it 
+def process_input(input,input_height_size,input_width_size):
+    input = tf.image.rgb_to_yuv(input)
+    last_dimension_axis = len(input.shape) - 1
+    y, u, v = tf.split(input, 3, axis=last_dimension_axis)
+    return tf.image.resize(y, [input_height_size, input_width_size], method="area")
+
+# A fucntion that turn given image to grey scale
+def process_target(input):
+    input = tf.image.rgb_to_yuv(input)
+    last_dimension_axis = len(input.shape) - 1
+    y, u, v = tf.split(input, 3, axis=last_dimension_axis)
+    return y
+
+
+# Process train dataset:create low resolution images and corresponding high resolution images, and put the pair into a tuple
+train_ds = train_ds.map(
+    lambda x: (process_input(x, input_height_size, input_width_size), process_target(x))
+)
+train_ds = train_ds.prefetch(buffer_size=32)
+
+# Process validation dataset:create low resolution images and corresponding high resolution images, and put the pair into a tuple
+valid_ds = valid_ds.map(
+    lambda x: (process_input(x, input_height_size, input_width_size), process_target(x))
+)
+valid_ds = valid_ds.prefetch(buffer_size=32)
+
+#Specify directory containing testing dataset
+test_path = 'D:/temporary_workspace/comp3710_project/PatternAnalysis_2023_Shan_Jiang/recognition/SuperResolutionShanJiang/test_dataset'
+#Put path of each testing image into a sorted list
+test_img_paths = sorted(
+    [
+        os.path.join(test_path, fname)
+        for fname in os.listdir(test_path)
+        if fname.endswith(".jpeg")
+    ]
+)
+
+#return a list containing path of each image for testing
+def get_test_img_paths():
+    return test_img_paths
+
+#Specify directory containing prediction dataset
+prediction_path = "D:/temporary_workspace/comp3710_project/PatternAnalysis_2023_Shan_Jiang/recognition/SuperResolutionShanJiang/prediction_dataset"
+#Put path of each prediction image into a sorted list
+prediction_path = sorted(
+    [
+        os.path.join(prediction_path, fname)
+        for fname in os.listdir(prediction_path)
+        if fname.endswith(".jpeg")
+    ]
+)
+
+
+# return a list containing path of each image to be predicted
+def get_prediction_img_paths():
+    return prediction_path
+
+
+
diff --git a/recognition/SuperResolutionShanJiang/modules.py b/recognition/SuperResolutionShanJiang/modules.py
@@ -0,0 +1,34 @@
+import tensorflow as tf
+from tensorflow import keras
+from keras import layers
+
+
+# Reference
+""" Title: Image Super-Resolution using an Efficient Sub-Pixel CNN
+Author: Xingyu Long
+Date: 28/07/2020
+Availability: https://keras.io/examples/vision/super_resolution_sub_pixel/"""
+
+def get_model(upscale_factor=4, channels=1):
+    """build a super-resolution model
+
+    Args:
+        upscale_factor: ratio to upscale the image. Defaults to 3.
+        channels: Number of channels. Defaults to 1.
+
+    Returns:
+        keras.Model: super-resolution model
+    """
+    conv_args = {
+        "activation": "relu",
+        "kernel_initializer": "Orthogonal",
+        "padding": "same",
+    }
+    inputs = keras.Input(shape=(None, None, channels))
+    x = layers.Conv2D(64, 5, **conv_args)(inputs)
+    x = layers.Conv2D(64, 3, **conv_args)(x)
+    x = layers.Conv2D(32, 3, **conv_args)(x)
+    x = layers.Conv2D(channels * (upscale_factor ** 2), 3, **conv_args)(x)
+    outputs = tf.nn.depth_to_space(x, upscale_factor)
+
+    return keras.Model(inputs, outputs)
diff --git a/recognition/SuperResolutionShanJiang/predict.py b/recognition/SuperResolutionShanJiang/predict.py
@@ -0,0 +1,92 @@
+from utils import *
+from modules import *
+from dataset import *
+import matplotlib.pyplot as plt
+import matplotlib.image as mpimg
+
+
+from tensorflow import keras
+from keras import layers
+from keras.utils import load_img
+from keras.utils import array_to_img
+import os
+import math
+import matplotlib.pyplot as plt
+
+
+# Reference
+""" Title: Image Super-Resolution using an Efficient Sub-Pixel CNN
+Author: Xingyu Long
+Date: 28/07/2020
+Availability: https://keras.io/examples/vision/super_resolution_sub_pixel/"""
+
+# load the trained model
+checkpoint_filepath= "D:/temporary_workspace/comp3710_project/PatternAnalysis_2023_Shan_Jiang/recognition/SuperResolutionShanJiang/checkpoint/"
+model = get_model()
+model.load_weights(checkpoint_filepath)
+
+# Specify path to store prediction result
+prediction_result_path = "D:/temporary_workspace/comp3710_project/PatternAnalysis_2023_Shan_Jiang/recognition/SuperResolutionShanJiang/prediction_result/"
+
+
+# Dowansample resolution of iamges by factor of 4, then predict higher resolution image using the model
+total_bicubic_psnr = 0.0 # PSNR of downsampled image
+total_test_psnr = 0.0 # PSNR of model output
+
+
+
+for index, prediction_img_path in enumerate(get_prediction_img_paths()):
+    img = load_img(prediction_img_path)
+    lowres_input = get_lowres_image(img, upscale_factor) # downsample
+    w = lowres_input.size[0] * upscale_factor
+    h = lowres_input.size[1] * upscale_factor
+    highres_img = img.resize((w, h))
+    prediction = upscale_image(model, lowres_input) # Predict
+    lowres_img = lowres_input.resize((w, h))
+    lowres_img_arr = img_to_array(lowres_img)
+    highres_img_arr = img_to_array(highres_img)
+    predict_img_arr = img_to_array(prediction)
+    bicubic_psnr = tf.image.psnr(lowres_img_arr, highres_img_arr, max_val=255)
+    test_psnr = tf.image.psnr(predict_img_arr, highres_img_arr, max_val=255)
+    print("higher resolution")
+    display(array_to_img(highres_img))
+    print("lower resolution")
+    display(array_to_img(lowres_img))
+    print("prediction")
+    display(array_to_img(prediction))
+    # array_to_img(prediction).show()
+
+    total_bicubic_psnr += bicubic_psnr
+    total_test_psnr += test_psnr
+
+    image1 = array_to_img(highres_img)
+    image2 = array_to_img(lowres_img)
+    image3 = array_to_img(prediction)
+
+    # Create a figure with three subplots
+    fig, axes = plt.subplots(1, 3, figsize=(12, 4))
+
+    # Display the high redolution image in the first subplot
+    axes[0].imshow(image1)
+    axes[0].set_title('high redolution')
+
+    # Display the low redolution image in the second subplot
+    axes[1].imshow(image2)
+    axes[1].set_title('low redolution')
+
+    # Display the prediction image in the third subplot
+    axes[2].imshow(image3)
+    axes[2].set_title('prediction')
+
+    # Adjust spacing between subplots
+    plt.tight_layout()
+
+    # Save the plot 
+    filename = os.path.basename(os.path.basename(prediction_img_path))
+    plt.savefig(prediction_result_path+filename)
+
+    print("PSNR of lowres images is %.4f" % (bicubic_psnr / 10))
+    print("PSNR of reconstructions is %.4f" % (test_psnr / 10))
+
+print("Avg. PSNR of lowres images is %.4f" % (total_bicubic_psnr / 10))
+print("Avg. PSNR of reconstructions is %.4f" % (total_test_psnr / 10))
diff --git a/recognition/SuperResolutionShanJiang/readme_images/high_res_train.png b/recognition/SuperResolutionShanJiang/readme_images/high_res_train.png
diff --git a/recognition/SuperResolutionShanJiang/readme_images/loss_plot.png b/recognition/SuperResolutionShanJiang/readme_images/loss_plot.png
diff --git a/recognition/SuperResolutionShanJiang/readme_images/low_res_train.png b/recognition/SuperResolutionShanJiang/readme_images/low_res_train.png
diff --git a/recognition/SuperResolutionShanJiang/readme_images/prediction.jpeg b/recognition/SuperResolutionShanJiang/readme_images/prediction.jpeg
diff --git a/recognition/SuperResolutionShanJiang/requirements.txt b/recognition/SuperResolutionShanJiang/requirements.txt
@@ -0,0 +1,5 @@
+tensorflow >= 2.14.0
+IPython >=8.16.1
+matplotlib >= 3.7.1
+Pillow >= 9.5.0
+numpy >= 1.24.2