Super-resolution

DEM resolution can be increased with machine learning techniques.

Content

Input data
Data preprocessing for training
Test data
Types of Super-resolution algorithm
Previous attempts
VDSR-based super-resolution DEM
Next Steps
Publications and Implementations

Input data

Super-resolution (SR) machine learning techniques were developed based on well-known image datasets like DIV2K. These contain PNG images with three layers (RGB - red, green and blue) and 8-bit values (0-255).

This is not the case for DEMs, where we have one layer with float values, or at least 16 bit values. Common geodata formats such as GeoTIFF cannot be read out-of-the-box by the image libraries used in machine learning frameworks, and the code of most SR implementations therefore needs to be adjusted.

Have a look at this Jupyter Notebook to see how to handle GeoTIFF files with Tensorflow. Thanks to Zushicat for the support.

Furthermore, for most algorithms, downscale processing plays an important role:

"DIV2K bicubic downscaled images have been created with the MATLAB imresize function. It is important that you create downscaled versions of Set5 images in the very same way [...] For example, to bicubic downscale by a factor of 4 use imresize(x, 0.25, method='bicubic') and then feed the downscaled image into a pre-trained model" (source: krasserm/super-resolution).

The information in a high-resolution image is therefore partly available in each pixel of a downscaled image. However, this is not realistic for SR DEMs. In many cases, we do not know exactly how a low-resolution DEM was derived from a high-resolution dataset. The nearest neighbours algorithm is the best case scenario, because the original values are preserved.

Let´s have a look at an example. The authorities in Bavaria (Germany) offer a range of different Digital Terrain products:

Product horizontal resolution [m] height accuracy [m] license
DGM 1 1 generally better than +- 0.2 proprietary
DGM 2 2 generally better than +- 0.2 proprietary
DGM 5 5 generally better than +- 0.2 proprietary
DGM 10 10 generally better than +- 0.2 proprietary
DGM 25 25 ~ +- 2-3 proprietary
DGM 50 50 ~ +- 5 CC BY 3.0 DE

Tab. 1: DTM (DGM in German) data products from Bavaria (link).

From the test data of the DTMs, you can see that there is no difference between the heights of the proprietary datasets, and that the data are just thinned out. The declaration of the height accuracy of DTM 25 is not correct. Reconstruction of the DGM 50 values with a bicubic or bilinear interpolation of the DGM 25 or DGM 1 datasets is not possible. The most important thing is that the preprocessing of SR datasets, which is usually assumed, cannot be ensured.

Data preprocessing for training

Using GDAL - gdalwarp with an even factor with nearest neighbour resampling results in slight shifts in the positions of pixels, and this is also true for gdal_translate.

Fig. 1: Pixel shift with gdalwarp: nearest neighbour downsampling with resolution 40 m vs. 10 m (use mouse-over to see the effect).

It is therefore better to downscale by an odd factor such as three, to reduce the sub-pixel shift; otherwise, the images cannot be compared.

Test data

A free Austrian DTM dataset (10 m) was used for processing.

test area austria

Fig. 2: Training and validation data for regions in Austria.

The data were downscaled with GDAL from 10 m to 30 m, using nearest neighbour interpolation, and then upsampled with cubic interpolation back to 30 m. The data can be downloaded here (under licence: Geoland.at (2020) Attribution 4.0 International, CC BY 4.0).

The pixel values in the training area have an uneven distribution (see below), which is not ideal for machine learning.

Histogram of the test data

Fig. 3: Histogram of pixel values in the training area.

Types of Super-resolution algorithm

In general, SR techniques can be divided into two groups (see Figure 4 below):

1) those that give a high perceptual quality (i.e. the results look good to human beings)

2) those that give high accuracy (i.e. reasonable data values that are good for DEM processing)

SupervResolution quality and accuracy

Fig. 4: Super-resolution algorithms, plotted according to the mean reconstruction accuracy (measured by RMSE values) and mean perceptual quality (source: Yochai Blau, Roey Mechrez and Radu Timofte - The 2018 PIRM Challenge on Perceptual Image Super-resolution)

For SR DEM, the second group is the right choice.

Previous attempts

- Enhanced Deep Residual Networks for Single Image Super-Resolution, implementation by Krasserm (Github)

- Enhanced Deep Residual Networks for Single Image Super-Resolution, implementation by Weber with OpenCV (Github)

- Image Super-Resolution Using Deep Convolutional Networks (SRCNN), implementation by Green (Github)

Unfortunately no better results could be achieved than with simple cubic upsampling.
Here is a tryout with Krasserm’s Super-Resolution EDSR. The algorithm needs an image with three layers (RGB). The usual procedure for greyscale images is simply to create a three-layer input with the same values. Finally, proceed the image with (band1 + band 2 + band3) / 3.

EDSR HR EDRS NN
Original resolution 10 m Downsampling NN 30 m input
EDSR Cubic EDSR SR
Cubic upsampling 10 m EDSR 10 m

Fig. 5: Comparison between the original 10 m dataset, the downsampled 30 m dataset and the derived products.

As can be seen from Fig. 5, the EDSR SR data looks sharper than the simple bicubic upsampling data, and is perhaps slightly too sharp. Statistics for the data are given below.

  Mean (abs) [m] Max [m] Min [m] STDDEV [m]
Original 385.87 503.89 310.45 32.66
Cubic Upsampling 385.87 503.42 311.05 32.66
SR Band 1 385.60 502.68 309.08 32.61
SR Band 2 385.82 503.09 309.19 32.65
SR Band 3 386.02 503.52 308.99 32.7
SR (1+2+3)/3 385.81 503.07 309.08 32.65
abs(original - derived)
Cubic 0.16 9.69 0 0.32
SR Band1 0.35 13.01 0 0.35
SR Band2 0.23 13.58 0 0.35
SR Band3 0.25 13.53 0 0.35
SR  (1+2+3)/3 0.22 13.37 0 0.35

Tab. 2: Comparison of statistics for the processed data (for the test area shown in Figure 6)

The results in Table 2 show that with standard EDSR processing, no better results could be achieved than with ordinary cubic interpolation. Be aware that when using QGIS for comparison, the statistics obtained are only approximations. You can see this from the STATISTICS_APPROXIMATE=YES parameter. It is therefore better to use gdalinfo -stats for statistics.

Testarea 1

Fig. 6:Test area for the data shown in Table 2 (EPSG:31287, 435000.0, 460000.0 : 450000.0, 475000.0).

We will now look at a more mountainous region.

  Mean (abs) [m] Max [m] Min [m] STDDEV [m]
Original 1244.13 2641.45 525.24 478.32
Cubic Upsampling 1244.13 2634.0 525.29 478.31
SR Band 1 1243.86 2642.54 524.44 477.57
SR Band 2 1244.06 2644.0 524.92 478.16
SR Band 3 1244.2 2647.24 521.86 478.88
SR (1+2+3)/3 1244.04 2644.59 523.74 478.2
abs(original - derived)
Cubic 1.1 63.88 0 1.84
SR Band1 1.51 64.37 0 2.02
SR Band2 1.3 63.66 0 1.99
SR Band3 1.4 61.14 0 2.0
SR  (1+2+3)/3 1.27 63.05 0 2.0

Tab. 3:Comparison of statistics for processed data (for the test area shown in Figure 7)

Testarea 2

Fig. 7:Test area for the data shown in Table 3 (EPSG:31287, 130000.0, 360000.0 : 145000.0, 375000.0)


The results obtained by Müller et al. 2020 with different SR techniques also show that when applied to real data (satellite imagery in this case), cubic upsampling is hard to beat using standard SR techniques.

table results meuller et. al.

Tab. 4: Results of different super-resolution techniques compared to bicubic upsampling (Müller et al. 2020: “Super-resolution of Multispectral Satellite Images Using Convolutional Neural Networks”)

For an overview of SR techniques and metrics, have a look at the paper. The peak signal-to-noise ratio (PSNR) is defined by the mean squared error (MSE) and maximal possible value, and higher values are better.

VDSR-based super-resolution DEM

Very Deep Super Resolution (VDSR) CNN based on the implementation of George Seif.

An implementation of an SR CNN was proposed by Kim, Jiwon, Jung Kwon Lee, and Kyoung Mu Lee. "Accurate image super-resolution using very deep convolutional networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.

More about convolution neural networks.

Code for DEM processing is available from Github:

Guthub

This is a Jupyter Notebook project running on Google Colab with data storage on Google Drive. It could also be adapted to use local resources.

Processing Epochs: 5250

Processing Time: ~ 200 hours

Carbon Footprint: ~ 25 kg CO2, compensation via Atmosfair.de.

Datasets: 2003 training and 101 validation datasets (each 300*300 px). See the Test data-section for download and licence.

Results

  SR Cubic SR sliced Cubic sliced
MAE 95 6 59 42
RMSE 101 0 101 0
PSNR 101 0 101 0
SSIM 80 21 79 22

Tab. 5: Comparison of the best results obtained for the metrics of mean absolute error (MAE), root mean squared error (RMSE), peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) for 101 validation datasets using VDSR and cubic upsampling. Datasets were sliced by 15 pixels to reduce border effects.

Overall, the RMSE for the validation dataset was improved by 2.04% (1.482 vs. 1.513), and the MAE was almost zero (0.8645 vs. 0.8640). Maximum error was improved by almost 0.5 m.

Statistics for the first ten validation datasets are given below.

Validation dataset SR MAE Cubic MAE SR RMSE Cubic RMSE Max. SR Max Cubic
278490_371780 0.759 0.761 1.188 1.208 16.131 16.734
293400_360980 1.253 1.242 2.391 2.440 59.760 60.862
275640_359420 0.692 0.691 1.226 1.253 23.100 23.278
282360_365720 0.803 0.810 1.248 1.281 17.579 18.204
278400_377660 0.699 0.700 1.115 1.136 12.054 11.460
280980_358340 1.520 1.5108 2.653 2.688 36.645 36.714
271440_374750 0.650 0.659 0.981 1.013 12.230 12.368
279090_369500 0.799 0.808 1.208 1.238 18.682 18.654
270780_357290 1.162 1.164 2.149 2.194 31.964 34.771
290100_367340 0.674 0.685 1.081 1.111 15.273 14.000

Tab. 6: Comparison of the basic statistics (mean absolute error, mean squared error, maximum error) for the first ten validation datasets

Validation dataset SR PSNR Cubic PSNR SR SSIM Cubic SSIM
278490_371780 67.806 67.667 0.993 0.993
293400_360980 61.73 61.56 0.988 0.988
275640_359420 67.537 67.346 0.990 0.990
282360_365720 67.382 67.151 0.993 0.993
278400_377660 68.362 68.194 0.993 0.992
280980_358340 60.830 60.717 0.991 0.991
271440_374750 69.471 69.196 0.993 0.993
279090_369500 67.664 67.448 0.990 0.990
270780_357290 62.659 62.479 0.992 0.991
290100_367340 68.631 68.389 0.994 0.994

Tab. 7: Further comparison of the metrics of peak signal noise ratio (PSNR) and structural similarity index (SSIM) for the first ten validation datasets (higher values are better).


The maximal value of the PSNR was set to the maximum value for the whole training region (3150 m).

border effect

Fig. 8: Border effects in a difference image (abs(original - superresolution).

Surprisingly was that the border effect for the cubic upsampling was even more relevant.

  MAE before MAE after Difference
Super Resolution 0.8229524  0.77234507 0.050607324
Cubic Upsampling 0.81898385  0.761203 0.05778086

Tab. 8: Metrics (mean absolute error) for super resolution and cubic upsampling, before and after slicing the border from 300*300 px to 270*270 px, for the first validation image (278490_371780).

It is important to slice the results to avoid these border effects through the filter by convolution neural networks (CNNs) and missing data at the edges in cubic upsampling. Although the use of padding may reduce the problem for CNNs, the effects are still present.

Watch the algorithm learning

SR Learning

Fig. 9: Loss for epochs 5 to 100. The first five epochs were removed from the plot for clearer presentation.

SR Learn 2

Fig. 10: Loss for epochs 100 to 200.

SR Learning 3

Fig. 11: Loss for epochs 300 to 400.

 

Next Steps

An improvement in RMSE of about 2% compared to simple cubic upsampling is not very impressive. Further work will involve adapted filters, higher scales and more training data.

If an improvement of 5% is reached, the 50 m Bavarian DTM will be processed.

Other types of SR techniques such as Cycle-GANs and Pix2Pix will be investigated.

Publications and Implementations

Here is a great list of different SR implementations: Awesome Open Source - TOP Super Resolution Open Source Projects

List of some publicly available papers about SR with DEMs:

Bekir Z Demiray, Muhammed Sit and Ibrahim Demir - D-SRGAN: DEM Super-Resolution with GenerativeAdversarial Networks

Dongjoe Shin, Stephen Spittle - LoGSRN: Deep Super Resolution Network for Digital Elevation Model

Mohammad Pashaei et al. - Deep Learning-Based Single Image Super-Resolution:An Investigation for Dense Scene Reconstruction with UAS Photogrammetry

Zherong Wu, Peifeng Ma - ESRGAN-BASED DEM SUPER-RESOLUTION FOR ENHANCED SLOPE DEFORMATION MONITORING IN LANTAU ISLAND OF HONG KONG

top