Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals

Image processing articles from across Nature Portfolio

Image processing is manipulation of an image that has been digitised and uploaded into a computer. Software programs modify the image to make it more useful, and can for example be used to enable image recognition.

research papers related image processing

Moving towards a generalized denoising network for microscopy

The visualization and analysis of biological events using fluorescence microscopy is limited by the noise inherent in the images obtained. Now, a self-supervised spatial redundancy denoising transformer is proposed to address this challenge.

  • Lachlan Whitehead

Latest Research and Reviews

research papers related image processing

The artificial intelligence-based model ANORAK improves histopathological grading of lung adenocarcinoma

Yuan and colleagues developed an artificial intelligence-based method to derive growth patterns and morphological features from hematoxylin and eosin-stained slides of lung adenocarcinoma samples, for improved tumor grading and patient prognostication.

  • Khalid AbdulJabbar
  • David A. Moore

research papers related image processing

Identification of wheel track in the wheat field

  • Wanhong Zhang

research papers related image processing

Unsupervised and supervised discovery of tissue cellular neighborhoods from cell phenotypes

CytoCommunity enables both supervised and unsupervised analyses of spatial omics data in order to identify complex tissue cellular neighborhoods based on cell phenotypes and spatial distributions.

  • Jiazhen Rong

research papers related image processing

Learning multi-site harmonization of magnetic resonance images without traveling human phantoms

Data harmonization of MRI scans can improve consistency when analysing MRI data from different instruments in different locations. Liu and Yap report an efficient deep neural network method to disentangle site-specific information from site-invariant anatomical information in MRI images. The approach allows data in a wide range of existing studies, conducted via different imaging protocols, to be harmonized.

  • Pew-Thian Yap

research papers related image processing

Improving deep neural network generalization and robustness to background bias via layer-wise relevance propagation optimization

Image background features can undesirably affect deep networks’ decisions. Here, the authors show that the optimization of Layer-wise Relevance Propagation explanation heatmaps can hinder such influence, improving out-of-distribution generalization.

  • Pedro R. A. S. Bassi
  • Sergio S. J. Dertkigil
  • Andrea Cavalli

research papers related image processing

Image restoration of degraded time-lapse microscopy data mediated by near-infrared imaging

InfraRed-mediated Image Restoration (IR 2 ) uses deep learning to combine the benefits of deep-tissue imaging with NIR probes and the convenience of imaging with GFP for improved time-lapse imaging of embryogenesis.

  • Nicola Gritti
  • Rory M. Power
  • Jan Huisken

Advertisement

News and Comment

research papers related image processing

JDLL: a library to run deep learning models on Java bioimage informatics platforms

  • Carlos García López de Haro
  • Stéphane Dallongeville
  • Jean-Christophe Olivo-Marin

research papers related image processing

Imaging across scales

New twists on established methods and multimodal imaging are poised to bridge gaps between cellular and organismal imaging.

  • Rita Strack

research papers related image processing

Visual proteomics

Advances will enable proteome-scale structure determination in cells.

research papers related image processing

Inferring how animals deform improves cell tracking

Tracking cells is a time-consuming part of biological image analysis, and traditional manual annotation methods are prohibitively laborious for tracking neurons in the deforming and moving Caenorhabditis elegans brain. By leveraging machine learning to develop a ‘targeted augmentation’ method, we substantially reduced the number of labeled images required for tracking.

research papers related image processing

napari-imagej: ImageJ ecosystem access from napari

  • Gabriel J. Selzer
  • Curtis T. Rueden
  • Kevin W. Eliceiri

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research papers related image processing

  • Reference Manager
  • Simple TEXT file

People also looked at

Editorial article, editorial: current trends in image processing and pattern recognition.

www.frontiersin.org

  • PAMI Research Lab, Computer Science, University of South Dakota, Vermillion, SD, United States

Editorial on the Research Topic Current Trends in Image Processing and Pattern Recognition

Technological advancements in computing multiple opportunities in a wide variety of fields that range from document analysis ( Santosh, 2018 ), biomedical and healthcare informatics ( Santosh et al., 2019 ; Santosh et al., 2021 ; Santosh and Gaur, 2021 ; Santosh and Joshi, 2021 ), and biometrics to intelligent language processing. These applications primarily leverage AI tools and/or techniques, where topics such as image processing, signal and pattern recognition, machine learning and computer vision are considered.

With this theme, we opened a call for papers on Current Trends in Image Processing & Pattern Recognition that exactly followed third International Conference on Recent Trends in Image Processing & Pattern Recognition (RTIP2R), 2020 (URL: http://rtip2r-conference.org ). Our call was not limited to RTIP2R 2020, it was open to all. Altogether, 12 papers were submitted and seven of them were accepted for publication.

In Deshpande et al. , authors addressed the use of global fingerprint features (e.g., ridge flow, frequency, and other interest/key points) for matching. With Convolution Neural Network (CNN) matching model, which they called “Combination of Nearest-Neighbor Arrangement Indexing (CNNAI),” on datasets: FVC2004 and NIST SD27, their highest rank-I identification rate of 84.5% was achieved. Authors claimed that their results can be compared with the state-of-the-art algorithms and their approach was robust to rotation and scale. Similarly, in Deshpande et al. , using the exact same datasets, exact same set of authors addressed the importance of minutiae extraction and matching by taking into low quality latent fingerprint images. Their minutiae extraction technique showed remarkable improvement in their results. As claimed by the authors, their results were comparable to state-of-the-art systems.

In Gornale et al. , authors extracted distinguishing features that were geometrically distorted or transformed by taking Hu’s Invariant Moments into account. With this, authors focused on early detection and gradation of Knee Osteoarthritis, and they claimed that their results were validated by ortho surgeons and rheumatologists.

In Tamilmathi and Chithra , authors introduced a new deep learned quantization-based coding for 3D airborne LiDAR point cloud image. In their experimental results, authors showed that their model compressed an image into constant 16-bits of data and decompressed with approximately 160 dB of PSNR value, 174.46 s execution time with 0.6 s execution speed per instruction. Authors claimed that their method can be compared with previous algorithms/techniques in case we consider the following factors: space and time.

In Tamilmathi and Chithra , authors carefully inspected possible signs of plant leaf diseases. They employed the concept of feature learning and observed the correlation and/or similarity between symptoms that are related to diseases, so their disease identification is possible.

In Das Chagas Silva Araujo et al. , authors proposed a benchmark environment to compare multiple algorithms when one needs to deal with depth reconstruction from two-event based sensors. In their evaluation, a stereo matching algorithm was implemented, and multiple experiments were done with multiple camera settings as well as parameters. Authors claimed that this work could be considered as a benchmark when we consider robust evaluation of the multitude of new techniques under the scope of event-based stereo vision.

In Steffen et al. ; Gornale et al. , authors employed handwritten signature to better understand the behavioral biometric trait for document authentication/verification, such letters, contracts, and wills. They used handcrafter features such as LBP and HOG to extract features from 4,790 signatures so shallow learning can efficiently be applied. Using k-NN, decision tree and support vector machine classifiers, they reported promising performance.

Author Contributions

The author confirms being the sole contributor of this work and has approved it for publication.

Conflict of Interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Santosh, KC, Antani, S., Guru, D. S., and Dey, N. (2019). Medical Imaging Artificial Intelligence, Image Recognition, and Machine Learning Techniques . United States: CRC Press . ISBN: 9780429029417. doi:10.1201/9780429029417

CrossRef Full Text | Google Scholar

Santosh, KC, Das, N., and Ghosh, S. (2021). Deep Learning Models for Medical Imaging, Primers in Biomedical Imaging Devices and Systems . United States: Elsevier . eBook ISBN: 9780128236505.

Google Scholar

Santosh, KC (2018). Document Image Analysis - Current Trends and Challenges in Graphics Recognition . United States: Springer . ISBN 978-981-13-2338-6. doi:10.1007/978-981-13-2339-3

Santosh, KC, and Gaur, L. (2021). Artificial Intelligence and Machine Learning in Public Healthcare: Opportunities and Societal Impact . Spain: SpringerBriefs in Computational Intelligence Series . ISBN: 978-981-16-6768-8. doi:10.1007/978-981-16-6768-8

Santosh, KC, and Joshi, A. (2021). COVID-19: Prediction, Decision-Making, and its Impacts, Book Series in Lecture Notes on Data Engineering and Communications Technologies . United States: Springer Nature . ISBN: 978-981-15-9682-7. doi:10.1007/978-981-15-9682-7

Keywords: artificial intelligence, computer vision, machine learning, image processing, signal processing, pattern recocgnition

Citation: Santosh KC (2021) Editorial: Current Trends in Image Processing and Pattern Recognition. Front. Robot. AI 8:785075. doi: 10.3389/frobt.2021.785075

Received: 28 September 2021; Accepted: 06 October 2021; Published: 09 December 2021.

Edited and reviewed by:

Copyright © 2021 Santosh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: KC Santosh, [email protected]

This article is part of the Research Topic

Current Trends in Image Processing and Pattern Recognition

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Elsevier - PMC COVID-19 Collection

Logo of pheelsevier

Deep learning and medical image processing for coronavirus (COVID-19) pandemic: A survey

Sweta bhattacharya.

a School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India

Praveen Kumar Reddy Maddikunta

Quoc-viet pham.

b Research Institute of Computer, Information and Communication, Pusan National University, Busan 46241, Republic of Korea

Thippa Reddy Gadekallu

Siva rama krishnan s, chiranji lal chowdhary, mamoun alazab.

c College of Engineering, IT & Environment, Charles Darwin University, Australia

Md. Jalil Piran

d Department of Computer Science and Engineering, Sejong University, 05006, Seoul, Republic of Korea

Since December 2019, the coronavirus disease (COVID-19) outbreak has caused many death cases and affected all sectors of human life. With gradual progression of time, COVID-19 was declared by the world health organization (WHO) as an outbreak, which has imposed a heavy burden on almost all countries, especially ones with weaker health systems and ones with slow responses. In the field of healthcare, deep learning has been implemented in many applications, e.g., diabetic retinopathy detection, lung nodule classification, fetal localization, and thyroid diagnosis. Numerous sources of medical images (e.g., X-ray, CT, and MRI) make deep learning a great technique to combat the COVID-19 outbreak. Motivated by this fact, a large number of research works have been proposed and developed for the initial months of 2020. In this paper, we first focus on summarizing the state-of-the-art research works related to deep learning applications for COVID-19 medical image processing. Then, we provide an overview of deep learning and its applications to healthcare found in the last decade. Next, three use cases in China, Korea, and Canada are also presented to show deep learning applications for COVID-19 medical image processing. Finally, we discuss several challenges and issues related to deep learning implementations for COVID-19 medical image processing, which are expected to drive further studies in controlling the outbreak and controlling the crisis, which results in smart healthy cities.

1. Introduction

The Coronavirus disease (COVID-19) pandemic and its related efforts of containment have generated a worldwide health crisis impacting all sectors of human life. At its initial stage of inception, with the number of people affected by the disease being minimal, it did not reflect threats of such enormous capacity wherein the majority of the cases were resolved spontaneously. With gradual progression of time, COVID-19 was declared as an outbreak by the World Health Organization (WHO) with an extremely high-risk potential of affecting millions of lives in all countries, especially ones with weaker health systems. The virus is deadly due to two basic reasons- firstly, it is novel with no vaccines discovered, and secondly, it is easily transmitted through direct or indirect contact with the affected individual.

The statistics of COVID-19 reflect reasons for immense concern, with almost 44,748,380 being affected globally, with 1,179,035 patients losing the battle and succumbing to death as on October 29, 2020 ( WORLDOMETER, 2020 ). To add more to the horrific statistical figure, United States of America (USA) being one of the leading flag bearers in healthcare advancement, record highest number of COVID-19 victims followed by Brazil, India, Russia, South Africa and the list goes on till 215 countries across the globe. The total number of COVID-19 diagnosed cases in the USA alone is 9,120,751, with 5,933,212 cases of recovery and 233,130 total deaths as on October 29, 2020 ( WORLDOMETER, 2020 ). The number of new cases being reported every single day has been increasing at an accelerated rate, compelling governments, and administrative authorities across the globe to impose a non-compromising lockdown to ensure social distancing for the containment of the disease.

The global response to interrupt spreading of COVID-19 has been prompt and unanimous wherein the majority of the affected countries have sealed their borders barring traveling and transportation services. The WHO and Center for Disease Control (CDC) and Prevention have issued structured guidelines to be followed by general citizens, governments, national and international corporations to ensure complete containment of the disease, thereby breaking the chain leading to this pandemic. The global strategy for COVID-19 response as framed by the WHO include five steps: (1) mobilization of all sectors of human life to maintain hygiene and social distancing, (2) controlling of sporadic cases to prevent community spread, (3) suppressing community transmission by imposing relevant restrictions, (4) providing healthcare services to reduce mortality, and (5) development of vaccines and therapeutics for large scale administering. Fig. 1 shows the transmission of COVID-19.

Fig. 1

Transmission of COVID-19.

WHO and CDC have identified and enlisted symptoms that indicate plausible COVID-19 infection, and the symptoms include fever, dry cough, vomiting, diarrhoea and myalgia. The general public of all countries have been made conscious of the same to be responsive to seek treatment at the earliest in order to reduce morbidity rates. Governments have initiated to generously and enthusiastically invest in COVID-19 vaccine and related research. To add this initiative, an enormous amount of research and development activities are being directed pertaining to the COVID-19 pandemic. Machine learning (ML) and deep learning (DL) approaches have been a predominant choice for various disease detection ( Zhang, Yang, Chen, & Li, 2018 ). The image processing technique has gained immense momentum in all sectors of healthcare, especially in cancer detection in smart cities ( Khan, Asif, Ahmad, Alharbi, & Aljuaid, 2020 ). Hence these approaches have been a natural choice for COVID-19 research as well. The present study focuses on highlighting the contributions of DL and medical image processing techniques to combat the COVID-19 pandemic presenting an extensive review of the state-of-the-art frameworks developed by employing these technologies.

1.1. State-of-the-arts and contributions

In a desperate attempt to combat the COVID-19 pandemic, researches have been initiated on scientific studies in all directions, and DL integrated with medical image processing techniques have also been explored rigorously to find a definite solution ( Hakak, Khan, Imran, Choo, & Shoaib, 2020 ; Iwendi et al., 2020 ). Numerous research publications have been published with similar objectives, as shown in Table. ​ Table.1 . 1 . The uniqueness of the present work lies in its effort to emphasize significant DL and image processing techniques proposed in the detection of COVID-19 and also to highlight challenges associated with such implementations in order to open specific dimensions of future research, which are yet to be explored or thought of. The approaches discussed are elicited from various published articles of reputed publishers and thus help to enlist a serious set of recommendations for the research community and also administrative authorities to combat the disease.

Deep learning implementations in COVID-19 datasets.

One of the major issues in COVID-19 research is the lack of reliable and adequate data. Due to the limitation in the number of tests conducted, multiple death cases and virus affected cases are being left out unreported. It is difficult to even comment if the failure factor in detecting COVID-19 infection is three, three hundred, or even more. Across the globe, none of the countries have been successful in providing reliable datasets pertaining to the existence of the virus in a representative sample of the mass population. But research and development activity cannot stop, and thus information fusion plays an extremely important role. The technical definition of information fusion, as per the text contents, mentions that “it is the process of combining and associating information from one or multiple sources to provide useful information for the detection, identification, and characterization of a particular entity”. In ML and DL applications, the availability of large-scale, high-quality datasets plays a major role in the accuracy of the results. Information fusion helps to integrate multiple datasets and use them in the DL models to achieve enhanced accuracy in predictions. As an example, computed tomography (CT) images from Xi’an Jiaotong University and Nanchang First Hospital and Xi’an No.8 Hospital have been integrated as part of Information fusion to be fed into the AI and DL models ( Wang, Kang, et al., 2020 ). Similar information fusion has been observed in Ghoshal and Tucker (2020) where X-ray images of the lung from Dr. Joseph Cohen's GitHub repository have been augmented with Chest X-ray images available from the publicly available Kaggle repository. In Apostolopoulos and Mpesiana (2020) , X-ray image datasets from GitHub, Cohen, Radiology Society of North America (RSNA), and Italian Society of Medical and Interventional Radiology (SIRM) were associated and used fed into the CNN for detecting COVID-19. In the later sections of the text, similar references can be visualized pertaining to applications of information fusion in order to fill the lag of data unavailability and still continue to generate predictions of enhanced quality.

It is important to understand that the pandemic is at its peak where existing medical facilities are overwhelmed. The emergency departments, intensive care facilities have been stretched beyond their regular capacity to serve the ever-growing population of patients. In such a crisis, the healthcare providers and also the patient family members need to make rapid decisions with minimal information. The phenotype of the COVID-19 disease starts with mild or no symptoms at all, yet rapidly changes its course to making patients extremely critical with even fatal outcomes suffering from multi-organ failures. The objective is to reduce such abrupt deteriorations, detect the disease at the earliest using the limited knowledge base and resources. The traditional lab-based RT-PCR (real-time polymerase chain reaction) test using the nose- throat swab has limited sensitivity and is also time-consuming. When the number of patients is huge, shortages of RT-PCR reagents and specialized laboratory resources for performing COVID screening tests are inevitable. The need for tools that help to augment the resources is thus an absolute necessity. ML and artificial intelligence (AI) are techniques that have the potential to enable accelerated decision making and improve patient-centered outcomes. Various studies have developed ML models to serve the same purpose using minimal resources resulting in significantly comparable accuracy (82–86%) and sensitivity (92–95%) as the gold standard test RT-PCR ( Brinati et al., 2020 ). The use of AI and ML model could be coupled with radiological images leading with more accurate detection of the disease at an earlier ensuring availability of treatment. Models like DarkNet, detect and classify COVID-19 cases from pneumonia and serve the purpose for locations where there are shortages of radiologists due to an overwhelming number of patients ( Ozturk et al., 2020 ).

1.2. Impact of DL based COVID-19 detection for sustainable cities

After conducting an extensive background study, it is evident that there is not many surveys conducted emphasizing the applications of DL frameworks and image processing in the prediction of COVID-19 cases. The present pandemic situation across the globe has impacted millions of lives. Thousands and thousands of people are getting affected by this highly contagious disease leading to questions on survival and sustainability of the human race ( Megahed & Ghoneim, 2020 ). The only way to contain the disease is to detect the disease at its initiation, barring others from getting infected. This requires accelerated diagnosis without associated health hazards. The traditional approaches fail to provide the same due challenges pertinent to detection time, cleaning needs after each use of the diagnostic machinery and availability of resources. The use of ML approaches eliminates these issues and also detects faster. ML approaches, if used more predominantly, can lead to containment of the disease and reduce mortality.

The paper thus provides comprehensive information on various DL implementations in COVID-19 using real-time as well as publicly available image datasets. The unique contributions of our study are mentioned below:

  • • The survey includes basic information on COVID-19 and its spread, which establishes the motivation and need for accelerated disease prediction ensuring containment of the disease in smart cities.
  • • The role of DL applications in medical image processing is discussed in detail in support of its capability in COVID-19 predictions.
  • • The recent works on DL and image processing implementations in COVID-19 are discussed explicitly.
  • • The datasets, methodologies, evaluation metrics, research challenges, and the lessons learned are included from these state-of-the-art research works in addition to the future directions in controlling the pandemic in smart cities.

1.3. Paper organization

The rest of this work is organized as follows. Section 2 presents fundamental information on COVID-19, DL and expresses the general motivation towards the adoption of DL to process and analyze medical images from the existing literature. An overview of DL applications in medical image processing is presented in Section 3 . Section 4 presents a focused review of potential DL implementations for medical image processing in COVID-19. Section 5 presents three use cases of plausible DL-based implementations for COVID-19 image processing. Section 6 summarizes the aforementioned reviews highlighting the lessons learned and enlisting the recommendations guiding towards the future direction of research. The paper is concluded in Section 7 .

2. Background and motivations

This section presents the fundamentals of COVID-19, DL, and an overview of the adoption of DL to process and analyze medical images from the existing literature.

2.1. Overview and status of COVID-19 outbreak

At the outset, multiple pneumonia cases were being registered in the Wuhan city of the People's Republic of China (PRC) towards the end of 2019 ( Pham, Nguyen, Huynh-The, Hwang, & Pathirana, 2020 ). The COVID-19 index case was identified in Wuhan, Hubei province, PRC in December 2019. The COVID-19 was identified and declared as an infectious virus initiated by acute respiratory syndrome coronavirus 2 (SARS CoV-2). The investigations revealed that the source of COVID-19 was likely from Huanan Seafood Market in Wuhan city, and eventually, the government of PRC officially declared an additional 27 cases by December 2019. Based on several experiments, the researchers observed that the infection was transmitted from wild bats ( Andersen, Rambaut, Lipkin, Holmes, & Garry, 2020 ). This virus falls under the category of Beta-coronaviruses (beta-CoV), which consists of SARS coronavirus (SARS-CoV). The authors in Unhale et al. (2020) noted that the COVID-19 virus epidemic started in the spring carnival in PRC, where many people from across the globe traveled to participate in this event and gathering of this huge mass from several countries across the globe catalyzed the virus not only to spread within China but was carried across International boundaries to other countries.

The China country office, Regional Office for the Western Pacific, and Headquarters of WHO have been working rigorously in analyzing the effect of COVID-19 from the first week of January 2020 ( WHO, 2019 ). In the last week of January, the officials of WHO announced the outbreak as Public Health Emergency of International Concern (PHEIC).

The authors in Kampf, Todt, Pfaender, and Steinmann (2020) observed that the spread of COVID-19 is minimal in the regions with high temperatures and humidity. The authors also highlighted the use of steam therapy for reducing the threat of coronavirus. The steam inhaled traverses in the respiratory glands to the alveoli and aids in improving the oxygen levels. The authors in Shereen, Khan, Kazmi, Bashir, and Siddique (2020) inferred that the parameters for growth or spread of the virus depend on the environmental conditions, hygiene, water droplets of respiration, and physical contact.

WHO has come up with an effective strategy on 3rd February 2020 to combat coronavirus ( WHO, 2020 ), which provides guidelines and protocols for health workers, doctors, Government officials, etc. to combat the COVID-19 crisis and work efficiently as front line workers ensuring personal safety. One of the strategies would be to test all the suspected cases, isolate them, and find their contact and travel history. Another important strategy given by the WHO is to follow lockdown which can be an effective way for virus containment. The guideline report ( W.H. Organization, 2020 ) released by WHO on 19th March 2020 mentions the fact that the infection prevention was based on the knowledge gained from the case history of patients who have suffered from severe acute respiratory syndrome (SARS). The report also focuses on following certain precautions such as usage of the medical mask by the public in COVID-19 affected areas, cleansing the hands with alcohol-based solutions or hand-wash, and the health care personnel are instructed not to touch the patients with bare hands. There is no specific medication available for COVID-19 as per WHO to date. United Nations Conference on Trade and Development (UNCTAD) has consolidated a list of best practices, guidelines ( UNCTAD, 2020 ) that nations can follow to run the essential services which can help the growth of the economic condition during COVID-19. The world economic forum has proposed the use of digital Foreign Direct Investment (FDI) ( WEFORUM, 2020 ) to accelerate the financial growth in developing countries.

2.2. Fundamentals of deep learning

DL and neural networks (NN) have gained immense momentum in present day scientific research since they have the capability to learn from the context ( Alazab et al., 2020 ; Gadekallu, Khare, et al., 2020 ; Reddy, Parimala, Chowdhary, Hakak, & Khan, 2020 ; Schmidhuber, 2015 ). These two techniques have been widely used in various applications such as classification and prediction problems, image recognition, smart homes, self-driven cars, object recognition, etc due to their capability to adapt to multiple data types across different domains. Fig. 2 depicts various techniques used in DL. DL replicates the functioning of human brain in filtering information for accurate decision making. Similar to the human brain, DL trains a system to filter the inputs using different layers to aid the prediction and classification of data. These layers are like layered filters used by the neural networks in the brain where each layer acts as a feedback to the next layer. The feedback cycle continues until the precise output is obtained. The precise output is formed by assigning weights in each layer, and during training, these weights are adjusted to get the accurate output.

Fig. 2

Various techniques and applications of deep learning.

DL techniques can be categorized as supervised, semi-supervised, and unsupervised. In supervised learning, the model is trained with a known input-output pair. Each known value constitutes an input vector and the desired value, which is referred to as the supervisory signal. The method uses existing labels to predict the labels of the desired output. Classification methods use supervised learning ( Patel et al., 2020 ) and can be applied to scenarios to identify faces, traffic symbols, recognizing spam in a given text, converting speech to text, etc.

Semi-supervised learning is an in-between technique of supervised and unsupervised ML methodologies. The training data in Semi-supervised learning consists of labeled and unlabelled values. Semi-supervised learning falls between unsupervised learning and supervised learning. The unlabeled data, when used in conjunction with a small amount of labeled data, can produce a considerable improvement in learning accuracy. There exist certain scientific assumptions related to DL techniques ( Cheng, 2019 ). The first being, data in proximity to each other have the same label. Second is the cluster assumption, where the data in the cluster share the same label. The third being, the data is restricted to a limited dimension rather than the complete input space. Unsupervised learning deals with knowing the inter-relations among the elements of the data set and then classifying the data without using labels. Some of the algorithms following these techniques are clustering, anomaly detection, and NN. Clustering is the principle of identifying similar elements or anomalies in a data set ( De Simone & Jacques, 2019 ). This anomaly detection of unsupervised learning is widely applied in security domains ( Lima & Keegan, 2020 ).

Most of the DL techniques use Artificial Neural Network (ANN) for feature processing and extraction. Feedback technique is used for the learning mechanism ( Shanmuganathan, 2016 ) where-in each level updates its input data to form a summarized representation. The term deep in DL technique refers to the number of layers required for the data to be transformed. A Credit Assignment Path (CAP) is used during this transformation process. In the case of a feed-forward NN, the depth of CAP is calculated by the number of hidden layers in addition to the number of output layers.

In the case of a Recurrent Neural Network (RNN), there might be more than one signal which traverses multiple times in a layer, and thus the CAP depth cannot be determined ( Yu, Si, Hu, & Zhang, 2019 ). One of the predominantly used techniques of NN for image processing is CNN ( Gadekallu, Rajput, et al., 2020 ; Huynh-The, Hua, Pham, & Kim, 2020; Rawat & Wang, 2017 ). In CNN, the feature extraction technique is automated and is performed during the training on the images making DL the most accurate method for image processing domains. RNN works similar to CNN, but the difference is that RNN is used for language computation. RNN uses the concept of feedback loops where the output of one layer is fed as the input of the next layer. RNN can be used for datasets which involve time-series ( Che, Purushotham, Cho, Sontag, & Liu, 2018 ), text, financial data, audio, video, etc.

Generative Adversarial Networks (GANs) works on the concept of the generator network and the discriminator. The generator network produces fake data while the discriminator differentiates fake and real data. These two networks work towards improvising the training process, and thus GANs are mostly used in an application that requires the generation of images ( Greenspan, Van Ginneken, & Summers, 2016 ) from the text. Google's inception network introduces inception block to compute convolutions and pooling operations that run simultaneously for the effective processing of complex tasks. This is an advanced level of DL used in automating the responsibility involved in image processing ( Alom, Yakopcic, Nasrin, Taha, & Asari, 2019 ).

DL can be applied in varied domains, which involve the processing of a vast set of data. DL has great potential in smart cities as a huge amount of data will be generated in smart cities due to digitization ( Bhattacharya, Somayaji, Gadekallu, Alazab, & Maddikunta, 2020 ; Habibzadeh, Nussbaum, Anjomshoa, Kantarci, & Soyata, 2019 ). The evaluation of DL techniques relies on two parameters: firstly, the enormous amount of data size to be processed and, secondly, the massive computational power. DL also aids in the faster analysis of complex medical images ( Greenspan et al., 2016 ) for rendering an accurate diagnosis. DL is popularly implemented in the health care sector for broad data interpretations ( Razzak, Naz, & Zaib, 2018 ), aiding early diagnosis of diseases, thereby reducing manual workload. The following section provides an overview of DL applications for medical image processing.

3. Overview of DL applications for medical image processing

Advances in medical science have significantly changed health care over the last few decades, allowing doctors to identify and treat diseases more effectively ( Sahiner et al., 2019 ). But doctors, similar to any human beings, are also prone to errors. The scholarly credentials of a doctor lie not only in the individual's level of intelligence, but the way they treat the problems of patients and the associated type of health system that supports them ( Lundervold & Lundervold, 2019 ). This combination caters to such wide variations in clinical outcomes, and ML, in this regard, is the best solution for improving the strength of a doctor in diagnosing and treating patients ( Huang et al., 2019 ). The effectiveness of ML algorithms depends on the types of features extracted and data representation. ML algorithms primarily face two key challenges, one being efficiency in scanning all high-dimensional datasets and secondly training of the model to find the most appropriate task ( Ghesu et al., 2016a ; Krawczyk, Minku, Gama, Stefanowski, & Woźniak, 2017 ). DL has been one of the commonly used techniques that guarantees a higher degree of accuracy in terms of disease prediction and detection. Applications of DL techniques have introduced new breakthroughs in the field of healthcare, as shown in Fig. 3 . In the real world, numerous sources of medical data, including Magnetic Resonance Imaging (MRI), X-ray, Positron Emission Tomography (PET), Computerized Tomography scan (CT scan), have provided doctors vast volumes of information ( Liu, Liu, & Wang, 2015 ; Rehman, Zia, et al., 2020 ; Shen, Wu, & Suk, 2017 ). CNN is one of the most preferred algorithm popularly used in image processing and analysis ( Huynh-The et al., 2020 ). The authors in Litjens et al. (2017) reviewed various DL methods for medical image processing and have inferred the use of DL in object identification, image categorization, segmentation, etc. In the medical domain, DL for image processing is used in various departments such as ophthalmology, neurology, psychotherapy, cancer detection, and cardiology. The authors have also enlisted the unresolved research challenges in DL relevant to image analysis. In the current scenario of patients and medical stakeholders maintaining electronic records, AI has aided easing the medical image processing. The authors in Ker, Wang, Rao, and Lim (2017) reviewed various AI techniques that can be implemented for medical image analysis. The authors from diverse literature found that CNN has been widely used for this analysis, along with big data techniques for processing. The authors also highlighted the main challenges of the unavailability of high quality labeled data for better interpretation.

Fig. 3

Applications of DL in medical image processing.

3.1. Classification

Classification is often termed as Computer-Aided Diagnosis (CAD). Classification plays a significant role in medical image processing. During the classification processing phase, one or even more images are taken as input samples, and a single diagnosis factor is generated as an output which classifies the image ( Gao, Li, Loomes, & Wang, 2017 ). In 1995, the authors in Lo et al. (1995) used DL to classify lung nodules. The detection procedure involves 55 chest X-ray images, two deep-neural hidden layers. Using this test, the radiologist noticed 82% of lung nodules. In Shen, Zhou, Yang, Yang, and Tian (2015) , the author's used multi-scale DL approaches to identify lung nodules in CT images. The experimentation process comprises of three hidden layers, which take the CT images as input and provide a response to the output layer of the lung nodule. In Rajpurkar et al. (2017) , the authors introduced a CheXNet DL model with 121 convolution levels, 1,12,120 chest X-ray images provided input dataset to diagnose 14 different forms of lung diseases. Using this examination, the radiologist states the CheXNet algorithm exceeds the range of F1-metric efficiency. In Pratt, Coenen, Broadbent, Harding, and Zheng (2016) , the authors developed a model by training 10-layer CNN with three completely integrated layers on around 90,000 fundus images to diagnose Diabetic Retinopathy (DR). Experimental tests attain 95% sensitivity and 75% accuracy on 5000 testing images. Another related research in Abràmoff et al. (2016) employed IDx-DR version X2.1 to train 1.2 million DR images for identifying DR. Results indicate that the built design can achieve a 97% sensitivity and 30% increase in specificity. The work in Kawahara and Hamarneh (2016) proposed a multi-layer CNN for the classification of skin lesions. The multi-layer CNN is trained with a variety of high-resolution images. Results from the publicly available dataset of skin lesions reveal that the proposed model achieves a better accuracy rate than the other existing models.

3.2. Localization

In the classification, the images are fed to CNN, and the contents of the image are revealed. After the image classification is done, the next step in the detection of the disease is the image localization, which is responsible for placing the bounding box around the output position, which is called as classification with localization, the term localization here refers to figuring out the disease in the image. The localization of anatomy is a crucial pre-processing phase in a clinical diagnosis that enables the radiologist to recognize certain essential features. During recent years, several research works have been conducted using DL models to localize the disease. For example, in Roth et al. (2015) , the authors presented a model for the classification of organs or body parts using a deep CNN. The CNN was trained with 4298 X-ray of 1675 patients to recognize the five organs of the body's legs, abdomen, neck, lungs, and liver. The experimental results produce a 5.9% classification loss and 0.998 Area under the curve (AUC). Another active research in Guo, Gao, and Shen (2015) introduced a model to locate T2 MR prostate images. Localizing accurate prostate has several obstacles, including thickness variations and inconsistent appearance. The authors used Stacked Sparse Auto-Encoder (SSAE) to train prostate images, and the trained features highlight the identity of the prostate in the image. The study was performed on the dataset containing 66 images of prostate MR, and the findings are positive, providing better performance than existing models. The work in Shin, Orton, Collins, Doran, and Leach (2012) trained Dynamic Contrast-Enhanced MRI (DCE-MRI) 78 images of patients’ livers and kidneys using a DL approach. Three different datasets were used for localizing multi-organ disease. The proposed model as a whole is more accurate for the localization of diseases in heterogeneous organs. In Payer, Štern, Bischof, and Urschler (2016) , the authors used 3D CNN for landmarking in medical images. Spatial Configuration-Net (SCN) architecture was used to combine accurate response with landmark localization. Experimental evaluation of 3D image datasets using CNN and the SCN architecture provides higher accuracy. The work in Baumgartner et al. (2016) developed a model that helps to localize the fetal in the image. During this process, CNN was trained to recognize up to 12 scan planes and a network model designed to detect fetal accurately. Experimental tests achieved 69% precision, 80% recall, and 81% accuracy.

3.3. Detection

Creating accurate ML models capable of classifying, localizing, and detecting multiple objects in a single image remained a core challenge in computer vision ( Duan et al., 2019 ). With recent advancements in DL and computer vision models, medical image detection applications are more comfortable to develop than ever before. Object detection allows for the recognition and localization of multiple objects within an image or video. Object detection is a computer vision technique that is used to identify instances of real-world objects. Object detection techniques train predictive models or use matching templates to locate and identify objects. Object detection algorithms use extracted features and learning algorithms to identify object type instances. Object detection is a key technology behind applications such as video surveillance, image retrieval system, and medical diagnostics ( Albarqouni et al., 2016 ). The work in Ghesu et al. (2016b) proposed the Marginal Space DL model for object detection. Adaptive training patterns is used for achieving better performance in Deep NN layers. The approximate position, boundary delineation, incorporated with a DL model, find image outline segmentation. The experimental method includes 869 patients, 2891 aortic valve images, delivering 45.2% better performance compared to other previous models. Another work in Shin et al. (2016) introduced a novel model by training a CNN using triple cross-validation to detect interstitial lung disease, lymph nodes. The experimental results achieve AUC 0.95, the sensitivity of 85%. Due to the Graphics Processing Unit (GPU) memory restrictions, enhancing 2-D image detection to 3-D image detection is a severe challenge in image detection. In Peng and Schmid (2016) , the authors used 2-D region proposal networks to capture solutions in 2-D objects and later use a different framework to integrate 2-D solutions into 3-D solutions.

Another interesting work in Liao, Liang, Li, Hu, and Song (2019) proposed a novel lung cancer detection model. This process encompasses two steps. Step one detects dubious pulmonary nodules using a 3-D NN. The second step encompasses cancer detection by collecting the finest five nodules and integration into the leaky noisy-OR gate. The results achieved 85.96% accuracy on training sets and 81.42% on test sets. In Xu et al. (2015) , the authors proposed a DL model called the SSAE for the detection of nuclei in breast images. SSAE is trained for extracting high-level features, and the extracted features are used as inputs to the softmax classifier for nuclei detection. The experimental results showed that the proposed model achieved 84.49% F-Measurement and 78.83% precision. Similar research in Cruz-Roa, Ovalle, Madabhushi, and Osorio (2013) used CNN with SSAE to obtain high-level features. The softmax classifier was used for detecting cancer in the skin. The experiment carried on 1417 images and achieved 91.4% accuracy and 89.4% F-measure.

3.4. Segmentation

Image Segmentation in medical image processing plays a crucial role in disease diagnosis. Image segmentation divides a digital image into several fragments. Medical image segmentation aims to make digital images simpler and more comfortable to examine. The output of medical image segmentation is a collection of medical segments covering the whole medical image ( Gu et al., 2019 ). Many inter-disciplinary techniques are currently being used for processing medical data for obtaining better accuracy in diagnosis. The authors in Guo, Li, Huang, Guo, and Li (2019) propose a supervised Artificial Neural Network (ANN) technique with the combination of cross-modality, which is applied in all the levels of ANN. Moreover, the authors also design an image segmentation method using CNN to depict the lesions of soft tissue from the images obtained by various inter-disciplinary techniques such as magnetic resonance, CT, and positron emission tomography. Proper mining and analysis of the organ feature are essential before applying image processing. This can help in scenarios where the images are unclear or erroneous during system malfunctions. The major challenge in existing methods for image processing is the lack of coordination between the training objectives and the dependencies of the output.

The authors in Oktay et al. (2017) propose a novel technique to overcome this challenge by implementing a generic training strategy that embeds prior knowledge into the CNNs using a regularisation model which is trained thoroughly. The identification of biomarkers in medical images helps in the diagnosis process. The supervised DL is mostly used for image processing but can lack accuracy since it needs extensive knowledge of the position of the organ and its characteristics. The biomarker discovery process can be achieved by detecting anomalous regions. This anomaly detection can give hidden information about the anatomical structure. The authors in Seeböck et al. (2019) use this idea to implement a Bayesian DL technique assuming that the epistemic uncertainties would relate to the anatomical eccentricities from the training dataset. The authors also use Bayesian U-Net to train on a scenario with weak labels of given anatomy by using different ANN methods. The MR scans of the prostate using segmentation can unfold valuable information for the detection of prostate cancer. But there are several challenges of using the segmentation of MR scans such as missing boundary region between the prostate and another organ, the multifaceted background, and a difference in feature of the prostate. The authors in Zhu, Du, and Yan (2019) design a novel technique called Boundary-Weighted Domain Adaptive Neural Network (BOWDA-Net) which eases the boundary detection in segmentation by implementing boundary-weighted segmentation loss. The authors also deploy a boundary-weighted transfer leaning method to solve the issue with small image datasets.

3.5. Registration

Image registration is a method for converting datasets to a single coordinate model. Image registration plays a vital role in the area of medical imaging, biological imaging. Registration is necessary to analyze or integrate data from several medical sources. Usually, a medical technician is supposed to display several images in different directions to reduce the visual contrast between images ( Haskins, Kruger, & Yan, 2020 ). The medical technician is often expected to manually classify points in the image that have significant signal variations as part of a sizeable anatomical structure. Medical image registration saves a lot of time for doctors and physicists. To address the shortcoming of the manual registration process, DL implementations in the image registration process have improved the productivity of the image registration process ( de Vos, Berendsen, Viergever, Staring, & Išgum, 2017 ).

More than half of cancer patients undergo radiation therapy, making it one of the most prevalent cancer treatments ( Elmahdy et al., 2019 ). When the number of patients rises, more doctors and more patients will be assisted by medical image processing. Elastix is fully automated 3D deformable registration software, and its application can enhance the radiotherapy process in radiation oncology departments around the world. The proposed model provides a computationally, efficient global feature search. Previously, it was difficult to get 3D registration applications that could accommodate changes in alignment, translation, and variations in intensity accurately. The proposed model saves a lot of time for physicians and offers superior registration outcomes with clinical trust. Enhanced ability to combine diagnostic scans ensures that patients undergo fewer repeat scans, save time, and minimize exposure to radiation. The experimental findings of the Haukeland Medical Center cancer dataset obtained a success rate of 97% for the prostate, 93% for the seminal vesicles, and 87% for the lymph nodes. In Bai et al. (2013) , the authors proposed a multi-atlas classifier to enhance the accuracy rate of the registration. The proposed model highlights two steps. In the first step, the patch-based label fusion method was formed in the bayesian model to extract multiple features from atlas classifiers. In the second step, using label data, better registration accuracy is achieved. The results conducted on cardiac MRI imaging dataset gained 0.92, 0.89, 0.82 dice score for the left ventricular cavity, right ventricular cavity, and myocardium.

In Chee and Wu (2018) , the authors used a Self-supervised learning model to establish 3D-image registration. The goal of this model is to conduct image registration in minimal time to classify the internal areas of the brain and to incorporate information from different data types. The experimental results conducted on the axial view of brain scans achieved 100x faster run time. Another exciting work in Lv, Yang, Zhang, and Wang (2018) , suggested a CNN image registration model for capturing abdominal images, to detect motion-free abdominal images. The experiments were conducted on ten different patients with a 1.5 T MRI scan. Moreover, the DNN model helped to achieve better registration results compared to other existing models.

3.6. Summary

In this section, we examined DL applications for medical image processing, such as classification, location, detection, segmentation, registration. In today's world of advances in DL, we have seen significant changes in health care over the past few years by providing new opportunities to improve the lives of the people. DL for the analysis of biological images has been used in the scientific domain to identify various diseases. We observe that DL applications dive into numerous applications for medical image processing. Previously, doctors have spent a long time looking at the reports, and most of the things have been done manually. DL has begun to improve the time-consuming process, providing better results, services, and sophisticated medical tools. DL applications reshape the healthcare industry by providing new possibilities by enhancing medical life for the people. We summarize the existing DL applications for medical image processing in Table 2 .

Review of DL applications in medical image processing.

4. Deep learning for medical image processing in COVID-19

This section discusses the potential of DL in medical image processing in order to combat the COVID-19 pandemic implementing four strategies. The strategies are outbreak prediction, the virus spread tracking, coronavirus diagnosis and treatment, vaccination, and drug discovery, as shown in Fig. 4 .

Fig. 4

Deep learning in medical image processing to fight COVID-19 pandemic.

X-ray is used to diagnose pneumonia and the basic stage of cancers. But CT scan is a more sophisticated technique that can be used to detect minute changes in the structure of internal organs, and it uses X-ray as well as computer vision technology for its results. X-ray fails to detect diagnosis related to soft tissues as it uses 2-D imaging, but on the other hand, CT scan uses 3-D computer vision technology as the scan takes multiple images from various angles of the body organ. Although both X-ray and CT scan capture images of internal body structures, in the case of traditional X-ray, the images tend to overlap. As an example, the ribs shadow the heart and the lungs, making the structure with major medical queries obscured, thereby failing to provide a more accurate diagnosis. On the contrary, in case of a CT scan these overlapping aspects are completely eradicated ensuring the internal anatomy is very prominent providing a clear understanding of the health condition.

The initial diagnosis of COVID-19 disease is usually based on basic symptoms of pneumonia, analysis of patient's travel history, and exposure to other COVID-19 patients. But the chest imaging plays a significant role in understanding the extent of infection and follow-up requirements. The indicators of COVID-19 cases typically have patchy or diffused asymmetric airspace opaqueness. The CT images, the indicators have revealed bilateral lung involvement. Patients with serious conditions in ICU have shown a consolidative pattern, whereas non-ICU patients have shown ground glass pattern in their reports. On the contrary, the chest images in SARS and MERS disease have revealed unilateral indicators. But in 15% of the cases, the initial X-ray and chest images have indicated normalcy for patients already infected by the disease. This emphasizes the need for further confirmation through physical tests or the use of DL based approaches, as discussed in Hosseiny, Kooraki, Gholamrezanezhad, Reddy, and Myers (2020) .

In CT images, an X-ray rotates and captures images of a particular section, from varied angles. These images are stored in the computer and further analyzed to create a new image that eliminates all overlapping. These images help doctors understand internal structures with enhanced clarity getting the complete idea about size, structure, density, texture, and shape of the same. Thus CT scan is considered to be an effective diagnostic technique than X-ray. The chest CT or X-ray fails to differentiate between COVID-19 and other cold-related symptoms. The chest CT or X-ray mostly detects the presence of an infection, which could be the consequence of any other disease as well. Also, the COVID-19 disease is extremely contagious, and the uses of imaging equipment on multiple patients are extremely hazardous. The scan machines are highly sophisticated machinery, and cleaning these machines each and every time after single patient usage is impossible. Even if attempts are being made to clean, the rigorous probability of virus exiting on surfaces of the scan machine is extremely high. On the contrary, Swab tests are proven to be more prudent for COVID-19 detection and diagnosis rather than imaging techniques. Many COVID-19 patients have normal chest CT or X-ray but are later found to be COVID-19 positive. The traditional method RT-PCR has the capability to detect the disease with accuracy but has associated challenges of taking higher detection time and requirement of reagents. In a pandemic crisis with an ever-growing number of patients, the dire need is accelerated detection of the disease using minimal resources. To fulfill this need, ML algorithms based on image processing play a significant role in eliminating the overwhelming crowd of conducting the swab test and relevant others. Sample CT scan and X-ray images of COVID-19 patients are depicted in Fig. 5 .

Fig. 5

Sample X-ray and CT scan images of COVID-19 patients ( Bernheim et al., 2020 , Ozturk et al., 2020 ).

4.1. Outbreak prediction

The world faced an unprecedented global health crisis due to the outbreak of COVID-19 ( Velavan & Meyer, 2020 ; Wu & McGoogan, 2020 ). The simple epidemiological and statistical models have attracted considerable attention from the authorities with regard to COVID-19 detection and predictions. It is also a known fact that governments and other legislative bodies of various countries have always depended on various outbreak prediction models that guide towards the implementation of new policies and determines the efficacy of earlier made decisions. In this present hour of crises, the authorities around the world are similarly emphasizing on implementation of different outbreak prediction models using COVID-19 data to make well-versed decisions. This would enable them to implement appropriate control measures ( Ardabili et al., 2020 ) and develop protocols for COVID-19 containment, detection, and prediction. Some COVID-19 worldwide outbreak predictions are available at ( OurworldinData, 2020 ; WHO, 2020a ; WORLDOMETER, 2020 ). While the literature contains many attempts to resolve the COVID-19 outbreak and related concerns, there exists a dire need to strengthen the necessary capabilities of the traditional models predominantly to enhance the robustness of the predicted results.

Digital technologies have contributed immensely to resolve significant health care and related therapeutic concerns. The majority of these new technologies implement big data analytics, the Internet of Things (IoT) with 5G, blockchain technology, and AI (with ML and DL) ( Ting, Carin, Dzau, & Wong, 2020 ). It is an established fact that DL has gained immense momentum in the field of ML with its implementations across all sectors of human life ( Tajbakhsh et al., 2020 ). As an example, in the case of data-centric studies such as computer vision, DL methods have proved to be extremely successful in providing optimal solutions ( Anwar et al., 2018 ; Liu et al., 2018 ).

DL methods have been used excessively in medical image processing and related studies ( Lundervold & Lundervold, 2019 ; Suzuki, 2017 ). Since it is already a popular choice among researchers in the healthcare sector, it is naturally suggested as an appropriate method for modeling the present outbreak as well. A fully connected CNN for diagnosing of COVID-19 is depicted in Fig. 6 . The choice is primarily due to the dynamic nature of COVID-19 and the variability in its actions from nation to nation. For example, the study in Ardabili et al. (2020) has collected data pertinent to the COVID-19 pandemic in Italy. The implementation of ML and soft computing models in this work predicts the possibility of an outbreak providing an opportunity for the administration to plan accordingly for disease control and related economic arrangements. In the case of DL implementations, quality and size of data play a significant role in generating accurate results. The COVID-19, the outbreak started from Wuhan in China and the patients who were presumed to be affected by 2019-nCoV were admitted to a selected hospital in Wuhan to prepare an effective disease control strategy ( Wu, Wu, Liu, & Yang, 2020 ). The data of these patients with laboratory-confirmed 2019-nCoV infection were prospectively collected and analyzed by the authors to assist in DL and ML related research activities ( Huang, Wang, et al., 2020 ).

Fig. 6

A fully connected CNN for COVID-19 diagnosis.

In the case of other countries like Iran, China, Italy, and South Korea, the Google Trends Data is being used to collect coronavirus related information ( Ayyoubzadeh, Ayyoubzadeh, Zahedi, Ahmadi, & Kalhori, 2020 ; Husnayain, Fuad, & Su, 2020 ; Strzelecki, 2020 ). The other data source for outbreak prediction involves fitting of the cumulative curve and relevant measurements from infected patients in Hubei, China, with an exponential curve ( Remuzzi & Remuzzi, 2020 ) to visualize the geographical areas with a possible outbreak. DL thus enables the prediction of COVID-19 epidemics on a global scale. The accuracy could depend on the number of involved factors of COVID-19 cases which are categorized as: a) confirmed; b) active; c) recovered; d) deceased; e) every day reported; f) population; g) living conditions; and h) environments, etc. DL can also generate data-driven features and manage high-dimensional data, whereas ML typically relies on hand-crafted features and only match low-dimensional data ( Cao et al., 2018 ). DL can, therefore, be established as more applicable in the field of genomic prediction like COVID-19. Similar implementations have been observed for the predictions of infectious disease, which also spread rapidly and are contagious ( Chae, Kwon, & Lee, 2018 ).

The work in Liu et al. (2020) deployed ML methodology in predicting outbreak-related events of COVID-19 in various parts of the Chinese provinces using a clustering technique that allowed the exploitation of geo-spatial synchronicity. The DL techniques in medical imaging can also help in implementing pandemic modeling to interpret the cumulative numbers of infected people versus the number of recovered cases worldwide. The cases of COVID-19 range from asymptomatic to severe pneumonia till acute respiratory distress and multiple organ dysfunctions. The findings of COVID-19 patients analyzed at Anhui Medical University in China ( Fu et al., 2020 ) revealed that the patients initially were detected with SARS-CoV-2 RNA with Reverse-Transcription Polymerase chain (RT-PCR). The results analyzed the genetic sequence and identified similarities with COVID-19 confirming cases of SARS-CoV-2. The Central Hospital of Wuhan examined some patients with conventional pathogen detection using chest CT scans ( Zhou et al., 2020 ). The CT scan reports detected the existence of COVID-19 virus in the lungs ( Bernheim et al., 2020 ; Zhang, Yang, et al., 2020), with additional claims of diagnosing COVID-19 faster than the current RT-PCR tests. It is important to mention that COVID-19 shared extremely similar imaging features with other types of pneumonia and hence difficult to decide on a conclusive diagnosis. With an intention to reduce this difficulty, the main objective of the study involved studying the existing DL algorithms, use significant features integrating medical image processing applications from CT scan images of COVID-19, and then reduce prediction errors to improve the accuracy of future estimations.

4.2. Virus spread tracking

In December 2019, when people were waiting for the New Year celebration of 2020, few cases of typical pneumonia caused by a novel coronavirus (2019-nCoV) ( Wu, Leung, & Leung, 2020 ) were reported in Wuhan, China. The work in Rothan and Byrareddy (2020) revealed that a significant number of people were infected from the wet animal market in Wuhan city, considered the zoonotic origin of the COVID-19. Eventually, multiple cases got spread across China, and the world is giving it the status of a global outbreak ( Surveillances, 2020 ). There have been attempts made to identify a host reservoir or intermediate carrier that initiated the spread of COVID-19 from animals to humans ( Cascella, Rajnik, Cuomo, Dulebohn, & Di Napoli, 2020 ). The authors in Lu et al. (2020) considered two species of snakes as a possible reservoir of the COVID-19, whereas another study ( Zhang, Zheng, et al., 2020 ) rejected the possibility. The work in Xu et al. (2020) found pangolins as a latent intermediate host of coronavirus. A study in Bassetti, Vena, and Giacobbe (2020) demonstrated that COVID-19 genomic sequence analysis had similarities with two severe bat-derived, acute respiratory syndromes (SARS)-like coronaviruses. Similarly, the study in Malik et al. (2020) claim bats, civet cats, and pangolins responsible for being the potential reason behind the SARS-CoV-2 infection in human.

Various applications have been developed using concepts of computer vision, ML, and image processing approach to monitor and control the spreading of COVID-19 disease. Computer vision, image processing, and ML-based devices are being used for inspection, identification, gauging, or guiding of COVID-19. As an example, protective gear, respirator, ventilators, automatic sanitizers are being used for treating patients and protecting healthcare professionals, ensuring virus containment as well. Thermal screening is being used for measuring the temperature of individuals as elevated body temperature is a primary symptom. Social distancing is being implemented strictly to ensure safe distancing from affected patients and vision-guided robots are being used in this regard. The Australian government has launched Draganfly, an unmanned aerial vehicle (UAV) company, for immediate deployment of drones to detect COVID-19 infections among people in remote locations. In collaboration with a firm DarwinAI, the University of Waterloo has established a deep CNN, COVID-Net, for the detection of COVID-19 cases from chest radiography images ( VISION, 2020 ). Image processing and computer vision technologies are being used for the mass production of healthcare products and gears to be used by all stakeholders in the hospitals. These devices assist in avoiding the spreading of the virus by minimizing human contact. These technologies have proved their roles in diagnosing and reducing airborne virus particles, which has possibilities of infecting a large number of people.

The virus spread tracking system involves the use of data science that emphasizes on informing responses to the queries or issues relevant to the outbreak situation ( Ji, Wang, Zhao, Zai, & Li, 2020 ). The key challenge of this evidence-based approach is to execute the model involving data collection, analysis and reporting in real-time. A report from the Ministry of Health in New Zealand ( Kvalsvig, Barnard, Gray, Wilson, & Baker, 2020 ) suggested a detailed analysis that focused on the location of a typical person, time and other epidemiological parameters. It is possible to understand the impact and spread of COVID-19 based on gender, geographical region, travel history, age, and daily updates from any such surveillance data using DL and ML approaches.

The work in Pourghasemi et al. (2020) , Saha, Gupta, and Patil (2020) used a geographic information systems (GIS) tool for tracking of infectious diseases. The authors in Boulos and Geraghty (2020) combined GIS-based ML algorithm and Support Vector Machine (SVM) for the risk measurement of COVID-19 outbreak cases in Fars Province, Iran. As far as data is concerned, the internet acts as a very useful source to gain a tremendous amount of information about the COVID-19 virus. Apart from that COVID-19 related information - the number of confirmed infectious cases, death tolls, and recoveries are also available at the Johns Hopkins University dashboard ( C. for Systems Science J. H. U. Engineering (CSSE), 2020 ). Later, WHO also launched a COVID-10 dashboard ( WHO, 2019 ), which operates on ArcGIS. HealthMap ( Hossain & Househ, 2016 ) is also a dashboard that holds a collection of information from various sources. Aarogya Setu mobile app ( G.o.I. NIC & MEITY, 2020 ) provides official data of COVID-19 cases in India.

DL applications have been used with medical image processing approaches for the development and validation of a model at Renmin Hospital of Wuhan University in China ( Chen, Wu, et al., 2020 ). This model retrospectively collected 46,096 unidentified images of 106 hospitalized patients. The 106 admitted patients belonged to two categories wherein the first were COVID-19 infected patients, and the later had other diseases. The number of COVID-19 detected patients were 51. The team of medical practitioners evaluated and compared CT scan images of 21 COVID-19 pneumonia cases with the model developed at Renmin Hospital ( Chen, Wu, et al., 2020 ). A DL-based system was designed to ensure an easy decision for doctors to detect COVID-19 instances of infected pneumonia early enough to control the epidemic.

4.3. Coronavirus diagnosis and treatment

Coronavirus is not a single virus but a group or family of multiple viruses. Once a patient is infected with coronavirus, the symptoms could be similar to normal cold infection or severe respiratory syndromes. As an example, Severe Acute Respiratory Syndrome (SARS-COV) and Middle East Respiratory Syndrome (MERS-COV) are some such severe infections ( Wang, Wang, Ye, & Liu, 2020 ) having similar symptoms as COVID-19. Numerous people around the globe have been affected and hence countries have declared a national lockdown with millions of citizens being strictly quarantined ( Das, Ghosh, Sen, & Mukhopadhyay, 2020 ; Hopman, Allegranzi, & Mehtar, 2020 ; Rahman, 2020 ; Singh & Adhikari, 2020 ). In such a crisis, outbreak prediction models and virus-spread tracking tools that involve DL and medical image processing have huge potential for COVID-19 diagnosis and treatment processes. These tools help in supporting doctors for the initial screening process and rapid detection for accurate diagnosis of the disease.

The role of technology is very important in the functioning of DL and medical image processing to combat COVID-19 ensuring faster and accurate patient diagnosis. The authors in Li et al. (2020) have discussed the potential role of AI (with ML and DL) in the diagnosis of COVID-19. Initially, tests were conducted taking throat swab and nasopharyngeal samples from a patient and the samples were used to collect RNA using specific chemical processes. This RNN mixed with a specific reverse transcriptase enzyme (i.e., RT-PCR) turns into two-stranded DNA. The enzyme causes the DNA to be synthesized into “primers”. The “primers” are then fused with a fluorescent dye. This combination signal becomes a viral DNA, which is finally referred as COVID-19 positive test of the patient ( Molteni & Rogers, 2020 ).

The medical image processing techniques such as computerized tomography have always helped in fast and accurate diagnosis of diseases and it is no different in the case of COVID-19 as well. The sensitivity of a CT-based COVID-19 diagnosis is observed to be significantly 80%-90% better than RT-PCR while having 60%-70% specificity on the low side ( Ai et al., 2020 ). ( Bai et al., 2020 ). DL and medical image processing play an important role in differentiating between COVID-19 infected and non-infected patients. The COVID-19 symptoms closely match regular pneumonia. The hospitals in Spain consider this methodology as their default feature in a diagnostic pathway. However, other sources have identified X-ray as an alternative examination ( Chen, Zhou, et al., 2020 ).

Interestingly, the work in Peng, Wang, Zhang, and C.C.C.U.S. Group (2020) , Poggiali et al. (2020) presented a comparison between ultrasound and CT findings making ultrasound a more reliable approach in case of Pneumonia detection than chest X-ray. Another solution in Qin, Liu, Yen, and Lan (2020) , Zou and Zhu (2020) mentioned the requirement of additional information in COVID-19 candidate diagnosis using PET-CT. Recently, the use of CT imagery with AI detection has helped in diagnosing COVID-19 cases with distinct manifestations. ( Butt, Gill, Chun, & Babu, 2020 ). The work in Jin et al. (2020) proposed a detailed guidance report with useful tools to support COVID-19 diagnosis and care. The guideline consists of the methodology, epidemiological characteristics, population prevention, diagnosis, treatment of COVID-19 disease.

Standford University has provided data, models, tools, research studies, and funding opportunities for COVID-19 research. The research effort combined with COVID-19 datasets has helped to build comprehensive medical image processing and DL models for identification, virus diagnosis, treatment, and even potential vaccine development. Some available medical imaging datasets worth mentioning are - (a) Societa Italiana di Radiologia Medica (b) IEEE8023 chest x-ray and CT dataset on COVID-19 (c) CNN based Darwin AI and University of Waterloo (d) Centre for Mathematical Imaging in Healthcare provides AI support for COVID-19 diagnosis (e) RadiologyAi Consortium (CT scans of COVID-19 patients).

DL-based screening models have enabled to generate consistent and accurate outcomes by digitizing and standardizing the image data by integrating various medical image processing techniques. It has also been observed that sensitivity and precision in detecting COVID-19 using RT-PCR have a relatively low positive detection rate than the use of radiographic patterns on CT chest scans in the initial stages of the disease inception. Several CNN models have been explored theoretically by the authors in Butt et al. (2020) to distinguish CT samples with COVID-19, influenza viral pneumonia, or no-infection. The authors have also worked on existing 2D and 3D DL models integrating clinical comprehension to achieve AUC, sensitivity, and accuracy for coronavirus vs. non-coronavirus cases on the basis of thoracic CT findings.

4.4. Vaccine discovery and drug research

The WHO's Department of Research and Development has embarked on promoting the development of diagnostics, vaccines, and therapies for this novel coronavirus COVID-19 ( WHO, 2020b ). The COVID-19 infections require immediate identification to increase the chances of recovery among patients and providing opportunities to start treatment at the earliest. Diagnosis is thus a valuable step for understanding the number of COVID-19 affected people, and for segregating individuals who are resistant and potentially “protected” from infection. Developing an efficient and stable novel vaccine against this highly infectious COVID-19 disease is essential. Medical image processing and DL have the potential to contribute towards the COVID-19 pandemic for aiding in the discovery of vaccines and related drugs.

The ML/DL algorithms can be trained by massive datasets of chemical compounds. Some of the compounds can enhance human immunity, and some do not. In this way, ML/DL algorithms can learn the patterns of the compounds that can build immunity to a virus in a very quick time. The researchers can then use the ML-based algorithms to test whether their newly designed combination of compounds in the vaccines can be used as an antidote to a virus or not. In this way ML/DL algorithms play a very vital role in the process of discovering vaccines/drugs ( Kannan, Subbaram, Ali, & Kannan, 2020 ).

Virus vaccines basically constitute of similar or part of the antigens that cause the disease. The vaccines, when introduced in the body, activates the immune system to generate specific antibodies for detecting and neutralizing the viruses. Viruses typically multiply quickly, and their antigens are vulnerable to mutations that prevent the recognition of these antibodies. The vaccine production effort to classify T-cell epitopes for the SARS-COV-2 virus is discussed in Qiao, Tran, Shan, Ghodsi, and Li (2020) . The work in Tayebi (2020) used CNN algorithms as a DL approach for the prediction of cross-immunoreactivity (CR) in heterogeneous epitope vaccines.

RADLogics ( RADLogics, 2020 ) developed AI-based systems and DL tools to be used for providing services in the hospitals. These tools work based on medical imaging like chest CT or X-ray scans for screening mild cases, triage new infections, and monitor advanced diseases to detect COVID-19 infections. DL applications based on medical imaging helps to identify various drug-manufacturing approaches to combat COVID-19. This breakthrough thus could set the stage for vaccines, or an effective antiviral. The work in Zhang, Saravanan, et al. (2020) suggests the use of the Deeper-Feature CNN (DFCNN) to identify potential drugs for 2019-nCoV. The study in Beck, Shin, Choi, Park, and Kang (2020) proposes a drug-target interactive DL model for the prediction of commercial antiviral drugs against COVID-19. In Ong, Wong, Huffman, and He (2020) a reverse vaccinology and ML-based approach for developing a vaccine against COVID-19 coronavirus is presented. The virus and proteins (spike, nucleocapsid, and membrane) are tested for the development of SARS and MERS vaccines. As a next step, the reverse vaccinology tool Vaxign and the ML application Vaxign-ML is used to predict candidates for COVID-19 vaccines. On 5 May 2020, Israel Institute for Biological Research (IIBR) recently claimed to evolve a monoclonal neutralizing antibody for coronavirus neutralization within the carrier body, which is definitely a ray of hope in the vaccine discovery of COVID-19 ( ISRAEL, 2020 ).

4.5. Limitations of DL based image processing in COVID-19

Although millions of patients are getting infected by the disease, there still exists the absence of publicly available large datasets focusing especially on tests with missing infections. The accuracy of any DL model depends on the availability of data, and in the case of COVID-19, this is a major concern. The accuracy of the DL model becomes questionable due to the use of insufficient data. Also, accessibility to CT and X-ray images from wider demographics is a requirement that is difficult to collect from healthcare organizations. Also, the implementation of policy decisions based on DL based implementations requires access to behavioral attributes such as work, education, family background. All of this information is necessary to understand how people remain asymptomatic even after getting infected by the virus. These limitations stand predominant in achieving optimized scientific decisions. But as presented in the paper, various studies are being conducted relentlessly to eliminate such challenges and acquire more clarity in COVID-19 diagnosis ( Vaid, Kalantar, & Bhandari, 2020 ).

4.6. Summary

After critically reviewing the evolving literature, it is evident that medical image processing and DL play a significant role in fighting the COVID-19 pandemic through a range of promising applications including outbreak prediction, the virus spread tracking, diagnosis/treatment of coronavirus, and discovery of vaccines/drugs. The comprehensive survey and selected references are summarized in Table ​ Table3 3 .

Summary of deep learning for medical image processing in COVID-19.

The manual detection of cases of COVID-19 or non-COVID-19 pneumonia is a demanding and time-consuming process, as these cases are exponentially increasing. Indeed, the application of DL in medical image analysis effectively supports disease prediction of huge datasets obtained from available sources such as health organizations (e.g., WHO), healthcare institutes (e.g., China National Health Commission, Indian Medical Research Council). DL applications focus on medical imaging, which has emerged as a promising solution. DL applications are used to process and analyze medical imaging data to help radiologists and doctors enhance the accuracy of the diagnosis. DL from medical images have the potential to even identify possible targets for an appropriate COVID-19 vaccine. Multiple studies on COVID-19 have been conducted to emphasize on automated COVID-19 identification using DL systems using medical imaging datasets.

5. Deep learning for COVID-19 medical image processing: use cases

This section presents some use cases of DL for COVID-19 medical image processing.

5.1. Use case 1: automated detection and monitoring of COVID-19 patients in China using medical images based DL techniques

It is redundant to mention that COVID-19 cases spread at an alarming rate with dreadful effects on normal human lives, general public health and the global economy. At this stage, it is extremely crucial to develop auxiliary diagnostic systems to detect the disease in minimal time to ensure disease containment. Application of AI-based DL techniques integrated with radiology imaging techniques was the immediate need that was addressed in the study conducted in China in collaboration with the USA. The study revealed the fact that AI-based Computer Tomography images developed rapidly have high accuracy in the detection of COVID-19 positive cases. The datasets used in the study were collected from China as well as other countries with prominent COVID-19 cases. The study implemented a system that used 2D and 3D DL models, integrated them with existing AI models and ensured the inclusion of complete clinical knowledge. Multiple experiments were conducted to detect significant COVID-19 related CT features, thereby detect the disease. The progression of the disease in each patient was monitored using a 3D volume view, which generated a COVID-19 score. The accuracy of the results in classifying positive and negative COVID-19 cases were quite promising of about more than 99 percent AUC ( Gozes et al., 2020 ).

Another significant study in the radiology department of Zhongnan highlights the use of AI-based DL software to detect visual symptoms pertaining to Pneumonia from CT scan images of the lungs of COVID-19 patients. The software has been immensely beneficial to assist the overworked healthcare professionals screen potential COVID-19 patients and forward them for further medical tests saving a lot of time. It is very difficult to differentiate between symptoms of normal Pneumonia and COVID-19 Pneumonia, which the software aided in identifying the typical or partial symptoms of more than 35,000 cases in 34 hospitals in China ( WIRED, 2020 ).

5.2. Use case 2: automated detection and monitoring of COVID-19 patients in Canada using medical images based DL techniques

Researchers at the University of Waterloo have designed software that uses AI, and DL assisted X-ray screening method named COVID-Net to augment the polymerase chain reaction for SWAB tests conducted on COVID-19 patients. This augmentation with computational techniques – AI and DL have enhanced accuracy reducing the time of screening contributing effectively towards the containment of the disease. This screening tool was developed using almost 6000 image datasets of chest radiography images collected from 3000 patients ( Ozturk et al., 2020 ; Wang & Wong, 2020 ).

The developing company of COVID-Net have taken the next step to develop COVID-RiskNet that aims at detecting the risk associated with the level of severity of the disease in an affected patient. The tool also suggests the plausible direction of treatment and segregates relatively better patients to be self-isolated from the severe cases needing inpatient medical care. It basically summarizes the status of severity and helps to prioritize the line of treatment accordingly.

The Toronto-based startup company – BlueDot developed a platform using AI, ML, and big data technologies to track and predict the outbreak of any infectious disease. This would set the alarm for the private sectors and government policymakers to implement their mitigation plans at the earliest, saving millions of lives. The platform was successful in alerting the administration on an unusual cluster of pneumonia patients being formed around a local market in Wuhan, China. This was the first reported and recognized information on COVID-19, and almost nine days later, the WHO made its official announcement. The software gathers data on 150 different types of diseases and syndromes around the world and updates the database every 15 minutes round the clock. The repository includes data from organizations like CDC or WHO and external healthcare sources, traveler history, human and animal population data, climate data and local information collected from journalists working in 1 million articles every single day. The analysts manually classify the collected data, develop specific taxonomy for efficient searching of keywords and then apply ML and natural language processing for training the model. The results generate the only minimal number of highly filtered cases to be further analyzed by humans for expert opinion and necessary action ( DIGINOMICA, 2020 ).

Apart from the above-mentioned cases, DL models and genomic sequences of the COVID-19 virus extracted from patients is being used to detect the interaction effects of the viral genomic sequences. Neural network approaches is deployed to train large sequences from varied geographical locations to identify mutations appearing on the RNA sequences based on the sequences of other nucleotides in the genomic sequence. This profiling of viral evolution could help in identifying mid-level interactions and detect plausible viral genomes and mutations for specific proteins relevant to ensure faster treatment response towards fighting severity of the disease ( Mila, 2020 ).

5.3. Use case 3: automated detection and monitoring of COVID-19 patients in South Korea using medical images based DL techniques

The Republic of Korea has been quite successful in the containment of COVID-19 without implementing the complete lockdown of its economy. However, public places involving the gathering of a large number of citizens have been locked down. The Division of Risk Assessment and International Cooperation at the Korean Disease Control and Prevention center have given the immense emphasis on adapting advanced ICT techniques to combat the spreading of COVID-19 and detection of the same ( ITUNEWS, 2020 ). The company Seegene developed a COVID-19 detection kit at a very early stage using AI techniques which helped to perform widespread testing with a primary focus on high-risk groups such as those having underlying diseases, elderly citizens who share homes with multiple individuals in crowded city locations and also patients who return from international travels ( Seegene, 2020 ).

The company VUNO developed an AI-based decision tool for chest X-ray images that used an algorithm capable of detecting abnormalities. The tool basically aided to classify and examine the intensive care patients using their X-ray images in less than three seconds ( ITN, 2020 ). JLK Inspection developed a medical platform called AiHub for the diagnosis of COVID-19 using DL, AI and big data technology to examine anomalies in the lungs within seconds. The same company has developed a handheld chest X-ray camera which scans the chest in seconds and presents a heatmap visualization highlighting abnormal lesions ( LABPULSE, 2020 ).

6. Lessons, challenges, and future directives

Although DL has gained immense momentum, popularity and has generated impressive results with simple 2D images, there exist limitations in achieving a similar level of performance in medical image processing. Research work in this regard is still-in-progress and some of the lessons learned are mentioned below:

  • • One of the most inhibiting factors is the unavailability of large datasets with high-quality images for training. In this case, synthesizing of the data is a possible solution so that the data collected from varied sources could be integrated together
  • • The majority of the state-of-the-art DL models are trained for 2D images. However, CT and MRI are usually 3D and hence add an additional dimension to the existing problem. Since the conventional DL models are not adjusted to this, experience plays a major role when DL models are implemented on these images
  • • The non-standardized process of collecting image data is one of the major issues in medical image processing. It is important to understand that with the increase in data variety, the need of larger datasets arise to ensure the DL algorithm generates robust solutions. The best possible way to resolve this issue is the application of transfer learning, which makes pre-processing efficient and eliminates scanner and acquisition issues.

6.1. Challenges and issues

The challenges and issues pertaining to DL implementations for medical image processing for controlling COVID-19 pandemic in smart cities are enlisted below:

  • • Privacy – Availability of COVID-19 high-quality images and larger datasets is a major challenge considering the privacy of patient data.
  • • Variability in Outbreak Pattern – The outbreak of the data has followed complex pattern and extreme variation in behavior across various countries and hence reliability of the prediction diseases get added as an additional challenge.
  • • Regulation and Transparency – Countries across the globe have adopted strict protocols in regulations to be complied pertinent to sharing of COVID-19 data, one of the major protocols clearly states that minimum data and specimens to be collected from patients in the minimum amount of time. Thus this makes it more difficult to analyze.
  • • Variability in the testing process across various hospitals is also an important concern leading to non-uniformity in data labels.
  • • The symptoms of Pneumonia and COVID-19 Pneumonia are very similar. Identification of an appropriate DL technique to exclusively and specifically detect COVID-19 with optimum accuracy still remains as a visible challenge.

Moreover, the coronavirus genome has been completely sequenced based on the data collected from thousands of patients suffering from the disease across the globe. This genome sequence has been extremely beneficial, especially due to the fact that the COVID-19 virus has a higher mutation rate. The present diagnostic tests help to identify specific genes from the virus, and the test accuracy depends on target areas of the relevant genomes. The effect of the mutation on the diagnostic tests is alarming and there exists a high possibility of generating a “false negative” for a patient actually suffering from the disease. These diagnostic tests provide their diagnosis based on the scrutiny of the coronavirus genes which often vary as the disease spreads from one human to another ( Bos, Heijnen, Luytjes, & Spaan, 1995 ).

6.2. Future directives

The future directions in COVID-19 research lie in connecting hierarchical features of COVID-19 image datasets with other clinical information for conducting multi-omics modeling for enhanced prediction of the disease. Also, since the available datasets are of relatively smaller size, insufficient for yielding robust predictions, transfer learning is a future direction of research that could detect anomalies in smaller datasets and yield remarkable results. In COVID-19 diagnosis, detection of the disease in the earliest possible time is the major necessity. In this regard, the ML algorithm contributes to expediting the process using limited resources. Transfer learning supports the same objective, being an apt technique in COVID-19 detection, where time-to-delivery and availability of training data is a primary concern. This technique takes pre-trained models from academics, research Institutes, or open source communities and uses the same to perform ML tasks, thereby saving time and resources. It transfers the learned parameters or knowledge to various algorithms as their engineered features. DL yields good results when larger volumes of data are available, but in the case of transfer learning, the same can be achieved with limited labeled dataset ( Zhuang et al., 2020 ). In this COVID-19 pandemic situation, availability of dataset, furthermore labeled ones, is an obvious challenge and hence transfer learning has immense potential to serve the purpose of COVID-19 detection. As an example, the study by Rehman, Naz, Khan, Zaib, and Razzak (2020) can be referred to where X-ray and CT images of COVID-19 cases were collected from the GitHub public repository. These images were then parsed to select the COVID-19 positive samples from these images. Apart from this, bacterial pneumonia, viral pneumonia and healthy image dataset were collected from the Kaggle Repository. When both of these data sets collected from different resources were combined, a database of 200 X-ray and CT images of COVID-19 cases, 200 cases of viral pneumonia, 200 cases of bacterial pneumonia and 200 cases of healthy subjects were formed. This pre-trained knowledge, when fed into the CNN architecture, delivered enhanced accuracy in COVID-19 detection results. Similar approaches were taken in Apostolopoulos and Mpesiana (2020) wherein a dataset of 1427 images with 224 COVID-19 cases were combined with 700 images of common bacterial pneumonia and 504 images of normal patients to form a data repository. This dataset, when fed into a DL model, detected COVID-19 with enhanced accuracy, sensitivity and specificity in comparison to the traditional approaches. Thus, creation of a centralized data repository for collecting COVID-19 patient data is an enormously important necessity to develop predictive, diagnostic and therapeutic strategies to combat the COVID-19 crises and similar pandemics of the future in smart healthy cities.

As reflected and reviewed in the existing studies, there are various DL implementations applied on different datasets utilizing different evaluation criteria where radiology imaging datasets have been found to be prevalent. But utilization of these implementations in real-world medical practice cases is a major concern that dictates the immediate need for benchmarking frameworks for the evaluation and comparison of the existing methodologies. These frameworks should enable the use of computational hardware related infrastructures considering similar patient records, data pre-processing methods and the evaluation criteria for the various AI methods ensuring data interpretability and transparency.

The inhabitants of this present-day world are much fortunate than the previous generation who have witnessed the Spanish flu pandemic in 1918 as we are immensely gifted with advanced technology. AI has been used extensively in all spheres of human lives. Since AI has touched all sectors of life, the same technology should be used and completely explored to combat the COVID-19 pandemic as well. As an example, the future of AI lies in the development of autonomous robots and machines for disinfection, healthcare work in hospitals, delivering medications and essentials to patients and also providing personal care for them. AI can be integrated with natural language processing (NLP) technologies to develop chatbots that can communicate remotely with patients and provide consultations during this crisis period.

Apart from the aforementioned benefits, the application of AI can play a significant role in eradicating fake news being spread in social media platforms. The use of AI can filter out information relevant to government policies, pandemic prevention protocols and the science behind the virus spreading and containment, thereby ensuring that authentic information alone reaches the common masses eliminating all possible chances of unnecessary panic creation. At this juncture of never-ending battle with the COVID-19 crisis, the only ray of hope is the development of a novel vaccine against the virus. AI has immense potential in this regard for investigating the genetic and protein structure of the virus to accelerate the process of drug discovery. Although this process is time-consuming and economically expensive using the traditional methods, with the use of AI and DL techniques, it would soon be possible to detect the most appropriate antibiotic from huge data set of hundred million molecules. This is definitely the most interesting and necessary future direction of research to win over the COVID-19 pandemic.

7. Conclusion

Among solutions to combat the COVID-19 pandemic, DL has been considered a great technique to provide intelligent solutions. Motivated by many applications of DL for medical image processing in the last decade, we have provided summarized recent efforts about the COVID-19 outbreak for smart, healthy cities. DL has been employed to achieve various solutions for COVID-19 disruption, including outbreak prediction, a virus spread tracking, diagnosis and treatment, vaccine discovery and drug research. Despite promising results, the successful use of DL to process COVID-19 medical images still requires considerable time and effort as well as close operation between different parties from government, industry, and academia. We have also enlisted a number of challenges and issues associated with existing studies such as data privacy, variability of outbreak pattern, regulation and transparency, and distinction between COVID-19 and non-COVID-19 symptoms. Finally, we have discussed a number of future directions of DL applications for COVID-19 medical image processing. We believe that the COVID-19 outbreak will be ending soon with help from DL and image processing techniques as well as many other technologies such as biomedicine, data science, and mobile communications. We also hope that our work is a good source of reference and can drive many novel studies on DL and medical image processing in the battle against the COVID-19 outbreak.

Declaration of Competing Interest

The authors report no declarations of interest.

  • Abràmoff M.D., Lou Y., Erginay A., Clarida W., Amelon R., Folk J.C. Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning. Investigative Ophthalmology & Visual Science. 2016; 57 (13):5200–5206. [ PubMed ] [ Google Scholar ]
  • Ai T., Yang Z., Hou H., Zhan C., Chen C., Lv W. Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in China: A report of 1014 cases. Radiology. 2020:200642. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Alazab M., Khan S., Krishnan S.S.R., Pham Q.-V., Reddy M.P.K., Gadekallu T.R. A multidirectional LSTM model for predicting the stability of a smart grid. IEEE Access. 2020; 8 :85454–85463. [ Google Scholar ]
  • Albarqouni S., Baur C., Achilles F., Belagiannis V., Demirci S., Navab N. AggNet: Deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Transactions on Medical Imaging. 2016; 35 (5):1313–1321. [ PubMed ] [ Google Scholar ]
  • Alom M.Z., Yakopcic C., Nasrin M.S., Taha T.M., Asari V.K. Breast cancer classification from histopathological images with inception recurrent residual convolutional neural network. Journal of Digital Imaging. 2019; 32 (4):605–617. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Andersen K.G., Rambaut A., Lipkin W.I., Holmes E.C., Garry R.F. The proximal origin of SARS-CoV-2. Nature Medicine. 2020; 26 (4):450–452. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Anwar S.M., Majid M., Qayyum A., Awais M., Alnowami M., Khan M.K. Medical image analysis using convolutional neural networks: A review. Journal of Medical Systems. 2018; 42 (11):226. [ PubMed ] [ Google Scholar ]
  • Apostolopoulos I.D., Mpesiana T.A. COVID-19: Automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Physical and Engineering Sciences in Medicine. 2020:1–6. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ardabili S.F., Mosavi A., Ghamisi P., Ferdinand F., Varkonyi-Koczy A.R., Reuter U. 2020. COVID-19 outbreak prediction with machine learning. Available at SSRN 3580188. [ Google Scholar ]
  • Ayyoubzadeh S.M., Ayyoubzadeh S.M., Zahedi H., Ahmadi M., Kalhori S.R.N. Predicting COVID-19 incidence through analysis of Google trends data in Iran: Data mining and deep learning pilot study. JMIR Public Health and Surveillance. 2020; 6 (2):e18828. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bai W., Shi W., O’regan D.P., Tong T., Wang H., Jamil-Copley S. A probabilistic patch-based label fusion model for multi-atlas segmentation with registration refinement: Application to cardiac MR images. IEEE Transactions on Medical Imaging. 2013; 32 (7):1302–1315. [ PubMed ] [ Google Scholar ]
  • Bai H.X., Hsieh B., Xiong Z., Halsey K., Choi J.W., Tran T.M.L. Performance of radiologists in differentiating COVID-19 from viral pneumonia on chest CT. Radiology. 2020:200823. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bassetti M., Vena A., Giacobbe D.R. The novel Chinese coronavirus (2019-nCoV) infections: Challenges for fighting the storm. European Journal of Clinical Investigation. 2020; 50 (3):e13209. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Baumgartner C.F., Kamnitsas K., Matthew J., Smith S., Kainz B., Rueckert D. Real-time standard scan plane detection and localisation in fetal ultrasound using fully convolutional neural networks. International conference on medical image computing and computer-assisted intervention. 2016:203–211. [ Google Scholar ]
  • Beck B.R., Shin B., Choi Y., Park S., Kang K. 2020. Predicting commercially available antiviral drugs that may act on the novel coronavirus (2019-nCoV), Wuhan, China through a drug-target interaction deep learning model. bioRxiv. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bernheim A., Mei X., Huang M., Yang Y., Fayad Z.A., Zhang N. Chest ct findings in coronavirus disease-19 (covid-19): Relationship to duration of infection. Radiology. 2020:200463. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bhattacharya S., Somayaji S.R.K., Gadekallu T.R., Alazab M., Maddikunta P.K.R. Internet Technology Letters; 2020. A review on deep learning for future smart cities; p. e187. [ Google Scholar ]
  • Bos E.C., Heijnen L., Luytjes W., Spaan W.J. Mutational analysis of the murine coronavirus spike protein: Effect on cell-to-cell fusion. Virology. 1995; 214 (2):453–463. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Boulos M.N.K., Geraghty E.M. 2020. Geographical tracking and mapping of coronavirus disease COVID-19/severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemic and associated events around the world: How 21st century GIS technologies are supporting the global fight against outbreaks and epidemics. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Brinati D., Campagner A., Ferrari D., Locatelli M., Banfi G., Cabitza F. 2020. Detection of covid-19 infection from routine blood exams with machine learning: A feasibility study. medRxiv. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Butt C., Gill J., Chun D., Babu B.A. Deep learning system to screen coronavirus disease 2019 pneumonia. Applied Intelligence. 2020:1. [ Google Scholar ]
  • Cao C., Liu F., Tan H., Song D., Shu W., Li W. Deep learning and its applications in biomedicine. Genomics, Proteomics & Bioinformatics. 2018; 16 (1):17–32. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cascella M., Rajnik M., Cuomo A., Dulebohn S.C., Di Napoli R. Statpearls [internet] StatPearls Publishing; 2020. Features evaluation and treatment coronavirus (COVID-19) [ PubMed ] [ Google Scholar ]
  • Chae S., Kwon S., Lee D. Predicting infectious disease using deep learning and big data. International Journal of Environmental Research and Public Health. 2018; 15 (8):1596. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Che Z., Purushotham S., Cho K., Sontag D., Liu Y. Recurrent neural networks for multivariate time series with missing values. Scientific Reports. 2018; 8 (1):1–12. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Chee E., Wu Z. 2018. Airnet: Self-supervised affine registration for 3D medical images using neural networks. arXiv:1810.02583 (arXiv preprint) [ Google Scholar ]
  • Chen J., Wu L., Zhang J., Zhang L., Gong D., Zhao Y. 2020. Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: A prospective study. medRxiv. [ Google Scholar ]
  • Chen N., Zhou M., Dong X., Qu J., Gong F., Han Y. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: A descriptive study. The Lancet. 2020; 395 (10223):507–513. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cheng Y. Semi-supervised learning for neural machine translation. Joint training for neural machine translation. 2019:25–40. [ Google Scholar ]
  • Cruz-Roa A.A., Ovalle J.E.A., Madabhushi A., Osorio F.A.G. A deep learning architecture for image representation, visual interpretability and automated basal-cell carcinoma cancer detection. International conference on medical image computing and computer-assisted intervention. 2013:403–410. [ PubMed ] [ Google Scholar ]
  • C. for Systems Science J. H. U. Engineering (CSSE) 2020. COVID-19 dashboard. https://coronavirus.jhu.edu/map.html Accessed 05.05.20. [ Google Scholar ]
  • Das S., Ghosh P., Sen B., Mukhopadhyay I. 2020. Critical community size for COVID-19 – a model based approach to provide a rationale behind the lockdown. arXiv:2004.03126 (arXiv preprint) [ Google Scholar ]
  • De Simone A., Jacques T. Guiding new physics searches with unsupervised learning. The European Physical Journal C. 2019; 79 (4):289. [ Google Scholar ]
  • de Vos B.D., Berendsen F.F., Viergever M.A., Staring M., Išgum I. End-to-end unsupervised deformable image registration with a convolutional neural network. Deep learning in medical image analysis and multimodal learning for clinical decision support. 2017:204–212. [ Google Scholar ]
  • 2020. How Canadian AI start-up BlueDot spotted Coronavirus before anyone else had a clue. https://www.wired.com/story/chinese-hospitals-deploy-ai-help-diagnose-covid-19/ Accessed 10.03.20. [ Google Scholar ]
  • Duan J., Bello G., Schlemper J., Bai W., Dawes T.J., Biffi C. Automatic 3D bi-ventricular segmentation of cardiac images by a shape-refined multi-task deep learning approach. IEEE Transactions on Medical Imaging. 2019; 38 (9):2151–2164. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Elmahdy M.S., Jagt T., Zinkstok R.T., Qiao Y., Shahzad R., Sokooti H. Robust contour propagation using deep learning and image registration for online adaptive proton therapy of prostate cancer. Medical Physics. 2019; 46 (8):3329–3343. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fu L., Fei J., Xiang H.-X., Xiang Y., Tan Z.-X., Li M.-D. 2020. Influence factors of death risk among COVID-19 patients in Wuhan, China: A hospital-based case-cohort study. medRxiv. [ Google Scholar ]
  • G.o.I. NIC, MEITY . 2020. Aarogya setu app. https://www.mygov.in/aarogya-setu-app/ Accessed 02.04.20. [ Google Scholar ]
  • Gadekallu T.R., Khare N., Bhattacharya S., Singh S., Reddy Maddikunta P.K., Ra I.-H. Early detection of diabetic retinopathy using PCA-firefly based deep learning model. Electronics. 2020; 9 (2):274. [ Google Scholar ]
  • Gadekallu T.R., Rajput D.S., Reddy M.P.K., Lakshmanna K., Bhattacharya S., Singh S. A novel PCA-whale optimization-based deep neural network model for classification of tomato plant diseases using gpu. Journal of Real-Time Image Processing. 2020:1–14. [ Google Scholar ]
  • Gao X., Li W., Loomes M., Wang L. A fused deep learning architecture for viewpoint classification of echocardiography. Information Fusion. 2017; 36 :103–113. [ Google Scholar ]
  • Ghesu F.C., Krubasik E., Georgescu B., Singh V., Zheng Y., Hornegger J. Marginal space deep learning: Efficient architecture for volumetric image parsing. IEEE Transactions on Medical Imaging. 2016; 35 (5):1217–1228. [ PubMed ] [ Google Scholar ]
  • Ghoshal B., Tucker A. 2020. Estimating uncertainty and interpretability in deep learning for coronavirus (COVID-19) detection. arXiv:2003.10769 (arXiv preprint) [ Google Scholar ]
  • Gozes O., Frid-Adar M., Greenspan H., Browning P.D., Zhang H., Ji W. 2020. Rapid ai development cycle for the coronavirus (covid-19) pandemic: Initial results for automated detection & patient monitoring using deep learning ct image analysis. arXiv:2003.05037 (arXiv preprint) [ Google Scholar ]
  • Greenspan H., Van Ginneken B., Summers R.M. Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique. IEEE Transactions on Medical Imaging. 2016; 35 (5):1153–1159. [ Google Scholar ]
  • Gu Z., Cheng J., Fu H., Zhou K., Hao H., Zhao Y. CE-Net: Context encoder network for 2D medical image segmentation. IEEE Transactions on Medical Imaging. 2019; 38 (10):2281–2292. [ PubMed ] [ Google Scholar ]
  • Guo Y., Gao Y., Shen D. Deformable MR prostate segmentation via deep feature learning and sparse patch matching. IEEE Transactions on Medical Imaging. 2015; 35 (4):1077–1089. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Guo Z., Li X., Huang H., Guo N., Li Q. Deep learning-based image segmentation on multimodal medical imaging. IEEE Transactions on Radiation and Plasma Medical Sciences. 2019; 3 (2):162–169. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Habibzadeh H., Nussbaum B.H., Anjomshoa F., Kantarci B., Soyata T. Sustainable Cities and Society; 2019. A survey on cybersecurity, data privacy, and policy issues in cyber-physical system deployments in smart cities. [ Google Scholar ]
  • Hakak S., Khan W.Z., Imran M., Choo K.-K.R., Shoaib M. Have you been a victim of COVID-19-related cyber incidents? Survey, taxonomy, and mitigation strategies. IEEE Access. 2020; 8 :124134–124144. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Haskins G., Kruger U., Yan P. Deep learning in medical image registration: A survey. Machine Vision and Applications. 2020; 31 (1):8. [ Google Scholar ]
  • Hemdan E.E.-D., Shouman M.A., Karar M.E. 2020. COVIDX-Net: A framework of deep learning classifiers to diagnose COVID-19 in X-ray images. arXiv: 2003.11055 (arXiv preprint) [ Google Scholar ]
  • Hopman J., Allegranzi B., Mehtar S. Managing covid-19 in low-and middle-income countries. Jama. 2020; 323 (16):1549–1550. [ PubMed ] [ Google Scholar ]
  • Hossain N., Househ M.S. Using healthmap to analyse middle east respiratory syndrome (MERS) data. ICIMTH. 2016:213–216. [ PubMed ] [ Google Scholar ]
  • Hosseiny M., Kooraki S., Gholamrezanezhad A., Reddy S., Myers L. Radiology perspective of coronavirus disease 2019 (covid-19): Lessons from severe acute respiratory syndrome and middle east respiratory syndrome. American Journal of Roentgenology. 2020; 214 (5):1078–1082. [ PubMed ] [ Google Scholar ]
  • Huang W., Luo M., Liu X., Zhang P., Ding H., Xue W. Arterial spin labeling images synthesis from structural magnetic resonance imaging using unbalanced deep discriminant learning. IEEE Transactions on Medical Imaging. 2019; 38 (10):2338–2351. [ PubMed ] [ Google Scholar ]
  • Huang C., Wang Y., Li X., Ren L., Zhao J., Hu Y. Clinical features of patients infected with 2019 novel coronavirus in Wuhan China. The Lancet. 2020; 395 (10223):497–506. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Huang L., Han R., Ai T., Yu P., Kang H., Tao Q. Serial quantitative chest CT assessment of COVID-19: dDep-learning approach. Radiology: Cardiothoracic Imaging. 2020; 2 (2):e200075. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Husnayain A., Fuad A., Su E.C.-Y. Applications of google search trends for risk communication in infectious disease management: A case study of COVID-19 outbreak in Taiwan. International Journal of Infectious Diseases. 2020 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Huynh-The T., Hua C.-H., Pham Q.-V., Kim D.-S. MCNet: An efficient CNN architecture for robust automatic modulation classification. IEEE Communications Letters. 2020; 24 (4):811–815. [ Google Scholar ]
  • 2020. Israel isolates coronavirus antibody in “significant breakthrough”: Minister. https://www.reuters.com/article/us-health-coronavirus-israel-treatment Accessed 07.05.20. [ Google Scholar ]
  • 2020. Vuno offers a suite of AI solutions for COVID-19. itnonline.com/content/vuno-offers-suite-ai-solutions-covid-19 Accessed 12.04.20. [ Google Scholar ]
  • 2020. COVID-19. https://news.itu.int/covid-19-how-korea-is-using-innovative-technology-and-ai-to-flatten-the-curve/ Accessed 11.04.20. [ Google Scholar ]
  • Iwendi C., Bashir A.K., Peshkar A., Sujatha R., Chatterjee J.M., Pasupuleti S. COVID-19 patient health prediction using boosted random forest algorithm. Frontiers in Public Health. 2020; 8 :357. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ji W., Wang W., Zhao X., Zai J., Li X. Homologous recombination within the spike glycoprotein of the newly identified coronavirus may boost cross-species transmission from snake to human. Journal Medical Virol. 2020 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Jin Y.-H., Cai L., Cheng Z.-S., Cheng H., Deng T., Fan Y.-P. A rapid advice guideline for the diagnosis and treatment of 2019 novel coronavirus (2019-nCoV) infected pneumonia (standard version) Military Medical Research. 2020; 7 (1):4. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kampf G., Todt D., Pfaender S., Steinmann E. Persistence of coronaviruses on inanimate surfaces and its inactivation with biocidal agents. Journal of Hospital Infection. 2020 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kannan S., Subbaram K., Ali S., Kannan H. The role of artificial intelligence and machine learning techniques: Race for covid-19 vaccine. Archives of Clinical Infectious Diseases. 2020; 15 (2) [ Google Scholar ]
  • Kawahara J., Hamarneh G. Multi-resolution-tract CNN with hybrid pretrained and skin-lesion trained layers. International workshop on machine learning in medical imaging. 2016:164–171. [ Google Scholar ]
  • Ker J., Wang L., Rao J., Lim T. Deep learning applications in medical image analysis. IEEE Access. 2017; 6 :9375–9389. [ Google Scholar ]
  • Kermany D.S., Goldbaum M., Cai W., Valentim C.C., Liang H., Baxter S.L. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018; 172 (5):1122–1131. [ PubMed ] [ Google Scholar ]
  • Khan F.A., Asif M., Ahmad A., Alharbi M., Aljuaid H. Sustainable Cities and Society; 2020. Blockchain technology, improvement suggestions, security challenges on smart grid and its application in healthcare for sustainable development; p. 102018. [ Google Scholar ]
  • Krawczyk B., Minku L.L., Gama J., Stefanowski J., Woźniak M. Ensemble learning for data stream analysis: A survey. Information Fusion. 2017; 37 :132–156. [ Google Scholar ]
  • Kvalsvig A., Barnard L.T., Gray L., Wilson N., Baker M. 2020. Supporting the COVID-19 pandemic response: Surveillance and outbreak analytics. [ Google Scholar ]
  • 2020. AI-based IVD rises to challenge of novel coronavirus. labpulse.com/index.aspx?sec=sup&sub=mic&pag=dis&ItemID=801043 Accessed 14.04.20. [ Google Scholar ]
  • Li L., Qin L., Xu Z., Yin Y., Wang X., Kong B. Artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest CT. Radiology. 2020:200905. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Liao F., Liang M., Li Z., Hu X., Song S. Evaluate the malignancy of pulmonary nodules using the 3-D deep leaky noisy-or network. IEEE Transactions on Neural Networks and Learning Systems. 2019; 30 (11):3484–3495. [ PubMed ] [ Google Scholar ]
  • Lima A.Q., Keegan B. Cyber influence and cognitive threats. Elsevier; 2020. Challenges of using machine learning algorithms for cybersecurity: A study of threat-classification models applied to social media communication data; pp. 33–52. [ Google Scholar ]
  • Litjens G., Kooi T., Bejnordi B.E., Setio A.A.A., Ciompi F., Ghafoorian M. A survey on deep learning in medical image analysis. Medical Image Analysis. 2017; 42 :60–88. [ PubMed ] [ Google Scholar ]
  • Liu Y., Liu S., Wang Z. A general framework for image fusion based on multi-scale transform and sparse representation. Information Fusion. 2015; 24 :147–164. [ Google Scholar ]
  • Liu Y., Chen X., Wang Z., Wang Z.J., Ward R.K., Wang X. Deep learning for pixel-level image fusion: Recent advances and future prospects. Information Fusion. 2018; 42 :158–173. [ Google Scholar ]
  • Liu D., Clemente L., Poirier C., Ding X., Chinazzi M., Davis J.T. 2020. A machine learning methodology for real-time forecasting of the 2019–2020 COVID-19 outbreak using internet searches, news alerts, and estimates from mechanistic models. arXiv: 2004.04019 (arXiv preprint) [ Google Scholar ]
  • Lo S.-C., Lou S.-L., Lin J.-S., Freedman M.T., Chien M.V., Mun S.K. Artificial convolution neural network techniques and applications for lung nodule detection. IEEE Transactions on Medical Imaging. 1995; 14 (4):711–718. [ PubMed ] [ Google Scholar ]
  • Lu R., Zhao X., Li J., Niu P., Yang B., Wu H. Genomic characterisation and epidemiology of 2019 novel coronavirus: Implications for virus origins and receptor binding. The Lancet. 2020; 395 (10224):565–574. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lundervold A.S., Lundervold A. An overview of deep learning in medical imaging focusing on MRI. Zeitschrift für Medizinische Physik. 2019; 29 (2):102–127. [ PubMed ] [ Google Scholar ]
  • Lv J., Yang M., Zhang J., Wang X. Respiratory motion correction for free-breathing 3D abdominal MRI using CNN-based image registration: A feasibility study. The British Journal of Radiology. 2018; 91 (xxxx):20170788. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Malik Y.S., Kumar N., Sircar S., Kaushik R., Bhatt S., Dhama K. 2020. Pandemic coronavirus disease (COVID-19): Challenges and a global perspective. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Megahed N.A., Ghoneim E.M. Sustainable Cities and Society; 2020. Antivirus-built environment: Lessons learned from covid-19 pandemic; p. 102350. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 2020. Mila COVID-19 related projects. https://mila.quebec/en/covid-19/ Accessed 16.03.20. [ Google Scholar ]
  • Molteni M., Rogers A. 2020. Everything you need to know about coronavirus testing. https://www.wired.com/story/everything-you-need-to-know-about-coronavirus-testing/ Accessed 02.05.20. [ Google Scholar ]
  • Narin A., Kaya C., Pamuk Z. 2020. Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks. arXiv:2003.10849 (arXiv preprint) [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Oktay O., Ferrante E., Kamnitsas K., Heinrich M., Bai W., Caballero J. Anatomically constrained neural networks (ACNNs): Application to cardiac image enhancement and segmentation. IEEE Transactions on Medical Imaging. 2017; 37 (2):384–395. [ PubMed ] [ Google Scholar ]
  • Ong E., Wong M.U., Huffman A., He Y. 2020. COVID-19 coronavirus vaccine design using reverse vaccinology and machine learning. BioRxiv. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 2020. Max Roser and Hannah Ritchie and Esteban Ortiz-ospina and Joe Hasell. https://ourworldindata.org/coronavirus Accessed 15.05.20. [ Google Scholar ]
  • Ozturk T., Talo M., Yildirim E.A., Baloglu U.B., Yildirim O., Acharya U.R. Automated detection of COVID-19 cases using deep neural networks with X-ray images. Computers in Biology and Medicine. 2020:103792. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Patel H., Singh Rajput D., Thippa Reddy G., Iwendi C., Kashif Bashir A., Jo O. A review on classification of imbalanced data for wireless sensor networks. International Journal of Distributed Sensor Networks. 2020; 16 (4) 1550147720916404. [ Google Scholar ]
  • Payer C., Štern D., Bischof H., Urschler M. Regressing heatmaps for multiple landmark localization using CNNs. International conference on medical image computing and computer-assisted intervention. 2016:230–238. [ Google Scholar ]
  • Peng X., Schmid C. Multi-region two-stream R-CNN for action detection. European conference on computer vision. 2016:744–759. [ Google Scholar ]
  • Peng Q.-Y., Wang X.-T., Zhang L.-N., C.C.C.U.S. Group . Intensive care medicine; 2020. Findings of lung ultrasonography of novel corona virus pneumonia during the 2019–2020 epidemic; p. 1. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pham Q., Nguyen D.C., Huynh-The T., Hwang W., Pathirana P.N. Artificial intelligence (AI) and big data for coronavirus (COVID-19) pandemic: A survey on the state-of-the-arts. IEEE Access. 2020; 8 :130820–130839. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Poggiali E., Dacrema A., Bastoni D., Tinelli V., Demichele E., Mateo Ramos P. Can lung US help critical care clinicians in the early diagnosis of novel coronavirus (COVID-19) pneumonia? Radiology. 2020:200847. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pourghasemi H.R., Pouyan S., Farajzadeh Z., Sadhasivam N., Heidari B., Babaei S. 2020. Assessment of the outbreak risk, mapping and infestation behavior of COVID-19: Application of the autoregressive and moving average (ARMA) and polynomial models. medRxiv. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pratt H., Coenen F., Broadbent D.M., Harding S.P., Zheng Y. Convolutional neural networks for diabetic retinopathy. Procedia Computer Science. 2016; 90 :200–205. [ Google Scholar ]
  • Qiao R., Tran N.H., Shan B., Ghodsi A., Li M. 2020. Personalized workflow to identify optimal T-cell epitopes for peptide-based vaccines against COVID-19. arXiv:2003.10650 (arXiv preprint) [ Google Scholar ]
  • Qin C., Liu F., Yen T.-C., Lan X. 18 F-FDG PET/CT findings of COVID-19: A series of four highly suspected cases. European Journal of Nuclear Medicine and Molecular Imaging. 2020:1–6. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 2020. AI-based detection and the COVID-19 pandemic. https://www.radlogics.com/coronavirus/ Accessed 16.03.20. [ Google Scholar ]
  • Rahman M.A. Sustainable Cities and Society; 2020. Data-driven dynamic clustering framework for mitigating the adverse economic impact of COVID-19 lockdown practices; p. 102372. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rajaraman S., Candemir S., Kim I., Thoma G., Antani S. Visualization and interpretation of convolutional neural network predictions in detecting pneumonia in pediatric chest radiographs. Applied Sciences. 2018; 8 (10):1715. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rajpurkar P., Irvin J., Zhu K., Yang B., Mehta H., Duan T. 2017. Chexnet: Radiologist-level pneumonia detection on chest X-rays with deep learning. arXiv:1711.05225 (arXiv preprint) [ Google Scholar ]
  • Rawat W., Wang Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural Computation. 2017; 29 (9):2352–2449. [ PubMed ] [ Google Scholar ]
  • Razzak M.I., Naz S., Zaib A. Classification in bioApps. Springer; 2018. Deep learning for medical image processing: Overview, challenges and the future; pp. 323–350. [ Google Scholar ]
  • Reddy T., Parimala M., Swarnapriya R.M., Chowdhary C.L., Hakak S., Khan W.Z. Computer Communications; 2020. A deep neural networks based model for uninterrupted marine environment monitoring. [ Google Scholar ]
  • Rehman Z.U., Zia M.S., Bojja G.R., Yaqub M., Jinchao F., Arshid K. Texture based localization of a brain tumor from MR-images by using a machine learning approach. Medical Hypotheses. 2020:109705. [ PubMed ] [ Google Scholar ]
  • Rehman A., Naz S., Khan A., Zaib A., Razzak I. 2020. Improving coronavirus (COVID-19) diagnosis using deep transfer learning. [ Google Scholar ]
  • Remuzzi A., Remuzzi G. COVID-19 and Italy: What next? Lancet. 2020 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Roth H.R., Lee C.T., Shin H.-C., Seff A., Kim L., Yao J. Anatomy-specific classification of medical images using deep convolutional nets. 2015 IEEE 12th international symposium on biomedical imaging (ISBI) 2015:101–104. [ Google Scholar ]
  • Rothan H.A., Byrareddy S.N. The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak. Journal of Autoimmunity. 2020:102433. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Saha A., Gupta K., Patil M. 2020. Monitoring and epidemiological trends of coronavirus disease (COVID-19) around the world. [ Google Scholar ]
  • Sahiner B., Pezeshk A., Hadjiiski L.M., Wang X., Drukker K., Cha K.H. Deep learning in medical imaging and radiation therapy. Medical Physics. 2019; 46 (1):e1–e36. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Schmidhuber J. Deep learning in neural networks: An overview. Neural Networks. 2015; 61 :85–117. [ PubMed ] [ Google Scholar ]
  • Seeböck P., Orlando J.I., Schlegl T., Waldstein S.M., Bogunović H., Klimscha S. Exploiting epistemic uncertainty of anatomy segmentation for anomaly detection in retinal OCT. IEEE Transactions on Medical Imaging. 2019; 39 (1):87–98. [ PubMed ] [ Google Scholar ]
  • 2020. Simultaneous screening & confirmation of COVID-19 using real-time PCR. http://www.seegene.com/covid19_detection Accessed 01.03.20. [ Google Scholar ]
  • Shan+ F., Gao+ Y., Wang J., Shi W., Shi N., Han M. 2020. Lung infection quantification of COVID-19 in ct images with deep learning. arXiv: 2003.04655 (arXiv preprint) [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shanmuganathan S. Artificial neural network modelling. Springer; 2016. Artificial neural network modelling: An introduction; pp. 1–14. [ Google Scholar ]
  • Shen W., Zhou M., Yang F., Yang C., Tian J. Multi-scale convolutional neural networks for lung nodule classification. International conference on information processing in medical imaging. 2015:588–599. [ PubMed ] [ Google Scholar ]
  • Shen D., Wu G., Suk H.-I. Deep learning in medical image analysis. Annual Review of Biomedical Engineering. 2017; 19 :221–248. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shereen M.A., Khan S., Kazmi A., Bashir N., Siddique R. COVID-19 infection: Origin, transmission, and characteristics of human coronaviruses. Journal of Advanced Research. 2020 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shin H.-C., Orton M.R., Collins D.J., Doran S.J., Leach M.O. Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4D patient data. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2012; 35 (8):1930–1943. [ PubMed ] [ Google Scholar ]
  • Shin H.-C., Roth H.R., Gao M., Lu L., Xu Z., Nogues I. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Transactions on Medical Imaging. 2016; 35 (5):1285–1298. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Singh R., Adhikari R. 2020. Age-structured impact of social distancing on the COVID-19 epidemic in India. arXiv:2003.12055 (arXiv preprint) [ Google Scholar ]
  • Strzelecki A. Italy and Iran: A google trends study; 2020. The second worldwide wave of interest in coronavirus since the COVID-19 outbreaks in South Korea. arXiv: 2003.10998 (arXiv preprint) [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Surveillances V. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) China, 2020. China CDC Weekly. 2020; 2 (8):113–122. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Suzuki K. Overview of deep learning in medical imaging. Radiological Physics and Technology. 2017; 10 (3):257–273. [ PubMed ] [ Google Scholar ]
  • Tajbakhsh N., Jeyaseelan L., Li Q., Chiang J.N., Wu Z., Ding X. Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation. Medical Image Analysis. 2020:101693. [ PubMed ] [ Google Scholar ]
  • Tayebi Z. 2020. Machine learning and deep learning to predict cross-immunoreactivity of viral epitopes. [ Google Scholar ]
  • Ting D.S.W., Carin L., Dzau V., Wong T.Y. Digital technology and COVID-19. Nature Medicine. 2020:1–3. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 2020. Coronavirus (COVID-19): News, analysis and resources. https://unctad.org/en/Pages/coronavirus.aspx Accessed 27.04.20. [ Google Scholar ]
  • Unhale S.S., Ansar Q.B., Sanap S., Thakhre S., Wadatkar S., Bairagi R. A review on corona virus (COVID-19) World Journal of Pharmaceutical and Life Sciences. 2020; 6 (4) [ Google Scholar ]
  • Vaid S., Kalantar R., Bhandari M. International Orthopaedics; 2020. Deep learning covid-19 detection bias: Accuracy through artificial intelligence; p. 1. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Velavan T.P., Meyer C.G. The COVID-19 epidemic. Tropical Medicine & International Health. 2020; 25 (3):278. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 2020. Coronavirus outbreak Can machine vision and imaging play a part? https://www.vision-systems.com/home/article/14170078/coronavirus-outbreak-can-machine-vision-and-imaging-play-a-part Accessed 05.05.20. [ Google Scholar ]
  • W.H. Organization . World Health Organization; 2020. Infection prevention and control during health care when COVID-19 is suspected: Interim guidance, 19 march 2020, Tech. rep. [ Google Scholar ]
  • Wang L., Wong A. 2020. COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images. arXiv: 2003.09871 (arXiv preprint) [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wang S., Kang B., Ma J., Zeng X., Xiao M., Guo J. 2020. A deep learning algorithm using CT images to screen for corona virus disease (COVID-19) MedRxiv. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wang L.-s., Wang Y.-r., Ye D.-w., Liu Q.-q. A review of the 2019 novel coronavirus (COVID-19) based on current evidence. International Journal of Antimicrobial Agents. 2020:105948. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 2020. How digital investment can help the COVID-19 recovery. https://www.weforum.org/agenda/2020/04/covid-19-digital-foreign-direct-investment-economic-recovery/ Accessed 28.04.20. [ Google Scholar ]
  • WHO . 2019. Coronavirus disease (COVID-19) outbreak. https://www.https://www.who.int/westernpacific/emergencies/covid-19 Accessed 22.12.19. [ Google Scholar ]
  • 2020. Situation report 91. https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200420-sitrep-91-covid-19.pdf?sfvrsn=fcf0670b_4 Accessed 22.04.20. [ Google Scholar ]
  • 2020. WHO coronavirus disease (COVID-19) dashboard. https://covid19.who.int/ Accessed 15.05.20. [ Google Scholar ]
  • WHO . 2020. Public statement for collaboration on COVID-19 vaccine development. https://www.who.int/news-room/detail/13-04-2020-public-statement-for-collaboration-on-covid-19-vaccine-development Accessed 07.05.20. [ Google Scholar ]
  • 2020. Chinese hospitals deploy AI to help diagnose Covid-19. https://www.wired.com/story/chinese-hospitals-deploy-ai-help-diagnose-covid-19/ Accessed 16.02.20. [ Google Scholar ]
  • 2020. COVID-19 coronavirus pandemic. https://www.worldometers.info/coronavirus/ Accessed 26.08.20. [ Google Scholar ]
  • Wu Z., McGoogan J.M. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: Summary of a report of 72 314 cases from the Chinese center for disease control and prevention. JAMA. 2020; 323 (13):1239–1242. [ PubMed ] [ Google Scholar ]
  • Wu J.T., Leung K., Leung G.M. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study. The Lancet. 2020; 395 (10225):689–697. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wu D., Wu T., Liu Q., Yang Z. The SARS-CoV-2 outbreak: What we know. International Journal of Infectious Diseases. 2020 [ Google Scholar ]
  • Xu J., Xiang L., Liu Q., Gilmore H., Wu J., Tang J. Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. IEEE Transactions on Medical Imaging. 2015; 35 (1):119–130. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Xu X., Chen P., Wang J., Feng J., Zhou H., Li X. Evolution of the novel coronavirus from the ongoing wuhan outbreak and modeling of its spike protein for risk of human transmission. Science China Life Sciences. 2020; 63 (3):457–460. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Yu Y., Si X., Hu C., Zhang J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Computation. 2019; 31 (7):1235–1270. [ PubMed ] [ Google Scholar ]
  • Zhang Q., Yang L.T., Chen Z., Li P. A survey on deep learning for big data. Information Fusion. 2018; 42 :146–157. [ Google Scholar ]
  • Zhang H., Saravanan K.M., Yang Y., Hossain M.T., Li J., Ren X. 2020. Deep learning based drug screening for novel coronavirus 2019-nCov. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zhang F., Yang D., Li J., Gao P., Chen T., Cheng Z. 2020. Myocardial injury is associated with in-hospital mortality of confirmed or suspected COVID-19 in Wuhan, China: A single center retrospective cohort study. MedRxiv. [ Google Scholar ]
  • Zhang C., Zheng W., Huang X., Bell E.W., Zhou X., Zhang Y. Protein structure and sequence re-analysis of 2019-nCoV genome refutes snakes as its intermediate host or the unique similarity between its spike protein insertions and hiv-1. Journal of Proteome Research. 2020 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zhou Y., Yang Z., Guo Y., Geng S., Gao S., Ye S. 2020. A new predictor of disease severity in patients with COVID-19 in Wuhan, China. medRxiv. [ Google Scholar ]
  • Zhu Q., Du B., Yan P. Boundary-weighted domain adaptive neural network for prostate MR image segmentation. IEEE Transactions on Medical Imaging. 2019; 39 (3):753–763. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zhuang F., Qi Z., Duan K., Xi D., Zhu Y., Zhu H. A comprehensive survey on transfer learning. Proceedings of the IEEE. 2020 [ Google Scholar ]
  • Zou S., Zhu X. FDG PET/CT of COVID-19. Radiology. 2020:200770. [ PMC free article ] [ PubMed ] [ Google Scholar ]

Methods for image denoising using convolutional neural network: a review

  • Survey and State of the Art
  • Open access
  • Published: 10 June 2021
  • Volume 7 , pages 2179–2198, ( 2021 )

You have full access to this open access article

  • Ademola E. Ilesanmi 1 , 2 &
  • Taiwo O. Ilesanmi 3  

41k Accesses

82 Citations

6 Altmetric

Explore all metrics

Image denoising faces significant challenges, arising from the sources of noise. Specifically, Gaussian, impulse, salt, pepper, and speckle noise are complicated sources of noise in imaging. Convolutional neural network (CNN) has increasingly received attention in image denoising task. Several CNN methods for denoising images have been studied. These methods used different datasets for evaluation. In this paper, we offer an elaborate study on different CNN techniques used in image denoising. Different CNN methods for image denoising were categorized and analyzed. Popular datasets used for evaluating CNN image denoising methods were investigated. Several CNN image denoising papers were selected for review and analysis. Motivations and principles of CNN methods were outlined. Some state-of-the-arts CNN image denoising methods were depicted in graphical forms, while other methods were elaborately explained. We proposed a review of image denoising with CNN. Previous and recent papers on image denoising with CNN were selected. Potential challenges and directions for future research were equally fully explicated.

Similar content being viewed by others

research papers related image processing

A comprehensive review of image denoising in deep learning

Rusul Sabah Jebur, Mohd Hazli Bin Mohamed Zabil, … Lim Kok Cheng

research papers related image processing

Research on CNN-Based Image Denoising Methods

research papers related image processing

Convolutional Neural Networks for Image Denoising and Restoration

Avoid common mistakes on your manuscript.

Introduction

In the last decade, the utilization of images has grown tremendously. Images are corrupted with noise in the process of acquisition, compression, and transmission. Environmental, transmission, and other channels are mediums through which images are corrupted by noise. In image processing, image noise is the variation in signal (in random form) that affects the brightness or color of image observation and information extraction. Noise adversely affects image processing tasks (such as video processing, image analysis, and segmentation) resulting in wrong diagnosis [ 1 ]. Hence, image denoising is a fundamental aspect which strengthens the understanding of image processing task.

Due to the increasing generation of digital images captured in poor conditions, image denoising methods have become an imperative tool for computer-aided analysis. Nowadays, the process of restoring information from noisy images to obtain a clean image is a problem of urgent importance. Image denoising procedures remove noise and restore a clean image. A major problem in image denoising is how to distinguish between noise, edge, and texture (since they all have high-frequency components). Interestingly, the most discussed noise in literature is the: additive white Gaussian noise (AWGN) [ 2 ], impulse noise [ 3 ], quantization noise [ 4 ], Poisson noise [ 5 ], and speckle noise [ 6 ]. AWGN occurs in analog circuitry, while impulse, speckle, Poisson, and quantization noise occur due to faulty manufacturing, bit error, and inadequate photon count [ 7 ]. Image denoising methods are used in the field of medical imaging, remote sensing, military surveillance, biometrics and forensics, industrial and agricultural automation, and in the recognition of individuals. In medical and biomedical imaging, denoising algorithms are fundamental pre-processing steps used to remove medical noise such as speckle, Rician, Quantum, and others [ 8 , 9 ]. In remotes sensing, denoising algorithms are used to remove salt and pepper, and additive white Gaussian noise [ 10 , 11 ]. Synthetic aperture radar (SAR) images provide space and airborne operation in military surveillance [ 12 ]. Image denoising algorithms have helped to reduce speckle in SAR images [ 13 ]. Moreover, forensic images do not have a specific kind of noise, they could be corrupted by any kind of noise. This noise can reduce the quality of evidence in the image thus, image denoising methods have helped suppress noise in forensic images [ 14 ]. Image denoising methods were used to filter paddy leaf and detect rice plant disease. Undoubtedly, image denoising is a hot area of research, encompassing all spheres of academic endeavor.

The linear, non-linear and non-adaptive filters were the first filters used for image applications [ 15 ]. Noise reduction filters are categorized into six (linear, non-linear, adaptive, wavelet-based, partial differential equation (PDE), and total variation filters). Linear filters appropriate output pixels with input neighboring pixels (using a matrix multiplication procedure) to reduce noise. Non-linear filters preserve edge information and still suppress noise. In most filtering applications, the non-linear filter is used in place of the linear filter. Linear filter does not preserve edge information; hence, it is considered a poor filtering method. A simple example of a non-linear filter is the median filter (MF) [ 16 ]. Adaptive filters employ statistical components for real-time applications (least mean square [ 17 ] and recursive mean square [ 18 ] are examples). Wavelets-based filters transform images to the wavelet domain and are used to reduce additive noise [ 19 , 20 ]. A detailed review of different denoising filters is available in reference [ 21 , 22 ].

Most of the above-mentioned filters have produced reasonably good results, however, they have some drawbacks. These drawbacks include poor test phase optimization, manual parameter settings, and specific denoising models. Fortunately, the flexibility of convolutional neural networks (CNN) has shown the ability to solve these drawbacks [ 23 ]. CNN algorithms have shown a strong ability to solve many problems [ 24 ]. For example, CNN has achieved excellent results in image recognition [ 25 ], robotics [ 26 ], self-driving [ 27 ], facial expression [ 28 ], natural language processing [ 29 ], handwritten digital recognition [ 30 ] and so many other areas. Chiang and Sullivan [ 31 ] were the first to use CNN (deep learning) for image denoising tasks. A neural network (weighting factor) was used to remove complex noise, then a feedforward network [ 32 ] produced a balance between efficiency and performance of the denoised image. In the early developments of CNN, the vanishing gradient, activation function (sigmoid [ 33 ] and Tanh [ 34 ]), and unsupported hardware platform made CNN difficult. However, the development of AlexNet [ 35 ] in 2012 has changed the difficulty in CNN usage. More CNN architecture (such as; VGG [ 36 ] and GoogleNet [ 37 ]) have been applied to computer vision tasks. References [ 38 , 39 ] were the first CNN architecture used in image denoising tasks. Zhang et al. [ 40 ] used the denoising CNN (DnCNN) for image denoising, super-resolution, and JPEG image blocking. The network consists of convolutions, back-normalization [ 41 ], rectified linear unit (ReLU) [ 42 ] and residual learning [ 43 ].

The use of CNN is not limited to general image denoising alone, CNN produced excellent results for blind denoising [ 44 ], real noisy images [ 45 ], and many others. Although several researchers have developed CNN methods for image denoising, only a few have proposed a review to summarize methods. Reference [ 46 ] summarized CNN methods for image denoising with categories based on noise type. Although this review is elaborate; it does not consider several methods for specific images. Again, the research did not consider recent methods (the year 2020 methods); hence, several research works published in late 2020 were unintentionally omitted. Our review provides an overview of CNN image denoising methods for different kinds of noise (including specific image noise). We discuss state-of-the-arts methods with emphases on image type and noise specification. The outline of CNN image denoising methods is depicted in Fig.  1 . It is hoped that explanations in this study will provide an understanding of CNN architectures used in image denoising. Our contribution is summarized as follows:

Analysis of different CNN image denoising models, database, and image type.

The highlight of commonly used objective evaluation methods in CNN image denoising

Potential challenges and road maps in CNN image denoising.

figure 1

CNN image denoising scheme

The rest of the paper is organized as follows. In Sect. 2, we review different CNN image denoising methods. In Sect. 3, we review databases for CNN image denoising algorithms. Section 4 gives an analysis of CNN image denoising; finally, the paper is concluded in Sect. 5.

Literature review

In this section, several existing methods for CNN image denoising will be discussed. We divide CNN image denoising approaches into two: (1) CNN denoising for general images, and (2) CNN denoising for specific images. The first approach uses CNN architectures to denoising general images, while the second approach uses CNN to denoise specific images. The first approach is widely used in CNN denoising applications when compared to the second. General images refer to images that represent a general purpose rather than the details (See [ 47 ] for samples of general images). Specific images are images intentionally created with a special or particular kind. For example, medical images, infrared images, remote sensing images, and others are kinds of specific images. The reason for dividing CNN denoising by image category is to bring readers up to speed with the latest CNN architecture with regards to image types. A block diagram depicting different approaches is shown in Fig.  1 .

CNN denoising for general images

Reference [ 48 ] proposed the attention-guided CNN (ADNet) for image denoising. ADNet consists of 17 layers with 4 blocks (sparse block (SB), feature enhancement block (FEB), attention block (AB), and reconstruction block (RB)). The use of sparsity has shown to be effective for image application [ 49 ]; hence, the SB was used to improve efficiency, performance, and to reduce the depth of the denoising framework. The SB has 12 layers with two types (dilated Conv + BN + ReLU, and Conv + BN + ReLU). The FEB has 4 layers with 3 types (Conv + BN + ReLU, Conv, and Tanh), while the AB has a single convolution layer. The AB was used to guide the SB and FEB which are useful for unknown noise. Finally, the RB performs reconstruction to produce a clean image. For training, the mean square error [ 50 ] was used to create model training (see Fig.  2 ).

figure 2

Attention-guided denoising CNN [ 48 ]

Some deep learning algorithm produces excellent results with synthetic noise; however, most of this network do not achieve good results in image corrupted by realistic noise. The research by Guo et al. [ 51 ] proposed the noise estimation removal network (NERNet). NERNet reduced noise on images with realistic noise. The architecture was divided into two modules; the noise estimation module and the noise removal module. The noise estimation module appropriates the noise-level map with the symmetric dilated block [ 52 , 53 ] and the pyramid feature fusion [ 54 ]. Meanwhile, the removal module used the estimated noise-level map to remove noise. The global and local information for preserving details and texture were aggregated into the removal module. The output of the noise estimate module was passed into the removal module to produce clean images.

It is no gainsaying that CNN learns noise patterns and image patches effectively. However, this learning produces a network with a large amount of training data and image patches. Because of the aforementioned, reference [ 55 ] proposed the patch complexity local divide and deep conquer network (PCLDCNet). The network was divided into local subtask (according to clean image patch and conquer block) and was trained on its local space. Each noisy patch weighting mixture was combined with the local subtask. Finally, image patches were grouped by complexity [ 56 ], while the training of the k network was achieved by the modified stacked denoising autoencoders [ 57 ]. Network degradation is another problem in a deep learning network (the deeper the layer the higher the error rate). Although the introduction of ResNet [ 58 ] resolved this issue, there is still room for improvement. Shi et al. [ 59 ] proposed the hierarchical residual learning that does not require the identity mapping for image denoising. The network has 3 sub-networks: feature extraction, inference, and fusion. Feature extraction sub-network extracts patches representing higher dimension feature maps. The interference sub-network [ 60 ] contains cascaded convolutions that produce a large receptive field. The cascaded procedure was performed to learn noise maps from multiscale information and to produce tolerating errors in noise estimation. Finally, the fusion sub-network fuses the entire noise map to produce estimation.

Gai and Bao [ 61 ], used the improved CNN (MP-DCNN) for image denoising. MP-DCNN is an end-to-end adaptive residual CNN constructed for modeling noisy images. Noise from the input image was extracted by the leaky ReLU, and the image features were reconstructed. An initial denoised image was inputted into the SegNet to obtain edge information. The MSE and the perceptive loss function [ 62 ] were used to obtain the final denoised image (see Fig.  3 ).

figure 3

MP-DCNN [ 61 ]

Another research by reference [ 63 ] proposed a new dictionary learning model for a mixture of Gaussian (MOG) distribution. The method was used for the expectation–maximization framework [ 64 ]. A minimization problem that uses the sparse coding and dictionary updating with quantitative and visual comparison was adopted. Specifically, this method was used to learn hierarchical mapping functions, and to prevent vanishing problems, Zhang et al. [ 65 ] proposed the separation aggregation network (SANet). SANet used three blocks (convolutional separation block, deep mapping block, and band aggregation block) to remove noise. The convolution separation block decomposed the input noise into sub-blocks [ 66 , 67 ]. Then, each band was mapped into a clean and latent form using the convolution and ReLU layers. Finally, the band aggregation block concatenates all maps and convolutes features to produce the output. The SANet model was inspired by the non-local patch (NPL) model [ 67 ]. NPL model consists of patch grouping, transformation, and patch aggregation. Residual images obtained by learning the difference between noisy and clean image pairs produce loss of information. This information is important in producing an effective noise-free output image. Reference [ 68 ] proposed the detail retaining CNN (DRCNN) to navigate between noisy and clean pairs without losing information. The model (DRCNN) focused on the integrity of high-frequency image content and produces better generalization ability. A minimization problem was analyzed, designed, and solved from the detail loss function. DRCNN has two modules: the generalization module (GM), and the detail retaining module (DRM). GM involves convolution layers with a stride of 1, while DRM involves several convolution layers. Unlike several architectures, DRCNN does not have BN.

Computation cost is an emerging problem in CNN applications, a very large network always occupies a large memory space and requires high computational capacity. These networks are unsuitable for applications on smart and portable devices. Because of the above problem, Yin et al. [ 69 ] proposed a side window CNN (SW-CNN) for image filtering. SW-CNN has two parts: side kernel convolution (SKC), fusion, and regression (FR). SKC aligns slide or corner of operation window with the target pixel to preserve edges. SKC was combined with CNN to provide effective representation power. A residual learning strategy [ 70 ] was adopted to map layers. FR involves two convolutional phases consisting of three operations: pattern expression, non-linear mapping, and weight calculations. The pattern expression calculates the gradient from the feature map tensor to produce a pattern tensor. Non-linear mapping convolutes the pattern tensor with different kernels to produce a tensor with (Hxwxd) dimension. Finally, the weight calculations generated the weighting coefficient of each pixel.

Single noise reduction with CNN is a difficult task. A more difficult task is to remove mixed noise from an image using CNN. Most mixed noise removal algorithms involve pre-processing outlier. Reference [ 71 ] proposed the denoising-based generative adversarial network (DeGAN) for removing mixed noise from images. The generative adversarial network (GAN) [ 72 ] has been widely used in deep learning applications. The DeGAN involved the generator, discriminator, and feature extractor network. The generator network used the U-Net [ 73 ] architecture, while the discriminator network consists of 10 end-to-end design layers. The main purpose of the discriminator network was to check whether the image estimated by U-Net (extractor network) was noise free. Finally, the feature extraction network used the VGG19 [ 74 ] to extract features and to assist the model training by calculating the loss function (see Fig.  4 ).

figure 4

DeGAN [ 71 ]

Xu et al. [ 75 ] proposed the Bayesian deep matrix factorization (BDMF) for multiple image denoising. BDMF used the deep neural network (DNN) for low-rank components and optimization via stochastic gradient variation Bayes [ 76 , 77 , 78 ]. The network is a combination of the deep matrix factorization (DMF) network and the Bayesian method. Synthetic and hyperspectral images were used to evaluate the methods. Reference [ 79 ] proposed the classifier/regression CNN for image denoising. The regression network was used for restoring the noisy pixel identified by the classifier network, while the classifier network detects impulse noise. The classifier network involves convolution, BN, ReLU, a softmax, and a skip connection. Meanwhile, based on the label predicted by the classifier network, the regression network used four layers and a skip connection to predict clean images (see Fig.  5 ).

figure 5

Classifier/regression CNN [ 79 ]

Reference [ 80 ] proposed the complex-valued CNN (CDNet) for image denoising. First, the input image was passed to 24 sequential connected convolutional units (SCCU). SCCU involve the complex-valued (CV) convolutional layer, complex-valued (CV) ReLU, and complex-valued (CV) BN. A 64 convolutional kernel was used in the network. The residual block was implemented for the middle 18 units. The convolution/deconvolution layer with a stride of 2 was used to improve computational efficiency. Finally, the merging layer transformed the complex-valued features into a real-value image. Overall, CDNet has five blocks: complex-valued (CV) Conv, complex-valued (CV) ReLU, complex-valued (CV) BN, complex-valued (CV) residual block (RB), and merging layer (see Fig.  6 ).

figure 6

Complex-value CNN [ 80 ]

Zhang et al. [ 81 ] proposed the detection and reconstruction of CNN for color images (3-channel). The method has three networks; classifier network, denoiser network, and reconstruction network. The classifier network predicts color channels to determine the probability of impulse noise in the image. Decision-maker procedures (that compute the label vector of each pixel) were employed to ascertain noisy or noise-free color pixels. Sparse clean image replaced corrupted channels (0 for noise free). Finally, the denoised image was reconstructed by the image reconstruction architecture. In a nutshell, the classifier network (consist of convolution, and ReLU layers) predicts the probability of channels, then the denoiser network (consist of convolution, BN, and ReLU layers) corresponds to the noise-free color pixel, while the reconstruction network (has only convolutions) reconstructs the images. Although the networks have the same structures, the depth and the number of nodes are different. Adaptive moment estimation (Adam) [ 82 ] was used to optimize the networks.

Reference [ 83 ] proposed the CNN variation model (CNN-VM) for denoising of images. The CNN used in this research was termed EdgeNet and it consists of multiple scale residual blocks (MSRB). EdgeNet extracts feature from the noisy image through an edge regularization method. The total variation regularization was used to obtain superior performance in the shape edge. The Bregman splitting method was used to obtain solutions to the model. MSRB employed a kernel of two for each bypass to detect local features. A skip connection was used for inputting data and to generate output features. Another skip connection was used after each MSRB block with a bottleneck layer fusing detected features. Four MSRB blocks were adopted in the EdgeNet training procedure. A comparison of different methods in this section is available in Tables 1 and 2 .

CNN denoising for specific images

Islam et al. [ 84 ] proposed a feedforward CNN method to remove mixed noise (Gaussian–impulse). The method adopts a computational efficiency transfer learning approach for noise removal. The model consists of a pre-processing, and four convolution filtering stages. A rank order filtering operation was applied to each stage and the convolution layer preceded the ReLU and max-pooling layers. The output of the first stage was fed into the ReLU and the output of the ReLU was pooled (max pooling). The second and third stages used convolution and ReLU layers, and the last stage adopts the convolution layer. A back-propagation algorithm (with differentiable and traceable loss function) was used to train the model. Finally, the model used a data argumentation [ 85 , 86 ] for effective learning. Another research by Tian et al. [ 87 ] proposed a deep learning method based on U-Net [ 73 ] and Noise2Noise [ 88 ] method. First, the noise was validated on computer-generated holography (CGH) images. Then, the classical Gerchberg–Saxton (GS) algorithm [ 89 ] was used to generate different holograms (two-phase). Next, the noise reduction mechanism (UNET and Noise2Noise) was obtained. Finally, the MSE was used as the loss function and the learning rate was set at 0.001. Like the previous method, the MSE was adopted as the loss function; apparently, MSE can act as a good loss function in image denoising.

Reference [ 90 ] proposed the spectral–spatial denoising residual network (SSDRN). SSDRN used the spectral difference [ 91 ] mapping method based on CNN with residual learning for denoising. The network was an end-to-end algorithm that preserves spectral profile and removes noise. A key band was selected based on a principal transform matrix with DnCNN [ 40 ]. Overall, SSDRN involves three parts: spectral difference learning, key band selection, and denoising (by DnCNN) model. Unlike most CNN denoising models, SSDRN used the batch normalization [ 92 ] layer in each block of the algorithm. Reference [ 93 ] proposed the patch group deep learning for image denoising. A training set with a patch group was created and then the deep learning method [ 94 , 95 ] was used to reduce the noise. Reference [ 96 ] developed an end-to-end deep neural network (DDANet) for computational ghost image reconstruction. DDANet used a bucket signal with multiple tunable noise-level maps. A clear image was outputted after training with the simulated bucket signals and the ground-truth image. DDANet has 21 layers that include: fully connected layers, dense blocks, and convolution layers. The inputs, transformation, noise adding [ 97 ] encoding [ 98 ], and object recovery layers were used in the DDANet architecture. A skip connection [ 99 , 100 ] for passing high-frequency feature information was utilized. The attention gate (AG) [ 101 ] and dilated convolution were used to filter the features. Finally, the dropout layer [ 102 ] was used to avoid overfitting, while the BN accelerated loss function.

Zhang et al. [ 103 ] proposed the deep spatio-spectral Bayesian posterior network (DSSBPNet) for hyperspectral images. A blend of Bayesian variation posterior and deep neural network produced the DSSBPNet. Specifically, the method was divided into two parts: deep spatio-spectral (DSS) network and Bayesian posterior. The DSS network split the input image into three parts producing a spatio-spectral gradient [ 104 ] for each part. Different convolutions were used in the DSS network. Meanwhile, the likelihood of original data, noise estimate, noise distribution, and sparse noise gradient constitute the Bayesian posterior method. Finally, a forward–backward propagation method was used to connect the DSS with the Bayesian posterior. Reference [ 105 ] proposed the two-stage cascaded residual CNN to remove mixed noise from infrared images. The model used the mixed convolutional layer combining dilated convolutions, sub-pixel convolutions, and standard convolutions to extract and improve accuracy. A residual learning method was used to estimate the calibration parameter from the input image. Five feature extraction blocks (FEBs) used the coarse–fine convolution unit (CF-Conv) and the spatial-channel noise attention unit (SCNAU) to stack noise features. The last convolution layer for each network consists of a single filter with kernel size (see Fig.  7 ). Giannatou et al. [ 106 ] proposed the residual learning CNN (SEMD) for noise removal in scanning electron microscopic images. SEMD is a residual learning method inspired by the DnCNN and trained to estimate the noise at each pixel of a noisy image. The input block in SEMD consists of a convolutional layer followed by a ReLU and BN. The output block consists of a convolution with one filter for reconstruction. Jiang et al. [ 107 ] proposed the generative adversarial network based on the deep network for denoising of underwater images (UDnNet). UDnNet consists of two sub-networks: a generator network, and a discriminator network. The generator network generates realistic samples using the training procedure, asymmetric codec structure, and a skip connection. The output of the generative network was processed by the convolution-instance Norm-Leaky ReLU. A deconvolution-instance Norm-Leaky ReLU decodes the features.

figure 7

Two-phased cascaded residual CNN [ 105 ]

Reference [ 108 ] combined the bilateral filter, the hybrid optimization, and the CNN to remove noise. The bilateral filter [ 100 , 109 ] was used to remove noise, while the hybrid optimization used the swarm insight strategy [ 110 ] to preserve edges. Finally, a CNN classifier (with convolution layers, pooling layer with feature extraction, and fully connected layer) was used to classify the image. For the evaluation procedure, the peak signal to noise ratio, vector root mean square error, structural similarity index, and root mean square error was adopted [ 8 , 111 ]. A major challenge when using CNN for speckle reduction is labeling. Ultrasound images are not labeled; hence, it becomes very difficult for deep learning to identify speckle. Feng et al. [ 112 ] proposed a hybrid CNN method for speckle reduction. The method involves a three knowledge system. Since speckle noise was similar to Gaussian distribution in the logarithm transform domain, the distribution parameters were also estimated in the logarithm transformation domain with maximum likelihood estimation. Second, a transfer denoising network was trained with a clean natural image dataset. Finally, the VGGNet was used to extract structural boundaries from the trained images. Overall, the transferable denoising network was trained based on Gaussian prior knowledge of Ultrasound clean images. Then, fine-tuning of the pre-trained network with prior knowledge of structural boundaries was performed. Ultrasound images (breast, liver, and spinal) and artificially generated phantom (AGP) images were used to evaluate the method.

Reference [ 113 ] used the pre-trained residual learning network (RLN) for despeckling of ultrasound images. The model consists of noise and pre-trained RLN models. A noise model was created from the training dataset, and then random patches were generated from the speckle noise images. The RLN was then used to train the random patches, and a despeckled image was created. The pre-trained RLN has 59 layers (consist of Conv, ReLU, and BN) for training and testing. The method was tested with artificial and natural images corrupted with speckle noise (see Fig.  8 ).

figure 8

Pre-trainedRLN [ 113 ]

Kim and Lee [ 114 ] proposed the conditional generative adversarial network (CGAN) for noise reduction in low-dose chest images. CGAN involves the generative model [ 115 ], discriminator model [ 116 ], and the prediction model. The generator model has 14 layers and focused on synthesized realistic images from random vector sample noise distribution. Meanwhile, the discriminator model has 4 layers and trains on ground-truth images. The tensor library was used to accomplish the CGAN architecture. Li et al. [ 117 ] proposed a progressive network learning strategy (PNLS) that fits the Racian distribution with large convolutional filters. The network consists of two residual blocks (used for fitting pixel domain, and matching pixel domains). The first residual block used the Conv, and ReLU layers without the BN layer. The second residual block used the Conv, ReLU, and BN layers. Each block has 5 layers with three convolution layers acting as the intermediary between blocks (see Fig.  9 ).

figure 9

Progressive network learning strategy [ 117 ]

Reference [ 118 ] proposed a novel CNN method for denoising MRI scans (CNN-DMRI). The network used convolutions to separate image features from noise. CNN-DMRI is an encoder–decoder structure for preserving important features and ignoring the unimportant ones. The neural network learns prior features from the image domain and produced clean images. A down-sampling and up-sampling factor of 2 was adopted. CNN-DMRI is a four-layer network; the first two layers have 64 filters followed by CONV layers. The down-sampling layer has 128 filters followed by 4 residual blocks and a 64 up-sampling filter. Finally, a concatenation of the noisy image and the network was performed to produce a clean MRI. Comparison of different methods in this section is available in Tables 3 and 4 .

CNN image denoising performance measures

Performance evaluations are key indices in image denoising. Over the years, researchers have used different objective evaluation methods for CNN image denoising. Below are different evaluation methods adopted by researchers in CNN denoising.

The mean square error (MSE) : Is the average of the square of the difference between the original image and the denoised image. Lower MSE values signify better image quality.

Peak signal to noise ratio (PSNR) : is determined through the MSE. It is an engineering term that measures the ratio between maximum original signal and MSE. Higher PSNR values signify better image quality.

Structural similarity index measure (SSIM) : measure perceptual difference (such as luminance, contrast, and structure) of two similar images. Higher SSIM values signify better image quality.

where \(\mu _{I}\) and \(\mu _{L}\) are the average gray values, \(\sigma _{I}\) and \(\sigma _{L}\) are the variance of patches, \(\sigma _{{IL}}\) is the covariance of I and L , and Q 1 and Q 2 denote two small positive constants (typically 0.01).

Root mean square error (RMSE) : measure the difference between estimated predictions and actual observed values. The MSE is the scale square of RMSE. The RMSE between two image metrics ( P , Q ) is:

Feature Similarity (FSIM and FSIMc) : is designed for gray-scale images and luminance components of color images. It computes local similarity maps and pools these maps into a single similarity score.

To learn more about the FSIM and FSIMc see reference [ 119 ].

The signal to noise ratio (SNR) : measures noise level relative to the original image as follows.

The Spectral Angle mapper (SAM), and the Erreur Relative Globale Adimensionnelle de Synthèse (ERGAS) [ 120 ] are used with other evaluation methods in remote sensing images. Overall, the PSNR and the SSIM are the most widely used evaluation methods for CNN denoising. These two methods are popular because they are easy and are considered tested and valid [ 121 ].

This section provides a list of datasets used for CNN image denoising algorithms. They include: ImageNet large scale visual recognition challenge object detection (ILSVRC-DET) [ 122 ], Places2 [ 123 ], BerkeleySegmentation Dataset (BSD) [ 124 ], waterloo explorationdatabase [ 125 ], EMNIST [ 126 ], COCO dataset [ 127 ], MIT-Adobe five k [ 128 ], ImageNet [ 129 ], BSD68 [ 130 ], Set 14 [ 131 ], Renoir [ 132 ], NC12 [ 133 ], NAM [ 134 ], SIDD [ 135 ], SUN397 [ 136 ], Set5 [ 137 ], CAVE [ 138 ],Harvard database [ 139 ], MRI brain dataset [ 140 ], LIVE1, Chelsea,DIV2K dataset [ 141 ], HIS dataset Xiongan, first Affiliated Hospital of Sun Yat-Sen University, Shenzen third people’s Hospital, Artificially generated phantom (AGP) [ 142 ], Ultrasound dataset [ 143 , 144 ], SPIE American association of physicist in medicine lung CT-challenge database [ 145 ], SIAT-CAS MRI dataset, Brainweb [ 146 , 147 ], IXI dataset [ 148 ], Multiple sclerosis [ 149 ], Prostrate MRI [ 150 ], ThammasatUniversity Hospital dataset [ 151 ]. A few samples of data in the dataset used by researchers for CNN denoising are shown in Fig.  10 . Similarly, Fig.  11 is the graph of datasets used for evaluating CNN denoising methods. However, the Berkeley segmentation dataset has the highest usage because it is particularly suited for image denoising research. Three major points that matter when selecting datasets are relevance, usability, and quality [ 47 , 152 ]. Hence, we believe that datasets with higher usage for CNN denoising tasks have these aforementioned points. It should be noted that some datasets were not recorded in the graph (Fig.  10 ). The reason for this was because they have a fewer appearance in CNN denoising researchers.

figure 10

A few samples of images in datasets used by researchers

figure 11

Datasets for CNN-IQA methods

A total of 152 references were included in this paper, specifically, 31 research papers related to CNN image denoising. In this review, a conscious effort was made to include all research articles relating to CNN image denoising; however, some studies might have been skipped. A graph depicting the number and the publication year for CNN image denoising methods is available in Fig.  12 . From the figure, it is clear that researchers have just recently adopted CNN for image denoising. This is open research for further experimentation and exploration. Finally, the graph of image type adopted by the CNN image denoising methods is available in Fig.  13

figure 12

Number of papers published yearly

figure 13

List of image types

Conclusions and future directions

Recently, CNN architectures are becoming quite useful in image denoising. We have proposed a survey of different techniques relating to CNN image denoising. A clear understanding of different concepts and methods was elucidated to give readers a grasp of recent trends. Several techniques for CNN denoising have been enumerated. A total of 144 references were included in this paper. From the study, we observed that the GAN was the most used method for CNN image denoising. Several methods used the generator and the discriminator for extraction and clean image generation. Interestingly, some researchers combined the GAN method with the DCNN methods. The feedforward CNN and U-Net were also used. The residual network was used severally by researchers. A reason for the high usage of the residual network could be its effectiveness and efficiency. Researchers used the residual network to limit the numbers of convolutions in their network. A creative measure adopted by the researcher was to try to mixed noise (impulse Gaussian noise). To reduce mixed noise in images, several careful deep convolutions were required. The Rician and speckle noise is common in medical images. Pre-trained networks have worked excellently in medical image noise reduction. The database from Berkeley was the most used in CNN image denoising. In addition, the attention mechanism and residual networks are commonly used CNN techniques in image denoising tasks. The reason for such wide acceptance is because of their popularity and effectiveness in image denoising.

Some problems confronted by CNN image denoising methods include not enough memory for CNN applications, and difficulty in solving unsupervised denoising tasks. Conclusively, only very few CNN methods were used for medical images. It will be encouraging if more CNN methods could be applied to denoise medical images. In addition, the authors try to collect codes and software; however, it was not available. The provision of more memory allocations for the CNN task will be very helpful. This could be a research area for future discussion.

The findings of the review can be summarized below:

From the available literature, it is clear that CNN can considerably remove all kinds of noise from images and advance capability in image denoising. Several studies reported higher performance of CNN architecture for image denoising. CNN architectures support end-to-end procedures and are implemented promptly.

CNN architecture can be customized for noise removal tasks creating patterns that remove the bottleneck of vanishing gradients.

CNN methods are designed using technical knowledge and principles in concert with understanding the noise type and noise models.

Most studies used pre-trained CNN models; however, noise properties are in a continuous nature and need a model built from scratch. Building such a model creates room for readjustment and fine-tuning. However, building a model from the beginning require lots of computation space and time. With the introduction of cloud-based methods (e.g., COLAB), it is hoped that the problem of space and time would be resolved.

The use of spatial patterns in CNN architecture could create a shift from conventional methods to deep learning methods. Contrary to perceptions that CNN is a black box, features visualization methods provide a trusted platform for noise removal, however, the greatest challenge remains the computational time and space.

Diwakar M, Kumar M (2018) A review on CT image noise and its denoising. Biomed Signal Process Control 42:73–88. https://doi.org/10.1016/j.bspc.2018.01.010

Article   Google Scholar  

Buades A, Coll B, Morel JM (2005) A review of image denoising algorithms, with a new one. Multiscale Model Simul 4(2):490–530

MathSciNet   MATH   Google Scholar  

Awad A (2019) Denoising images corrupted with impulse, Gaussian, or a mixture of impulse and Gaussian noise, Engineering Science and Technology, an. Int J 22(3):746–753

Google Scholar  

Bingo W-KL, Charlotte YFH, Qingyun D, Reiss JD (2014) Reduction of quantization noise via periodic code for oversampled input signals and the corresponding optimal code design. Digit Signal Process 24:209–222

Rajagopal A, Hamilton RB, Scalzo F (2016) Noise reduction in intracranial pressure signal using causal shape manifolds. Biomed Signal Process Control 28:19–26

Ilesanmi AE, Idowu OP, Chaumrattanakul U, Makhanov SS (2021) Multiscale hybrid algorithm for pre-processing of ultrasound images. Biomed Signal Process Control 66:102396

Goyal B, Dogra A, Agrawal S, Sohi BS, Sharma A (2020) Image denoising review: from classical to state-of-the-art approaches. Inform Fusion 55:220–244

Gai S, Zhang B, Yang C, Lei Yu (2018) Speckle noise reduction in medical ultrasound image using monogenic wavelet and Laplace mixture distribution. Digital Signal Process 72:192–207

Baselice F, Ferraioli G, Pascazio V, Sorriso A (2019) Denoising of MR images using Kolmogorov–Smirnov distance in a non local framework. Magn Reson Imaging 57:176–193

Vijay M, Devi LS (2012) Speckle noise reduction in satellite images using spatially adaptive wavelet thresholding. Int J Comput Sci Inf Technol 3:3432–3435

Bhosale NP, Manza RR (2013) Analysis of effect of noise removal filters on noisy remote sensing images. Int J Sci Eng Res 4:1151

Berens P. Introduction to synthetic aperture radar (SAR). NATO OTAN, pp 1–14

Sivaranjani R, Mohamed-Mansoor-Roomi S, Senthilarasi M (2019) Speckle noise removal in SAR images using multi-objective PSO (MOPSO) algorithm. Appl Soft Comput 76:671–681

Aljarf A, Amin S (2015) Filtering and reconstruction system for gray forensic images. World Acad Sci Eng Technol Int J Inform Commun Eng 9(1)

Huang T (1971) Stability of two-dimensional recursive filters (mathematical model for stability problem in two-dimensional recursive filtering)

Jaspin-Jeba-Sheela C, Suganthi G (2020) An efficient denoising of impulse noise from MRI using adaptive switching modified decision based unsymmetric trimmed median filter. Biomed Signal Process Control 55:101657

Zhao H, Zheng Z (2016) Bias-compensated affine-projection-like algorithms with noisy input. Electron Lett 52(9):712–714

Dinga F, Wanga Y, Ding J (2015) Recursive least squares parameter identification algorithms for systems with colored noise using the filtering technique and the auxilary model. Digit Signal Process 37:100–108

Stolojescu-Crisan C (2015) A hyperanalytic wavelet based denoising technique for ultrasound images. In: International Conference on Bioinformatics and Biomedical Engineering, pp 193–200

Zhang X, Feng X (2014) Multiple-step local wiener filter with proper stopping in wavelet domain. J Vis Commun Image Represent 25(2):254–262

MohdSagheer SV, George SN (2020) A review on medical image denoising algorithms. Biomed Signal Process Control 61:102036

Fan L, Zhang F, Fan H et al (2019) Brief review of image denoising techniques. Vis Comput Ind Biomed Art 2:7. https://doi.org/10.1186/s42492-019-0016-7

Lucas A, Iliadis M, Molina R, Katsaggelos AK (2018) Using deep neural networks for inverse problems in imaging: beyond analytical methods. IEEESignal Process Mag 35(1):20–36

Khan A, Sohail A, Zahoora U et al (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53:5455–5516

Zhu P, Isaacs J, Fu B et al. (2017) Deep learning feature extraction for target recognition and classification in underwater sonar images. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), IEEE, 2017

Paolini R, Rodriguez A, Srinivasa SS et al (2014) A data-driven statistical framework for post-grasp manipulation. Int J Robot Res 33(4):600–615

Ramos S, Gehrig S, Pinggera P et al (2017) Detecting unexpected obstacles for self driving cars: Fusing deep learning and geometric modeling. In: 2017 IEEE Intelligent Vehicles Symposium (IV), IEEE, 2017

Wu H, Liu Y, Liu Y, Liu S (2019) Efficient facial expression recognition via convolution neural network and infrared imaging technology. Infrared Phys Technol 102:103031

Firmansyah I, Yamaguchi Y (2020) FPGA-based implementation of a chirp signal generator using an OpenCL design. Microprocess Microsyst 77:103199

LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

Chiang Y, Sullivan BJ (1989) Multi-frame image restoration using a neural network. In: Proceedings of the 32nd midwest symposium on circuits and systems, IEEE, pp 744–747

Hu J, Wang X, Shao F, Jiang Q (2020) TSPR: deep network-based blind image quality assessment using two-side pseudo reference images. Digit Signal Process 106:102849

Marreiros AC, Daunizeau J, Kiebel SJ, Friston KJ (2008) Population dynamics: variance and the sigmoid activation function. Neuroimage 42(1):147–157

Jarrett K, Kavukcuoglu K, Ranzato M, LeCun Y (2009) What is the best multi-stage architecture for object recognition. In: 2009 IEEE 12th international conference on computer vision, pp 2146–2153

Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural information Processing Systems, pp 1097–1105

Ha I, Kim HJ, Park S, Kim H (2018) Image retrieval using BIM and features from pretrained VGG network for indoor localization. Build Environ 140:23–31

Tang P, Wang H, Kwong S (2017) G-MS2F: GoogLeNet based multi-stage feature fusion of deep CNN for scene recognition. Neurocomputing 225:188–197

Liang J, Liu R (2015) Stacked denoising autoencoder and dropout together to prevent overfitting in deep neural network, In: 2015 8th international congress on image and signal processing (CISP), pp 697–701

Xu J, Zhang L, Zuo W, Zhang D, Feng X (2015) Patch group based nonlocal self-similarity prior learning for image denoising. In: Proceedings of the IEEE international conference on computer vision, pp 244–252

Zhang K, Zuo W, Chen Y, Meng D, Zhang L (2017) Beyond a Gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans Image Process 26(7):3142–3155

Li Y, Wang N, Shi J, Hou X, Liu J (2018) Adaptive batch normalization for practical domain adaptation. Pattern Recogn 80:109–117

Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814

Zhang M, Yang L, Yu D, An J (2021) Synthetic aperture radar image despeckling with a residual learning of convolutional neural network. Optik 228:165876

Zhang F, Liu D, Wang X, Chen W, Wang W (2018) Random noise attenuation method for seismic data based on deep residual networks. In: International geophysical conference, Beijing, China, 24–27, 2018, Society of Exploration Geophysicists and Chinese Petroleum Society, pp 1774–1777

Guo Z, Sun Y, Jian M, Zhang X (2018) Deep residual network with sparse feedback for image restoration. Appl Sci 8(12):2417

ChunweiTian LunkeFei, WenxianZheng YX, WangmengZuo C-W (2020) Deep learning on image denoising: an overview. Neural Netw 131:251–275

Koesten L, Simperl E, Blount T, Kacprzak E, Tennison J (2020) Everything you always wanted to know about a dataset: studies in data summarization. Int J Human-Comp Stud 135:102367

Schwenker F, Kestler HA, Palm G (2001) Three learning phases for radial-basis-function networks. Neural Netw 14(4):439–458

MATH   Google Scholar  

Tian C, Zhang Q, Sun G, Song Z, Li S (2018) FFT consolidated sparse and collaborative representation for image classification. Arab J Sci Eng 43(2):741–758

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition, ArXiv preprint arXiv: 1409.1556

Guo B, Song K, Dong H, Yan Y, Tu Z, Zhu L (2020) NERNet: noise estimation and removal network for image denoising. J Vis Commun Image R 71:102851

Wei Y et al (2018) Revisiting dilated convolution: a simple approach for weakly-and semi-supervised semantic segmentation. IEEE Conf Comp Vis Pattern Recogn (Cvpr) 2018:7268–7277

Li X et al (2019) Selective kernel networks. IEEE Conf Comp Vis Pattern Recogn (Cvpr) 2019:510–519

He KM et al (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Computer Vision - Eccv 2014, Pt Iii,. 8691: pp 346–361

Hong I, Hwang Y, Kim D (2019) Efficient deep learning of image denoising using patch complexity local divide and deep conquer. Pattern Recogn 96:106945

Chatterjee P, Milanfar P (2011) Practical bounds on image denoising: from estimation to information. IEEE Trans Image Process 20(5):1221–1233

Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

Shi W, Jiang F, Zhang S, Wang R, Zhao D, Zhou H (2019) Hierarchical residual learning for image denoising. Signal Process Image Commun 76:243–251

Kim J, Kwon Lee J, Mu Lee K (2016) Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1646–1654

Gai S, Bao Z (2019) New image denoising algorithm via improved deep convolutional neural network with perceptive loss. Expert Syst Appl 138:112815

Wu CZ, Chen X, Ji D, Zhan S (2018) Image denoising via residual network based on perceptual 1oss. J Image Graphics 23(10):1483–1491

Zhang J, Luo H, Hui B, Chang Z (2019) Unknown noise removal via sparse representation model. ISA Trans 94:135–143

Liu J, Tai X, Huang H, Huan Z (2013) A weighted dictionary learning models for denoising images corrupted by mixed noise. IEEE Trans Image Process 22(3):1108–1120

Zhang L, Li Y, Wang P, Wei W, Xu S, Zhang Y (2019) A separation–aggregation network for image denoising. Appl Soft Comp J 83:105603

Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2921–2929

Gu S, Zhang L, Zuo W, Feng X (2014) Weighted nuclear norm minimization with application to image denoising. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2862–2869

Li X, Xiao J, Zhou Y, Ye Y, Lv N, Wang X, Wang S, Gao S (2020) Detail retaining convolutional neural network for image denoising. J Vis Commun Image R 71:102774

Yin H, Gong Y, Qiu G (2020) Fast and efficient implementation of image filtering using a side window convolutional neural network. Signal Process 176:107717

Zhang K, Zou W, Chen Y, Meng D, Zhang L (2017) Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans Image Process 26(7):3142–3155

Lyu Q, Guo M, Pei Z (2020) DeGAN: mixed noise removal via generative adversarial networks. Appl Soft Comp J 95:106478

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley B, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proceedings of the 28th Annual Conference on Neural Information Proceeding System, pp 2672–2680

Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, pp 234–241

Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ArXiv preprint arXiv: 1409.1556.

Xu S, Zhang C, Zhang J (2020) Bayesian deep matrix factorization network for multiple images denoising. Neural Netw 123:420–428

Blei DM, Kucukelbir A, McAuliffe JD (2017) Variational inference: a review for statisticians. J Am Stat Assoc 112(518):859–877

MathSciNet   Google Scholar  

Blundell C, Cornebise J, Kavukcuoglu K, Wierstra D (2015) Weight uncertainty in neural networks. In: ICML, pp 1613–1622

Kingma DP, Welling M (2014) Auto-encoding variational Bayes. In: International conference on learning representations, ICLR Banff, AB, Canada, pp 14–16

Jin L, Zhang W, Ma G, Song E (2019) Learning deep CNNs for impulse noise removal in images. J Vis Commun Image R 62:193–205

Quan Y, Chen Y, Shao Y, Teng H, Xu Y, Ji H (2021) Image denoising using complex-valued deep CNN. Pattern Recogn 111:107639

Zhang W, Jin L, Song E, Xu X (2019) Removal of impulse noise in color images based on convolutional neural network. Appl Soft Comp J 82:105558

Kingma D, Ba J (2015) Adam: a method for stochastic optimization. In: International Conference on Learning Representation

Fang Y, Zeng T (2020) Learning deep edge prior for image denoising. Comp Vis Image Understand 200:103044

Islam MT, Rahman SMM, Ahmad MO, Swamy MNS (2018) Mixed Gaussian-impulse noise reduction from images using convolutional neural network. Signal Process Image Commun 68:26–41

Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Proc. Int. Conf. Neural Information Processing Systems, Lake Tahoe, NV, pp 1097–1105

Oyelade ON, Ezugwu AE (2021) A deep learning model using data augmentation for detection of architectural distortion in whole and patches of images. Biomed Signal Process Control 65:102366

Yin D, Gu Z, Zhang Y, Gu F, Nie S, Feng S, Ma J, Yuan C (2020) Speckle noise reduction in coherent imaging based on deep learning without clean data. Opt Lasers Eng 133:106151

Lehtinen J, Munkberg J, Hasselgren J, Laine S, Karras T, Aittala M et al (2018) Noise2Noise: learning image restoration without clean data. arXiv: 1803.04189

Gerchberg RW, Saxton WO (1972) A practical algorithm for the determination of phase from image and diffraction plane pictures. Optik (Stuttg) 35(2):237–250

Xi W, Li Y, Jia X (2018) Deep convolutional networks with residual learning for accurate spectral-spatial denoising. Neurocomputing 312:372–381

Li Y, Hu J, Zhao X, Xie W, Li J (2017) Hyperspectral image super-resolution using deep convolutional neural network. Neurocomputing 266:29–42

Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the International Conference on Machine Learning (ICML), pp 448–456

Park M, Lee S, Choi S, Lee S, Han S, Lee H, Kang S-H, Lee Y (2020) Deep learning-based noise reduction algorithm using patch group technique in cadmium zinc telluride fusion imaging system: A Monte Carlo simulation study. Opt Int J Light Electron Opt 207:164472

Feng J, Song L, Huo X, Yang X, Zhang W (2015) An optimized pixel-wise weighting approach for patch-based image denoising. IEEE Signal Process Lett 22:115–119

Xu J, Ren D, Zhang L, Zhang D (2017) Patch group based Bayesian learning for blind image denoising. In: Computer Vision-ACCV 2016 Workshops. ACCV 2016, Lecture Notes in Computer Science 10116, pp 79–95

Wu et al (2020) Deep-learning denoising computational ghost imaging. Opt Lasers Eng 134(2020):106183.

Zhang K, Zuo W, Zhang L (2018) FFDNet: toward a fast and flexible solution for CNN-based image denoising. IEEE T Image Process 27(9):4608–4622

Li S, Deng M, Lee J, Sinha A, Barbastathis G (2018) Imaging through glass diffusers using densely connected convolutional networks. Optica 5(7):803–813

Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–8

Routray S, Malla PP, Sharma SK, Panda SK, Palai G (2020) A new image denoising framework using bilateral filtering based non-subsampled Shearlet transform. Optik 216:164903

Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B (2018) Attention u-net: learning where to look for the pancreas, arXiv preprint arXiv: 1804.03999

Sutskever KI, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 2012:1097–1105

Zhang Q, Yuan Q, Li J, Sun F, Zhang L (2020) Deep spatio-spectral Bayesian posterior for hyperspectral image non-i.i.d.noise removal. ISPRS J Photogramm Remote Sens 164:125–137

Xu Z et al (2019) Deep gradient prior network for DEM super-resolution: transfer learning from image to DEM. ISPRS J Photogramm Remote Sens 150:80–90

Guan J, Lai R, Xiong A, Liu Z, Gu L (2020) Fixed pattern noise reduction for infrared images based on cascade residual attention CNN. Neurocomputing 377:301–313

Giannatoua E, Papavieros G, Constantoudis V, Papageorgiou H, Gogolides E (2019) Deep learning denoising of SEM images towards noise-reduced LER measurements. Microelectron Eng 216:111051

Jiang Q, Chen Y, Wang G, Ji T (2020) A novel deep neural network for noise removal from underwater image. Signal Process Image Commun 87:115921

Elhoseny M, Shankar K (2019) Optimal bilateral filter and convolutional neural network based denoising method of medical image measurements. Measurement 143:125–135

Ilesanmi AE, Idowu OP, Makhanov SS (2020) Multiscalesuperpixel method for segmentation of breast ultrasound. Comp Biol Med 125:103879

Mafarja M, Aljarah I, Heidari AA, Faris H, Fournier-Viger P, Li X, Mirjalili S (2018) Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowl-Based Syst 161(1):185–204

Singh K, Ranade SK, Singh C (2017) A hybrid algorithm for speckle noise reduction of ultrasound images. Comp Methods Progr Biomed 148:55–69

Feng X, Huang Q, Li X (2020) Ultrasound image de-speckling by a hybrid deep network with transferred filtering and structural prior. Neurocomputing 414:346–355

Kokil P, Sudharson S (2020) Despeckling of clinical ultrasound images using deep residual learning. Comp Methods Programs Biomed 194:105477

Kim H-J, Lee D (2020) Image denoising with conditional generative adversarial networks (CGAN) in low dose chest images. Nuclear Inst Methods Phys Res A 954:161914

Kunfeng W, Xuan L, Lan Y et al (2017) Generative adversarial networks for parallel vision. In: Proc., Chinese Autom. Cong., Jinan, China

Burlingame EA, Margolin A, Gray JW et al (2018) SHIFT: speedy histopathological to-immunofluorescent translation of whole slide images using conditional generative adversarial networks. In: Proc., SPIE Medical Imaging, Houston, Texas, United States

Li S, Zhou J, Liang D, Liu Q (2020) MRI denoising using progressively distribution-based neural network. Magn Reson Imaging 71:55–68

Tripathi PC, Bag S (2020) CNN-DMRI: a convolutional neural network for denoising of magnetic resonance images. Pattern Recogn Lett 135:57–63

Zhang L, Zhang L, Mou X, Zhang D (2011) FSIM: a feature similarity index for image quality assessment. IEEE Trans Image Process 20(8):2378–2386

Garzelli A (2016) A review of image fusion algorithms based on the super-resolution paradigm. Remote Sens 8:797. https://doi.org/10.3390/rs8100797

Setiadi DIM (2021) PSNR vs SSIM: imperceptibility quality assessment for image steganography. Multimed Tools Appl 80:8423–8444

Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

Zhou B, Khosla A, Lapedriza A, Torralba A, Oliva A. Places2: A large-scale database for scene understanding, ArXiV preprint. arXiv: 1610.02055

Martin D, Fowlkes C, Tal D, Malik J et al. (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: ICCV Vancouver

Ma K, Duanmu Z, Wu Q, Wang Z, Yong H, Li H et al (2016) Waterlooexploration database: new challenges for image quality assessment models. IEEE Trans Image Process 26(2):1004–1016

Cohen G, Afshar S, Tapson J, van Schaik A (2017) EMNIST: an extension of MNIST to handwritten letters. arXiv: 1702.05373

Lin T-Y, Maire M, Belongie S, Bourdev L, Girshick R, Hays J et al (2014) Microsoft COCO: common objects in context. In: European conference on computer vision. Springer, Cham, pp 740–55

Bychkovsky V et al (2011) Learning photographic global tonal adjustment with a database of input/output image Pairs. IEEE Conf Comp Vis Pattern Recogn (Cvpr) 2011:97–104

J. Deng et al (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), pp 248–255

Zeyde R, Elad M, Protter M (2010) On single image scale-up using sparse representations. In: International conference on curves and surfaces, pp 711–730

Roth S, Black MJ (2009) Fields of experts. Int J Comput Vision 82(2):205–229

Anaya J, Barbu A (2018) RENOIR—a dataset for real low-light image noise reduction. J Vis Commun Image Represent 51:144–154

Lebrun M, Colom M, Morel JM (2014) The noise clinic: a universal blind denoising algorithm. In 2014 IEEE International Conference on Image Processing (Icip), pp 2674–2678

Nam S et al (2016) A holistic approach to cross-channel image noise modeling and its application to image denoising. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), pp 1683–1691

Abdelhamed A, Lin S, Brown MS (2018) A high-quality denoising dataset for smartphone cameras. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), pp 1692–1700

Xiao J, Ehinger KA, Hays J, Torralba A, Oliva A (2016) SUN database: exploring a large collection of scene categories. Int J Comput Vis 119(1):3–22

Bevilacqua M, Roumy A, Guillemot C, Alberi-Morel ML (2012) Low-complexity single-image super-resolution based on nonnegative neighbor embedding. BMVA Press

Chakrabarti YA, Zickler T (2011) Statistics of real-world hyperspectral images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, pp 193–200

Yasuma F, Mitsunaga T, Iso D, Nayar SK (2010) Generalized assorted pixel camera: post capture control of resolution, dynamic range, and spectrum. IEEE Trans Image Process 19(9):2241–2253

https://www.kaggle.com/ilknuricke/neurohackinginrimages . Accessed 15 Mar 2021

Timofte R, Gu S, Wu J, Van Gool L (2018) Ntire 2018 challenge on single image super-resolution: methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018. pp 852–63

Tay PC, Acton ST, Hossack JA (2006) Ultrasound despeckling using an adaptive window stochastic approach. In: Proceedings of the International Conference on Image Processing, 2006, pp 2549–2552

Geertsma T (2011) Ultrasoundcases.info

Antony J (2015), Ultrasound-images.com

Cancer image archive database, Available at: https://www.cancerimagingarchive.net/ Accessed 15 Mar 2021

Brain web, Simulated brain database, McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill, 2004. https://www.mcgill.ca/bic/software/brainweb-mri-simulator . Accessed 15 Mar 2021

Kwan RK-S, Evans AC, Pike GB (1996) An extensible MRI simulator for post-processing evaluation. In: Visualization in Biomedical Computing (VBC’96). Lecture Notes in Computer Science, vol. 1131. Springer Verlag, pp 135–140

IXI MRI, Brain MRI database, Imperial College London (2018). https://brain-development.org/ixi-dataset/ Accessed 15 Mar 2021

Loizou CP, Murray V, Pattichis MS, Seimenis I, Pantziaris M, Pattichis CS (2010) Multiscale amplitude-modulation frequency-modulation (am–fm) texture analysis of multiple sclerosis in brain MRI images. IEEE Trans Inform Technol Biomed 15(1):119–129

Prostate MRI, Prostate MR image database, National Center for Image Guided Therapy (2008). https://prostatemrimagedatabase.com/ . Accessed 15 Mar 2021

Rodtook A, Kirimasthong K, Lohitvisate W, Makhanov SS (2018) Automatic initialization of active contours and level set method in ultrasound images of breast abnormalities. Pattern Recogn 79:172–182

Koesten LM, Kacprzak E, Tennison JFA, Simperl E (2017) The trials and tribulations of working with structured data: a study on information seeking behavior. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, ACM, New York, NY, USA, pp. 1277–1289, https://doi.org/10.1145/3025453.3025838

Download references

Acknowledgements

This research is supported by Thailand Research Fund grant  RSA6280098  and the Center of Excellence in Biomedical Engineering of Thammasat University. Special thanks to the management of Alex Ekwueme Federal University, Ndufu-Alike for support during this research. The authors express thanks to Professor Stanislav S. Makhanov for his encouragement and support during this review paper.

Author information

Authors and affiliations.

School of ICT, Sirindhorn International Institute of Technology, Thammasat University, Pathum Thani, 12000, Thailand

Ademola E. Ilesanmi

Alex Ekwueme Federal University, Ndufu-Alike Ikwo, Abakaliki, Ebonyi State, Nigeria

National Population Commission, Abuja, Nigeria

Taiwo O. Ilesanmi

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ademola E. Ilesanmi .

Ethics declarations

Conflict of interest.

All authors in this paper have no potential conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Ilesanmi, A.E., Ilesanmi, T.O. Methods for image denoising using convolutional neural network: a review. Complex Intell. Syst. 7 , 2179–2198 (2021). https://doi.org/10.1007/s40747-021-00428-4

Download citation

Received : 19 January 2021

Accepted : 05 June 2021

Published : 10 June 2021

Issue Date : October 2021

DOI : https://doi.org/10.1007/s40747-021-00428-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Convolutional neural network
  • Image denoising
  • Deep neural network
  • Noise in images
  • Find a journal
  • Publish with us
  • Track your research

Digital image processing

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

This paper is in the following e-collection/theme issue:

Published on 11.1.2024 in Vol 26 (2024)

The Association Between Linguistic Characteristics of Physicians’ Communication and Their Economic Returns: Mixed Method Study

Authors of this article:

Author Orcid Image

Original Paper

  • Shuang Geng 1 , PhD   ; 
  • Yuqin He 1 , BSc   ; 
  • Liezhen Duan 1 , BSc   ; 
  • Chen Yang 1 , PhD   ; 
  • Xusheng Wu 2 , MSc   ; 
  • Gemin Liang 1 , BSc   ; 
  • Ben Niu 1 , DPhil  

1 College of Management, Shenzhen University, Shenzhen, China

2 Shenzhen Health Development Research and Data Management Center, Shenzhen, China

Corresponding Author:

Chen Yang, PhD

College of Management

Shenzhen University

Xueyuan Road 1066

Shenzhen, 518055

Phone: 86 18565859968

Email: [email protected]

Background: Web-based health care has the potential to improve health care access and convenience for patients with limited mobility, but its success depends on active physician participation. The economic returns of internet-based health care initiatives are an important factor that can motivate physicians to continue their participation. Although several studies have examined the communication patterns and influences of web-based health consultations, the correlation between physicians’ communication characteristics and their economic returns remains unexplored.

Objective: This study aims to investigate how the linguistic features of 2 modes of physician-patient communication, instrumental and affective, determine the physician’s economic returns, measured by the honorarium their patients agree to pay per consultation. We also examined the moderating effects of communication media (web-based text messages and voice messages) and the compounding effects of different communication features on economic returns.

Methods: We collected 40,563 web-based consultations from 528 physicians across 4 disease specialties on a large, web-based health care platform in China. Communication features were extracted using linguistic inquiry and word count, and we used multivariable linear regression and K-means clustering to analyze the data.

Results: We found that the use of cognitive processing language (ie, words related to insight, causation, tentativeness, and certainty) in instrumental communication and positive emotion–related words in affective communication were positively associated with the economic returns of physicians. However, the extensive use of discrepancy-related words could generate adverse effects. We also found that the use of voice messages for service delivery magnified the effects of cognitive processing language but did not moderate the effects of affective processing language. The highest economic returns were associated with consultations in which the physicians used few expressions related to negative emotion; used more terms associated with positive emotions; and later, used instrumental communication language.

Conclusions: Our study provides empirical evidence about the relationship between physicians’ communication characteristics and their economic returns. It contributes to a better understanding of patient-physician interactions from a professional-client perspective and has practical implications for physicians and web-based health care platform executives.

Introduction

Web-based health care platforms offer environments where patients can consult physicians and pay for their services remotely. These platforms are particularly helpful for patients residing in rural areas with limited access to medical resources and patients with limited mobility [ 1 ]. Physicians also benefit from providing web-based consultations, both in terms of economic returns and social returns, such as improving their reputation [ 2 ]. In recent years, web-based health care has been upheld as a national health priority in China. The number of web-based hospitals surged to 2700, and the number of users surged to 360 million in 2022 [ 3 ]. Physicians affiliated with offline hospitals can also provide services on third-party health care platforms, such as the Dingxiang Doctor and Haodf websites [ 4 , 5 ]. Unlike offline service prices, web-based consultation prices are not governed by certain pricing standards on these third-party health care platforms [ 6 ]. Thus, physicians can charge higher or lower fees per web-based consultation than their offline consultations. Meanwhile, patients can register as platform users, consult physicians, and make payment. Most of these health care platforms offer both synchronous telemedicine and asynchronous, message-based consultation services. We focus specifically on asynchronous, message-based consultations on the third-party health care platforms.

The quality of patient-physician interactions is vital for consultation outcome and patient satisfaction [ 7 , 8 ]. Although many studies have investigated the communication styles and features during web-based health care consultation [ 7 - 10 ], scant attention has been devoted to its associations with the economic returns of physicians. Characterizing physician-patient communications at a more granular level and exploring their associations with physicians’ economic returns can provide important guidance for physicians to improve their communication skills and economic benefits. The findings also have implications for the amelioration of platform incentive mechanisms.

Previous studies have identified 2 types of physician-patient interactions during web-based consultations: affective, which focuses on expressing care for the patients [ 8 ], and instrumental, which focuses on addressing the patients’ health problems [ 11 ]. Instrumental interactions typically involve discussions about the severity of the disease or syndrome, its causes, and potential treatment plans [ 12 , 13 ]. They are problem-solving oriented and emphasize the provision of medical information and expertise to patients. These 2 types of interactions are the 2 components of the professional-client interaction theory proposed by Ben-Sira [ 11 ] and are crucial for effective web-based health care consultations [ 14 ].

This study investigated the associations between these 2 types of interactions and the economic returns of physicians in the context of the Chinese web-based health care system. Specifically, we focused on asynchronous, text message–based consultation services, in which physicians are paid on a per-consultation basis. Physicians usually receive a larger portion (ie, 90%) of the consultation fee, with the remaining portion paid to the platform. We measured the economic returns of the physician as the consultation payment on a per-consultation basis. On the one hand, the consultation prices are initially established by physicians and range from RMB 30 (US $4.19) to RMB 699 (US $97.65) [ 4 ]. However, patients can make a choice among physicians, and they are only willing to pay a higher price for better service. Thus, physicians need to adjust the consultation prices to maintain the patient base, and the economic returns are regarded as being mutually determined by the patient and physician. During the consultation process, physicians can choose to use text or voice message for service delivery. Therefore, besides the association between communication characteristics and economic returns, we were also interested in how the associations change vis-à-vis different communication media (text vs voice).

Instrumental and Affective Linguistic Features

Communication quality is critical for patient satisfaction and effective use of medical resources [ 15 ]. Previous studies have investigated the quality of patient-physician interactions using self-reported survey [ 7 , 16 - 18 ], statistical method [ 9 , 19 , 20 ], and manual labeling combined with text classification [ 14 ]. We derived insights from sociopsychological studies and used Linguistic Inquiry and Word Count (LIWC) to extract the linguistic indicators. LIWC provides a psychometrically validated dictionary comprising approximately 6400 words, word stems, and selected emoticons [ 21 , 22 ]. So far, LIWC has been widely used to identify emotional and cognitive dimensions in social, health care, and personality psychology research [ 23 - 26 ]. For instance, Liu et al [ 27 ] used LIWC to ascertain positive emotion words in replies provided by physicians to indicate their emotional support to patients.

In instrumental interactions, physicians demonstrate their expertise and cognitive thinking process. Hence, physicians tend to use words related to the cognitive process, presenting linguistic features regarding insight, causation, discrepancy, tentativeness, and certainty to deliver disease knowledge to patients [ 28 ]. Affective interactions articulate the emotions of physicians including the emotional support they provide to their patients to alleviate patient anxiety [ 29 ]. Notably, the emotional expressions of physicians could comprise diverse feelings such as happiness, anxiety, anger, and sadness. Thus, we included both positive and negative emotional linguistic features for affective interactions. Specifically, we use a simplified Chinese LIWC dictionary established according to the LIWC lexicon and its Chinese version [ 30 - 32 ]. In total, there are 9 linguistic features for the 2 types of interactions, as shown in Figure 1 . Example words for each feature are provided in Multimedia Appendix 1 .

research papers related image processing

Research Model

Figure 1 displays our conceptual model. This study explored the associations between the communication behaviors of physicians and their economic returns in 3 stages. First, it probed the impact of instrumental and affective communication of physicians on their economic returns. Second, it investigated the potential moderating effects of communication media (voice vs text). Third, it investigated the interaction patterns among different communication features and identified the pattern yielding the highest economic returns.

Instrumental Interactions by Physicians and Their Economic Returns

Patients seek knowledge related to their health conditions through consultations with physicians because of their professional capital [ 33 ]. Thus, physicians are expected to provide disease-specific information, suggestions, or guidance to patients to improve communication quality and patient satisfaction. Physicians’ instrumental behaviors manifest mainly through the use of words related to the cognitive process that facilitate the inculcation of due medical knowledge in patients [ 34 ]. Thus, instrumental interactions are vital for physicians to achieve a consensus with their patients regarding relevant disease knowledge, treatment plans, and problem solutions.

Economic returns incentivize physicians by satisfying their financial needs. In our research context, health care platforms allow physicians to levy consultation charges as they deem fit [ 6 ]; patients can also offer monetary gifts to their physicians. According to the capital exchange theory [ 2 ], physicians use their decision capital to perform exchange actions. To a certain degree, the consultation service charges reflect the self-evaluation of physicians vis-à-vis their decision capital and service quality. In contrast, patients may attribute high expectations for the consultation process and outcomes when they agree to pay high service charges. Therefore, we postulated that high economic returns could correlate to more intense instrumental interactions. Specifically, we hypothesized the following: The instrumental communication features of physicians positively influence their economic returns (hypothesis 1).

Affective Interactions by Physicians and Their Economic Returns

Physicians’ affective communications involve the expression of care about the feelings of their patients as human beings rather than medical cases [ 11 ]. During the consultation process, patients are often emotionally occupied and could unintentionally express their emotional concerns. Many studies have emphasized the psychological needs of patients, citing anxiety and distress as critical issues for health care services [ 7 , 8 , 14 , 35 ]. The expression of feelings such as empathy, sympathy, caring, and concern to patients could help mitigate the stress or anxiety induced in patients by their health issues [ 36 ]. Hence, patient satisfaction is also associated with the affective behaviors of physicians. In addition, many patients are ignorant of the complexities and technical challenges of their treatment solutions [ 6 , 11 ]. Hence, affective behaviors by physicians can influence patients’ perceptions about their service quality. Patients may place higher expectations for web-based consultation with higher charges in terms of the sensed emotional support. Therefore, we hypothesized the following: The affective communication features of physicians positively influence their economic returns (hypothesis 2).

The Moderating Effects of Communication Media

Physicians and patients communicate primarily via textual and vocal means in asynchronous web-based health consultations. The media synchronicity theory asserts that media characteristics function significantly in information transmission and processing [ 37 ]. Vocal communication can deliver the speech signals of physicians, such as pronunciation and intonation, which textual communication using character data cannot [ 38 ]. Therefore, vocal communication allows patients to discern the subtleties of the thoughts or concerns of their physicians about their diseases. Moreover, vocal communication also adds to the multiplicity of cues to support complex tasks [ 14 ]. In addition, physicians can use vocal communication to deliver immediate and convenient feedback because typing messages is more time consuming. However, it cannot be presumed that vocal communication is inherently better than textual communication because the latter allows physicians time to think about the inquiry in great depth.

Communication comprises 2 primary processes according to the media synchronicity theory: the conveyance of information and the convergence of meaning [ 37 ]. Thus, we relate instrumental communication to the process of conveying information and associate affective communication with the process of achieving the convergence of meaning. The match between media capabilities and communication purposes could cause differences in communication performance. We further investigated the potential moderating effects of communication media (text vs voice) on the correlations between the communication features and economic returns of physicians. Therefore, we hypothesized the following: the associations between the instrumental communication features of physicians and their economic returns are moderated by the type of communication media (hypothesis 3), and the associations between the affective communication features of physicians and their economic returns are moderated by the type of communication media (hypothesis 4).

Data Collection

Objective secondhand data were collected from the Dingxiang Doctor website [ 4 ], a predominant web-based health consultation platform in China for >20 years. So far, the website serves >5.5 million active users and 2.1 million participating physicians. The Dingxiang Doctor website categorizes physicians into 39 disease specialties, which deliver different treatment solutions according to patients’ disease severity. We selected physicians who treat severe diseases (malignant tumors and heart disease) and who treat less severe diseases (digestive and endocrine diseases) to control interference from the disease type and severity [ 39 ]. We used a data crawling application to collect historical physician-patient consultation records. Each consultation instance contained several rounds of physician-patient interactions. Figure 2 presents a sample historical consultation record encompassing physician-related information (title, professional domain, disease department, hospital, and overall patient rating), patient consultation inquiries, physician’s replies, and consultation service charges. We collected data from 2 periods: June 2021 and December 2022. We excluded some duplicated or invalid samples containing errors and missing values and ultimately analyzed 40,563 consultation instances generated from 528 physicians between September 2016 and December 2022. Of the 528 physicians, 54 (10.2%) were directors, 154 (29.2%) were associate directors, 231 (43.8%) were attending physicians, and 89 (16.9%) were residents. The studied physicians had worked for an average of 14.34 (SD 7.01) years, and their average consultation fees were RMB 36.53 (US $5.10; SD 21.37).

research papers related image processing

Ethical Considerations

Crawling techniques were used to collect data about web-based consultation from a public domain data set visible to all platform users. The studied consultations were not conducted for research purposes. The crawling program followed the Robots Exclusion Protocol. The home pages of the concerned physicians provided only the publicly available information of the physician (such as name, designations, and hospitals) and only the patient consultation information (content, time, and price) that patients had agreed to make public. The patient names were automatically anonymized on the home pages. In the data set handling process, we took precautionary measures to guarantee data security. We also applied to the institutional review board of Shenzhen University for the ethical review of the research project and obtained due approval (202300004) for the study protocol. The institutional review board had waived the requirement to obtain informed consent for this study.

Dependent Variable

The payment received by individual physicians on a per-consultation basis was used to measure their economic returns. We used the log value of the returns in the empirical model. Notably, physicians could vary their consultation charges for different patients.

Independent Variables

The instrumental and affective communication features were estimated from the messages transmitted by physicians for each consultation instance. We used a software named Textmind [ 31 ] to extract and quantify the instrumental and affective linguistic features as ratio values between 0 and 1. Textmind is a simplified Chinese-language analysis program developed based on LIWC and Chinese LIWC by the Computational Cyber-Psychology Lab at the Institute of Psychology at the Chinese Academy of Sciences [ 31 ]. It delivers a 1-stop solution from word segmentation to language feature analysis and has been effectively used in multiple Chinese-language studies [ 21 , 22 ].

Moderating Variables

We classified the physician messages as textual and vocal communication by introducing an additional feature labeled as message media. Voice messages were assigned a value of 1, and textual messages were assigned a value of 0 [ 14 ].

Control Variables

We controlled for the designations and working years of physicians, levels and rankings of their hospitals, development levels of cities in which the hospitals are located, and disease types. Physicians are accorded 4 designations on the Dingxiang Doctor website: director, associate director, attending physician, and resident physician. Physicians who are assigned high designations are assumed to have more experience in the treatment of particular diseases. Hospital types include private, public, and 3A. The term 3A denotes tertiary hospitals equipped with more staff members and more patient beds (>501 beds) [ 40 ]. These hospitals are usually located in urban cities and provide high-level, specialized medical services. For example, a 3A hospital should set up at least 13 designated clinical departments and have >60 m 2 of space on each floor [ 40 ]. Hospital rankings may also influence patient expectations. According to Fudan’s 2021 China hospital rankings [ 41 ], we classified institutions as ranked in the top 100 hospitals or not ranked in the top 100 hospitals. Disease types included malignant tumors, heart disease, digestive disease, and endocrine disease. The abovementioned control variables, apart from working years, were treated as categorical variables in the model estimating process. They are transformed into dummy variables.

Table 1 presents the descriptive statistics of all the variables, and the correlations among variables are reported in Multimedia Appendix 2 . We used Spearman correlation instead of Pearson correlation because of the nonnormality of many LIWC indicators [ 25 ]. The correlations among variables ( Multimedia Appendix 2 ) provide the basis for further analysis of the proposed model.

Model Estimation

We first conducted multiple linear regression analyses using SPSS (version 22; IBM Corp) to test our hypotheses about the impact of instrumental and affective communications on economic returns of physicians (log value). Table 2 reports the analysis results, which disclose that our independent variables and control variables explained 31.9% of the variance in the economic returns of physicians. The effect size ( f 2 ) is 0.468, and the regression model is significant at the 0.001 level. The variance inflation factor values ranged between 1.052 and 2.566, indicating that our study did not evince the problem of multicollinearity [ 42 ].

a R 2 =0.319.

b F 24 =792.686 ( P <.001).

c 2-tailed test.

d VIF: variance inflation factor.

e N/A: not applicable.

Direct Effect

The results presented in Table 2 indicate that instrumental communication using words indicating insight (B=0.248; P <.001), causation (B=0.155; P <.001), tentativeness (B=0.189; P <.001), and certainty (B=0.528; P <.001) was positively associated with the economic returns of physicians. Affective communication using language expressing positive emotion (B=0.283; P <.001) was also linked with high economic returns. In contrast, the use of terms conveying discrepancy (B=−0.294; P <.001) by physicians was negatively related to their economic returns. Surprisingly, the use of words related to anxiety (B=0.048; P =.57), anger (B=0.132; P =.60), and sadness (B=−0.228; P =.14) did not affect the economic returns of physicians.

We also conducted 2 robustness tests to evaluate the results’ stability. First, because the feature values are small, we scaled the independent variables up by 100-fold and performed the multiple regression analysis again, which would not change the model’s fitness [ 43 ]. Second, we conducted quantile regression analysis to provide a more detailed analysis of the relationships between the communication features and economic returns of physicians. The results are reported in Multimedia Appendices 3 and 4 , and they demonstrated the robustness of our multiple linear regression results.

Moderating Effect

On the basis of the regression results, we further tested the moderating effects of communication media (text vs voice) using PROCESS Model 1 in SPSS (version 22) [ 44 ]. Table 3 displays the results of the moderating effects of communication media on significant influencing paths presented in Table 2 . The findings reveal that communication media function significantly in moderating the influencing paths from the use of insight (B=0.169; P =.09), causation (B=0.403; P =.005), tentativeness (B=0.268; P <.001), and certainty (B=0.839; P <.001). However, communication media do not influence the use of words related to discrepancy (B=0.097; P =.21) and positive emotion (B=−0.159; P =.12). Figure 3 summarizes the results of the testing of our hypotheses.

a 2-tailed test.

research papers related image processing

Figure 4 further illustrates the moderating effects. We found that the use of voice messages could magnify the impact of insight-related words (simple slope=0.535; t 40,551 =5.324; P <.001), causation-related terms (simple slope=0.713; t 40,551 =5.055; P <.001), tentativeness-related terms (simple slope=0.409; t 40,551 =4.967; P <.001), and certainty-related terms (simple slope=1.425; t 40,551 =7.434; P <.001) on the economic returns of physicians. However, the use of textual and vocal messages did not vary the effects of terms communicating discrepancy and positive emotion.

research papers related image processing

Communication Patterns

The regression results indicated that communication features exerted heterogeneous effects on the economic returns of physicians. In many cases, physicians offered both instrumental and affective responses to patients at different stages of the consultations. Thus, we investigated the compounding effects of discrete communicating features to attain deep insight into communication patterns. First, we transformed the physician responses into a sequence of messages according to the number of replies. The longest sequence comprised 16 messages. Second, we calculated the linguistic feature values for 9 psychological linguistic features for each message in the sequence and constructed a data set composed of 9×16 dimensions for every consultation instance. We entered 0 values for the remaining dimensions of consultations with sequence length <16, because no information was offered after the last message. Thus, the data set encompassed the 9 communication features at different stages of the consultations. Third, we used the K-means clustering method to cluster the consultation instances into multiple groups corresponding to discrete communication patterns. We then selected the optimal number of clusters by using ANOVA to compare the economic returns of physicians across the different groups. The T2 test by Tamhane [ 45 ] was selected for the analysis of the variance because the sample variance did not satisfy homogeneity. The results disclosed 4 clusters as the optimal number. Therefore, we obtained 4 groups of consultation instances associated with different levels of economic returns, as depicted in Multimedia Appendix 5 .

Table 4 presents the detailed results of the T2 test by Tamhane [ 45 ]. The results elucidated that the third group commanded the highest economic returns, whereas the second group received the lowest gain. The mean revenue from consultations in the third group was 12.2% higher than the average earnings of the first group, 21.2% higher than the standard remunerations of the second group, and 14.6% higher than the average payments received by the fourth group.

a The values in the table represent the difference (%) between the vertical and horizontal category labels.

b F 3 =87.971 ( P <.001).

c N/A: not applicable.

To further elucidate the communication patterns of each group, we divided the 9 linguistic features into 3 dimensions according to the regression results: instrumental interactions, affective communication using positive emotions, and affective communication using negative emotions. As Multimedia Appendix 5 displays, we then visualized the communication features for each group using the mean values. The sequence length was set to 16 according to the longest sequence of physician messages. Multimedia Appendix 6 describes the visualization results for all 9 linguistic features.

Figure 5 shows that consultations in the third group tended to include longer sequences than those categorized into other groups, signifying that physicians designated to the third group replied more frequently. These physicians also used more terms related to positive emotions at the beginning of their consultations and later offered more instrumental suggestions. In contrast, physicians in the second group used more terms related to positive and negative emotions, offered instrumental suggestions in their second reply, and then tended to end the consultation with a short response. Surprisingly, physicians in the fourth group used few emotion-related terms and provided moderate amounts of instrumental suggestions at the beginning of their consultations, and their economic returns were slightly higher than that of those in the second group.

research papers related image processing

Principal Findings

This study aimed to investigate, at a more granular level, the associations between communication behaviors (instrumental and affective) and economic returns of physicians. We also explored the moderating effects of communication media (text vs voice). Our results indicate that the use of words indicating insight, causation, tentativeness, and certainty and the use of words indicating positive emotion were positively associated with the economic returns of physicians. In contrast, the use of terms conveying discrepancy by physicians was negatively related to their economic returns. The use of voice media by physicians intensified the impact of terms related to insight, causation, tentativeness, and certainty. The pattern analysis results indicate that physicians who responded to patients more frequently, communicated positive emotions at the beginning of the consultations, and provided more instrumental suggestions afterward achieved the highest economic returns.

Our findings align with the results reported in previous studies [ 2 , 6 , 7 , 14 , 46 ] and offer additional feature-level details. Moreover, the provision of logical and assured explanations helps reduce uncertainty in patients, who gain knowledge about the causes and consequences of their diseases. Specificity and clarity in expression and information delivery also improve communication quality [ 47 ]. Therefore, it is reasonable that physicians with superior instrumental communication skills achieve high economic returns. Moreover, patients may harbor great expectations from physicians with high consultation fees [ 6 ], which could drive physicians to devote more effort to cognitive thinking before providing answers. However, the increased use of discrepancy-related words (ie, should and would) could suggest that physicians notice divergences between their suggestions and patient behaviors and highlight them during conversations. Thus, they may unintentionally deliver dissatisfaction signals to patients. Hence, instrumental communication using discrepancy-related words was negatively associated with the economic returns of physicians.

Our results also indicate that affective communication encompassing more terms related to positive emotions (ie, happy, love, and nice) was positively linked with high economic returns of physicians. This result is congruent with previous findings that the enhancement of service quality mandates the delivery of emotional support for patients [ 8 , 11 , 33 ], which is also a vital patient-centered communication skill [ 48 ]. The patients’ psychological needs for empathy and care are satisfied when physicians demonstrate compassion and encouragement [ 7 ]. Consequently, physicians with better affective communication skills tend to receive high economic returns. However, the articulation of negative emotions such as anger, anxiety, and sadness was not related to the economic returns attained by physicians. Perhaps, the use of words conveying anger, anxiety, and sadness by physicians could deliver negative signals to patients and undermine their confidence in their physicians, even though such terms could also transmit empathy. Physicians are suggested to use language related to understanding, respecting, and supporting to express empathy to patients [ 47 ].

The moderating effects analysis revealed that the choice of text or voice media for communication can moderate the influence exerted by certain linguistic features. The media synchronicity theory posits that the use of media supporting high synchronicity and multiplicity of cues is more suited to complex communication [ 37 ]. Explaining the causes of diseases or syndromes during web-based consultations may be deemed as information transmission processes with high requirements for media synchronicity. Therefore, such messages become more significantly influential through voice media, which is characterized by high synchronicity [ 38 ]. These findings suggest that physicians who make smart use of voice media to communicate with patients tend to obtain high economic returns.

Our pattern analysis results unveiled the compounding effects of multiple communication features. Providing positive emotional support to patients at the beginning of the consultation can fulfill the psychological needs of patients before satisfying their knowledge requirements [ 47 ]. This process enhances the outcomes of instrumental communication because patients may find it difficult to comprehend disease knowledge and treatment suggestions if they are emotionally occupied. Therefore, consultations that first delivered positive emotional support and later supplied the instrumental comments were found to yield high levels of economic returns. This partially aligns with the previous finding that interaction frequency positively affects patient satisfaction [ 19 ]. In contrast, consultations that simultaneously offered emotional support and instrumental comments in a limited number of replies yielded low levels of economic returns. These findings imply that physicians who address the emotional needs of patients before offering professional advice are more likely to obtain high economic returns.

Limitations and Future Directions

Despite the contributions of our study, we must indicate a few limitations. First, our data were collected from the Dingxiang Doctor website in China. The generalization of our study’s findings in other countries would require us to obtain data from multiple websites in many other countries. Second, although our data contain a relatively large set of consultation cases, we should collect data sets encompassing long time durations and generate a panel data set that would guarantee the robustness of our findings. Third, we did not include patients’ personal preferences in our conceptual model such as patients’ education level because of the limited access to patients’ personal information. We believe future study that investigates the impact of patients’ preferences will contribute novel insight into this research issue. Finally, we can expand our study by relating the communication features of physicians to their medical domain knowledge to provide deep insight into the service quality of web-based health care consultations.

Conclusions

This study demonstrates that the economic returns of physicians are associated with their communication features and the media used for web-based health care consultations. This study adopted a psychological and linguistic perspective to offer methodological referential value for relevant prospective studies of web-based physician-patient interactions. Moreover, it supplements the limited literature relating to the economic returns received by physicians through web-based health care platforms. The findings deliver important practical directions for improving the quality of web-based consultation services provided by physicians.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (71901150 and 72334004), Guangdong Basic and Applied Basic Research Foundation (2022A1515012077 and 2023A1515012515), Guangdong Philosophy and Social Science Planning Project (GD23XGL113), and Guangdong Province Innovation Team (2021WCXTD002).

Conflicts of Interest

None declared.

Linguistic features related to cognitive and affective processes.

Pair-wise correlations (Spearman correlation).

Regression results (independent variables×100).

Results of quantile regression analysis.

Levels of economic returns across the 4 groups.

Distribution of communication features.

  • Lu X, Zhang R. Association between eHealth literacy in online health communities and patient adherence: cross-sectional questionnaire study. J Med Internet Res. Sep 13, 2021;23(9):e14908. [ https://www.jmir.org/2021/9/e14908/ ] [ CrossRef ] [ Medline ]
  • Guo S, Guo X, Fang Y, Vogel D. How doctors gain social and economic returns in online health-care communities: a professional capital perspective. J Manag Inf Syst. 2017;34(2):487-519. [ CrossRef ]
  • The cyberspace administration of China released the "digital China development report (2022)". Cyberspace Administration of China. May 23, 2023. URL: http://www.cac.gov.cn/2023-05/22/c_1686402318492248.htm [accessed 2023-06-18]
  • Dingxiang Doctor. URL: https://dxy.com/ [accessed 2023-06-18]
  • Haodf. URL: https://www.haodf.com/ [accessed 2023-06-18]
  • Chiu YL, Wang JN, Yu H, Hsu YT. Consultation pricing of the online health care service in China: hierarchical linear regression approach. J Med Internet Res. Jul 14, 2021;23(7):e29170. [ CrossRef ] [ Medline ]
  • Roongruangsee R, Patterson P, Ngo LV. Professionals’ interpersonal communications style: does it matter in building client psychological comfort? J Serv Mark. 2022;36(3):379-397. [ CrossRef ]
  • Atanasova S, Kamin T, Petrič G. The benefits and challenges of online professional-patient interaction: comparing views between users and health professional moderators in an online health community. Comput Hum Behav. Jun 2018;83:106-118. [ CrossRef ]
  • Cao B, Huang W, Chao N, Yang G, Luo N. Patient activeness during online medical consultation in China: multilevel analysis. J Med Internet Res. May 27, 2022;24(5):e35557. [ https://www.jmir.org/2022/5/e35557/ ] [ CrossRef ] [ Medline ]
  • Wu D, Lowry PB, Zhang D, Tao Y. Patient trust in physicians matters-understanding the role of a mobile patient education system and patient-physician communication in improving patient adherence behavior: field study. J Med Internet Res. Dec 20, 2022;24(12):e42941. [ https://www.jmir.org/2022/12/e42941/ ] [ CrossRef ] [ Medline ]
  • Ben-Sira Z. Affective and instrumental components in the physician-patient relationship: an additional dimension of interaction theory. J Health Soc Behav. Jun 1980;21(2):170-180. [ Medline ]
  • Ben-Sira Z. The function of the professional's affective behavior in client satisfaction: a revised approach to social interaction theory. J Health Soc Behav. Mar 1976;17(1):3-11. [ Medline ]
  • Webster C, Sundaram DS. Effect of service provider's communication style on customer satisfaction in professional services setting: the moderating role of criticality and service nature. J Serv Mark. 2009;23(2):103-113. [ CrossRef ]
  • Tan H, Yan M. Physician-user interaction and users' perceived service quality: evidence from Chinese mobile healthcare consultation. Inf Technol People. 2020;33(5):1403-1426. [ CrossRef ]
  • Vermeir P, Vandijck D, Degroote S, Peleman R, Verhaeghe R, Mortier E, et al. Communication in healthcare: a narrative review of the literature and practical recommendations. Int J Clin Pract. Nov 2015;69(11):1257-1267. [ https://europepmc.org/abstract/MED/26147310 ] [ CrossRef ] [ Medline ]
  • Lee D. HEALTHQUAL: a multi-item scale for assessing healthcare service quality. Serv Bus. Jun 13, 2016;11:491-516. [ CrossRef ]
  • Izumi N, Matsuo T, Matsukawa Y. Associations among physician-patient communication, patient satisfaction, and clinical effectiveness of overactive bladder medication: a survey of patients with overactive bladder. J Clin Med. Jul 14, 2022;11(14):4087. [ https://www.mdpi.com/resolver?pii=jcm11144087 ] [ CrossRef ] [ Medline ]
  • Chen L, Tang H, Guo Y. Effect of patient-centered communication on physician-patient conflicts from the physicians' perspective: a moderated mediation model. J Health Commun. Mar 04, 2022;27(3):164-172. [ CrossRef ] [ Medline ]
  • Yang H, Guo X, Wu T. Exploring the influence of the online physician service delivery process on patient satisfaction. Decis Support Syst. Oct 2015;78:113-121. [ CrossRef ]
  • Yang Y, Zhang X, Lee PK. Improving the effectiveness of online healthcare platforms: an empirical study with multi-period patient-doctor consultation data. Int J Prod Econ. Jan 2019;207:70-80. [ CrossRef ]
  • Yuan C, Hong Y, Wu J. Personality expression and recognition in Chinese language usage. User Model User Adap Inter. Aug 25, 2020;31:121-147. [ CrossRef ]
  • Guan L, Hao B, Cheng Q, Yip PS, Zhu T. Identifying Chinese microblog users with high suicide probability using internet-based profile and linguistic features: classification model. JMIR Ment Health. May 12, 2015;2(2):e17. [ https://mental.jmir.org/2015/2/e17/ ] [ CrossRef ] [ Medline ]
  • Andy A, Andy U. Understanding communication in an online cancer forum: content analysis study. JMIR Cancer. Sep 07, 2021;7(3):e29555. [ https://cancer.jmir.org/2021/3/e29555/ ] [ CrossRef ] [ Medline ]
  • Donnellan WJ, Warren JG. Emotional word use in informal carers of people living with dementia: linguistic analysis of online discussion forums. JMIR Aging. Jun 17, 2022;5(2):e32603. [ https://aging.jmir.org/2022/2/e32603/ ] [ CrossRef ] [ Medline ]
  • Marengo D, Azucar D, Longobardi C, Settanni M. Mining Facebook data for quality of life assessment. Behav Inf Technol. 2021;40(6):597-607. [ CrossRef ]
  • Jha D, Singh R. Analysis of associations between emotions and activities of drug users and their addiction recovery tendencies from social media posts using structural equation modeling. BMC Bioinformatics. Dec 30, 2020;21(Suppl 18):554. [ https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03893-9 ] [ CrossRef ] [ Medline ]
  • Liu Y, Ren C, Shi D, Li K, Zhang X. Evaluating the social value of online health information for third-party patients: is uncertainty always bad? Inf Process Manag. Sep 2020;57(5):102259. [ CrossRef ]
  • Seraj S, Blackburn KG, Pennebaker JW. Language left behind on social media exposes the emotional and cognitive costs of a romantic breakup. Proc Natl Acad Sci U S A. Mar 16, 2021;118(7):e2017154118. [ https://europepmc.org/abstract/MED/33526594 ] [ CrossRef ] [ Medline ]
  • Meggiolaro E, Berardi MA, Andritsch E, Nanni MG, Sirgo A, Samorì E, et al. Cancer patients' emotional distress, coping styles and perception of doctor-patient interaction in European cancer settings. Palliat Support Care. Jun 2016;14(3):204-211. [ CrossRef ] [ Medline ]
  • Geng S, Niu B, Feng Y, Huang M. Understanding the focal points and sentiment of learners in MOOC reviews: a machine learning and SC‐LIWC‐based approach. Br J Educ Technol. Sep 2020;51(5):1785-1803. [ CrossRef ]
  • Gao R, Hao B, Li H, gao Y, Zhu T. Developing simplified Chinese psychological linguistic analysis dictionary for microblog. In: Proceedings of the International Conference, BHI 2013. Presented at: International Conference, BHI 2013; October 29-31, 2013, 2013; Maebashi, Japan. [ CrossRef ]
  • Huang CL, Chung CK, Hui N, Lin YC, Seih YT, Lam BC, et al. The development of the Chinese linguistic inquiry and word count dictionary. Chin J Psychol. 2012;54(2):185-201.
  • Chen S, Guo X, Wu T, Ju X. Exploring the online doctor-patient interaction on patient satisfaction based on text mining and empirical analysis. Inf Process Manag. Sep 2020;57(5):102253. [ CrossRef ]
  • Kelly KM, Ellington L, Schoenberg N, Agarwal P, Jackson T, Dickinson S, et al. Linking genetic counseling content to short-term outcomes in individuals at elevated breast cancer risk. J Genet Couns. Oct 2014;23(5):838-848. [ https://europepmc.org/abstract/MED/24671341 ] [ CrossRef ] [ Medline ]
  • Soós MJ, Coulson NS, Davies EB. Exploring social support in an online support community for tourette syndrome and tic disorders: analysis of postings. J Med Internet Res. Oct 04, 2022;24(10):e34403. [ https://www.jmir.org/2022/10/e34403/ ] [ CrossRef ] [ Medline ]
  • Keating DM. Spirituality and support: a descriptive analysis of online social support for depression. J Relig Health. Sep 2013;52(3):1014-1028. [ CrossRef ] [ Medline ]
  • Dennis AR, Fuller RM, Valacich JS. Media, tasks, and communication processes: a theory of media synchronicity. MIS Q. Sep 2008;32(3):575-600. [ CrossRef ]
  • Liu S, Zhang M, Gao B, Jiang G. Physician voice characteristics and patient satisfaction in online health consultation. Inf Manag. Jul 2020;57(5):103233. [ CrossRef ]
  • China statistical yearbook 2021. Institute for Health Metrics and Evaluation. URL: https://ghdx.healthdata.org/record/china-statistical-yearbook-2021 [accessed 2022-12-25]
  • Yifa GW. Notice of the national health commission on issuing the accreditation standards for tertiary hospitals (2020 edition). China Government Website. Dec 21, 2022. URL: http://www.gov.cn/zhengce/zhengceku/2020-12/28/content_5574274.htm [accessed 2023-02-01]
  • China’s hospital and specialty reputation rankings. CN-Healthcare. URL: http://rank.cn-healthcare.com/fudan/national-general [accessed 2023-02-01]
  • Petter S, Straub D, Rai A. Specifying formative constructs in information systems research. MIS Q. Dec 2007;31(4):623-656. [ CrossRef ]
  • Wooldridge JM. Introductory Econometrics: A Modern Approach. Mason, OH. South-Western Cengage Learning; 2012.
  • Hayes AF. Introduction to Mediation, Moderation, and Conditional Process Analysis: A Regression-Based Approach. New York, NY. The Guilford Press; 2013.
  • Tamhane AC. Multiple comparisons in model I one-way ANOVA with unequal variances. Commun Stat Theory Methods. 1977;6(1):15-32. [ CrossRef ]
  • Li J, Zheng H, Duan X. Factors influencing the popularity of a health-related answer on a Chinese question-and-answer website: case study. J Med Internet Res. Sep 28, 2021;23(9):e29885. [ https://www.jmir.org/2021/9/e29885/ ] [ CrossRef ] [ Medline ]
  • Hashim MJ. Patient-centered communication: basic skills. Am Fam Physician. Jan 01, 2017;95(1):29-34. [ https://www.aafp.org/link_out?pmid=28075109 ] [ Medline ]
  • Jiang S. How does patient-centered communication improve emotional health? An exploratory study in China. Asian J Commun. 2018;28(3):298-314. [ CrossRef ]

Abbreviations

Edited by G Eysenbach; submitted 21.09.22; peer-reviewed by W Zhang, E Nichele, Y Yang; comments to author 23.11.22; revised version received 24.02.23; accepted 17.11.23; published 11.01.24

©Shuang Geng, Yuqin He, Liezhen Duan, Chen Yang, Xusheng Wu, Gemin Liang, Ben Niu. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 11.01.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

IMAGES

  1. (PDF) Review Paper On Image Processing

    research papers related image processing

  2. (PDF) REVIEW PAPER ON REAL TIME IMAGE PROCESSING: METHODS, TECHNIQUES

    research papers related image processing

  3. IEEE Paper for Image Processing

    research papers related image processing

  4. Digital Image Processing Research Proposal [Professional Thesis Writers]

    research papers related image processing

  5. (PDF) A Review on Image Processing

    research papers related image processing

  6. Image processing research proposal

    research papers related image processing

VIDEO

  1. P1 Learning and Developing using Scientific Research Papers

  2. PROCESSING PAPERS KULITZ VACATION HERE IN TAIWAN🥰🥰🥰🙏🙏🙏

  3. How to download research papers online #education

  4. Step-by-step approach to starting and completing a good research paper

  5. Research Methodology and Project Report model papers || degree 6th sem research methodology

  6. Chaos Based Image Encryption

COMMENTS

  1. Image processing

    Image processing is manipulation of an image that has been digitised and uploaded into a computer. Software programs modify the image to make it more useful, and can for example be used to...

  2. Image Processing: Research Opportunities and Challenges

    The objectives of this article are to define the meaning and scope of image processing, discuss the various steps and methodologies involved in a typical image processing, and applications...

  3. 471383 PDFs

    Dec 2023. All kinds of image processing approaches. | Explore the latest full-text research PDFs, articles, conference papers, preprints and more on IMAGE PROCESSING. Find methods information ...

  4. IEEE Transactions on Image Processing

    IEEE Transactions on Image Processing. null | IEEE Xplore. Need Help? US & Canada: +1 800 678 4333 Worldwide: +1 732 981 0060 Contact & Support

  5. (PDF) A Review on Image Processing

    In this paper investigates different steps of digital image processing.like, a high-speed non-linear Adaptive median filter implementation is presented. Then Adaptive Median Filter solves the...

  6. Image Processing Technology Based on Machine Learning

    This paper introduces machine learning into image processing, and studies the image processing technology based on machine learning. This paper summarizes the current popular image processing technology, compares various image technology in detail, and explains the limitations of each image processing method.

  7. Frontiers

    Volume 1 - 2021 | https://doi.org/10.3389/frsip.2021.675547 Grand Challenges in Image Processing Frédéric Dufaux * Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des signaux et Systèmes, Gif-sur-Yvette, France Introduction

  8. Frontiers

    In their experimental results, authors showed that their model compressed an image into constant 16-bits of data and decompressed with approximately 160 dB of PSNR value, 174.46 s execution time with 0.6 s execution speed per instruction.

  9. Recent Trends in Image Processing and Pattern Recognition

    The 5th International Conference on Recent Trends in Image Processing and Pattern Recognition (RTIP2R) aims to attract current and/or advanced research on image processing, pattern recognition, computer vision, and machine learning.

  10. J. Imaging

    Developments in Image Processing Using Deep Learning and Reinforcement Learning by Jorge Valente 1, João António 1, Carlos Mora 2 and Sandra Jardim 2,* 1 Techframe-Information Systems, SA, 2785-338 São Domingos de Rana, Portugal 2 Smart Cities Research Center, Polytechnic Institute of Tomar, 2300-313 Tomar, Portugal *

  11. Advances in image processing using machine learning techniques

    With the recent advances in digital technology, there is an eminent integration of ML and image processing to help resolve complex problems. In this special issue, we received six interesting papers covering the following topics: image prediction, image segmentation, clustering, compressed sensing, variational learning, and dynamic light coding.

  12. Digital Image Processing

    In this paper we give a tutorial overview of the field of digital image processing. Following a brief discussion of some basic concepts in this area, image processing algorithms are presented with emphasis on fundamental techniques which are broadly applicable to a number of applications. In addition to several real-world examples of such techniques, we also discuss the applicability of ...

  13. Applied Sciences

    Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications. ... Digital image processing and ...

  14. 267349 PDFs

    Jan 2024 Eko adhi Setiawan Muhammad Fathurrahman Radityo Fajar Pamungkas Samsul Ma'arif Retracted: Predicting the Elastic Modulus of Nanoparticle-Reinforced Polymer Matrix Composites Based on...

  15. Deep learning and medical image processing for coronavirus (COVID-19

    Motivated by this fact, a large number of research works have been proposed and developed for the initial months of 2020. In this paper, we first focus on summarizing the state-of-the-art research works related to deep learning applications for COVID-19 medical image processing.

  16. Methods for image denoising using convolutional neural network: a

    Environmental, transmission, and other channels are mediums through which images are corrupted by noise. In image processing, image noise is the variation in signal (in random form) that affects the brightness or color of image observation and information extraction. ... 31 research papers related to CNN image denoising. In this review, a ...

  17. Image Processing

    Digital image processing consists of the manipulation of images using digital computers. Its use has been increasing exponentially in the last decades. Its applications range from medicine to entertainment, passing by geological processing and remote sensing.

  18. A Study on Various Image Processing Techniques

    Abstract. The image processing techniques plays vital role on image Acquisition, image pre-processing, Clustering, Segmentation and Classification techniques with different kind of images such as Fruits, Medical, Vehicle and Digital text images etc. In this study the various images to remove unwanted noise and performs enhancement techniques ...

  19. Digital image processing

    A review of the field of digital image processing is presented, with concentration upon image formation and recording processes, digital sampling and digital image display, and with in-depth coverage of image coding and image restoration. New results in image restoration are also presented, covering restoration by use of an eye-model constraint and nonlinear restoration by maximization of the ...

  20. Underwater image processing and analysis: A review

    Future development trend of underwater image processing is introduced. • Some insights into the prospective research directions to promote the development of underwater vision are provided. Abstract Keywords Underwater image Marine environment Underwater saliency detection Color constancy 1. Introduction

  21. (PDF) Studies on application of image processing in ...

    Processing Studies on application of image processing in various fields: An overview IOP Conference Series Materials Science and Engineering License CC BY 3.0 Authors: T Prabaharan P...

  22. Journal of Medical Internet Research

    Background: Web-based health care has the potential to improve health care access and convenience for patients with limited mobility, but its success depends on active physician participation. The economic returns of internet-based health care initiatives are an important factor that can motivate physicians to continue their participation.

  23. (PDF) REVIEW PAPER ON REAL TIME IMAGE PROCESSING ...

    The main contribution of this paper is to provide an overview of the current state of real-time image processing research (Applications), the relevant techniques, and methods. Real-Time...