• Nu S-Au Găsit Rezultate

View of A Survey: Identification of Throat Cancer byMachine Learning

N/A
N/A
Protected

Academic year: 2022

Share "View of A Survey: Identification of Throat Cancer byMachine Learning"

Copied!
7
0
0

Text complet

(1)

A Survey: Identification of Throat Cancer byMachine Learning

R.Akshara1,T.P.Latchoumi2,*

1Research Scholar,Department of Computer Science and Engineering, VFSTR (Deemed to be University), Guntur, India.

2Assistant Professor, Department of Computer Science and Engineering, SRM Institute of Science and Technology,Tamilnadu, India.

*Corresponding Author: [email protected], +918754830690

Abstract:Medical applications in Machine Learning (ML) algorithms is well-being state on analyzing the different attributes that have a high impact on getting illness. Cancer is one amoungof the human disease where researchers are still struggling for the complete cureness and it is also unpredictable. Cancer is a heterogeneous disease and its treatment varies from one type to another and can inculcate different phases. Throat cancer is a tumor that spreads throughout the voice box (larynx), tonsils, or throat (pharynx). In the initial stage, it is actively recommended to diagnose throat cancer and get proper medication. Deep Learning (DL) image processing techniques and ML techniques are used to effectively predict the throat cancer specifically for the supervised learning classification algorithms. This paper reviewed theML and DLbasedresearchactivities undertaken to classify cancer of the throat cancer.

Keywords-Deep Learning, Image Processing, Machine Learning, Supervised Classification algorithms, Throat Cancer.

I. Introduction

Cancer is one of the deadly diseases that grow unusual cells and spread obstinately to decimate tissues in the body [1]. Cancer is a complex disease where the symptoms and treatment vary between specific forms of cancer and can instill various stages such as chemotherapy, radiation, and surgery. Essentially, there are more than 100 separate types of cancers that fall under each of the four groups, such as Carcinoma, Sarcoma, Leukemia, and Lymphoma depending on where it begins. Throat cancer falls into a cancer category that has two basic types: squamous cell carcinoma (frequent in the US and affecting the smoothed throat cells) and adenocarcinoma (uncommon influencing organs) [2]. Throat cancer is divided into two groups namely pharyngeal and laryngeal cancer. Pharyngeal cancer causes larynx and throat to develop in the pharynx, a muscle conduit that connects the oral and nasal cavity. It includes various types of the nasopharynx, oropharynx, and pituitarynx cancer. In the larynx, a voice box, laryngeal cancer develops and is also associated with a variety of Human Papilloma Virus (HPV), a sexually transmitted infection that can affect many parts. Regardless of their natural danger factors, throat cancer is associated with different kinds of diseases such as lung, or bladder cancer [3].

The first laryngoscopy medical procedure will be performed to speculate on cancer of the throat which provides a closer view of the throat. If any abnormalities are found, then biopsy such as conventional Fine Needle Aspiration (FNA) biopsy or endoscopic biopsy is done as suggested by specialists [4]. If cancerous cells are detected, extra imaging testing of the head, neck, and chest will be done at that stage to hypothesize the disease process that ranges from 0 to 4. Initial diagnosis can have a high rate of recovery and it becomes difficult to fix if harmful cells spread to different parts of the throat [5].

(2)

Cancers of larynx, oropharynx, and hypopharynx are the 21st, 24th, 25th common tumors in the world, respectively. South-Central Asia has a heightened risk of pharynx cancers due to exposure to thread factors [6]. According to India’s current throat cancer statistics, 76,400 per 200,000 people get affected per year. Speculating on the illness is strongly recommended when it is in the initial stage because the curable is unpredictable. Extensive research into all aspects of cancer has resulted in the production of massive data. Image processing, DL, and ML play a key role in recognizing the throat cancer [7].

II. Related Work

Diagnostic imaging is a tool by which the different parts of the body are demonstrated and acquired [8]. Comprehensively, a large number of continuing diagnostic imaging occurs every week and this enhance the use of the image processing techniques. Many image processing techniques, such as segmentation and texture analysis are used for cancer diagnosis. The segmentation’s objective is to partition and assemble the images into important parts within images using local segmentation and global segmentation [9]. The objective of the texture analysis is to maximize the data obtainable from diagnostic images. Image processing plays a crucial role in improving diagnostic performance [10].

ML empowers a system to research and enhance its efficiency, with almost no human effort. ML’s applicability on various information is to convoke, examine, design, and train the data as well as to estimate and improve the model performance [11]. Contingent upon the standards of learning, ML is divided into three categories according to learning standards, namely reinforcement, unsupervised, and supervised learning. The application of supervised learning effectively performs in the detection of several diseases. It takes a known (labeled and categorized dataset) as input, responses as output and trains a model to speculate results for new responses to data. The model can also be changed here for better improvement, depending on the examinations carried out between the result obtained and the expected output [12].

Diagnostic imaging’s essential aim is to conjecture abnormal things. DL is an efficient diagnostic for an in-depth diagnosis. Medical applications on DL offer explanations for a wide range of issues, from diagnosis to personalized treatment counseling. Throat cancer is one of the deadly diseases that spread to other parts of the body and its entire cure is unforeseeable. It is therefore important to detect cancer in the initial step, which can be well-used for image processing, ML, and DL [13].

From 2008 to 2018, in a video interleaved format, speculating unusual things in radix linguae, epiglottis, glottis, and hypopharynx to boost diagnosis in detecting the initial stage of larynx cancer focusing on color space analysis, image enhancement, threshold collection, region-based and dynamic information-based segmentation. In 2019, a new detection system was developed using various image processing techniques with structures found in the larynx to overcome obstacles. It was done on video laryngoscopes provided by Taiwan’s General Tri-Service Hospital, which included 359 specimens of diagnosed throat cancer. With high-speed detection, it has obtained 97 percent accuracy [14].

Deep Convolutional Neural Network (DCNN) has been developed based on a diagnostic framework to automatically speculate laryngeal cancer using laryngoscopic images. The data used in this research has been provided by the 5 tertiary hospitals in China. Compared to a seasoned human specialist with 10-20 years of work experience, it has achieved an accuracy of 0.77 [15]. A model that enhances radionics efficiency for the detection of head and neck cancer by applying DL techniques to data obtained from the repository of The Cancer Imaging Archive (TCIA). In this

(3)

model, tomography images were calculated before 194 patients were treated and 106 patients were tested using Convolution Neural Network (CNN). This model achieved the Area Under Curve (AUC) of 0.88 compared to a traditional radiomic framework and the AUC improved to 0.92 when compared to our model with the previous model [16].

DL framework for the speculation of head and neck cancer was introduced in H2O using CT scanning images taken from TCIA. Through Weiner filtering MATLAB R2017a, a total of 26019 images were collected and preprocessed to enhance the image features. Then, using fuzzy C, the partitioning algorithm is done to analyze the image directly, and the extraction of the function is done using the Gray Level Co-Accuracy Matrix (GLCM) to extract the necessary parameters. Then, for the extracted features, the H2O DL method is implemented with 80 percent of the training data and 20 percent of the dataset testing. RStudio3.4.4 is used for classification. The results of the confusion matrix and experiments are 99.5 percent accurate with 98.9 percent accuracy, 98.3 percent specificity, and 99.8 percent overall accuracy [17].

Automated detection of destructive contusion was designed based on the theory of human vocalization, as their early detection of in vocal folds plays a crucial role in detecting laryngeal cancer at the initial stage. The non-intrusive technique is performed relying on capturing the vocal fold waveform from the recorded utterances. First, the Iterative Adaptive Inversion Filter (IAIF) is applied to two databases: the Massachusetts Eye and Year Inhermery (MEEI) database, and the Sarbrooken Voice Database (SVD) with continuous speech patterns to capture the vocal fold waveform. Now, the relevant contexts are used from glottal flow impulses to generate relevant features, and the complex and related extraction of features is done using statistical tools such as boxplot and Principal Component Analysis (PCA). Then, the technique of Support Vector Machine (SVM) is applied to distinguish between normal and premalignant tumors. PCA analysis shows that approximately 92 percent accuracy is achieved when combined with selected features, which helps in the early detection of laryngeal cancer [18].

Digital image processing techniques have been developed in MATLAB to speculate and identify the affected cancer cell in the oral region. Magnetic Resonance Image (MRI) is preprocessed using salt and pepper removal system. Second, the Firefly feature extraction algorithm is implemented which allows simple visualization to speculate cancerous tumors, and then the cancer cells are identified by the Expectation-Maximization (EM) algorithm. It offers an efficient way of initiating care as it reliably measures the total clustered pixels and determines the percentage of affected cancer cells [19].

Developed a model for Laryngeal Tumor Speculation using endoscopic images. First, endoscopic Narrow-band imaging (NBI) images are pre-processed to remove noise using a bilateral filter.

Small vessels affected by the denoising cycle are repaired by Anisotropic diffusion. Then, segmentation of the blood vessels is done using Matched Filter (MF), which also extracts a high number of false positives which must be eliminated. Thus, First-Order Derivative of Gaussian (FODG) based MF filter was used which selected the MF results according to a FODG threshold.

Then, contusion classification is carried out on the basis of a statistical study of the characteristics of the blood vessel, such as thickness, tortuosity, and density. This algorithm was applied to 50 NBI endoscopic laryngeal images and achieved an 84.3 percent overall classification accuracy [20].

Developed a model to distinguish vocal folds by examining blood vessels that appear on the top surface of the vocal folds as well as the shape of the vocal folds using image processing and ML techniques. First, vocal folds were identified using a descriptor Histogram Oriented Gradients (HOG) on video laryngoscopes. Then, the structure of the vocal fold edges was examined and a

(4)

new vessel centerline extraction technique was created that specializes in vascular vocal fold formation. The extracted vessel center lines were then evaluated to obtain the vascular properties.

Finally, the classification was demonstrated using the architecture of the binary decision tree. The model was tested using 70 patients laryngeal images and had a sensitivity of 86%, 94%, 80%, 73%, and 76% for healthy, polyp, nodule, laryngitis, and sulcus vocalis classes respectively. Results suggested that the vessels of the vocal fold serve best to speculate vocal fold pathologies as well as features of vocal fold shape, and play a crucial role in effective diagnosis [21].

Proposed a model using feature selection and ML to use 31 oral cancer data from the Malaysian Oral Cancer Database and Tissue Bank System (MOCDTBS) to speculate on oral cancer. In the first phase, five feature selection methods were examined on a dataset. Four classifiers in the second phase; namely, Adaptive Neuro-Fuzzy Inference System (ANFIS), Artificial Neural Network (ANN), SVM and Logistic Regression, and the selected characteristics from each set of features. Since the sample size is small, all classifiers have had k-fold cross validation implemented [22].

A model for the efficient detection of Oral Squamous Cell Carcinoma (OSCC) was developed using SVM analyzing a total of 34 clinical and molecular variables in 69 OSCC patients results. Its classification capacity was assessed and the most important prognostic factors were defined as several recurrences and Tumor, Nodes, Metastases (TNM) levels. It has achieved a 98.55% overall accuracy [23-26].

The author proposed a larynx cancer prediction system and it was implemented in two stages. First, transforming and selecting data, and second, predicting larynx cancer with information on human Laryngeal Carcinoma (LaCa) using a collection of classifiers through the databases at the Centro Medico NacionalSiglo XXI. By this prediction system, they found an increase in the Cellular Retinol Binding Protein-1 (CRBP-1) gene correlated with patient survival rate and implemented a Hybrid Classifier of Decision Rules (HCDR) with an efficient 90% accuracy prediction using CRBP-1 genes, indicating a high degree of reliability in the Prediction System of Larynx Cancer (PSLC) [27-31].

For malignant tumors in the human larynx, soft computing-based non-invasive screening was proposed using questionnaire data and digital voice recording to analyze and non-linear map with the property of local data ordering for data exploring and classification. The screening aims at classifying the data into healthy, cancerous, and non-cancer classes. Experimental investigations have shown that the data from the questionnaire carry more discriminative information than the signal from the voice. For questionnaire data obtained from 240 subjects, the accuracy of 92% is achieved and can be further improved by adding more questionnaires.

III. Methods used for each region

Epiglottis:First, Medial filter eliminates conflicts with noise.Then, Contrast Limited Adaptive Histogram Equalization (CLAHE)increasesimage brightness similarly. Eventually, Arnold-Chiari Malformation (ACM) was used to get a picture of the complete epiglottis, since the ACM depends on the point of seed.

Glottis:Firstly, the method used by Otsu to divide the imageand combine the negative slice to obtain the glottal image after retaining the bright texture of most tissues, it has used a negative conversion to get the edge of the darker region.The glottal mask is used to cover and connect the glottis to avoid an incomplete area of the glottis. The morphology was finally used to make the texture softer.

(5)

Radix linguae:To obtain an inclined image, the edge is defined using Sobel. Then, the watershed algorithm was used to get the details of the inclined images. The ACM is then used to obtain a complete tumor appearance. The reflection mask is used to communicate to overcome the reflective effects and finally obtained the tongue root tumor portion.

Hypopharynx and Epiglottis:The color space L * a * b is used as the principal color space for tumor segmentation. Then, the processing of closed operation and morphology filling improves the texture defects in the tumor area. Finally, ACM is used to get the complete appearance of a tumor.

Vocal:FlowCytometry (FCM) was used to obtain contractions of the glottis and vocal cord. The glottis mask was then used to restrict the region around the glottis. Finally obtained complete glottal and vocal cord contractions after open and filling operations with morphology.

IV. Analysis

Group 2 performance (clinicopathologic and genomic variables) exceeded Group 1 performance (clinicopathologic variables) and is more accurate in the diagnostic result. The Relief F-GA-3-input model and the Kruskal-Wallis test with Group 2 ANFIS classification model showed a worthy difference compared to the GA, CC, Relief F, and CC 3-input model. Drinking, invasion, and p63 are the ideal symptoms of a diagnosis of oral cancer. In DL, GLCM achieved better accuracy among all classifiers as shown in Table 1. A model with fewer inputs achieved more accuracy.

Table 1: Comparison results of techniques

S.No. Technique Accuracy

1 DCNN 0.77

2 AUC 0.88

3 AUC in CNN 0.92

4 GLCM in DL 99.5

5 PCA 92.0

6 FODG 84.3

7 OSCC in SVM 98.5

8 HCDR 90.0

9 ANFIS 93.8

V. Conclusion and Future work

This article analyzed the earlier studies related to the diagnosis of throat cancer. There was considerable research work in this area due to its importance in the field of medical diagnosis.

Based on this research, the challenge of diagnosing throat cancer as a worldwide project was considered to be the diagnosis of all forms of throat cancer, and video laryngoscopy plays an important role in diagnosing the laryngeal area as it explores the inner area of the larynx and possible abnormalities to identify tumors. Further research in this field is needed as there is no guarantee of success even with well-visualized glottis, which can result in damage to the dental and pharyngeal.

References

[1] Reddy, D. K. N., “Prior Prediction and Impediment of Cancer using Machine Learning Process”, International Journal of Psychosocial Rehabilitation, 24(4), 2020

(6)

[2] Alabi, R. O., Elmusrati, M., Sawazaki-Calone, I., Kowalski, L. P., Haglund, C., Coletta, R. D., Almangush, A.,“Machine learning application for prediction of locoregional recurrences in early oral tongue cancer: a Web-based prognostic tool”, VirchowsArchiv, 475(4), 489-497, 2019.

[3] Liu, Huicong, Wei Dong, Yunfei Li, Fanqi Li, JiangjunGeng, Minglu Zhu, Tao Chen, Hongmiao Zhang, Lining Sun, Chengkuo Lee, "An epidermal sEMG tattoo-like patch as a new human–machine interface for patients with loss of voice", Microsystems

&Nanoengineering 6(1), 1-13, 2020.

[4] Latchoumi, T. P., &Sunitha, R. Multi agent systems in distributed datawarehousing. In 2010 International Conference on Computer and Communication Technology (ICCCT) (pp. 442- 447). IEEE, 2010.

[5] Loganathan, J., Janakiraman, S., &Latchoumi, T. P. (2017). A Novel Architecture for Next Generation Cellular Network Using Opportunistic Spectrum Access Scheme. Journal of Advanced Research in Dynamical and Control Systems,(12), 1388-1400.

[6] Tran, B. X., Latkin, C. A., Sharafeldin, N., Nguyen, K., Vu, G. T., Tam, W. W., Ho, R. C.,

“Characterizing Artificial Intelligence Applications in Cancer Research: A Latent Dirichlet Allocation Analysis”, JMIR Medical Informatics, 7(4), e14401, 2019.

[7] Korach, Z. T., Cato, K. D., Collins, S. A., Kang, M. J., Knaplund, C., Dykes, P. C., Chang, F.,

“Unsupervised Machine Learning of Topics Documented by Nurses about Hospitalized Patients Prior to a Rapid-Response Event”, Applied Clinical Informatics, 10(05), 952-963, 2019 . [8] G. Castellano, L. Bonilha, L.M. Li, F. Cendes,“Texture analysis of medical images”, Clinical

Radiology, 59, 1061–1069, 2004.

[9] Pereda, M., Estrada, E., “Visualization and machine learning analysis of complex networks in hyperspherical space”, Pattern Recognition, 86, 320-331, 2019.

[10] Feltes, B. C., Chandelier, E. B., Grisci, B. I., Dorn, M., “CuMiDa: An Extensively Curated Microarray Database for Benchmarking and Testing of Machine Learning Approaches in Cancer Research”, Journal of Computational Biology, 26(4), 376-386, 2019.

[11] Srinivas, S., “A Machine Learning-Based Approach for Predicting Patient Punctuality in Ambulatory Care Centers”, International Journal of Environmental Research and Public Health, 17(10), 3703, 2020.

[12] Gu, J. T., Schindler, J. S., Karle, W. E., “An Unusual Sore Throat”, JAMA Otolaryngology–

Head & Neck Surgery, 145(7), 678-679, 2019.

[13] Chung-Feng Jeffrey Kuo, Yu-Ching Li, Wei-Han Weng, Kathya Belen PinosLeon,Yueng- Hsiang Chu. Applied image processing techniques in video laryngoscope for occult tumor detection. Biomedical Signal Processing and Control Journal Volume 55, January 2020.

https://doi.org/10.1016/j.bspc.2019.101633

[14] Latchoumi, T. P., Ezhilarasi, T. P., &Balamurugan, K. Bio-inspired weighed quantum particle swarm optimization and smooth support vector machine ensembles for identification of abnormalities in medical data. SN Applied Sciences, 1(10), 1137, 2019.

[15] André Diamant ,AvishekChatterjee, Martin Vallières , George Shenouda& Jan Seuntjens.

Deep learning in head & neck cancer outcome prediction, Scientific Reports (2019) | https://doi.org/10.1038/s41598-019-39206-1

[16] Pooja Gupta, AvleenKaurMalhi. Using deep learning to enhance head and neck cancer diagnosis and classification. 2018 ieee international conference on system, computation, automation and networking (icscan). https://doi.org/10.1109/ICSCAN.2018.8541142

(7)

[17] Anis Ben Aicha. Noninvasive Detection of Potentially Precancerous Contusions of Vocal Fold Based on Glottal Wave Signal and SVM Approaches. International Conference on Knowledge Based and Intelligent Information and Engineering Systems (KES2018).https://doi.org/10.1016/j.procs.2018.07.293

[18] C.R.Muzakkir Ahmed, M.Narayanan, S.Kalaivanan, K.Sathya Narayanan, A.K. Reshmy.To speculate and classify oral cancer in mri image using firefly algorithm and expectation maximization algorithm. International Journal of Pure and Applied Mathematics 116(21):149- 154October 2017.

[19] CorinaBarbalata and Leonardo S. Mattos. Laryngeal Tumor Detection and Classificationin Endoscopic Video. IEEE Journal of Biomedical and Health Informatics.

https://doi.org/10.1109/JBHI.2014.2374975

[20] Balamurugan, K., Uthayakumar, M., Sankar, S., Hareesh, U. S., &Warrier, K. G. K. (2018).

Effect of abrasive waterjet machining on LaPO 4/Y 2 O 3 ceramic matrix composite. Journal of the Australian Ceramic Society, 54(2), 205-214.

[21] Siow-Wee Chang, Sameem Abdul-Kareem, Amir Feisal Merican and RosnahBintiZain. Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods. BMC Bioinformatics 2013.

https://doi.org/10.1186/1471-2105-14-170

[22] Bhasha, A. C., &Balamurugan, K. (2019). Fabrication and property evaluation of Al 6061+

x%(RHA+ TiC) hybrid metal matrix composite. SN Applied Sciences, 1(9), 1-9.

[23] McMullen, C., Chung, C. H., & Hernandez-Prera, J. C., “Evolving role of human papillomavirus as a clinically significant biomarker in head and neck squamous cell carcinoma”, Expert review of molecular diagnostics, 19(1), 63-70, 2019.

[24] Grzelczyk, W. L., Szemraj, J., Kwiatkowska, S., Józefowicz-Korczyńska, M., “Serum expression of selected miRNAs in patients with laryngeal squamous cell carcinoma (LSCC)”, Diagnostic pathology, 14(1), 49, 2019.

[25] Hart, G. R., Roffman, D. A., Liang, Y., Nartowt, B. J., Muhammad, W., Deng, J., “Multi- parameterized models for early cancer detection and prevention”, Big Data in Radiation Oncology, 265, 2019.

[26] Gowthaman, S., Balamurugan, K., Kumar, P. M., Ali, S. A., Kumar, K. M., &Gopal, N. V. R.

(2018). Electrical discharge machining studies on monel-super alloy. Procedia Manufacturing, 20, 386-391.

[27] Zhang, L., Wu, Y., Zheng, B., Su, L., Chen, Y., Ma, S., Chen, L., “Rapid histology of laryngeal squamous cell carcinoma with deep-learning based stimulated Raman scattering microscopy”, Theranostics, 9(9), 2541, 2019.

[28] Aravidan, M. K., &Balamurugan, K. (2016). Tribological and corrosion behaviour of Al6063 metal matrix composites. t J. of Adv. Engr& Tech, 7(2), 994Y999.

[29] H. IremTurkmen, M.ElifKarsligil, IsmailKocak. Classification of laryngeal disorders based on shape and vascular defects of vocal folds. Computers in Biology and Medicine.

http://dx.doi.org/10.1016/j.compbiomed.2015.02.001

[30] Balamurugan, K., Uthayakumar, M., Ramakrishna, M., &Pillai, U. T. S. (2020). Air jet Erosion studies on mg/SiC composite. Silicon, 12(2), 413-423.

[31] Wang, H. Y., Chen, C. H., Shi, S., Chung, C. R., Wen, Y. H., Wu, M. H., Lu, J. J.,

“Improving Multi-Tumor Biomarker Health Check-up Tests with Machine Learning Algorithms”, Cancers, 12(6), 1442, 2020.

Referințe

DOCUMENTE SIMILARE

The study discloses the various methodologies of machine learning techniques like, deep learning models for various datasets that comprises of voice, non-motor symptoms and

Vijayalakshmi M M, Melanoma Skin Cancer Detection using Image Processing and Machine Learning, International Journal of Trend in Scientific Research and

In this study, K-Means based Morphological segmentation and deep learning based VGGNet 16 classification is carried out to recognize the outcomes of colorectal

Many methods and machine learning techniques are being applied to predict and form analytical data on churn reduction. The following studies tells about the

Simultaneously, the system gives bad performance while using naïve Bayes since it is not debatable as naive Bayes uses a different set of algorithms with a vast dataset that results

[11] in the paper proposed an explanation about deep learning techniques that are used for object detection.. Some of the SOTA algorithms are discussed in this

The models used in Machine Learning to predict diabetes are the Linear Regression, Support Vector Machine.. Other algorithms require more computational time and Deep

Radial basis function networks (RBFs) are a form of supervised learning techniques that are used to model or estimate an unknown function between a set of input-output pairs.. The