View of Enhancing Sketch-Based Image Retrieval By Cnn Semantic Re-Ranking

(1)

Enhancing Sketch-Based Image Retrieval By Cnn Semantic Re-Ranking

Anisha R¹ , N. Anusha²,G.Kavya³

1pg Student, Department Of Ece, Sa Engineering College, Chennai, India

2professor, Department Of Ece, Sa Engineering College, Chennai, India

3professor, Department Of Ece, Sa Engineering College, Chennai, India

Abstract – A Re-Ranking System To Enhance The Performance Of Sketch-Predicated Image Retrieval (Sbir). From The Subsisting Approaches, The Proposed System Can Leverage Category Information Brought By Cnns To Fortify Homogeneous Attribute Quantification Between The Images. To Achieve Efficacious Relegation, One Cnn Model Is Trained For Relegation Of Sketches, Another For That Of Natural Images. By Training Dual Cnn Models, The Semantic Information Of Both The Images Is Captured By Deep Learning. To Quantify The Category Homogeneous Attribute Between Images, A Category Homogeneous Attribute Quantification Method Are Proposed. Category Informations Are Then Utilized For Re-Ranking.

Re-Ranking Operation First Infers The Retrieval Category Of The Query Sketch And Utilizes The Category Kindred Attribute Quantification To Quantify The Category Homogeneous Attribute Between The Query And Each Initial Retrieval Result. Determinately, Initial Retrieval Results Are Re-Ranked.

Keywords: Convolutional Neural Network, Re-Ranking.

1. Introduction

Text-Based Image Retrieval Systems, Retrieve Images By Some Keywords. Although They Are Widely Applied, Keywords Are Sometimes Not Sufficient To Express The Desired Pictures Clearly. So, Content-Based Image Retrieval (Cbir) Systems Which Retrieve Images By Exemplar Images, Emerge. Besides, Algorithms For Content-Based Video Retrieval And Processing Exist. However, Sometimes There Is Not An Exemplar Picture As The Query Image. At This Very Moment, The Sketch-Based Image Retrieval (Sbir) System Is Useful. Sbir Systems Just Need Users To Draw A Simple Sketch With A Few Lines Or Shapes As The Query Image. Lines And Shapes Tend To Reflect Primary Outlines Of A Desired Object And Bring Little Redundant Information. Therefore, Sometimes Sketches Are More Capable Of Expressing Users’ Search Intentions. An Effective And Timesaving Sbir Re-Ranking Method Can Significantly Improve The Performance Of An Sbir System Without Adding Much Time Cost.

2. Convolutional Neural Network

In Deep Learning, There Are Different Types Of Models Such As Artificial Neural Networks (Ann), Recurrent Neural Networks (Rnn) And Reinforcement Learning. But There Is One Particular Model That Has Been Contributing A Lot In The Field Of Computer Vision And Image Analysis Which Is Convolutional Neural Network (Cnn) Or Convnets. Cnns Are A Class Of Deep Neural Networks That Can Recognize And Relegate Particular Features From Images And Are Widely Utilized For Analyzing Visual Images. Their Applications Range From Image And Video Apperception, Image Relegation, Medical Image Analysis, Computer Vision And Natural Language Processing.

2.1 Architecture Of Cnn

There Are Two Main Components To A Cnn Architecture.

• A Convolution Tool That Issolates And Identifies The Varied Features Of The Image For Analysis In A Process Called As Feature Extraction.

• A Completely Connected Layer That Utilizes The Output From The Convolution Process And Presages The Class Of The Image Predicated On The Features Extracted In Previous Stages.

(2)

Figure: 1 2.2 Convolutional Layer

This Layer Is The First Layer That Is Utilized To Extract The Varied Features From The Input Images. In This First Layer, The Mathematical Operation Of Convolution Is Executed Between The Input Image And A Filter Of A Particular Size Mxm. By Sliding The Filter Over The Input Image, The Dot Product Is Taken Between The Filter And The Components Of The Input Image With Veneration To The Size Of The Filter (Mxm).

The Resultant Output Is Defined As The Feature Map Which Gives Details About The Image Such As The Corners And Edges. Later, This Feature Map Is Alimented To Other Layers To Learn Several Other Features Of The Input Image.

2.3 Pooling Layer

Mostly, A Convolutional Layer Is Followed By A Pooling Layer. The Primary Aim Of This Layer Is To Decrement The Size Of The Convolved Feature Map To Truncate The Computational Costs. This Is Performed By Decrementing The Connections Between Layers And Independently Operates On Each Feature Map. Depending Upon Method Utilized, There Are Several Types Of Pooling Operations.

In Max Pooling, The Most Astronomically Immense Element Is Taken From Feature Map.

Average Pooling Is Calculated As The Average Of The Elements In A Predefined Sized Image Section. The Total Sum Of The Elements In The Predefined Section Is Calculated In Sum Pooling. The Pooling Layer Conventionally Accommodates As A Bridge Between The Convolutional Layer And The Fc Layer.

2.4 Fully Connected Layer

The Fully Connected (Fc) Layer Consists Of The Weights And Biases Along With The Neurons And Is Utilized To Connect The Neurons Between Two Different Layers. These Layers Are Normally Placed Afore The Output Layer And Form The Last Few Layers Of A Cnn Architecture.

In This, The Input Image From The Layers Are Flattened And Victualed To The Fc Layer. The Flattened Vector Then Undergoes Few More Fc Layers Where The Mathematical Functions Operations Normally Take Place. In This Stage, The Relegation Process Commences To Take Place.

3.Working And Analysis

Framework Of The Proposed Sbir Re-Ranking System Is Given In The Following Figure.

Query Refers To A Query Sketch, Image Dataset Contains All The Natural Images That Are Needed For The Paper. The Query Sketch And Natural Images Are Put Into The Sbir System To Engender The Initial Retrieval Results. Q-Net And N-Net Are Two Cnns. Once They Are

(3)

Initial Retrieval Results, Respectively. Determinately, With The Avail Of Category Information, Initial Retrieval Results Are Re-Ranked In Re-Ranking.

Figure: 2

In This Paper, We Test The Performance Of Vggnet, Alexnet, And Googlenet. The Retrieved Images And The Comparion Of Accuracy And Elapsed Time Are Given Below.

Figure: 3

The Above Figure 3 Is The Image Retrieved From Vgg Cnn Architecture.

Figure: 4

(4)

The Above Figure 4 Is The Image Retrieved From Alexnet Cnn Architecture.

Figure: 5

The Above Figure 4 Is The Image Retrieved From Googlenet Cnn Architecture.

All The Architectures Retrieve Images But The Accuracy And The Elapsed Time Differs Which Is Compared In The Following Table. The Image Retrievals Are Done Using Matlab Software And The Outputs Are Received Using A Receiver Unit Which Is A Personal Computer.

Architecture Accuracy Elapsed Time

Vggnet 57.4% 26 Sec

Alexnet 28.57% 24 Sec

Googlenet 28.57% 4 Min 28 Sec Table: 1

4. Results And Conclusions

We Propose A Re-Ranking-Based Sbir System To Enhance The Performance Of Sbir Systems.

First, We Train Two Cnns Separately, Where One Is For Sketch Classification, And The Other Is For Natural Image Classification. By This Means, Cnn Models Study The Semantic Information Of Sketches And Natural Images. After This, Cnn-Based Image Classification Is Carried Out On A Sketch And Its Initial Retrieval Results, And Category Information Of Sketches And Natural Images Are Obtained. Finally, The Initial Retrieval Results Are Re- Ranked Through Measuring The Similarity Between The Category Information Of The Query Sketch And The Initial Retrieval Results. Experiments Show That Our Proposed Re-Ranking- Based Sbir System Significantly Improves The Performance Of Various Sbir Systems. Thus The Performance Of Different Architectures For Image Retrieval Has Done Successfully.

References

[1] A. Chalechale, G. Naghdy, And A. Mertins, “Edge Image Description Using Angular Radial Partitioning,”. Ieee Proceedings-Vision, Image And Signal Processing, Vol. 151(2): 93–

101, April, 2004.

(5)

[2] Y. Cao, C. Wang, L. Zhang, L. Zhang. “Edgel Index For Large-Scale Sketch-Based Image Search,”. Cvpr , Ieee Conference, 2011 R. Nicole, “Title Of Paper With Only First Word Capitalized,” J. Name Stand. Abbrev., In Press.

[3] ] E. D. Sciascio, G. Mingolla, M. Mongiello, “Content-Based Image Retrieval Over The Web Using Query By Sketch And Relevance Feedback,”. Visual’99, London, Uk, 1999, Pp.

123–130.

[4] C. Liu, D. Wang, X. Liu, C. Wang, L. Zhang, B. Zhang, “Robust Semantic Sketch Based Specific Image Retrieval”. Icme, 2010 Ieee.

[5] R. Datta, D. Joshi, J. Li, And J. Wang. “Image Retrieval: Ideas, Influences, And Trends Of The New Age”. Acm, Computing Surveys, 2008.

[6] G. Salton And C. Buckley. “Improving Retrieval Performance By Relevance Feedback”.

Journal Of The American Society For Information Science, 41(4): 288– 297, 1999

[7] I. J. Cox, M. L. Miller, T. P. Minka, T. V. Papathomas, P. N. Yianilos, “The Bayesian Image Retrieval System, Pichunter: Theory, Implementation And Psychological Experiments”, Ieee Transactions On Image Processing, 9(1), Pp. 20–37, 2000.

[8] E. Cheng, F. Jing And L. Zhang, “A Unified Relevance Feedback Framework For Web Image Retrieval”, Ieee Trans. Image Process., Vol. 18, No. 6, Pp.1350–1357, 2009.

[9] P. Salembier, F. Marqués, Region-Based Representations Of Image And Video:

Segmentation Tools For Multimedia Services. Circuits And Systems For Video Technology, Ieee Transactions On, 1999, 9(8): 1147– 1169.

[10] K. Hirata And T. Kato, “Query By Visual Example - Content Based Image Retrieval,” In Proc. Adv. Database Technol. 1992, Pp. 56–71.