View of Effective Deep Learning approach based on VGG-Mini Architecture for Iris Recognition

Download (0)

Full text


Effective Deep Learning approach based on VGG-Mini Architecture for Iris Recognition

Kranthi Kumar K

1, a)*

, Rahul Bharadwaj M

2, b)*

, Sheshank Ch

3, c)*

, Sujana S

4, d)

1, 2, 3, 4

Department of ECE, Vardhaman College of Engineering, Hyderabad, India

a) [email protected], b) [email protected] c) [email protected]

d) [email protected]


Biometric system is a pattern recognition system that works by collecting biometric data from a user, extracting a feature set from that data, and comparing that feature set to a database template set. Through this paper, we propose a biometric recognition system based on iris recognition. Iris is the most secured and unique biometric trait among other biometric traits.

In our work, we have proposed a modified Hough Transform and considered the Mini-VGG Net model without its weights, and trained the network to obtain the best features. Using the Neural Networks, we have performed the classification and obtained Accuracy, Precision, and Recall of 98%, 0.99, and 0.99 respectively. Our experiments were performed on the CASIA Version-1, which includes 756 samples of iris in 108 folders with 7 samples having dimensions 280X320 each.


Iris, Iris Recognition system, Hough Transform, Integro-Differential, Daughman Rubber Sheet model, VGG-Mini, Deep Learning, Neural Network.


Biometrics is a robust and accurate authentication mechanism for systems that provide restricted access to physical assets by identifying person based on psychological or physiological features [1]. Iris, palm print, and face etc are physiological characteristics, while accent, signature, and ECG etc are behavioral characteristics. Unlike conventional authentication methods such as passwords or PINs, biometric identification technique are focused on assets that can’t be missed, destroyed, revealed or lost. Among the all biometric authentications iris has been most efficient and the best authentication technique.

Iris is a small, spherical shaped in the eye that functions as a covered internal organ that is unaffected by environmental factors [2]. Because of its uniqueness, durability, and long-term stability, Iris is the most advanced biometric identification technology currently on the market. Iris textures vary even in genetically identical twins [3].

The Iris recognition device has many uses, including basic user access management (home, business place, and research lab), protected financial transfers, internet access, payment card authentication, secure access to banking among several others. [4].

During the verification step, the iris identification or verification system acquires a sample of the eye, extracts the region of interest from the sample to ascertain the distinct feature for identifying individuals, and compares it to the database generated during the enrolment process. As a result, determining an individual's identity is more easy, easier, accurate, and reliable.

The upcoming sections of this paper contain the following data: Section 2 briefs about the various works existed till date. Section 3 details the workflow of the work done. Section 4 elaborates about the segmentation technique used.

Section 5 presents the Normalization Technique used (Daughman Rubber Sheet model). Section 6 describes about the Feature extraction and classification (Mini-VGG) model. Sections 7, 8 conclude the paper with results and discussions.

Literature Review

Maram.GAlaslni and Lamiaa A.Elrefaei [5] suggested a CNN (VGG-16) transfer learning technique for iris recognition. They used a pre-trained VGG-16 to conduct operations on databases such as the IITD and CASIA Iris databases, with good performance. The proposed model, however, does not fit well with other publicly accessible databases and does not provide solutions to other biometric issues.

Kien Nguyen, et al. [6] suggested a CNN deep learning method for iris recognition. They tested various CNN feature extraction techniques, such as AlexNet, VGG, Google Inception, ResNet, and DenseNet, on two datasets, the LG2200 dataset and the CASIA-Iris-Thousand dataset, and found that the DenseNet method performed better on both databases. They did not, however, clarify why the proposed model using AlexNet was less accurate.


J. Jayanthi, et al. [7] proposed an integrated frame work using deep learning features for iris detection and recognition. In this paper they used different techniques like Black Hat filtering, Median filtering and Gamma Correction, Hough Circle Transform, R-CNN are used for iris detection and recognition. The dataset used here is CASIA-Iris Thousand. They achieved significant results using this approach. However, they did not discuss the use of the hyper parameter tuning procedure for the implemented DL model.

Shervin Minaee, et al. [8] proposed a Conceptual Study of Deep Convolutional Features for Iris Recognition. In this paper they proposed a scheme based on features extracted from the technique VGG Net and achieved good results.

They used two well known databases namely CASIA IRIS-1000 and IIT DELHI. The architecture used in this paper is a modified architecture that is designed for object recognition. This shows that results would have been more accurate if they had used an architecture that is designed for iris recognition.

Maram.G Alaslani, et al. [9] proposed a CNN Based Feature Extraction for Iris Recognition system. They used an ALEX-NET model that had been pre-trained for feature extraction and an SVM model for classification in this paper. This paper makes use of public datasets such as IITD, CASIA Iris-databases. Results show us that the performance of the pretrained model (ALEX-NET) is not appreciated in some datasets and has to improve for wide range of datasets.

Shervin Minaee, et al. [10] developed a Face Recognition System based on Scattering Convolutional architecture. In this paper, Scattering Transform technique for extraction of features, and SVM for classification are used. Yale Face, Georgia Tech Face, and Extended Yale Face Database are used in this paper. However, Scale invariant scattered features can be used for improvement in accuracy, which they didn’t use in this paper.

Muhammad Arsalan, et al. [11] proposed an Iris Recognition System using Deep Learning-Based Iris Segmentation in Visible Light Environment. They used a two-stage iris segmentation approach based on CNN in this paper, which is capable of precise iris segmentation in extremely noisy conditions of iris detection by visible light camera sensor.

Datasets used here are: NICE-II, UBIRIS.v2, MICHE. The results here don’t address other datasets that contain biometric problems like NIR light environments and need of SSN technique after post processing is seen.

Tianming Zhao, et al. [12] proposed an Iris Recognition System based on Capsule Network Architecture. In this paper they introduced a capsule network architecture where different layers with different depths are created and is tested on datasets like JluIrisV3.1, JluIrisV4, and CASIA-V4 Lamp to achieve good results. The results of this paper show that the proposed method shows excellent results. However the proposed method may fail at larger datasets as the complexion in this paper is too high.

Lili Hsieh, et al. [13] proposed an Iris Recognition System using Embedded Zerotree Wavelet Coding. In this paper they used UBIRIS iris image database and concluded that 100% accuracy can be achieved. This paper also states that for larger datasets and for less size features the proposed system fails.

Sue Chin Yow and Ahmad Nazri Ali [14] proposed an iris recognition system using Deep Learning Technique. In this paper CNN along with SVM, data augmentation and Bayesian optimization techniques are used. CASIA-VI is the dataset used in the paper. The results in this paper are appreciable but are not efficient as the proposed method does not fit for other publicly available datasets.

Kai Yang, et al. [15] proposed an Iris Recognition System using DualSANet architecture. In this paper they used DualSANet, ResNet-18, SAFFM techniques and OSIRIS code on CASIA, IITD databases and achieved good results.

They proposed encoder-decoder type architecture unlike the present trend CNN.

From the above survey, we could draw certain conclusions that few authors worked on the conventional segmentation approaches without any modifications which lead to lower accuracies and the preprocessing of the iris is considered to be one of the important processes for an effective system.Moreover, it is pretty evident that many authors have verified or tested their system on multiple databases leading to higher accuracies on only particular databases rather on all of them. Additionally, a couple of authors have worked on the traditional classification algorithms leading to loss in accuracies. With increasing demands of security, accuracy and liveliness detection are playing a significant part in today’s world. So, come up with the demands introducing the transfer learning and pre- trained models which provide better results in shorter period of time have become a necessity.



The database used in experimentation is CASIA-V1. There are 756 iris images from 108 eyes. Each file is saved in BMP format with a resolution of 320*280 pixels.Seven photos are obtained in two sessions with each eye using the self-developed CASIA close-up iris camera system. Three photos are taken in the first session, and four images are taken in the second session. Pre-processing has to be done after the image has been collected or prepared.

Filtering is one of the most basic computer vision and image processing operations. The filtered image’s value at a given location is a function of the input image's values in a small neighborhood of the same location in the best possible way "filtering."The median filter is an optical technique which is a non-linear filter that is often used to eliminate noise [16]. This form of noise reduction is commonly used as a preprocessing phase to improve the results of subsequent processing.

Median filtering is usually used in Digital Image Processing because it retains edges while eliminating noise under some conditions. The median filters classify all of the pixel elements in the window before changing the center value to the pixel value in consideration. The sharp edges are preserved. Since there is usually noise in iris images, we use a Gauss filter (low pass filter) before performing iris localization to reduce the impact of noise [17]. This filtering operation must have the ability to enhance iris image structure information while also eliminating noise. The iris image is sub-sampled after low pass filtering to reduce computational operations. As compared to processing a full- size iris file, this step has a faster processing speed. The sub-sampling image can only be used for iris localization;

after that, the initial iris image must be used for feature extraction and subsequent matching or other processing. A bilateral filter is a non-linear image smoothing filter that preserves edges while minimizing noise. It replaces the intensity of each pixel with a rolling sum of intensity values from neighboring pixels.

A Gaussian distribution can be used to calculate this weight. The bilateral filter is a simple and effective extension of the regular Gaussian filter that has a lot of interesting properties like robust Local Structure, Mean Shift, Local Mode Estimation and Efficient Algorithms. After the removal of noise from the image through this filter, the filtered image is passed to the segmentation process. The overall layout of the proposed iris recognition method is represented in Fig. 1.

Figure 1. The proposed framework for iris recognition



(a) (b) (c) (d)

Figure 2. The above figures illustrating the segmentation process. (a) Input Image, (b) Preprocessed Image, (c) Canny Edge Detector Output and, (d) Hough Transform Output.

Segmentation is process of extracting or segregating the region of iris from the image of an eye. Since we are not considering the whole picture, these processes help to reduce computational time. The iris area is obstructed by the hair like particles around the eye, and specular reflections will deceive the iris patterns. In all iris recognition techniques, correctly identifying the inner and outer borders of iris image is significant. It is desirable in segmentation to differentiate the iris texture from the rest of the pixel. In detecting the boundaries, renowned methodologies such as the Integro-Differential, Active contour and Hough transform [18] models have proven to be efficient.

Hough transformations are used as a segmentation technique in our work. The Hough transforms a well-known image analysis technique for locating curves that can be identified parametrically, such as lines, polynomials, and circles.

To detect the iris and pupil boundary Circular Hough transform [19] is applied. The Circular Hough transforms computes the pupil and iris center coordinates and radius. In Circular Hough, the circle is generated by the

“Voting” procedure in the Hough parameter space and then selecting maximum local maximum in a matrix called as

“Accumulator”. But the Hough takes more computational time to effective detect the circles.

A combination of Hough transform and canny can be used to detect the circles more Hough quickly. First the image is passed to canny edge detector and generates output an edge mapped image, which then applied to Hough Transform to determine the Iris boundaries more quickly and more accurately. The output of Hough transform is shown in Fig 2.


When the iris region has been accurately partitioned from an image of the eye, the next step is to transform the iris region into fixed shape [20]. Because of the distance from the camera, illumination, variations, and other factors, the size of the iris of the same eye can vary. In order to convert it into fixed dimensions we do normalization.

Normalization is the process of preparing a partition iris image for feature extraction.

A proper normalization procedure is required to convert the iris image to accommodate for these variations. Daugman introduced a homogeneous model of rubber sheets for normalization. Within the iris area, this model converts image from cartesian form to polar form [21]. According to Daugman’s rubber sheet model, each point is mapped into a pair of polar points (r, θ), where r has the interval [0, 1] and theta on the interval [0,2π] as shown in Fig 3.

r r

o 1

Figure 3. Daugman’s Rubber Sheet Model


(a) (b) Figure 4. (a) Normalized Image and (b) Enhanced Normalized Image

The normalization output is a fixed size rectangular image of dimension (64,512). The output of normalization is shown in Fig 4. The image produced is low contrast and not clear image. To compensate these effects and enhances the quality of the image, we apply histogram equalization method. The enhanced smooth normalized image is shown in Fig 4 and which is further send to feature extraction process.


For feature extraction, a variety of techniques are introduced, including conventional techniques such as PCA and Gabor, as well as modern techniques such as CNN and RNN. CNN models such as Alex net, Resnet, VGG Net, and other so-called pretrained models in variety of applications. In our work, we used a structure called Mini VGG Net [22], which is a portion of the VGG Net model. The structure of Mini VGG Net is shown in Fig 5. Two main characteristics characterize the VGG family of Convolutional Neural Networks are Firstly, only 3x3 filters are used in all convolution layers in the network. Secondly, stacking several Convolutional and Relu layer sets before performing a pooling operation.

The Mini-VGG Net consists of two sets of Convolutional and Relu layers followed by pooling layer, then a collection of fully connected layer etc as shown in Fig 5. The first two Convolutional layers will learn 32 filters, of kernel size 3 x 3. There after other Convolutional layer will learn 64 filters with kern l size 3 x 3. With a 2 x 2 stride, the pooling layer can do max pooling operation over a 2 x 2 window. We'll also add a Batch Normalization layer after the activations, as well as dropout layers after the pooling and fully connected layers. Table 1 describes the network architecture in detail. The dropout value is taken as 0.25. The dropout with p = 0.25 probability, meaning that during training, a node from the pooling layer would be randomly isolated from the next layer with a 25% probability. We are using SGD optimizer with an lr = 0.01 and momentum term = 0..9. Learning rate schedulers are used in reducing over fitting and obtain higher classification accuracy. Before feeding the normalized images to VGG, we split the normalized in the ratio of 80 percent for model learning or training phase and 20 percent for testing phase to validate the model performance on unseen data. The performance of the model during learning process on data is shown in Fig 6. The softmax function in the last year of Mini-VGG net classifies the features obtain in previous layer. It gives the probabilities of each features corresponding to all the class(108).

Figure 5. Architecture of Mini VGG Net


Table 1. Summary of the Mini VGG Net architecture.

Layer Type Output Size Filter Size / Stride

Input Image 200 x 150 x 3

CONV 200 x 150 x 32 3 x 3, k = 32

ACT 200 x 150 x 32

BN 200 x 150 x 32

CONV 200 x 150 x 32 3 x 3, k = 32

ACT 200 x 150 x 32

BN 200 x 150 x 32

POOL 100 x 75 x 32

DROPOUT 100 x 75 x 32

CONV 100 x 75 x 64 3 x 3, k = 64

ACT 100 x 75 x 64

BN 100 x 75 x 64

CONV 100 x 75 x 64 3 x 3, k = 64

ACT 100 x 75 x 64

BN 100 x 75 x 64

POOL 50 x 37 x 64 2 x 2

DROPOUT 50 x 37 x 64

FC 512

ACT 512

BN 512


FC 109



We present the performance analysis for the proposed system as well as a comparison with previous research on this dataset consisting 756 iris images from 108 folders having 7 samples each. The proposed work was implemented on the Google coolab framework with GPU processor and the model accuracies are visualized using the matplotlib python library. The Confusion matrix is one of the simplest metrics for determining the model's consistency and accuracy. It is used for classification problems with two or more types of groups as production. The Confusion Matrix is a performance indicator in and of itself, but nearly all performance metrics are dependent on it and the numbers contained within it. The confusion matrix obtained from model after classification is shown Fig 7.


Figure 6. The recognition rate of Mini VGG.

Figure 7. The confusion matrix after the classification of data Table 2. Results obtained from confusion matrix

Precision Recall Accuracy

0.99 0.99 0.98


Table 3. Proposed Iris Recognition System vs Existing Systems

Method’s Accuracy (%)

Lemmouchi Mansoura, et al.[23] 77.50

Aniket S. Buddharpawar et al.[24] 85

Asim Ali Khan et al.[25] 90.25

Manjunath M[26] 90

MeghaDua et al.[27] 97

Proposed Method 98

The model achieves optimal results on the CASIA V1 database. The comparison of the model performance with the existing model is shown in Table 3. We employed precision, recall and accuracy from the confusion matrix shown in Table 2. Precision also called as Positive predictive value. The number of true class predictions that are genuinely ground truth class predictions is known as precision. Recall the percentage of true class predictions made out of all true class in the dataset. From the Fig 6, the model training and testing performance is seen and within a very few epochs the model could able to gain high saturation level accuracy.


In our work, we have proposed an efficient Iris Recognition scheme based on Circular Hough in combination with pre-trained CNN (Mini-VGG). The Circular Hough Transform is employed in the segmentation of iris. For Normalization we have used the Daughman Rubber Sheet Model to obtain images in unified form. The segmented images are then fed into a CNN (Mini-VGG) which has not been educated before. We have used the CASIA-IrisV1 database consisting of 756 entities to perform our experiments and the proposed framework has obtained an Accuracy of 98% and Precision, Recall equivalent to 0.99.

In our Future work, we will be working on multiple databases and try to tune the existing Mini VGG model to obtain more accurate and efficient system for all the databases.



Pradhan, M. (2015). Next generation secure computing: biometric in secure e- transaction. International Journal of Advance Research in Computer Science and Management Studies, 3(4), 473-489.Blunkett, D. (1998, July 24). Cash for competence. Times Educational Supplement p. 15.


Trader, J. (2012). M2SYS Blog On Biometric Technology. Delta ID, 11.


Daugman, J. G. (1993). High confidence visual recognition of persons by a test of statistical independence. IEEE transactions on pattern analysis and machine intelligence, 15(11), 1148-1161.


Kalaiselvi, S., UniversityKaraikudi, A., India, T., Jothi, R. A., India, T., & Palanisamy, V. (2018).

Biometric security with iris recognition techniques: A review. International Journal of Pure and Applied Mathematics, 118(8), 567-572.


Alaslani, M. G., & Elrefaei, L. A. (2019). Transfer lerning with convolutional neural networks for iris recognition. Int. J. Artif. Intell. Appl, 10(5), 47-64.


Nguyen, K., Fookes, C., Ross, A., & Sridharan, S. (2017). Iris recognition with off-the-shelf CNN features: A deep learning perspective. IEEE Access, 6, 18848-18855.


Jayanthi, J., Lydia, E. L., Krishnaraj, N., Jayasankar, T., Babu, R. L., & Suji, R. A. (2020). An effective deep learning features based integrated framework for iris detection and recognition. Journal of Ambient Intelligence and Humanized Computing, 1-11.



Minaee, S., Abdolrashidiy, A., & Wang, Y. (2016, December). An experimental study of deep convolutional features for iris recognition. In 2016 IEEE signal processing in medicine and biology symposium (SPMB) (pp. 1-6). IEEE.


Alaslani, M. G. (2018). Convolutional neural network based feature extraction for iris recognition. International Journal of Computer Science & Information Technology (IJCSIT) Vol, 10.


Minaee, S., Abdolrashidi, A., & Wang, Y. (2017, December). Face recognition using scattering convolutional network. In 2017 IEEE signal processing in medicine and biology symposium (SPMB) (pp. 1-6). IEEE.


Arsalan, M., Hong, H. G., Naqvi, R. A., Lee, M. B., Kim, M. C., Kim, D. S., ... & Park, K. R. (2017).

Deep learning-based iris segmentation for iris recognition in visible light environment. Symmetry, 9(11), 263.


Zhao, T., Liu, Y., Huo, G., & Zhu, X. (2019). A deep learning iris recognition method based on capsule network architecture. IEEE Access, 7, 49691-49701.


Hsieh, L., Chen, W. S., & Li, T. H. (2010, September). Personal Authentication Using Human Iris Recognition Based on Embedded Zerotree Wavelet Coding. In 2010 Fifth International Multi- conference on Computing in the Global Information Technology (pp. 99-103). IEEE.


Yow, S. C., & Ali, A. N. (2019). Iris Recognition System (IRS) Using Deep Learning Technique. Journal of Engineering, 15(2), 125-144.


Yang, K., Xu, Z., & Fei, J. (2021). DualSANet: Dual Spatial Attention Network for Iris Recognition.

In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 889- 897).


Gui, F., & Qiwei, L. (2007, December). Iris localization scheme based on morphology and gaussian filtering. In 2007 Third International IEEE Conference on Signal-Image Technologies and Internet- Based System (pp. 798-803). IEEE.


Umer, S., Dhara, B. C., & Chanda, B. (2015). Iris recognition using multiscale morphologic features. Pattern Recognition Letters, 65, 67-74.


Matveev, I. A. (2012). Iris center location using Hough transform with two-dimensional parameter space. Journal of Computer and Systems Sciences International, 51(6), 785-791.


Umer, S., & Dhara, B. C. (2015, January). A fast iris localization using inversion transform and restricted circular Hough transform. In 2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR) (pp. 1-6). IEEE.


Johar, T., & Kaushik, P. (2015). Iris segmentation and normalization using Daugman’s rubber sheet model. International Journal of Scientific and Technical Advancements, 1(1), 11-14.


Ivins, J. P., & Porrill, J. (1998). A deformable model of the human iris for measuring small three- dimensional eye movements. Machine Vision and Applications, 11(1), 42-51.


Khrisne, D. C., & Suyadnya, I. M. (2018, October). Indonesian herbs and spices recognition using smaller VGGNet-like network. In 2018 International Conference on Smart Green Technology in Electrical and Information Systems (ICSGTEIS) (pp. 221-224). IEEE.


Mansoura, L., Noureddine, A., Assas, O., & Yassine, A. (2019, April). Biometric recognition by multimodal face and iris using FFT and SVD methods With Adaptive Score Normalization. In 2019 4th World Conference on Complex Systems (WCCS) (pp. 1-5). IEEE.


Buddharpawar, A. S., & Subbaraman, S. (2015). Iris recognition based on pca for person identification. International Journal of Computer Applications, 975, 8887.


Dhouib, M., & Masmoudi, S. (2016). Advanced Multimodal Fusion for Biometric Recognition System based on Performance Comparison of SVM and ANN Techniques. International Journal of Computer Applications, 148(11).


Manjunath, M., & Kulkarni, H. B. (2018). Analysis of unimodal and multimodal biometric system using iris and fingerprint. Perspectives in Communication, Embedded-systems and Signal-processing- PiCES, 2(8), 333-337.


Dua, M., Gupta, R., Khari, M., & Crespo, R. G. (2019). Biometric iris recognition using radial basis function neural network. Soft Computing, 23(22), 11801-11815.




Related subjects :