Publication Date: 2024
Neural Computing And Applications (09410643)36(10)pp. 5447-5469
Several state-of-the-art convolutional neural networks (CNNs)-based methods are available for image denoising tasks. CNNs are typically trained using the backpropagation algorithm, which requires all operations in the network to be differentiable. Most CNN operations satisfy this requirement and can be applied to backpropagation-based training algorithms. However, some transforms, including wavelet transform, which is useful for speeding up CNN computations as well as performing multi-resolution analysis, are not strictly differentiable. This paper addresses this challenge by proposing a wavelet-like transform that is differentiable. This new design is, in fact, a new CNN architecture named semi-wavelet, specific edge convolutional neural network (SW/SE-CNN), consisting of three newly designed layers. The first layer is a Semi-Wavelet (SW)-based layer which is a differential down-sampling operator for wavelet approximation. That is, the SW layer converts the input image into four channels. Three of these channels are estimations of the vertical, horizontal, and diagonal edges of the original image; and the fourth channel is a down-sampled version of it. The second proposed layer, called Semi-Wavelet Inverse (SWI), is to restore the original image by using the four SW output channels. Additionally, a specific edge extractor (SE), as another new layer, is designed on the basis of the well-known Sobel operator to extract specific edges of the image. The reason behind proposing the SE layer is to provide more edge information for the network; and the motive for including the SW layer is to speed the network up as well as multi-resolution analysis. Then, the new SW/SE-CNN architecture is implemented for Gaussian image denoising. The experimental results show that the new SW/SE-CNN outperformed the state-of-the-art methods for Gaussian image denoising based on the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) measurements for grayscale as well as color images. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024. corrected publication 2024.
Publication Date: 2024
Neural Computing And Applications (09410643)36(10)pp. 5471-5471
In this article the note of Table 3 was incorrectly given as ‘The bold font indicates the related layers caption in Fig. 7, and italic font demonstrates the minimum receptive field requirement’ and should have read ‘The italic font indicates the related layers caption in Fig. 7, and bold font demonstrates the minimum receptive field requirement’. In this article the notes for Tables 4 and 5 were incorrectly given as ‘Bold indicates the best results, and the second-best results are italic’ but should have been ‘Italic indicates the best results, and the second-best results are bold’. The original article has been corrected. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
Adversarial attacks on images, where intentional noise is added to deceive machine learning models, have emerged as both a significant security concern and a beneficial tool for en-hancing privacy, protecting intellectual property, and innovating in creative industries. This paper introduces a novel information-theoretic approach to crafting effective adversarial attacks, fo-cusing on parts of the image that contain information relevant to decision-making processes in typical deep learning models for object detection and classification. Our method, Fisher-CAM, generates class activation maps (CAM) using Fisher information to identify significant regions in input images. The adversarial noise is crafted by augmenting images based on these regions and iteratively updating perturbation patterns through gradient calculation and momentum updates. Extensive experiments on a subset of the ImageNet dataset demonstrate that our method surpasses state-of-the-art attack performance in both white-box and black-box settings. © 2024 IEEE.
Publication Date: 2024
Aut Journal Of Modeling And Simulation (25882953)56(1)pp. 69-86
Skeleton-based action recognition has attracted significant attention in the field of computer vision. In recent years, Transformer networks have improved action recognition as a result of their ability to capture long-range dependencies and relationships in sequential data. In this context, a novel approach is proposed to enhance skeleton-based activity recognition by introducing Transformer self-attention alongside Convolutional Neural Network (CNN) architectures. The proposed method capitalizes on the 3D distances between pair-wise joints, utilizing this information to generate Joint Distance Images (JDIs) for each frame. These JDIs offer a relatively view-independent representation, allowing the model to discern intricate details of human actions. To further enhance the model’s understanding of spatial features and relationships, the extracted JDIs from different frames are processed. They can be directly input into the Transformer network or first fed into a CNN, enabling the extraction of crucial spatial features. The obtained features, combined with positional embeddings, serve as input to a Transformer encoder, enabling the model to reconstruct the underlying structure of the action from the training data. Experimental results showcase the effectiveness of the proposed method, demonstrating performance comparable to other state-of-the-art transformer-based approaches on benchmark datasets such as NTU RGB+D and NTU RGB+D120. The incorporation of Transformer networks and Joint Distance Images presents a promising avenue for advancing the field of skeleton-based human action recognition, offering robust performance and improved generalization across diverse action datasets. © 2024, Amirkabir University of Technology. All rights reserved.
Publication Date: 2022
Neural Computing And Applications (09410643)34(24)pp. 22449-22464
Finding a proper kernel for Support vector machine and adjusting the involved parameters for a better classification remain immense challenges. This paper addresses both challenges in two parts. In part one, a new kernel, called Frequency Component Kernel, is presented; and in the second part, a couple of techniques to form objective functions are introduced to estimate its shape parameter. In designing the FCK, a new Frequency-Based Regressor Matrix is designed based on data structure discovery through curve fitting. The inner product of this regressor matrix with itself produces an intermediary kernel. FCK is a smoothed version of this intermediary kernel. The FCK’s classification accuracy with a 95% confidence interval is compared to well-known kernels, namely Gaussian, Linear, Polynomial, and Sigmoid kernels, for fifteen sets of data. A grid search method is employed for parameter assignments in all kernels. This comparison shows the superiority of FCK in most cases. In part two, the first technique to form an objective function is based on variances of data groups, distances between the centers of data groups, and upper bound classification errors; and the second technique is based on distances between all data, SVM margin, and distance between the centers of data groups. Both techniques take advantage of the FCK development so that all data are converted to the new space via the FBRM. Then, the data distances in this space are calculated. The comparative results show that both suggested techniques to form objective functions outperform the current state-of-the-art parameter estimation methods. The inclusive results show that the combination of our FCK with our two automatic shape parameter estimation methods, could be used as a superlative choice in many related SVM usages and applications. © 2022, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
Publication Date: 2021
Applied Intelligence (0924669X)51(6)pp. 3581-3599
Camera pose estimation in robotic applications is paramount. Most of recent algorithms based on convolutional neural networks demonstrate that they are able to predict the camera pose adequately. However, they usually suffer from the computational complexity which prevent them from running in real-time. Additionally, they are not robust to perturbations such as partial occlusion while they have not been trained on such cases beforehand. To study these limitations, this paper presents a fast and robust end-to-end Siamese convolutional model for robot-camera pose estimation. Two colored-frames are fed to the model at the same time, and the generic features are produced mainly based on the transfer learning. The extracted features are then concatenated, from which the relative pose is directly obtained at the output. Furthermore, a new dataset is generated, which includes several videos taken at various situations for the model evaluation. The proposed technique shows a robust performance even in challenging scenes, which have not been rehearsed during the training phase. Through the experiments conducted with an eye-in-hand KUKA robotic arm, the presented network renders fairly accurate results on camera pose estimation despite scene-illumination changes. Also, the pose estimation is conducted with reasonable accuracy in presence of partial camera occlusion. The results are enhanced by defining a new dynamic weighted loss function. The proposed method is further exploited in visual servoing scenario. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.
Publication Date: 2020
Neural Computing And Applications (09410643)32(8)pp. 4073-4091
This paper presents a video object segmentation method which jointly uses motion boundary and convolutional neural network (CNN)-based class-level maps to carry out the co-segmentation of the frames. The key characteristic of the proposed approach is a combination of those two sources of information to create initial object and background regions. These regions are employed within the co-segmentation energy function. The motion boundary map detects the areas which contain the object movement, and the CNN-based class saliency map determines the regions with more impact on acquiring the correct network classification. The proposed approach can be implemented on unconstrained natural videos which include changes in an object’s appearance, rapidly moving background, object deformation in non-rigid moving, rapid camera motion and even the existence of a static object. Experimental results on two challenging datasets (i.e., Davis and SegTrackv2 datasets) demonstrate the competitive performance of the proposed method compared with the state-of-the-art approaches. © 2019, Springer-Verlag London Ltd., part of Springer Nature.
Publication Date: 2019
Multimedia Tools and Applications (13807501)78(22)pp. 31319-31345
A novel class-dependent joint weighting method is proposed to mine the key skeletal joints for human action recognition. Existing deep learning methods or those based on hand-crafted features may not adequately capture the relevant joints of different actions which are important to recognize the actions. In the proposed method, for each class of human actions, each joint is weighted according to its temporal variations and its inherent ability in extension or flexion. These weights can be used as a prior knowledge in skeletal joints-based methods. Here, a novel human action recognition algorithm is also proposed in order to use these weights in two different ways. First, for each frame of a skeletal sequence, the histogram of 3D joints is weighted according to the contribution of joints in the corresponding class of human action. Second, a weighted motion energy function is defined to dynamically divide the temporal pyramid of actions. Experimental results on three benchmark datasets show the efficiency of proposed weighting method, especially when occlusion occurs. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.
Publication Date: 2019
Multidimensional Systems And Signal Processing (15730824)30(1)pp. 175-193
A probabilistic video content analysis method called extended histogram (EH) is proposed for modelling temporal evolutions of a set of histograms extracted from video frames. In EH, the number of counts for each histogram bin is considered as a random variable (instead of a single value) to account for bin variations. This representation is especially suitable for modelling the dynamic behaviour of a tracked video content of interest in a general manner. The pitfall of such a modelling is its negligence of the temporal order of observations in the collection. To overcome that problem, a hierarchical approach called hierarchical extended histogram (HEH) is proposed for extracting EHs in different levels of the temporal pyramid. Once these generative models are identified for each video, an information-based metric is proposed to be used for defining the similarity of the two EHs. Having this metric, EHs can be used in many different tasks including video retrieval, classification, summarization, and so forth. Especially in the case of discriminant learning, probabilistic kernels based on this metric are also defined to be able to use EHs/HEHs alongside machine learning models such as the SVM. Person re-identification and human action recognition are used as pilot applications to show the capabilities of proposed representations. Experimental results show the significant effectiveness of proposed models. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
Publication Date: 2019
International Journal of Knowledge-Based and Intelligent Engineering Systems (13272314)23(3)pp. 191-201
Pedestrian detection has been a crucial issue over the last decades. The existing pedestrian detection methods are still face abrupt illumination, partial occlusion, different poses of humans, and cluttered backgrounds challenges. Consequently, the significance of pedestrian detection systems encourages us to propose a new method to address some of these challenges and offer higher accuracy rate. Noting that the power of various kinds of features are different and a single type of feature cannot extract the comprehensive information of human shape. Taking this fact into consideration, we combined pragmatic and useful features in order to detect pedestrian more accurate. Indeed, we combine histogram of oriented gradients (HOG), a proposed modified local binary pattern (M-LBP), and a proposed modified Haar-like features (M-Haar) to achieve these goals. By applying the proposed method, it is possible to extract various information on human shapes including the edge information, texture information, and local shape information. After feature extraction, Cascade Adaboost classifier is used to detect pedestrian images from non-pedestrian. In experiments, INRIA dataset, Daimler dataset, and ETH dataset are applied. The extensive experimental results demonstrate that our approach outperforms the traditional methods in terms of the accuracy and robustness. © 2019 - IOS Press and the authors. All rights reserved.
Publication Date: 2018
Applied Intelligence (0924669X)48(12)pp. 5019-5036
This paper introduces a novel iterative approach for interactive single or multiple foreground co-segmentation using semantic information. A quadratic cost function based on a graph model is proposed. The cost function includes a ‘smoothness’ and a ‘label-information’ terms. The ‘label-information’ term propagates the feature-level and contextual information. This information is updated based on the features and neighborhood patterns of all the images after each iteration. The approach can be easily implemented with a few scribbles on a few random images. The paper also proposes a model called Neighborhood Pattern Model (NPM) for contextual information. Along with feature level information, NPM helps to give semantic meanings to the labels (i.e., foreground(s) and background). Moreover, in the case of insufficient features (i.e., same features for different labels), NPM can be effective to distinct the labels. Experimental results on two benchmark datasets, iCoseg and FlickrMFC, illustrate the better performance of the proposed approach over the current state-of-the-art co-segmentation methods. [Figure not available: see fulltext.]. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
Publication Date: 2018
Journal of Visual Communication and Image Representation (10473203)55pp. 201-214
This paper addresses the co-segmentation problem using feature visualization for CNNs. Visualization is exploited as an auxiliary information to discriminate salient image regions (dubbed as “heat-regions”) from non-salient ones. Region occlusion sensitivity is proposed for feature visualization. The co-segmentation problem is formulated via a convex quadratic optimization which is initialized by the heat-regions. The information obtained through the visualization is considered as an extra energy term in the cost function. The results of the visualization demonstrate that there exist some heat-regions which are not productive in the co-segmentation. To detect helpful regions among them, an adaptive strategy in the form of an iterative algorithm is proposed according to the consistency among all images. Comparison experiments conducted on two benchmark datasets, iCoseg and MSRC, illustrate the superior performance of the proposed approach over state-of-the-art algorithms. © 2018
Publication Date: 2018
Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization (21681163)6(2)pp. 170-181
Purpose: Coronary Computed Tomography Angiography (CCTA) is a promising alternative for high accuracy detection of a wide range of coronary artery diseases. To achieve the anatomical and pathological features of intramuscular coronary arteries with minimal user interaction, we need an automated coronary artery centerline extraction algorithm. Method: This article presents a fully automatic coronary artery centerline tracking algorithm. First, a complex continuous wavelet transform with the Gaussian kernels is used to reduce noise effect. Then, a multiple hypothesis tracking approach is applied to segment 3-D vessel structures. Finally, the tracking procedure is completed by applying a newly presented branch searching approach based on region growing algorithm and a mathematical morphology operation. Results: The performance of the presented method is measured on the publicly available Rotterdam Coronary Artery Algorithm Evaluation Framework. The extraction ability of the algorithm computed by overlap measures averaged over 32 data-sets including overall overlap, overlap until the first error and overlap with the clinically relevant part of the vessel were OV = 85.2%, OF = 75.7% and OT = 98.5%, respectively. Also the average accuracy measurement was 0.26 mm which shows high extraction accuracy with respect to mean voxel size 0.32 × 0.32 × 0.4 mm3. Conclusion: The average coronary artery extraction time was about 8 min per data-set. The experiment results show that our newly developed algorithm achieved high efficiency in coronary artery centerlines tracking for CCTA images. © 2016 Informa UK Limited, trading as Taylor & Francis Group.
Publication Date: 2017
IET Computer Vision (17519640)11(8)pp. 683-690
Although there is an increasing interest in employing the depth data in computer vision applications, the spatial resolution of depth maps is still limited compared with typical visible-light images. A novel method is proposed to synthetically improve the spatial resolution of a single depth image. It integrates the higher-order terms into the Markov random field (MRF) formulation of example-based methods in order to improve the representational power of those methods. The inference is performed by approximately minimising the higher-order multi-label MRF energies. In addition, to improve the efficiency of the inference algorithm, a hierarchical scheme on the number of MRF states is proposed. First, a large number of states are used to obtain an initial labelling by solving the minimisation problem of inference for only the first-order energies. Then, the problem is solved for the higher-order energies in a smaller number of states. Performance comparisons show that proposed method improves the results of first-order approaches that are based on simple four-connected MRF graph structure, both qualitatively and quantitatively. © The Institution of Engineering and Technology 2017.
Publication Date: 2017
Computers in Biology and Medicine (00104825)91pp. 181-190
Background and objective To diagnose infertility in men, semen analysis is conducted in which sperm morphology is one of the factors that are evaluated. Since manual assessment of sperm morphology is time-consuming and subjective, automatic classification methods are being developed. Automatic classification of sperm heads is a complicated task due to the intra-class differences and inter-class similarities of class objects. In this research, a Dictionary Learning (DL) technique is utilized to construct a dictionary of sperm head shapes. This dictionary is used to classify the sperm heads into four different classes. Methods Square patches are extracted from the sperm head images. Columnized patches from each class of sperm are used to learn class-specific dictionaries. The patches from a test image are reconstructed using each class-specific dictionary and the overall reconstruction error for each class is used to select the best matching class. Average accuracy, precision, recall, and F-score are used to evaluate the classification method. The method is evaluated using two publicly available datasets of human sperm head shapes. Results The proposed DL based method achieved an average accuracy of 92.2% on the HuSHeM dataset, and an average recall of 62% on the SCIAN-MorphoSpermGS dataset. The results show a significant improvement compared to a previously published shape-feature-based method. We have achieved high-performance results. In addition, our proposed approach offers a more balanced classifier in which all four classes are recognized with high precision and recall. Conclusions In this paper, we use a Dictionary Learning approach in classifying human sperm heads. It is shown that the Dictionary Learning method is far more effective in classifying human sperm heads than classifiers using shape-based features. Also, a dataset of human sperm head shapes is introduced to facilitate future research. © 2017 Elsevier Ltd
Developing techniques to retrieve video contents with regard to their impact on viewers' emotions is the main goal of affective video retrieval systems. Existing systems mainly apply a multimodal approach that fuses information from different modalities to specify the affect category. In this paper, the effect of exploiting two types of textual information to enrich the audio-visual content of music video is evaluated; subtitles or songs' lyrics and texts obtained from viewers' comments in video sharing websites. In order to specify the emotional content of texts, an unsupervised lexicon-based method is applied. This method does not need any human-coded corpus for training and is much faster than supervised approach. In order to integrate these modalities, a new information fusion method is proposed based on the Dempster-Shafer theory of evidence. Experiments are conducted on the video clips of DEAP dataset and their associated viewers' comments on YouTube. Results show that incorporating songs' lyrics with the audio-visual content has no positive effect on the retrieval performance, whereas exploiting viewers' comments significantly improves the affective retrieval system. This could be justified by the fact that viewers' affective responses depend not only on the video itself but also on its context. © 2017 IEEE.
Publication Date: 2017
Intelligent Data Analysis (1088467X)21(2)pp. 427-441
Affective video retrieval systems seek to retrieve video contents concerning their impact on viewers' emotions. These systems typically apply a multimodal approach that fuses information from different modalities to specify the affect category. The main drawback of existing information fusion methods exploited in affective video retrieval systems is that they consider all modalities equally important; hence they ignore conflicts among modalities. In order to address this drawback, a new information fusion method is proposed based on the Dempster-Shafer theory of evidence. This proposed method assigns different weights to modalities based on their correlation and their level of confidence. Experiments are run on the video clips of DEAP dataset. Results indicate that the proposed method outperforms existing evidential information fusion methods significantly. © 2017 - IOS Press and the authors. All rights reserved.
Publication Date: 2016
Journal of Information Science (01655515)42(4)pp. 524-538
Affective video retrieval systems aim at finding video contents matching the desires and needs of users. Existing systems typically use the information contained in the video itself to specify its affect category. These systems either extract low-level features or build up higher-level attributes to train classification algorithms. However, using low-level features ignores global relations in data and constructing high-level features is time consuming and problem dependent. To overcome these drawbacks, an external source of information may be helpful. With the explosive growth and availability of social media, users' comments could be such a valuable source of information. In this study, a new method for incorporating social media comments with the audio-visual contents of videos is proposed. Furthermore, for the combination stage a decision-level fusion method based on the Dempster-Shafer theory of evidence is presented. Experiments are carried out on the video clips of the DEAP (Database for Emotion Analysis using Physiological signals) dataset and their associated users' comments on YouTube. Results show that the proposed system significantly outperforms the baseline method of using only the audio-visual contents for affective video retrieval. © The Author(s) 2015.
Publication Date: 2016
Computer Methods and Programs in Biomedicine (01692607)132pp. 11-20
Background and objective: Manual assessment of sperm morphology is subjective and error prone so developing automatic methods is vital for a more accurate assessment. The first step in automatic evaluation of sperm morphology is sperm head detection and segmentation. In this paper a complete framework for automatic sperm head detection and segmentation is presented. Methods: After an initial thresholding step, the histogram of the Hue channel of HSV color space is used, in addition to size criterion, to discriminate sperm heads in microscopic images. To achieve an improved segmentation of sperm heads, an edge-based active contour method is used. Also a novel tail point detection method is proposed to refine the segmentation by locating and removing the midpiece from the segmented head. An algorithm is also proposed to separate the acrosome and nucleus using morphological operations. Dice coefficient is used to evaluate the segmentation performance. The proposed methods are evaluated using a publicly available dataset. Results: The proposed method has achieved segmentation accuracy of 0.92 for sperm heads, 0.84 for acrosomes and 0.87 for nuclei, with the standard deviation of 0.05, which significantly outperforms the current state-of-the-art. Also our tail detection method achieved true detection rate of 96%. Conclusions: In this paper we presented a complete framework for sperm detection and segmentation which is totally automatic. It is shown that using active contours can improve the segmentation results of sperm heads. Our proposed algorithms for tail detection and midpiece removal further improved the segmentation results. The results indicate that our method achieved higher Dice coefficients with less dispersion compared to the existing solutions. © 2016 Elsevier Ireland Ltd.
In this paper, a new steganography algorithm that combines two different steganography methods, namely Matrix Pattern (MP) and Least Significant Bit (LSB), is presented for RGB images. These two methods use the spatial domain of images for hiding secret messages; however, they differ from each other, fundamentally. The MP method is an algorithm which, firstly, divides the "Cover-Image" into non-overlapping B×B blocks. Then, it hides the data in the 4th through 7th bit layers of the blue layer of the "Cover-Image", by generating unique tixt2 matrix patterns for each character in each block. The LSB method is an algorithm that hides data in the least significant bit of the "Cover-Image" pixels, which has the least visible effect on the transparency of the "Stego-Image". In the proposed algorithm, the first three bit layers, and the 4th to 7th bit layers of the blue layer of the RGB "Cover-Image" is used for hiding the "Message", with LSB and MP methods, respectively. This algorithm has two entrances for the "Message"; one of them can be only text, "Text Message", which is hidden with the MP method. The other one, "Binary Message", can be any digital media, and is hidden with the LSB method. Our simulation and evaluation results show that this new method has a better capacity than the LSB and MP methods, by more than 1.265 and 4.77 times, correspondingly. Our results also indicate that the final "Stego-Image" has a high quality PSNR. © 2016 IEEE.
Publication Date: 2015
Journal Of Medical Signals And Sensors (22287477)5(1)pp. 12-20
In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization , it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews′s correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms.
Publication Date: 2014
Multimedia Systems (14321882)20(2)pp. 215-226
In this paper a data hiding method is proposed based on the combination of a secret sharing technique and a novel steganography method using integer wavelet transform. In this method in encoding phase, first a secret image is shared into n shares, using a secret sharing technique. Then, the shares and Fletcher-16 checksum of shares are hidden into n cover images using proposed wavelet based steganography method. In decoding phase, t out of n stego images are required to recover the secret image. In this phase, first t shares and their checksums are extracted from t stego images. Then, by using the Lagrange interpolation the secret image is revealed from the t shares. The proposed method is stable against serious attacks, including RS and supervisory training steganalysis methods, it has the lowest detection rate under global feature extraction classifier examination compared to the state-of-the-art techniques. Experimental results on a set of benchmarks showed that this method outperforms conventional methods in offering a high secure and robust mechanism for joining secret image sharing and steganography. © 2013 Springer-Verlag Berlin Heidelberg.
Publication Date: 2014
Pattern Analysis and Applications (1433755X)17(1)pp. 69-81
The local binary patterns (LBP) operator is a powerful multi-resolution micro-texture descriptor, which can be applied to many image-processing applications. However, existing LBP operators cannot use the information of non-uniform patterns efficiently. This paper presents a general extension of LBP operator to extract all uniform and non-uniform pattern types by using suitable rotation-invariant labeling scheme. Since the proposed LBP operator can extract all micro-texture structures, we combined it with artificial neural networks (ANN) to present a new supervised technique for automatic blood vessel enhancement and detection. The thin and thick blood vessels are detected by applying proper top-hat transform and length filtering on the enhanced blood vessels. The performance of the proposed method is evaluated on manually labeled images of the publicly available DRIVE and STARE databases and compared with several state-of-the-art approaches. The obtained results show the high accuracy of the proposed method on detecting thin and thick blood vessels. © 2011 Springer-Verlag London Limited.
Publication Date: 2014
Journal Of Medical Signals And Sensors (22287477)4(1)pp. 1-9
In this paper, a novel matched filter based on a new kernel function with Cauchy distribution is introduced to improve the accuracy of the automatic retinal vessel detection compared with other available matched filter-based methods, most notably, the methods built on Gaussian distribution function. Several experiments are conducted to pick the best values of the parameters for the new designed filter, including both Cauchy function parameters as well as the matched filter parameters such as the threshold value. Moreover, the thresholding phase is enhanced with a two-step procedure. Experimental results employed on DRIVE retinal images database confirms that the proposed method has higher accuracy compared with other available matched filter-based methods.
Publication Date: 2014
Journal of Information Science (01655515)40(3)pp. 313-328
Sentiment analysis is used to extract people's opinion from their online comments in order to help automated systems provide more precise recommendations. Existing sentiment analysis methods often assume that the comments of any single reviewer are independent of each other and so they do not take advantage of significant information that may be extracted from reviewers' comment histories. Using psychological findings and the theory of negativity bias, we propose a method for exploiting reviewers' comment histories to improve sentiment analysis. Furthermore, to use more fine-grained information about the content of a review, our method predicts the overall ratings by aggregating sentence-level scores. In the proposed system, the Dempster-Shafer theory of evidence is utilized for score aggregation. The results from four large and diverse socialWeb datasets establish the superiority of our approach in comparison with the state-of-the-art machine learning techniques. In addition, the results show that the suggested method is robust to the size of training dataset. © The Author(s) 2014.
Publication Date: 2014
Mathematical Problems In Engineering (1024123X)2014
Sentiment prediction techniques are often used to assign numerical scores to free-text format reviews written by people in online review websites. In order to exploit the fine-grained structural information of textual content, a review may be considered as a collection of sentences, each with its own sentiment orientation and score. In this manner, a score aggregation method is needed to combine sentence-level scores into an overall review rating. While recent work has concentrated on designing effective sentence-level prediction methods, there remains the problem of finding efficient algorithms for score aggregation. In this study, we investigate different aggregation methods, as well as the cases in which they perform poorly. According to the analysis of existing methods, we propose a new score aggregation method based on the Dempster-Shafer theory of evidence. In the proposed method, we first detect the polarity of reviews using a machine learning approach and then, consider sentence scores as evidence for the overall review rating. The results from two public social web datasets show the higher performance of our method in comparison with existing score aggregation methods and state-of-the-art machine learning approaches. © 2014 Mohammad Ehsan Basiri et al.