مرتب سازی بر اساس: سال انتشار
(نزولی)
Multimedia Systems (14321882) 31(2)
In recent years, action recognition has witnessed significant advancements. However, most existing approaches heavily depend on the availability of large amounts of video data, which can be computationally expensive and time-consuming to process especially in real-time applications with limited computational resources. Utilizing too few frames instead, may lead to the loss of crucial information. Therefore, selecting a few frames in a way that preserves essential information poses a challenge. To address this issue, this paper proposes a novel video clip embedding technique called Hybrid Embedding. This technique combines the advantages of uniform frame sampling and tubelet embedding to enhance recognition with few frames. By employing a transformer-based architecture, the approach captures both spatial and temporal information from limited video frames. Furthermore, a keyframe extraction method is introduced to select more informative and diverse frames, which is crucial when only a few frames are available. In addition, the region of interest (ROI) in each RGB frame is cropped using skeletal data to enhance spatial attention. The study also explores the impact of the number of frames, different modalities, various transformer models, and the effect of pretraining in few-frame human action recognition. Experimental results demonstrate the effectiveness of the proposed embedding technique in few-frame action recognition. These findings contribute to addressing the challenge of action recognition with limited frames and shed light on the potential of transformers in this domain. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
Journal of Signal Processing Systems (19398115) 96(12)pp. 763-777
Deep learning models often lack robustness against adversarial attacks, which may deceive classifiers and limit their use in safety-critical applications, such as pedestrian detection. The robustness of pedestrian detection methods, remains underexplored, and current defenses often prove unsuccessful due to their reliance on deep learning architectures. This paper introduces a pedestrian detection approach specifically designed to enhance robustness against adversarial attacks. Our method’s resilience stems from three key elements: First, it employs a novel hand-crafted feature extraction method that are less susceptible to minor perturbations compared to the irrelevant and vulnerable features extracted by deep learning models. Second, our non-deep model lacks gradients, thereby rendering many gradient-based adversarial attacks, such as FGSM and PGD attacks, ineffective. Third, employing a novel ant colony optimization technique with a tailored evaluation function, which selects resilient feature subsets. Extensive experiments demonstrate that our approach maintains comparable detection accuracy to state-of-the-art methods on clean data while exhibiting robustness against adversarial attacks. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
Neural Computing And Applications (09410643) 36(10)pp. 5471-5471
In this article the note of Table 3 was incorrectly given as ‘The bold font indicates the related layers caption in Fig. 7, and italic font demonstrates the minimum receptive field requirement’ and should have read ‘The italic font indicates the related layers caption in Fig. 7, and bold font demonstrates the minimum receptive field requirement’. In this article the notes for Tables 4 and 5 were incorrectly given as ‘Bold indicates the best results, and the second-best results are italic’ but should have been ‘Italic indicates the best results, and the second-best results are bold’. The original article has been corrected. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
Adversarial attacks on images, where intentional noise is added to deceive machine learning models, have emerged as both a significant security concern and a beneficial tool for en-hancing privacy, protecting intellectual property, and innovating in creative industries. This paper introduces a novel information-theoretic approach to crafting effective adversarial attacks, fo-cusing on parts of the image that contain information relevant to decision-making processes in typical deep learning models for object detection and classification. Our method, Fisher-CAM, generates class activation maps (CAM) using Fisher information to identify significant regions in input images. The adversarial noise is crafted by augmenting images based on these regions and iteratively updating perturbation patterns through gradient calculation and momentum updates. Extensive experiments on a subset of the ImageNet dataset demonstrate that our method surpasses state-of-the-art attack performance in both white-box and black-box settings. © 2024 IEEE.
Aut Journal Of Modeling And Simulation (25882953) 56(1)pp. 69-86
Skeleton-based action recognition has attracted significant attention in the field of computer vision. In recent years, Transformer networks have improved action recognition as a result of their ability to capture long-range dependencies and relationships in sequential data. In this context, a novel approach is proposed to enhance skeleton-based activity recognition by introducing Transformer self-attention alongside Convolutional Neural Network (CNN) architectures. The proposed method capitalizes on the 3D distances between pair-wise joints, utilizing this information to generate Joint Distance Images (JDIs) for each frame. These JDIs offer a relatively view-independent representation, allowing the model to discern intricate details of human actions. To further enhance the model’s understanding of spatial features and relationships, the extracted JDIs from different frames are processed. They can be directly input into the Transformer network or first fed into a CNN, enabling the extraction of crucial spatial features. The obtained features, combined with positional embeddings, serve as input to a Transformer encoder, enabling the model to reconstruct the underlying structure of the action from the training data. Experimental results showcase the effectiveness of the proposed method, demonstrating performance comparable to other state-of-the-art transformer-based approaches on benchmark datasets such as NTU RGB+D and NTU RGB+D120. The incorporation of Transformer networks and Joint Distance Images presents a promising avenue for advancing the field of skeleton-based human action recognition, offering robust performance and improved generalization across diverse action datasets. © 2024, Amirkabir University of Technology. All rights reserved.
Artificial Intelligence Review (02692821) 57(7)
Vision-based Human Action Recognition (HAR) is a hot topic in computer vision. Recently, deep-based HAR has shown promising results. HAR using a single data modality is a common approach; however, the fusion of different data sources essentially conveys complementary information and improves the results. This paper comprehensively reviews deep-based HAR methods using multiple visual data modalities. The main contribution of this paper is categorizing existing methods into four levels, which provides an in-depth and comparable analysis of approaches in various aspects. So, at the first level, proposed methods are categorized based on the employed modalities. At the second level, methods categorized in the first level are classified based on the employment of complete modalities or working with missing modalities at the test time. At the third level, complete and missing modality branches are categorized based on existing approaches. Finally, similar frameworks in the third category are grouped together. In addition, a comprehensive comparison is provided for publicly available benchmark datasets, which helps to compare and choose suitable datasets for a task or to develop new datasets. This paper also compares the performance of state-of-the-art methods on benchmark datasets. The review concludes by highlighting several future directions. © The Author(s) 2024.
Neural Computing And Applications (09410643) 36(10)pp. 5447-5469
Several state-of-the-art convolutional neural networks (CNNs)-based methods are available for image denoising tasks. CNNs are typically trained using the backpropagation algorithm, which requires all operations in the network to be differentiable. Most CNN operations satisfy this requirement and can be applied to backpropagation-based training algorithms. However, some transforms, including wavelet transform, which is useful for speeding up CNN computations as well as performing multi-resolution analysis, are not strictly differentiable. This paper addresses this challenge by proposing a wavelet-like transform that is differentiable. This new design is, in fact, a new CNN architecture named semi-wavelet, specific edge convolutional neural network (SW/SE-CNN), consisting of three newly designed layers. The first layer is a Semi-Wavelet (SW)-based layer which is a differential down-sampling operator for wavelet approximation. That is, the SW layer converts the input image into four channels. Three of these channels are estimations of the vertical, horizontal, and diagonal edges of the original image; and the fourth channel is a down-sampled version of it. The second proposed layer, called Semi-Wavelet Inverse (SWI), is to restore the original image by using the four SW output channels. Additionally, a specific edge extractor (SE), as another new layer, is designed on the basis of the well-known Sobel operator to extract specific edges of the image. The reason behind proposing the SE layer is to provide more edge information for the network; and the motive for including the SW layer is to speed the network up as well as multi-resolution analysis. Then, the new SW/SE-CNN architecture is implemented for Gaussian image denoising. The experimental results show that the new SW/SE-CNN outperformed the state-of-the-art methods for Gaussian image denoising based on the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) measurements for grayscale as well as color images. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024. corrected publication 2024.
IEEE Access (21693536) 11pp. 17555-17568
Computerized tomography (CT) scan images are widely used in automatic lung cancer detection and classification. The lung nodules' texture distribution throughout the CT scan volume can vary significantly, and accurate identification and consideration of discriminative information in this volume can greatly help the classification process. Deep stacks of recurrent and convolutional operations cannot entirely represent such variations, especially in the size and location of the nodules. To model this complex pattern of inter/intra dependencies in the CT slices of each nodule, a multi-orientation-based guided-attention module (MOGAM) is proposed in this paper, which provides high flexibility in concentrating on the relevant information extracted from different regions of the nodule in a non-local manner. Moreover, to provide the model with finer-grained discriminative information from the nodule volume, specifically-designed local texture feature descriptors (TFDs) are extracted from the nodule slices in multiple orientations. These TFDs not only represent the distribution of textural information across multiple slices of a nodule but also encode and approximate this distribution within each slice. The extended experimentation has shown the effectiveness of the non-local combination of these local TFDs through the proposed guided attention mechanism. According to the classification results obtained on the standard LIDC-IDRI dataset, the proposed approach has outperformed other counterparts in terms of accuracy and AUC evaluation metrics. Also, a detailed explainability analysis of the results is provided, demonstrating the correct functioning of the proposed attention-based fusion approach, which is required by medical experts. © 2013 IEEE.
Expert Systems with Applications (09574174) 223
Image captioning is a difficult problem for machine learning algorithms to compress huge amounts of images into descriptive languages. The recurrent models are popularly used as the decoder to extract the caption with significant performance, while these models have complicated and inherently sequential overtime issues. Recently, transformers provide modeling long dependencies and support parallel processing of sequences compared to recurrent models. However, recent transformer-based models assign attention weights to all candidate vectors based on the assumption that all vectors are relevant and ignore the intra-object relationships. Besides, the complex relationships between key and query vectors cannot be provided using a single attention mechanism. In this paper, a new transformer-based image captioning structure without recurrence and convolution is proposed to address these issues. To this end, a generator network and a selector network to generate textual descriptions collaboratively are designed. Our work contains three main steps: (1) Design a transformer-based generator network as word-level guidance to generate next words based on the current state. (2) Train a latent space to learn the mapping of captions and images into the same embedding space to learn the text-image relation. (3) Design a selector network as sentence-level guidance to evaluate next words by assigning fitness scores to the partial captions through the embedding space. Compared with the architecture of existing methods, the proposed approach contains an attention mechanism without the dependencies of time. It executes each state to select the next best word using local–global guidance. In addition, the proposed model maintains dependencies between the sequences, and can be trained in parallel. Several experiments on the COCO and Flickr datasets demonstrate that the proposed approach can outperform various state-of-the-art models over well-known evaluation measures. © 2023 Elsevier Ltd
Neural Computing And Applications (09410643) 34(24)pp. 22449-22464
Finding a proper kernel for Support vector machine and adjusting the involved parameters for a better classification remain immense challenges. This paper addresses both challenges in two parts. In part one, a new kernel, called Frequency Component Kernel, is presented; and in the second part, a couple of techniques to form objective functions are introduced to estimate its shape parameter. In designing the FCK, a new Frequency-Based Regressor Matrix is designed based on data structure discovery through curve fitting. The inner product of this regressor matrix with itself produces an intermediary kernel. FCK is a smoothed version of this intermediary kernel. The FCK’s classification accuracy with a 95% confidence interval is compared to well-known kernels, namely Gaussian, Linear, Polynomial, and Sigmoid kernels, for fifteen sets of data. A grid search method is employed for parameter assignments in all kernels. This comparison shows the superiority of FCK in most cases. In part two, the first technique to form an objective function is based on variances of data groups, distances between the centers of data groups, and upper bound classification errors; and the second technique is based on distances between all data, SVM margin, and distance between the centers of data groups. Both techniques take advantage of the FCK development so that all data are converted to the new space via the FBRM. Then, the data distances in this space are calculated. The comparative results show that both suggested techniques to form objective functions outperform the current state-of-the-art parameter estimation methods. The inclusive results show that the combination of our FCK with our two automatic shape parameter estimation methods, could be used as a superlative choice in many related SVM usages and applications. © 2022, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
Applied Intelligence (0924669X) 51(6)pp. 3581-3599
Camera pose estimation in robotic applications is paramount. Most of recent algorithms based on convolutional neural networks demonstrate that they are able to predict the camera pose adequately. However, they usually suffer from the computational complexity which prevent them from running in real-time. Additionally, they are not robust to perturbations such as partial occlusion while they have not been trained on such cases beforehand. To study these limitations, this paper presents a fast and robust end-to-end Siamese convolutional model for robot-camera pose estimation. Two colored-frames are fed to the model at the same time, and the generic features are produced mainly based on the transfer learning. The extracted features are then concatenated, from which the relative pose is directly obtained at the output. Furthermore, a new dataset is generated, which includes several videos taken at various situations for the model evaluation. The proposed technique shows a robust performance even in challenging scenes, which have not been rehearsed during the training phase. Through the experiments conducted with an eye-in-hand KUKA robotic arm, the presented network renders fairly accurate results on camera pose estimation despite scene-illumination changes. Also, the pose estimation is conducted with reasonable accuracy in presence of partial camera occlusion. The results are enhanced by defining a new dynamic weighted loss function. The proposed method is further exploited in visual servoing scenario. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.
Imanpour, N. ,
Naghsh nilchi, A.R. ,
Monadjemi, A. ,
Karshenas, H. ,
Nasrollahi, K. ,
Moeslund, T.B. IET Signal Processing (17519675) 15(2)pp. 141-152
Dense connections in convolutional neural networks (CNNs), which connect each layer to every other layer, can compensate for mid/high-frequency information loss and further enhance high-frequency signals. However, dense CNNs suffer from high memory usage due to the accumulation of concatenating feature-maps stored in memory. To overcome this problem, a two-step approach is proposed that learns the representative concatenating feature-maps. Specifically, a convolutional layer with many more filters is used before concatenating layers to learn richer feature-maps. Therefore, the irrelevant and redundant feature-maps are discarded in the concatenating layers. The proposed method results in 24% and 6% less memory usage and test time, respectively, in comparison to single-image super-resolution (SISR) with the basic dense block. It also improves the peak signal-to-noise ratio by 0.24 dB. Moreover, the proposed method, while producing competitive results, decreases the number of filters in concatenating layers by at least a factor of 2 and reduces the memory consumption and test time by 40% and 12%, respectively. These results suggest that the proposed approach is a more practical method for SISR. © 2021 The Authors. IET Signal Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.
Neural Computing And Applications (09410643) 32(8)pp. 4073-4091
This paper presents a video object segmentation method which jointly uses motion boundary and convolutional neural network (CNN)-based class-level maps to carry out the co-segmentation of the frames. The key characteristic of the proposed approach is a combination of those two sources of information to create initial object and background regions. These regions are employed within the co-segmentation energy function. The motion boundary map detects the areas which contain the object movement, and the CNN-based class saliency map determines the regions with more impact on acquiring the correct network classification. The proposed approach can be implemented on unconstrained natural videos which include changes in an object’s appearance, rapidly moving background, object deformation in non-rigid moving, rapid camera motion and even the existence of a static object. Experimental results on two challenging datasets (i.e., Davis and SegTrackv2 datasets) demonstrate the competitive performance of the proposed method compared with the state-of-the-art approaches. © 2019, Springer-Verlag London Ltd., part of Springer Nature.
Multimedia Tools and Applications (13807501) 78(22)pp. 31319-31345
A novel class-dependent joint weighting method is proposed to mine the key skeletal joints for human action recognition. Existing deep learning methods or those based on hand-crafted features may not adequately capture the relevant joints of different actions which are important to recognize the actions. In the proposed method, for each class of human actions, each joint is weighted according to its temporal variations and its inherent ability in extension or flexion. These weights can be used as a prior knowledge in skeletal joints-based methods. Here, a novel human action recognition algorithm is also proposed in order to use these weights in two different ways. First, for each frame of a skeletal sequence, the histogram of 3D joints is weighted according to the contribution of joints in the corresponding class of human action. Second, a weighted motion energy function is defined to dynamically divide the temporal pyramid of actions. Experimental results on three benchmark datasets show the efficiency of proposed weighting method, especially when occlusion occurs. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.
Expert Systems with Applications (09574174) 119pp. 476-490
Influence maximization is an important issue in social network analysis domain which concerns finding the most influential nodes. Determining the influential nodes is made with respect to information diffusion models. Most of the existing models only contain trust relationships while distrust exist in social networks as well. There exist some drawbacks in limited studies where distrust relationship is involved. The most outstanding drawback is the lack of assessment on the validity of the schemes presented on how influence propagates through distrust relationships in comparison with real word propagation in social networks. In this paper, two schemes are proposed, where based on each, some new models are proposed in two classes: cascade-based and threshold-based. All models of concern here are evaluated in comparison with the benchmark models through two real data sets, the Epinions and Bitcoin OTC. Results obtained indicate the superiority of one of the proposed schemes: when a distrusted user performs an action or adopts an opinion, the target users may tend not to do it. © 2018 Elsevier Ltd
Multidimensional Systems And Signal Processing (15730824) 30(1)pp. 175-193
A probabilistic video content analysis method called extended histogram (EH) is proposed for modelling temporal evolutions of a set of histograms extracted from video frames. In EH, the number of counts for each histogram bin is considered as a random variable (instead of a single value) to account for bin variations. This representation is especially suitable for modelling the dynamic behaviour of a tracked video content of interest in a general manner. The pitfall of such a modelling is its negligence of the temporal order of observations in the collection. To overcome that problem, a hierarchical approach called hierarchical extended histogram (HEH) is proposed for extracting EHs in different levels of the temporal pyramid. Once these generative models are identified for each video, an information-based metric is proposed to be used for defining the similarity of the two EHs. Having this metric, EHs can be used in many different tasks including video retrieval, classification, summarization, and so forth. Especially in the case of discriminant learning, probabilistic kernels based on this metric are also defined to be able to use EHs/HEHs alongside machine learning models such as the SVM. Person re-identification and human action recognition are used as pilot applications to show the capabilities of proposed representations. Experimental results show the significant effectiveness of proposed models. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
International Journal of Knowledge-Based and Intelligent Engineering Systems (13272314) 23(3)pp. 191-201
Pedestrian detection has been a crucial issue over the last decades. The existing pedestrian detection methods are still face abrupt illumination, partial occlusion, different poses of humans, and cluttered backgrounds challenges. Consequently, the significance of pedestrian detection systems encourages us to propose a new method to address some of these challenges and offer higher accuracy rate. Noting that the power of various kinds of features are different and a single type of feature cannot extract the comprehensive information of human shape. Taking this fact into consideration, we combined pragmatic and useful features in order to detect pedestrian more accurate. Indeed, we combine histogram of oriented gradients (HOG), a proposed modified local binary pattern (M-LBP), and a proposed modified Haar-like features (M-Haar) to achieve these goals. By applying the proposed method, it is possible to extract various information on human shapes including the edge information, texture information, and local shape information. After feature extraction, Cascade Adaboost classifier is used to detect pedestrian images from non-pedestrian. In experiments, INRIA dataset, Daimler dataset, and ETH dataset are applied. The extensive experimental results demonstrate that our approach outperforms the traditional methods in terms of the accuracy and robustness. © 2019 - IOS Press and the authors. All rights reserved.
Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization (21681163) 6(2)pp. 170-181
Purpose: Coronary Computed Tomography Angiography (CCTA) is a promising alternative for high accuracy detection of a wide range of coronary artery diseases. To achieve the anatomical and pathological features of intramuscular coronary arteries with minimal user interaction, we need an automated coronary artery centerline extraction algorithm. Method: This article presents a fully automatic coronary artery centerline tracking algorithm. First, a complex continuous wavelet transform with the Gaussian kernels is used to reduce noise effect. Then, a multiple hypothesis tracking approach is applied to segment 3-D vessel structures. Finally, the tracking procedure is completed by applying a newly presented branch searching approach based on region growing algorithm and a mathematical morphology operation. Results: The performance of the presented method is measured on the publicly available Rotterdam Coronary Artery Algorithm Evaluation Framework. The extraction ability of the algorithm computed by overlap measures averaged over 32 data-sets including overall overlap, overlap until the first error and overlap with the clinically relevant part of the vessel were OV = 85.2%, OF = 75.7% and OT = 98.5%, respectively. Also the average accuracy measurement was 0.26 mm which shows high extraction accuracy with respect to mean voxel size 0.32 × 0.32 × 0.4 mm3. Conclusion: The average coronary artery extraction time was about 8 min per data-set. The experiment results show that our newly developed algorithm achieved high efficiency in coronary artery centerlines tracking for CCTA images. © 2016 Informa UK Limited, trading as Taylor & Francis Group.
Naghsh nilchi, A.R. ,
Haamer R.E. ,
Kulkarni K. ,
Imanpour, N. ,
Haque M.A. ,
Avots E. ,
Breisch M. ,
Nasrollahi, K. ,
Escalera S. ,
Ozcinar C. ,
Baro X. ,
Moeslund, T.B. ,
Anbarjafari G. ,
Haamer R.E. ,
Kulkarni K. ,
Imanpour, N. ,
Haque M.A. ,
Avots E. ,
Breisch M. ,
Nasrollahi, K. ,
Escalera S. ,
Ozcinar C. ,
Baro X. 2025 29th International Computer Conference, Computer Society of Iran, CSICC 2025 pp. 621-628
Facial dynamics can be considered as unique signatures for discrimination between people. These have started to become important topic since many devices have the possibility of unlocking using face recognition or verification. In this work, we evaluate the efficacy of the transition frames of video in emotion as compared to the peak emotion frames for identification. For experiments with transition frames we extract features from each frame of the video from a fine-tuned VGG-Face Convolutional Neural Network (CNN) and geometric features from facial landmark points. To model the temporal context of the transition frames we train a Long-Short Term Memory (LSTM) on the geometric and the CNN features. Furthermore, we employ two fusion strategies: first, an early fusion, in which the geometric and the CNN features are stacked and fed to the LSTM. Second, a late fusion, in which the prediction of the LSTMs, trained independently on the two features, are stacked and used with a Support Vector Machine (SVM). Experimental results show that the late fusion strategy gives the best results and the transition frames give better identification results as compared to the peak emotion frames. © 2018 IEEE.
Journal of Visual Communication and Image Representation (10473203) 55pp. 201-214
This paper addresses the co-segmentation problem using feature visualization for CNNs. Visualization is exploited as an auxiliary information to discriminate salient image regions (dubbed as “heat-regions”) from non-salient ones. Region occlusion sensitivity is proposed for feature visualization. The co-segmentation problem is formulated via a convex quadratic optimization which is initialized by the heat-regions. The information obtained through the visualization is considered as an extra energy term in the cost function. The results of the visualization demonstrate that there exist some heat-regions which are not productive in the co-segmentation. To detect helpful regions among them, an adaptive strategy in the form of an iterative algorithm is proposed according to the consistency among all images. Comparison experiments conducted on two benchmark datasets, iCoseg and MSRC, illustrate the superior performance of the proposed approach over state-of-the-art algorithms. © 2018
Applied Intelligence (0924669X) 48(12)pp. 5019-5036
This paper introduces a novel iterative approach for interactive single or multiple foreground co-segmentation using semantic information. A quadratic cost function based on a graph model is proposed. The cost function includes a ‘smoothness’ and a ‘label-information’ terms. The ‘label-information’ term propagates the feature-level and contextual information. This information is updated based on the features and neighborhood patterns of all the images after each iteration. The approach can be easily implemented with a few scribbles on a few random images. The paper also proposes a model called Neighborhood Pattern Model (NPM) for contextual information. Along with feature level information, NPM helps to give semantic meanings to the labels (i.e., foreground(s) and background). Moreover, in the case of insufficient features (i.e., same features for different labels), NPM can be effective to distinct the labels. Experimental results on two benchmark datasets, iCoseg and FlickrMFC, illustrate the better performance of the proposed approach over the current state-of-the-art co-segmentation methods. [Figure not available: see fulltext.]. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
Journal of Information Science (01655515) 43(2)pp. 204-220
One of the important issues concerning the spreading process in social networks is the influence maximization. This is the problem of identifying the set of the most influential nodes in order to begin the spreading process based on an information diffusion model in the social networks. In this study, two new methods considering the community structure of the social networks and influence-based closeness centrality measure of the nodes are presented to maximize the spread of influence on the multiplication threshold, minimum threshold and linear threshold information diffusion models. The main objective of this study is to improve the efficiency with respect to the run time while maintaining the accuracy of the final influence spread. Efficiency improvement is obtained by reducing the number of candidate nodes subject to evaluation in order to find the most influential. Experiments consist of two parts: first, the effectiveness of the proposed influence-based closeness centrality measure is established by comparing it with available centrality measures; second, the evaluations are conducted to compare the two proposed community-based methods with well-known benchmarks in the literature on the real datasets, leading to the results demonstrate the efficiency and effectiveness of these methods in maximizing the influence spread in social networks. © Chartered Institute of Library and Information Professionals.
Computers in Biology and Medicine (00104825) 91pp. 181-190
Background and objective To diagnose infertility in men, semen analysis is conducted in which sperm morphology is one of the factors that are evaluated. Since manual assessment of sperm morphology is time-consuming and subjective, automatic classification methods are being developed. Automatic classification of sperm heads is a complicated task due to the intra-class differences and inter-class similarities of class objects. In this research, a Dictionary Learning (DL) technique is utilized to construct a dictionary of sperm head shapes. This dictionary is used to classify the sperm heads into four different classes. Methods Square patches are extracted from the sperm head images. Columnized patches from each class of sperm are used to learn class-specific dictionaries. The patches from a test image are reconstructed using each class-specific dictionary and the overall reconstruction error for each class is used to select the best matching class. Average accuracy, precision, recall, and F-score are used to evaluate the classification method. The method is evaluated using two publicly available datasets of human sperm head shapes. Results The proposed DL based method achieved an average accuracy of 92.2% on the HuSHeM dataset, and an average recall of 62% on the SCIAN-MorphoSpermGS dataset. The results show a significant improvement compared to a previously published shape-feature-based method. We have achieved high-performance results. In addition, our proposed approach offers a more balanced classifier in which all four classes are recognized with high precision and recall. Conclusions In this paper, we use a Dictionary Learning approach in classifying human sperm heads. It is shown that the Dictionary Learning method is far more effective in classifying human sperm heads than classifiers using shape-based features. Also, a dataset of human sperm head shapes is introduced to facilitate future research. © 2017 Elsevier Ltd
Intelligent Data Analysis (1088467X) 21(2)pp. 427-441
Affective video retrieval systems seek to retrieve video contents concerning their impact on viewers' emotions. These systems typically apply a multimodal approach that fuses information from different modalities to specify the affect category. The main drawback of existing information fusion methods exploited in affective video retrieval systems is that they consider all modalities equally important; hence they ignore conflicts among modalities. In order to address this drawback, a new information fusion method is proposed based on the Dempster-Shafer theory of evidence. This proposed method assigns different weights to modalities based on their correlation and their level of confidence. Experiments are run on the video clips of DEAP dataset. Results indicate that the proposed method outperforms existing evidential information fusion methods significantly. © 2017 - IOS Press and the authors. All rights reserved.
Developing techniques to retrieve video contents with regard to their impact on viewers' emotions is the main goal of affective video retrieval systems. Existing systems mainly apply a multimodal approach that fuses information from different modalities to specify the affect category. In this paper, the effect of exploiting two types of textual information to enrich the audio-visual content of music video is evaluated; subtitles or songs' lyrics and texts obtained from viewers' comments in video sharing websites. In order to specify the emotional content of texts, an unsupervised lexicon-based method is applied. This method does not need any human-coded corpus for training and is much faster than supervised approach. In order to integrate these modalities, a new information fusion method is proposed based on the Dempster-Shafer theory of evidence. Experiments are conducted on the video clips of DEAP dataset and their associated viewers' comments on YouTube. Results show that incorporating songs' lyrics with the audio-visual content has no positive effect on the retrieval performance, whereas exploiting viewers' comments significantly improves the affective retrieval system. This could be justified by the fact that viewers' affective responses depend not only on the video itself but also on its context. © 2017 IEEE.
IET Computer Vision (17519640) 11(8)pp. 683-690
Although there is an increasing interest in employing the depth data in computer vision applications, the spatial resolution of depth maps is still limited compared with typical visible-light images. A novel method is proposed to synthetically improve the spatial resolution of a single depth image. It integrates the higher-order terms into the Markov random field (MRF) formulation of example-based methods in order to improve the representational power of those methods. The inference is performed by approximately minimising the higher-order multi-label MRF energies. In addition, to improve the efficiency of the inference algorithm, a hierarchical scheme on the number of MRF states is proposed. First, a large number of states are used to obtain an initial labelling by solving the minimisation problem of inference for only the first-order energies. Then, the problem is solved for the higher-order energies in a smaller number of states. Performance comparisons show that proposed method improves the results of first-order approaches that are based on simple four-connected MRF graph structure, both qualitatively and quantitatively. © The Institution of Engineering and Technology 2017.
In this paper, a new steganography algorithm that combines two different steganography methods, namely Matrix Pattern (MP) and Least Significant Bit (LSB), is presented for RGB images. These two methods use the spatial domain of images for hiding secret messages; however, they differ from each other, fundamentally. The MP method is an algorithm which, firstly, divides the "Cover-Image" into non-overlapping B×B blocks. Then, it hides the data in the 4th through 7th bit layers of the blue layer of the "Cover-Image", by generating unique tixt2 matrix patterns for each character in each block. The LSB method is an algorithm that hides data in the least significant bit of the "Cover-Image" pixels, which has the least visible effect on the transparency of the "Stego-Image". In the proposed algorithm, the first three bit layers, and the 4th to 7th bit layers of the blue layer of the RGB "Cover-Image" is used for hiding the "Message", with LSB and MP methods, respectively. This algorithm has two entrances for the "Message"; one of them can be only text, "Text Message", which is hidden with the MP method. The other one, "Binary Message", can be any digital media, and is hidden with the LSB method. Our simulation and evaluation results show that this new method has a better capacity than the LSB and MP methods, by more than 1.265 and 4.77 times, correspondingly. Our results also indicate that the final "Stego-Image" has a high quality PSNR. © 2016 IEEE.
Computer Methods and Programs in Biomedicine (01692607) 132pp. 11-20
Background and objective: Manual assessment of sperm morphology is subjective and error prone so developing automatic methods is vital for a more accurate assessment. The first step in automatic evaluation of sperm morphology is sperm head detection and segmentation. In this paper a complete framework for automatic sperm head detection and segmentation is presented. Methods: After an initial thresholding step, the histogram of the Hue channel of HSV color space is used, in addition to size criterion, to discriminate sperm heads in microscopic images. To achieve an improved segmentation of sperm heads, an edge-based active contour method is used. Also a novel tail point detection method is proposed to refine the segmentation by locating and removing the midpiece from the segmented head. An algorithm is also proposed to separate the acrosome and nucleus using morphological operations. Dice coefficient is used to evaluate the segmentation performance. The proposed methods are evaluated using a publicly available dataset. Results: The proposed method has achieved segmentation accuracy of 0.92 for sperm heads, 0.84 for acrosomes and 0.87 for nuclei, with the standard deviation of 0.05, which significantly outperforms the current state-of-the-art. Also our tail detection method achieved true detection rate of 96%. Conclusions: In this paper we presented a complete framework for sperm detection and segmentation which is totally automatic. It is shown that using active contours can improve the segmentation results of sperm heads. Our proposed algorithms for tail detection and midpiece removal further improved the segmentation results. The results indicate that our method achieved higher Dice coefficients with less dispersion compared to the existing solutions. © 2016 Elsevier Ireland Ltd.
Journal of Information Science (01655515) 42(4)pp. 524-538
Affective video retrieval systems aim at finding video contents matching the desires and needs of users. Existing systems typically use the information contained in the video itself to specify its affect category. These systems either extract low-level features or build up higher-level attributes to train classification algorithms. However, using low-level features ignores global relations in data and constructing high-level features is time consuming and problem dependent. To overcome these drawbacks, an external source of information may be helpful. With the explosive growth and availability of social media, users' comments could be such a valuable source of information. In this study, a new method for incorporating social media comments with the audio-visual contents of videos is proposed. Furthermore, for the combination stage a decision-level fusion method based on the Dempster-Shafer theory of evidence is presented. Experiments are carried out on the video clips of the DEAP (Database for Emotion Analysis using Physiological signals) dataset and their associated users' comments on YouTube. Results show that the proposed system significantly outperforms the baseline method of using only the audio-visual contents for affective video retrieval. © The Author(s) 2015.
Intelligent Data Analysis (1088467X) 20(1)pp. 199-218
Influence maximization in a social network involves identifying an initial subset of nodes with a pre-defined size in order to begin the information diffusion with the objective of maximizing the influenced nodes. In this study, a sign-aware cascade (SC) model is proposed for modeling the effect of both trust and distrust relationships on activation of nodes with positive or negative opinions towards a product in the signed social networks. It is proved that positive influence maximization is NP-hard in the SC model and influence function is neither monotone nor submodular. For solving this NP-hard problem, a particle swarm optimization (PSO) method is presented which applies the random keys representation technique to convert the continuous search space of the PSO to the discrete search space of this problem. To improve the performance of this PSO method against premature convergence, a re-initialization mechanism for portion of particles with poorer fitness values and a heuristic mutation operator for global best particle are proposed. Experiments establish the effectiveness of the SC in modeling the real-world cascades. In addition, PSO method is compared with the well-known algorithms in the literature on two realworld data sets. The evaluation results demonstrate that the proposed method outperforms the compared algorithms significantly in the SC model. © 2016 - IOS Press and the authors. All rights reserved.
Journal Of Medical Signals And Sensors (22287477) 5(1)pp. 12-20
In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization , it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews′s correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms.
Three dimensional modeling of organs plays a crucial role in the treatment of cancer and radio vascular diseases. The purpose of this work is 3D modeling of breast vessels using only two uncalibrated two-dimensional mammography images in order to have the patient less exposed to X-ray radiation. In the proposed method, we first optimize the internal and external parameters using a nonlinear optimization framework. To this end, we use the data stored in the header of files and key features in the mammography images. Using the optimized parameters, 3D active contours is proposed for 3D modeling of the vessels. Then using the parameters obtained from the previous step, an initial active curve gradually evolves until the energy of active curve is minimized. The surface reconstruction of the vessels is done by employing the methods converting a set of surface points to lattice surface. The proposed method is implied for a set of mammography images. Assuming optimized parameters are achieved, the method can yield promising 3D reconstruction. © 2014 IEEE.
Multimedia Systems (14321882) 20(2)pp. 215-226
In this paper a data hiding method is proposed based on the combination of a secret sharing technique and a novel steganography method using integer wavelet transform. In this method in encoding phase, first a secret image is shared into n shares, using a secret sharing technique. Then, the shares and Fletcher-16 checksum of shares are hidden into n cover images using proposed wavelet based steganography method. In decoding phase, t out of n stego images are required to recover the secret image. In this phase, first t shares and their checksums are extracted from t stego images. Then, by using the Lagrange interpolation the secret image is revealed from the t shares. The proposed method is stable against serious attacks, including RS and supervisory training steganalysis methods, it has the lowest detection rate under global feature extraction classifier examination compared to the state-of-the-art techniques. Experimental results on a set of benchmarks showed that this method outperforms conventional methods in offering a high secure and robust mechanism for joining secret image sharing and steganography. © 2013 Springer-Verlag Berlin Heidelberg.
Journal Of Medical Signals And Sensors (22287477) 4(1)pp. 1-9
In this paper, a novel matched filter based on a new kernel function with Cauchy distribution is introduced to improve the accuracy of the automatic retinal vessel detection compared with other available matched filter-based methods, most notably, the methods built on Gaussian distribution function. Several experiments are conducted to pick the best values of the parameters for the new designed filter, including both Cauchy function parameters as well as the matched filter parameters such as the threshold value. Moreover, the thresholding phase is enhanced with a two-step procedure. Experimental results employed on DRIVE retinal images database confirms that the proposed method has higher accuracy compared with other available matched filter-based methods.
Journal of Information Science (01655515) 40(3)pp. 313-328
Sentiment analysis is used to extract people's opinion from their online comments in order to help automated systems provide more precise recommendations. Existing sentiment analysis methods often assume that the comments of any single reviewer are independent of each other and so they do not take advantage of significant information that may be extracted from reviewers' comment histories. Using psychological findings and the theory of negativity bias, we propose a method for exploiting reviewers' comment histories to improve sentiment analysis. Furthermore, to use more fine-grained information about the content of a review, our method predicts the overall ratings by aggregating sentence-level scores. In the proposed system, the Dempster-Shafer theory of evidence is utilized for score aggregation. The results from four large and diverse socialWeb datasets establish the superiority of our approach in comparison with the state-of-the-art machine learning techniques. In addition, the results show that the suggested method is robust to the size of training dataset. © The Author(s) 2014.
Pattern Analysis and Applications (1433755X) 17(1)pp. 69-81
The local binary patterns (LBP) operator is a powerful multi-resolution micro-texture descriptor, which can be applied to many image-processing applications. However, existing LBP operators cannot use the information of non-uniform patterns efficiently. This paper presents a general extension of LBP operator to extract all uniform and non-uniform pattern types by using suitable rotation-invariant labeling scheme. Since the proposed LBP operator can extract all micro-texture structures, we combined it with artificial neural networks (ANN) to present a new supervised technique for automatic blood vessel enhancement and detection. The thin and thick blood vessels are detected by applying proper top-hat transform and length filtering on the enhanced blood vessels. The performance of the proposed method is evaluated on manually labeled images of the publicly available DRIVE and STARE databases and compared with several state-of-the-art approaches. The obtained results show the high accuracy of the proposed method on detecting thin and thick blood vessels. © 2011 Springer-Verlag London Limited.
Mathematical Problems In Engineering (1024123X) 2014
Sentiment prediction techniques are often used to assign numerical scores to free-text format reviews written by people in online review websites. In order to exploit the fine-grained structural information of textual content, a review may be considered as a collection of sentences, each with its own sentiment orientation and score. In this manner, a score aggregation method is needed to combine sentence-level scores into an overall review rating. While recent work has concentrated on designing effective sentence-level prediction methods, there remains the problem of finding efficient algorithms for score aggregation. In this study, we investigate different aggregation methods, as well as the cases in which they perform poorly. According to the analysis of existing methods, we propose a new score aggregation method based on the Dempster-Shafer theory of evidence. In the proposed method, we first detect the polarity of reviews using a machine learning approach and then, consider sentence scores as evidence for the overall review rating. The results from two public social web datasets show the higher performance of our method in comparison with existing score aggregation methods and state-of-the-art machine learning approaches. © 2014 Mohammad Ehsan Basiri et al.
Image processing softwares, like all softwares, need to be both verified and validated. Synthetic images are very useful during the medical software development process to verify the accuracy of algorithms. In this paper we introduce the process of generating synthetic 2D medical X-ray images in addition to ground truth imaging parameters. First, a 3D model of an organ (e.g., vessels) is made in a 3D-modeling software. Then, this volume model is voxelized based on the specified resolution in order to create a 3D CT image of that organ by assigning proper Hounsfield unit to each voxel. The obtained 3D CT image volume is used in DRR program as the input. Geometry parameters such as internal and external parameters are adjusted to take some images from different views. We demonstrated this process by three examples to confirm its usage in validation of medical image processing applications. © 2014 IEEE.
Computers in Biology and Medicine (00104825) 43(5)pp. 587-593
Automatic measurement and quantification of blood vessels' features and detection of vessel landmarks are key steps in the computer-aided diagnosis and diseases monitoring. This work proposes a novel and robust method for detecting vessel landmarks, i.e. bifurcation and crossovers, and measurement of different features, i.e. vessel orientation and vessel diameter as well as bifurcation angle, from the detected vessel network using simple and efficient local vessel pattern operator. The proposed method is applied to the publicly available DRIVE, STARE and ARIA databases and compared with existing state-of-the-art approaches. It shows higher accuracy in detection of vessel landmark and estimation of vessel features. © 2013 Elsevier Ltd.
Biomedical Signal Processing and Control (17468108) 8(1)pp. 71-80
Automatic detection of retinal blood vessels and measurement of vessel diameter are important steps in the computer aided diagnosis in ophthalmology. Here, we present a new multi-scale vessel enhancement method based on complex continuous wavelet transform (CCWT). The parameters of CCWT are optimized to represent line structures in all directions and separate them from simple edges. The final vessel network is obtained by applying an adaptive histogram-based thresholding process along with a proper length filtering method. An efficient circular structure operator is employed on the centerline of vessels to estimate their diameters. The performance of the proposed method is measured on the publicly available DRIVE and STARE databases and compared with several state-of-the-art methods as well as second observer. The proposed method shows much higher accuracy (95%) and sensitivity (79%) in the same range of specificity (97%). The predictive value of it is higher than 72.9%. The vessel diameter estimation process also shows lower root mean square error compared to the existing methods and second observer. © 2012 Elsevier Ltd.
International Journal of Innovative Computing, Information and Control (13494198) 9(3)pp. 939-953
To improve transparency, as an important parameter in watermarking, and maintain robustness, a new quantization coefficient of the third and fourth sub-bands of Discrete Wavelet Transform (DWT) is being proposed. In this method, all coefficients of four-level Haar DWT sub-bands of a host image (HL4, LH4, HL3 and LH3) are divided into different non-overlapping blocks. Then, each block is divided into some sets consisting of several wavelet coefficients as their members. Depending on whether a zero or one needs to be embedded, one or all sets are selected. By quantizing the first and second largest coefficient values in each selected set, a watermark bit is embedded. In decoding stage, the lowest difference between the first and second largest coefficient values in each block is compared with an empirical threshold, in order to estimate the watermark bit. In comparison with other methods, the implementation results show that the proposed method sigficantly improves transparency, while enhancing the robustness for most attacks. Moreover, the proposed method establishes a trade-off between transparency and robustness by tuning a threshold value in the decoding stage. © 2013 ICIC International.
Neural Computing And Applications (09410643) 22(SUPPL.1)pp. 163-174
Automatic extraction of blood vessels is an important step in computer-aided diagnosis in ophthalmology. The blood vessels have different widths, orientations, and structures. Therefore, the extracting of the proper feature vector is a critical step especially in the classifier-based vessel segmentation methods. In this paper, a new multi-scale rotation-invariant local binary pattern operator is employed to extract efficient feature vector for different types of vessels in the retinal images. To estimate the vesselness value of each pixel, the obtained multi-scale feature vector is applied to an adaptive neuro-fuzzy inference system. Then by applying proper top-hat transform, thresholding, and length filtering, the thick and thin vessels are highlighted separately. The performance of the proposed method is measured on the publicly available DRIVE and STARE databases. The average accuracy 0.942 along with true positive rate (TPR) 0.752 and false positive rate (FPR) 0.041 is very close to the manual segmentation rates obtained by the second observer. The proposed method is also compared with several state-of-the-art methods. The proposed method shows higher average TPR in the same range of FPR and accuracy. © 2012 Springer-Verlag London Limited.
Iranian Conference on Machine Vision and Image Processing, MVIP (21666776) pp. 381-386
Segmentation of moving objects in a video sequence is a primary mission of many computer vision tasks. However, shadows extracted along with the objects can result in large errors in object localization and recognition. We propose a novel method of moving shadow detection using wavelets and watershed segmentation algorithm, which can effectively separate the cast shadow of moving objects in a scene obtained from a video sequence. The wavelet transform is used to de-noise and enhance edges of foreground image, and to obtain an enhanced version of gradient image. Then, the watershed transform is applied to the gradient image to segment different parts of object including shadows. Finally a post-processing exertion is accommodated to mark segmented parts with chromacity close to the background reference as shadows. Experimental results on two datasets prove the efficiency and robustness of the proposed approach. © 2013 IEEE.
International Journal of Innovative Computing, Information and Control (13494198) 8(2)pp. 1205-1220
In this paper, digital image watermarking based on parameters amelioration of parametric slant-Hadamard transform using genetic algorithm is presented. In image watermarking procedure, the image is divided into separate blocks and the parametric slant-Hadamard transform is applied on each block individually. Then, the watermark is embedded in the transform domain and the inverse transform is carried out. The main advantage of the selecting parametric slant-Hadamard transform is the availability of transform parameters which could be used to ameliorate the fidelity and robustness. In general, the fidelity and robustness properties of watermarking schemes are in conflict. Here, a genetic algorithm is introduced to ameliorate the transform parameters to improve both the fidelity and robustness simultaneously. Additionally, to increase the security of our algorithm, different sets of parameters are sought for each block individually. Experimental results show that the introduced watermarking scheme produces remarkably high fidelity with highly robust watermark against various attacks. © 2012 ICIC International.
IEEE Transactions on Image Processing (10577149) 21(9)pp. 3981-3990
This paper proposes a statistically optimum adaptive wavelet packet (WP) thresholding function for image denoising based on the generalized Gaussian distribution. It applies computationally efficient multilevel WP decomposition to noisy images to obtain the best tree or optimal wavelet basis, utilizing Shannon entropy. It selects an adaptive threshold value which is level and subband dependent based on analyzing the statistical parameters of subband coefficients. In the utilized thresholding function, which is based on a maximum a posteriori estimate, the modified version of dominant coefficients was estimated by optimal linear interpolation between each coefficient and the mean value of the corresponding subband. Experimental results, on several test images under different noise intensity conditions, show that the proposed algorithm, called OLI-Shrink, yields better peak signal noise ratio and superior visual image quality - measured by universal image quality index - compared to standard denoising methods, especially in the presence of high noise intensity. It also outperforms some of the best state-of-the-art wavelet-based denoising techniques. © 2012 IEEE.
International Journal for Numerical Methods in Biomedical Engineering (20407947) 28(8)pp. 838-860
The numerical solution of D-bar integral equations is the key in inverse scattering solution of many complex problems in science and engineering including conductivity imaging. Recently, a couple of methodologies were considered for the numerical solution of D-bar integral equation, namely product integrals and multigrid. The first one involves high computational complexity and other one has low convergence rate disadvantages. In this paper, a new and efficient sinc-convolution algorithm is introduced to solve the two-dimensional D-bar integral equation to overcome both of these disadvantages and to resolve the singularity problem not tackled before effectively. The method of sinc-convolution is based on using collocation to replace multidimensional convolution-form integrals- including the two-dimensional D-bar integral equations - by a system of algebraic equations. Separation of variables in the proposed method allows elimination of the formulation of the huge full matrices and therefore reduces the computational complexity drastically. In addition, the sinc-convolution method converges exponentially with a convergence rate of O(e-cN). Simulation results on solving a test electrical impedance tomography problem confirm the efficiency of the proposed sinc-convolution-based algorithm. © 2012 John Wiley & Sons, Ltd.
Computers in Biology and Medicine (00104825) 42(7)pp. 743-750
With ever increasing use of medical ultrasound (US) images, a challenge exists to deal with storage and transmission of these images while still maintaining high diagnostic quality. In this article, a state-of-the-art context based method is proposed to overcome this challenge called contextual vector quantization (CVQ). In this method, a contextual region is defined as a region containing the most important information and must be encoded without considerable quality loss. Attempts are made to encode this region with high priority and high resolution (low compression ratio and high bit rate) CVQ algorithm; and the background, which has a lower priority, is separately encoded with a low resolution (high compression ratio and low bit rate) version of the CVQ algorithm. Finally both of the encoded contextual region and the encoded background region is merged together to reconstruct the output image. As a result, very good diagnostic image quality with lower image size and enhanced performance parameters including mean square error (MSE), pick signal to noise ratio (PSNR) and coefficient of correlation (CoC) are gained. The experimental results show that the proposed CVQ methodology is superior as compared to other existing methods (general methods such as JPEG and JPEG2K, and ROI based methods such as EBCOT and CSPIHT) in terms of measured performance parameters. This makes CVQ compression method a feasible technique to overcome storage and transmission limitations. © 2012 Elsevier Ltd.
Pattern Recognition Letters (01678655) 33(9)pp. 1093-1100
The local binary pattern (LBP) operator is a very effective multi-resolution texture descriptor that can be applied in many image processing applications. However, existing LBP operators can not use the information of non-uniform patterns efficiently and they are also sensitive to noise. This paper proposes a noise tolerant extension of LBP operators to extract statistical and structural image features for efficient texture analysis. The proposed LBP operator uses a circular majority voting filter and suitable rotation-invariant labeling scheme to obtain more regular uniform and non-uniform patterns that have better discrimination ability and more robustness against noise. Experimental results on the Brodatz, CUReT and MeasTex databases show the improvement of the proposed LBP operator performance, especially when a large number of neighbors are used for extracting texture patterns. © 2012 Elsevier B.V. All rights reserved.
BioMedical Engineering Online (1475925X) 11
Background: Electrical Impedance Tomography (EIT) is used as a fast clinical imaging technique for monitoring the health of the human organs such as lungs, heart, brain and breast. Each practical EIT reconstruction algorithm should be efficient enough in terms of convergence rate, and accuracy. The main objective of this study is to investigate the feasibility of precise empirical conductivity imaging using a sinc-convolution algorithm in D-bar framework.Methods: At the first step, synthetic and experimental data were used to compute an intermediate object named scattering transform. Next, this object was used in a two-dimensional integral equation which was precisely and rapidly solved via sinc-convolution algorithm to find the square root of the conductivity for each pixel of image. For the purpose of comparison, multigrid and NOSER algorithms were implemented under a similar setting. Quality of reconstructions of synthetic models was tested against GREIT approved quality measures. To validate the simulation results, reconstructions of a phantom chest and a human lung were used.Results: Evaluation of synthetic reconstructions shows that the quality of sinc-convolution reconstructions is considerably better than that of each of its competitors in terms of amplitude response, position error, ringing, resolution and shape-deformation. In addition, the results confirm near-exponential and linear convergence rates for sinc-convolution and multigrid, respectively. Moreover, the least degree of relative errors and the most degree of truth were found in sinc-convolution reconstructions from experimental phantom data. Reconstructions of clinical lung data show that the related physiological effect is well recovered by sinc-convolution algorithm.Conclusions: Parametric evaluation demonstrates the efficiency of sinc-convolution to reconstruct accurate conductivity images from experimental data. Excellent results in phantom and clinical reconstructions using sinc-convolution support parametric assessment results and suggest the sinc-convolution to be used for precise clinical EIT applications. © 2012 Abbasi and Naghsh-Nilchi; licensee BioMed Central Ltd.
Scientific Research and Essays (19922248) 6(10)pp. 2119-2128
Digital image watermarking is one of the most important techniques for copyright protection. The robustness and imperceptibility are the basic requirements of digital image watermarking that are contradictory. The key factor that affects both the robustness and imperceptibility is the watermark strength. This paper presents a new method to determine the watermark strength using Reinforcement Learning (RL) in Discrete Cosine Transform (DCT) domain. Thus, finding the watermark strength was formulated as an RL problem. In our study, the defined reinforcement function has two contradictory aspects, the one with positive aspect is with respect to the similarities between the host and watermarked image and the other with negative aspect is with respect to the robustness of the watermark. Therefore, a novel adaptive methodology is introduced to estimate watermark strength to ameliorate both imperceptibility and robustness at the same time. The experimental results show that the proposed RL algorithm for watermark strength estimation improves simultaneously the robustness and imperceptibility of the watermarking scheme. © 2011 Academic Journals.
Journal of Circuits, Systems and Computers (17936454) 20(5)pp. 801-819
In this paper, an adaptive digital image watermarking technique using fuzzy gradient on DCT domain is presented. In our approach, the image is divided into separate blocks and the DCT is applied on each block individually. Then, the watermark is inserted in the transform domain and the inverse transform is carried out. We increase the robustness of the watermark by increasing the watermark strength. However, this reduces the fidelity of the watermarking scheme. This is because the fidelity and robustness of watermarking are generally in conflict with each other. To improve the fidelity, a new fuzzy-based method is introduced. In this method, a fuzzy gradient-based mask is generated from the host image. Then, as a post-processing stage, the generated mask is combined with the watermarked image. Experimental results show that the proposed technique has high fidelity as well as high robustness against a variety of attacks. © 2011 World Scientific Publishing Company.
Journal of Circuits, Systems and Computers (17936454) 19(2)pp. 451-477
In this paper, a novel watermarking technique based on parametric slant-Hadamard transform is presented. Our approach embeds a pseudo-random sequence of real numbers in a selected set of the parametric slant-Hadamard transform coefficients. By exploiting statistical properties of the embedded sequence, the mark can be reliably extracted without resorting to the original uncorrupted image. The presented method is capable of increasing the flexibility of the watermarking scheme, where the changes in parameter set help to improve fidelity and robustness against a number of attacks. Experimental results show that the proposed technique is secure and indeed highly robust to these attacks. © 2010 World Scientific Publishing Company.
Biomedical Signal Processing and Control (17468108) 5(2)pp. 147-157
In this paper, a new approach based on eigen-systems pseudo-spectral estimation methods, namely Eigenvector (EV) and MUSIC, and Multiple Layer Perceptron (MLP) neural network is introduced. In this approach, the calculated EEG (electroencephalogram) spectrum is divided into smaller frequency sub-bands. Then, a set of features, {maximum, entropy, average, standard deviation, mobility}, are extracted from these sub-bands. Next, incorporating a set of the EEG time domain features {standard deviation, complexity measure} with the spectral feature set, a feature vector is formed. The feature vector is then fetched into a MLP neural network to classify the signal into the following three states: normal (healthy), epileptic patient signal in a seizure-free interval (inter-ictal), and epileptic patient signal in a full seizure interval (ictal). The experimental results show that the classification of the EEG signals maybe achieved with approximately 97.5% accuracy and the variance of 0.095% using an available public EEG signals database. The results are among the best reported methods for classifying the three states aforementioned. This is a high speed with high accuracy as well as low misclassifying rate method so it can make the practical and real-time detection of this chronic disease feasible. © 2010 Elsevier Ltd. All rights reserved.
European Signal Processing Conference (22195491) pp. 2377-2381
A new approach based on root-MUSIC frequency estimation method and a Multiple Layer Perceptron neural network is introduced. In this method, a feature vector is formed using power frequency, entropy, standard deviation, as well as the complexity of the time domain Electroencephalography (EEG) signal. The power frequency values are estimated using root-MUSIC algorithm. The resulted feature vector is then classified into three categories namely healthy, interictal (epileptic during seizure-free interval), and ictal (full epileptic condition during seizure interval) states using Multiple Layer Perceptron Neural Network (MLPNN). The experimental results show that EEG states classification maybe achieved with approximately 94.53% accuracy and variance of 0.063% applying the method on an available public database. This is a high speed with high accuracy as well as low misclassification rate method. © EURASIP, 2009.
International Journal of Imaging Systems and Technology (10981098) 19(3)pp. 179-186
In recent years, active contour models (ACM) have been considered as powerful tools for image segmentation and object tracking in computer vision and image processing applications. This article presents a new tracking method based on parametric active contour models. In the proposed method, a new pressure energy called "texture pressure energy" is added to the energy function of the parametric active contour model to detect and track a texture target object in a texture background. In this scheme, the texture features of the contour are calculated by a moment-based method. Then, by comparing these features with texture features of the target object, the contour curve is expanded or contracted to be adapted to the object boundaries. Experimental results show that the proposed method is more efficient and accurate in the tracking of objects compare to the traditional ones, when both object and background are textures in nature. © 2009 Wiley Periodicals, Inc.
Block matching has been widely used for block motion estimation; however most of the block matching algorithms impose heavy computational load to the system, and require much time for execution. This problem prevents using them in time critical applications. In this paper, a new approach to block matching technique is presented, which has small computational complexity as well as high accuracy. The main assumption of the algorithm is that, all the pixels of a block move equally by a linear motion. Experimental results show the feasibility and effectiveness of the proposed algorithm. © 2008 IEEE.
In this paper, a new robust digital image watermarking algorithm based on Joint DWT-DCT Transformation is proposed. The imperceptibility is provided as well as higher robustness against common signal processing attacks. A binary watermarked image is embedded in certain sub-bands of a 3-level DWT transformed of a host image. Then, DCT transform of each selected DWT sub-band is computed and the PN-sequences of the watermark bits are embedded in the coefficients of the corresponding DCT middle frequencies. In extraction stages, the watermarked image, which maybe attacked, is first preprocessed by sharpening and Laplassian of Gaussian filters. Then, the same approach as the embedding process is used to extract the DCT middle frequencies of each sub-band. Finally, correlation between mid-band coefficients and PN-sequences is calculated to determine watermarked bits. Experimental results show that the proposed method improved the performance of the watermarking algorithms which are based on the joint of DWT-DCT. © 2008 IEEE.
This paper presents a new robust digital image watermarking technique based on Discrete Cosine Transform (DCT) and neural network. The neural network is Full Counter propagation Neural Network (FCNN). FCNN has been used to simulate the perceptual and visual characteristics of the original image. The perceptual features of the original image have been used to determine the highest changeable threshold values of DCT coefficients. The highest changeable threshold values have been used to embed the watermark in DCT coefficients of the original image. The watermark is a binary image. The pixel values of this image are inserted as zero and one values in the DCT coefficients of the image. The implementation results have shown that this watermarking algorithm has an acceptable robustness versus different kinds of watermarking attacks. © 2008 IEEE.
Eurasip Journal on Advances in Signal Processing (16876172) 2008
An electrocardiogram (ECG) beat classification scheme based on multiple signal classification (MUSIC) algorithm, morphological descriptors, and neural networks is proposed for discriminating nine ECG beat types. These are normal, fusion of ventricular and normal, fusion of paced and normal, left bundle branch block, right bundle branch block, premature ventricular concentration, atrial premature contraction, paced beat, and ventricular flutter. ECG signal samples from MIT-BIH arrhythmia database are used to evaluate the scheme. MUSIC algorithm is used to calculate pseudospectrum of ECG signals. The low-frequency samples are picked to have the most valuable heartbeat information. These samples along with two morphological descriptors, which deliver the characteristics and features of all parts of the heart, form an input feature vector. This vector is used for the initial training of a classifier neural network. The neural network is designed to have nine sample outputs which constitute the nine beat types. Two neural network schemes, namely multilayered perceptron (MLP) neural network and a probabilistic neural network (PNN), are employed. The experimental results achieved a promising accuracy of 99.03 for classifying the beat types using MLP neural network. In addition, our scheme recognizes NORMAL class with 100 accuracy and never misclassifies any other classes as NORMAL.
The reliable execution of a mobile agent is a very important design issue in building a mobile agent system and many fault-tolerant schemes have been proposed so far. Security is a major problem of mobile agent systems, especially when money transactions are concerned . Security for the partners involved is handled by encryption methods based on a public key authentication mechanism and by secret key encryption of the communication. In this paper, we examine qualitatively the security considerations and challenges in application development with the mobile code paradigm. We identify a simple but crucial security requirement for the general acceptance of the mobile code paradigm, and evaluate the current status of mobile code development in meeting this requirement. We find that the mobile agent approach is the most interesting and challenging branch of mobile code in the security context. Therefore, we built a simple agent-based information retrieval application, the Traveling Information Agent system, and discuss the security issues of the system in particulars. ©2008 IEEE.
Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A/Chung-kuo Kung Ch'eng Hsuch K'an (02533839) 31(4)pp. 649-657
In recent years, Active Contour Models (ACMs) have become powerful tools for object detection and image segmentation in computer vision and image processing applications. This paper presents a new energy function in parametric active contour models for object detection and image segmentation. In the proposed method, a new pressure energy called “texture pressure energy” is added to the energy function of the parametric active contour model to detect and segment a textured object against a textured background. In this scheme, the texture features of the contour are calculated by a moment based method. Then by comparing these features with texture features of the object, the contour curve is expanded or contracted in order to be adapted to the object boundaries. Experimental results show that the proposed method has more efficient and accurate segmenting functionality than the traditional method when both object and background have texture properties. © 2008, Taylor & Francis Group, LLC.
Electronic Transactions on Numerical Analysis (10689613) 23pp. 251-262
A new and efficient sine-convolution algorithm is introduced for the numerical solution of the radiosity equation. This equation has many applications including the production of photorealistic images. The method of sine-convolution is based on using collocation to replace multi-dimensional convolution-type integrals - such as two dimensional radiosity integral equations - by a system of algebraic equations. The developed algorithm solves for the illumination of a surface or a set of surfaces when both reflectivity and emissivity of those surfaces are known. It separates the radiosity equation's variables to approximate its solution. The separation of variables allows the elimination of the formulation of huge full matrices and therefore reduces required storage, as well as computational complexity, as compared with classical approaches. Also, the highly singular nature of the kernel, which results in great difficulties using classical numerical methods, poses absolutely no difficulties using sine-convolution. In addition, the new algorithm can be readily adapted for parallel computation for an even faster computational speed. The results show that the developed algorithm clearly reveals the color bleeding phenomenon which is a natural phenomenon not revealed by many other methods. These advantages should make real-time photorealistic image production feasible. Copyright © 2006, Kent State University.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (15206149) 5pp. 309-312
This paper introduces an efficient algorithm that jointly estimates differential time delays and frequency offsets between two signals is introduced. The approach is a two-step procedure. First, the differential frequency offsets are estimated from measurement of the autocorrelation functions of the received and transmitted signals. The time delays are estimated from estimates of the higher-order statistics of the two signals involved. The major advantage of the approach is its remarkably reduced computational complexity over traditional approaches. The experimental results indicate that the algorithm performs better than the traditional methods in most cases of interest in spite of its reduced computational complexity. © 1992 IEEE.