Expert Systems with Applications (09574174)235
Domain adaptation provides the possibility of utilizing the knowledge gained from an auxiliary domain to accomplish the task in another related domain. In this paper, a common feature representation of these domains is learned through a new proposed idea to reduce inter-domain differences more precisely which results in higher accuracy for unsupervised domain adaptation. To decrease these divergences, the proposed method finds a subspace that reduces data distribution discrepancy between domains. To achieve this goal, minimization of the divergence of the class conditional probability distributions is considered. By considering class conditional distributions, more discriminative information of data is preserved compared to applying marginal distributions. Also, an explicit parametric distribution of the source and target domains is considered to reduce the discrepancy between the two domains data which results in higher accuracy compared to the other relevant domain adaptation methods. Experimental studies on benchmark image classification tasks confirmed our assumptions and indicated the significant improvement given by the proposed method compared to the other state-of-the-art methods. © 2023 Elsevier Ltd
Journal of Information Science (01655515)
Recent developments in the information ecosystem and the changes of the knowledge organisations have resulted in a growing tendency towards a new generation of libraries. This study intends to reflect the characteristics, necessities and challenges of smart libraries using the documentary research method. In total, 78 research articles from the top 17 databases were reviewed. A total of 128 concepts were identified in different aspects, such as technology (n = 53), services (n = 36), people (n = 19), management (n = 7), space and place (n = 9), governance (n = 2), and moral and legal matters (n = 2). The characteristics, necessities, reasons, challenges and obstacles of smart libraries are multidimensional, complex and varied. Smart libraries employ various technologies to facilitate the interaction between people and resources and between people and libraries, while also enabling intelligent administration. This work assists researchers, designers and librarians in how to develop and improve smart libraries. © The Author(s) 2024.
IEEE Access (21693536)11pp. 17555-17568
Computerized tomography (CT) scan images are widely used in automatic lung cancer detection and classification. The lung nodules' texture distribution throughout the CT scan volume can vary significantly, and accurate identification and consideration of discriminative information in this volume can greatly help the classification process. Deep stacks of recurrent and convolutional operations cannot entirely represent such variations, especially in the size and location of the nodules. To model this complex pattern of inter/intra dependencies in the CT slices of each nodule, a multi-orientation-based guided-attention module (MOGAM) is proposed in this paper, which provides high flexibility in concentrating on the relevant information extracted from different regions of the nodule in a non-local manner. Moreover, to provide the model with finer-grained discriminative information from the nodule volume, specifically-designed local texture feature descriptors (TFDs) are extracted from the nodule slices in multiple orientations. These TFDs not only represent the distribution of textural information across multiple slices of a nodule but also encode and approximate this distribution within each slice. The extended experimentation has shown the effectiveness of the non-local combination of these local TFDs through the proposed guided attention mechanism. According to the classification results obtained on the standard LIDC-IDRI dataset, the proposed approach has outperformed other counterparts in terms of accuracy and AUC evaluation metrics. Also, a detailed explainability analysis of the results is provided, demonstrating the correct functioning of the proposed attention-based fusion approach, which is required by medical experts. © 2013 IEEE.
Knowledge-Based Systems (09507051)282
Recent progress in deep learning has led to successful utilization of encoder–decoder frameworks inspired by machine translation in image captioning models. The stacking of layers in encoders and decoders has made it possible to use several modules in encoders and decoders. However, just one type of module in encoder or decoder has been used in stacked models. In this research, we propose a parallel encoder–decoder framework that aims to take advantage of multiple of types modules in encoders and decoders, simultaneously. This framework contains augmented parallel blocks, which include stacking modules or non-stacked ones. Then, the results of the blocks are integrated to extract higher-level semantic concepts. This general idea is not limited to image captioning and can be customized for many applications that utilize encoder–decoder frameworks. We evaluated our proposed method on the MS-COCO dataset and achieved state-of-the-art results. We got 149.92 for CIDEr-D metric outperforming state-of-the-art image captioning models. © 2023 Elsevier B.V.
Tabealhojeh, H.,
Adibi, P.,
Karshenas, H.,
Roy, S.K.,
Harandi, M. Pattern Recognition (00313203)140
Meta-learning is the core capability that enables intelligent systems to rapidly generalize their prior experience to learn new tasks. In general, the optimization-based methods formalize the meta-learning as a bi-level optimization problem, that is a nested optimization framework, in which meta-parameters are optimized (or learned) at the outer-level, while the inner-level optimizes the task-specific parameters. In this paper, we introduce RMAML, a meta-learning method that enforces orthogonality constraints to the bi-level optimization problem. We develop a geometry aware framework that generalizes the bi-level optimization problem to the Riemannian (constrained) setting. Using the Riemannian operations such as orthogonal projection, retraction and parallel transport, the bi-level optimization is reformulated so that it respects the Riemannian geometry. Moreover, we observe that a superior stable optimization and an improved generalization ability can be achieved when the parameters and meta-parameters of the method are modeled using a Stiefel Manifold. We empirically show that RMAML can easily reach competitive performances against several state of the art algorithms for few-shot classification and consistently outperforms its Euclidean counterpart, MAML. For example, by using the geometry of the Stiefel manifold to structure the fully-connected layers in a deep neural network, a 7% increase in single-domain few-shot classification accuracy is achieved. For the cross-domain few-shot learning, RMAML outperforms MAML by up to 9% of accuracy. Our ablation study also demonstrates the effectiveness of RMAML over MAML in terms of higher accuracy with a reduced number of tasks and (or) inner-level updates. © 2023 Elsevier Ltd
Computers and Industrial Engineering (03608352)172
Electronic vehicles (EVs) are receiving increasing attention to addressing global warming challenges since fossil fuel is replaced with fuel cell technology. Hence, new challenges arise as demands have increased for using EVs. One of these challenges is the long waiting time of charging EVs spent in queues, especially during peak hours. So, in this study, we aim to propose an efficient method for the electric vehicle charging scheduling problem (EVCSP), which an actual charging station inspires. The most important constraint in this problem is balancing power consumption between charging lines, leading to a limited number of devices that can be charged simultaneously. Also, in this problem, EVs may have interrelationships with each other during the scheduling procedure. So, the estimation of distribution algorithm (EDA) as a competent method in handling the possible relations among decision variables is applied in our proposed hybrid EDA-based solving method. Our proposed method comprises two EDAs, a Markov network-based EDA and a Mallows model-based EDA. It achieves an appropriate schedule and charging line assignment simultaneously while minimizing the total tardiness considering problem constraints. We compared our method with a constraint programming (CP) model and the state-of-art meta-heuristic methods in terms of the objective function value by simulation on a benchmark dataset. Results from the experimental study show significant improvement in solving the introduced EVCSPs. © 2022
Iranian Journal of Information Processing Management (22518231)38(4)pp. 1367-1393
Keyword extraction is one of the most important issues in text processing and analysis and provides a high-level and accurate summary of the text. Therefore, choosing the right method to extract keywords from the text is important. The aim of the present study was to compare the performance of three approaches in discovering and extracting the subject keywords of e-books using text mining and machine learning techniques. In this regard, three experimental approaches have been introduced and compared including the successive implementation of the clustering process, improving the quality of clusters in terms of semantics and enriching the stop words of a specific field, use of specialized keyword template, finally, the use of important parts of the text in discovering and extracting key words and important topics of the text. The statistical population includes 1000 e-book titles from the subject fields of library and information science based on the congress classification system. Bibliographic information of e-books was obtained from the Congress Library database, then the original text was prepared. The extraction of topic keywords and clustering of training data was performed using the non-negative matrix factorization algorithm with three experimental approaches. The quality and performance of the subject clusters resulting from the implementation of three approaches in the automatic classification of experimental data were compared using a support vector machine. The findings showed that the Hamming loss (0.020) and in other words the error rate in the correct classification of experimental texts in the third approach is far less than the other
PLoS ONE (19326203)17(9 September)
Lung cancer is a serious threat to human health, with millions dying because of its late diagnosis. The computerized tomography (CT) scan of the chest is an efficient method for early detection and classification of lung nodules. The requirement for high accuracy in analyzing CT scan images is a significant challenge in detecting and classifying lung cancer. In this paper, a new deep fusion structure based on the long short-term memory (LSTM) has been introduced, which is applied to the texture features computed from lung nodules through new volumetric grey-level-co-occurrence-matrices (GLCMs), classifying the nodules into benign, malignant, and ambiguous. Also, an improved Otsu segmentation method combined with the water strider optimization algorithm (WSA) is proposed to detect the lung nodules. WSA-Otsu thresholding can overcome the fixed thresholds and time requirement restrictions in previous thresholding methods. Extended experiments are used to assess this fusion structure by considering 2D-GLCM based on 2D-slices and approximating the proposed 3D-GLCM computations based on volumetric 2.5D-GLCMs. The proposed methods are trained and assessed through the LIDC-IDRI dataset. The accuracy, sensitivity, and specificity obtained for 2D-GLCM fusion are 94.4%, 91.6%, and 95.8%, respectively. For 2.5D-GLCM fusion, the accuracy, sensitivity, and specificity are 97.33%, 96%, and 98%, respectively. For 3D-GLCM, the accuracy, sensitivity, and specificity of the proposed fusion structure reached 98.7%, 98%, and 99%, respectively, outperforming most state-of-the-art counterparts. The results and analysis also indicate that the WSA-Otsu method requires a shorter execution time and yields a more accurate thresholding process. Copyright: © 2022 Saihood et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Ieee Transactions On Fuzzy Systems (10636706)30(9)pp. 3918-3927
An inherent property of natural languages is the possibility of distinct meanings for the same word in different sentences. Word sense induction (WSI) is the unsupervised process of discovering the meanings of a word. The meanings form a sense inventory, which is used for word sense disambiguation (WSD). Fuzzy logic's capability at uncertainty representation makes it perfectly applicable for handling the vague information processed in natural languages for WSI and WSD. In this article, a novel fuzzy-based methodology is proposed for extracting meaningful information from ambiguous words, where both word senses and sense inventories are modeled as linguistic variables. The proposed method aims to gather a term set of level-2 fuzzy values for the variables representing words' meanings, to achieve WSI. The values in the term set are, then, used for linguistic approximation using a fuzzy inference system designed for WSD based on word's context. The fuzzy word senses are extracted from an input corpus by word substitution, i.e., predicting words suitable as substitutes for the target word using masked language models. These fuzzy substitute sets are, then, clustered to discover similarities in the semantics they represent. Finally, each cluster is reformed into a sense value and added to the term set for the target word. The experimental results show that the proposed system outperforms the systems submitted to the standard SemEval 2010 and 2013 WSI and WSD tasks and achieves comparable performance with other fuzzy and nonfuzzy state-of-the-art methods. © 1993-2012 IEEE.
Iranian Journal of Information Processing Management (22518231)38(2)pp. 369-396
Identification of hot topics in research areas has always been of interest. Making smart decisions about what is needed to be studied is always a fundamental factor for researchers and can be challenging for them. The goal of this study is to identify hot topics and thematic trend analysis of articles indexed in Scopus database in the field of Knowledge and Information Science (KIS), between 2010 and 2019, using Text Mining techniques. The population consists of 50995 articles published in 249 journals indexed in Scopus database in the field of KIS from 2010 to 2019. To identify thematic clusters, algorithms of Latent Dirichlet Allocation (LDA) technique were used and the data were analyzed using libraries in Python software. To do this, by implementing the word weighting algorithm, using the TF-IDF method, and weighting all of the words and forming a text matrix, the topics in the documents and the coefficients for assigning each document to each topic (theta) were determined. The output of the LDA algorithm led to the identification of the optimal number of 260 topics. Each topic was labeled based on the words with the highest weight assigned to each topic and with considering of experts’ opinions. Then, Topic clustering, keywords and topics identification were done. By performing calculations with 95% confidence, 63 topics were selected from 260 main topics. By calculating the average theta in years, 24 topics with a positive trend or slope (hot topic) and 39 topics with a negative trend or negative slope (cold topic) were determined. According to the results, measurement studies, e-management/ e-marketing, content retrieval, data analysis and e-skills, are considered as hot topics and training, archive, knowledge management, organization and librarians’ health, were identified as cold topics in the field of KIS, in the period 2010 to 2019. The analysis of the findings shows that due to the interest of the most researchers in the last 10 years in using of emerging technologies, technology-based topics have attracted them more. In contrast, basic issues are less considered to be developed. © 2022 Iranian Research Institute for Scientific Information and Documentation. All rights reserved.
Handling lesion size and location variance in lung nodules are one of the main shortcomings of traditional convolutional neural networks (CNNs). The pooling layer within CNNs reduces the resolution of the feature maps causing small local details loss that needs processing by the following layers. In this article, we proposed a new pooling-based stochastic neighboring embedding method (SNE-pooling) that is able to handle the long-range dependencies property of the lung nodules. Further, an attention-based SNE-pooling model is proposed that could perform spatial and channel attention. The experimental results conducted on LIDC and LUNGx datasets show that the attention-based SNE-pooling model significantly improves the performance for the state of the art. © 2022 IEEE.
Journal of Biomedical Informatics (15320480)115
Taking multiple drugs at the same time can increase or decrease each drug's effectiveness or cause side effects. These drug-drug interactions (DDIs) may lead to an increase in the cost of medical care or even threaten patients’ health and life. Thus, automatic extraction of DDIs is an important research field to improve patient safety. In this work, a deep neural network model is presented for extracting DDIs from medical texts. This model utilizes a novel attention mechanism for improving the discrimination of important words from others, based on the word similarities and their relative position with respect to candidate drugs. This approach is applied for calculating the attention weights for the outputs of a bi-directional long short-term memory (Bi-LSTM) model in the deep network structure before detecting the type of DDIs. The proposed method was tested on the standard DDI Extraction 2013 dataset and according to experimental results was able to achieve an F1-Score of 78.30 which is comparable to the best results reported for the state-of-the-art methods. A detailed study of the proposed method and its components is also provided. © 2021 Elsevier Inc.
Imanpour, N.,
Naghsh nilchi, A.R.,
Monadjemi, A.,
Karshenas, H.,
Nasrollahi, K.,
Moeslund, T.B. IET Signal Processing (17519675)15(2)pp. 141-152
Dense connections in convolutional neural networks (CNNs), which connect each layer to every other layer, can compensate for mid/high-frequency information loss and further enhance high-frequency signals. However, dense CNNs suffer from high memory usage due to the accumulation of concatenating feature-maps stored in memory. To overcome this problem, a two-step approach is proposed that learns the representative concatenating feature-maps. Specifically, a convolutional layer with many more filters is used before concatenating layers to learn richer feature-maps. Therefore, the irrelevant and redundant feature-maps are discarded in the concatenating layers. The proposed method results in 24% and 6% less memory usage and test time, respectively, in comparison to single-image super-resolution (SISR) with the basic dense block. It also improves the peak signal-to-noise ratio by 0.24 dB. Moreover, the proposed method, while producing competitive results, decreases the number of filters in concatenating layers by at least a factor of 2 and reduces the memory consumption and test time by 40% and 12%, respectively. These results suggest that the proposed approach is a more practical method for SISR. © 2021 The Authors. IET Signal Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.
Khalilian, S.,
Hallaj, Y.,
Balouchestani, A.,
Karshenas, H.,
Mohammadi, A. Iranian Conference on Machine Vision and Image Processing, MVIP (21666776)2020
Printed Circuit boards (PCBs) are one of the most important stages in making electronic products. A small defect in PCBs can cause significant flaws in the final product. Hence, detecting all defects in PCBs and locating them is essential. In this paper, we propose an approach based on denoising convolutional autoencoders for detecting defective PCBs and to locate the defects. Denoising autoencoders take a corrupted image and try to recover the intact image. We trained our model with defective PCBs and forced it to repair the defective parts. Our model not only detects all kinds of defects and locates them, but it can also repair them as well. By subtracting the repaired output from the input, the defective parts are located. The experimental results indicate that our model detects the defective PCBs with high accuracy (97.5%) compare to state of the art works. © 2020 IEEE.
The rapid evolution of data has challenged traditional machine learning methods and leads to the failure of many learning models. As a possible solution to the lack of sufficient labeled data, transfer learning aims to exploit the accumulated knowledge in an auxiliary domain to develop new predictive models. This article studies a specific type of transfer learning called domain adaptation, which works based on subspace learning in order to minimize distance between class conditional probability distributions of source and target domains and to preserve source discriminative information. SVM classifier trained on source domain data has been used to predict target domain data labels to facilitate subspace learning. In this work, subspace learning is formulated as an optimization problem and experiments have been carried out on the real-world datasets. The results of experiments indicate that the proposed method outperforms several exiting methods at this field in the term of accuracy in two object recognition benchmarks: Offlce-Caltech10 and Office datasets. © 2020 IEEE.
Journal of Applied Security Research (19361629)14(2)pp. 169-190
Binary feature descriptors, require considerable amount of information to be applicable in wide appearance variations, which contradicts the single sample per person (SSPP) problem. To address this challenge, a novel binary feature learning method called discriminative binary feature mapping is presented. Then, based on a number of precisely selected objectives, a feature mapping is learned by projecting all of the extracted vectors to a lower-dimensional feature space. The resulting feature vectors are then used to obtain a holistic face representation based on dictionary learning. Extensive experimental results show that the proposed method is able to obtain superior performance. © 2019, © 2019 Taylor & Francis Group, LLC.
Neurocomputing (09252312)322pp. 177-186
A common approach to solve multi-label classification problems is the transformation method, in which a multi-label problem is converted into multiple single-label representations. With an efficient implementation of single-label algorithms, and considering dependency between labels and the fact that similar samples often share the same labels, we can expect a highly effective classification in multi-label datasets. In this paper, to tackle multi-label classification problem, first using an improved twin support vector machine classifier, the hyperplanes containing structural information of samples and local information of each class label, are found. Then the prior probability of each hyperplane and the sample points that are located in the margins of the hyperplanes are extracted. For the prediction phase, several facts are applied to help finding the sets of relevant labels of a sample: (1) samples with similar labels share the same information, (2) local information has a great impact on the performance and efficiency of a multi-label algorithm, and (3) the samples that are most important in classifications are located in the margin of hyperplanes. To obtain the sets of relevant labels for a test sample, first its k nearest samples in the margin space of the hyperplanes are found. Then, relevant labels are extracted using statistical and membership counting methods. The nonlinear version of the algorithm is also developed through kernel trick. The experimental results obtained from different datasets and different measures indicated good performances of the proposed algorithm, compared to several relevant methods. © 2018
Image representation is proven as a long-standing activity in computer vision. The rich context and large amount of information in images makes image recognition hard. So the image features must be extracted and learned correctly. Obtaining good image descriptors is greatly challenging. In recent years Learning Binary Features has been applied for many representation tasks of images, but it is shown to be efficient and effective just on face images. Therefore, designing a method that can be simultaneously successful in representing both texture and face images as well as other type of images is very important. Moreover, advanced binary feature methods need strong prior knowledge as they are hand-crafted. In order to address these problems, here a method is proposed that applies a pattern called Multi Cross Pattern (MCP) to extract the image features, which calculates the difference between all the pattern neighbor pixels and the pattern center pixel in a local square. In addition, a Multi-Objective Binary Feature method, named MOBF for short, is presented to address the aforementioned problems by the following four objectives: (1) maximize the variance of learned codes, (2) increase the information capacity of the binary codes, (3) prevent overfitting and (4) decrease the difference between binary codes of neighboring pixels. Experimental result on standard datasets like FERET, CMU-PIE, and KTH-TIPS show the superiority of MOBF descriptor on texture images as well as face images compared with other descriptors developed in literature for image representation. © 2017 IEEE.
Decomposition to smaller sub-problems is a general approach in problem solving. Many of the real-world problems can be decomposed into a number of sub-problems which may be solved easier. Appropriate decomposition is a significant issue specially for optimization problems where the optimal solution is usually obtained by combining the solutions of sub-problems. Estimation of distribution algorithms (EDAs) are a type of evolutionary algorithms that learn a model of problem from the population of candidate solutions. This model is intended to capture the interactions between problem variables, thus facilitating problem decomposition and is used to generate new solutions. In this paper, a novel type of problems is presented that is designed to challenge the model building process in discrete EDAs. The main idea is to propose a set of problems that their candidate solutions can be simultaneously decomposed into different sub-problems. This means that the candidate solution of a problem may be interpreted by two or more different structure where only one is true, resulting in the optimal solution to that problem. Some of these decompositions or structures may be more likely according to the low-order statistics collected from the population of candidate solutions, but may not necessarily lead to the optimal solution. Learning the correct structure/decomposition is a challenge for the model building process in EDA. The experimental results show that the proposed problems are indeed difficult for EDAs even when expressive models such as Bayesian networks are used to capture the interactions in the problem. © 2016 IEEE.
Computational Optimization and Applications (15732894)61(2)pp. 517-555
As one of the most competitive approaches to multi-objective optimization, evolutionary algorithms have been shown to obtain very good results for many real-world multi-objective problems. One of the issues that can affect the performance of these algorithms is the uncertainty in the quality of the solutions which is usually represented with the noise in the objective values. Therefore, handling noisy objectives in evolutionary multi-objective optimization algorithms becomes very important and is gaining more attention in recent years. In this paper we present $$\alpha $$α-degree Pareto dominance relation for ordering the solutions in multi-objective optimization when the values of the objective functions are given as intervals. Based on this dominance relation, we propose an adaptation of the non-dominated sorting algorithm for ranking the solutions. This ranking method is then used in a standard multi-objective evolutionary algorithm and a recently proposed novel multi-objective estimation of distribution algorithm based on joint variable-objective probabilistic modeling, and applied to a set of multi-objective problems with different levels of independent noise. The experimental results show that the use of the proposed method for solution ranking allows to approximate Pareto sets which are considerably better than those obtained when using the dominance probability-based ranking method, which is one of the main methods for noise handling in multi-objective optimization. © 2014, Springer Science+Business Media New York.
Many of the real-world problems can be decomposed into a number of sub-problems for which the solutions can be found easier. However, proper decomposition of large problems remains a challenging issue, especially in optimization, where we need to find the optimal solutions more efficiently. Estimation of distribution algorithms (EDAs) are a class of evolutionary optimization algorithms that try to capture the interactions between problem variables when learning a probabilistic model from the population of candidate solutions. In this paper, we propose a type of synthesized problems, specially designed to challenge this specific ability of EDAs. They are based on the principal idea that each candidate solution to a problem may be simultaneously interpreted by two or more different structures where only one is true, resulting in the best solution to that problem. Of course, some of these structures may be more likely according to the statistics collected from the population of candidate solutions, but may not necessarily lead to the best solution. The experimental results show that the proposed benchmarks are indeed difficult for EDAs even when they use expressive models such as Bayesian networks to capture the interactions in the problem.
IEEE Transactions on Evolutionary Computation (1089778X)18(4)pp. 519-542
This paper proposes a new multiobjective estimation of distribution algorithm (EDA) based on joint probabilistic modeling of objectives and variables. This EDA uses the multidimensional Bayesian network as its probabilistic model. In this way, it can capture the dependencies between objectives, variables and objectives, as well as the dependencies learned between variables in other Bayesian network-based EDAs. This model leads to a problem decomposition that helps the proposed algorithm find better tradeoff solutions to the multiobjective problem. In addition to Pareto set approximation, the algorithm is also able to estimate the structure of the multiobjective problem. To apply the algorithm to many-objective problems, the algorithm includes four different ranking methods proposed in the literature for this purpose. The algorithm is first applied to the set of walking fish group problems, and its optimization performance is compared with a standard multiobjective evolutionary algorithm and another competitive multiobjective EDA. The experimental results show that on several of these problems, and for different objective space dimensions, the proposed algorithm performs significantly better and on some others achieves comparable results when compared with the other two algorithms. The algorithm is then tested on the set of CEC09 problems, where the results show that multiobjective optimization based on joint model estimation is able to obtain considerably better fronts for some of the problems compared with the search based on conventional genetic operators in the state-of-the-art multiobjective evolutionary algorithms. © 1997-2012 IEEE.
Information Sciences (00200255)233pp. 109-125
Thanks to their inherent properties, probabilistic graphical models are one of the prime candidates for machine learning and decision making tasks especially in uncertain domains. Their capabilities, like representation, inference and learning, if used effectively, can greatly help to build intelligent systems that are able to act accordingly in different problem domains. Bayesian networks are one of the most widely used class of these models. Some of the inference and learning tasks in Bayesian networks involve complex optimization problems that require the use of meta-heuristic algorithms. Evolutionary algorithms, as successful problem solvers, are promising candidates for this purpose. This paper reviews the application of evolutionary algorithms for solving some NP-hard optimization tasks in Bayesian network inference and learning. © 2013 Elsevier Inc. All rights reserved.
Applied Soft Computing (15684946)13(5)pp. 2412-2432
Regularization is a well-known technique in statistics for model estimation which is used to improve the generalization ability of the estimated model. Some of the regularization methods can also be used for variable selection that is especially useful in high-dimensional problems. This paper studies the use of regularized model learning in estimation of distribution algorithms (EDAs) for continuous optimization based on Gaussian distributions. We introduce two approaches to the regularized model estimation and analyze their effect on the accuracy and computational complexity of model learning in EDAs. We then apply the proposed algorithms to a number of continuous optimization functions and compare their results with other Gaussian distribution-based EDAs. The results show that the optimization performance of the proposed RegEDAs is less affected by the increase in the problem size than other EDAs, and they are able to obtain significantly better optimization values for many of the functions in high-dimensional settings. © 2012 Elsevier B.V. All rights reserved.
Journal of Heuristics (13811231)18(5)pp. 795-819
Thanks to their inherent properties, probabilistic graphical models are one of the prime candidates for machine learning and decision making tasks especially in uncertain domains. Their capabilities, like representation, inference and learning, if used effectively, can greatly help to build intelligent systems that are able to act accordingly in different problem domains. Evolutionary algorithms is one such discipline that has employed probabilistic graphical models to improve the search for optimal solutions in complex problems. This paper shows how probabilistic graphical models have been used in evolutionary algorithms to improve their performance in solving complex problems. Specifically, we give a survey of probabilistic model building-based evolutionary algorithms, called estimation of distribution algorithms, and compare different methods for probabilistic modeling in these algorithms. © Springer Science+Business Media, LLC 2012. © Springer Science+Business Media, LLC 2012.
Adaptation, Learning, and Optimization (18674542)14(1)pp. 157-173
Because of their intrinsic properties, the majority of the estimation of distribution algorithms proposed for continuous optimization problems are based on the Gaussian distribution assumption for the variables. This paper looks over the relation between the general multivariate Gaussian distribution and the popular undirected graphical model of Markov networks and discusses how they can be employed in estimation of distribution algorithms for continuous optimization. A number of learning and sampling techniques for thesemodels, including the promising regularized model learning, are also reviewed and their application for function optimization in the context of estimation of distribution algorithms is studied. © Springer-Verlag Berlin Heidelberg 2012.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (03029743)6594(PART 2)pp. 98-107
N-grams are the basic features commonly used in sequence-based malicious code detection methods in computer virology research. The empirical results from previous works suggest that, while short length n-grams are easier to extract, the characteristics of the underlying executables are better represented in lengthier n-grams. However, by increasing the length of an n-gram, the feature space grows in an exponential manner and much space and computational resources are demanded. And therefore, feature selection has turned to be the most challenging step in establishing an accurate detection system based on byte n-grams. In this paper we propose an efficient feature extraction method where in order to gain more information; both adjacent and non-adjacent bi-grams are used. Additionally, we present a novel boosting feature selection method based on genetic algorithm. Our experimental results indicate that the proposed detection system detects virus programs far more accurately than the best earlier known methods. © 2011 Springer-Verlag.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (03029743)6576pp. 298-312
The objective values information can be incorporated into the evolutionary algorithms based on probabilistic modeling in order to capture the relationships between objectives and variables. This paper investigates the effects of joining the objective and variable information on the performance of an estimation of distribution algorithm for multi-objective optimization. A joint Gaussian Bayesian network of objectives and variables is learnt and then sampled using the information about currently best obtained objective values as evidence. The experimental results obtained on a set of multi-objective functions and in comparison to two other competitive algorithms are presented and discussed. © 2011 Springer-Verlag.
This paper shows that statistical algorithms proposed for the quantitative trait loci (QTL) mapping problem, and the equation of the multivariate response to selection can be of application in multi-objective optimization. We introduce the conditional dominance relationships between the objectives and propose the use of results from QTL analysis and G-matrix theory to the analysis of multi-objective evolutionary algorithms (MOEAs). © 2011 Authors.
K-order Markov models have been introduced to estimation of distribution algorithms (EDAs) to solve a particular class of optimization problems in which each variable depends on its previous k variables in a given, fixed order. In this paper we investigate the use of regularization as a way to approximate k-order Markov models when k is increased. The introduced regularized models are used to balance the complexity and accuracy of the k-order Markov models. We investigate the behavior of the EDAs in several instances of the hydrophobic-polar (HP) protein problem, a simplified protein folding model. Our preliminary results show that EDAs that use regularized approximations of the k-order Markov models offer a good compromise between complexity and efficiency, and could be an appropriate choice when the number of variables is increased. Copyright 2011 ACM.
Bayesian Optimization Algorithm (BOA) has been used with different local structures to represent more complex models and a variety of scoring metrics to evaluate Bayesian network. But the combinatorial effects of these elements on the performance of BOA have not been investigated yet. In this paper the performance of BOA is studied using two criteria: Number of fitness evaluations and structural accuracy of the model. It is shown that simple exact local structures like CPT in conjunction with complexity penalizing BIC metric outperforms others in terms of model accuracy. But considering number of fitness evaluations (efficiency) of the algorithm, CPT with other complexity penalizing metric K2P performs better. Copyright 2009 ACM.
Estimation of distribution algorithms, especially those using Bayesian network as their probabilistic model, have been able to solve many challenging optimization problems, including the class of hierarchical problems, competently. Since model-building constitute an important part of these algorithms, finding ways to improve the quality of the models built during optimization is very beneficial. This in turn requires mechanisms to evaluate the quality of the models, as each problem has a large space of possible models. The efforts in this field are mainly concentrated on single-level problems, due to complex structure of hierarchical problems which makes them hard to treat. In order to extend model analysis to hierarchical problems, a model evaluation algorithm is proposed in this paper which can be applied to different problems. The results of applying the algorithm to two common hierarchical problems are also mentioned and described. ©2009 IEEE.
Background estimation is one of the most challenging phases in extracting foreground objects from video sequences. In this paper we present a background modeling approach that uses the similarity of frames to extract background areas from the video sequence. We use a window over the frames history and compute the similarity between the selected frames of this window as a similarity window. The properties of similarity window depend on the characteristics of the scene and can be adjusted parametrically. Our primary results show that if proper parameters are chosen, this method can give a good approximation of the background model. ©2008 IEEE.