Articles
Multilingual neural machine translation (MNMT) is a novel machine translation approach that benefits from large multilingual resources. However, its performance drops significantly when training with low-resource languages due to the reliance on parameter sharing and data size. In this paper, a new method is proposed to improve the performance of MNMT for a pair of languages where the target language is low-resource. The main idea of this study is to find important nodes that have parameters connected to them that negatively affect an MNMT model and then split those nodes into two sub nodes. Then, the model selects the important sub node that has an effect on the specific language pair to create a twin sub node. This twin sub node helps to strengthen the translation quality of the specific language pair without having a negative effect on other languages. The proposed method works in four steps as: 1) training an MNMT model with parameter sharing over multiple languages, 2) selecting important nodes which negatively affect the MNMT, 3) splitting important nodes into sub nodes, and 4) Twining important sub nodes. The proposed method has been evaluated using several multilingual datasets, including TED 2013, TED 2020, and BIBLE, by examining English-Persian language as a case study. The obtained results show that the proposed method yields the best results for one-to-many and many-to-many models according to the average BLEU value and semantic similarity. The results also show that the proposed method has given better results than other well-known large language models, such as ChatGPT, BING GPT4, and the Google Neural Machine Translation (GNMT) model, when applied to a low-resource language. © 2025 Elsevier B.V.
2025 29th International Computer Conference, Computer Society of Iran, CSICC 2025pp. 253-259
This study presents a method for the automatic identification of micro-cracks in photovoltaic solar modules using deep learning techniques. The main challenge in this research is the lack of labeled data and class imbalance for the detection of micro-cracks. The proposed method employs a multi-stage approach. Initially, 10% of the dataset is manually labeled to train a simple convolutional neural network model. This model is then used to generate pseudo-labels for the unlabeled data using a semi-supervised approach. The pseudo-labels are manually reviewed to increase the number of micro-crack samples in the training set. Data augmentation techniques are also applied to increase the size and diversity of the training dataset. Finally, the pre-trained ResNet-50 model is fine-tuned on the expanded labeled dataset for accurate detection of microcracks. Advanced preprocessing steps, including solar cell segmentation, cropping, and data augmentation, have been performed. The class imbalance problem is addressed through undersampling and weighted loss functions. The experimental results demonstrate the effectiveness of the proposed method, achieving an accuracy of 0.978 and an F1-score of 0.797 in the detection of micro-cracks in electroluminescence images of solar panels. This study provides insights into the use of limited labeled data for training robust deep learning models for the identification of defects in solar modules. © 2024 IEEE.
International Journal of Machine Learning and Cybernetics (18688071)(12)pp. 5509-5529
Federated semi-supervised learning (Fed-SSL) algorithms have been developed to address the challenges of decentralized data access, data confidentiality, and costly data labeling in distributed environments. Most existing Fed-SSL algorithms are based on the federated averaging approach, which utilizes an equivalent model on all machines and replaces local models during the learning process. However, these algorithms suffer from significant communication overhead when transferring parameters of local models. In contrast, knowledge distillation-based Fed-SSL algorithms reduce communication costs by only transferring the output of local models on shared data between machines. However, these algorithms assume that all local data on the machines are labeled, and that there exists a large set of shared unlabeled data for training. These assumptions are not always feasible in real-world applications. In this paper, a knowledge distillation-based Fed-SSL algorithm has been presented, which does not make any assumptions about how the data is distributed among machines. Additionally, it artificially generates shared data required for the learning process. The learning process of the presented approach employs a semi-supervised GAN on local machines and has two stages. In the first stage, each machine trains its local model independently. In the second stage, each machine generates some artificial data in each step and propagates it to other machines. Each machine trains its discriminator with these data and the average output of all machines on these data. The effectiveness of this algorithm has been examined in terms of accuracy and the amount of communication among machines by using different data sets with different distributions. The evaluations reveal that, on average, the presented algorithm is 15% more accurate than state-of-the-art methods, especially in the case of non-IID data. In addition, in most cases, it yields better results than existing studies in terms of the amount of data communication among machines. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.