Volume 15 | Advances in Engineering Innovation

Research Article Open Access

Published 26 December 2024 DOI: 10.54254/2977-3903/2024.19435

A MDA-based multi-modal framework for panoramic viewport prediction

Jinghao Lyu

Panoramic viewport prediction is crucial in 360-degree video streaming, aiming to forecast users' future viewing regions for efficient bandwidth management. To achieve accurate panoramic viewport prediction, existing frameworks have explored the utilization of multi-modal inputs, combining trajectory, visual, and audio data. However, they uniformly process different modalities through standardized pipelines and use concatenation-based feature fusion regardless of modality characteristics. With the unmodified application of computationally intensive Transformer architectures, the uniform design exacerbates computational overhead. Besides that, the concatenation-based feature fusion lacks the ability to model global dependencies and explicit interactions between different modalities, which limits the prediction accuracy. To overcome these issues, we introduce a lightweight Modality Diversity-Aware (MDA) framework including two primary components: a lightweight feature refinement module and a cross-modal attention module. The feature refinement module uses compact latent tokens to sequentially process audio-visual data, thereby filtering out irrelevant background signals and reducing model parameters. Following this, our cross-modal attention module effectively fuses trajectory features with the refined audio-visual features by allocating attention weights on the effective features, improving the prediction accuracy. Experimental results on a standard 360-degree video benchmark demonstrate that our MDA framework achieves higher prediction accuracy than current multi-modal frameworks, while requiring up to 50% fewer parameters.

Read Article PDF

Cite

Research Article Open Access

Published 10 January 2025 DOI: 10.54254/2977-3903/2025.20435

Deep regularization techniques for improving robustness in noisy record linkage task

Yichen Xu

Linking records is essential in data integration, healthcare analysis, fraud detection, and other applications where matching across datasets is needed. But actual data is usually noisy (lost values, typos, inconsistent formatting), and these factors greatly sour the performance of deterministic and probabilistic approaches. In this paper, we introduce a deep learning model and high-level regularizations (dropout, weight decay, early stopping) to enhance robustness for noisy record linkage. We test the approaches by using open data, that are simulated scenarios of real world with different levels of noise. Data augmentation generates fake noise (realistic input errors). Results reveal that regularization techniques improve the model’s performance under noisy environments with up to 20% better accuracy and recall than unregularized models. Dropout specifically tended to generalise better by limiting overfitting to noise. These results reveal the potential of deep learning and regularization to address record linkage problems in noisy environments, and suggest future work on additional techniques including adversarial training and batch normalization.

Read Article PDF

Cite

Research Article Open Access

Published 15 January 2025 DOI: 10.54254/2977-3903/2025.20514

Evaluation of character creation of large language models

Yuhui Wang, An Yang, Yifan Zhou, Chao Yao, Xiuxiu Sun, Wandi Sun

In intelligent scenarios, large language models (LLMs) are used to create characters that interact with users, providing guidance and relevant information. The higher the degree of anthropomorphism of these roles, the better the emotional experience they provide, which is beneficial for user interaction and enhances user experience. Therefore, evaluating the character-creation capabilities of LLMs is essential. This study used a questionnaire and used another LLM (ChatGPT-4o) to assess the impact of emoji usage and language style on the anthropomorphism and emotional expression of content generated by LLMs. The results indicate that when using emojis, the characters exhibit higher levels of anthropomorphism and emotional expression. Additionally, informal language styles contribute to enhancing both anthropomorphism and emotional expression.

Read Article PDF

Cite

Research Article Open Access

Published 15 January 2025 DOI: 10.54254/2977-3903/2025.20517

Multi-temporal analysis of land use change using GIS and satellite imagery: Implications for sustainable urban planning

Yankun Li

Urbanization is a significant driver of land use change, particularly in rapidly growing metropolitan areas. This research investigates Greenfield City land use change in the 20-year period (2000-2020) using GIS and satellite data. The mapping shows where the greatest land-use changes occurred, ranging from increased residential and commercial developments to the loss of agricultural fields and the omission of green space. The work applies multi-temporal analyses of Landsat satellite images taken in 2000, 2010 and 2020 to estimate land cover change and its effects on urban planning and sustainability. They indicate that there’s a clear rise in housing and business developments, but also a steep decline in farming and greenspace. These transformations affect the environment, with habitat loss, biodiversity destruction and encroachment on natural resources. The paper wraps up by focusing on the issues of sustainability in urban planning and how better land use planning is required to reduce the negative environmental effects of urban sprawl.

Read Article PDF

Cite

Research Article Open Access

Published 17 January 2025 DOI: 10.54254/2977-3903/2025.20553

Research on online mixed palletizing strategy of robotic arms for multi-SKU scenarios

Guigen Jin, Xiaoxia Ding, Dian Jin

In intelligent warehousing and transportation processes, the centralization of material units significantly enhances storage and handling efficiency. Among these, the centralized unitization of material pallets is in high demand and widely applied in practical operations. In multi-SKU scenarios, achieving efficient palletizing—particularly online mixed palletizing—poses a major challenge in logistics operations. This process aims to save manpower while ensuring operational efficiency. To address this issue, this paper presents a combined heuristic algorithm that integrates an anthropomorphic heuristic algorithm with a greedy algorithm incorporating local perturbations. The proposed approach accounts for constraints such as mass, volume, center of gravity, non-overlapping placement, and stability. Experimental results demonstrate that this algorithm effectively resolves the palletizing challenges for multi-SKU goods, significantly reducing space waste.

Read Article PDF

Cite

Research Article Open Access

Published 17 January 2025 DOI: 10.54254/2977-3903/2025.20556

Time series data analysis and association rule mining in financial recommendation systems using Hadoop and Spark

Yaoyu Chen, Yichen Xu

Increasing amounts of financial data demand sophisticated analytics to develop sound recommendation models. This article discusses combining time series analysis and association rule mining for big data in Hadoop and Spark to enrich financial product recommendation engines. The paper is an integrated analysis of two types of prediction algorithms: AutoRegressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) networks to forecast user behavior and demand for financial services in the future from transactional history. The ARIMA model is used as the default while the LSTM model is used to represent non-linear dependencies and give a more dynamic forecast. association rule mining – in particular the Apriori algorithm – is used to find latent patterns and relationships between user transactions and financial products. This article illustrates how time series forecasting and association rule mining can be merged to bring a more useful financial recommendation. The hybrid approach, which combines both approaches, proves to increase user interaction and recommendation accuracy by 20% compared to the previous systems, according to experiments. The paper emphasises the possibilities of using big data in the construction of scalable, individualized financial recommendation systems.

Read Article PDF

Cite

Research Article Open Access

Published 6 February 2025 DOI: 10.54254/2977-3903/2025.20727

Tuning of FOPID parameters combined with swarm intelligent algorithm

Minghao Lou

This paper delves into the parameter tuning of fractional-order PID (FOPID) controllers. FOPID controllers, with additional integral and derivative orders compared to traditional PID controllers, possess enhanced capabilities in handling complex systems. However, effective tuning of its five parameters is challenging. To address this, multiple intelligent algorithms are investigated. The improved sparrow search algorithm (ISSA) utilizes Chebyshev chaotic mapping initialization, adaptive t-distribution, and the firefly algorithm to overcome the limitations of the basic algorithm, showing high accuracy, speed, and robustness in multi-modal problems. The grey wolf optimizer (GWO), inspired by the hunting behavior of grey wolves, has procedures for encircling, hunting, and attacking but may encounter local optima, and several improvement methods have been proposed. The genetic algorithm, based on the survival of the fittest principle, involves encoding, decoding, and other operations. Taking vehicle ABS control as an example, the genetic algorithm-based FOPID controller outperforms the traditional PID controller. In conclusion, different algorithms have their own advantages in FOPID parameter tuning, and the selection depends on system characteristics and control requirements. Future research can focus on further algorithm improvement and hybrid methods to achieve better control performance, providing a valuable reference for FOPID applications in industry.

Read Article PDF

Cite

Research Article Open Access

Published 7 February 2025 DOI: 10.54254/2977-3903/2025.20826

Hotel booking cancellation and machine learning

Jianing Sun

In recent years, machine learning has emerged as a powerful tool with widespread applications across various domains due to its ability to process and analyze vast amounts of data. This study explores the application of machine learning techniques in predicting hotel booking cancellations using Property Management System (PMS) data. The research involves a comprehensive process, including data cleaning, feature engineering, feature selection, and model development. Feature selection and dimensionality reduction using Principal Component Analysis (PCA) and Lasso regression identified key predictive features, facilitating the rapid creation of neural network models. A diverse set of machine learning and deep learning models, such as Logistic Regression, Decision Tree, Random Forest, XGBoost, Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Deep Neural Network (DNN), and Long Short-Term Memory (LSTM), were employed. All models achieved accuracies exceeding 80%, with neural networks nearing 100%. These results highlight the efficacy of these models in predicting cancellations across different hotels, revealing consistent cancellation patterns. The study demonstrates the potential of machine learning to optimize hotel management by accurately forecasting booking cancellations, thereby reducing uncertainty and increasing revenue. Future work may focus on exploring more advanced feature engineering techniques and models to further enhance prediction accuracy and generalizability.

Read Article PDF

Cite

Research Article Open Access

Published 7 February 2025 DOI: 10.54254/2977-3903/2025.20827

Multimodal fake news detection using graph neural networks and attention mechanisms

Zixuan Li

The rapid spread of fake news across digital platforms poses a significant challenge to societies, leading to a growing demand for robust detection mechanisms. Traditional fake news detection methods often rely on unimodal data, such as textual content, limiting their effectiveness in addressing the complex, and multimodal nature of fake news. This paper introduces a Multimodal Fake News Detector (MFND) that integrates textual, visual, and social context features to enhance detection accuracy. This makes classification tasks more accurate and reliable. The MFND was evaluated using the FakeNewsNet and Sina Weibo datasets, achieving high accuracy and outperforming existing models. The experimental results highlight the importance of multimodal fusion and attention-based weighting mechanisms in improving detection performance, particularly in complex social media environments. This research demonstrates the potential of multimodal approaches for more accurate and reliable fake news detection.

Read Article PDF

Cite

Research Article Open Access

Published 12 February 2025 DOI: 10.54254/2977-3903/2025.20887

Research on the protection of far-side occupants under different side impact conditions

Wei Zhang

In the development of side-impact safety performance in automobiles, the injury conditions of far-side occupants have gradually been incorporated into the evaluation system of automotive safety in China. This study, based on three side-impact conditions in the Chinese automotive safety assessment system, employs the finite element method to analyze the injury and motion characteristics of far-side occupants under various impact scenarios. The results indicate that in the C-NCAP 75° POLE condition, certain impact areas of the far-side dummy sustain more severe injuries. These findings provide data references for the development of safety performance measures aimed at protecting far-side occupants.

Read Article PDF

Cite

Articles in this Volume