ArchiveTo address the significant decline in positioning accuracy of traditional algorithms under indoor non line of sight(NLOS)conditions and low beacon deployment density, this paper proposes a fused positioning method based on ranging correction, combining bluetooth low energy(BLE)and pedestrian dead reckoning(PDR). Firstly, the received signal strength(RSS)of BLE is rapidly constructed using SketchUp indoor 3D modeling software integrated with a ray-tracing algorithm, eliminating the need for tedious manual RSS field collection. Subsequently, a variational autoencoder based on convolutional neural network(VAE-CNN)is designed to predict and correct BLE ranging errors, thereby improving BLE positioning accuracy. Finally, an extended Kalman filter(EKF)is employed to fuse the positioning results from BLE and PDR. Experimental results demonstrate that the proposed ranging-corrected BLE positioning and EKF-based fusion positioning achieve superior performance in environments with NLOS interference and low beacon deployment density.
A novel ±45° dual-polarized microstrip patch antenna based on “L”-shaped probe feeding is designed and simulated for optimization. Two orthogonal “L”-shaped probes with a height difference are used for mutual coupling feeding to achieve dual polarization, significantly increasing the antenna's channel capacity. The copper metal columns between the two dielectric substrates play the role of feeding and radiation. The microstrip radiation patches are connected through metalized vias, and symmetrical “τ”-shaped grooves designed with openings facing inward are adopted to increase the resonance frequency point, the polarization patch is designed in an “S”-shape to expand the bandwidth. Simulation results show that the S11 of the antenna is less than -10 dB within the frequency bands of 2.08~2.77 GHz & 3.66~5.36 GHz, the relative impedance bandwidth is 66.15%, the gain is not less than 6 dBi, the radiation efficiency is above 90%, the isolation between ports is greater than 10 dB, and the cross-polarization level is greater than 20 dB. Physical fabrication and actual measurements demonstrate good agreement between the measured and simulated results at port 1, while minor deviations at port 2 are attributed to fabrication and testing conditions but remain within acceptable limits. Compared with similar studies, the proposed antenna features wide bandwidth, compact structure, and ease of fabrication, making it suitable for C-band(3700~4200 MHz)and WLAN(2400~2484 MHz and 5150~5350 MHz)wireless transceiver communication systems.
The integration of mobile edge computing(MEC)and wireless power transfer(WPT)can effectively alleviate the constraints of limited computing resources and battery capacity in wireless devices. To address the dynamic energy-efficient offloading problem in wireless-powered MEC systems under a nonlinear energy harvesting model, this paper proposes an energy consumption optimization algorithm based on Lyapunov optimization theory. By jointly optimizing the server's computing frequency, the energy station's transmit power, task offloading time, device transmit power, and local computing frequency, the algorithm minimizes the system's long-term average energy consumption while ensuring system stability. The stochastic optimization problem is transformed into a time-slot-based deterministic subproblem using Lyapunov optimization and solved through the Lagrange multiplier method and an improved whale optimization algorithm. Simulation results show that, compared with benchmark schemes, the proposed offloading strategy significantly reduces system energy consumption while maintaining long-term task queue stability.
This study focuses on the problem of fractional channel parameters affecting channel estimation performance in intelligent reflecting surface-orthogonal time frequency space(IRS-OTFS)communication systems. A channel estimation method for IRS-OTFS systems is proposed by leveraging the sparsity of OTFS channels in the delay-Doppler(DD)domain. First, the joint sparsity channel estimation problem among channel parameters is transformed into a sparse signal recovery problem. Next, the fast iterative shrinkage/thresholding algorithm(FISTA)is introduced to solve this problem. The inputoutput relationship of the IRS-OTFS communication system is then derived. To address the issue of manual parameter tuning in traditional FISTA, a network architecture based on the FISTA algorithm is proposed. This architecture unfolds the iterative process of the sparse signal recovery algorithm into a neural network. The network is designed to automatically learn the optimal hyperparameters and nonlinear functions within the algorithm. Theoretical analysis and simulation results demonstrate that, under the same channel transmission conditions, the proposed algorithm achieves lower estimation error com pared to the benchmark algorithm.
To address the security challenges of relay communication in complex environments with potential eavesdroppers, this paper proposes a multi-UAV-assisted relay communication network that provides secure communication services for users. A multi-agent deep reinforcement learning(MARL)algorithm based on the Q-mixing network(QMIX)is employed to jointly optimize UAV trajectories and power allocation. The goal is to guarantee the minimum transmission rate of low-security-sensitivity users(secondary users)while enhancing the communication security and data rate of high-security-sensitivity users(primary users). Simulation results demonstrate that, compared with the Double Deep Q-Network(Double DQN)and the Dueling Deep Q-Network(Dueling DQN), the proposed algorithm improves the cumulative reward by approximately 15.5% and 1.26%, respectively. Moreover, the proposed rate-splitting multiple access(RSMA)technique significantly outperforms space-division multiple access(SDMA)and non-orthogonal multiple access(NOMA)in terms of overall system performance and information security. The proposed method provides an effective solution for achieving secure and efficient communication in multi-user wireless networks.
This study proposes a wood board recognition method that integrates laser speckle technology with deep learning. Conventional photography and laser speckle imaging were employed to capture wood board images before and after modification treatments under both normal lighting and adverse conditions(including darkness and defocusing). A corresponding dataset was then constructed. Classification experiments were conducted using the ResNet34 deep learning model. The results show that the ResNet34 model achieves high recognition accuracy when classifying laser speckle datasets and maintains good performance even under adverse environmental conditions. Furthermore, by introducing a convolutional block attention module(CBAM)to optimize the ResNet34 convolutional neural network, the classification accuracy for laser speckle images reached 93.29%. The combination of laser speckle technology and deep learning provides a low-environmental-requirement, efficient, and promising approach for wood board classification.
Generative text summarization models can produce novel expressions in summaries, but even the most advanced models may generate content that contradicts the source text or lacks factual verifiability—a phenomenon known as hallucination. To address this issue, this paper proposes an intrinsic hallucination optimization method to improve the summarization generation process. The proposed approach mitigates hallucinations from three perspectives: data-level optimization, model training-level optimization, and summary generation strategy-level optimization. Experiments conducted on two benchmark datasets demonstrate the superior performance of the proposed method. Compared with baseline models, the proposed approach achieves an average improvement of 8.58% in R-1 score on the CNNDM dataset and 7.26% on the XSUM dataset. The results indicate that the method not only enhances summary quality but also effectively reduces hallucinations, providing a valuable reference for the practical deployment of generative text summarization models.
The localization and recognition of key symbols in engineering drawings have long been essential applications in computer vision. Compared with traditional methods, deep learning-based text detection approaches offer higher detection efficiency and accuracy. It is therefore necessary to apply existing text detection algorithms to engineering drawing recognition tasks. This paper proposes a deep learning-based method for the localization and recognition of key symbols in engineering drawings, focusing on the detection and recognition of index symbols and dimension symbols. For index symbol localization, the drawings are cropped to a uniform size, and non-maximum suppression is used to remove redundant candidate boxes. For dimension symbol localization, a complete detection is performed on the masked drawings, and the intersection-over-union between each detected box and index symbol location is calculated to filter out partial data. Experimental results demonstrate that the proposed method achieves high precision and recall in both the localization and recognition of index and dimension symbols in engineering drawings.
Temporal knowledge graph reasoning, which predicts events absent from the graph, has seen significant applications in recommendation systems, question answering, and healthcare. The lack of background knowledge in temporal knowledge graphs hinders reasoning, with existing methods relying on external graphs while overlooking implicit data within the graph. To fully exploit the graph's implicit background information, this paper extracts cross-temporal features to define entity backgrounds and proposes a temporal knowledge graph reasoning model incorporating cross-time commonality features(TR-CTC). TR-CTC uses a graph neural network to extract cross-temporal commonality from multi-hop paths, integrating it as background information into the graph representation learning process, enhancing reasoning performance. Experimental results show that TR-CTC generally outperforms baseline models in link prediction tasks.
To enhance the performance of 4-DoF grasp detection, this paper improves the grasp representation and proposes a depth-guided multi-scale grasp detection framework(DGM-Grasp)for robotic manipulators. Built upon an encoder-decoder architecture, the framework integrates a multi-scale cross-spatial attention down-sampling module to better focus on grasp-relevant features. To extract semantic information at different scales, a progressive multi-scale feature fusion and decoding module is designed. In addition, a depth-guided grasp filtering module is introduced to address collision problems during the grasping process. Experimental results show that DGM-Grasp achieves accuracies of 98.6% and 95.25% on the Cornell and Jacquard single-object datasets, respectively, while reducing detection time to 21 ms. The method also performs effectively on multi-object datasets, achieving a 96% success rate in ablation and real-world grasping experiments. These results demonstrate the superior generalization ability and performance of DGM-Grasp.
To address the poor performance of underwater object detection caused by light attenuation and scattering, this paper proposes an enhanced underwater object detection framework based on YOLOv8, named ERMS-YOLOv8, aiming to improve detection accuracy. The backbone is replaced with an efficient vision transformer(EfficientViT)to strengthen feature extraction of underwater organisms and reduce information loss. The neck adopts a reparameterized generalized-directional feature pyramid network(RepGFPN)to enhance the fusion of high-level semantic and low-level spatial features, enabling richer feature representation. A mixed local channel attention for object detection(MLCA)is introduced to integrate channel, spatial, local, and global channel information, thereby boosting the model's representational capacity. Additionally, a scalable intersection over union loss(SIoU)is employed to improve boundary prediction accuracy. Experimental re sults demonstrate that the proposed method achieves mAP values of 83.9% on the UPRC2021 dataset and 84.4% on the DUO dataset, outperforming the original YOLOv8 and exhibiting superior performance in underwater object detection.
A long short-term memory(LSTM)neural network edge computing accelerator based on distributed systolic array architecture was proposed on the resource limited edge computing devices. The design distributes input data storage to reduce data movement and power consumption, while data transmission in a systolic manner minimizes the idle rate of computing units and enhances computational efficiency. Experimental validation on a VU13P field-programmable gate array(FPGA)shows that the proposed LSTM accelerator achieves an effective computing power of 179.2 GOPS at an operating frequency of 200 MHz, with a dynamic power consumption of 0.343 W and an energy efficiency of 522.4 GOPS/W. Compared with typical existing designs, the proposed accelerator improves energy efficiency by more than 34%.
To address the computational bottlenecks faced by classical neural networks under the explosive growth of data scale, quantum convolutional neural networks(QCNNs)based on quantum computing have become a research hotspot. This study constructs a QCNN for image classification within the limited resources provided by noisy intermediate-scale quantum(NISQ)devices. The model employs angle encoding and designs a convolutional layer based on a data re-uploading classifier, followed by a four-qubit pooling layer. Two different architectures of quantum fully connected layers are designed to perform image classification, and the impact of their structures on QCNN classification performance is analyzed.Simulation results show that the proposed QCNN achieves high classification accuracy and good generalization in binary classification tasks, with a maximum accuracy of 100.00%, a minimum of 94.55%, and an average of 97.29%. Furthermore, increasing the circuit depth improves model performance, enabling the QCNN to achieve over 90% accuracy in fourclass classification tasks.
The current dialogue emotion recognition models often overlook the coherence features and discourse structure information in context modeling. Therefore, this paper proposes a dialogue emotion recognition model based on coherence features and discourse structure. Firstly, discourse coherence detection is conducted to eliminate weak or incoherent discourse information, and both local and global coherent information are obtained by constructing a coherence matrix. Secondly, a dialogue parser is utilized to establish discourse structure relations, and a directed acyclic graph is employed to model the discourse structure while conveying both discourse structure information and speaker information. Finally, through interactive attention, coherent information and discourse information are interactively integrated to generate emotional labels. This paper validates the proposed model using two public datasets, with results indicating that compared to existing models, the proposed model demonstrates certain improvements in performance indices.
To alleviate the scarcity of data, this paper collects and annotates two infrared tracking datasets for sequential small target detection, named ATR-ISTD and UAV-ISTD. This paper proposes a sequential small target detection network integrating a memory pool, which effectively utilizes the correlation information between frames before and after, reads memory information through memory matching between the query frame and the memory frame, and solves the problems of high false alarm and low accuracy in infrared small target detection under high clutter background. To reduce the loss of small target features caused by downsampling, a forward semantic guided fusion module(PSGF)is designed to integrate features of different scales. In the memory vector encoder, a pseudo label guided feature enhancement module(PLG-FE)is designed to enhance the local feature expression ability of small targets. Experimental results show that, compared with mainstream single-frame detection methods, the proposed method significantly reduces false alarm rates, achieving improvements of 16.87% and 10.49% on the ATR-ISTD and UAV-ISTD datasets, respectively. Target-level F1 scores increased by 4.89% and 6.54%, and pixel-level F1 scores improved by 7.69% and 11.63%.