My research interests include Human-Computer Interaction (2020-) and AI for Science (2025-).
I also engage in collaborative research on interesting topics.
My hobbies are composing music and working out. I am also the lead guitarist of Tokamak Disruption, an alternative metal band.
The Hong Kong Polytechnic University
PhD Student, Research Institute for Intelligent Wearable SystemsNorthwestern Polytechnical University
Master, Mechatronics EngineeringNorthwestern Polytechnical University
Bachelor, Honors CollegeSummary: Pretrained equivariant graph neural networks based on spherical harmonics offer efficient and accurate alternatives to computationally expensive ab-initio methods, yet adapting them to new tasks and chemical environments still requires fine-tuning. Conventional parameter-efficient fine-tuning (PEFT) techniques, such as Adapters and LoRA, typically break symmetry, making them incompatible with those equivariant architectures. ELoRA, recently proposed, is the first equivariant PEFT method. It achieves improved parameter efficiency and performance on many benchmarks. However, the relatively high degrees of freedom it retains within each tensor order can still perturb pretrained feature distributions and ultimately degrade performance. To address this, we present Magnitude-Modulated Equivariant Adapter (MMEA), a novel equivariant fine-tuning method which employs lightweight scalar gating to modulate feature magnitudes on a per-order and per-multiplicity basis. We demonstrate that MMEA preserves strict equivariance and, across multiple benchmarks, consistently improves energy and force predictions to state-of-the-art levels while training fewer parameters than competing approaches. These results suggest that, in many practical scenarios, modulating channel magnitudes is sufficient to adapt equivariant models to new chemical environments without breaking symmetry, pointing toward a new paradigm for equivariant PEFT design.
Summary: Graph neural networks (GNNs) have achieved remarkable success in molecular property prediction. However, traditional graph representations struggle to effectively encode the inherent 3D spatial structures of molecules, as molecular orientations in 3D space introduce significant variability, severely limiting model generalization and robustness. Existing approaches primarily focus on rotation-invariant and rotation-equivariant methods. Invariant methods often rely heavily on prior knowledge and lack sufficient generalizability, while equivariant methods suffer from high computational costs. To address these limitations, this paper proposes a novel plug-and-play 3D encoding module leveraging rotational sampling. By computing the expectation over the SO(3) rotational group, the method naturally achieves approximate rotational invariance. Furthermore, by introducing a carefully designed post-alignment strategy, strict invariance can be achieved without compromising performance. Experimental evaluations on the QM9 and C10 Datasets demonstrate superior predictive accuracy, robustness, and generalization performance compared to existing methods. Moreover, the proposed approach maintains low computational complexity and enhanced interpretability, providing a promising direction for efficient and effective handling of 3D molecular information in drug discovery and material design.
Summary: Adaptive interaction system in flight control always aims to enhance the pilot’s situation awareness (SA) to achieve human-in-the-loop control. Most adaptive interaction systems are always activated according to the pilot’s current workload state. However, the pilot may already lose important information during a high workload, and thus the corresponding reaction of the adaptive interaction system would lag. Moreover, most adaptive interaction systems adopt the expert’s knowledge as a reference to generate information. Still, the tacit knowledge that reveals the information interrelationship is seldom studied, despite being the foundation of the interactive information display. To solve the above problems, we proposed an adaptive interaction system architecture with three subsystems. Firstly, we developed a workload level prediction subsystem, where physiological parameters are used to predict future workload levels, thus avoiding interaction system lag; Secondly, we developed a tacit expert knowledge mining subsystem to discover the interrelationship hidden in the expert’s perceived information, which will guide the interactive information interface. Thirdly, we developed a tips information inference subsystem to provide the lost SA information based on expert knowledge and the pilot’s online perceived information. The effectiveness of the proposed system is verified via a comparative experiment utilizing the control interface of a remotely piloted aircraft.
Summary: Attention allocation reflects the way of humans filtering and organizing the information. On one hand, different task scenarios seriously affect human's rule of attention distribution, on the other hand, visual attention reflecting the cognitive and psychological process. Most of the previous studies on visual attention allocation are based on cognitive models, predicted models, or statistical analysis of eye movement data or visual images, however, these methods are inadequate to provide an inside view of gaze behavior to reveal the attention distribution pattern within scenario context. Moreover, they seldom study the association rules of these patterns. Therefore, we adopted the big data mining approach to discover the paradigm of visual attention distribution.
Summary: Visual attention is one of the most important brain cognitive functions, which filters the rich information of the outside world to ensure the efficient operation of limited cognitive resources. The underlying knowledge, i.e., tacit knowledge, hidden in the human attention allocation performances, is context-related and is hard to be expressed by experts, but it is essential for novice operator training and interaction system design. Traditional models of visual attention allocation and corresponding analysis methods seldomly involve task contextual information or present the tacit knowledge in an explicit and quantified way. Thus, it is challenging to pass on the expert’s tacit knowledge to the novice or utilize it to construct an interaction system by employing traditional methods. Therefore, this paper first proposes a new model called the visual cognitive graph model based on graph theory to model the visual attention allocation associated with the task context. Then, based on this graph model, utilize the data mining method to reveal attention patterns within context to quantitatively analyze the operator’s tacit knowledge during operation tasks. We introduced three physical quantities derived from graph theory to describe the tacit knowledge, which can be used directly to construct an interaction system or operator training. For example, discover the essential information within the task context, the relevant information affecting critical information, and the bridge information revealing the decision-making process. We tested the proposed method in the example of flight operation, the comparison results with the traditional eye movement graph model demonstrate that the proposed visual cognitive model can compromise the task context. The comparison results with the statistical analysis method demonstrate that our tacit knowledge mining method can reveal the underlying knowledge hidden in the visual information. Finally, we give practical applications in the examples of operator training guidance and adaptive interaction system. Our proposed method can explore more in-depth knowledge of visual information, such as the correlations of different obtained information and the way operator obtains information, most of which are even not noticed by operators themselves.
Summary: Adaptive interactive systems can be divided into two categories: Semantic adaptations affect the system’s function, and syntactic adaptations affect the presentation of information through the operator interface without modifying the system behavior. Although automation is becoming more and more important, in many situations, human operators still play an irreplaceable role, making the development of syntactic adaptations indispensable. However, existing syntactic adaptations usually do not relate to the task context, which is a critical factor to achieve syntactic adaptations. Besides, existing studies do not fully consider the operator’s information perception state under highly dynamic scenarios, thus cannot dynamically prompt accurate information to the operators. Aiming at the problems above, a syntactic adaptive interactive system architecture based on perception state estimation and context segmentation is proposed in this paper. To address the real-time context segmentation, a time series segmentation algorithm is proposed, which achieves real-time estimation of the scene and produces highly interpretable results. To address the problem of dynamic perception state estimation, for the first time, the concept of a perception stack is proposed, which can be used to estimate the perception bias in the operator’s working memory based on the eye movement data and generate proper prompt information according to the current context. The proposed design method of an adaptive syntactic interactive system was verified in a flight simulator. The system improves information prompts accuracy and reduces redundancy through these innovative mechanisms mentioned above. Comparative experiments confirm significant improvements in the operator’s control accuracy and control stability according to our evaluation criteria.
Summary: Eye movement data can show the cognitive process in performing tasks to a certain extent. The existing researches on eye movement analysis are usually based on statistics, and it is difficult to show the correlation between the information associated with the scene. Other probabilistic algorithms usually focus on user feature recognition based on eye movement representation. In this paper, the concept of time-domain and frequency-domain analysis of eye movement area of interest is proposed, within which, the frequent pattern mining method and visual cognitive graph model are constructed to mine the relationship between the areas of interest. Finally, some application examples of this model in the novice expert paradigm are presented.
Summary: Flight activities are highly dynamic operations. The adaptive system that can assist human pilots is in urgent need. Exist adaptive system structures mainly focus on enhancing the visual information of display or monitoring human state. However, the specific visual information user missed is not monitored, thus cannot achieve the information enhancement. As for adaptive system, there always exist three main decisions that need to be made: what to adapt, how to adapt and when to adapt. In this paper, a new architecture of adaptive system is proposed aiming to solve the first two questions. The proposed architecture consists of two parts, one is the tips information generating, which is used to analyze and generate the visual information that pilot didn't acquire or missed; the other is the information relation extracting, which is used to extract the relation between visual information, in order to obtain a suitable way to present the information to pilot. Finally, the output results of the former steps are put into the adaptive information alert system, which is used to integrate the outputs and represent them on the display in a proper way at a proper time, to achieve the visual information enhancement.
Summary: With the rapid development of complex equipment, such as airplanes, the appropriate design of the human‐machine interface is often upgraded, thus emerged many methods to evaluate whether such an upgrade is effective. Most researches focus on the time accumulation effect of the human state during the interaction to evaluate the interface. However, in the aviation application, the performance of the pilot’s instantaneous reactions also reveals the design efficiency of the interface, since the difficulty level of obtaining the useful information would severely influence the reaction time in some voice command tasks or emergency situations. Besides, there are so many flight scenarios that are impossible to be simulated in experiments or in a laboratory environment. Also, voice commands are too numerous to be traversed simulated. This paper introduced predicted auditory reaction time as an index to evaluate human‐machine interface design. The proposed method has two advantages. On the one hand, it effectively measures the pilot’s auditory reaction time based on the eye movement tracking; thus, the data can be taken in flight task scenarios, and the experiment would not cause interference to the subjects. On the other hand, a prediction model is proposed, in which the pilot’s reaction time under more generalized voice command can be estimated based on a small‐size sample set.
Summary: In 1953, Enrico Fermi criticized Dyson's model by quoting Johnny von Neumann: "With four parameters I can fit an elephant, and with five I can make him wiggle his trunk." So far, there have been several attempts to fit an elephant using four parameters, but as the problem has not been well-defined, the current methods do not completely satisfy the requirements. This paper defines the problem and presents an attempt.
Summary: Transfer entropy is used to establish a measure of causal relationships between two variables. Symbolic transfer entropy, as an estimation method for transfer entropy, is widely applied due to its robustness against non-stationarity. This paper investigates the embedding dimension parameter in symbolic transfer entropy and proposes optimization methods for high complexity in extreme cases with complex data. Additionally, it offers some perspectives on estimation methods for transfer entropy.
AISI, Beijing
Research InternThe Hong Kong Polytechnic University
Research Assistant, Research Institute for Intelligent Wearable SystemsKeyanquan, Global Science
Academic News InternChenJiaGou Hope Elementary School
Volunteer TeachingTokamak
Disruption
Lead Guitarist.