A Novel Bi-hemispheric Discrepancy Model for EEG Emotion Recognition
一种新型双半球差异模型用于脑电情绪识别
Abstract—The neuroscience study [1] has revealed the discrepancy of emotion expression between left and right hemispheres of human brain. Inspired by this study, in this paper, we propose a novel bi-hemispheric discrepancy model (BiHDM) to learn the asymmetric differences between two hemispheres for electroencephalograph (EEG) emotion recognition. Concretely, we first employ four directed recurrent neural networks (RNNs) based on two spatial orientations to traverse electrode signals on two separate brain regions, which enables the model to obtain the deep representations of all the EEG electrodes’ signals while keeping the intrinsic spatial dependence. Then we design a pairwise subnetwork to capture the discrepancy information between two hemispheres and extract higher-level features for final classification. Besides, in order to reduce the domain shift between training and testing data, we use a domain disc rim in at or that adversarial ly induces the overall feature learning module to generate emotion-related but domain-invariant feature, which can further promote EEG emotion recognition. We conduct experiments on three public EEG emotional datasets, and the experiments show that the new state-of-the-art results can be achieved.
摘要—神经科学研究 [1] 揭示了人脑左右半球情绪表达的差异性。受此启发,本文提出一种新型双半球差异模型 (BiHDM),通过捕捉两半球间非对称差异实现脑电图 (EEG) 情绪识别。具体而言,我们首先基于两个空间方位部署四个定向循环神经网络 (RNN),分别遍历两个脑区的电极信号,使模型在保持固有空间依赖性的同时获取所有 EEG 电极信号的深层表征。随后设计配对子网络捕获两半球间差异信息,并提取更高层次特征用于最终分类。此外,为降低训练与测试数据间的域偏移,我们采用领域判别器对抗式引导整体特征学习模块生成与情绪相关但域不变的特征,从而进一步提升 EEG 情绪识别性能。在三个公开 EEG 情绪数据集上的实验表明,本方法能达到当前最优性能。
Index Terms—EEG emotion recognition, bi-hemispheric discrepancy, spatial-temporal network
索引术语—EEG情绪识别、双半球差异、时空网络
I. INTRODUCTION
I. 引言
Emotion, as a common mental phenomenon, is closely related to our daily life. Although it is easy to sense other people’s emotion in human-human interaction, it is still difficult for machines to understand the complicated emotions of human beings [2]. As the first step to make machines capture human emotions, emotion recognition has received substantial attention from human-machine-interaction (HMI) and pattern recognition research communities in recent years [3], [4], [5].
情感作为一种常见的心理现象,与我们的日常生活密切相关。尽管在人际交往中感知他人情绪很容易,但机器仍难以理解人类复杂的情感 [2]。作为让机器捕捉人类情感的第一步,情感识别近年来在人机交互 (HMI) 和模式识别研究领域受到了广泛关注 [3], [4], [5]。
Human emotional expressions are mostly based on verbal behavior methods (e.g., speech), and nonverbal behavior methods (e.g., facial expression). Thus, a large body of literature concentrates on learning the emotional components
人类情感表达主要基于言语行为方式(如语音)和非言语行为方式(如面部表情)。因此,大量文献集中于学习情感成分
Yang Li and Tengfei Song are with the Key Laboratory of Child Development and Learning Science (Ministry of Education), and the Department of Information Science and Engineering, Southeast University, Nanjing, Jiangsu, 210096, China. Wenming Zheng and Yuan Zong are with the Key Laboratory of Child Development and Learning Science (Ministry of Education), School of Biological Sciences and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.(∗Corresponding author: Wenming Zheng (E-mail: wenming zheng@seu.edu.cn).) Lei Wang is with the School of Computing and Information Technology, University of Wollongong, NSW, 2500, Australia. Lei Qi is with the State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, Jiangsu, 210096, China. Tong Zhang and Zhen Cui are with the School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, Jiangsu, 210096, China.
杨立和宋腾飞任职于儿童发展与学习科学教育部重点实验室及东南大学信息科学与工程学院,江苏南京,210096。郑文明和宗元任职于儿童发展与学习科学教育部重点实验室及东南大学生物科学与医学工程学院,江苏南京,210096。(*通讯作者:郑文明(邮箱:wenming_zheng@seu.edu.cn)。王磊任职于伍伦贡大学计算与信息技术学院,新南威尔士州,2500,澳大利亚。齐磊任职于南京大学计算机软件新技术国家重点实验室,江苏南京,210096。张桐和崔振任职于南京理工大学计算机科学与工程学院,江苏南京,210096。
from speech and facial expression data. However, from the viewpoint of neuroscience, humans emotion originates from a variety of brain cortex regions, such as the orbital frontal cortex, ventral medial prefrontal cortex, and amygdala [6], which provides us a potential approach to decode emotion by recording the continuous human brain activity signals over these brain regions. For example, by placing the EEG electrodes on the scalp, we can record the neural activities in the brain, which can be used to recognize human emotions.
从语音和面部表情数据来看。然而,从神经科学的角度出发,人类的情绪源于多个大脑皮层区域,例如眶额皮层 (orbital frontal cortex)、腹内侧前额叶皮层 (ventral medial prefrontal cortex) 和杏仁核 (amygdala) [6],这为我们提供了一种通过记录这些脑区连续的脑活动信号来解码情绪的潜在方法。例如,通过在头皮上放置脑电图 (EEG) 电极,我们可以记录大脑中的神经活动,进而用于识别人类情绪。
Most existing EEG emotion recognition methods focus on two fundamental challenges. One is how to extract discriminative features related to emotions. Typically, EEG features can be extracted from the time domain, frequency domain, and time-frequency domain. In [7], Jenke et al. evaluated all the existing features by using machine learning techniques on a self-recorded dataset. The other challenge is how to classify the features correctly. Many EEG emotion recognition models and methods have been proposed over the past years [8], [9]. For example, Zheng et al. [10] proposed a group sparse canonical correlation analysis method for simultaneous EEG channel selection and emotion recognition. Li et al. [11] fused the information propagation patterns and activation difference in the brain to improve the performance of emotional recognition. These techniques have shown excellent performance on some EEG emotional datasets.
现有的大多数脑电图(EEG)情绪识别方法主要关注两个基本挑战。一是如何提取与情绪相关的判别性特征。通常可以从时域、频域和时频域提取EEG特征。在[7]中,Jenke等人通过在自记录数据集上使用机器学习技术评估了所有现有特征。另一个挑战是如何正确分类这些特征。过去几年已经提出了许多EEG情绪识别模型和方法[8][9]。例如,Zheng等人[10]提出了一种群体稀疏典型相关分析方法,用于同时进行EEG通道选择和情绪识别。Li等人[11]融合了大脑中的信息传播模式和激活差异来提高情绪识别的性能。这些技术在一些EEG情绪数据集上表现出了优异的性能。
Recently, many researchers have attempted to consider the neuroscience findings of emotion as the prior knowledge to extract features or develop models, effectively enhancing the performance of EEG emotion recognition. For example, Hinrikus et al. [12] used EEG spectral asymmetry index for the depression detection. It is well realized through neuroscience study that although the anatomy of human brain looks like symmetric, the left and right hemispheres have different responses to emotions. For example, from the view of neuroscience, Dimond et al. [1], Davidson et al. [13], and Herrington et al. [14] have studied the asymmetry of emotion expression, and Schwartz et al. [15], Wager et al. [16], and Costanzo et al.[17] have discussed the emotion lateralization. Furthermore, the literature of EEG emotion recognition has seen the use of asymmetry to classify EEG emotional signal. Lin et al. [4] investigated the relationships between emotional states and brain activities, and extracted power spectrum density, differential asymmetry power, and rational asymmetry power as the features. Motivated by their previous findings of critical brain areas for emotion recognition, Zheng et al. [5] selected six symmetrical temporal lobe electrodes as the critical channels for EEG emotion recognition. Li et al. [18] separately extracted two brain hemispheric features and achieved the state-of-the-art classification performance. The above researches demonstrate that it is a promising and fruitful way to integrate the unique characteristics of EEG signal into the machine learning algorithms. It will be an interesting and meaningful topic of how to utilize this discrepancy property of two brain hemispheres to improve EEG emotion recognition.
近来,许多研究者尝试将情绪神经科学发现作为先验知识来提取特征或开发模型,有效提升了脑电(EEG)情绪识别的性能。例如Hinrikus等人[12]利用EEG频谱不对称指数进行抑郁症检测。神经科学研究已明确:虽然人脑解剖结构看似对称,但左右半球对情绪的反应存在差异。例如从神经科学视角,Dimond等人[1]、Davidson等人[13]和Herrington等人[14]研究了情绪表达的不对称性,Schwartz等人[15]、Wager等人[16]以及Costanzo等人[17]探讨了情绪偏侧化现象。此外,EEG情绪识别领域已有研究利用不对称性对脑电情绪信号进行分类。Lin等人[4]探究了情绪状态与大脑活动的关系,提取功率谱密度、差分不对称功率和理性不对称功率作为特征。基于先前发现的情绪识别关键脑区,Zheng等人[5]选取六个对称颞叶电极作为EEG情绪识别的关键通道。Li等人[18]分别提取两个大脑半球特征,实现了最先进的分类性能。上述研究表明,将EEG信号的独有特性融入机器学习算法是极具前景的研究方向。如何利用大脑两半球这种差异性特性来提升EEG情绪识别,将成为有趣且富有意义的课题。
Thus, in this paper, we propose a novel neural network model BiHDM to learn the bi-hemispheric discrepancy for EEG emotion recognition. BiHDM aims to obtain the deep discrepant features between the left and right hemispheres, which is expected to contain more disc rim i native information to recognize the EEG emotion signals. To achieve this goal, we need to solve two major problems, i.e., how to extract the features for each hemispheric EEG data and meanwhile measure the difference between them. Unlike other data structures such as skeletal action data, in which the position of each node varies with time, the EEG data consists of several electrodes that are set under the predefined coordinates on the scalp. Hence, to avoid losing this intrinsic graph structural information of EEG data, we can simplify the graph structure learning process by using the horizontal and vertical traversing RNNs, which will construct a complete relationship graph and generate disc rim i native deep features for all the EEG electrodes. After obtaining these deep features of each electrodes, we can extract the asymmetric discrepancy information between two hemispheres by performing specific pairwise operations for any paired symmetric electrodes. The concrete process is as follows:
因此,本文提出了一种新型神经网络模型BiHDM,用于学习脑电(EEG)情绪识别中的双半球差异特征。BiHDM旨在获取左右脑半球之间的深层差异特征,这些特征有望包含更多用于识别脑电情绪信号的判别性信息。为实现这一目标,我们需要解决两个主要问题:如何提取每个半球脑电数据的特征,同时衡量它们之间的差异。与骨骼动作数据等其他数据结构不同(其中每个节点的位置随时间变化),脑电数据由多个电极组成,这些电极按照预设坐标固定在头皮上。因此,为避免丢失脑电数据固有的图结构信息,我们可以通过使用水平和垂直遍历的RNN来简化图结构学习过程,这将构建完整的关联图并为所有脑电电极生成判别性深层特征。在获得每个电极的深层特征后,我们可以通过对任意成对的对称电极执行特定的配对操作,提取两个半球之间的不对称差异信息。具体过程如下:
(1) Firstly, we employ individual two RNN modules to separately scan all spatial electrodes’ data on the left and right hemispheres and generate deep feature representations for all the EEG electrodes. Herein, when the RNN module traverses the spatial regions, it will walk under two predefined stack strategies determined with respect to the horizontal and vertical direction streams;
(1) 首先,我们采用两个独立的RNN模块分别扫描左右半球所有空间电极的数据,并为所有EEG电极生成深度特征表示。在此过程中,当RNN模块遍历空间区域时,将按照水平方向和垂直方向流预先定义的两种堆叠策略进行移动;
appeared.
出现。
To the best of our knowledge, this is the first work to integrate the electrodes’ discrepancy relation on two hemispheres into deep learning models to improve EEG emotion recognition. The experimental results verify the discrimination and effectiveness of this differential information between the left and right hemispheres for EEG emotion recognition.
据我们所知,这是首个将两半球电极差异关系整合到深度学习模型中以提高脑电图(EEG)情绪识别的研究。实验结果验证了左右半球间这种差异信息对EEG情绪识别的区分性和有效性。
The remainder of this paper is organized as follows: In section II, we specify the method of BiHDM as well as its application to EEG emotion recognition. In section III, we conduct extensive experiments to evaluate the proposed method for EEG emotion recognition. In sections IV and V, we discuss the paper and conclude it.
本文的其余部分组织如下:在第 II 部分中,我们详细介绍了 BiHDM 方法及其在 EEG (脑电图) 情绪识别中的应用。第 III 部分通过大量实验评估所提出的 EEG 情绪识别方法。第 IV 和第 V 部分分别对本文进行讨论和总结。
II. THE PROPOSED MODEL FOR EEG EMOTION RECOGNITION
II. 基于脑电图 (EEG) 的情绪识别模型
A. The BiHDM model
A. BiHDM 模型
To specify the proposed method clearly, we illustrate the framework of the BiHDM model in Fig. 1. Its goal is to capture the asymmetric differential information between two hemispheres. We adopt three steps to achieve this goal. First, we obtain the deep representations of all the electrodes’ data. Subsequently, we characterize the relationship between the identified paired electrodes on two hemispheres, and generate a more disc rim i native and higher-level discrepancy feature for final classification. Third, we leverage a classifier and a disc rim in at or to corporate ly induce the above process to generate the emotion-related but domain-invariant features. The overall process is described as follows.
为明确阐述所提方法,我们在图1中展示了BiHDM模型的框架。其目标是捕捉两个半球间的不对称差异信息。我们通过三个步骤实现这一目标:首先获取所有电极数据的深层表征;随后刻画两半球配对电极间的关系,生成更具判别力的高层差异特征用于最终分类;最后通过分类器和判别器协同引导上述过程,生成与情绪相关但领域不变的特征。具体流程如下:
- Obtaining the deep representation for each electrode: In BiHDM, we attempt to separately extract the EEG electrodes’ deep features on left and right brain hemispheres by using two independent RNN modules. To avoid losing the intrinsic graph structural information of EEG data, for each hemispheric EEG data, we build the RNN module traversing the spatial regions under two predefined stacks, which are determined with respect to horizontal and vertical directions. These two directional RNNs are actually complementary for simplifying the technology to construct a complete relationship graph of electrodes’ locations. Concretely, for an EEG sample $\mathbf{X}{t}$ , it $\mathbb{R}^{d\times N}$ , swplhiet raes $\mathbf{X}{t}^{l}$ $\mathbf{X}{t}=[\mathbf{X}{t}^{l},\mathbf{X}{t}^{r}]=[\mathbf{x}{1}^{l},\cdot\cdot\cdot,\mathbf{x}{\frac{N}{2}}^{l},\mathbf{x}{1}^{r},\cdot\cdot\cdot,\mathbf{x}{\frac{N}{2}}^{r}]\in$ $\mathbf{X}{t}^{r}$ on the left and right hemispheres, $d$ is the dimension of each EEG electrode’s data and $N$ is the number of electrodes. When modeling spatial dependencies, two graphs, i.e., $\mathtt{G}^{l}{=}{\mathtt{N}^{l},\mathtt{E}^{l}}$ and , are used to separately represent the electrodes’ spatial relations on the left and right hemispheres, where $\mathrm{N}^{l}={\mathbf{x}{i}^{l}}$ and $\mathbb{N}^{r}={\mathbf{x}{i}^{r}},(i=1,2,\cdot\cdot\cdot,\frac{N}{2})$ denote the electrode sets, while $\mathrm{E}^{l}={e_{i j}^{l}}$ and $\mathrm{E}^{r}={e_{i j}^{r}}$ represent the edges between spatially neighboring electrodes. Then we traverse through $\mathrm{G}^{l}$ and $\mathsf{G}^{r}$ separately with a predefined forward evolution sequence so that the input state and the previous states can be defined for an RNN unit. This process can be formulated as
- 获取每个电极的深度表征:在BiHDM中,我们尝试通过两个独立的RNN模块分别提取左右脑半球EEG电极的深层特征。为避免丢失EEG数据固有的图结构信息,针对每个半球的EEG数据,我们构建了按照水平和垂直方向两个预定义堆栈遍历空间区域的RNN模块。这两个方向性RNN在简化电极位置完整关系图构建技术方面实际互为补充。具体而言,对于EEG样本$\mathbf{X}{t}$,其$\mathbb{R}^{d\times N}$,其中$\mathbf{X}{t}^{l}$和$\mathbf{X}{t}^{r}$分别表示左右半球的信号,即$\mathbf{X}{t}=[\mathbf{X}{t}^{l},\mathbf{X}{t}^{r}]=[\mathbf{x}{1}^{l},\cdot\cdot\cdot,\mathbf{x}{\frac{N}{2}}^{l},\mathbf{x}{1}^{r},\cdot\cdot\cdot,\mathbf{x}{\frac{N}{2}}^{r}]\in$,$d$为每个EEG电极数据的维度,$N$为电极总数。建模空间依赖关系时,采用两个图$\mathtt{G}^{l}{=}{\mathtt{N}^{l},\mathtt{E}^{l}}$和分别表示左右半球电极空间关系,其中$\mathrm{N}^{l}={\mathbf{x}{i}^{l}}$和$\mathbb{N}^{r}={\mathbf{x}{i}^{r}},(i=1,2,\cdot\cdot\cdot,\frac{N}{2})$表示电极集合,$\mathrm{E}^{l}={e_{i j}^{l}}$和$\mathrm{E}^{r}={e_{i j}^{r}}$表示空间相邻电极间的边。随后我们按预定义的前向演化序列分别遍历$\mathrm{G}^{l}$和$\mathsf{G}^{r}$,从而为RNN单元定义输入状态和先前状态。该过程可表述为
$$
\begin{array}{r l}&{\mathbf{s}{i}^{l}=\sigma(\mathbf{U}^{l}\mathbf{x}{i}^{l}+\sum_{j=1}^{N/2}e_{i j}^{l}\mathbf{V}^{l}\mathbf{s}{j}^{l}+\mathbf{b}^{l})\in\mathbb{R}^{d_{l}\times1},}\ &{\mathbf{s}{i}^{r}=\sigma(\mathbf{U}^{r}\mathbf{x}{i}^{r}+\sum_{j=1}^{N/2}e_{i j}^{r}\mathbf{V}^{r}\mathbf{s}{j}^{r}+\mathbf{b}^{r})\in\mathbb{R}^{d_{r}\times1},}\end{array}
$$
$$
\begin{array}{r l}&{\mathbf{s}{i}^{l}=\sigma(\mathbf{U}^{l}\mathbf{x}{i}^{l}+\sum_{j=1}^{N/2}e_{i j}^{l}\mathbf{V}^{l}\mathbf{s}{j}^{l}+\mathbf{b}^{l})\in\mathbb{R}^{d_{l}\times1},}\ &{\mathbf{s}{i}^{r}=\sigma(\mathbf{U}^{r}\mathbf{x}{i}^{r}+\sum_{j=1}^{N/2}e_{i j}^{r}\mathbf{V}^{r}\mathbf{s}{j}^{r}+\mathbf{b}^{r})\in\mathbb{R}^{d_{r}\times1},}\end{array}
$$

Fig. 1: The framework of BiHDM. BiHDM consists of four RNN modules to capture each hemispheric EEG electrodes’ information from horizontal and vertical streams. Then all the electrodes’ data representations interact and construct the final vector for the classifier and disc rim in at or.
图 1: BiHDM框架。BiHDM由四个RNN模块组成,分别从水平和垂直流中捕捉每个半球EEG电极的信息。随后所有电极的数据表征进行交互,构建出用于分类器和判别器的最终向量。
where $\mathbf{s}{i}^{l},\mathbf{s}{i}^{r}$ and $d_{l},d_{r}$ are the hidden units and the dimensions of RNN modules on the left and right hemispheres, respectively; $\sigma(\cdot)$ denotes the nonlinear operation such as Sigmoid function; ${\mathbf{U}^{l}\in\mathbb{R}^{d_{l}\times d},\mathbf{V}^{l}\in\mathbb{R}^{d_{l}\times d_{l}},\mathbf{b}^{l}\in\mathbb{R}^{d_{l}\times1}}$ and ${{\bf U}^{r}\in\mathbb{R}^{d_{r}\times d}$ , $\mathbf{V}^{r}\in\mathbb{R}^{d_{r}\times d_{r}}$ , $\mathbf{b}^{r}\in\mathbb{R}^{d_{r}\times1}}$ are the learnable transformation matrices of the two hemispheric RNN modules; and $\mathcal{N}\left(\mathbf{x}{i}^{\cdot}\right)$ denotes the set of predecessors of the node $\mathbf{x}{i}^{\cdot}$ . Here $d_{l}=d_{r}$ . As the RNN modules traverse all the nodes in $\mathrm{N}^{l}$ and $\mathrm{N}^{r}$ , the obtained hidden states $\mathbf{s}{i}^{l}$ and $\mathbf{s}_{i}^{r}$ can be used as the deep features to represent the $i$ -th electrode’s data on two hemispheres.
其中 $\mathbf{s}{i}^{l},\mathbf{s}{i}^{r}$ 和 $d_{l},d_{r}$ 分别是左右半球RNN模块的隐藏单元和维度; $\sigma(\cdot)$ 表示非线性操作(如Sigmoid函数); ${\mathbf{U}^{l}\in\mathbb{R}^{d_{l}\times d},\mathbf{V}^{l}\in\mathbb{R}^{d_{l}\times d_{l}},\mathbf{b}^{l}\in\mathbb{R}^{d_{l}\times1}}$ 和 ${{\bf U}^{r}\in\mathbb{R}^{d_{r}\times d}$ , $\mathbf{V}^{r}\in\mathbb{R}^{d_{r}\times d_{r}}$ , $\mathbf{b}^{r}\in\mathbb{R}^{d_{r}\times1}}$ 是两个半球RNN模块的可学习变换矩阵; $\mathcal{N}\left(\mathbf{x}{i}^{\cdot}\right)$ 表示节点 $\mathbf{x}{i}^{\cdot}$ 的前驱节点集合。此处 $d_{l}=d_{r}$ 。当RNN模块遍历 $\mathrm{N}^{l}$ 和 $\mathrm{N}^{r}$ 中所有节点时,获得的隐藏状态 $\mathbf{s}{i}^{l}$ 和 $\mathbf{s}_{i}^{r}$ 可作为表示第 $i$ 个电极在左右半球数据的深层特征。
Particularly, for the left and right hemispheric RNN modules, they traverse the spatial regions under two predefined horizontal and vertical stacks. Therefore, we will obtain two paired deep feature sets, i.e., $(\mathbf{S}{t}^{l h},\mathbf{S}{t}^{r h})$ and $(\mathbf{S}{t}^{l v},\mathbf{S}{t}^{r v})$ , where $\mathbf{\bar{S}}{t}^{l h}={\mathbf{s}{i}^{l h}}\in\mathbb{R}^{d_{l}\times(N/2)}$ and $\mathbf{S}{t}^{r h}={\mathbf{s}{i}^{r h}}\in\mathbb{R}^{d_{r}\times(N/2)}$ represent the left and right hemispheric electrodes’ deep features under horizontal direction, while $\mathbf{S}{t}^{l v}={\mathbf{s}{i}^{l v}}\in\mathbb{R}^{d_{l}\times\bar{(}N/2)}$ and $\mathbf{S}{t}^{r v}={\mathbf{s}{i}^{r v}}\in\mathbb{R}^{d_{r}\times(N/2)}$ represent the deep features under vertical direction. So far, we obtain the deep representation of each electrode, which has the emotional disc rim i native information and keeps the location structural relation.
特别地,左右半球RNN模块按照预设的水平堆叠和垂直堆叠两种方式遍历空间区域。因此,我们将获得两组配对深度特征集,即 $(\mathbf{S}{t}^{l h},\mathbf{S}{t}^{r h})$ 和 $(\mathbf{S}{t}^{l v},\mathbf{S}{t}^{r v})$。其中 $\mathbf{\bar{S}}{t}^{l h}={\mathbf{s}{i}^{l h}}\in\mathbb{R}^{d_{l}\times(N/2)}$ 与 $\mathbf{S}{t}^{r h}={\mathbf{s}{i}^{r h}}\in\mathbb{R}^{d_{r}\times(N/2)}$ 表示水平方向下左右半球电极的深度特征,而 $\mathbf{S}{t}^{l v}={\mathbf{s}{i}^{l v}}\in\mathbb{R}^{d_{l}\times\bar{(}N/2)}$ 和 $\mathbf{S}{t}^{r v}={\mathbf{s}{i}^{r v}}\in\mathbb{R}^{d_{r}\times(N/2)}$ 则对应垂直方向下的深度特征。至此,我们获得了每个电极的深度表征,这些表征既包含情感判别信息,又保持了位置结构关系。
- Interaction between the paired electrodes on two hemispheres: After obtaining the deep features of every electrode above, i.e., $(\mathbf{S}{t}^{l h},\mathbf{S}{t}^{r h})$ and $(\mathbf{S}{t}^{l v},\mathbf{S}_{t}^{r v})$ , we perform a specific pairwise operation on the paired electrodes referring to the symmetric locations on the brain scalp to identify the asymmetric differential information between two hemispheres. This operation can be expressed as
- 两半球配对电极间的交互作用:在获取上述每个电极的深层特征后,即 $(\mathbf{S}{t}^{l h},\mathbf{S}{t}^{r h})$ 和 $(\mathbf{S}{t}^{l v},\mathbf{S}_{t}^{r v})$,我们对位于头皮对称位置的配对电极执行特定的成对操作,以识别两半球间的不对称差异信息。该操作可表示为
$$
\begin{array}{r l}{\hat{\mathbf{S}}{t}^{h}=\mathcal{F}(\mathbf{S}{t}^{l h},\mathbf{S}{t}^{r h})=\mathcal{F}({\mathbf{s}{i}^{l h}},{\mathbf{s}{i}^{r h}})\in\mathbb{R}^{d_{p}\times(N/2)},}\ {\hat{\mathbf{S}}{t}^{v}=\mathcal{F}(\mathbf{S}{t}^{l v},\mathbf{S}{t}^{r v})=\mathcal{F}({\mathbf{s}{i}^{l v}},{\mathbf{s}{i}^{r v}})\in\mathbb{R}^{d_{p}\times(N/2)},}\end{array}
$$
$$
\begin{array}{r l}{\hat{\mathbf{S}}{t}^{h}=\mathcal{F}(\mathbf{S}{t}^{l h},\mathbf{S}{t}^{r h})=\mathcal{F}({\mathbf{s}{i}^{l h}},{\mathbf{s}{i}^{r h}})\in\mathbb{R}^{d_{p}\times(N/2)},}\ {\hat{\mathbf{S}}{t}^{v}=\mathcal{F}(\mathbf{S}{t}^{l v},\mathbf{S}{t}^{r v})=\mathcal{F}({\mathbf{s}{i}^{l v}},{\mathbf{s}{i}^{r v}})\in\mathbb{R}^{d_{p}\times(N/2)},}\end{array}
$$
where $\hat{\mathbf{S}}{t}^{h}{=}{\hat{\mathbf{s}}{i}^{h}}$ and $\hat{\mathbf{S}}{t}^{v}={\hat{\mathbf{s}}{i}^{v}}$ are the deep asymmetric differential features, $\mathcal{F}(\cdot)$ denotes the pairwise operation between any two paired electrodes’ data representations.
其中 $\hat{\mathbf{S}}{t}^{h}{=}{\hat{\mathbf{s}}{i}^{h}}$ 和 $\hat{\mathbf{S}}{t}^{v}={\hat{\mathbf{s}}{i}^{v}}$ 是深度非对称差分特征,$\mathcal{F}(\cdot)$ 表示任意两个配对电极数据表示之间的成对操作。
where $d_{p_{1}}=d_{p_{2}}=d_{l}$ and $d_{p_{3}}=1$ .1
其中 $d_{p_{1}}=d_{p_{2}}=d_{l}$ 且 $d_{p_{3}}=1$。
To further capture the higher-level discrepancy discriminative features, we utilize another RNN module that performs on the obtained differential asymmetric features ${\hat{\mathbf{s}}{i}^{h}}$ and from the horizontal and vertical streams. Formally, the operations on them can be written as
为了进一步捕捉更高层次的差异判别特征,我们采用了另一个RNN模块来处理从水平和垂直流中获得的差分非对称特征${\hat{\mathbf{s}}{i}^{h}}$。形式上,对这些特征的操作可以表示为
$$
\begin{array}{r l}{\tilde{\mathbf{s}}{i}^{h}=\sigma(\mathbf{U}^{h}\hat{\mathbf{s}}{i}^{h}+\mathbf{V}^{h}\tilde{\mathbf{s}}{i-1}^{h}+\mathbf{b}^{h})\in\mathbb{R}^{d_{g}\times1},}\ {\tilde{\mathbf{s}}{i}^{v}=\sigma(\mathbf{U}^{v}\hat{\mathbf{s}}{i}^{v}+\mathbf{V}^{v}\tilde{\mathbf{s}}{i-1}^{v}+\mathbf{b}^{v})\in\mathbb{R}^{d_{g}\times1},}\end{array}
$$
$$
\begin{array}{r l}{\tilde{\mathbf{s}}{i}^{h}=\sigma(\mathbf{U}^{h}\hat{\mathbf{s}}{i}^{h}+\mathbf{V}^{h}\tilde{\mathbf{s}}{i-1}^{h}+\mathbf{b}^{h})\in\mathbb{R}^{d_{g}\times1},}\ {\tilde{\mathbf{s}}{i}^{v}=\sigma(\mathbf{U}^{v}\hat{\mathbf{s}}{i}^{v}+\mathbf{V}^{v}\tilde{\mathbf{s}}{i-1}^{v}+\mathbf{b}^{v})\in\mathbb{R}^{d_{g}\times1},}\end{array}
$$
where ${\mathbf{U}^{h}\in~\mathbb{R}^{d_{g}\times d_{p}}$ , $\mathbf{V}^{h}\in\mathbb{R}^{d_{g}\times d_{g}}$ , $\mathbf{b}^{h}\in\mathbb{R}^{d_{g}\times1}}$ and ${\mathbf{U}^{v}\in\mathbb{R}^{d_{g}\times d_{p}}$ , ${\bf V}^{v}\in\mathbb{R}^{d_{g}\times d_{g}}$ , $\mathbf{b}^{v}\in\mathbb{R}^{d_{g}\times1}}$ are the learnable parameter matrices, and $d_{g}$ is the hidden unit’s dimension of the high-level RNN module. Moreover, to automatically detect the salient information related to emotion among these paired differential features, projection matrices are applied to the higher-level discrepancy disc rim i native features ${\tilde{\mathbf{s}}{i}^{h}}$ and ${\tilde{\mathbf{s}}{i}^{v}}$ obtained by Eq. (7) and (8). Denoting the projection matrices for the horizontal and vertical traversing directions by $\mathbf{W}^{h}=[w_{i k}^{h}]{(N/2)\times K}$ and $\mathbf{W}^{v}=[w_{i k}^{v}]_{(N/2)\times K}$ , the projection can be written as
其中 $\mathbf{V}^{h}\in\mathbb{R}^{d_{g}\times d_{g}}$ , $\mathbf{b}^{h}\in\mathbb{R}^{d_{g}\times1}$ 和 ${\mathbf{U}^{v}\in\mathbb{R}^{d_{g}\times d_{p}}$ , ${\bf V}^{v}\in\mathbb{R}^{d_{g}\times d_{g}}$ , $\mathbf{b}^{v}\in\mathbb{R}^{d_{g}\times1}}$ 是可学习的参数矩阵,$d_{g}$ 为高层RNN模块的隐藏单元维度。此外,为自动检测这些配对差分特征中与情感相关的显著信息,需对式(7)和(8)获得的高层差异判别特征 ${\tilde{\mathbf{s}}{i}^{h}}$ 和 ${\tilde{\mathbf{s}}{i}^{v}}$ 应用投影矩阵。设水平和垂直遍历方向的投影矩阵分别为 $\mathbf{W}^{h}=[w_{i k}^{h}]{(N/2)\times K}$ 和 $\mathbf{W}^{v}=[w_{i k}^{v}]_{(N/2)\times K}$ ,投影运算可表示为
$$
\begin{array}{r l}&{\bar{\mathbf{s}}{k}^{h}=\sigma(\sum_{i=1}^{N/2}w_{i k}^{h}\tilde{\mathbf{s}}{i}^{h}+\hat{\mathbf{b}}^{h})\in\mathbb{R}^{d_{g}\times1},k=1,2,\cdots,K,}\ &{\bar{\mathbf{s}}{k}^{v}=\sigma(\sum_{i=1}^{N/2}w_{i k}^{v}\tilde{\mathbf{s}}{i}^{v}+\hat{\mathbf{b}}^{v})\in\mathbb{R}^{d_{g}\times1},k=1,2,\cdots,K.}\end{array}
$$
$$
\begin{array}{r l}&{\bar{\mathbf{s}}{k}^{h}=\sigma(\sum_{i=1}^{N/2}w_{i k}^{h}\tilde{\mathbf{s}}{i}^{h}+\hat{\mathbf{b}}^{h})\in\mathbb{R}^{d_{g}\times1},k=1,2,\cdots,K,}\ &{\bar{\mathbf{s}}{k}^{v}=\sigma(\sum_{i=1}^{N/2}w_{i k}^{v}\tilde{\mathbf{s}}{i}^{v}+\hat{\mathbf{b}}^{v})\in\mathbb{R}^{d_{g}\times1},k=1,2,\cdots,K.}\end{array}
$$
Finally, we use two learnable mapping matrices $\mathbf{G}^{h}\in$ $\mathbb{R}^{d_{o}\times d_{g}}$ and $\mathbf{G}^{v}\in\mathbb{R}^{d_{o}\times d_{g}}$ to summarize the stimulus $\bar{\mathbf{S}}{t}^{h}=$ ${\bar{\mathbf{s}}{k}^{h}}\in\mathbb{R}^{d_{g}\times K}$ and $\bar{\mathbf{S}}{t}^{v}={\bar{\mathbf{s}}{k}^{v}}\in\mathbb{R}^{d_{g}\times K}$ from two directional streams, namely,
最后,我们使用两个可学习的映射矩阵 $\mathbf{G}^{h}\in$ $\mathbb{R}^{d_{o}\times d_{g}}$ 和 $\mathbf{G}^{v}\in\mathbb{R}^{d_{o}\times d_{g}}$ 来汇总来自两个方向流的刺激 $\bar{\mathbf{S}}{t}^{h}=$ ${\bar{\mathbf{s}}{k}^{h}}\in\mathbb{R}^{d_{g}\times K}$ 和 $\bar{\mathbf{S}}{t}^{v}={\bar{\mathbf{s}}{k}^{v}}\in\mathbb{R}^{d_{g}\times K}$,即
$$
\mathbf{S}{t}^{h v}=\mathbf{G}^{h}\bar{\mathbf{S}}{t}^{h}+\mathbf{G}^{v}\bar{\mathbf{S}}{t}^{v}\in\mathbb{R}^{d_{o}\times K}.
$$
$$
\mathbf{S}{t}^{h v}=\mathbf{G}^{h}\bar{\mathbf{S}}{t}^{h}+\mathbf{G}^{v}\bar{\mathbf{S}}{t}^{v}\in\mathbb{R}^{d_{o}\times K}.
$$
Until now, for an input EEG sample $\mathbf{X}{t}$ , the output feature $\mathbf{S}_{t}^{h v}$ is obtained.
对于输入的EEG样本$\mathbf{X}{t}$,现已获得输出特征$\mathbf{S}_{t}^{h v}$。
- Disc rim i native prediction and domain adversarial strategy: Like most supervised models, we add a supervision term into the network by applying the softmax function to the output feature $\mathbf{S}{t}^{h v}{=}{\mathbf{s}_{k}^{h v}}$ , $\left(k=1,\cdots,K\right)$ to predict the class label.
- 判别式预测与领域对抗策略:与大多数监督模型类似,我们在网络中添加监督项,通过对输出特征 $\mathbf{S}{t}^{h v}{=}{\mathbf{s}_{k}^{h v}}$ ($k=1,\cdots,K$)应用 softmax 函数来预测类别标签。
Let $\mathbf{o}=[(\bar{\mathbf{s}}{1}^{h v})^{\mathrm{T}},(\mathbf{s}{2}^{h v})^{\mathrm{T}},\cdot\cdot\cdot,(\mathbf{s}{K}^{h v})^{\mathrm{T}}]\in\mathbb{R}^{1\times K d_{o}}$ denotes the output feature vector, then
令 $\mathbf{o}=[(\bar{\mathbf{s}}{1}^{h v})^{\mathrm{T}},(\mathbf{s}{2}^{h v})^{\mathrm{T}},\cdot\cdot\cdot,(\mathbf{s}{K}^{h v})^{\mathrm{T}}]\in\mathbb{R}^{1\times K d_{o}}$ 表示输出特征向量,则
$$
\mathbf{y}=\mathbf{o}\mathbf{P}+\mathbf{b}^{c}={y_{1},y_{2},\cdot\cdot\cdot~,y_{C}}\in\mathbb{R}^{1\times C},
$$
$$
\mathbf{y}=\mathbf{o}\mathbf{P}+\mathbf{b}^{c}={y_{1},y_{2},\cdot\cdot\cdot~,y_{C}}\in\mathbb{R}^{1\times C},
$$
where $\mathbf{P}\in\mathbb{R}^{K d_{o}\times C}$ and $\mathbf{b}^{c}\in\mathbb{R}^{1\times C}$ are the transform matrices, and $C$ is the number of emotion types.
其中 $\mathbf{P}\in\mathbb{R}^{K d_{o}\times C}$ 和 $\mathbf{b}^{c}\in\mathbb{R}^{1\times C}$ 是变换矩阵,$C$ 为情感类型数量。
Finally, the output vector of BiHDM is fed into the softmax layer for emotion classification, which can be written as
最后,BiHDM的输出向量被输入到softmax层进行情感分类,可表示为
$$
P(c|\mathbf{X}{t})=\exp(y_{c})/\sum_{i=1}^{C}\exp(y_{i}),
$$
$$
P(c|\mathbf{X}{t})=\exp(y_{c})/\sum_{i=1}^{C}\exp(y_{i}),
$$
where $P(c|\mathbf{X}{t})$ denotes the predicted probability that the input sample $\mathbf{X}{t}$ belongs to the $c$ -th class. As a result, the label $\tilde{l}{t}$ of sample $\mathbf{X}_{t}$ is predicted as
其中 $P(c|\mathbf{X}{t})$ 表示输入样本 $\mathbf{X}{t}$ 属于第 $c$ 类的预测概率。因此,样本 $\mathbf{X}{t}$ 的标签 $\tilde{l}_{t}$ 被预测为
$$
\tilde{l}{t}=a r g\operatorname*{max}{c}P(c|\mathbf{X}_{t}).
$$
$$
\tilde{l}{t}=a r g\operatorname*{max}{c}P(c|\mathbf{X}_{t}).
$$
Here $\theta_{f}$ and $\theta_{c}$ denote the learnable parameters of the feature extraction module and the classifier, while $l_{t}$ and $M_{1}$ are the ground-truth label of sample $\mathbf{X}_{t}$ and the number of training samples. By minimizing the above loss function, discriminative features could be extracted for emotion recognition.
这里 $\theta_{f}$ 和 $\theta_{c}$ 分别表示特征提取模块和分类器的可学习参数,而 $l_{t}$ 和 $M_{1}$ 分别是样本 $\mathbf{X}_{t}$ 的真实标签和训练样本数量。通过最小化上述损失函数,可以提取出用于情绪识别的判别性特征。
To align the feature distributions between source and target domains, we adopt the domain adversarial strategy by adding a disc rim in at or into the network. It works cooperatively with the classifier to induce the feature extraction process to generate emotion-distinguishable but domain-invariant features.
为了对齐源域和目标域之间的特征分布,我们采用领域对抗策略,在特征提取网络中引入判别器 (discriminator) 。该判别器与分类器协同工作,引导特征提取过程生成具有情感区分性但领域不变的特征。
Concretely, we predefine the source domain label set $D_{S}=$ ${0,0,\cdots,0}\in\mathbb{Z}^{M_{1}\times1}$ and target domain label set $D_{T}= $ ${1,1,\cdots,1}\in\mathbb{Z}^{M_{2}\times1}$ , where $M_{2}$ is the number of testing samples. Then through maximizing the loss function of the disc rim in at or, which can be denoted as
具体而言,我们预定义源域标签集 $D_{S}=$ ${0,0,\cdots,0}\in\mathbb{Z}^{M_{1}\times1}$ 和目标域标签集 $D_{T}=$ ${1,1,\cdots,1}\in\mathbb{Z}^{M_{2}\times1}$ ,其中 $M_{2}$ 为测试样本数量。随后通过最大化判别器 (discriminator) 的损失函数来实现,该函数可表示为
$$
\begin{array}{r l r}{\lefteqn{L_{d}(\mathbf{X}{t}^{S},\mathbf{X}{t^{\prime}}^{T};\boldsymbol{\theta}{f},\boldsymbol{\theta}{d})}}\ {=-\sum_{k=1}^{M_{1}}\log P(0|\mathbf{X}{t}^{S})-\sum_{k^{\prime}=1}^{M_{2}}\log P(1|\mathbf{X}_{t^{\prime}}^{T}),}\end{array}
$$
$$
\begin{array}{r l r}{\lefteqn{L_{d}(\mathbf{X}{t}^{S},\mathbf{X}{t^{\prime}}^{T};\boldsymbol{\theta}{f},\boldsymbol{\theta}{d})}}\ {=-\sum_{k=1}^{M_{1}}\log P(0|\mathbf{X}{t}^{S})-\sum_{k^{\prime}=1}^{M_{2}}\log P(1|\mathbf{X}_{t^{\prime}}^{T}),}\end{array}
$$
the feature extraction process expects to have the ability to generate the data representation to confuse the disc rim in at or to distinguish which domain the input comes from (i.e., the domain-invariant features). Here $\mathbf{X}{t}^{S}$ and $\mathbf{X}{t^{\prime}}^{T}$ denote the $t\cdot$ -th and $t^{\prime}$ -th sample in the source and target data set respectively, and $\theta_{d}$ represents the learnable parameter of disc rim in at or.
特征提取过程期望具备生成数据表征的能力,以混淆判别器 (discriminator) 或区分输入来自哪个域 (即域不变特征)。这里 $\mathbf{X}{t}^{S}$ 和 $\mathbf{X}{t^{\prime}}^{T}$ 分别表示源数据集和目标数据集中的第 $t$ 个和第 $t^{\prime}$ 个样本,$\theta_{d}$ 代表判别器的可学习参数。
B. The optimization of BiHDM
B. BiHDM 的优化
The overall optimization of BiHDM can be expressed as
BiHDM的整体优化可表示为
$$
\begin{array}{r l}{\operatorname*{min}L(\mathbf{X};\boldsymbol{\theta}{f},\boldsymbol{\theta}{c},\boldsymbol{\theta}{d})}\ {\quad\quad=\operatorname*{min}L_{c}(\mathbf{X}^{S};\boldsymbol{\theta}{f},\boldsymbol{\theta}{c})+\operatorname*{max}L_{d}(\mathbf{X}^{S},\mathbf{X}^{T};\boldsymbol{\theta}{f},\boldsymbol{\theta}_{d}),}\end{array}
$$
$$
\begin{array}{r l}{\operatorname*{min}L(\mathbf{X};\boldsymbol{\theta}{f},\boldsymbol{\theta}{c},\boldsymbol{\theta}{d})}\ {\quad\quad=\operatorname*{min}L_{c}(\mathbf{X}^{S};\boldsymbol{\theta}{f},\boldsymbol{\theta}{c})+\operatorname*{max}L_{d}(\mathbf{X}^{S},\mathbf{X}^{T};\boldsymbol{\theta}{f},\boldsymbol{\theta}_{d}),}\end{array}
$$
where $L(\cdot)$ is the loss function of the overall model, and $\mathbf{X}$ denotes the entire data set that consists of the source data set $\mathbf{X}^{S}$ and target data set XT , i.e., X = [XS, XT ] ∈ Rd×N×(M1+M2).
其中 $L(\cdot)$ 是整个模型的损失函数,$\mathbf{X}$ 表示由源数据集 $\mathbf{X}^{S}$ 和目标数据集 XT 组成的完整数据集,即 X = [XS, XT ] ∈ Rd×N×(M1+M2)。
This max-minimizing loss function will force the parameters of feature extraction module to generate emotion-related but domain-invariant data representation, which benefits for EEG emotion recognition because of the tremendous data distribution shift for EEG emotional signal, especially in the case of subject-independent task where the source and target data come from different subjects.
这种最大化最小化的损失函数会迫使特征提取模块的参数生成与情绪相关但领域不变的数据表示,这有助于脑电图情绪识别,因为脑电图情绪信号存在巨大的数据分布偏移,尤其是在源数据和目标数据来自不同受试者的跨被试任务情况下。
Specifically, the maximizing problem can be transferred to a minimizing problem by using a gradient reversal layer (GRL) [19] before the disc rim in at or, which can be optimized by using stochastic gradient descent (SGD) algorithm [20] easily. GRL acts as an identity transform in the forwardpropagation but reverses the gradient sign while performing the back-propagation operation. The overall optimization process follows the rules below
具体而言,最大化问题可通过在判别器前加入梯度反转层 (GRL) [19] 转换为最小化问题,这能方便地通过随机梯度下降 (SGD) 算法 [20] 进行优化。GRL 在前向传播中保持恒等变换,但在反向传播时反转梯度符号。整体优化过程遵循以下规则:
$$
\begin{array}{l}{{\displaystyle{\theta_{c}\leftarrow\theta_{c}-\alpha\frac{\partial{\cal L}{c}}{\partial\theta_{c}}},~\theta_{d}\leftarrow\theta_{d}-\alpha\frac{\partial{\cal L}{d}}{\partial\theta_{d}}},}\ {{\displaystyle{\theta_{f}\leftarrow\theta_{f}-\alpha(\frac{\partial{\cal L}{c}}{\partial\theta_{f}}-\frac{\partial{\cal L}{d}}{\partial\theta_{f}})},}}\end{array}
$$
$$
\begin{array}{l}{{\displaystyle{\theta_{c}\leftarrow\theta_{c}-\alpha\frac{\partial{\cal L}{c}}{\partial\theta_{c}}},~\theta_{d}\leftarrow\theta_{d}-\alpha\frac{\partial{\cal L}{d}}{\partial\theta_{d}}},}\ {{\displaystyle{\theta_{f}\leftarrow\theta_{f}-\alpha(\frac{\partial{\cal L}{c}}{\partial\theta_{f}}-\frac{\partial{\cal L}{d}}{\partial\theta_{f}})},}}\end{array}
$$
where $\alpha$ is the learning rate. In this way, we can iterative ly train the classifier and the disc rim in at or to update the parameters with the similar approach of standard deep learning methods by chain rule.
其中 $\alpha$ 是学习率。通过这种方式,我们可以迭代训练分类器和判别器,并使用链式法则以类似标准深度学习方法的方式更新参数。
III. EXPERIMENTS
III. 实验
A. Setting up
A. 设置
To evaluate the proposed BiHDM model, in this section, we will conduct experiments on three public EEG emotional datasets. All the three datasets were collected when the participants sat in front of a monitor comfortably and watched emotional video clips. The EEG signals are recorded from 62 electrode channels using ESI NeuroScan with a sampling rate of $1000~\mathrm{Hz}$ . The locations of electrodes are on the basis of the international 10-20 system. Thus in the experiment, we perform the pairwise operation on the 31 paired electrodes based on the symmetric locations on the left and right brain hemispheric scalps. The detailed information of these datasets are described as follows:
为评估所提出的BiHDM模型,本节将在三个公开的脑电图(EEG)情感数据集上进行实验。所有数据集均在参与者舒适地坐在显示器前观看情感视频片段时采集。EEG信号通过ESI NeuroScan系统从62个电极通道记录,采样率为$1000~\mathrm{Hz}$。电极位置基于国际10-20系统布置。因此实验中,我们根据左右脑半球头皮的对称位置,对31组成对电极执行配对操作。各数据集详细信息如下:
(1) SEED [21]. SEED dataset contains 15 subjects, and each subject has three sessions. During the experiment, the participants watched three kinds of emotional film clips, i,e, happy, neutral and sad, where each emotion has 5 film clips. Consequently, there are totally 15 trails, and each trail has 185-238 samples for one session of each subject. Then there are totally about 3400 samples in one session;
(1) SEED [21]。SEED数据集包含15名受试者,每位受试者进行三次实验。实验中,参与者观看三种情绪类型的电影片段(快乐、中性、悲伤),每种情绪包含5段影片。每次实验共进行15轮 trials,每位受试者单次实验的每个trial包含185-238个样本,因此单次实验总样本量约为3400个;
(2) SEED $\mathbf{IV}^{2}$ [5]. SEED-IV dataset also contains 15 subjects, and each subject has three sessions. But it includes four emotion types with the extra emotion fear compared with SEED, and each emotion has 6 film clips. Thus there are totally 24 trails, and each trail has 12-64 samples for one session of each subject. Then there are totally about 830 samples in one session;
(2) SEED $\mathbf{IV}^{2}$ [5]。SEED-IV数据集同样包含15名被试对象,每位被试进行三次实验。与SEED相比,该数据集新增恐惧情绪,共涵盖四种情绪类型,每种情绪包含6段电影片段。因此每个实验会话包含24条轨迹,每条轨迹对应单名被试单次会话的12-64个样本。单个会话总样本量约为830个;
(3) $\mathbf{MPED^{2}}$ [22]. MPED dataset contains 30 subjects and each subject has one session. It includes seven refined emotion types, i.e., joy, funny, neutral, sad, fear, disgust and anger, and each emotion has 4 film clips. There are totally 28 trails, and each trail has 120 samples. There are totally 3360 samples in one subject.
(3) $\mathbf{MPED^{2}}$ [22]。MPED数据集包含30名受试者,每位受试者进行一次实验。该数据集包含七种精细化情绪类型,即喜悦、滑稽、中性、悲伤、恐惧、厌恶和愤怒,每种情绪对应4段影片片段。每名受试者共进行28次试验,每次试验包含120个样本,单个受试者样本总量为3360个。
To evaluate the proposed BiHDM model adequately, we design two kinds of experiments including the subject-dependent and subject-independent ones. We use the released handcrafted features, i.e., the differential entropy (DE) in SEED and SEEDIV, and the Short-Time Fourier Transform (STFT) in MPED, as the input to feed our model. Thus the sizes $d\times N$ of the input sample $\mathbf{X}{t}$ are $5\times62$ , $5\times62$ and $1\times62$ for these three datasets, respectively. Moreover, in the experiment, we respectively set the dimension $d_{l}$ of each electrode’s deep representation to 32; the parameters $d_{g}$ and $K$ of the global high-level feature to 32 and 6; and the dimension $d_{o}$ of the output feature to 16 without elaborate traversal. Specifically, we implemented BiHDM using TensorFlow on one Nvidia 1080Ti GPU. The learning rate, momentum and weight decay rate are set as 0.003, 0.9 and 0.95 respectively. The network is trained using SGD with batch size of 200. In addition, we adopt the subtraction as the pairwise operation of the BiHDM model in the experiment section, and discuss the other two types of operations in section III-D.
为充分评估提出的BiHDM模型,我们设计了两类实验:被试相关(subject-dependent)和被试无关(subject-independent)实验。采用公开的手工特征作为模型输入:SEED和SEEDIV数据集使用差分熵(DE),MPED数据集使用短时傅里叶变换(STFT)。因此输入样本$\mathbf{X}{t}$的尺寸$d\times N$在这三个数据集中分别为$5\times62$、$5\times62$和$1\times62$。实验中将每个电极的深层表示维度$d_{l}$设为32,全局高层特征的参数$d_{g}$和$K$分别设为32和6,输出特征维度$d_{o}$设为16(未进行精细遍历)。具体实现采用TensorFlow框架和Nvidia 1080Ti GPU,学习率、动量和权重衰减率分别设置为0.003、0.9和0.95,使用批量为200的随机梯度下降(SGD)进行训练。实验部分采用减法作为BiHDM模型的成对运算,其他两种运算方式将在第III-D节讨论。
B. The EEG emotion recognition experiments
B. EEG情绪识别实验
- The subject-dependent experiment: In this experiment, we adopt the same protocols as [21], [5] and [22]. Namely, for SEED, we use the former nine trails of EEG data per session of each subject as source (training) domain data while using the remaining six trials per session as target (testing) domain data; for SEED-IV, we use the first sixteen trials per session of each subject as the training data, and the last eight trials containing all emotions (each emotion with two trials) as the testing data; for MPED, we use twenty-one trials of EEG data as training data and the rest seven trails consist of seven emotions as testing data for each subject. The mean accuracy (ACC) and standard deviation (STD) are used as the final evaluation metrics for all the subjects in the dataset.
- 受试者相关实验:本实验采用与[21]、[5]和[22]相同的协议。具体而言,对于SEED数据集,我们使用每位受试者每节前9次脑电图(EEG)实验数据作为源(训练)域数据,剩余6次实验数据作为目标(测试)域数据;对于SEED-IV数据集,采用每节前16次实验数据作为训练数据,最后8次包含所有情绪(每种情绪对应2次实验)的数据作为测试数据;对于MPED数据集,选取21次EEG实验数据作为训练数据,剩余7次包含七种情绪的实验数据作为每位受试者的测试数据。最终以所有受试者的平均准确率(ACC)和标准差(STD)作为评估指标。
To validate the superiority of BiHDM, we also conduct the same experiments using twelve methods, including linear support vector machine (SVM) [23], random forest (RF) [24], canonical correlation analysis (CCA) [25], group sparse canonical correlation analysis (GSCCA) [10], deep believe network (DBN) [21], graph regular iz ation sparse linear regression (GRSLR) [26], graph convolutional neural network (GCNN) [27], dynamical graph convolutional neural network (DGCNN) [28], domain adversarial neural networks (DANN) [19], bi-hemisphere domain adversarial neural network (BiDANN) [29], Emotion Meter [5], and attention-long short-term memory (A-LSTM) [22]. All the methods compared in our paper are the representative ones in the previous studies. We directly take (or reproduce) their results from the literature to ensure a convincing comparison with the proposed method. The results are summarized in Table I.
为验证BiHDM的优越性,我们同样采用十二种方法进行对比实验,包括线性支持向量机(SVM) [23]、随机森林(RF) [24]、典型相关分析(CCA) [25]、组稀疏典型相关分析(GSCCA) [10]、深度信念网络(DBN) [21]、图正则化稀疏线性回归(GRSLR) [26]、图卷积神经网络(GCNN) [27]、动态图卷积神经网络(DGCNN) [28]、域对抗神经网络(DANN) [19]、双半球域对抗神经网络(BiDANN) [29]、Emotion Meter [5]以及注意力长短期记忆网络(A-LSTM) [22]。本文对比方法均为前人研究中的代表性方法,我们直接采用(或复现)文献结果以确保与所提方法进行可信对比。实验结果汇总如 表 I 所示。
TABLE I: The classification performance for subjectdependent EEG emotion recognition on SEED, SEED-IV and MPED datasets.
$^*$ indicates the experiment results obtained are based on our own implementation. indicates the experiment results are not reported on that dataset.
表 1: SEED、SEED-IV 和 MPED 数据集上基于被试的脑电情绪识别分类性能
| 方法 | SEED | SEED-IV | MPED |
|---|---|---|---|
| SVM [23] | 83.99/09.72 | 56.61/20.05* | 32.39/09.53* |
| RF [24] | 78.46/11.77 | 50.97/16.22* | 23.83/06.82* |
| CCA [25] | 77.63/13.21 | 54.47/18.48* | 29.08/07.96* |
| GSCCA [10] | 82.96/09.95 | 69.08/16.66* | 36.78/07.76* |
| DBN [21] | 86.08/08.34 | 66.77/07.38* | 35.07/11.25* |
| GRSLR [26] | 87.39/08.64 | 69.32/19.57* | 34.58/08.41* |
| GCNN [27] | 87.40/09.20 | 68.34/15.42* | 33.26/06.44* |
| DGCNN [28] | 90.40/08.49 | 69.88/16.29* | 32.37/06.08* |
| DANN [19] | 91.36/08.30 | 63.07/12.66* | 35.04/06.52* |
| BiDANN [29] | 92.38/07.04 | 70.29/12.63* | 37.71/06.04* |
| EmotionMeter [5] | — | 70.59/17.01 | — |
| A-LSTM [22] | 88.61/10.16* | 69.50/15.65* | 38.99/07.53* |
| BiHDM | 93.12/06.06 | 74.35/14.09 | 40.34/07.59 |
$^*$ 表示实验结果基于我们自己的实现获得。—表示该数据集未报告实验结果。
From Table I, we can see that the proposed BiHDM model outperforms all the compared methods on all the three public EEG emotional datasets, which verifies the effectiveness of BiHDM. Especially for the result on SEED-IV, the proposed method improves over the state-of-the-art method EmotionMeter by $4%$ . Besides, we can see that the compared method BiDANN, which also considers the bi-hemispheric asymmetry, achieves a comparable performance. The main difference between BiDANN and BiHDM is that the former adopts two hemispheric local disc rim in at or s to separately narrow the left and right hemispheric data distribution gaps in either source or target domain but not directly captures the discrepancy information. In contrast, the latter (i.e., the proposed BiHDM) focuses on constructing model to learn the discrepancy relation between two hemispheres and these differential components are beneficial for emotion recognition. Meanwhile, both the results of BiHDM and BiDANN indicate the importance of considering the difference between the left and right cerebral hemispheric data for EEG emotion recognition.
从表1可以看出,提出的BiHDM模型在三个公开EEG情绪数据集上均优于所有对比方法,验证了BiHDM的有效性。特别是在SEED-IV数据集上的结果,该方法比当前最优方法EmotionMeter提升了4%。此外,同样考虑双半球不对称性的对比方法BiDANN也取得了可比的性能。BiDANN与BiHDM的主要区别在于:前者采用两个半球局部判别器分别缩小左右半球在源域或目标域的数据分布差距,而非直接捕获差异信息;而后者(即提出的BiHDM)着重构建模型来学习两半球间的差异关系,这些差异成分对情绪识别具有促进作用。同时,BiHDM和BiDANN的结果都表明,考虑左右脑半球数据差异对EEG情绪识别具有重要意义。
To test if the proposed BiHDM is statistically significantly better than the baseline method, paired t-test statistical analysis is conducted at the significant level of 0.05. When the improvement of BiHDM over the method is statistically significant, the results will be underlined in the table. Table II shows the t-test statistical analysis results, from which we can see BiHDM is significantly better than the baseline method.
为验证所提出的BiHDM在统计学意义上是否显著优于基线方法,我们在显著性水平0.05下进行了配对t检验统计分析。当BiHDM相对于对比方法的改进具有统计显著性时,表格中相应数据会添加下划线标注。表II展示了t检验统计分析结果,可见BiHDM确实显著优于基线方法。
TABLE II: The t-test statistics analysis between BiHDM and the baseline method at the significance level of 0.05. When the improvement of BiHDM over the method is statistically significant, the result will be underlined.
表 II: BiHDM 与基线方法在显著性水平 0.05 下的 t 检验统计分析。当 BiHDM 相对于方法的改进具有统计显著性时,结果将以下划线标注。
$^{a}$ and $^b$ indicate the subject-dependent and independent experiment results respectively.
| 方法 | p-value | ||
|---|---|---|---|
| SEED | SEED-IV | MPED | |
| BiHDM vs.BiDANN | 0.0580 | 0.0344a | 0.0488g |
| 0.04516 | 0.01886 | 0.00916 |
$^{a}$ 和 $^b$ 分别表示主体相关和主体无关的实验结果。
Besides, although the representative methods DANN and BiDANN in Table I have used the unlabelled testing data to enhance their performance, some compared baseline methods only use the labelled training data to learn the model. To have a fair comparison with them, we follow their setting by taking off the disc rim in at or and only using the labelled training data to conduct the same experiments. The accuracy becomes $91.07%$ , $72.22%$ and $38.55%$ on SEED, SEED-IV and MPED datasets, which still achieves comparable performance. This indicates our differential features are indeed more disc rim i native.
此外,尽管表1中的代表性方法DANN和BiDANN已利用未标记测试数据提升性能,但部分对比基线方法仅使用标记训练数据学习模型。为确保公平比较,我们遵循其设定移除判别器,仅使用标记训练数据重复实验。在SEED、SEED-IV和MPED数据集上准确率分别为$91.07%$、$72.22%$和$38.55%$,仍保持可比性能。这表明我们的差分特征确实更具判别性。
- The subject-independent experiment: In this experiment, we adopt the leave-one-subject-out (LOSO) cross-validation strategy [30] to evaluate the proposed BiHDM model. LOSO strategy uses the EEG signals of one subject as testing data and the rest subjects’ EEG signals as training data. This procedure is repeated such that the EEG signals of each subject will be used as testing data once. Again, the mean accuracy (ACC) and standard deviation (STD) are used as the evaluation metrics.
- 与受试者无关的实验:在本实验中,我们采用留一受试者交叉验证 (LOSO) 策略 [30] 来评估所提出的 BiHDM 模型。LOSO 策略将一个受试者的脑电信号 (EEG) 作为测试数据,其余受试者的脑电信号作为训练数据。该过程不断重复,直到每个受试者的脑电信号都被用作测试数据一次。同样采用平均准确率 (ACC) 和标准差 (STD) 作为评估指标。
In addition, for comparison purpose, we use twelve methods including Kullback-Leibler importance estimation procedure (KLIEP) [31], un constrained least-squares importance fitting (ULSIF) [32], selective transfer machine (STM) [33], linear SVM, transfer component analysis (TCA) [34], transfer component analysis (TCA) [35], geodesic flow kernel (GFK) [36], DANN, DGCNN, deep adaptation network (DAN) [37], BiDANN, and A-LSTM, to conduct the same experiments. Note that the distribution gap in the subject-independent task is much larger than the subject-dependent one, so that transfer learning methods always achieve promising performance.
此外,为便于比较,我们采用了包括Kullback-Leibler重要性估计程序(KLIEP) [31]、无约束最小二乘重要性拟合(ULSIF) [32]、选择性迁移机(STM) [33]、线性SVM、迁移成分分析(TCA) [34]、迁移成分分析(TCA) [35]、测地流核(GFK) [36]、DANN、DGCNN、深度适配网络(DAN) [37]、BiDANN和A-LSTM在内的十二种方法进行相同实验。需注意,主体无关任务中的分布差距远大于主体相关任务,因此迁移学习方法总能取得优异性能。
Therefore, in the subject-independent task, we include lots of domain adaptation methods in the comparison. By doing so, we can effectively validate the state-of-the-art performance of our method. The results are shown in Table III.3
因此,在主体无关任务中,我们在比较中纳入了多种域适应方法。通过这种方式,我们能够有效验证所提方法的最先进性能。结果如表 III.3 所示。
TABLE III: The classification performance for subjectindependent EEG emotion recognition on SEED, SEED-IV and MPED datasets.
表 III: SEED、SEED-IV 和 MPED 数据集上主体无关的 EEG 情绪识别分类性能。
$^*$ indicates the experiment results obtained are based on our own implementation. indicates the experiment results are not reported on that dataset.
| 方法 | ACC / STD (%) | SEED | SEED-IV | MPED |
|---|---|---|---|---|
| KLIEP [31] | 45.71/17.76 | 31.46/09.20* | 18.92/04.54* | |
| ULSIF [32] | 51.18/13.57 | 32.99/11.05* | 19.63/03.81* | |
| STM [33] | 51.23/14.82 | 39.39/12.40* | 20.89/03.62* | |
| SVM [23] | 56.73/16.29 | 37.99/12.52* | 19.66/03.96* | |
| TCA [34] | 63.64/14.88 | 56.56/13.77* | 19.50/03.61* | |
| SA [35] | 69.00/10.89 | 64.44/09.46* | 20.74/04.17* | |
| GFK [36] | 71.31/14.09 | 64.38/11.41* | 20.27/04.34* | |
| A-LSTM [22] | 72.18/10.85* | 55.03/09.28* | 24.06/04.58* | |
| DANN [19] | 75.08/11.18 | 47.59/10.01* | 22.36/04.37* | |
| DGCNN [28] | 79.95/09.02 | 52.82/09.23* | 25.12/04.20* | |
| DAN [37] | 83.81/08.56 | 58.87/08.13 | ||
| BiDANN [29] | 83.28/09.60 | 65.59/10.39* | 25.86/04.92* | |
| BiHDM | 85.40/07.53 | 69.03/08.66 | 28.27/04.99 |
$^*$ 表示实验结果基于我们自己的实现获得。表示该数据集上未报告实验结果。
From Table III, it can be clearly seen that the proposed BiHDM method achieves the best performance in the three public datasets, which verifies the effectiveness of BiHDM in dealing with subject-independent EEG emotion recognition. For the three datasets, the improvements on accuracy are $2.2%$ , $3.5%$ and $2.4%$ , respectively, when compared with the existing state-of-the-art methods. On the other hand, we also perform the paired t-test between BiHDM and the baseline method at the significant level of 0.05 to see whether BiHDM has an improvement of recognition rate. Table II shows the t-test statistical analysis results, from which we can see BiHDM is significantly better than the baseline method.
从表 III 可以清楚地看出,所提出的 BiHDM 方法在三个公开数据集上均取得了最佳性能,验证了 BiHDM 在处理主体独立 EEG 情感识别任务中的有效性。与现有最优方法相比,该方法在三个数据集上的准确率分别提升了 $2.2%$ 、 $3.5%$ 和 $2.4%$ 。此外,我们在显著性水平 0.05 下对 BiHDM 与基线方法进行了配对 t 检验,以验证识别率的提升是否显著。表 II 展示了 t 检验统计分析结果,表明 BiHDM 显著优于基线方法。
C. Confusion matrix
C. 混淆矩阵
To see the confusions of BiHDM in recognizing different emotions, we depict the confusion matrices of the above two experiments in Fig. 2 from which, we have two observations:
为了观察BiHDM在识别不同情绪时的混淆情况,我们在图2中绘制了上述两个实验的混淆矩阵,从中我们得出两点观察:
(1) From Fig. 2(a) and Fig. 2(d) corresponding to the SEED dataset, we can see that the emotions happy and neutral are much easier to be recognized than sad. But comparing the results between these two kinds of experiments, it is easy to see that in the subject-independent experiment, when the training and testing data come from different people, the recognition rates of emotions neutral and sad will decrease about $10%$ and $9%$ while the emotion happy only decreases $3%$ . We can also observe the same case
从图2(a)和图2(d)对应的SEED数据集可以看出,高兴(happy)和中性(neutral)情绪比悲伤(sad)更容易被识别。但比较这两类实验结果时,可以明显发现在被试独立实验中,当训练数据和测试数据来自不同个体时,中性情绪和悲伤情绪的识别率分别下降约$10%$和$9%$,而高兴情绪仅下降$3%$。我们也能观察到相同情况
3Note that the subspace based methods, such as TCA, SA and GFK, are problematic to handle a large amount of EEG data due to the computer memory limitation and computational issue. Therefore, to compare with them we have to randomly select 3000 EEG feature samples from the training data set to train these methods.
需要注意的是,基于子空间的方法(如 TCA、SA 和 GFK)由于计算机内存限制和计算问题,难以处理大量 EEG 数据。因此,为了与这些方法进行比较,我们不得不从训练数据集中随机选取 3000 个 EEG 特征样本进行训练。
from the two confusion matrices of the SEED-IV dataset from Fig. 2(b) and Fig. 2(e). This shows that the emotion happy causes more similar brain reflection over different people than neutral and sad;
从图 2(b) 和图 2(e) 的 SEED-IV 数据集的两个混淆矩阵可以看出,快乐情绪在不同人之间引发的大脑反应比中性和悲伤情绪更为相似;
(2) For MPED, which consists of seven emotion types, it is much more complicated than the other two datasets. For the subject-dependent experimental result in Fig. 2(c), we can find that the emotions funny, neutral and sad are much easier to be recognized than the other four emotions. However, comparing it with the subjectindependent confusion matrix in Fig. 2(f), we can see that the recognition rate of sad decreases significantly, which is same as the case observed in the above (i.e., point (1)). It is possibly because the pattern of emotion sad varies considerably from one subject to another. Moreover, it is interesting to see that the recognition rate of emotion funny decreases significantly but the emotion anger increases, which may be because the participants share common response to the anger emotional videos but have different interpretation about funny.
(2) 对于包含七种情绪类型的MPED数据集,其复杂度远高于另外两个数据集。从图2(c)所示的受试者相关实验结果可见,搞笑(funny)、中性(neutral)和悲伤(sad)三种情绪的识别难度明显低于其他四种情绪。但对比图2(f)的受试者无关混淆矩阵可发现,悲伤情绪的识别率显著下降,这与前文第(1)点观察到的现象一致。这可能源于不同受试者表现悲伤情绪的模式存在较大差异。值得注意的是,搞笑情绪的识别率急剧下降而愤怒(anger)情绪识别率上升,可能因为受试者对愤怒情绪视频的反应较为一致,但对搞笑内容的解读存在个体差异。
D. Different pairwise operations
D. 不同的成对操作
In this section, we investigate the performance of using different pairwise operations in BiHDM, as shown in Eq. (6). Here, we denote the subtraction, division and inner product variants as BiHDM-S, BiHDM-D, and BiHDM-I, respectively. The results are shown in Fig. 3. As seen, the subtraction operation achieves the best performance among the three pairwise operations. This may be because the subtraction operation directly measure the discrepancy between two hemispheres, whereas the other two operations describe the difference from various aspects. However, both BiHDM-I and BiHDMD achieve comparable performance compared with the other methods shown in Table I and III, which can show the effectiveness of considering the differential information between two cerebral hemispheres. We will explore more pairwise operations such as nonlinear kernel functions in the future work.
在本节中,我们研究了BiHDM中使用不同成对操作的性能,如公式(6)所示。这里,我们将减法、除法和内积变体分别表示为BiHDM-S、BiHDM-D和BiHDM-I。结果如图3所示。可以看出,减法操作在三种成对操作中取得了最佳性能。这可能是因为减法操作直接测量了两个半球之间的差异,而其他两种操作则从不同方面描述了差异。然而,与表I和表III中所示的其他方法相比,BiHDM-I和BiHDMD都取得了相当的性能,这表明考虑两个大脑半球之间的差异信息是有效的。我们将在未来的工作中探索更多成对操作,例如非线性核函数。
To further verify the performance of pairwise operations, we conduct additional experiments for subject-dependent EEG emotion recognition on SEED, SEED-IV and MPED datasets by replacing subtraction with concatenation, and obtain $90.52%$ , $72.68%$ and $37.89%$ in accuracy. It is clearly inferior to the proposed subtraction operation $(93.12%$ , $74.35%$ and $40.34%$ ). This shows that using pairwise operation to explicitly extract the discrepancy indeed helps EEG emotion recognition.
为了进一步验证成对操作的性能,我们在SEED、SEED-IV和MPED数据集上进行了被试依赖的脑电情绪识别实验,将减法操作替换为拼接操作,准确率分别为$90.52%$、$72.68%$和$37.89%$。这明显逊于提出的减法操作$(93.12%$、$74.35%$和$40.34%$)。结果表明,使用成对操作显式提取差异确实有助于脑电情绪识别。
IV. DISCUSSION
IV. 讨论
A. The activity maps of the paired EEG electrodes
A. 配对 EEG 电极的活动图谱
To explore the contribution of the differential information from various brain areas for emotion expression, we depict the electrode activity maps in Fig. 4. The contribution is evaluated by computing each column’s 2-norm of the asymmetric differential features $\hat{\mathbf{S}}{t}^{h}$ and $\hat{\mathbf{S}}_{t}^{v}$ in Eq. (4) and (5) for all the testing data and mapping these values into the corresponding electrodes. The two electrodes in a pair share the same value.
为探究不同脑区差异信息对情绪表达的贡献,我们在图4中绘制了电极活动图。通过计算所有测试数据在公式(4)和(5)中非对称差异特征$\hat{\mathbf{S}}{t}^{h}$和$\hat{\mathbf{S}}_{t}^{v}$各列的2-范数,并将这些值映射到对应电极上来评估贡献度。同一电极对中的两个电极具有相同数值。

Fig. 2: The confusion matrices of the experiments.
图 2: 实验的混淆矩阵。
From Fig. 4, we can see that the frontal EEG asymmetry appears to serve as a more important role in emotion recognition for all the three datasets, which is consistent with the cognition observation in biological psychology [38]. Moreover, for the MPED dataset, which consists of more emotion types, the temporal lobe asymmetry also makes important contribution as the frontal asymmetry.
从图4可以看出,在所有三个数据集中,额叶脑电图(EEG)不对称性似乎在情绪识别中扮演着更重要的角色,这与生物心理学中的认知观察结果一致[38]。此外,对于包含更多情绪类型的MPED数据集,颞叶不对称性与额叶不对称性同样具有重要贡献。
Specifically, to explore where the differential information coming from in terms of the emotion expressed, we separately depict the electrode activity maps corresponding to each emotion in Fig. 5. Although it looks quite similar with Fig. 4, i.e., the asymmetry on frontal and temporal lobes make more contribution to discriminate different emotions, we can observe some delicate distinctions from these maps of different emotions:
具体而言,为探究情绪表达差异信息的来源,我们在图5中分别绘制了各情绪对应的电极活动图。尽管该图与图4高度相似(即额叶与颞叶的不对称性对情绪区分贡献更大),但仍可从不同情绪图中观察到细微差异:
(1) For the positive emotions (happy in SEED and SEED-IV, joy and funny in MPED), we can see that the asymmetry on temporal lobe actives as same as (or even more than)
(1) 对于积极情绪(SEED和SEED-IV中的happy,MPED中的joy和funny),我们观察到颞叶活动的不对称性与(甚至超过)

Fig. 3: The experimental results by using BiHDM-S, BiHDMD and BiHDM-I models on three datasets.
图 3: 使用 BiHDM-S、BiHDMD 和 BiHDM-I 模型在三个数据集上的实验结果。

Fig. 4: The EEG electrode activity maps of the subjectdependent experiments. Darker red denotes more significant contribution. Note that these maps are symmetric because it shows the value computed by pairwise operation shared by paired electrodes. (Best Viewed in Color)
图 4: 被试依赖性实验的脑电图电极活动图。红色越深表示贡献越显著。请注意这些图是对称的,因为它们显示了由成对电极共享的双向操作计算得出的值。(建议彩色查看)
the frontal lobe;
额叶;
B. Electrodes reduction
B. 电极还原
For the emotion recognition system in real-world applications, fewer electrodes will be preferred considering the feasibility and comfort. Thus in this section, we investigate how the performance varies with relatively small numbers of electrodes. Motivated by the results from Fig. 4 and 5, we select the paired electrodes on four brain areas referring to the locations of frontal and temporal lobes, denoted by Frontal (6), Frontal (10), Temporal (6) and Temporal (9) 4. The experimental results are shown in Table IV. We can have two observation from this table:
在实际应用的情绪识别系统中,考虑到可行性和舒适度,更少的电极是首选。因此,本节我们研究了在电极数量较少的情况下性能如何变化。受图4和图5结果的启发,我们参考额叶和颞叶的位置,在四个脑区选择了成对电极,分别标记为额叶(6)、额叶(10)、颞叶(6)和颞叶(9)4。实验结果如表IV所示。从该表中我们可以得出两个观察结果:
TABLE IV: The classification performance based on the frontal and temporal lobe EEG data for subject-dependent and subject-independent EEG emotion recognition on SEED, SEED-IV and MPED datasets.
表 IV: 基于额叶和颞叶脑电数据的分类性能,用于SEED、SEED-IV和MPED数据集上的被试依赖与被试独立脑电情绪识别。
(a) Subject-dependent experiment results
(a) 受试者相关实验结果
| 电极区域 | ACC/ STD(%) | SEED | SEED-IV | MPED |
|---|---|---|---|---|
| 额叶区 (6) | 80.15/09.86 | 57.93/13.88 | 29.02/05.68 | |
| 额叶区 (10) | 84.49/08.83 | 63.02/16.95 | 32.37/06.79 | |
| 颞叶区 (6) | 88.16/08.03 | 64.88/15.76 | 33.61/07.19 | |
| 颞叶区 (9) | 90.16/07.44 | 65.19/16.03 | 33.13/07.06 | |
| 全脑区 (31) | 93.12/06.06 | 74.35/14.09 | 40.34/07.59 |
(b) Subject-independent experiment results
(b) 主体无关实验结果
| 电极区域 | ACC/STD(%) | SEED-IV | MPED |
|---|---|---|---|
| 额叶区 (6) | 74.33/08.70 | 67.28/08.19 | 23.54/02.73 |
| 额叶区 (10) | 80.28/09.94 | 68.16/07.85 | 25.44/04.95 |
| 颞叶区 (6) | 85.04/07.13 | 65.07/08.74 | 26.07/04.32 |
| 颞叶区 (9) | 84.09/07.78 | 66.92/08.74 | 26.43/04.55 |
| 全电极 (31) | 85.40/07.53 | 69.03/08.66 | 28.27/04.99 |
C. The performance based on single hemispheric EEG data
C. 基于单侧脑半球 EEG 数据的性能
From the above discussion, we can see the discrepancy information between the left and right hemispheres indeed contributes to the EEG emotion recognition task. On the other hand, it will be interesting to investigate which hemisphere is more tightly associated to emotion recognition. Therefore, in this section, we focus on this problem and conduct the same experiments by separately feeding our BiHDM with the left and right hemispheric data. The obtained experimental results are shown in Table V, from which we can see the left hemisphere is superior to the right for EEG emotion recognition, especially in the experiments on SEED-IV and MPED datasets. Besides, comparing it with the results in Table IV(a), which are based on feeding the model with less symmetric electrodes’ data, we can observe that results are comparable or even better than this experiment based on single hemisphere data. This verifies the effectiveness of discrepancy information for EEG emotion recognition from another aspect.
从上述讨论可以看出,左右半球间的差异信息确实对脑电(EEG)情绪识别任务有贡献。另一方面,探究哪侧半球与情绪识别关联更紧密也颇具意义。因此,本节我们聚焦该问题,通过分别向BiHDM模型输入左右半球数据进行相同实验。实验结果如表V所示,可见左半球在EEG情绪识别中表现优于右半球,尤其在SEED-IV和MPED数据集上的实验。此外,与表IV(a)中使用非对称电极数据的结果相比,可观察到单半球数据的实验结果与之相当甚至更优,这从另一角度验证了差异信息对EEG情绪识别的有效性。

Fig. 5: The EEG electrode activity maps in terms of different emotions based on the results of subject-dependent experiments. Darker red denotes more significant contribution. (a)-(c), (d)-(g), and (h)-(n) are the results on SEED, SEED-IV, and MPED datasets respectively. Note that these maps are symmetric because it shows the value computed by pairwise operation shared by paired electrodes. (Best Viewed in Color)
图 5: 基于被试内实验结果的各情绪状态下EEG电极活动分布图。红色越深表示贡献越显著。(a)-(c)、(d)-(g)和(h)-(n)分别为SEED、SEED-IV和MPED数据集上的结果。注意这些分布图具有对称性,因为它们显示的是由成对电极共享的成对运算计算得出的值。(建议彩色查看)
TABLE V: The classification performance based on single hemispheric EEG data for subject-dependent EEG emotion recognition on SEED, SEED-IV and MPED datasets.
表 V: 基于单侧半球 EEG 数据的 SEED、SEED-IV 和 MPED 数据集被试依赖型 EEG 情绪识别分类性能
| 半球 | ACC / STD (%) | SEED | SEED-IV | MPED |
|---|---|---|---|---|
| BiHDM-left | 86.63/08.88 | 64.48/15.34 | 35.92/07.26 | |
| BiHDM-right | 86.39/07.54 | 60.11/13.53 | 33.08/08.30 | |
| BiHDM-overall | 93.12/06.06 | 74.35/14.09 | 40.34/07.59 |
with horizontal and vertical RNNs achieves much better performance than the single directional RNN. This shows that the proposed spatial feature learning method is helpful to extract the disc rim i native information for EEG emotion recognition.
结合水平和垂直方向RNN的方法比单向RNN取得了更好的性能。这表明所提出的空间特征学习方法有助于提取EEG情绪识别中的判别性信息。
TABLE VI: The classification performance of different spatial feature extraction methods for EEG emotion recognition on SEED, SEED-IV and MPED datasets.
表 VI: 不同空间特征提取方法在SEED、SEED-IV和MPED数据集上对EEG情绪识别的分类性能。
(a) Subject-dependent experiment results
(a) 受试者相关实验结果
| 电极区域 | ACC/STD(%) | SEED-IV | MPED |
|---|---|---|---|
| BiHDM-h | 87.47/09.17 | 62.06/15.01 | 36.24/08.41 |
| BiHDM-v | 86.75/07.09 | 65.57/15.43 | 36.69/08.20 |
| BiHDM | 93.12/06.06 | 74.35/14.09 | 40.34/07.59 |
(b) Subject-independent experiment results
(b) 主体无关实验结果
| 电极区域 | ACC / STD (%) | SEED-IV | MPED |
|---|---|---|---|
| BiHDM-h | 82.38/09.33 | 66.80/08.22 | 28.05/04.98 |
| BiHDM-v | 81.03/10.28 | 66.96/08.28 | 27.86/05.06 |
| BiHDM | 85.40/07.53 | 69.03/08.66 | 28.27/04.99 |
D. The effect of two directional RNNs to extract spatial information
D. 双向RNN提取空间信息的效果
In BiHDM, the horizontal and vertical RNNs are adopted to model the structural relation between the electrodes. To evaluate the effect of this spatial information extraction for emotion recognition, we modified the framework of BiHDM with a single directional RNN, denoted by BiHDM-h and BiHDM-v respectively, to conduct the same experiments. The results are summarized in Table VI, from which we can see that the predefined strategy of traversing the spatial region
在BiHDM中,采用水平和垂直RNN来建模电极间的结构关系。为评估这种空间信息提取对情绪识别的效果,我们将BiHDM框架修改为单向RNN(分别记为BiHDM-h和BiHDM-v)进行相同实验。结果汇总于表VI,可见预定义的空间区域遍历策略
V. CONCLUSION
V. 结论
In this paper, we propose a novel bi-hemispheric discrepancy model (BiHDM) for EEG emotion recognition. The proposed framework is easy to implement and generally achieves the state-of-the-art performance. This shows the effectiveness of incorporating the asymmetric differential information into EEG emotion recognition. In the future work, we will further investigate more left and right hemispheric differential operations to explore the potential efficacy of cerebral hemisphere asymmetry in EEG emotion recognition.
本文提出了一种新型的双半球差异模型 (BiHDM) 用于脑电 (EEG) 情绪识别。该框架易于实现,且普遍达到了最先进的性能水平,这表明将非对称差异信息融入脑电情绪识别的有效性。在未来的工作中,我们将进一步研究更多左右半球差异操作,以探索大脑半球不对称性在脑电情绪识别中的潜在功效。
