[论文翻译]填充K空间与优化图像:动态多对比MRI重建的提示方法


原文地址:https://arxiv.org/pdf/2309.13839v1


Fill the K-Space and Refine the Image: Prompting for Dynamic and Multi-Contrast MRI Reconstruction

填充K空间与优化图像:动态多对比MRI重建的提示方法

Abstract. The key to dynamic or multi-contrast magnetic resonance imaging (MRI) reconstruction lies in exploring inter-frame or intercontrast information. Currently, the unrolled model, an approach combining iterative MRI reconstruction steps with learnable neural network layers, stands as the best-performing method for MRI reconstruction. However, there are two main limitations to overcome: firstly, the unrolled model structure and GPU memory constraints restrict the capacity of each denoising block in the network, impeding the effective extraction of detailed features for reconstruction; secondly, the existing model lacks the flexibility to adapt to variations in the input, such as different contrasts, resolutions or views, necessitating the training of separate models for each input type, which is inefficient and may lead to insufficient reconstruction. In this paper, we propose a two-stage MRI reconstruction pipeline to address these limitations. The first stage involves filling the missing $k$ -space data, which we approach as a physics-based reconstruction problem. We first propose a simple yet efficient baseline model, which utilizes adjacent frames/contrasts and channel attention to capture the inherent interframe/-contrast correlation. Then, we extend the baseline model to a prompt-based learning approach, PromptMR, for all-in-one MRI reconstruction from different views, contrasts, adjacent types, and acceleration factors. The second stage is to refine the reconstruction from the first stage, which we treat as a general video restoration problem to further fuse features from neighboring frames/contrasts in the image domain. Extensive experiments show that our proposed method significantly outperforms previous state-of-the-art accelerated MRI reconstruction methods.

摘要。动态或多对比度磁共振成像(MRI)重建的关键在于探索帧间或对比度间的信息。目前,展开模型(unrolled model)这种将迭代MRI重建步骤与可学习神经网络层相结合的方法,是MRI重建中性能最佳的方法。然而该方法存在两个主要局限:首先,展开模型结构和GPU内存限制制约了网络中每个去噪模块的容量,阻碍了有效提取重建所需的细节特征;其次,现有模型缺乏适应输入变化(如不同对比度、分辨率或视角)的灵活性,需要为每种输入类型单独训练模型,效率低下且可能导致重建不足。本文提出两阶段MRI重建流程来解决这些局限。第一阶段通过基于物理的重构方法来填补缺失的k空间数据:首先提出一个利用相邻帧/对比度和通道注意力捕捉固有帧间/对比度相关性的高效基线模型;随后将该基线模型扩展为基于提示学习(prompt-based learning)的PromptMR方法,实现多视角、多对比度、多相邻类型和多加速因子的统一MRI重建。第二阶段将第一阶段的输出作为通用视频修复问题,在图像域进一步融合相邻帧/对比度特征。大量实验表明,我们提出的方法显著优于现有最先进的加速MRI重建方法。

Keywords: MRI reconstruction · Prompt-based learning · Dynamic · Multi-contrast · Two-stage approach

关键词: MRI重建 · 基于提示的学习 · 动态 · 多对比度 · 两阶段方法

1 Introduction

1 引言

Cardiovascular disease, including conditions such as coronary artery disease, heart failure, and arrhythmia s, remains the leading cause of death globally. Cardiac magnetic resonance (CMR) imaging is the most accurate and reliable non-invasive technique for accessing cardiac anatomy, function, and pathology [13]. In the field of accelerated MR imaging (MRI) reconstruction, unrolled networks have achieved state-of-the-art performance. This is attributed to their ability to incorporate the known imaging degradation processes, the under sampling operation in kspace, into the network and to learn image priors from large-scale data [16,2]. As transformers have become predominant in general image restoration tasks [18,9], there is a noticeable trend towards incorporating transformer-based denoising blocks into the unrolled network [2], which enhances reconstruction quality. However, the adoption of transformer blocks concurrently increases the network parameters and computational complexity. The stacking of denoising blocks, in an unrolled manner, further exacerbates this complexity, making the network training challenging. Therefore, one challenging question is how to design efficient denoising blocks within an unrolled model while fully leveraging the k-space information. Another challenge arises from the versatility of MRI, which enables the acquisition of multi-view, multi-contrast, multi-slice, and dynamic image sequences, given specific clinical demands. While there is a prevailing trend towards designing all-in-one models for natural image restoration [7,12], existing MRI reconstruction models cannot offer a unified solution for diverse input types. We thus endeavor to address these challenges with the following contributions:

心血管疾病,包括冠状动脉疾病、心力衰竭和心律失常等,仍是全球主要的死亡原因。心脏磁共振 (cardiac magnetic resonance, CMR) 成像是评估心脏解剖结构、功能和病理最准确可靠的无创技术 [13]。在加速磁共振成像 (MRI) 重建领域,展开式网络 (unrolled networks) 已实现最先进的性能,这归因于其能将已知的成像退化过程 (即k空间中的欠采样操作) 融入网络,并能从大规模数据中学习图像先验 [16,2]。随着Transformer在通用图像修复任务中占据主导地位 [18,9],将基于Transformer的去噪模块融入展开式网络以提升重建质量已成为显著趋势 [2]。然而,Transformer模块的采用同时增加了网络参数和计算复杂度,而展开式堆叠去噪模块的方式进一步加剧了这一复杂性,使网络训练更具挑战性。因此,一个关键问题是如何在展开式模型中设计高效去噪模块,同时充分利用k空间信息。另一挑战源于MRI的多功能性——根据临床需求可获取多视角、多对比度、多切片及动态图像序列。尽管自然图像修复领域普遍趋向设计全能模型 [7,12],现有MRI重建模型仍无法为多样化输入类型提供统一解决方案。我们通过以下创新应对这些挑战:

2 Preliminaries

2 预备知识

Consider reconstructing a complex-valued MR image $x$ from the multi-coil under sampled measurements $y$ in k-space, such that,

考虑从k空间的多线圈欠采样测量值$y$重建复值MR图像$x$,使得

$$
y=A x+\epsilon,
$$

$$
y=A x+\epsilon,
$$

where $A$ is the linear forward complex operator which is constructed based on multiplications with the sensitivity maps $S$ , application of the 2D Fourier transform $F$ , while it under-samples the k-space data with a binary mask $M$ ; $\epsilon$ is the acquisition noise. According to compressed sensing theory [1], we can estimate $x$ by formulating an optimization problem:

其中 $A$ 是基于灵敏度图 $S$ 的线性前向复算子,通过应用二维傅里叶变换 $F$ 构建,同时使用二元掩码 $M$ 对k空间数据进行欠采样;$\epsilon$ 是采集噪声。根据压缩感知理论 [1],我们可以通过构建以下优化问题来估计 $x$:

$$
\operatorname*{min}{x}\frac{1}{2}||y-A x||_{2}^{2}+\lambda R(x),
$$

$$
\operatorname*{min}{x}\frac{1}{2}||y-A x||_{2}^{2}+\lambda R(x),
$$


Fig. 1: The proposed two-stage MRI reconstruction pipeline. The first stage solves a physics-based inverse problem to fill the missing k-space data, which are then transformed to the image domain by the inverse Fast Fourier Transformation (IFFT) and root-sum-of-squares (RSS) is applied to get the first-stage reconstructed image. The second stage solves a general denoising problem to further refine the image reconstruction result.

图 1: 提出的两阶段MRI重建流程。第一阶段通过求解基于物理学的逆问题来填补缺失的k空间数据,随后通过快速傅里叶逆变换 (IFFT) 转换到图像域,并应用平方和根 (RSS) 方法获得第一阶段重建图像。第二阶段通过解决通用去噪问题进一步优化图像重建结果。

where $||\boldsymbol{y}-\boldsymbol{A x}||_{2}^{2}$ is the data consistency term, $R(x)$ is a sparsity regular iz ation term on $x$ (e.g., total variation) and $\lambda$ is a hyper-parameter which controls the contribution weights of the two terms. E2E-VarNet [16] solves the problem in Eq. 2 by applying an iterative gradient descent method in the k-space domain. In the $t$ -th step, the k-space is updated from $k^{t}$ to $k^{t+1}$ using:

其中 $||\boldsymbol{y}-\boldsymbol{A x}||_{2}^{2}$ 是数据一致性项,$R(x)$ 是关于 $x$ 的稀疏正则化项 (例如全变分),$\lambda$ 是控制两项贡献权重的超参数。E2E-VarNet [16] 通过在k空间域应用迭代梯度下降法来求解公式2中的问题。在第 $t$ 步时,k空间从 $k^{t}$ 更新至 $k^{t+1}$ 的公式为:

$$
\boldsymbol{k}^{t+1}=\boldsymbol{k}^{t}-\eta^{t}\boldsymbol{M}(\boldsymbol{k}^{t}-\boldsymbol{y})+\boldsymbol{G}(\boldsymbol{k}^{t}),
$$

$$
\boldsymbol{k}^{t+1}=\boldsymbol{k}^{t}-\eta^{t}\boldsymbol{M}(\boldsymbol{k}^{t}-\boldsymbol{y})+\boldsymbol{G}(\boldsymbol{k}^{t}),
$$

where $\eta^{t}$ is a learned step size and $G$ is a learned function representing the gradient of the regular iz ation term $R$ . We can unroll the iterative updating algorithm to a sequence of sub-networks, where each cascade represents an unrolled iteration in Eq. 3. The regular iz ation term is applied in the image domain:

其中 $\eta^{t}$ 是学习到的步长,$G$ 是表示正则化项 $R$ 梯度的一个学习函数。我们可以将迭代更新算法展开为一系列子网络,其中每个级联对应于式3中的一次展开迭代。正则化项在图像域中应用:

$$
G(k)=F({\mathcal{E}}(\mathbf{D}({\mathcal{R}}(F^{-1}(k))))),
$$

$$
G(k)=F({\mathcal{E}}(\mathbf{D}({\mathcal{R}}(F^{-1}(k))))),
$$

where $\begin{array}{r}{\mathcal{R}(x_{1},...,x_{N})=\sum_{i=1}^{N}\hat{S}_{i}^{*}x_{i}}\end{array}$ is the reduce operator that combines $N$ coil images ${x_{i}}_{i=1}^{N}$ via esti mated sensitivity maps ${\hat{S}_{i}}_{i=1}^{N}$ , $\hat{S}_{i}^{*}$ is the complex conju- gate of ${\hat{S}}_{i}$ , and $\mathcal{E}(x)=(\hat{S}_{i}x,...,\hat{S}_{N}x)$ is the expand operator that computes coil images from image $x$ . Therefore, the linear forward operator $A$ is computed as $A=M F{\mathcal{E}}$ . $\mathbf{D}$ is a denoising neural network used to refine the complex image. $\hat{S}=\mathrm{SME}(y_{\mathrm{ACS}})$ is computed by a sensitivity map estimation (SME) network from the low-frequency region of k-space $y_{\mathrm{ACS}}$ , called the Auto-Calibration Signal (ACS), which is typically fully sampled. The final updated multi-coil k-space is converted to the image domain by applying an inverse Fourier transform followed by a root-sum-of-squares (RSS) method reduction [14] for each pixel.

其中 $\begin{array}{r}{\mathcal{R}(x_{1},...,x_{N})=\sum_{i=1}^{N}\hat{S}_{i}^{*}x_{i}}\end{array}$ 是通过估计的灵敏度图 ${\hat{S}{i}}{i=1}^{N}$ 将 $N$ 个线圈图像 ${x_{i}}{i=1}^{N}$ 组合的归约算子(reduce operator),$\hat{S}{i}^{*}$ 是 ${\hat{S}}{i}$ 的复共轭,而 $\mathcal{E}(x)=(\hat{S}{i}x,...,\hat{S}{N}x)$ 是从图像 $x$ 计算线圈图像的扩展算子(expand operator)。因此,线性前向算子 $A$ 计算为 $A=M F{\mathcal{E}}$。$\mathbf{D}$ 是用于细化复图像的降噪神经网络。$\hat{S}=\mathrm{SME}(y_{\mathrm{ACS}})$ 由灵敏度图估计(SME)网络从k空间的低频区域 $y_{\mathrm{ACS}}$ (称为自动校准信号(ACS))计算得出,该区域通常被完全采样。最终更新的多线圈k空间通过应用逆傅里叶变换,然后对每个像素采用平方和根(RSS)方法归约[14]转换到图像域。


Fig. 2: Overview of PromptMR in Stage I: an all-in-one unrolled model for MRI reconstruction. Adjacent inputs, depicted in image domain for visual clarity, provide neighboring k-space information for reconstruction. To accommodate different input varieties, the input-type adaptive visual prompt is integrated into each cascade of the unrolled architecture to guide the reconstruction process.

图 2: PromptMR第一阶段概述:用于MRI重建的一体化展开模型。为便于可视化,相邻输入在图像域中展示,为重建提供相邻k空间信息。为适应不同输入类型,输入类型自适应视觉提示被集成到展开架构的每个级联中,以指导重建过程。

3 Method

3 方法

We propose a two-stage pipeline for dynamic and multi-contrast MRI reconstruction, as shown in Fig. 1. Below, we give more details of each stage.

我们提出了一种动态多对比度MRI重建的两阶段流程,如图1所示。下面我们将详细介绍每个阶段。

3.1 Stage I: Filling the K-Space

3.1 第一阶段:填充K空间

The center of the k-space preserves image contrast, and the periphery of the $\mathrm{k\Omega}$ -space contains edge information. In the first stage, we fill the missing k-space data constrained by the existing k-space acquisition and learned image priors.

k空间中心保留图像对比度,而 $\mathrm{k\Omega}$ 空间外围包含边缘信息。在第一阶段,我们通过现有k空间采集数据和学习到的图像先验信息来填充缺失的k空间数据。

Baseline Model We follow the implementation of E2E-VarNet [16] to construct an unrolled model in Stage I. Inspired by the adjacent slice reconstruction (ASR) method [2], which learns inter-slice information by jointly reconstructing a set of adjacent slices instead of relying on a single k-space to be reconstructed, we devise the following new method. We generalize ASR to adjacent k-space reconstruction along any dimension, e.g., temporal/slice/view/contrast dimension, and the updating formula of Eq. 3 is improved as follows:

基线模型 我们遵循E2E-VarNet [16] 的实现方式构建了第一阶段展开式模型。受相邻切片重建 (ASR) 方法 [2] 启发 (该方法通过联合重建一组相邻切片而非依赖单一k空间数据进行重建来学习切片间信息) ,我们设计了以下新方法:将ASR推广至沿任意维度 (如时间/切片/视图/对比度维度) 的相邻k空间重建,并将公式3的更新方式改进如下:

$$
k_{a d j}^{t+1}=k_{a d j}^{t}-\eta^{t}A(k_{a d j}^{t}-y_{a d j})+G(k_{a d j}^{t}),
$$

$$
k_{a d j}^{t+1}=k_{a d j}^{t}-\eta^{t}A(k_{a d j}^{t}-y_{a d j})+G(k_{a d j}^{t}),
$$

where $\boldsymbol{k}{a d j}^{t}=[k_{c-a}^{t},...,k_{c-1}^{t},k_{c}^{t},k_{c+1}^{t},...,k_{c+a}^{t}]$ is the concatenation of the central -space $k_{c}^{t}$ with its $2a$ adjacent k-spaces along a specific dimension. To efficiently extract features from adjacent inputs, we design a Unet-style network [15] with channel attention [3,4], namely CAUnet, for both the denoising network $D$ and the sensitivity map estimation network, as shown in Appendix A.1. The CAUnet has a 3-level encoder-decoder structure. Each level consists of a DownBlock,

其中 $\boldsymbol{k}{a d j}^{t}=[k_{c-a}^{t},...,k_{c-1}^{t},k_{c}^{t},k_{c+1}^{t},...,k_{c+a}^{t}]$ 是中心k空间 $k_{c}^{t}$ 与其沿特定维度的 $2a$ 个相邻k空间的拼接。为了高效地从相邻输入中提取特征,我们为去噪网络 $D$ 和灵敏度图估计网络设计了一个带有通道注意力 [3,4] 的Unet风格网络 [15],称为CAUnet,如附录A.1所示。该CAUnet采用3级编码器-解码器结构,每级包含一个DownBlock。


Fig. 3: Overview of the PromptUnet architecture in PromptMR, featuring a 3-level encoder-decoder design. Each level comprises a DownBlock, UpBlock and Prompt Block. The Prompt Block in the $i$ -th level encodes input-specific context into fixed prompt $P_{i}$ , producing adaptively learned prompt $\hat{P}{i}$ . These prompts, across multiple levels, integrate with decoder features $F_{d,i}$ in the UpBlocks to allow rich hierarchical context learning.

图 3: PromptMR中PromptUnet架构概览,采用3级编码器-解码器设计。每级包含DownBlock、UpBlock和Prompt Block。第$i$级的Prompt Block将输入特定上下文编码为固定提示$P_{i}$,生成自适应学习提示$\hat{P}{i}$。这些多级提示与UpBlock中的解码器特征$F_{d,i}$集成,实现丰富的分层上下文学习。

UpBlock, and corresponding skip connection. The architecture integrates a Bottleneck Block for high-level semantic feature capturing and employs Channel Attention Blocks (CABs) within each block. The overall unrolled architecture is shown in Appendix A.2.

UpBlock 及对应的跳跃连接。该架构集成了用于捕获高级语义特征的 Bottleneck Block,并在每个块内采用了通道注意力块 (CAB)。整体展开架构如附录 A.2 所示。

PromptMR Considering various image types (e.g., different views, different contrasts) with different adjacent types (e.g., dynamic, multi-contrast) under different under sampling rates (e.g., $\times4,\times8,\times10.$ ), instead of training separate models for each specific input, we propose to learn an all-in-one unified model for all possible adjacent inputs. The image structure remains consistent for multicontrast adjacent input, while only the contrast varies. Conversely, the contrast remains constant for dynamic adjacent input, but the image structure shifts. To achieve effective performance on diverse input types, the unified model should be able to encode the contextual information conditioned on the input type. Inspired by the recent development of visual prompt learning [5,6] and prompt learning-based image restoration method [12], we introduce PromptMR, an all-inone approach for MRI reconstruction, as illustrated in Fig. 2. While PromptMR retains the same unrolled architecture of the basline model, it extends CAUnet to PromptUnet by integrating Prompt Blocks to learn input-type adaptive prompts and then interact with decoder features in the UpBlocks at multiple levels, to enrich the input-specific context, as shown in Fig. 3. The Prompt Block at $i$ -th level takes features $F_{d,i}\in\mathbb{R}^{H_{f}\times W_{f}\times C_{f}}$ from the decoder and $N_{p}$ -components fixed prompt $P_{i}\in\mathbb{R}^{N_{p}\times H_{p}\times W_{p}\times C_{p}}$ as input. Then, $F_{d,i}$ are processed by a global average pooling (GAP) layer, followed by a linear layer and a softmax layer to generate the normalized prompt weights ${\omega_{i j}}{j=1}^{N_{p}}.$ These weights linearly combine

PromptMR 针对不同图像类型(如不同视角、不同对比度)和不同相邻类型(如动态、多对比度)在不同欠采样率(如 $\times4,\times8,\times10.$)下的情况,提出学习一个适用于所有可能相邻输入的一体化统一模型,而非为每种特定输入单独训练模型。多对比度相邻输入的图像结构保持一致,仅对比度变化;而动态相邻输入的对比度恒定,但图像结构发生偏移。为实现多样化输入类型的有效性能,统一模型需能根据输入类型编码上下文信息。受视觉提示学习 [5,6] 和基于提示学习的图像恢复方法 [12] 的启发,我们提出 PromptMR——一种 MRI 重建的一体化方法,如图 2 所示。PromptMR 在保留基线模型展开式架构的同时,通过集成提示块(Prompt Blocks)将 CAUnet 扩展为 PromptUnet,以学习输入类型自适应的提示,并在多层级与 UpBlocks 中的解码器特征交互,从而丰富输入特定上下文(如图 3)。第 $i$ 层提示块以解码器特征 $F_{d,i}\in\mathbb{R}^{H_{f}\times W_{f}\times C_{f}}$ 和 $N_{p}$ 分量固定提示 $P_{i}\in\mathbb{R}^{N_{p}\times H_{p}\times W_{p}\times C_{p}}$ 为输入。随后,$F_{d,i}$ 经全局平均池化(GAP)层处理,再通过线性层和 softmax 层生成归一化提示权重 ${\omega_{i j}}{j=1}^{N_{p}}$,这些权重线性组合——

$$
\begin{array}{r}{\hat{P}{i}=\operatorname{Conv}{3\times3}(\operatorname{Interp}(\sum_{j=1}^{N_{p}}\omega_{i j}P_{i j})),\quad\omega_{i}=\operatorname{Softmax}(\operatorname{Linear}(\operatorname{GAP}(F_{d,i})))}\end{array}
$$

$$
\begin{array}{r}{\hat{P}{i}=\operatorname{Conv}{3\times3}(\operatorname{Interp}(\sum_{j=1}^{N_{p}}\omega_{i j}P_{i j})),\quad\omega_{i}=\operatorname{Softmax}(\operatorname{Linear}(\operatorname{GAP}(F_{d,i})))}\end{array}
$$

The generated prompts by the Prompt Blocks at multiple levels can learn hierarchical input-type contextual representations, which are integrated with the decoder features to guide the all-in-one MRI reconstruction.

通过多级提示块生成的提示可以学习分层输入类型的上下文表示,这些表示与解码器特征相结合,以指导一体化MRI重建。

3.2 Stage II: Refining the Image

3.2 阶段二:图像优化

After the first stage, the missing k-space data has been filled, and image aliasing artifacts have been largely removed. However, due to the unrolled nature and memory limitations, the capability of the denoising block we can use is constrained, which may prevent the full exploration of dynamic and multi-contrast information. In stage II, we further explore the inter-frame/-contrast coherence in the image domain for multi-frame/-contrast feature aggregation by using a powerful restoration model, ShiftNet [8], as the refinement network. This network employs stacked Unets and grouped spatio-temporal shift operations to expand the effective receptive fields. Details of the ShiftNet are not covered here, since it is not the core part of this paper, and ShiftNet can be replaced by any state-of-the-art video restoration model.

第一阶段完成后,缺失的k空间数据已被填补,图像混叠伪影也基本消除。但由于网络展开特性和内存限制,我们所能使用的去噪模块性能受到制约,这可能阻碍对动态和多对比信息的充分探索。在第二阶段,我们采用强大的修复模型ShiftNet [8]作为 refinement 网络,进一步挖掘图像域中帧间/对比间的相关性,实现多帧/多对比特征聚合。该网络通过堆叠Unet和分组时空位移操作来扩展有效感受野。ShiftNet的具体细节在此不作赘述,因其非本文核心部分,且该网络可被任何先进的视频修复模型替代。

4 Experiments

4 实验

In this section, we first provide experimental details and results of our proposed method on the CMRxRecon dataset. We use SSIM, PSNR, and NMSE to compare the performance of different reconstruction methods under various acceleration factors ( $\times$ 4, ×8, ×10). Then, we conduct extensive ablation studies of our proposed method and also benchmark on another large-scale MRI dataset, the fastMRI multi-coil knee dataset. For experiments on fastMRI dataset, we refer readers to the Appendix B.

在本节中,我们首先在CMRxRecon数据集上提供所提方法的实验细节和结果。我们使用SSIM、PSNR和NMSE来比较不同重建方法在多种加速因子( ×4、×8、×10 )下的性能。随后,我们对所提方法进行了全面的消融研究,并在另一个大规模MRI数据集fastMRI多线圈膝关节数据集上进行了基准测试。关于fastMRI数据集的实验,请读者参阅附录B。

4.1 CMRxRecon Dataset

4.1 CMRxRecon 数据集

The CMRxRecon Dataset [17] includes 120 cardiac MRI cases of fully sampled dynamic cine and multi-contrast raw k-space data obtained on 3 Tesla magnets. The dynamic cine images in each case include short-axis (SAX), two-chamber (2-CH), three-chamber (3-CH), and four-chamber (4-CH) long-axis (LAX) views. Typically $5\sim10$ slices were acquired for SAX cine, while a single slice was acquired for each LAX view. The cardiac cycle was segmented into $12\sim25$ phases with a temporal resolution of 50 ms. The multi-contrast cardiac MRI in each case is in the SAX view, which contains 9 T1-weighted (T1w) images conducted using a modified look-locker inversion recovery (MOLLI) sequence and 3 T2-weighted (T2w) images performed using T2-prepared FLASH sequence.

CMRxRecon数据集[17]包含120例心脏MRI病例,采集自3特斯拉磁共振设备的全采样动态电影和多对比度原始k空间数据。每例动态电影图像包含短轴(SAX)、两腔(2-CH)、三腔(3-CH)和四腔(4-CH)长轴(LAX)切面。通常SAX电影序列采集$5\sim10$层,每个LAX切面采集单层。心动周期被分割为$12\sim25$个时相,时间分辨率为50毫秒。每例多对比度心脏MRI均为SAX切面,包含9幅采用改良Look-Locker反转恢复(MOLLI)序列获取的T1加权(T1w)图像,以及3幅采用T2准备FLASH序列获取的T2加权(T2w)图像。

The shape of each k-space data is [time phases/contrasts, slices, coils, readouts, phase encodings]. All data were compressed into 10 virtual coils. We splited the cases in an 8 : 2 ratio, resulting in 14, 964 dynamic images and 6, 516 multicontrast images for training, and 2, 940 dynamic images and 1, 272 multi-contrast images for testing.

每个k空间数据的形状为[时间相位/对比度、切片、线圈、读出、相位编码]。所有数据被压缩为10个虚拟线圈。我们按照8:2的比例划分病例,得到14,964张动态图像和6,516张多对比度图像用于训练,2,940张动态图像和1,272张多对比度图像用于测试。

4.2 Results

4.2 结果

We assessed the performance of our proposed baseline model, PromptMR, and two-stage reconstruction pipeline using the CMRxRecon dataset. In the first stage, we compared the E2E-VarNet [16] and HUMUS-Net-L [2] with our baseline in a one-by-one setup, in which we trained four separate models from scratch for SAX/LAX/T1w/T2w reconstruction task, respectively. Then we compared our PromptMR and PromptIR [12] in an all-in-one configuration. In the second stage, we deployed ShiftNet to refine the images reconstructed by PromptMR. In our experiment, we minimize the SSIM loss between the target image and the reconstructed image; all unrolled models consist of 12 cascades, except for HUMUS-Net-L, which only has 8 cascades due to its large parameter size; we trained networks using AdamW [10] optimizer with a weight decay of 0.01 for 12 epochs; the learning rate was set as $2\times10^{-4}$ for the first 11 epochs and $2\times10^{-5}$ for the last epoch.

我们使用CMRxRecon数据集评估了提出的基线模型PromptMR和两阶段重建流程的性能。在第一阶段,我们将E2E-VarNet [16]和HUMUS-Net-L [2]与基线模型在逐一配置下进行对比,分别为SAX/LAX/T1w/T2w重建任务从头训练了四个独立模型。随后在统一配置下比较了PromptMR与PromptIR [12]的性能。第二阶段采用ShiftNet对PromptMR重建图像进行细化处理。实验设置如下:最小化目标图像与重建图像间的SSIM损失;除参数量较大的HUMUS-Net-L仅含8级级联外,其余展开式模型均采用12级级联结构;使用AdamW [10]优化器训练12个epoch(权重衰减0.01);学习率在前11个epoch设为$2\times10^{-4}$,最终epoch调整为$2\times10^{-5}$。

The results are shown in Table 1. Notably, our baseline model outperforms E2E-VarNet and HUMUS-Net-L across all tasks. Moreover, our PromptMR demonstrates significant enhancement in the all-in-one setup when compared to the baseline model trained for individual tasks. PromptIR performs poorly due to the fact that it is not tailored to account for the MRI forward model. The refinement in the second stage offers a marginal boost to the SSIM, but provides considerable improvements for NMSE and PSNR. The qualitative results are shown in Fig. 4. More qualitative comparisons can be found in Appendix C. These qualitative comparisons show that our method can recover more finer details for small anatomical structures on the reconstructed images.

结果如表 1 所示。值得注意的是,我们的基线模型在所有任务中都优于 E2E-VarNet 和 HUMUS-Net-L。此外,与针对单个任务训练的基线模型相比,我们的 PromptMR 在一体化设置中表现出显著提升。由于 PromptIR 未针对 MRI 前向模型进行定制,其表现较差。第二阶段的优化对 SSIM 提升有限,但对 NMSE 和 PSNR 有显著改善。定性结果如图 4 所示,更多定性对比见附录 C。这些定性对比表明,我们的方法能在重建图像中恢复更细小解剖结构的细节。

4.3 Ablation Study

4.3 消融实验

Single MRI Reconstruction Task We started with an ablation study on two single MRI reconstruction tasks, dynamic cine SAX image reconstruction and multi-contrast T1-weighted image reconstruction, both under $\times$ 10 acceleration, to investigate the impact of adjacent reconstruction and prompt module in the proposed PromptMR. We changed the number of adjacent images to 1, 3, and 5, where ‘1’ indicates the absence of adjacent input. The results, shown in Table 2, underscore the utility of incorporating adjacent input to enhance the reconstruction quality. Moreover, the inclusion of Prompt Blocks proves beneficial for individual MRI reconstruction tasks.

单MRI重建任务

我们首先在两个单MRI重建任务上进行了消融实验,分别是动态电影SAX图像重建和多对比度T1加权图像重建,均在$\times$10加速条件下进行,以研究相邻重建和提示模块在提出的PromptMR中的影响。我们将相邻图像数量调整为1、3和5,其中“1”表示没有相邻输入。表2所示的结果强调了引入相邻输入对提升重建质量的作用。此外,加入提示块(Prompt Blocks)也被证明对单个MRI重建任务有益。

Table 1: Comparison of NMSE( $\times10^{-2}$ )/PSNR/SSIM of different MRI reconstruction methods on CMRxRecon dataset under $\times10$ acceleration. The best and second best results are highlighted in red and blue colors, respectively.

表 1: 不同MRI重建方法在CMRxRecon数据集上10倍加速下的NMSE(×10⁻²)/PSNR/SSIM对比。最优和次优结果分别用红色和蓝色标出。

Stage Task Method Cine Mapping T2w
SAX LAX T1w
I E2E-Varnet [16]
One-by-OneHUMUS-Net-L [2] Baseline(Ours) 1.3/42.96/0.9791 2.0/40.07/0.9689 1.1/43.68/0.9814 1.9/40.38/0.9705
All-in-One PromptIR [12] PromptMR (Ours) 2.5/40.16/0.9659 2.7/38.62/0.9581 2.3/41.10/0.9726 1.4/41.10/0.9784
II PromptMR+ShiftNet [8]


Fig. 4: The reconstruction results and absolute error maps of different methods for LAX 2-CH cine image of case P101 under $\times$ 10 acceleration factors. The bottom two rows show the zoomed area. Red arrows show the difference in recovery of the mitral valve structure for different reconstruction methods.

图 4: 不同方法在P101病例LAX 2-CH电影图像$\times$10加速因子下的重建结果与绝对误差图。底部两行显示放大区域。红色箭头标示不同重建方法在二尖瓣结构恢复上的差异。

All-In-One MRI Reconstruction Task To investigate the impact of the Prompt Block in the all-in-one MRI reconstruction task, we trained both our baseline and PromptMR model using all possible input data in the CMRxRecon dataset. As depicted in Table 3, the integration of the Prompt Block into our baseline model enables PromptMR to achieve significant improvements across all individual reconstruction tasks. We also used $\mathrm{t}$ -SNE [11] to visualize the learned prompts in the 12-th cascade at multiple decoder levels from different types of data in the test set. Fig. 5 shows that the prompts can learn to encode disc rim i native information for different input types at lower levels.

一体化MRI重建任务
为研究提示块(Prompt Block)在一体化MRI重建任务中的影响,我们使用CMRxRecon数据集中的所有可能输入数据对基线模型和PromptMR模型进行训练。如表3所示,将提示块集成到基线模型后,PromptMR在所有独立重建任务中均取得显著提升。我们同时采用$\mathrm{t}$-SNE[11]对测试集中不同类型数据在第12级级联多个解码器层级学习到的提示进行可视化。图5表明,提示能够在较低层级学习编码不同输入类型的判别性信息。

Table 2: Impact of the adjacent input number and the Prompt Block in PromptMR for two single MRI reconstruction tasks: dynamic cine SAX and multi-contrast T1-weighted (T1w) reconstruction under $\times10$ acceleration.

表 2: PromptMR中相邻输入数量和提示块(Prompt Block)对两项单MRI重建任务的影响:动态电影SAX和多对比T1加权(T1w)重建在$\times10$加速下的表现。

相邻帧数 提示块 SAX PSNR/SSIM T1w PSNR/SSIM
1 43.19/0.9798 44.36/0.9845
3 43.96/0.9822 44.78/0.9856
5 43.87/0.9820 44.75/0.9856
5 43.68/0.9814 44.14/0.9839

Table 3: Impact of Prompt Block in all-in-one task. Results are reported on the CMRxRecon dataset under $\times10$ acceleration.

表 3: Prompt Block 在多合一任务中的影响。结果基于 CMRxRecon 数据集在 $\times10$ 加速条件下报告。

方法 SAX PSNR/SSIM LAX PSNR/SSIM T1w PSNR/SSIM T2w PSNR/SSIM
Baseline (Ours) 43.97/0.9825 42.11/0.9786 44.90/0.9862 44.45/0.9874
PromptMR R(Ours) 45.58/0.9865 43.72/0.9836 46.84/0.9899 46.24/0.9903


Fig. 5: Visualization of the learned prompts at each decoder level in the 12-th cascade in PromptMR using t-SNE.

图 5: 使用t-SNE对PromptMR中第12级联各解码器层级学习到的提示(prompt)进行可视化展示。

5 Conclusion

5 结论

In this work, we introduce a robust baseline model for MRI reconstruction that utilizes neighboring information of adjacent k-space. To accommodate various input types, adjacent configurations, and under sampling rates within a unified model, we enhance our baseline with prompt-based learning blocks, creating an all-in-one MRI reconstruction model, PromptMR. Finally, to overcome the model capacity constraints of unrolled architectures, we propose a second stage of image refinement to delve deeper into the adjacent information, which is particularly useful when immediate reconstruction latency is not a priority.

在本研究中,我们提出了一种利用相邻k空间邻域信息的稳健MRI重建基线模型。为统一适配多种输入类型、邻域配置及欠采样率,我们通过基于提示(prompt)的学习模块增强基线模型,构建了全能型MRI重建框架PromptMR。最后,为突破展开式架构的模型容量限制,我们提出第二阶段图像细化方案以深度挖掘邻域信息,该方案在实时重建延迟非首要考量时尤为有效。

A.1 CAUnet

A.1 CAUnet


Fig. A.1: Overview of the CAUnet architecture in the proposed baseline model.

图 A.1: 所提基线模型中CAUnet架构概览

A.2 Unrolled model architecture

A.2 展开模型架构


Fig. A.2: Overview of the unrolled model architecture for both our baseline model and PromptMR. The primary distinction is in the denoiser $D$ and sensitivity map estimation (SME) networks: the baseline employs CAUnet, whereas PromptMR utilizes PromptUnet. Each cascade represents an updating step in Eq. 5 in the main text. The red module indicates the learnable part in the unrolled model.

图 A.2: 基准模型与PromptMR展开模型架构概览。主要区别在于去噪器$D$和灵敏度图估计(SME)网络:基准模型采用CAUnet,而PromptMR使用PromptUnet。每个级联代表主文本公式5中的更新步骤。红色模块表示展开模型中的可学习部分。

B Experiments on FastMRI Multi-Coil Knee Dataset

B 在 FastMRI 多线圈膝关节数据集上的实验

Benchmark on FastMRI Multi-Coil Knee Dataset To assess the performance of our proposed method across different anatomies, we benchmarked it on another large-scale MRI reconstruction dataset, the fastMRI multi-coil knee dataset [19]. Since the online evaluation platform for the fastMRI test set is unavailable $^{1}$ , we divided the original 199 validation cases into 99 for validation and 100 for testing. The results of other methods are reported using their officially pretrained models. As presented in Table B.1, our models outperform all previous state-of-the-art methods, without significantly increasing the number of network parameters compared to E2E-Varnet.

在fastMRI多线圈膝关节数据集上的基准测试
为了评估我们提出的方法在不同解剖结构上的性能,我们在另一个大规模MRI重建数据集fastMRI多线圈膝关节数据集[19]上进行了基准测试。由于fastMRI测试集的在线评估平台不可用$^{1}$,我们将原始的199个验证案例分为99个用于验证,100个用于测试。其他方法的结果均使用其官方预训练模型进行报告。如表B.1所示,我们的模型优于所有先前的最先进方法,且与E2E-Varnet相比没有显著增加网络参数数量。

Table B.1: Performance of state-of-the-art accelerated MRI reconstruction techniques on the fastMRI knee multi-coil $\times$ 8 test dataset. The best and second best results are highlighted in red and blue colors, respectively.

表 B.1: 当前最先进的加速MRI重建技术在fastMRI膝关节多线圈×8测试数据集上的性能表现。最佳和次佳结果分别用红色和蓝色标出。

方法 参数量 NMSE(×10-2)(↓) PSNR(↑) SSIM(↑)
E2E-Varnet t [16] 30M 0.8690 ±0.9279 37.30 ± 4.925 50.8936 ±0.1157
HUMUS-Net [2] 109M 0.8974 ±0.9743 37.20±5.009 0.8946 ±0.1162
HUMUS-Net-L [2] 228M 0.8587±0.9930 37.45 ± 5.067 0.8955 ± 0.1161
Baseline (ours) 47.5M 0.8321 ± 0.9258 37.57 ± 5.143 0.8964 ± 0.1162
PromptMR (ours) 79.6M 0.8344±0.9648 37.63 ± 5.319 0.8970 ± 0.1168

Effectiveness of Two-Stage Pipeline We employed ShiftNet to refine the images reconstructed by the pretrained E2E-Varnet on the fastMRI multi-coil knee test dataset with $\times$ 8 under sampling. Table B.2 shows that the second-stage refinement substantially improves the reconstruction quality, which implies that the multi-slice information in the fastMRI dataset might not be comprehensively utilized by the single-stage unrolled model.

两阶段流程的有效性
我们采用ShiftNet对预训练E2E-Varnet在fastMRI多线圈膝关节测试数据集(8倍欠采样)上重建的图像进行细化。表B.2显示,第二阶段的细化显著提升了重建质量,这表明fastMRI数据集中的多切片信息可能未被单阶段展开模型充分利用。

Table B.2: Effectiveness of the second-stage image refinement on the fastMRI knee multi-coil $\times8$ test dataset.

表 B.2: 第二阶段图像细化在 fastMRI 膝关节多线圈 $\times8$ 测试数据集上的效果。

阶段 方法 [# 参数数量 [NMSE(×10-2)(↓) PSNR(↑) SSIM(↑)
I E2E-Varnet [16] 30M 0.8690 ± 0.9279 37.30 ± 4.925 0.8936 ± 0.1157
II ShiftNet [8] 2M 0.8415 ± 0.9131 37.46 ± 4.973 0.8953 ± 0.1157


Fig. C.1: Visual comparison of reconstructions from the CMRxRecon dataset with $\times10$ acceleration. PromptMR can recover fine details (highlighted in red box) on reconstructed images that other state-of-the-art methods may miss.

图 C.1: CMRxRecon数据集在10倍加速下的重建效果视觉对比。PromptMR能够恢复重建图像中的精细细节(红色框标注部分),而其他前沿方法可能会遗漏这些细节。

阅读全文(20积分)