Abstract
Keywords
Introduction
Segmentation is crucial to medical image analysis because it produces useful data for the identification, monitoring, and treatment of brain tumors.1,2 A crucial task in medical image analysis is brain tumor segmentation, especially in the field of neuroimaging. 3 Delineating and identifying regions of interest linked to brain tumors in medical pictures, like magnetic resonance imagings (MRIs), is the aim. For the purpose of diagnosing, treating, and keeping track of patients with brain tumors, precise segmentation is crucial. 4 Before being identified, brain tumors can grow to be quite large and take on a variety of shapes and sizes. 5 Unusual cells that proliferate inside the brain are called tumors. If not found early on, these aberrant cells have the potential to be fatal. 6 The two primary categories of brain tumors are malignant, a severe form of cancer, and benign, which is not cancerous. 7 Adult brain tumors of the glioma type are the most common kind and are classified into two grades: low-grade gliomas (LGG) are slowly developing tumors, whereas high-grade gliomas (HGG) proliferate. 8 Glioma cancer is classified into three subdivisions: tumor core (TC), enhancing core (EC), and whole tumor (WT). It is one of the most prevalent malignant brain tumors in current times. 9 Every year, over 90,000 patients are diagnosed with primary brain tumors. Malignant brain tumors kill roughly 17,200 people each year. An analysis revealed that 25 persons out of 100,000 have malignancies, with 33% of them being serious. 10 Medical imaging and neurology heavily focus on the segmentation of the brain, especially neurological disorders like brain tumors. 11 The assessment of brain segments is a crucial component in the range of neurological care, covering diagnostic and therapeutic approaches, surgical procedures, disease surveillance, as well as research and development of personalized therapies. 12
Recent advances in deep learning have introduced attention mechanisms into segmentation models. Attention mechanisms allow models to focus selectively on important features within the image, enhancing segmentation performance by prioritizing relevant spatial information.13–15 In comparison to manual techniques and conventional tools, the deep U-Net architecture has transformed fMRI segmentation by utilizing deep learning capabilities to increase performance and efficiency. 16 Deep U-Net automates the segmentation process by learning from vast datasets and extracting complex patterns in fMRI pictures, in contrast to manual segmentation, which is laborious and subjective. 17 This deep learning approach allows for more precise delineation of brain tumor boundaries, enabling clinicians to make more informed decisions about diagnosis and treatment planning.18,19 Existing methods in brain tumor segmentation face several challenges, as highlighted in recent literature. For instance, Saeed et al. 20 proposed RMU-Net, which, despite achieving high performance on the BraTS 2018 dataset, may encounter difficulties in effectively segmenting heterogeneous tumor types across different subregions. Furthermore, Ali et al. 21 introduced an ensemble of U-Net and 3D convolutional neural network (CNN) models, yet computational complexity remains a concern, potentially limiting its clinical applicability. Tataei et al. 22 utilized CNN engineering and ResNet-50 architecture, but the reliance on basic convolutional layers may restrict the model’s power to capture intricate tumor variants. Additionally, evaluation metrics such as the Dice similarity coefficient, emphasized in several studies, may not fully encompass segmentation quality, thus warranting a more comprehensive assessment approach.23,24 These challenges underscore the necessity for novel methodologies to handle the boundaries of existing approaches and advance the field of brain tumor segmentation.
The proposed attention-based convolutional U-Net (ACU-Net) model overcomes the limitations of conventional brain tumor segmentation approaches by incorporating attention mechanisms that selectively emphasize critical features while maintaining spatial context. This novel architecture enables enhanced segmentation accuracy across diverse tumor types and subregions in fMRI data from the BraTS 2018 and BraTS 2020 datasets. Rigorous evaluation of ACU-Net demonstrates substantial improvements in segmentation performance, with the model consistently outperforming state-of-the-art methods. Notably, it achieves superior results in key metrics, including the Dice similarity coefficient, sensitivity, and specificity, underscoring its effectiveness in accurately delineating brain tumors.
The key contributions of this article are outlined as follows:
The article follows this format in the sections that follow: section ”Literature review” offers a thorough analysis of the pertinent research. The recommended methodology used in this study is described in depth in section ”Methdolgy.” The performance analysis and discussion of the findings are presented in sections ”Result analysis” and ”Discussion.” Ultimately, the concluded findings from this study project are summarized in section ”Conclusion.”
Literature review
The segmentation of brain tumors from multimodal MRI images has seen significant progress with the advent of deep learning models. The Brain Tumor Segmentation (BraTS) challenge datasets from 2018 and 2020 have been pivotal in evaluating the performance of segmentation methods. This section reviews the major techniques employed on the BraTS 2018 and BraTS 2020 datasets, providing an in-depth analysis of their strengths, limitations, and how these methods compare to the proposed ACU-Net.
Study on BraTS 2018
The BraTS 2018 dataset has been a key benchmark for evaluating brain tumor segmentation models, with numerous contributions showcasing diverse approaches to improve segmentation accuracy.
Saeed et al. 20 proposed RMU-Net, a hybrid model combining MobileNetV2 and U-Net architectures. This end-to-end segmentation framework demonstrated good performance on the BraTS 2018 dataset, achieving Dice scores ranging from 90.80% to 79.36% for different tumor areas. However, the model’s reliance on MobileNetV2, a lightweight architecture, might limit its ability to capture more complex tumor features compared to more traditional CNN-based models. Additionally, the integration of MobileNetV2 introduces additional challenges in fine-tuning and achieving consistent accuracy across all tumor subregions.
Ali et al. 21 presented an ensemble model combining U-Net and 3D convolutional neural networks (CNNs) for tumor segmentation. On the BraTS 2018 dataset, their model produced competitive Dice scores: 0.750 for ET, 0.906 for the WT, and 0.846 for the TC. While the ensemble approach enhances model robustness, it also increases computational complexity and can be harder to implement in real-time applications due to the increased processing requirements.
Tataei et al. 22 utilized a CNN and ResNet-50 architecture to segment gliomas in the BraTS 2018 dataset. Their approach achieved promising Dice scores across different tumor regions, but the CNN-based model’s ability to generalize may be hindered by its dependance on handcrafted features. This limitation is addressed in more recent models that incorporate deep learning techniques capable of learning features automatically.
Gull et al. 25 introduced a CNN-based model that achieved an average performance rating of 96.50% for brain tumor segmentation on the BraTS 2018 dataset. However, while the model performed well, it lacked the robustness to handle variations in tumor appearance, which remains a challenge in medical imaging tasks.
Ullah et al. 23 applied 3D U-Net for tumor segmentation, achieving Dice scores of 0.91 for the OT, 0.86 for the TC, and 0.70 for the AT. Although the model performed well overall, its segmentation of augmenting tumors still faces challenges due to the irregularities in tumor shapes and low-contrast areas that require further enhancement.
Cao et al. 24 introduced MBANet, a 3D convolutional neural network with multi-branch attention for brain tumor segmentation. Their method achieved competitive results on the BraTS 2018 dataset with Dice scores of 78.21%, 89.79%, and 83.04% for ET, WT, and TC, respectively. However, attention mechanisms, while helpful, often suffer from computational overhead and may not fully capture the complex structures in the tumor images.
Zia et al. 26 presented an attentional residual dropout U-Net (ARDUNet), which achieved excellent results, including a Dice score of 0.92 for augmenting tumors. Despite its strong performance, the method still faces challenges with segmentation consistency and may require more sophisticated techniques to handle a wider variety of tumor types and MR imaging conditions.
Sun et al. 27 created a special model based on a three-dimensional fully convolutional network to segment brain tumors. By separating the complete tumor and enhancing the tumor areas, the suggested approach achieved dice similarity coefficient values for the dataset of 0.90, 0.79, and 0.77. A novel residual network-based U-Net architecture for brain tumor segmentation was proposed by Pedada et al. 9 Testing their proposed U-Net model on benchmark datasets, such as the BraTS Challenge 2018, demonstrated segmentation accuracies of 92.20. Additionally, EC, total tumor (WT), and TC classifications were developed based on the sub-regions of the tumor. Despite its good performance, the method still needs to improve to get a segmentation consistency performance.
Study on BraTS 2020
The BraTS 2020 dataset builds upon the success of BraTS 2018 by offering a more complex and diverse set of MRI scans, providing additional challenges for tumor segmentation models. Many of the most successful methods from BraTS 2018 have been adapted and enhanced for BraTS 2020, with new approaches addressing the growing need for more robust and accurate models.
Isensee et al. 28 introduced nnU-Net, which achieved impressive results on BraTS 2020, securing first place in the challenge with Dice scores of 88.95% for WT, 85.06% for TC, and 82.03% for ET. This model is notable for its baseline configuration, which achieves competitive results without any task-specific adjustments, such as post-processing or region-based training. However, nnU-Net’s lack of specialized optimizations for different tumor types and imaging conditions may limit its applicability to more diverse datasets.
Kataria et al. 29 proposed the HybriCSF model, a hybrid approach combining CNNs with SVMs and fuzzy C-means clustering. Their model achieved competitive results on BraTS 2020 with Dice scores of 87% for WT, 81% for TC, and 63% for ET. While this hybrid approach offers a novel perspective by combining machine learning techniques, it struggled with segmenting enhancing tumors, which remain challenging due to their irregular shapes and low contrast in MRI scans.
Magadza et al. 30 extended the U-Net architecture with bottleneck units and a shuffle attention mechanism, improving the model’s performance on BraTS 2020 with Dice scores of 91.2% for WT, 84.8% for TC, and 79.2% for ET. Despite these enhancements, the method’s performance on enhancing tumors is still less than ideal, suggesting that attention mechanisms need further refinement to address this issue effectively.
Susanto et al. 31 introduced a spatial transformation-based data augmentation pipeline that significantly improved model robustness. The augmented approach achieved Dice scores of 90.91% for WT, 86.89% for TC, and 87.10% for ET. However, while their augmentation strategy helped improve segmentation performance, it remains to be seen whether it can generalize across more diverse datasets or handle more severe image artifacts.
Zhang et al. 32 proposed a multi-encoder model for 3D MRI segmentation, incorporating a novel Categorical Dice loss function to address voxel imbalance. Their method demonstrated notable results, with Dice scores of 70.24% for WT, 88.26% for TC, and 73.86% for ET. Despite the promising results, the model’s performance on whole tumor segmentation is still lower than that of other methods, indicating room for further improvement in segmentation accuracy.
These studies highlight the diversity of approaches for segmenting brain tumors, particularly the challenges in accurately segmenting enhancing tumors, which remain an area for further research. Hybrid models and attention mechanisms have shown promise, but combining these techniques with novel architectures like ACU-Net might offer further improvements in both segmentation accuracy and computational efficiency.
Rajinikanth et al. 33 developed a Computer-Aided Disease Diagnosis (CADD) system for classifying brain tumors in 2D MRI slices as Glioblastoma or Glioma. The system used CNNs for segmentation with VGG-UNet, feature extraction, and selection through the Firefly algorithm, achieving over 98% accuracy, especially with the SVM-Cubic classifier, demonstrating enhanced diagnostic accuracy. Khan et al. 34 introduced an automated brain tumor detection and classification system using saliency maps and deep learning for feature optimization. The system enhanced images through fusion techniques, fine-tuned a pre-trained EfficientNetB0 model, and used a dragonfly algorithm for feature optimization. The system achieved accuracies of 95.14%, 94.89%, and 95.94% on three public datasets, surpassing other neural networks.
Kurdi et al. 35 employed a Harris Hawks optimized convolutional network (HHOCNN) to improve brain tumor detection in MRI. The process involved noise removal, tumor region identification, and CNN-based feature classification. With Harris Hawks optimization, the system achieved 98% tumor recognition accuracy on the Kaggle dataset, reducing misclassification errors. Badjie et al. 36 developed a DL algorithm to enhance brain tumor detection in MR images. Using a transfer learning model based on AlexNet’s CNN, the system automated the diagnostic process, improving accuracy, efficiency, and robustness in tumor classification across various stages and sizes.
Methodology
In our experiment, we developed an ACU-Net architecture for brain tumor segmentation employing fMRI images. The methodology includes brain fMRI image collection, image preprocessing, building an ACU-Net architecture and training the dataset and finally predicting the enactment of the segmentation of the brain tumor using our model on fMRI images. The architecture of our ACU-Net is depicted in Figure 1.

The proposed ACU-Net architecture for brain tumor segmentation.
The research focused on enhancing brain tumor segmentation in fMRI images using the ACU-Net model, specifically leveraging the attention mechanism with the convolutional model integrated with the U-Net model. This study was conducted in 2024 in Bangladesh (IUBAT, JnU) and Saudi Arabia (KSU).
Data collection
The BraTS 2018 dataset 37 is a widely used benchmark for glioma segmentation, including both high-grade and low-grade gliomas. It consists of multi-institutional MRI scans from 285 patients, divided into four modalities: T1-weighted, T1-weighted with contrast (T1CE), T2-weighted, and FLAIR. These modalities capture different tumor regions, including the enhancing tumor, core, and surrounding edema. The dataset has been preprocessed to a uniform resolution with skull stripping to focus only on the brain, and the pixel-wise annotations enable precise segmentation of tumor sub-regions, making it ideal for deep learning model development.
The BraTS 2020 dataset 38 includes 368 MRI scans, with 265 for training, 65 for validation, and 47 for testing. It also offers the same four MRI modalities and focuses on glioma sub-region segmentation. Improvements include more accurate annotations and a wider range of clinical cases, making it suitable for both segmentation and survival prediction tasks. This dataset remains crucial in developing and evaluating deep learning models for clinical use, with rigorous comparisons made possible through standardized challenges.
In our study, we used the BraTS 2018 dataset, which consists of 285 patients: 210 with high-grade glioma (HGG) and 75 with low-grade glioma (LGG). We combined both HGG and LGG cases into a unified dataset for training, validation, and testing, without explicitly differentiating between the two tumor types. The dataset was split into 80% for training (228 patients), 10% for validation (28 patients), and 10% for testing (29 patients). Our primary goal was to achieve accurate brain tumor segmentation, and this unified approach allowed our ACU-Net model to generalize well across both tumor grades. The model performed consistently in segmenting key regions such as WT, TC, and ET, demonstrating high performance. The merging of HGG and LGG into a single dataset did not adversely affect segmentation performance, as our model maintained robust performance across all regions of interest.
Figure 2 shows some sample images from the BraTS 2018 and BraTS 2020 datasets.

Sample images of BraTS 2018 and BraTS 2020 dataset.
Data preprocessing
In order to prepare the fMRI images for effective brain tumor segmentation, a rigorous preprocessing pipeline is employed, encompassing several essential steps:
By adopting adaptive slicing, we avoided the need to experiment with different slice resolutions and orientations, which can be computationally expensive and may vary significantly between datasets. Additionally, this technique minimizes the risk of excluding important anatomical structures that might be present in slices outside a manually defined range. Overall, adaptive slicing enhances the segmentation performance by providing a comprehensive and consistent representation of the 3D volume without the need for extensive pre-slice experimentation.
While the attention mechanism in ACU-Net is designed to focus on pertinent areas of the input image and can adapt to some extent, relying solely on this mechanism for managing inconsistencies would be insufficient. Preprocessing techniques like denoising and standardization are crucial in preparing the data and improving segmentation performance. By combining these approaches, we ensure that our model can effectively handle variations in the dataset and deliver reliable segmentation results. While the attention mechanism adds value in emphasizing relevant features, preprocessing techniques play a vital role in ensuring data consistency and quality, ultimately enhancing the performance of the ACU-Net model in brain tumor segmentation tasks.
Data splitting
In this study, the dataset was divided into training, validation, and testing sets using an 80-10-10 split to ensure a balanced model evaluation. Both the BraTS 2018 and BraTS 2020 datasets were structured accordingly, with 80% of the data used for training the model, 10% reserved for validation, and the remaining 10% allocated for testing. This approach allowed for a comprehensive evaluation of the model’s performance while maintaining a sufficient amount of data for both the learning and testing phases. Table 1 outlines the specific data splits for each dataset, showing 228 training samples, 28 validation samples, and 29 test samples for BraTS 2018. Similarly, for BraTS 2020, 294 samples were used for training, 37 for validation, and 27 for testing. This consistent splitting approach ensured reliable performance comparisons across datasets.
Dataset splits for BraTS 2018 and BraTS 2020.
The training setup included a specified epoch count, batch size, and early stopping criteria. The model was trained for up to 25 epochs, with early stopping implemented based on the validation loss to prevent overfitting; training halts if no improvement is observed after a set patience period. We utilized a
Attention-based convolutional U-Net architecture
In our brain tumor segmentation approach, we leverage an Attention-based Convolutional U-Net Architecture (ACU-Net) to accurately detect and segment tumors within fMRI images. This architecture amalgamates several key components, each playing a crucial role in enhancing the segmentation performance.
We chose a symmetric U-Net design due to its proven effectiveness in biomedical image segmentation. The symmetry enables balanced capture of both local and global features, which is essential for accurately delineating tumor boundaries in fMRI images. Although experimenting with asymmetric U-Net designs or deeper networks could capture more hierarchical features, they often increase complexity and the risk of overfitting. Our experiments with both MHAU-Net and ACU-Net showed that the symmetric design provided robust segmentation performance while maintaining model stability and interoperability. The symmetric U-Net design was chosen for its reliable performance in segmentation tasks, delivering high performance without unnecessary complexity.
By combining these components, the ACU-Net architecture offers several benefits for brain tumor segmentation in fMRI images. Firstly, its attention mechanism enables the model to concentrate on relevant tumor regions, improving both sensitivity and specificity in segmentation results. Secondly, the convolutional layers capture discriminative features from the input images, promoting the model to distinguish between tumor and non-tumor regions with high performance. Lastly, the U-Net architecture facilitates the integration of contextual information, enabling the model to delineate tumor boundaries accurately even in challenging cases.
The attention mechanism we introduced is specifically designed to enhance feature representation by capturing long-range dependencies in the input data. The integration of our attention mechanism and convolutional layers within the U-Net architecture has proven to be instrumental in enhancing segmentation performance. This enables the model to better focus on relevant spatial features while suppressing irrelevant information, which is crucial in improving segmentation accuracy, particularly in complex tasks such as brain tumor delineation. Unlike standard attention modules, our approach prioritizes preserving the spatial integrity of feature maps to ensure the adaptive enhancement of segmentation performance. This design allows the model to emphasize the most informative regions in the image, leading to more precise identification of critical structures. To further evaluate its effectiveness, we conducted comparative experiments using a Multi-Head Attention U-Net (MHAU-Net) on the BraTS 2018 and BraTS 2020 datasets. These experiments revealed key strengths of our ACU-Net mechanism over multi-head attention, affirming its suitability for the task.
The model follows a U-Net-like architecture with an encoder-decoder structure. The encoder consists of four blocks, where each block includes two convolutional layers of size
The essential components of our ACU-Net with its internal architecture are presented in Figure 3.

Internal architecture of ACU-Net model.
Our proposed ACU-Net model addresses several of these key challenges such as:
Brain tumor segmentation
In the process of brain tumor segmentation, we employ ACU-Net techniques to partition fMRI images into three distinct segments, namely, the whole tumor (WT), core tumor (TC), and enhanced tumor (ET). Each segment serves a specific purpose in characterizing different aspects of the tumor’s morphology and pathology, thereby facilitating comprehensive analysis and treatment planning.
This segmentation approach with ACU-Net offers detailed insights into tumor morphology and pathology, enabling precise treatment planning and monitoring.
Result analysis
In this study, we employed the ACU-Net model for brain tumor segmentation and conducted a comprehensive evaluation against various state-of-the-art models using the BraTS 2018 and BraTS 2020 datasets. Our model was assessed on key performance metrics such as the Dice coefficient, sensitivity, and specificity. The results showed that ACU-Net significantly outperformed existing methods, particularly in segmenting challenging tumor regions, including the WT, TC, and ET.
Environment setup
The experiments were conducted on a development environment powered by NVIDIA Tesla P100 GPUs with 30 GB of dedicated GPU memory. Additionally, the system was equipped with 30 GB of RAM and 70 GB of disk space to support extensive model training. TensorFlow and Keras were the primary deep learning frameworks used, and essential libraries like Matplotlib, Pandas, and NumPy facilitated data processing, visualization, and evaluation of model performance throughout the study.
The performance metrics for brain tumor segmentation are as follows:
Dice (Dice Similarity Coefficient): Jaccard Index (IoU): Sensitivity (True Positive Rate): Specificity (True Negative Rate): The paired Where:
Importantly, we did not employ any specific threshold values in our calculations, as the Dice coefficient inherently does not require thresholds for its computation. Consequently, we did not perform cross-validation to optimize thresholds for enhancing tumor regions, since our approach focused on directly measuring the similarity between the predicted and actual segmentations without the need for thresholding. This streamlined our evaluation process and allowed us to obtain robust results based on the inherent characteristics of the segmentation outputs.
Performance analysis
We conducted a comprehensive performance evaluation of the ACU-Net model on the BraTS 2018 and BraTS 2020 fMRI datasets, focusing on its effectiveness in segmenting various tumor classes. The key metrics used in this evaluation included Dice coefficient, Jaccard index, Sensitivity, Specificity, and Intersection over Union (IoU). These metrics provide insight into the performance and robustness of the model in delineating different tumor regions: ET, WT, and TC.
As shown in Table 2, ACU-Net achieved outstanding results across all tumor classes. For the BraTS 2018 dataset, it attained a Dice score of 99.23 for ET, 99.27 for WT, and 96.99 for TC, outperforming MHAU-Net in each case. The Jaccard and IoU scores further emphasize ACU-Net’s high segmentation performance, with minimal discrepancies in tumor boundary predictions. Specifically, the Jaccard index for WT was 98.57, indicating precise overlap between predicted and ground truth regions. Similarly, on the BraTS 2020 dataset, ACU-Net continued to deliver strong performance, with Dice scores of 98.82 for ET, 98.4 for WT, and 97.66 for TC. The model demonstrated high Sensitivity and Specificity, which are crucial in medical imaging tasks, ensuring that the model not only correctly identifies tumor regions but also minimizes false positives.
Performance analysis of our ACU-net mdel on BraTS 2018 and BraTS 2020 fMRI dataset.
In comparison to MHAU-Net, ACU-Net consistently exhibited better or comparable results, particularly excelling in TC segmentation. These results confirm the model’s ability to accurately and efficiently segment brain tumors, making it a reliable tool for clinical applications. The attention mechanisms and robust preprocessing pipeline integrated into ACU-Net contribute significantly to its superior performance across diverse datasets.
The graphical representation of our model’s performance on fMRI images for BraTS 2018 and BraTS 2020 is illustrated in Figure 4. The bar chart provides a visual comparison of our ACU-Net’s performance over MHAU-Net in terms of dice, jaccard, sensitivity and specificity, reinforcing the robustness and performance of our segmentation approach in both datasets.

Performance analysis of our ACU-Net model on fMRI image.
We evaluated the average performance of the ACU-Net model in comparison with MHAU-Net on the BraTS 2018 and BraTS 2020 fMRI datasets, using key metrics such as Dice coefficient, Jaccard score, Sensitivity, Specificity and Intersection over Union (IoU) and paired
Average performance analysis of our ACU-net model on BraTS 2018 and BraTS 2020 fMRI dataset.
For the BraTS 2018 dataset, ACU-Net achieved an average Dice coefficient of 97.82%, which is slightly higher than the 97.54% obtained by MHAU-Net. Similarly, ACU-Net outperformed MHAU-Net in terms of the Jaccard score (95.86% vs. 95.34%) and Sensitivity (97.82% vs. 97.54%), reflecting its superior capability in identifying and accurately segmenting tumor regions. Specificity and IoU followed a similar trend, with ACU-Net attaining a Specificity of 99.27% and an IoU of 95.86%, both marginally better than MHAU-Net’s results, highlighting the reliability of ACU-Net in medical image segmentation tasks. On the BraTS 2020 dataset, ACU-Net continued to excel, achieving a Dice coefficient of 98.59%, surpassing MHAU-Net’s 98.38%. The Jaccard score and IoU were also higher for ACU-Net, with values of 97.3% compared to MHAU-Net’s 96.89%. ACU-Net showed slightly higher Sensitivity and Specificity, with values of 98.59% and 99.53%, respectively.
Statistical analysis
For fMRI brain tumor segmentation, a statistical analysis was conducted to compare the performance of the proposed model, ACU-Net, with MHAU-Net. Using the paired
The paired
Figure 5 provides a visual representation of our model’s performance on fMRI images, illustrating the effectiveness of ACU-Net in accurately segmenting brain tumors. The segmented regions align closely with the ground truth annotations, validating the robustness of our approach.

The prediction of brain tumor segmentation using ACU-Net model.
Our proposal, ACU-Net, demonstrates better performance due to its innovative architecture leveraging advanced contextual understanding techniques. By integrating attention mechanisms and context aggregation units, ACU-Net effectively captures intricate spatial dependencies and subtle features crucial for accurate tumor segmentation. This results in more precise delineation of tumor boundaries and improved detection of subtle abnormalities, enhancing diagnostic exactness and facilitating better treatment planning in clinical settings.
Our model, ACU-Net, addresses the borderline cases where tumors are poorly defined or exhibit non-standard shape issues through the integration of attention mechanisms. The attention modules help the model focus on critical features while suppressing irrelevant or noisy information, which is particularly beneficial in cases with ambiguous tumor boundaries or irregular shapes. For poorly defined tumor regions, the attention mechanism allows the model to enhance the representation of subtle differences in intensity or texture that might indicate tumor presence. Additionally, the multi-scale feature extraction employed by ACU-Net enables the model to capture both local and global contextual information, improving its ability to segment non-standard tumor shapes that deviate from typical patterns.
Complexity analysis
The complexity analysis of our proposed model, ACU-Net, focused on evaluating key performance metrics such as build time, inference time, and memory usage on fMRI images from the BraTS 2018 and BraTS 2020 datasets. These factors are crucial for assessing the feasibility of deploying the model in real-time clinical settings, where computational efficiency is essential.
As shown in Table 4, the build time for ACU-Net on the BraTS 2018 dataset was 159.01 seconds, while the inference time was 43.08 seconds. The model’s memory usage stood at 7119.84 MB. In comparison, the MHAU-Net model took significantly longer to build at 231.26 seconds, with an inference time of 58.02 seconds, and consumed more memory, requiring 8757.52 MB. These results demonstrate that ACU-Net is both faster and more memory-efficient than MHAU-Net on the BraTS 2018 dataset. On the BraTS 2020 dataset, ACU-Net also outperformed MHAU-Net in terms of build and inference times, with ACU-Net completing the build in 307.77 seconds and the inference in 68.71 seconds, while MHAU-Net took 417.88 seconds for build time and 69.2 seconds for inference. Memory usage followed a similar pattern, with ACU-Net using 11838.21 MB, compared to 13497.51 MB for MHAU-Net.
Complexity analysis of ACU-Net model on fMRI image.
Despite its efficiency, ACU-Net maintains a strong segmentation score (Table 3), showcasing its ability to handle complex tasks with a lower computational burden. These results emphasize the balance ACU-Net achieves between high performance and computational efficiency, making it a more practical choice for brain tumor segmentation in real-world applications. The reduced computational requirements of ACU-Net make it a more efficient choice for large-scale deployment, especially in resource-constrained environments. This efficiency is achieved without compromising performance, as the model maintains robust performance in brain tumor segmentation tasks. The attention-based design of ACU-Net allows for effective feature extraction and segmentation while minimizing the computational overhead associated with deeper and more complex architectures.
Discussion
In the field of brain tumor segmentation, achieving high performance and robustness across various tumor regions is essential for improving diagnosis and treatment planning. Brain tumors exhibit complex and highly heterogeneous structures that vary widely in size, shape, and location, which poses significant challenges for automated segmentation. Recent advancements in deep learning have led to the development of numerous models designed to tackle this problem, each demonstrating varying degrees of success. In our study, we proposed an Attention-based Convolutional U-Net (ACU-Net) for brain tumor segmentation using the BraTS 2018 and BraTS 2020 datasets. The ACU-Net model leverages attention mechanisms to enhance the segmentation of intricate tumor regions. We evaluated its performance against several state-of-the-art approaches using key metrics such as the Dice coefficient, sensitivity, specificity, and overall performance. In this discussion, we explore how the ACU-Net model outperforms existing methods and analyze the factors contributing to its superior performance.
Comparison analysis on the BRATS 2018 dataset
The results of our experiments on the BraTS 2018 dataset clearly indicate the effectiveness of the ACU-Net model in achieving high segmentation performance. As shown in Table 5, the ACU-Net model achieved a Dice coefficient of 99.23% for the WT region, 99.27% for the TC, and 96.99% for the ET region. This performance significantly surpasses that of existing models, such as the HTTU-Net and RMU-Net, which achieved Dice coefficients of 91.50% and 90.80%, respectively, for the whole tumor region. The remarkable performance of ACU-Net can be attributed to its unique architectural design that incorporates attention mechanisms. Attention mechanisms enable the model to focus on relevant features and suppress irrelevant ones, allowing for improved segmentation score, particularly in regions where tumor boundaries are indistinct or complex. This is crucial in the context of brain tumor segmentation, where subtle variations in pixel intensity can represent critical diagnostic information. Additionally, the ACU-Net’s deep learning architecture enhances its ability to learn complex hierarchical features from the input images. By utilizing multi-scale feature extraction, the model effectively captures both global and local information, resulting in a more comprehensive understanding of the tumor’s anatomy. The high score achieved by ACU-Net in segmenting various tumor regions underscores its potential as a reliable tool for clinical applications, where precise segmentation is vital for accurate diagnosis and effective treatment planning.
Performance analysis (dice coefficient) of our proposed ACU-net model with state-of-the-art methods works on BraTS 2018.
Comparision analysis on the BRATS 2018 dataset
Similarly, the performance analysis on the BraTS 2020 dataset further reinforces the efficacy of the ACU-Net model. As presented in Table 6, ACU-Net achieved a Dice coefficient of 98.72% for the WT region, 98.40% for the TC, and 97.66% for the ET region. These results position ACU-Net favorably against the current state-of-the-art models, such as the nnU-Net and the Spatial Transformer model, which achieved lower Dice coefficients ranging from 88.95% to 91.2%. The consistency of ACU-Net’s performance across both datasets highlights its robustness and generalizability in handling diverse tumor characteristics. The incorporation of attention mechanisms proves particularly beneficial in differentiating between tumor and non-tumor regions, ensuring that the model remains sensitive to subtle features that may be indicative of tumor presence.
Performance analysis (dice coefficient) of our proposed ACU-net model with state-of-the-art works on BraTS 2020.
Factors contributing to superior performance
Several factors contribute to the superior performance of the ACU-Net model compared to existing methods. First, the attention mechanism embedded within the ACU-Net architecture enhances its ability to capture intricate features while mitigating the impact of noise and irrelevant information. This results in a more accurate representation of the tumor’s morphology, enabling more precise segmentation. Second, the architectural depth and complexity of the ACU-Net allow it to learn a richer set of features from the training data. This deep learning approach is advantageous in medical imaging, where the intricacies of tumor structures demand sophisticated models capable of understanding complex patterns. Lastly, the robust training strategy employed during the development of the ACU-Net model, including data augmentation techniques, contributed to its resilience against overfitting and improved its performance on unseen data. This is particularly important in medical applications, where the availability of labeled data is often limited.
Validation of research questions and hypotheses
Validation
The comparative analysis in Table 3 shows that ACU-Net achieves a Dice coefficient of 97.82%, outperforming traditional U-Net models such as 3D-UNet and other existing models. This significant improvement in the Dice score validates that ACU-Net enhances segmentation scores by better capturing intricate details and boundaries of brain tumors.
Validation
As demonstrated in Tables 5 and 6, ACU-Net achieves superior Dice scores for different tumor subregions: 99.23% for WT, 99.27% for TC, and 96.99% for ET for BraTS 2018 and 98.72% for WT, 98.40% for TC, and 97.66% for ET for BraTS 2020. These results surpass those of current state-of-the-art methods, confirming the model’s efficacy in accurately segmenting various tumor subregions.
Hypothesis validation
The empirical data presented in the performance analysis corroborates this hypothesis. The ACU-Net attains a Dice coefficient of 97.82% and a sensitivity of 97.82%, both of which indicate significant advancements over traditional U-Net models and other contemporary methodologies. The incorporation of attention mechanisms facilitates the model’s ability to selectively highlight pertinent features, resulting in elevated segmentation scores and improved tumor region detection. This validation through robust empirical results underscores the effectiveness of the ACU-Net in preserving spatial context and enhancing key metrics, thereby affirming the proposed hypothesis.
Practical challenges of implications
We acknowledge that deploying ACU-Net in clinical settings would require careful consideration of real-world constraints, such as variability in fMRI image quality across healthcare institutions. ACU-Net has been designed with an adaptive framework that leverages attention mechanisms, which enhance its ability to focus on critical regions within the imaging data, potentially improving its robustness to variations in image quality. However, to fully address inter-institutional differences, future work could incorporate domain adaptation techniques to further enhance generalizability.
Regarding misclassification risks, we recognize that any inaccuracies could have significant clinical implications. To mitigate these risks, ACU-Net could incorporate confidence scoring for each prediction, allowing clinicians to interpret model outputs with an understanding of their reliability. Additionally, integrating ACU-Net outputs into a clinical decision support system, where final decisions remain with trained professionals, would further minimize risks associated with misclassification.
Implications for clinical applications
The ACU-Net architecture has significant implications for clinical applications, particularly in neuro-oncology. By enhancing the accuracy of brain tumor segmentation from fMRI data, ACU-Net facilitates precise tumor characterization, which is crucial for effective treatment planning. Accurate segmentation allows clinicians to differentiate between tumor subregions—WT, TC, and ET, enabling tailored therapeutic strategies that can improve patient outcomes. Furthermore, the ability to efficiently delineate tumor boundaries aids in the assessment of tumor progression and response to treatment, thereby supporting timely clinical decisions. The model’s high performance on the BraTS 2018 and BraTS 2020 datasets demonstrates its robustness, making it a valuable tool for both research and clinical settings. As a result, ACU-Net not only advances the field of brain tumor imaging but also holds the potential to contribute to personalized medicine, ultimately enhancing patient care in neuro-oncology.
Limitations of the study
While the ACU-Net model has shown substantial improvements in brain tumor segmentation, several limitations should be considered. Firstly, ACU-Net relies on a single imaging modality (fMRI), which may restrict its performance as it could benefit from integrating multiple imaging modalities. Incorporating data from modalities like CT or PET alongside fMRI could provide complementary information, potentially enhancing segmentation accuracy by capturing a broader range of anatomical and functional details. Another limitation is the absence of Explainable AI (XAI) techniques within our approach. Implementing XAI methods would improve the interpretability of ACU-Net, offering insights into the model’s decision-making process. Finally, our study does not employ pre-trained models, such as SAM-MED2D or SAM-MED3D, which have shown notable segmentation performance due to large-scale pretraining. Incorporating such models could further enhance ACU-Net’s segmentation accuracy and robustness, particularly in scenarios with limited training data.
Future directions
In future work, we plan to address these issues by incorporating XAI techniques to improve model interpretability and transparency, ensuring that the decisions made by the model can be understood and trusted by clinicians. We will explore feature fusion strategies to combine data from different imaging modalities, such as combining fMRI with other modalities like MRI or PET scans, which may provide complementary information and improve segmentation accuracy. These future directions aim to enhance the clinical applicability and robustness of the ACU-Net model, allowing it to better handle challenges such as data variability and the limitations associated with relying on a single imaging modality. Besides, we will consider including a formal ablation study to further clarify the specific contributions of each component within the model. Moreover, we intend to explore integrating pre-trained models like SAM-MED to further improve ACU-Net’s segmentation performance while maintaining its efficiency and adaptability in specialized datasets such as brain tumor fMRI data.
Conclusion
In this study, we introduced ACU-Net, an advanced DL architecture designed for brain tumor segmentation from fMRI images. The proposed model integrates attention mechanisms with a convolutional-UNet framework, enabling it to effectively capture intricate features and accurately delineate tumor subregions. Extensive evaluations of the BraTS 2018 and BraTS 2020 datasets demonstrated that ACU-Net significantly outperforms traditional U-Net models and state-of-the-art approaches. For the BraTS 2018 dataset, ACU-Net achieved Dice coefficients of 99.23%, 99.27%, and 96.99% for WT, TC, and ET, respectively. In the BraTS 2020 dataset, the model attained Dice scores of 98.72% for WT, 98.40% for the TC, and 97.66% for ET.
These results affirm that ACU-Net’s innovative approach enhances segmentation accuracy by selectively emphasizing critical features while preserving spatial context. This advancement is particularly crucial in clinical applications, where precision in tumor segmentation can directly impact treatment planning and patient outcomes. Our findings underscore the potential of ACU-Net as a robust tool in the medical imaging domain, paving the way for further research into hybrid models that leverage attention mechanisms.
Despite the promising results, our study has some limitations. We did not explore explainable AI (XAI) techniques, feature fusion, or hybrid segmentation methods. In future work, we will address these limitations by incorporating XAI to enhance model transparency and interpretability. We will also investigate feature fusion strategies to improve segmentation scores and explore hybrid segmentation techniques to further advance brain tumor segmentation.
