Sage Journals: Discover world-class research

Abstract

Network biology has become crucial to understanding the complex structural characteristics of biological systems. Consequently, advanced visualization approaches are needed to support the investigation of such structures, and several network visualization tools have subsequently been developed to help researchers analyze intricate biological networks. While these tools support a range of analytical and interactive features, it is sometimes unclear to a data analyst or visualization designer which features are of most relevance to biologists. Thus, this study investigates and identifies essential factors for the visualization of complex biological networks using a mixed methodology approach. Based on the findings, essential factors were categorized as either generic and heuristic, where the former concern different analytical and interactive functionalities, such as an efficient layout, advanced search capabilities, plugin availability, graph analysis and user-friendliness, while the latter concern usability, such as information coding, flexibility, orientation and help.¹ Furthermore, the findings indicate that 12 of the 15 generic factors identified were moderately important, while all 10 heuristic factors identified herein were moderately important.

Keywords

Visualization tools biological networks graph visualization evaluation data analysis

Introduction

Network biology has become a vital research concept in revealing the structural features of biological systems,² but the study and modeling of advanced biological processes necessitate the creation of highly integrated networks that can handle heterogeneous and complex data.³ As a result, numerous biologists and bioinformaticians routinely study and elucidate biological networks using interactive graphs, enabling the mapping and classification of signaling pathways, as well as anticipating the functions of unidentified proteins.⁴ Meanwhile, the ubiquitous nature of big data across many fields, including biology, as well as advancements in computation have led to the emergence of many complex networks, the primary goals of which include modeling and understanding real, complex systems.^5,6 Special properties, such as being small-world⁷ and scale-free,⁸ are key indicators of complex networks,⁹ where the former is the value derived when the average path length scales logarithmically and when the clustering coefficient is higher than the random network of the same size.¹⁰ Conversely, the latter is a functional form that cannot be changed in a multiplicative factor while rescaling independent variables.¹¹

Visualization is an essential concept used to understand and analyze data.¹² Currently, several visualization methods and tools are available, having been introduced in the literature, but due to the magnitude and intricacy of biological datasets, it is difficult to obtain useful information from interaction networks.^13,14 Furthermore, Cromar et al.¹⁵ mentioned that due to this difficulty, knowledge of big molecular assemblies and physiologically active fragments has not been well-captured in the research. Therefore, the literature has introduced a wide range of methods and techniques that can be used to develop, represent and evaluate biological networks.^16,17 Further, in response to Pavlopoulos et al.’s¹² assertion that advanced methods are required for the visualization of biological networks given their complexity, several network visualization tools have been introduced to assist researchers in studying complex biological networks, some of which include Cytoscape,¹⁸ Gephi,¹⁹ Medusa,²⁰ Ondex,²¹ Osprey,²² Pajek,²³ and Proviz.²⁴

Given the large number of network visualization tools made available to biologists, as well as the consequential challenge of reviewing and choosing the right one, a hypothesis was generated to determine which qualities are most critical to the efficient and effective visualization of complex networks. This list of quality factors can facilitate the analysis and comparison of complex network visualizations and enable investigators to grasp each tool’s critical components. Thus, this study focuses on identifying appropriate factors that can be utilized to assist designers and users of existing tools for the visualization of complex biological networks in improving and selecting the most suitable tools for different purposes.

Background

Biological networks

Complex network theory spans many disciplines, from computer science to the biological and molecular sciences, and within these disciplines, many biological networks exist, including protein–protein interaction (PPI) networks,²⁵ gene-regulatory networks (GRNs),²⁶ signal transduction or metabolic networks²⁷ and biomedical networks.²⁸ Concerning PPI networks, in computational biology and bioinformatics, such models as affinity purification,²⁹ pull-down assays,³⁰ yeast two-hybrid (Y2H),³¹ mass spectrometry, microarrays³² and phage display³³ are used to identify protein functions from their relationships and interactions with other biomolecules.¹²

Further, concerning GRNs, control over gene expression in cells is assigned to the regulatory network. As such, the study of gene regulatory networks on a large scale is now feasible with the help of data collection, analysis and visualization tools.³⁴

Signal transduction networks, alternatively, use multi-edged directed graphs to visualize and represent interactions within various bioentities (proteins, chemicals or macromolecules).^35,36 In addition, studying the transmission of the signal can be done either from the outside toward the inside of the cell or by investigating transmission within the cell.¹²

Metabolic and biochemical networks are powerful tools for investigating and studying metabolism patterns in various organisms. Similar to bacteria in humans, modifications to the biomedical reaction network can be done via modern techniques for sequence operation.³⁷ Further, biological networks can be described using computer-readable formats, such as Systems Biology Markup Language,³⁸ Proteomics Standards Initiative Interaction (PSI-MI),³⁹ Chemical Markup Language,⁴⁰ Cell Markup Language and the Resource Description Framework.¹²

Data visualization and visualization tools

The emergence of big data and its associated challenges has recently piqued the interest of researchers across many sectors, including healthcare, academics, information technology (IT) and government.^41–43 Other manual operations are now being digitalized,⁴⁴ producing another set of data requiring suitable visualization tools.⁴⁵ Thus, the analysis, interpretation and presentation of the results also face serious challenges in a meaningful way.⁴⁶ One of big data’s greatest challenges is visualization, as the best tool for structured data will be incapacitated for unstructured data.⁴⁷ For instance, visualizing complex biological data requires advanced techniques to identify patterns in the data structure, which then aid in making decisions that fit the data content.

Data visualization can be defined as a method of unveiling data content by graphically presenting and conveying messages. According to Munzner,⁴⁸ data visualization is defined by how a designed dataset provides a visual representation of data to help people carry out their work more effectively. While visualization has been used for centuries to communicate data, the associated challenges and opportunities have greatly changed with the emergence of big data. Conventional data visualization methods are becoming inefficient and obsolete, considering the rate at which data are generated.43 Big data have five main characteristics, known as the “5Vs”: huge volume, high velocity, high variety, low veracity and high value.⁴⁶ The main problem relates not to the processing of huge amounts of data but to the diversity among the data.⁴⁶ For instance, biological data present numerous complexities, and only networking approaches can assist in visualizing such data. Examining the content of a genome goes beyond what can be visualized via bar charts and histograms: first, the data structure must be investigated, and suitable visualization tools with relevant features are developed. Hence, choosing the right tools and features is crucial to obtain accurate information from the data.⁴⁹

Major network analysis factors include structure and dynamics, which are the by-products of network science.⁵⁰ Concerning the network structure, the importance of a node is measured by nodal centrality,⁵¹ and network communities can detect similar nodes.⁵² Whether nodal centrality or network communities are used, both approaches endeavor to identify essential nodes in complex networks.⁵³

Further, the manifold’s width of data visualization has increased significantly due to the development of digital technology through internet advancements, due primarily to such visuals as graphs and graphic diagrams.⁵⁴ Thus, visualization has become a great tool for information analysis and sharing,⁵⁵and it is essential to the scientific process because, no matter the research significance, if policymakers, other experts or the public cannot grasp the science presented to them, society will not profit from its results ⁵⁶; as such, visualization uses images to represent data to give viewers a clear understanding,⁵⁷ and it enables the comprehension of highly complex biological data,⁵⁸ making it possible for administrators to understand much data quickly and easily at first glance on the state-of-the-art network. In addition, it offers decision-makers the power to visualize analytics to help them comprehend complex ideas and patterns.⁵⁹

Network visualization

Network visualization focuses primarily on interpreting, interacting, identifying and exploring the patterns within a dataset,⁶⁰ for which several tools have been developed. The investigated tools are classified into two major sections: 2D and 3D visualization tools. By investigating them, it is possible to understand their algorithms and the data structures they utilize, as well as to explore their application domains and learn about their capabilities and features. These features reviewed in this section to enhance our understanding of them.

2D network visualization tools

Among the 2D visualization tools analyzed are Cytoscape.js, Osprey, Medusa, ProViz, Pajek, ONDEX, Gephi and Tulip, and overviews and key feature analyses of the selected tools are provided in Tables 1 and 2 respectively.

Table 1.

Review of fundamental features of widely used 2D network visualization tools.

Tools	Open-Source?	File Formats	Community Support	System	URL
Cytoscape.js	Yes.	Supports different input formats, such as GML, SIF, NNF, BioPAX, PSI-Mi, etc.	Yes	Web	https://js.cytoscape.org/
Osprey	Yes.	Supports raw and processed formats from major MRI vendors, such as GE, Siemens, and Philips.	Yes	Standalone	http://tinyurl.com/osprey1/
Pajek	Freely accessible for academic use.	Only supports network in Pajek (.net) (Strict file input formats).	Yes	Standalone	http://vlado.fmf.uni-lj.si/pub/networks/pajek/
ONDEX	Freely accessible with a GNU Public License.	OBO ontologies, BioPAX (.owl, .bpn), Excel Workbook (.xlsx, .xls), OWL (Web Ontology Language) (.owl), FASTA (.fasta), GFF (Generic Feature Format) (.gff), ONDEX Graph Format (.oxl), XGMML (eXtensible Graph Markup and Modeling Language) and many others	No	Web	http://www.ondex.org/
Gephi	Yes	Supports CSV, GEXF, GDF, GML, GraphML, Pajek NET, GraphViz DOT, UCINET DL, Tulip TPL, NetDraw VNA and spreadsheet formats.	Yes	Web	http://gephi.org/
Tulip	Yes	GraphML (.graphml), CSV (.csv), GEXF, JSON, UCINet (.dl), Pajek (.net), Tulip (.tlp) and Pajek	Yes	Standalone	https://tulip.labri.fr/site/

Table 2.

Review of functional features of widely used 2D network visualization tools.

Visualization tools	Layouts	Input customizations	Scalability	Editing
Cytoscape.js	Wide variety of simple grids and sophisticated algorithms.	Data import/export, custom extensions, layout options, style customization, custom attributes, data structure, nodes and edges, labels and tooltips, zooming and panning, styling and visual themes and many others	Can visualize large networks with nodes and edges. Cannot effectively scale the analysis.	Offers predefined visual styles and color schemes, academic viewer, etc.
Osprey	Circular, concentric, spoke and dual ring layouts.	Data import, node and edge customizations, layout algorithms, node filtering and highlighting, interactive exploration, data integration and many others	Can use two or more added datasets: filter options.	Automated identification of input file formats; uses GRID to build interaction networks.
Pajek	Graph layout, neighborhood detection, clique finding, node merging, etc.	Node and edge attributes, network layout, visualization styles, partitioning and clustering, layout algorithms and others	Can visualize a million nodes with over a billion connections.	Manual graph editing.
ONDEX	FastCirculae, Ycircle, JungFR layouts.	API customization, user interface customization, workflow customization, visualization styles, custom analysis tools, grouping and clustering, custom annotations and others	Text mining, graph analysis.	Graph filters.
Gephi	Force-directed layout algorithm (Force Atlas).	Network file formats, visualization styles, data filters, layout algorithms, interactive exploration and others	Can visualize large networks (over 20,000 nodes).	Graphical modules, designs of nodes, edges and labels. Options for increasing network clarity and readability.
Tulip	Force-directed layouts, hierarchical layouts, circular layouts, grid-based layouts, random layouts and custom layouts.	Data import, data processing, layout algorithms, visual representation, interaction and exploration, analysis algorithms and others.	With regard to the graph size (number of nodes and edges) that might be managed and visualized, Tulip’s basic framework and low-level data structures are optimized to meet challenging objectives.	Node and edge creation, node and edge removal, node and edge editing, selection and grouping, drag-and-drop, copy and paste and undo and redo.

Su et al.⁶¹ explored the use of Cytoscape 3 for biological network data, the main advantage of which is that it offers users an interactive and versatile visualization interface with which they can easily navigate available features to explore network data.⁶¹ Their work highlighted other features added to Cytoscape 3, which only advanced users can access. As rendering interactive graphs in a web browser is among its most frequent use cases as a visualization software component, it can be implemented in this capacity easily, and it can be utilized heedlessly, which is helpful for graph operations on a server, such as Node.js.⁶¹

Oeltzschner et al.⁶² analyzed Osprey as an open-source processing approach toward reconstructing and estimating magnetic resonance spectroscopy (MRS) data, and they used it to load a series of MRS data formats and carry out phased-array coil combinations, as well as to determine the frequencies and phase corrections of transients. An MRS voxel co-registers an anatomical image, so it was found that Osprey has the capacity to load, process, model and quantify MRS data successfully using different conventional and spectral editing methods.

Meersche et al.⁶³ explored Medusa’s ability to predict protein flexibility in sequences, having derived protein homologous sequences and amino acid physiochemical features from evolutionary trends to serve as inputs for a convolutional neural network. This was possible using the Medusa tool due to its flexibility, as its output was found to allow users to identify highly deformable protein regions and the general dynamics of protein properties, though it is important to note that the tool is currently inactive.

Jehl et al.²⁴ discussed ProViz, a web-based visualization tool, to investigate the functional and evolutionary features of protein sequences. With the goal of streamlining the study of proteins’ operational and developmental characteristics, ProViz, a potent browser-based tool, was created to assist biologists in developing concepts and designing experiments. Resources outlining the modular architecture of protein, sequence variations, post-translational modifications, structures and experimental characterizations of functional areas are used to derive feature information automatically. The data are presented via a user-friendly, interactive visualization medium, made available via a straightforward protein search tool, enabling people with modest bioinformatic expertise to obtain appropriate information quickly for their research. User-defined data can also be added to visualizations via manual customization or by using a representational state transfer (REST) application programing interface (API).

The Windows program Pajek evaluates and visualizes huge networks with dozens – sometimes even millions – of vertices,⁶⁴ and its primary objectives are to offer users a powerful visualization tool and to develop various effective algorithms for investigating huge networks. The tool was primarily built on the experiences gained while developing the libraries of the graph data structures and the algorithms for graph and X-graph, which are network analysis and visualization programs that identify transformations, numbering, partitions, maximum flow, random networks, hierarchical components, decompositions, citation weights, k-neighbors, the critical path method (CPM), paths between two vertices, vectors and counts in NET.⁶⁴

Concerning heterogeneous biological networks, Taubert et al.²¹ visualized and explored Ondex Web, an updated version of the Ondex data integration platform that includes new network visualization and research characteristics. The appearance of heterogeneous biological networks may be explored and altered easily by users thanks to such novel capabilities as context-sensitive menus and annotation tools. Further, open source, Java-based Ondex Web is effortlessly embeddable as an applet into websites, and data can be uploaded onto Ondex Web in a variety of network formats, including Pajek, XGMML, OXL and NWB.

Jayamohan and Chatterjee⁶⁵ analyzed Multiviz, a Gephi plugin that uses a multi-layer network-scalable tool to visualize complex networks that are also multi-layered. They discovered the availability of different settings that can be used to transform extant multi-layered networks, which shows that the Gephi plugin can visualize multi-layered data in complex real-life situations.

The TULIP framework was created to foster extensibility and reusability ⁶⁶; generally, it encourages the implementation of new technologies and scientific collaborations, and it gives users the option to build rapidly and browse via cluster trees or graph hierarchies (nested subgraphs). These methods have served as a key visual framework for the research team, as they frequently supply data analysts with the necessary answers.

Allegri et al.⁶⁷ designed and developed a new network-based visualization tool called CompositeView, an open-source application developed in Python. It mainly improves the visualization and extraction of complex interactive networks, increasing the chances of obtaining actionable insights. The authors found that although CompositeView was developed to visualize network data using ranking properties, it functions better on non-network datasets.

3D network visualization tools

Some of the 3D visualization tools analyzed include Arena3Dweb, CellNetVis, Graphia and OmicsNet, overviews and key feature analyses of which are provided in Tables 3 and 4 respectively. The first web program to enable the visualization of multi-layered graphs in 3D space was Arena3Dweb, which is entirely dynamic and independent.⁶⁸ Users of Arena3Dweb can combine numerous networks with their intra- and inter-layer connections into a single view, and a wide variety of inter- and intra-layer layouts and network indicators is available for node scalability, with easy use by beginners on a web browser. Moreover, it was created using R, Shiny and JavaScript, and it supports weighted and unweighted undirected graphs.

Table 3.

Review of fundamental features of widely used 3D network visualization tools.

Tools	Open-Source?	File Formats	Community Support	System	URL
Arena3Dweb	Free for academic use.	Supports text file formats.	No	Web	http://bib.fleming.gr:3838/Arena3D
CellNetVis	Freely accessible with a GPLv.3 license.	XGMML formats.	No	Web	http://www.lge.ibi.unicamp.br/cellnetvis
Graphia	Yes	Graphia Graph Format (.ggf), GraphML (.graphml), CSV, XLSX, GEXF and JSON.	Yes	Standalone	https://graphia.app/
OmicsNet	No	TXT, CSV, Excel (XLS, XLSX), Simple Interaction File (SIF), Cytoscape Formats (XGMML, GML), Open Biomedical Ontologies (OBO), Gene Association File (GAF) and Kyoto Encyclopedia of Genes and Genomes (KEGG).	No	Web	https://www.omicsnet.ca/

Table 4.

Review of functional features of widely used 3D network visualization tools.

Tools	Layouts	Input customizations	Scalability	Editing
Arena3Dweb	Inter-/intra-layer layouts, circle, grid, random, star, Fruchterman–Rheingold layout algorithms, etc.	Annotations and markers, interaction and navigation, structure interactions, structure labeling, structure coloring, structure highlighting and others	Can visualize large-scale complex biological networks.	Control buttons for easy 3D navigation, zooming, panning and orbiting.
CellNetVis	Force-directed layout algorithm.	Network data import, node customization, edge customization, layout algorithms, filtering and subsetting, interactive exploration and others	Can visualize both small and large networks.	Search for nodes by label, manual creation of shapes and position nodes.
Graphia	A robust open-source visual analytics program called Graphia was created to assist with analyzing vast and complicated datasets.	Graph data import, node customization, edge customization, layout algorithms, filtering and subsetting, interactive exploration, node annotation and labeling and others.	Developed to visualize the typically huge graphs producing 2D or 3D space quickly.	Users can add additional nodes and edges right within the visualization. Usually, users can choose either certain nodes or edges to change in the graph.
OmicsNet	Force-directed layout, grid layout, spherical, multi-layered perspective layouts and others.	data integration, interactive exploration, node and edge attributes, omics data input, network layout and others.	Created to simplify the creation, visualization and analyses of multi-omics networks to examine complex interactions across lists of relevant ‘omics traits. OmicsNet 2.0.	Able to modify a network’s edges or connections between nodes and enables users to alter specific network nodes.

CellNetVis creates an adaptive network structure where nodes are organized into flexible cellular components using an iterative force-directed process,¹ where a correctly documented network in the XGMML format serves as the tool’s input. It provides some capabilities that are crucial to modern biological network analysis and that are not offered by other tools, including simultaneously being web-based, supporting enormous networks and automatically displaying nodes within their cellular components.

The open-source platform Graphia was developed for the graph-based analysis of the massive volumes of quantitative and qualitative data currently being produced from research on cells, genes, proteins and metabolites.⁶⁹ Computing the correlation matrices of any tabular matrix, whether of discrete or continuous values, is at the heart of Graphia’s capabilities, and the program is built to demonstrate swiftly the frequently enormous graphs that emerge in 2D or 3D space.

Another web-based application, called OmicsNet, was designed to simplify for users the creation, visualization and analysis of multi-omics networks for the exploration of intricate correlations between lists of relevant ‘omics traits.⁷⁰ Some highlights include a new 3D module called layout and improved network visual analytics, with 11 2D graph layout possibilities. It includes steps to enhance study reproducibility by introducing the companion OmicsNetR package, linking R command history and creating ongoing links for the exchange of interactive network views.

Comparison of 2D and 3D network visualization tools

From the in-depth analyses performed in Tables 2 and 4, it is clear that 3D visualization tools provide an enhanced user experience, offering interactive features and making it easy to explore a 3D space. Conversely, 2D network visualization tools provide several layout algorithms that can be used to provide a wide range of visualizations. Therefore, depending on the requirements, the analyzer must select the proper tool to maximize their output.

Factors for evaluating visualization tools

The factors that can be adopted in the evaluation and selection of visualization tools for different purposes are classified into generic and heuristic. This section will describe these factors, including how they were derived and their use among the reviewed network visualization tools.

Generic factors

There are 15 generic factors derived from the literature (see Table 5), of which “Factors in evaluating visualization tools” was used as a keyword to search for scientific publications in such online databases as Google Scholar, Research Gate, JSTOR, IEEE and Science Direct, among others. A number of publications were discovered and sorted using the keyword “generic factors,” and different factors in each publication were identified and ranked based on the number of publications in which they appeared. Each of the 15 factors appeared in more than three journals; hence, they were included in the study.

Table 5.

Generic factors.

Generic Factors
Filtering Tools	Scalability
Plugins	Different File Formats
Visual Styles	Text Mining
Advanced Search	User Input and Customization
Free/Open Source	Runtime performance
Graph Analysis	User-friendliness
Feedback to Users	Strengths
Efficient Layout

A brief discussion of the generic factors is included below.

Filtering Tools (GF_1): According to Heberle et al.,¹ this is one of the options used to improve network layouts. Rusch et al.⁷¹ also used the filtering tool to eliminate redundant probe sets in the genetic locus. Filtering tools are also used to enable compatibility between different file formats and visualization systems and to enable easy network manipulation. Baitaluk et al.⁷² evaluated Cytoscape and VisANT as filtering tools, indicating that while Cytoscape has flexible filters with different nodes and edge attributes, VisANT has several available “select” filters. Furthermore, Kohl et al.⁷³ investigated the filtering features of Cytoscape, ProViz and VANTED, indicating that the tools probe session-relevant features of the networks. In addition, Yeung et al.⁷⁴ stated that in Cytoscape, the numerical attributes were filtered to determine the minimum and maximum values, and nodes and edges within this range were identified.

Plugins (GF_2): The presence of plugins in network visualization and analysis is a major factor, as they are important means through which advanced users can extend and customize applications. Millán⁷⁵ stated that Cytoscape helped to access PPI repositories and the BINGO plugin for the GO enrichment analysis of the resulting network. Furthermore, Koh et al.⁷⁶ mentioned that Cytoscape offers visualization features that may be combined to display complex information. Moreover, Cline et al.⁷⁷ investigated Cytoscape, Osprey, VisANT, GenMAPP, BioLayout Express3D, PATIKA, CellDesigner, PIANA and ProViz, and they found that additional functionality in such areas as download services and data integration was provided. Furthermore, Gerasch et al.⁷⁸ stated that, compared to other systems, the network visualization model of BINA is based on hierarchical graphs and focuses on interactive and comprehensive visualizations for signaling high-quality networks.

Visual Styles (GF_3): Due to their increased perceptibility and ability to highlight patterns in complex information, visual styles are essential for data visualization.⁷⁹ Moreover, they play a key role in ensuring effectiveness, as they avoid confusion.⁸⁰ Visual styles are linked to themes, each of whose file contains a section defining which visual style will be used while it is active. A visual style comprises numerous components that come together to form a seamless whole that supersedes the sum of its parts.

Advanced Search (GF_4): The advanced search feature is important in a visualization tool, as it allows narrowing of the search query’s scope to exclude unrelated information, so a user can easily find the specific content they want. However, with Google, this feature limits the results of complex searches.⁸¹ It is a straightforward application that enables a recursive search for files utilizing a basic filter, and it allows a file filter to batch-move files from one folder to another. Baitaluk et al.⁷² stated that Cytoscape enables node name searches on the graph, while no search option is available in VisANT. In addition, after investigating Ondex Web and Cytoscape Web, Taubert et al.²¹ stated that an advanced search helped the user obtain information relevant to the network inputs.

Free/Open Source (GF_5): Including this feature in a visualization tool enables its users to overcome limitations, as well as to examine and validate source code independently and to rely on the user and volunteer community. This feature does not require permission to access, study or use tools; moreover, it is accessed free of charge. An open-source application is one in which the executable binaries that constitute the program are distributed together or wherever in the source code that generates the application.⁸² Faysal and Arifuzzaman⁸³ stated that SocNetV, Cytoscape, Gephi, TULIP and Pajek are all free and open-source software, though Pajek has commercial and non-commercial versions.

Graph Analysis (GF_6): The embedding feature is critical, as it requires less time and effort to organize data while combining more data points or sources, rendering it easier to work with. The data it analyses can also be modeled, stored and retrieved. In many contexts, identifying, visualizing and analyzing links between items are referred to as graph analysis, a process Kohler et al.⁸⁴ identified as supported by Ondex, which maps and automatically links data from various heterogeneous sources, unlike other graph-based systems.

Feedback to Users (GF_7): This is an important feature of a visualization tool, as it measures its ease of use. Users may be asked to share their opinions on the system under assessment as part of the usability evaluation process, and considerations for this may relate to the system’s appropriateness to the usage context, expectations of usability issues and design recommendations.⁸⁵ Faysal and Arifuzzarman⁸³ stated that only Gephi and SocNet were identified as having good reporting strategies, while TULIP and Cytoscape offer limited options for users to save the resultant graphs from operations in specific formats.

Efficient Layout (GF_8): These primarily entail the most important features of visualization tools.⁸⁶ A layout algorithm is a specific tool that helps expedite the production of various diagram types, and it automatically organizes diagram elements.⁸⁷ The algorithm determines the placement of diagram shapes and connectors based on predetermined principles and arranges them so that even the most complicated diagrams are understandable and instructive.⁸⁸ Yeung et al.⁷⁴ investigated Cytoscape and stated that applying the layout to the network produced a more vivid visual representation of the data and rendered the network structure more interpretable. Moreover, Pavlopoulos et al.⁸⁹ stated that most of the tools have several sophisticated layout algorithms, although TULIP is highly recommended.

Scalability (GF_9): The capacity of a system to adapt its performance and costs to changes in application and system processing demands is known as scalability,⁹⁰ and the capacity to scale will allow the visualization tool to grow without compromising its user effectiveness or caliber. Pavlopoulos et al.⁸⁹ stated that while TULIP is preferred as a medium-scale network, it is not as scalable as Gephi. Further, Cytoscape cannot scale well with analysis, while Pajek outperforms other tools as the most scalable for network visualizations. Furthermore, Faysal and Arifuzzaman⁸³ identified Gephi and Cytoscape as effective tools for scaling vast networks due to the tools’ ability to read and visualize all of the networks presented in the table.

Different File Formats (GF_10): It is important that different file formats are supported by visualization tools, as they determine how data are utilized. Standard methods for encoding data for storage within a computer file are referred to as file formats. The way bits are utilized to encrypt data in a digital storage medium is specified, though both proprietary and open-source file formats are available. Cline et al.⁷⁷ stated that creating an image file of network data through Cytoscape is comparatively easy, while Pavlopoulos et al.⁸⁹ stated that Cytoscape is the most suitable tool, as it accepts several input file formats compared to Gephi and TULIP, while Pajek is the least suitable, as it is not flexible in its input file format.

Text Mining (GF_11): The ability to analyze vast amounts of information quickly makes text mining an essential visualization process,⁹¹ where important connections between elements that could not have otherwise been discovered can be revealed by mining. Text mining is a method for obtaining insightful and underlying features from textual data sources to discover knowledge.⁹² The data are in the form of images, videos, audio and texts. Kohl et al.⁷³ stated that Cytoscape, VANTED and ProViz provide flexible and advanced text-mining capabilities.

User Input and Customization (GF_12): A user input comprises whatever data a computer receives for processing.⁹³ In contrast, user customization entails adapting a basic product design concept to a single customer’s wants.⁹⁴

User input is critical for visualization tools because it is essential to the identification and comprehension of the functional and technical requirements a product must meet. This information also guides less obvious but often equally significant qualities, such as fulfilment, acceptance or esthetics. Customization allows users to choose what they want to see or to set preferences for the information arrangement or presentation process. Because it gives consumers power over their interactions, it can improve the user experience. Suderman and Hallett⁹⁵ stated that while most tools support improved functionality in graphic user interfaces (UIs), most tools’ functionality is often insufficient for specified tasks.

Runtime performance (GF_13): This concern how well a page functions when active. Because every browser utilizes a different JavaScript engine, runtime performance figures may vary for every browser.⁹⁶ Thus, runtime performance is critical to consider in a visualization tool.

User-friendliness (GF_14): This concerns the ability of a system to offer an environment in which its users may complete tasks safely, effectively and cost-effectively using straightforward, logical, dependable and effortless features, among its many shared qualities. Generally, user-friendliness is desired in terms of the qualities considered stimulating and engaging to users,⁹⁷ and it is critical to visualization tools, as it ensures they will be well-designed and easy to use. A visualization tool can increase its use by being more user-friendly, as users’ will keep it in mind and use it again the next time they need it, if they believe the tool was created with them in mind. Pavlopoulos et al.⁸⁹ stated that TULIP is the strongest tool in terms of user-friendliness, particularly compared to Gephi and Cytoscape, which are good and medium, respectively, while Pajek is weak.

Strengths (GF_15): Data visualization is critical for firms to identify data trends, relationships and structures quickly, processes that would ordinarily be time-consuming.⁹⁸ Moreover, analysts can perceive concepts and new patterns thanks to the graphical depiction of datasets.⁹⁸ In addition, a proper understanding of a quintillion bytes of data is difficult without data proliferation, which incorporates data visualization into the rising rush of daily data.⁴⁴ Pavlopoulos et al.⁸⁹ stated that Pajek’s strength is in its variety of layout algorithms, while PIVOT, Medusa and ProViz are best suited for PPI visualization. PATIKA enables the efficient visualization of transitions, BioLayout Express3D offers various approaches to microarray data analysis and Osprey’s filtering capabilities make it a powerful tool for network manipulation. Meanwhile, Ondex’s strength is in combining heterogeneous data types into one network, and Cystoscape’s strength is in visualizing molecular networks.

Heuristic factors

A brief discussion of heuristics factors¹⁰ is provided here (see Table 6), where “Factors in evaluating visualization tools” was the keyword used to search for scientific publications in such online databases as Google Scholar, Research Gate, JSTOR, IEEE and Science Direct. Several journals were identified, and they were narrowed down using “heuristics factors” as keywords. Different factors in each journal were identified and ranked based on the number of journals in which they appeared. Each of the 10 factors appeared in more than two journals, so they were included in the study.

Table 6.

Heuristic factors.

Heuristic Factors
Information Coding	Consistency
Flexibility	Spatial Organization
Orientation and Help	Recognition Rather than Recall
Minimal Actions	Remove the Extraneous
Prompting	Dataset Reduction

Information Coding (HF_1): This requires changing information, such as a gesture, image, sound, word or letter, into a different form, sometimes abbreviated, for transmission across a communication channel or storage on a medium. Forsell and Johansson⁹⁹ stated that realistic techniques and added symbols improve information perception, while Vaataja et al.¹⁰⁰ agreed that they are useful for information visualization through the mapping of data objects into visual elements, such as graphics, symbols and visual cues. Lastly, Williams et al.¹⁰¹ stated that dataset mapping is utilized for visual elements.

Flexibility (HF_2): This entails the capacity to adapt to user requirements,¹⁰² as flexibility is important to visualization tools because it enables the easy design of controlled visualization stylings.¹⁰³ Vaataja et al.¹⁰⁰ stated that flexibility in network visualization refers to easy access to and available means for users to customize the interfaces of visualization tools to understand the processes, working strategies and task requirements.

Orientation and Help (HF_3): Orientation summarizes information related to the tool, and it is important because it provides accurate and concise information concerning visualization tools. The help feature aids users in obtaining support from the tool’s developers, and it is a visualization tool because it enhances the user experience. Williams et al.¹⁰¹ stated that orientation and help are functions that provide support for users to control the level of detail, action and representation of additional information.

Minimal Actions (HF_4): This involves testing actions, where the action accepts a file as an input and outputs an identical file with the prefix “bak” before the final suffix, which is the bare minimum it can perform. Vaataja et al.¹⁰⁰ stated that minimal actions refer to the extent of workloads based on the number of actions required to complete a task.

Prompting (HF_5): This alludes to the text or symbols used to indicate a system’s preparedness to execute the instructions that follow. A textual depiction of the user’s location could alternatively be prompted. This feature is vital to a visualization tool because it helps elicit an action. Forsell and Johansson⁹⁹ stated that promotion refers to using a guide or prompts to support users in taking specific actions and providing alternatives within the system, done through data entry or by serving as a guide in the performance of other tasks.

Consistency (HF_6): This feature of distributed systems guarantees that all nodes or replicas have a similar data view at any given time, regardless of which user has modified the data.¹⁰⁴ This feature is important to visualization because it gives the viewer a sense of organization, which aids in a better understanding of visualized data. Vaataja et al.¹⁰⁰ stated that consistency enables the system to become more predictable and improves learning and generalizations, including errors in using the system.

Spatial Organization (HF_7): This feature emphasizes identifying and categorizing the geographic space in which human activities take place and in which spatial structures are created,⁹³ and it is crucial to incorporate into visualization tools because it helps interpret what is visualized. Forsell and Johansson⁹⁹ stated that spatial organization refers to the orientation available to users in the information space, the efficiency in space usage, the distribution of layout elements, legibility and precision and the alteration of visual elements.

Recognition rather than recall (HF_8): This implies that users can identify the data, object or event as being acquainted from memory.¹⁰⁵ Conversely, recall means locating relevant knowledge in the memory,¹⁰⁶ which is relevant to visualization tools because it makes actions, objects and options visible to reduce users’ memory load. Meanwhile, Vaataja et al.¹⁰⁰ stated that recognition rather than recall is used to reduce users’ memory load by making actions, instructions and objects more visible and easier to recognize or retrieve.

Removing the extraneous (HF_9): This involves presenting the largest amount of data possible using only a small amount of ink by determining whether additional information will serve as a distraction or will limit the visualization process.⁹⁹

Dataset reduction (HF_10): This minimizes large datasets to hasten the testing process,¹⁰⁷ which is crucial for visualization tools because it reduces storage costs and improves performance and storage capability. Forsell and Johansson⁹⁹ stated that dataset reduction would help users redirect their focus to areas of greater interest or relevance and toward an understanding of the available datasets.

Methodology

This study gathered information through a mixed methodology, using a combination of quantitative and qualitative research provided in Figure 1. to study the subject matter,¹⁰⁸ allowing the researcher to avoid the constraints associated with employing a single approach and enhancing the knowledge gained in relation to the stated research problems.¹⁰⁹ Further, it can leverage the benefits and limitations of both strategies and is especially helpful when dealing with complicated, multidimensional challenges.¹¹⁰

Figure 1.

Method adopted.

Data collection

Qualitative research involves gathering descriptive opinions and experiences using various methods, including interviews, observations, focus groups and case studies,¹¹¹ the former of which were utilized in this study. Conversely, quantitative research produces numerical data or data that can be converted into useful statistics to measure a specific concept using questionnaires, surveys, etc.,¹¹² the latter of which was adopted in this study. The interviews consisted of open-ended questions, whereas Likert-scale questions were utilized in the survey,⁵ created and distributed via the Newcastle University Online Surveys tool to gather responses. The survey and interviews were carried out independently, and the interview participants were not required to complete the survey before participation. The survey captured the participants’ demographic information and asked them to rate five general statements related to each factor identified in the literature. Alternatively, the interview questions were more dynamic and related to the factors, including participants’ opinions about them, particularly their importance when adopting a specific network visualization tool.

Data analysis

The qualitative data were analyzed using thematic analysis, an approach that focuses on the discovery, description, rationalization of, as well as interconnections between themes.¹¹³

Steps utilized for the thematic analysis were derived from Dawadi, as follows ¹¹⁴:

Step 1 (Becoming acquainted with the information): The first step (data familiarization) begins with the investigator’s desire to get to know their data, as the name suggests. The types (and quantity) of themes that could be discovered through the data are determined during this step.

Step 2: (Producing the initial codes): Investigators can begin making notes on prospective data items of interest, on inquiries, on linkages between data items and on other initial ideas after the familiarization process of Step 1.

Step 3 (Searching for themes): The main goal of this phase is to identify trends and connections present among the complete data collected.

Step 4 (Reviewing themes): All themes are deliberately gathered to improve the initial grouping of topics and to display them in a more organized manner.

Step 5 (Defining and naming themes): After the thematic map is improved, the investigator proceeds to Step 5, where each of theme is characterized and narratively described by explaining its significance to the larger study.

Step 6 (Producing the report/manuscript): A write-up of the final review and a summary of the results is the last step.

Participants’ demographic information

Interview responses were gathered from five participants and survey responses from 98; even though the number of participants gathered for the study is relatively low, the representation of participants ensures equality in general concerning age, job position, domain, experience with visualization tools, etc.

Age

It is important to note that there is no participant representation from the 18 to 24-year age group. However, to overcome this drawback, 34% of survey participants were aged 18–24 years. The statistics in Table 7 show that most survey and interview participants were within the age range of 25–34 years (at 37% and 40%, respectively), while participants aged 45–54 years comprised the fewest survey participants (7%) and the majority of interview participants.

Table 7.

Age statistics.

Age Range	Survey	Interview
18–24	34%	0%
25–34	37%	40%
35–44	22%	20%
45–54	7%	40%

Job position

The various job positions held by the participants are outlined in Table 8. The specialist category includes computer specialists, customer service agents, data analysts, economists, writers and engineers, constituting 7% of participants, according to the table, while 1% were biomedical specialists from Newcastle University. Meanwhile, lecturers comprised most of the survey participants, while postdocs comprised most of the interview participants.

Table 8.

Job position statistics.

Job Position	Survey	Interview
Specialist	7%	0%
Biomedical Scientist	1%	20%
Postdoc	1%	40%
Research Associate	2%	20%
Soldier	4%	0%
Student	36%	0%
Lecturer	40%	20%
Unemployed	4%	0%
Other	5%	0%

Experience

For the survey, participant selection focused on individuals with varying years of experience, but for the interviews, selection focused on individuals with relatively more experience. Table 9 shows the experience statistics of the participants, where those with 1–3 years’ experience comprised most survey participants (51%) and those with 6–10 years comprised the majority of interview participants.

Table 9.

Experience statistics.

Experience	Survey	Interview
1–3 years	51%	0%
4–5 years	35%	20%
6–10 years	8%	60%
>10 years	6%	20%

Knowledge and experience with visualization tools

There are additional visualization tools, apart from those specified in Table 10: MATLAB, Spike, QuPath, PowerBI, Tableau, NetworkX, etc. However, from the survey, Pajek tops the list, followed by Cytoscape, even though during the interviews, Cytoscape was a widely discussed visualization tool compared to Pajek. The table below shows the top–bottom order in which the respondents rated the visualization tools.

Table 10.

Visualization tools.

Rating	Visualization tools specified in the survey	Discussed visualization tool during the interview
1	Pajek	Cytoscape
2	Cytoscape	Pajek
3	ProViz	Osprey
4	Osprey	ProViz
5	Gephi	Osprey
6	Medusa	Medusa
7	ONDEX	ONDEX
8	Other	Other

Findings

The factors identified from the survey and interviews with the participants were divided into two major categories: generic and heuristic.

Generic factors

The survey respondents’ ratings of the importance of the generic factors are provided in Figure 2. The overall score indicates the level of importance of the factors for the network visualization tools in general according to all participants. As such, Figure 2 shows that the respondents consider filtering tools to be the most important factor, at 73%, followed by user input and customization, at 58%. Graph analysis and the benefit of the tool to its users follow closely at 57%, and user-friendliness comes next at 56%. Meanwhile, the percentage of responses for scalability was 55%, while an efficient layout, plugin availability and runtime performance ranked 54%, followed closely by visual style, text mining and user feedback, at 53%. The participants’ response rate for different file formats was 51%, but an advanced search showed 48% and open source (free) had the lowest percentage (44%). Thus, of the 15 generic factors, the majority (12) are considered of moderate importance to the survey participants, as the score ranges between 50% and 60%. Furthermore, the factors “Advanced Search” and “Open Source (free)” were considered less important (below average) compared to other factors. Nevertheless, the factor “filter tools” was considered key (73%) in assessing and selecting a visualization tool for complex biological networks.

Figure 2.

Importance of the generic factors.

From Table 11, it is clear that the mean values of the generic factors range between 3.41 and 3.88, indicating that the ratings received from participants are mostly moderately positive. Moreover, the range of the standard deviation is from 0.52 to 0.65, which specifies that the variation between ratings from responses was relatively low. Therefore, it is possible to assume that the opinions of the participants were almost similar.

Table 11.

Descriptive analysis of the generic factors.

Factors	Mean	Std. Dev	Factors	Mean	Std. Dev
Filtering Tools	3.88	0.52	Text Mining	3.42	0.56
Plugins	3.57	0.59	User Input & Customization	3.59	0.64
Visual Styles	3.50	0.54	Graph Analysis	3.54	0.57
Advanced Search	3.44	0.53	Feedback to Users	3.52	0.64
Free and Open Source	3.41	0.58	Strength	3.59	0.62
Efficient Layout Algorithm	3.51	0.56	Runtime Performance	3.50	0.65
Scalability	3.55	0.56	User Friendliness	3.57	0.62
Different File Formats	3.51	0.62

The interview responses substantiated the survey responses, indicating that open-source or free licensing is not compulsory or preferred if the visualization tools are suitable for analyzing a complex biological network. It is also specified that the analysts are ready to pay any price for the license for a visualization tool to analyze complex biological networks. Moreover, the interview participants stated that the basic search feature is sufficient and that there is no need to perform an advanced search. Filter tools are also considered essential for visualizing complex biological networks, as they assist the analyst in extracting the relevant segment of the large chunk of data.

Heuristic factors

The importance of the heuristic factors identified by the respondents is outlined in Figure 3.

Figure 3.

Importance of the heuristics factors.

It is essential to realize that the survey responses from the participants indicate that all 10 heuristics factors are of moderate importance, as the scores range between 50% and 60%. The interview specified that all heuristic factors should be considered when developing a good graph visualization tool for a complex biological network. Accordingly, Figure 3 shows that the heuristic features with the highest percentage (59%) include information coding, flexibility and consistency, closely followed by prompting and recognition rather than recall, at 57%. Orientation and help follow, at 54%, as well as minimal actions. Spatial organization and dataset reduction comprised 54% of responses, while the heuristic factor with the lowest percentage concerns removing extraneous, at 53%. Consequently, from the interview responses, it is safe to assume that even though all heuristic factors are of moderate importance, information coding, flexibility and consistency are considered the most crucial.

From Table 12, it is clear that the mean values of the heuristics factors range between 3.49 and 3.61, indicating that the ratings received from the participants are mostly moderately positive. Moreover, the range of the standard deviation is from 0.55 to 0.68, which specifies that the variation among the ratings from the responses was relatively low but slightly high compared to the generic factors. Therefore, it is possible to assume the opinions of the participants were almost similar.

Table 12.

Descriptive analysis of the heuristics factors.

Factors	Mean	Std. Dev	Factors	Mean	Std. Dev
Information coding	3.61	0.55	Consistency	3.57	0.59
Flexibility	3.59	0.61	Spatial Organization	3.49	0.56
Orientation and Help	3.53	0.62	Recognition Rather than Recall	3.52	.67
Minimal Action	3.52	0.61	Removing the Extraneous	3.49	0.68
Prompting	3.56	.67	Dataset Reduction	3.49	0.57

Remarkably, the survey and interview participants were conversant in the graph visualization tools and the generic and heuristic factors. For example, the interview participants mentioned that Cytoscape is one of the most widely used tools, and several issues related to Cytoscape factors were discussed, among which was the tool’s filter feature, which the survey responses also confirmed was the most important among the generic features of the visualization tools. Moreover, all participants confirmed that it is difficult to identify a solution that adequately handles large graphs.

Conclusion

This research studied essential factors for evaluating and selecting a visualization tool for complex biological networks. It employed a mixed research approach to gather responses from participants having a wide range of backgrounds and experience using graph-based visualization tools. In total, 98 participants responded to the survey questions, and five were interviewed to obtain detailed responses.

Importantly, the responses received from the survey and interviews corresponded with each other. From the interviews, it is clear that the users prefer 3D to 2D visualization, as well as that some network visualization tools, such as Cytoscape.js, Gephi and others, which are primarily known for 2D visualization support, offer 3D visualization via plugins. In addition, the interviews provided detailed opinions about and justifications for the factors. This study divided the 25 factors identified into two major categories: generic and heuristic, where the former total 15: efficient layout, advanced search, plugin availability, graph analysis, user friendliness, runtime performance, visual style, text mining, different file format, filtering tools, benefits of the tool to its users, user feedback, user input and customization, scalability and open source (free), and the latter 10: information coding, flexibility, orientation and help, minimal actions, prompting, consistency, spatial organization, recognition rather than recall, removing the extraneous and dataset reduction. The findings indicate that all generic factors except advanced search, open source (free) and filtering tools are moderately important. Furthermore, the advanced search and open source (free) factors are less important compared to others, whereas filtering tools are key considerations as network visualization tools.

The findings indicate that all heuristic factors were essential, and the interview respondents added that they should be considered when developing a visualization tool for a complex biological network to increase the tool’s user-friendliness. Future studies should assess the different factors of visualization tools to rate their applicability to complex biological networks.

Supplemental Material

sj-pdf-1-ivi-10.1177_14738716231181545 – Supplemental material for An investigation into various visualization tools for complex biological networks

Supplemental material, sj-pdf-1-ivi-10.1177_14738716231181545 for An investigation into various visualization tools for complex biological networks by Hanin Alzahrani and Sara Fernstad in Information Visualization

Footnotes

Funding

The author(s) received no financial support for the research,authorship,and/or publication of this article.

ORCID iDs

Hanin Alzahrani

Sara Fernstad

Supplemental material

Supplemental material for this article is available online.

References

Heberle

Carazzolle

Telles

, et al. Cellnetvis: A web tool for visualization of biological networks using force-directed layout constrained by cellular components. BMC Bioinform 2017; 18: 1–9. DOI: 10.1186/s12859-017-1787-5

Green

Şerban

Scholl

, et al. Network analyses in systems biology: New strategies for dealing with biological complexity. Synthese 2018; 195: 1751–1777.

Gao

Wang

, et al. Control principles for complex biological networks. Brief Bioinform 2019; 20: 2253–2266.

Muzio

O’Bray

Borgwardt

. Biological network analysis with deep learning. Brief Bioinform 2021; 22(2): 1515–1530.

Sivarajah

Kamal

Irani

, et al. Critical analysis of big data challenges and analytical methods. J Bus Res 2017; 70: 263–286.

Yang

Huang

, et al. Big data and cloud computing: Innovation opportunities and challenges. Int J Digit Earth 2017; 10(1): 13–53.

Cheng

Chen

Xue

, et al. The scale-free and small-world properties of complex networks on a Sierpinski-type hexagon. Fractals 2020; 28: 2050054.

Tsiotas

. Detecting different topologies immanent in scale-free networks with the same degree distribution. Proc Natl Acad Sci USA 2019; 116(14): 6701–6706.

Gao

. Complex systems, emergence, and multiscale analysis: A tutorial and brief survey. Appl Sci 2021; 11: 5736.

10.

Estrada

. Introduction to complex networks: Structure and dynamics. In: Banasiak

(ed.) Evolutionary equations with applications to natural sciences. New York, NY: Springer, 2015, pp.93–131.

11.

Konstantin

Victor

. Growing scale-free networks with small-world behavior. Phys Rev E 2002; 65(5): 057102.

12.

Pavlopoulos

Secrier

Moschopoulos

, et al. Using graph theory to analyze biological networks. BioData Min 2011; 4(1): 1–27.

13.

Milano

Agapito

Cannataro

. Challenges and limitations of biological network analysis. Biotech 2022; 11: 24.

14.

Guo

Sosa

Altman

. Challenges and opportunities in network-based solutions for biological questions. Brief Bioinform 2022; 23(1): bbab437.

15.

Cromar

Zhao

Yang

, et al. Hyperscape: visualization for complex biological networks. Bioinformatics 2015; 31: 3390–3391.

16.

Omony

. Biological network inference: A review of methods and assessment of tools and techniques. Annu Res Rev Biol 2014; 4(4): 577–601.

17.

Hetzel

Fischer

Günnemann

, et al. Graph representation learning for single-cell biology. Curr Opin Syst Biol 2021; 28: 100347.

18.

Shannon

Markiel

Ozier

, et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res 2003; 13(11): 2498–2504.

19.

Bastian

Heymann

Jacomy

. Gephi: An Open Source Software for exploring and manipulating networks. Proc Int AAAI Conf Web Soc Media 2009; 3(1): 361–362.

20.

Wilk

. Medusa: Solving the mystery of the Gorgon. New York, NY: Oxford University Press, 2000.

21.

Taubert

Hassani-Pak

Castells-Brooke

, et al. Ondex web: Web-based visualization and exploration of heterogeneous biological networks. Bioinformatics 2014; 30(7): 1034–1035.

22.

Breitkreutz

B-J

Stark

Tyers

. Osprey: A network visualization system. Genome Biol 2003; 3: eprint0012.1.

23.

Batagelj

Mrvar

. Pajek: Analysis and visualisation of large networks. In: Junger

Mutzel

(eds) Graph drawing software. Heidelberg: Springer Verlag, 2004, pp.77–103.

24.

Jehl

Manguy

Shields

, et al. ProViz-a web-based visualization tool to investigate the functional and evolutionary features of protein sequences. Nucleic Acids Res 2016; 44(W1): W11–W15.

25.

Athanasios

Charalampos

Vasileios

, et al. Protein-Protein Interaction (PPI) network: Recent advances in Drug Discovery. Curr Drug Metab 2018; 18: 5–10.

26.

Safari-Alighiarloo

Taghizadeh

Rezaei-Tavirani

, et al. Protein-protein interaction networks (PPI) and complex diseases. Gastroenterol Hepatol Bed Bench 2014; 7(1): 17–31.

27.

Jang

Heras

Lee

. M6A in the signal transduction network. Mol Cells 2022; 45(7): 435–443.

28.

Yue

Wang

Huang

, et al. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics 2020; 36(4): 1241–1251.

29.

Fung

D’Orso

. Tandem affinity purification of protein complexes from eukaryotic cells. J Viz Exp 2017; 119: e55236.

30.

Louche

Salcedo

Bigot

. Protein-protein interactions: pull-down assays. Methods Mol Biol 2017; 1615: 247–255.

31.

Liu

, et al. Yeast two-hybrid screening for proteins that interact with PFT in wheat. Sci Rep 2019; 9(1): 1–10.

32.

Syu

Dunn

Zhu

. Developments and applications of functional protein microarrays. Mol Cell Proteom 2020; 19(6): 916–927.

33.

Jaroszewicz

Morcinek-Orłowska

Pierzynowska

, et al. Phage display and other peptide display technologies. FEMS Microbiol Rev 2022; 46(2): fuab052.

34.

Emmert-Streib

Dehmer

Haibe-Kains

. Gene regulatory networks and their applications: Understanding biological and medical problems in terms of networks. Front Cell Dev Biol 2014; 2: 38.

35.

Hahn

. Modelling and analysis of signal transduction networks. Basel: MDPI AG, 2016.

36.

Koutrouli

Karatzas

Paez-Espino

, et al. Corrigendum: A guide to conquer the biological network era using graph theory. Front Bioeng Biotechnol 2023; 11: 1182500.

37.

Feist

Herrgård

Thiele

, et al. Reconstruction of biochemical networks in microorganisms. Nat Rev Microbiol 2018; 7: 129–143.

38.

Sordo Vieira

Laubenbacher

. Computational models in systems biology: standards, dissemination and best practices. Curr Opin Biotechnol 2022; 75: 102702.

39.

Deutsch

Orchard

Binz

, et al. Proteomics standards initiative: Fifteen years of progress and future work. J Proteome Res 2017; 16: 4288–4298.

40.

Barrows

Martin

Smith

. Markup language for chemical process control and simulation. Comput Chem Eng 2022; 160: 107702.

41.

Jin

Wah

Cheng

, et al. Significance and challenges of big data research. Big Data Res 2015; 2(2): 59–64.

42.

Kumar

Tiwari

Zymbler

. Internet of Things is a revolutionary approach for future technology enhancement: A review. J Big Data 2019; 6(1): 1–21.

43.

Garcia

ACB

Bentes

de Melo

RHC

, et al. Sensor data analysis for equipment monitoring. Knowl Inf Syst 2011; 28: 333–364.

44.

Ali

Gupta

Nayak

, et al. Big data visualisation: Tools and challenges. In: 2nd International Conference on Contemporary Computing and Informatics (IC3I), 2016, Noida, India. DOI: 10.1109/ic3i.2016.7918044.

45.

Agrawal

Nyamful

. Challenges of big data storage and management. Glob J Inf Technol 2016; 6(1): 1–10. DOI: 10.18844/gjit.v6i1.383

46.

Ali

Qadir

Rasool

, et al. Big data for development: Applications and techniques. Big Data Anal 2016; 1: 2.

47.

Hajirahimova

Ismayilova

. Big data visualisation: Existing approaches and problems. Prob Inf Technol 2018; 09(1): 65–74.

48.

Munzner

. Visualization analysis and design. 1st ed. Boca Raton, FL: AK Peters/CRC Press, 2014.

49.

Midway

. Principles of effective data visualisation. Patterns 2020; 1(9): 100141.

50.

Ujwary-Gil

Potoczek

. A dynamic, network and resource-based approach to the sustainable business model. Electron Mark 2020; 30(4): 717–733.

51.

Lü

Chen

Ren

X-L

, et al. Vital nodes identification in complex networks. Phys Rep 2016; 650: 1–63.

52.

Fortunato

Hric

. Community detection in networks: A user guide. Phys Rep 2016; 659: 1–44.

53.

Mester

Pop

Mursa

B-E-M

, et al. Network analysis based on important node selection and community detection. Mathematics 2021; 9(18): 2294.

54.

Jena

. A review of data visualisation tools used for big data. Int Res J Eng Technol 2017; 4(01): 2395–0056.

55.

Pradeep

Kallimani

. The different tools and techniques to handle challenges in big data. In: Proceedings of the 3rd international conference on trends in electronics and informatics (ICOEI), Tirunelveli, India, 23–25 April 2019, pp.964–967. New York: IEEE.

56.

Parish

Edmondson

. Data visualization heuristics for the physical sciences. Mater Des 2019; 179: 107868.

57.

Siddiqui

. Data visualisation: A study of tools and challenges. Asian J Technol Manag Res 2021; 2249: 0892.

58.

Villaveces

Koti

Habermann

. Tools for visualization and analysis of molecular networks, pathways, and -omics data. Adv Appl Bioinform Chem 2015; 8: 11–22.

59.

Hariharan

Krithivasan

. Data visualisation tools: A case study. Int J Comput Sci Inf Secur 2016; 14(9): 834.

60.

Baltoumas

Zafeiropoulou

Karatzas

, et al. Biomolecule and bioentity interaction databases in systems biology: A comprehensive review. Biomolecules 2021; 11(8): 1245.

61.

Morris

Demchak

, et al. Biological network exploration with Cytoscape 3. Curr Protoc Bioinform 2014; 47(1): 8–13.

62.

Oeltzschner

Zöllner

Hui

SCN

, et al. Osprey: open-source processing, reconstruction & estimation of magnetic resonance spectroscopy data. J Neurosci Methods 2020; 343: 108827.

63.

Vander Meersche

Cretin

de Brevern

, et al. Medusa: Prediction of protein flexibility from sequence. J Mol Biol 2021; 433: 166882.

64.

Thangaraj

Amutha

. Description of GNP (Gephi, NodeXL, pajek) social network analysis tools. Int J Sci Res 2016; 5(12): 846–852.

65.

Jayamohan

Chatterjee

, et al. Multiviz: A Gephi plugin for scalable visualisation of multi-layer networks. arXiv:2209.03149 2022. DOI: 10.48550/arXiv.2209.03149.

66.

Auber

Archambault

Bourqui

, et al. TULIP 5. In: Alhajj

Rokne

(eds) Encyclopaedia of social network analysis and mining. Berlin: Springer, 2017, pp.1–28.

67.

Allegri

McCoy

Mitchell

. CompositeView: A network-based visualization tool. Big Data Cogn Comput 2022; 6: 66.

68.

Karatzas

Baltoumas

Panayiotou

, et al. Arena3Dweb: interactive 3D visualization of multilayered networks. Nucleic Acids Res 2021; 49(W1): W36–W45.

69.

Freeman

Horsewell

Patir

, et al. Graphia: A platform for the graph-based visualisation and analysis of high dimensional data. PLoS Comput Biol 2022; 18(7): e1010310.

70.

Zhou

Pang

, et al. OmicsNet 2.0: A web-based platform for multi-omics integration and network visual analytics. Nucleic Acids Res 2022; 50(W1): W527–W533.

71.

Rusch

Halpern

Sutton

, et al. The Sorcerer II global ocean sampling expedition: Northwest Atlantic through eastern tropical Pacific. PLoS Biol 2007; 5(3): e77.

72.

Baitaluk

Sedova

Ray

, et al. BiologicalNetworks: visualization and analysis tool for systems biology. Nucleic Acids Res 2006; 34: W466–W471.

73.

Kohl

Wiese

Warscheid

. Cytoscape: Software for visualisation and analysis of biological networks. In: Hamacher

Eisenacher

Stephen

(eds) Data mining in proteomics: from standards to applications, methods in molecular biology, Vol. 696. Totowa, NJ: Humana Press, 2011

74.

Yeung

Cline

Kuchinsky

, et al. Exploring biological networks with Cytoscape software. Curr Protoc Bioinform 2008; 23: 8.13.1– 8.13.20.

75.

Millan

. Chapter 4: Visualisation and analysis of biological networks. In: Schneider

(ed.) Silico systems biology. New York, NY: Humana Press, 2013, pp.63–87.

76.

Koh

Porras

Aranda

, et al. Analyzing protein-protein interaction networks. J Proteome Res 2012; 11(4): 2014–2031.

77.

Cline

Smoot

Cerami

, et al. Integration of biological networks and gene expression data using cytoscape. Nat Protoc 2007; 2: 2366–2382.

78.

Gerasch

Faber

Küntzer

, et al. Bina: A visual analytics tool for biological network data. PLoS One 2014; 9: e87397.

79.

Kennedy

Hill

Allen

, et al. Engaging with (big) data visualizations: factors that affect engagement and resulting new definitions of effectiveness. First Monday 2016; 21 (11): 1–20.

80.

Franconeri

Padilla

Shah

, et al. The science of visual data communication: What works. Psychol Sci Public Interest 2021; 22: 110–161.

81.

Parte

Sardà Carbasse

Meier-Kolthoff

, et al. List of prokaryotic names with standing in nomenclature (LPSN) moves to the DSMZ. Int J Syst Evol Microbiol 2020; 70(11): 5607–5612.

82.

Heron

Hanson

Ricketts

. Open source and accessibility: advantages and limitations. J Interact Sci 2013; 1(1): 1–10.

83.

Faysal

Arifuzzaman

. A comparative analysis of large-scale network visualisation tools. In: IEEE international conference on big data (Big Data), 2018, pp.4837–4843. New York: IEEE.

84.

Köhler

Baumbach

Taubert

, et al. Graph-based analysis and visualisation of experimental results with ONDEX. Bioinformatics 2006; 22(11): 1383–1390.

85.

Følstad

. Users’ design feedback in usability evaluation: A literature review. Hum Centric Comput Inf Sci 2017; 7(1): 1–19.

86.

Saraiya

Shaffer

Scott Mccrickard

, et al. Effective features of algorithm visualisations. In: Proceedings of the 35th SIGCSE technical symposium on computer science education. Association for Computing Machinery, New York, NY: United States, 2004, pp.382–386.

87.

Filipova

Nikiforova

. Definition of the criteria for layout of the UML use case diagrams. Appl Comput Syst 2019; 24: 75–81.

88.

Gurevich

. What is an algorithm. In: International conference on current trends in theory and practice of computer science. Berlin, Heidelberg: Springer, 2012.

89.

Pavlopoulos

Paez-Espino

Kyrpides

, et al. Empirical comparison of visualization tools for larger-scale network analysis. Adv Bioinformatics 2017; 2017: 1–8.

90.

Bortolini

Galizia

Mora

. Reconfigurable manufacturing systems: Literature review and research trend. J Manuf Syst 2018; 49: 93–106.

91.

Sumathy

Chidambaram

. Text mining: Concepts, applications, tools and issues an overview. Int J Comput Appl 2013; 80(4): 29–32.

92.

Talib

Kashif

Ayesha

, et al. Text mining: Techniques, applications and issues. Int J Adv Comput Sci Appl 2016; 7(11): 414–418.

93.

Wang

Zou

Upadhyaya

, et al. An empirical study on categorising user input parameters for user input reuse. In: Proceedings of the 14th International Conference, ICWE 2014, Toulouse, France, July 1–4, 2014, Springer International Publishing, 2014, pp.21–39

94.

Hanus

. Distinguishing user experience when customizing in a user-generated content advertising campaign and subsequent effects on product attitudes, reactance, and source credibility. J Interact Advert 2019; 19(1): 74–85.

95.

Suderman

Hallett

. Tools for visually exploring biological networks. Bioinformatics 2007; 23(20): 2651–2659.

96.

Barker

. Pro JavaScript performance: Monitoring and visualisation. Scholars Portal, 2012.

97.

Kayode

Tella

Akande

. Ease-of-use and user-friendliness of cloud computing adoption for web-based services in academic libraries in Kwara State, Nigeria. Internet Ref Serv Q 2020; 23(3–4): 89–117.

98.

Zheng

. Data visualization in business intelligence. In: Global business intelligence, Routledge, 2017, pp.67–81.

99.

Forsell

Johansson

. An heuristic set for evaluation in information visualisation. In: Proceedings of the international conference on advanced visual interfaces, 2010, Epub ahead of print 2010. DOI: 10.1145/1842993.1843029.

100.

Väätäjä

Varsaluoma

Heimonen

, et al. Information visualisation heuristics in practical expert evaluation. In: Proceedings of the sixth workshop on beyond time and errors on novel evaluation methods for visualization, 2016, Epub ahead of print 2016. DOI: 10.1145/2993901.2993918.

101.

Williams

Scholtz

Blaha

, et al. Evaluation of visualisation heuristics. In: Kurosu

(ed.) Human–computer interaction: theories, methods, and human issues, Vol. 10901. Cham: Springer, 2018, pp.208–224.

102.

Günsel

Açikgšz

Tükel

, et al. The role of flexibility on software development performance: an empirical study on software development teams. Procedia Soc Behav Sci 2012; 58: 853–860.

103.

Hoffswell

Liu

. Techniques for flexible responsive visualisation design. In: Conference on human factors in computing systems. Epub ahead of print 2020. DOI: 10.1145/3313831.3376777.

104.

van Steen

Tanenbaum

. A brief introduction to distributed systems. Computing 2016; 98(10): 967–1009.

105.

Nielsen Norman Group. Memory recognition and recall in user interfaces, https://www.nngroup.com/articles/recognition-and-recall/ (2022, accessed 6 September 2022).

106.

Towse

Cowan

Hitch

, et al. The recall of information from working memory: Insights from behavioural and chronometric perspectives. Exp Psychol 2008; 55(6): 371–383.

107.

Chandrasekaran

Feng

Lei

, et al. Effectiveness of dataset reduction in testing machine learning algorithms. 2020 IEEE Int Conf Artif Intell Test (AITest), Epub ahead of print 2020. DOI: 10.1109/aitest49225.2020.00027

108.

Mackey

Bryfonski

. Mixed methodology. In: Burns

Pennycook

(eds) The Palgrave handbook of applied linguistics research methodology. London: Palgrave Macmillan, 2018, pp.5–7.

109.

Alasmari

. Mixed methods research: an overview. Int J Soc Sci Hum Res 2020; 03: 03.

110.

Smajic

Avdic

Pasic

, et al. Mixed methodology of scientific research in healthcare. Acta Inform Med 2022; 30(1): 57–60.

111.

Aspers

Corte

. What is qualitative in qualitative research. Qual Sociol 2019; 42: 139–160.

112.

Abuhamda

Ismail

Bsharat

. Understanding quantitative and qualitative research methods: A theoretical perspective for young researchers. Int J Res 2021; 8: 71–87.

113.

Lochmiller

. Conducting thematic analysis with qualitative data. Qual Report 2021; 26(6): 2029–2044.

114.

Dawadi

. Thematic analysis approach: A step by step guide for ELT research practitioners. J NELTA 2020; 25(1–2): 62–71.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

3.01 MB

0.00 MB