API Reference
This page contains the API reference for the Neko package.
Network Class
The main class in NeKo is the Network
class. You can import it as follows:
from neko.core.network import Network
Below is a summary of the methods available in the Network
class:
- class neko.core.network.Network(initial_nodes: list[str] | None = None, sif_file=None, resources=None)
Bases:
object
A molecular interaction network.
The
Network
object is the central organizing component of theneko
module. It is the subject of all operations implemented here, including topological algorithms, graph analysis, network visualization and integration of database knowledge.- Args:
initial_nodes: A list of initial nodes to be added to the network. sif_file: A SIF (Simple Interaction Format) file to load the network from. resources: A pandas DataFrame containing the resources database.
Methods:
- add_edge(edge: DataFrame) None
This method adds an interaction to the list of interactions while converting it to the NeKo-network format. It checks if the edge represents inhibition or stimulation and sets the effect accordingly. It also checks if the nodes involved in the interaction are already present in the network, if not, it adds them.
- Args:
edge: A pandas DataFrame representing the interaction. The DataFrame should contain columns for
‘source’, ‘target’, ‘type’, and ‘references’. The ‘source’ and ‘target’ columns represent the nodes involved in the interaction. The ‘type’ column represents the type of interaction. The ‘references’ column contains the references for the interaction.
- Returns:
None
- add_node(node: str, from_sif: bool = False) None
Adds a node to the network. The node is added to the nodes DataFrame of the network. The function checks the syntax for the genesymbol to ensure it is correct. If the node is a complex, it is added with the ‘Genesymbol’ as the complex string and ‘Uniprot’ as the node. Otherwise, it is added with the ‘Genesymbol’ as the genesymbol and ‘Uniprot’ as the uniprot. The ‘Type’ is set as ‘NaN’ for all new nodes.
- Args:
- node: A string representing the node to be added. The node can be represented by either its
Genesymbol or Uniprot identifier.
- Returns:
None.
- bfs_algorithm(node1: str, node2: str, maxlen: int, only_signed: bool, consensus: bool, connect_with_bias: bool) None
This function uses the Breadth-First Search (BFS) algorithm to find paths between two nodes in the network. It starts from the target node and searches for paths of increasing length until it finds a path of length maxlen. If the
only_signed
flag is set to True, it filters out unsigned paths. If theconnect_with_bias
flag is set to True, it connects the nodes when first introduced. Args: node1: node2: only_signed: consensus: connect_with_bias:- Args:
node1: A string representing the source node.
node2: A string representing the target node.
maxlen: An integer representing the maximum length of the paths to be searched for. Default is 2.
only_signed: A boolean flag indicating whether to filter unsigned paths. Default is False.
consensus: A boolean flag indicating whether to check for consensus among references. Default is False.
- connect_with_bias: A boolean flag indicating whether to connect the nodes when first introduced.
Default is True.
- Returns:
None
- check_node(node: str) bool
This function checks if a node exists in the resources’ database.
- Args:
node: A string representing the node to be checked.
- Returns:
A boolean indicating whether the node exists in the resources’ database.
- check_nodes(nodes: list[str]) list[str]
This function checks if the nodes exist in the resources database and returns the nodes that are present.
- Args:
nodes: A list of node identifiers (strings). These are the nodes to be checked.
- Returns:
A list[str] of node identifiers that are present in the resources database.
- complete_connection(maxlen: int = 2, algorithm: Literal['bfs', 'dfs'] = 'dfs', minimal: bool = True, only_signed: bool = False, consensus: bool = False, connect_with_bias: bool = False) None
This function attempts to connect all nodes of a network object using one of the methods presented in the Connection object. This is a core characteristic of this package and the user should have the possibility to choose different methods to enrich its Network object.
- Args:
maxlen: The maximum length of the paths to be searched for. Default is 2.
- algorithm: The search algorithm to be used. It can be ‘bfs’ (Breadth-First Search) or ‘dfs’
(Depth-First Search).
- minimal: A boolean flag indicating whether to reset the object connect_network, updating the possible list
of paths. Default is True.
only_signed: A boolean flag indicating whether to filter unsigned paths. Default is False.
consensus: A boolean flag indicating whether to check for consensus among references. Default is False.
- connect_with_bias: A boolean flag indicating whether to connect nodes when first
introduced. Default is True.
- Returns:
None
- connect_as_atopo(strategy: Literal['radial', 'complete', None] | None = None, max_len: int = 1, loops: bool = False, outputs=None, only_signed: bool = True, consensus: bool = False) None
This method attempts to connect all nodes of a network object in a topological manner. It iteratively connects upstream nodes and checks if the network is connected. If not, it increases the search depth and repeats the process. It also removes any nodes that do not have a source in the edge dataframe and are not in the output nodes.
- Args:
strategy: The strategy to use to connect the network. It can be ‘radial’ or ‘complete’. Default is None.
max_len: The maximum length of the paths to be searched for. Default is 1.
loops: A boolean flag indicating whether to allow loops in the network. Default is False.
outputs: A list of output nodes to connect to. Default is None.
only_signed: A boolean flag indicating whether to filter unsigned paths. Default is True.
consensus: A boolean flag indicating whether to check for consensus among references. Default is False.
- Returns:
None
- connect_component(comp_A: str | list[str], comp_B: str | list[str], maxlen: int = 2, mode: Literal['OUT', 'IN', 'ALL'] = 'OUT', only_signed: bool = False, consensus: bool = False) None
This function attempts to connect subcomponents of a network object using one of the methods presented in the Connection object. This is a core characteristic of this package and the user should have the possibility to choose different methods to enrich its Network object.
- Args:
comp_A: A string or list of strings representing the first component to connect.
comp_B: A string or list of strings representing the second component to connect.
maxlen: The maximum length of the paths to be searched for. Default is 2.
mode: The search mode, which can be ‘OUT’, ‘IN’, or ‘ALL’. Default is ‘OUT’.
only_signed: A boolean flag indicating whether to filter unsigned paths. Default is False.
consensus: A boolean flag indicating whether to check for consensus among references. Default is False.
- Returns:
None
- connect_genes_to_phenotype(phenotype: str | None = None, id_accession: str | None = None, sub_genes: list[str] | None = None, maxlen: int = 2, only_signed: bool = False, compress: bool = False) None
This function connects genes to a phenotype based on the provided Args. It retrieves phenotype markers, identifies unique Uniprot genes, and connects them to the network. It also has the option to compress the network by substituting specified genes with the phenotype name.
- Args:
phenotype: The phenotype to connect to. If not provided, it will be retrieved using the id_accession.
id_accession: The accession id of the phenotype. If not provided, the phenotype parameter must be given.
- sub_genes: A list of genes to be considered for connection. If not provided, all nodes in the network are
considered.
maxlen: The maximum length of the paths to be searched for.
only_signed: A boolean flag to indicate whether to filter unsigned paths.
compress: A boolean flag to indicate whether to substitute the specified genes with the phenotype name.
- Returns:
None
- connect_network_radially(max_len: int = 1, direction: Literal['OUT', 'IN', None] | None = None, loops: bool = False, consensus: bool = False, only_signed: bool = True) None
This function connects all nodes of a network object in a radial manner. It iteratively connects upstream and downstream nodes of the initial nodes. The function also removes any nodes that do not have a source in the edge dataframe and are not in the initial nodes.
- Args:
max_len: The maximum length of the paths to be searched for. Default is 1.
direction: The direction of the search. It can be ‘OUT’, ‘IN’, or None. Default is None.
loops: A boolean flag indicating whether to allow loops in the network. Default is False.
consensus: A boolean flag indicating whether to check for consensus among references. Default is False.
only_signed: A boolean flag indicating whether to filter unsigned paths. Default is True.
- Returns:
None
- connect_nodes(only_signed: bool = False, consensus_only: bool = False) None
Basic node connections. It adds all the interactions found in the omnipath database. Once an interaction is found it will be added to the list of edges. The only_signed flag makes sure that just signed interaction will be added to the network, while “consensus_only” makes sure that just signed interaction with consensus among references will be included.
- Args:
only_signed: A boolean flag indicating whether to only add signed interactions to the network.
- consensus_only: A boolean flag indicating whether to only add signed interactions with consensus among
references to the network.
- Returns:
None
- connect_subgroup(group: str | DataFrame | list[str], maxlen: int = 1, only_signed: bool = False, consensus: bool = False) None
This function is used to connect all the nodes in a particular subgroup. It iterates over all pairs of nodes in the subgroup and finds paths between them in the resources’ database. If a path is found, it is added to the edge list of the network. The function also filters out unsigned paths if the
only_signed
flag is set to True.- Args:
- group: A list of nodes representing the subgroup to connect. Nodes can be represented as strings,
pandas DataFrame, or list of strings.
maxlen: The maximum length of the paths to be searched for in the resources’ database. Default is 1.
- only_signed: A boolean flag indicating whether to only add signed interactions to the network. Default is
False.
- consensus: A boolean flag indicating whether to only add signed interactions with consensus among
references to the network. Default is False.
- Returns:
None
- connect_to_upstream_nodes(nodes_to_connect: List[str] | None = None, depth: int = 1, rank: int = 1, only_signed: bool = True, consensus: bool = False) None
This function connects the provided nodes to their upstream nodes in the network.
- Args:
nodes_to_connect: A list of nodes to connect. If not provided, all nodes in the network are considered.
depth: The depth of the search for upstream nodes.
rank: The rank of the search for upstream nodes.
only_signed: A boolean flag indicating whether to filter unsigned paths. Default is True.
consensus: A boolean flag indicating whether to check for consensus among references. Default is False.
- Returns:
None
- convert_edgelist_into_genesymbol() DataFrame
This function generates a new edges dataframe with the source and target identifiers translated (if possible) in Genesymbol format.
- Args:
None
- Returns:
- A pandas DataFrame containing the edges with the source and target identifiers translated into Genesymbol
format.
- copy()
- dfs_algorithm(node1: str, node2: str, maxlen: int, only_signed: bool, consensus: bool, connect_with_bias: bool) None
This function uses the Depth-First Search (DFS) algorithm to find paths between two nodes in the network. It starts from the target node and searches for paths of increasing length until it finds a path of length maxlen. If the
only_signed
flag is set to True, it filters out unsigned paths. If theconnect_with_bias
flag is set to True, it connects the nodes when first introduced. Args: node1: node2: maxlen: only_signed: consensus: connect_with_bias:- Args:
node1: A string representing the source node.
node2: A string representing the target node.
maxlen: An integer representing the maximum length of the paths to be searched for. Default is 2.
only_signed: A boolean flag indicating whether to filter unsigned paths. Default is False.
consensus: A boolean flag indicating whether to check for consensus among references. Default is False.
- connect_with_bias: A boolean flag indicating whether to connect the nodes when first introduced.
Default is True.
- Returns:
None
- is_connected() bool
This function checks if the network is connected. It uses Depth-First Search (DFS) to traverse the network and check if all nodes are visited. If all nodes are visited, the network is connected.
- Args:
None
- Returns:
None
- modify_node_name(old_name: str, new_name: str, type: Literal['Genesymbol', 'Uniprot', 'both'] = 'Genesymbol') None
This function modifies the name of a node in the network. It takes the old name of the node and the new name as input and modifies the name of the node in the nodes and in the edges DataFrame. If type is set to ‘Genesymbol’, it modifies the genesymbol name of the node in the nodes DataFrame. If type is set to ‘Uniprot’, it modifies the uniprot name of the node in the edges DataFrame. If type is set to ‘both’, it modifies both the genesymbol and uniprot names of the node in the nodes and edges DataFrame.
- Args:
old_name: A string representing the old name of the node. - new_name: A string representing the new
name of the node. - type: A string indicating the type of name to be modified. It can be ‘Genesymbol’, ‘Uniprot’, or ‘both’. Default is ‘Genesymbol’.
- Returns:
-None
- print_my_paths(node1: str, node2: str, maxlen: int = 2, genesymbol: bool = True) None
This function prints all the paths between two nodes in the network. It uses the
find_paths
method from theConnections
class to find all the paths between the two nodes in the Network object. If no paths are found, it prints a warning message. If one of the selected nodes is not present in the network, it prints an error message.- Args:
node1: A string representing the source node.
node2: A string representing the target node.
maxlen: An integer representing the maximum length of the paths to be searched for. Default is 2.
genesymbol: A boolean flag indicating whether to print the paths in genesymbol format. Default is True.
- Returns:
None
- remove_disconnected_nodes() None
This function removes nodes from the network that are not connected to any other nodes.
Returns: None. The function modifies the network object in-place by removing the disconnected nodes from the nodes DataFrame.
- remove_edge(node1: str, node2: str) None
This function removes an edge from the network. It takes the source node and target node as input and removes the edge from the edges DataFrame.
- Args:
node1: A string representing the source node of the edge.
node2: A string representing the target node of the edge.
- Returns:
None
- remove_node(node: str) None
Removes a node from the network. The node is removed from both the list of nodes and the list of edges.
- Args:
- node: A string representing the node to be removed. The node can be represented by either its
Genesymbol or Uniprot identifier.
- Returns:
None
- remove_path(path: list[str]) None
This function removes a path from the network. It takes a list of nodes representing the path and removes all the edges between the nodes in the path.
- Args:
path: A list of nodes representing the path to be removed. The nodes can be represented as strings or tuples.
- Returns:
None
- remove_undefined_interactions()
This function removes all undefined interactions from the network.
- Args:
None
- Returns:
None
Method Details
This method adds an interaction to the list of interactions while converting it to the NeKo-network format. |
|
Removes a node from the network. |
|
|
Adds a node to the network. |
This function attempts to connect all nodes of a network object using one of the methods presented in the Connection object. |
|
This function attempts to connect subcomponents of a network object using one of the methods presented in the Connection object. |
|
This function connects genes to a phenotype based on the provided Args. |
|
Basic node connections. |
|
This function is used to connect all the nodes in a particular subgroup. |
|
This function connects the provided nodes to their upstream nodes in the network. |
|
|
This function generates a new edges dataframe with the source and target identifiers translated (if possible) in Genesymbol format. |
This function checks if the network is connected. |
NetworkVisualizer Class Methods
|
|
|
|
|
Color the nodes based on their expression in the tissue of interest (based on data from The Human Protein Atlas). |
|
Render the graph. |
|
|
|