Integration¶
This module contains functions that help add more data to the network
-
pybel_tools.integration.
overlay_data
(graph, data, label=None, overwrite=False)[source]¶ Overlays tabular data on the network
-
pybel_tools.integration.
overlay_type_data
(graph, data, func, namespace, label=None, overwrite=False, impute=None)[source]¶ Overlay tabular data on the network for data that comes from an data set with identifiers that lack namespaces.
For example, if you want to overlay differential gene expression data from a table, that table probably has HGNC identifiers, but no specific annotations that they are in the HGNC namespace or that the entities to which they refer are RNA.
- Parameters
graph (
BELGraph
) – A BEL Graphdata (dict) – A dictionary of {name: data}
func (
str
) – The function of the keys in the data dictionarynamespace (
str
) – The namespace of the keys in the data dictionarylabel (
Optional
[str
]) – The annotation label to put in the node dictionaryoverwrite (
bool
) – Should old annotations be overwritten?impute (
Optional
[float
]) – The value to use for missing data
- Return type
None
-
pybel_tools.integration.
load_differential_gene_expression
(path, gene_symbol_column='Gene.symbol', logfc_column='logFC', aggregator=None)[source]¶ Load and pre-process a differential gene expression data.
- Parameters
path (
str
) – The path to the CSVgene_symbol_column (
str
) – The header of the gene symbol column in the data framelogfc_column (
str
) – The header of the log-fold-change column in the data frameaggregator (
Optional
[Callable
[[List
[float
]],float
]]) – A function that aggregates a list of differential gene expression values. Defaults tonumpy.median()
. Could also use:numpy.mean()
,numpy.average()
,numpy.min()
, ornumpy.max()
- Return type
- Returns
A dictionary of {gene symbol: log fold change}