bionty.base.Protein

class bionty.base.Protein(organism=None, source=None, version=None, **kwargs)

Bases: PublicOntology

Protein.

1. Uniprot https://www.uniprot.org/

Parameters:
  • organism (Literal['human', 'mouse'] | None, default: None) – name of Organism entity.

  • source (Literal['uniprot'] | None, default: None) – The key of the source in the local.yml versions file. Get all available databases with .display_available_sources().

  • version (Literal['2023-02', '2023-03', '2024-03'] | None, default: None) – The version of the ontology. Typically a date or an actual version. Get available versions with .display_available_sources().

Attributes

property fields: set

All PublicOntology entity fields.

property organism

The name of Organism.

property source

Name of the source.

property version

Version of the source.

Methods

df()

Pandas DataFrame of the ontology.

Return type:

DataFrame

Returns:

A Pandas DataFrame of the ontology.

Examples

>>> import bionty.base as bt_base
>>> bt_base.Gene().df()
diff(compare_to, **kwargs)

Determines a diff between two PublicOntology objects’ ontologies.

Parameters:
  • compare_to (PublicOntology) – PublicOntology object that must be of the same class as the calling object.

  • kwargs – Are passed to pd.DataFrame.compare()

Return type:

tuple[DataFrame, DataFrame]

Returns:

A tuple of two DataFrames

  1. New entries.

  2. A pd.DataFrame.compare result which denotes all changes in self and other.

Examples

>>> import bionty.base as bt_base
>>> public_1 = bt_base.Disease(source="mondo", version="2023-04-04")
>>> public_2 = bt_base.Disease(source="mondo", version="2023-04-04")
>>> new_entries, modified_entries = public_1.diff(public_2)
>>> print(new_entries.head())
>>> print(modified_entries.head())
inspect(values, field, *, mute=False, **kwargs)

Inspect a list of values against a field of entity reference.

Parameters:
  • values (Iterable) – Identifiers that will be checked against the field.

  • field (PublicOntologyField) – The PublicOntologyField of the ontology to compare against. Examples are ‘ontology_id’ to map against the source ID or ‘name’ to map against the ontologies field names.

  • return_df – Whether to return a Pandas DataFrame.

  • mute (bool, default: False) – Whether to suppress logging. Defaults to False.

  • kwargs – Used for backwards compatibility and return types.

Return type:

InspectResult

Returns:

  • A Dictionary of “validated” and “not_validated” identifiers

  • If return_df: A DataFrame indexed by identifiers with a boolean

    __validated__ column indicating compliance validation.

Examples

>>> import bionty.base as bt_base
>>> public = bt_base.Gene()
>>> gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
>>> public.inspect(gene_symbols, field=public.symbol)
lookup(field=None)

An auto-complete object for a PublicOntology field.

Parameters:

field (PublicOntologyField | str | None, default: None) – The field to lookup the values for. Defaults to ‘name’.

Return type:

tuple

Returns:

A NamedTuple of lookup information of the field values.

Examples

>>> import bionty.base as bt_base
>>> lookup = bt_base.CellType().lookup()
>>> lookup.cd103_positive_dendritic_cell
>>> lookup_dict = lookup.dict()
>>> lookup['CD103-positive dendritic cell']
map_synonyms(values, *, return_mapper=False, case_sensitive=False, keep='first', synonyms_field='synonyms', field=None)

Maps input synonyms to standardized names.

Return type:

dict[str, str] | list[str]

search(string, *, field=None, limit=None, case_sensitive=False)

Search a given string against a PublicOntology field or fields.

Parameters:
  • string (str) – The input string to match against the field values.

  • field (PublicOntologyField | str | list[PublicOntologyField | str], default: None) – The PublicOntologyField or several fileds of the ontology the input string is matching against. Search all fields containing strings by default.

  • limit (int | None, default: None) – Maximum amount of top results to return. If None, return all results.

  • case_sensitive (bool, default: False) – Whether the match is case sensitive.

Returns:

Ranked search results.

Examples

>>> import bionty.base as bt_base
>>> public = bt_base.CellType()
>>> public.search("gamma delta T cell")
standardize(values, field=None, *, return_field=None, return_mapper=False, case_sensitive=False, mute=False, keep='first', synonyms_field='synonyms')

Convert into standardized names.

Parameters:
  • values (Iterable) – Iterable Synonyms that will be standardized.

  • field (PublicOntologyField | str | None, default: None) – Optional[str] The field representing the standardized names.

  • return_field (str, default: None) – Optional[str] The field to return. Defaults to field.

  • return_mapper (bool, default: False) – bool = False If True, returns {input_synonym1: standardized_name1}.

  • case_sensitive (bool, default: False) – bool = False Whether the mapping is case sensitive.

  • keep (Literal['first', 'last', False], default: 'first') –

    {‘first’, ‘last’, False}, default ‘first’. When a synonym maps to multiple standardized values, determines which duplicates to mark as pandas.DataFrame.duplicated.

    • ”first”: returns the first mapped standardized value

    • ”last”: returns the last mapped standardized value

    • False: returns all mapped standardized value

  • mute (bool, default: False) – Whether to mute logging. Defaults to False.

  • synonyms_field (PublicOntologyField | str, default: 'synonyms') – str = "synonyms" A field containing the concatenated synonyms.

Return type:

dict[str, str] | list[str]

Returns:

If return_mapper is False – a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.

Examples

>>> import bionty.base as bt_base
>>> public = bt_base.Gene()
>>> gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
>>> standardized_symbols = public.standardize(gene_symbols, public.symbol)
to_pronto()

The Pronto Ontology object.

See: https://pronto.readthedocs.io/en/stable/api/pronto.Ontology.html

validate(values, field, *, mute=False, **kwargs)

Validate a list of values against a field of entity reference.

Parameters:
  • values (Iterable) – Identifiers that will be checked against the field.

  • field (PublicOntologyField) – The PublicOntologyField of the ontology to compare against. Examples are ‘ontology_id’ to map against the source ID or ‘name’ to map against the ontologies field names.

  • mute (bool, default: False) – Whether to suppress logging. Defaults to False.

  • kwargs – Used for backwards compatibility and return types.

Return type:

ndarray

Returns:

A boolean array indicating compliance validation.

Examples

>>> import bionty.base as bt_base
>>> public = bt_base.Gene()
>>> gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
>>> public.validate(gene_symbols, field=public.symbol)