Metadata-Version: 2.1
Name: stringdb_alias
Version: 1.0
Summary: A python package for working with string-db.org aliases (gene and protein ID mapping).
Home-page: https://github.com/Craven-Biostat-Lab/stringdb_alias
Author: Yuriy Sverchkov
Author-email: yuriy.sverchkov@wisc.edu
License: MIT
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.6
Description-Content-Type: text/x-rst
License-File: LICENSE

stringdb_alias
==============

A python package for working with string-db.org aliases (gene and
protein ID mapping).

This package is specifically for working offline using downloaded files.
For accessing the STRINGdb API instead, see for example the
`stringdb <https://pypi.org/project/stringdb/>`__ package.

Usage
-----

Mapping HGNC symbols
~~~~~~~~~~~~~~~~~~~~

First, download the aliases and info files from string-db.org:

::

   $ wget https://stringdb-static.org/download/protein.info.v11.5/9606.protein.info.v11.5.txt.gz
   $ wget https://stringdb-static.org/download/protein.aliases.v11.5/9606.protein.aliases.v11.5.txt.gz

Then, initialize our mapper object with the downloaded files, and map
lists of IDs

::

   from stringdb_alias import HGNCMapper

   mapper = HGNCMapper('9606.protein.info.v11.5.txt.gz', '9606.protein.aliases.v11.5.txt.gz')

   print(mapper.get_string_ids(['ADCK2', 'TOMM7', 'PRODH']))

The mapper always returns a `pandas
Series <https://pandas.pydata.org/pandas-docs/stable/reference/series.html>`__.
This is convenient for directly mapping a column in a
`DataFrame <https://pandas.pydata.org/pandas-docs/stable/reference/frame.html>`__.
Moreover, if the input list is a pandas Series, the index is preserved
in the output.


