Metadata-Version: 1.0
Name: pyCGA
Version: 1.3.0
Summary: A REST client for OpenCGA web services
Home-page: https://github.com/opencb/opencga/tree/develop/opencga-client/src/main/python
Author: antonior,dapregi,ernesto-ocampo
Author-email: antonio.rueda-martin@genomicsengland.co.uk,daniel.perez-gil@genomicsengland.co.uk,kenan.mcgrath@genomicsengland.co.uk
License: Apache Software License
Description: .. contents::
        
        PyCGA
        ==========
        
        - This Python package makes use of the exhaustive RESTful Web service API that has been implemented for the `OpenCGA`_ database.
        
        - It provides easy access to OpenCGA, an open-source project that aims to provide a Big Data storage engine and analysis framework for genomic scale data analysis of hundreds of terabytes or even petabytes.
        
        - More info about this project in the `OpenCGA Wiki`_
        
        Installation
        ------------
        
        Cloning
        ```````
        PyCGA can be cloned in your local machine by executing in your terminal::
        
           $ git clone https://github.com/opencb/opencga.git
        
        Once you have downloaded the project you can install the library::
        
           $ cd opencga/tree/develop/opencga-client/src/main/python
           $ python setup.py install
        
        Usage
        -----
        
        Getting started
        ```````````````
        The first step is to set up the OpenCGA server configuration:
        
        .. code-block:: python
        
            >>> configuration = {
                    "version": "v1",
                    "rest": {
                        "hosts": ["http://100.15.26.35:8080/opencga"]
                    }
                }
        
        The configuration can be stored in a JSON or YML file as well:
        
        .. code-block:: python
        
            >>> configuration = '/path/to/config/opencga_configuration.json'
        
        The second step is to import the module and initialize the OpenCGAClient. Configuration, user and password must be specified:
        
        .. code-block:: python
        
            >>> from pyCGA.opencgarestclients import OpenCGAClient
            >>> oc = OpenCGAClient(configuration=configuration, user='user_example', pwd='pass_example')
        
        If user and password are not desired to be written down in a script, session id can be used instead:
        
        .. code-block:: python
        
            >>> from pyCGA.opencgarestclients import OpenCGAClient
            >>> oc = OpenCGAClient(configuration=configuration, user='user_example', pwd='pass_example')  # Remove after getting session id
            >>> print oc.session_id  # Remove after getting session id
            "I4MG3fXJIZARl1LhwZ"
            >>> oc = OpenCGAClient(configuration=configuration, session_id='I4MG3fXJIZARl1LhwZ')
        
        The next step is to create the specific client for the data we want to query:
        
        .. code-block:: python
        
           >>> samples = oc.samples()  # Query for samples
           >>> files = oc.files()  # Query for files
           >>> cohorts = oc.cohorts()  # Query for cohorts
        
        Now you can start asking to the OpenCGA RESTful service by providing a query ID:
        
        .. code-block:: python
        
           >>> sample_search = samples.search(study='study1', name='sample1').get()
           >>> print sample_search
           "[{'acl': [{'member': '@gel', u'permissions': ['VIEW', 'VIEW_ANNOTATIONS']}..."
        
        Responses are retrieved as JSON formatted data. Therefore, fields can be queried by key:
        
        .. code-block:: python
        
            >>> creation_date = oc.samples.search(study='study1', name='sample1').get()[0]['creationDate']
            "20170204822738"
        
        First levels in the JSON output can be accessed as attributes:
        
        .. code-block:: python
        
            >>> creation_date = samples.search(study='study1', name='sample1').get().creationDate
            "20170204122738"
        
            >>> annotation = cohorts.search(study='study1', name='cohort1').get().annotationSets
            >>> print annotation[0]['annotations'][0]['value']['sex']
            "F"
        
        Regex are allowed in some fields. This is specially useful when searching by name:
        
        .. code-block:: python
        
            >>> cohort_name = cohorts.search(study=study_id, name='~LP3000506-DNA_J01').get().name
            >>> print cohort_name
            "LP3000506-DNA_J01_LP3000924-DNA_Z02_0"
        
        Data can be accessed specifying comma-separated IDs or a list of IDs:
        
        .. code-block:: python
        
            >>> creation_date = oc.samples.search(study='study1', name='sample1').get()[0]['creationDate']
            "20170204822738"
        
            >>> creation_date = oc.samples.search(study='study1', name='sample1').get()[1]['creationDate']
            "20170204822738"
        
            >>> creation_date = samples.search(study='study1', name='sample1,sample2').get().creationDate
            ["20170204122738", "20170204123049"]
        
        Optional filters and extra options can be added as key-value parameters (value can be a comma-separated string or a list):
        
        .. code-block:: python
        
            >>> # e.g. "exclude" parameter
            >>> attributes = oc.files.search(study='study1', name='~sample', bioformat='VARIANT', status='READY', exclude='attributes').get().attributes
            >>> print attributes
            [{}, {}, {}, {}, {}, {}, {}, {}]
        
            >>> # e.g. "limit" parameter
            >>> files = oc.files.search(study='study1', name='~sample', bioformat='VARIANT', status='READY', limit=1).get()
            >>> print len(files)
            1
        
        Special mention for "analysis_variant" endpoint, which returns an iterator:
        
        .. code-block:: python
        
            >>> variant_iterator = oc.analysis_variant.query(pag_size=100, data={'studies': 'study1', 'gene': 'BRCA2'}, limit=1)
            >>> for variant in var_iterator:
            >>>     print v.get().type
            "SNV"
        
        What can I ask for?
        ```````````````````
        The best way to know which data can be retrieved for each client is either checking out the `RESTful web services`_ section of the OpenCGA Wiki or the `OpenCGA web services`_
        
        
        .. _OpenCGA: https://github.com/opencb/opencga
        .. _OpenCGA Wiki: https://github.com/opencb/opencga/wiki
        .. _RESTful web services: https://github.com/opencb/opencga/wiki/RESTful-Web-Services
        .. _OpenCGA web services: http://bioinfodev.hpc.cam.ac.uk/opencga/webservices/
        
Keywords: opencb opencga bioinformatics genomic database
Platform: UNKNOWN
