Metadata-Version: 1.1
Name: molml
Version: 0.9.0
Summary: An interface between molecules and machine learning
Home-page: https://github.com/crcollins/molml/
Author: Chris Collins
Author-email: chris@crcollins.com
License: MIT
Description: MolML
        =====
        
        |Build Status| |Coverage Status| |Documentation Status| |PyPI version|
        |License|
        
        A library to interface molecules and machine learning. The goal of this
        library is to be a simple way to convert molecules into a vector
        representation for later use with libraries such as
        `scikit-learn <http://scikit-learn.org/>`__. This is done using a
        similar API scheme.
        
        All of the coordinates are assumed to be in angstroms.
        
        Features
        ========
        
        ::
        
            - Simple interface to many common molecular descriptors and their variants
                - Molecule
                    - Coulomb Matrix
                    - Bag of Bonds
                    - Encoded Bonds
                    - Encoded Angles
                    - Connectivity
                    - Connectivity Tree
                    - Autocorrelation
                - Atom
                    - Shell
                    - Local Encoded Bonds
                    - Local Encoded Angles
                    - Local Coulomb Matrix
                    - Behler-Parrinello
                - Kernel
                    - Atom/Summation Kernel
                - Fragment
                    - FragmentMap
                - Crystal
                    - Generallized Crystal
                    - Ewald Sum Matrix
                    - Sine Matrix
            - Parallel feature generation
            - Ability to save/load fit models
            - Multiple input formats supported (and ability to define your own)
            - Supports both Python 2 and Python 3
        
        Example Usage
        =============
        
        .. code:: python
        
                >>> from molml.features import CoulombMatrix
                >>> feat = CoulombMatrix()
                >>> H2 = (
                ...         ['H', 'H'],
                ...         [
                ...             [0.0, 0.0, 0.0],
                ...             [1.0, 0.0, 0.0],
                ...         ]
                ... )
                >>> HCN = (
                ...         ['H', 'C', 'N'],
                ...         [
                ...             [-1.0, 0.0, 0.0],
                ...             [ 0.0, 0.0, 0.0],
                ...             [ 1.0, 0.0, 0.0],
                ...         ]
                ... )
                >>> feat.fit([H2, HCN])
                CoulombMatrix(input_type='list', n_jobs=1, sort=False, eigen=False, drop_values=False, only_lower_triangle=False)
                >>> feat.transform([H2])
                array([[ 0.5,  1. ,  0. ,  1. ,  0.5,  0. ,  0. ,  0. ,  0. ]])
                >>> feat.transform([H2, HCN])
                array([[  0.5      ,   1.       ,   0.       ,   1.       ,   0.5      ,
                        0.       ,   0.       ,   0.       ,   0.       ],
                        [  0.5      ,   6.       ,   3.5      ,   6.       ,  36.8581052,
                        42.       ,   3.5      ,  42.       ,  53.3587074]])
                >>>
                >>> # Example loading from files directly
                >>> feat2 = CoulombMatrix(input_type='filename')
                CoulombMatrix(input_type='filename', n_jobs=1, sort=False, eigen=False, drop_values=False, only_lower_triangle=False)
                >>> paths = ['data/qm7/qm-%04d.out' % i for i in xrange(2)]
                >>> feat2.fit_transform(paths)
                array([[ 36.8581052 ,   5.49459021,   5.49462885,   5.4945    ,
                          5.49031286,   0.        ,   0.        ,   0.        ,
                          5.49459021,   0.5       ,   0.56071947,   0.56071656,
                          0.56064037,   0.        ,   0.        ,   0.        ,
                          5.49462885,   0.56071947,   0.5       ,   0.56071752,
                          0.56064089,   0.        ,   0.        ,   0.        ,
                          5.4945    ,   0.56071656,   0.56071752,   0.5       ,
                          0.56063783,   0.        ,   0.        ,   0.        ,
                          5.49031286,   0.56064037,   0.56064089,   0.56063783,
                          0.5       ,   0.        ,   0.        ,   0.        ,
                          0.        ,   0.        ,   0.        ,   0.        ,
                          0.        ,   0.        ,   0.        ,   0.        ,
                          0.        ,   0.        ,   0.        ,   0.        ,
                          0.        ,   0.        ,   0.        ,   0.        ,
                          0.        ,   0.        ,   0.        ,   0.        ,
                          0.        ,   0.        ,   0.        ,   0.        ],
                       [ 36.8581052 ,  23.81043959,   5.48396427,   5.48394941,
                          5.4837656 ,   2.78378686,   2.78375582,   2.78376439,
                          23.8104396,  36.8581052 ,   2.78378953,   2.78375777,
                          2.78375823,   5.4839846 ,   5.48393324,   5.48376877,
                          5.48396427,   2.78378953,   0.5       ,   0.56363019,
                          0.56362464,   0.40019757,   0.39971446,   0.3261774 ,
                          5.48394941,   2.78375777,   0.56363019,   0.5       ,
                          0.56362305,   0.39971429,   0.32617621,   0.40019524,
                          5.4837656 ,   2.78375823,   0.56362464,   0.56362305,
                          0.5       ,   0.32617702,   0.40019469,   0.3997145 ,
                          2.78378686,   5.4839846 ,   0.40019757,   0.39971429,
                          0.32617702,   0.5       ,   0.56362996,   0.56362587,
                          2.78375582,   5.48393324,   0.39971446,   0.32617621,
                          0.40019469,   0.56362996,   0.5       ,   0.56362278,
                          2.78376439,   5.48376877,   0.3261774 ,   0.40019524,
                          0.3997145 ,   0.56362587,   0.56362278,   0.5       ]])
        
        For more examples, look in the
        `examples <https://github.com/crcollins/molml/tree/master/examples>`__.
        Note: To run some of the examples scikit-learn>=0.16.0 is required.
        
        For the full documentation, refer to the
        `docs <http://molml.readthedocs.io>`__ or the docstrings in the code.
        
        Dependencies
        ============
        
        MolML works with both Python 2 and Python 3. It has been tested with the
        versions listed below, but newer versions should work.
        
        ::
        
            python>=2.7/3.5/3.6
            numpy>=1.9.1
            scipy>=0.15.1
            pathos>=0.2.0
            bidict>=0.17.5
            future  # For python 2
        
        NOTE: Due to an issue with multiprocess (a pathos dependency), the
        minimum version of Python that will work is 2.7.4. For full details see
        `this link <https://github.com/uqfoundation/multiprocess/issues/11>`__.
        Without this, the parallel computation of features will fail.
        
        Install
        =======
        
        Once ``numpy`` and ``scipy`` are installed, the package can be installed
        with pip.
        
        ::
        
            $ pip install molml
        
        Or for the bleeding edge version, you can use
        
        ::
        
            $ pip install git+git://github.com/crcollins/molml
        
        Development
        ===========
        
        To install a development version, just clone the git repo.
        
        ::
        
            $ git clone https://github.com/crcollins/molml
            $ # cd to molml and setup some virtualenv
            $ pip install -r requirements-dev.txt
        
        `Pull requests <https://github.com/crcollins/molml/pulls>`__ and `bug
        reports <https://github.com/crcollins/molml/issues>`__ are welcomed!
        
        To build the documentation, you just need to install the documentation
        dependencies. These are already included in the dev install.
        
        ::
        
            $ cd docs/
            $ pip install -r requirements-docs.txt
            $ make html
        
        Testing
        =======
        
        To run the tests, make sure that ``nose`` is installed and then run:
        
        ::
        
            $ nosetests
        
        To include coverage information, make sure that ``coverage`` is
        installed and then run:
        
        ::
        
            $ nosetests --with-coverage --cover-package=molml --cover-erase
        
        Citation
        ========
        
        Currently, there is not a dedicated publication for MolML. Instead, feel
        free to cite the work that spawned this library.
        
        ::
        
            @article{collins2018constant,
                title={Constant size descriptors for accurate machine learning models of molecular properties},
                author={Collins, Christopher R and Gordon, Geoffrey J and von Lilienfeld, O Anatole and Yaron, David J},
                journal={The Journal of Chemical Physics},
                volume={148},
                number={24},
                pages={241718},
                year={2018},
                publisher={AIP Publishing}
            }
        
        In addition, each feature extraction method has its own main reference
        listed in the docstring. These can also be accessed as follows:
        
        .. code:: python
        
                >>> from molml.features import CoulombMatrix
                >>> print(CoulombMatrix().get_citation())
                Rupp, M.; Tkatchenko, A.; Muller, K.-R.; von Lilienfeld, O. A. Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning. Phys. Rev. Lett. 2012, 108, 058301.
                Hansen, K.; Montavon, G.; Biegler, F.; Fazli, S.; Rupp, M.; Scheffler, M.; von Lilienfeld, O. A.; Tkatchenko, A.; Muller, K.-R. Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies. J. Chem. Theory Comput. 2013, 9, 3404-3419.
        
        .. |Build Status| image:: https://travis-ci.org/crcollins/molml.svg?branch=master
           :target: https://travis-ci.org/crcollins/molml
        .. |Coverage Status| image:: https://coveralls.io/repos/github/crcollins/molml/badge.svg?branch=master
           :target: https://coveralls.io/github/crcollins/molml?branch=master
        .. |Documentation Status| image:: https://readthedocs.org/projects/molml/badge/?version=latest
           :target: http://molml.readthedocs.io/en/latest/?badge=latest
        .. |PyPI version| image:: https://img.shields.io/pypi/v/MolML.svg?style=flat
           :target: http://pypi.python.org/pypi/MolML
        .. |License| image:: https://img.shields.io/pypi/l/MolML.svg?style=flat
           :target: https://github.com/crcollins/molml/blob/master/LICENSE.txt
        
Platform: UNKNOWN
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Physics
