Metadata-Version: 2.1
Name: embeddingdb
Version: 0.0.1
Summary: A package for storing and querying knowledge graph embeddings
Home-page: UNKNOWN
License: MIT
Description: Embedding Database
        ==================
        This package provides a database schema and Python wrapper
        for storing the embeddings generated through various representation
        learning packages.
        
        Currently, this package focuses on using a SQL database with SQLAlchemy,
        but might be extended to use a NoSQL database as an alternative.
        
        Installation
        ------------
        Install ``embeddingdb`` directly from GitHub with:
        
        .. code-block:: sh
        
           $ pip install git+https://github.com/cthoyt/embeddingdb
        
        Set the environment variable ``EMBEDDINGDB_CONNECTION`` to a valid
        SQLAlchemy connection string for a PostgreSQL instance, as this package uses
        the PostgreSQL-specific ``ARRAY`` type.
        
        Command Line Interface
        ----------------------
        This package installs an entrypoint ``embeddingdb`` that can be used directly from
        the shell.
        
        Uploading Entity Embeddings
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~
        Entities can be embedded and stored from various types of representation learning,
        including network representation learning, knowledge graph embedding, and textual
        learning.
        
        Upload embeddings generated by ``word2vec`` by specifying the file path with:
        
        .. code-block:: sh
        
           $ embeddingdb upload --fmt word2vec --path ~/path/to/file.txt
        
        Upload embeddings generated by ``pykeen`` by specifying the output directory
        with:
        
        .. code-block:: sh
        
           $ embeddingdb upload --fmt keen --path ~/path/to/directory/
        
        Listing Entity Embeddings
        ~~~~~~~~~~~~~~~~~~~~~~~~~
        After uploading, the collections can be listed with:
        
        .. code-block:: sh
        
           $ embeddingdb ls
        
        Analyzing Entity Embeddings' Correlations
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        One of the motivations for building this repository was to make a convenient way to
        compare the embeddings for entities generated through orthogonal embedding tecnhiques.
        For example, we wanted to know to what extent the embeddings for proteins generated from
        their sequences with ``ratvec`` contained the same information as the embeddings generated
        from protein-protein interaction networks with ``pykeen`` or ``nrl``.
        
        The two positional arguments correspond to the collection identifiers in the database.
        
        .. code-block:: sh
        
           $ embeddingdb analyze 1 2
        
        Running with Docker
        -------------------
        After installing Docker, the entire web application can be instantiated with:
        
        .. code-block:: sh
        
           $ docker-compose up
        
        Get the endpoint ``/test`` to instantiate the database and add a test collection.
        
Keywords: Knowledge Graph Embeddings,Machine Learning,Data Mining,Linked Data
Platform: UNKNOWN
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.7
Provides-Extra: web
Provides-Extra: docs
