Metadata-Version: 2.1
Name: auto-mapper
Version: 0.1.2
Summary: An auto mapper that accepts a list of string and a list of objects of the format {'code', 'name'} and return a list of object where each 'code' is mapped to the most similar strings from the list of strings
Home-page: https://pypi.org/project/auto-mapper/
Author: Ilyasse Benrkia
Author-email: benrkyailyass@gmail.com
License: MIT
Description: # Auto Mapper
        
        A package that **maps** a set of strings to another set of objects where each object is described as:
        ```python
        {
          'code': 'unique_code',
          'name': 'given name'
        }
        ```
        
        ## How it works
        Tha process mainly starts by a text-cleaning, which is just another way of saying text processing, with the help of certain dependencies ([NLTK](https://www.nltk.org/) for the current release).
        \
        The cleaning operation transforms each string to a list of tokens. By running a text similarity algorithm on the resulting vectors we're able to map certain fields with the most similar columns.
        \
        Another phase of mapping consist of an additional text processing step, which is `stemming`, combined with a lower similarity threshold is applyed on the unmapped fields.
        \
        The final step is to measure the semantic similarity between the unmapped fields and columns. Thanks to [Datamuse](http://datamuse.com) and their greate API we were able to ~externalize this operation.
        
        ## Installation
        First you need to install the package, then run a setup script that will download the necessary ntlk packages
        ```
        $ pip install auto-mapper
        $ setup-nltk
        ```
        
        `NOTE:` if you are using a virtual environment, please check it out before running the nltk setup **It downloads the packages to the environment folder**
        
        ## Usage
        It's pretty straightforward
        ```pycon
        >>> from mapper import AutoMapper
        >>> mapper = AutoMapper()
        >>> cols = ['city', 'Location Name']
        >>> fields = [{'code': 'loc_name', 'name': 'location names'}, {'code': 'town', 'name': 'Town'}]
        >>> mapping_result, unmapped_columns_indices, unmapped_fields_indices = mapper.map(column_names=cols, fields=fields)
        >>> print(mapping_result)
        [{'source': ['city'], 'target': 'town'}, {'source': ['Location Name'], 'target': 'loc_name'}]
        >>> print(unmapped_columns_indices)
        set()
        >>> print(unmapped_fields_indices)
        set()
        ```
Keywords: auto,mapper,mapping,text processing,text similarity
Platform: UNKNOWN
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.7
Description-Content-Type: text/markdown
