Metadata-Version: 2.1
Name: makeup
Version: 0.1.2
Summary: Make your models look pretty.
Home-page: https://github.com/trevorgrayson/mlf
Author: trevor grayson
Author-email: trevor@dave.com
License: UNKNOWN
Description: # makeup Dependency Framework
        
        Run Machine Learning/AI models, reproducibly, from ideation to production.
        
        `makeup` strives to help Data Scientists write model code and not be obliged to much else.
        `makeup` is the connective tissue that plugs the different stages of building a model together.
        
        This is not a processor library, it's a code organizational framework made to make your development easier.
        
        Additional features may be developed in this library to provide services, like a web API to host models. 
        
        # Why?
        
        - write less code.
        - promote a "model interface" for interoperable models.
        - target caching for reproducible and expedient execution. 
        - simplified debugging, without hacking code.
        - production deployable code that's easy to test.
        - artifact rendering, to help with deployments.  
        
        
        ## How?
        
        We're going to try to break our ML code down into smaller functional parts.  These parts will be simple python functions,
        and we will refer to them as targets.  How big should we make these targets?  A good rule of thumb is to make a new 
        target anywhere you may want to `print`, save, or inspect variables or results.  
        
        Some example targets may be: `load`, `prep` (or `feature` generation), `split`, `train`
        
        ## Getting Started
        
        Take an ML project, like the [Sklearn Iris Example](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html).
        Let's start by making a module called `iris` to name our model.  The following code will be added to `iris/__init__.py`.
        
        ### loading data
        
        No matter what you're doing, you'll want to load some data first.  I'm hard pressed to find a example where 
        the program should hardcode the data source in the code, but this seems to happen in every jupyter notebook I've ever 
        seen, so let's write a method to do it.  This will be our default data, but you will be able to change data 
        sources/sets at run time.
        
        To implement that here we will use the `dataset.load_iris` function. Keep in mind, this block of code could just as
        easily load a csv, call a database, or load any other data source. More on this later. 
        
        ```python
        # iris/__init__.py
        from sklearn import datasets
        
        def load():
            """Returns reasonable "default data" for executon. Use in Juypter Notebooks.""" 
            iris = datasets.load_iris()
            return iris.data[:, :2], iris.target
        
        # load = "data/yourdataset.tsv"
        ```
        
        The Iris Example is loading an object and extracts two useful components from it:  
        a data frame, and target numbers.
        
        Notice the loaded `iris` variable wasn't returned, though it could've been.  By returning a generic tuple of 
        python primatives you can avoid coupling your code to a data object. By explicitly stating your data requirements 
        in the function's arguments, it will make it much easier to plug in different data sources, and
        unit test method separately.
        
        ### training on data
        
        Now that you have your data, you will want to train your model against it.
        Rather than procedurally continuing our code, let's make another method which takes the previous
        function's returned values. Let's name those returned values sensibly: `data` and `target`. 
        
        ```python
        def train(data, target):
            """
            Further describing the inputs here will help later.
            data: a DataFrame with x, y, z column requirements.
            target: a list of numbers
            """
            clf = SVC()
            clf.fit(data, target)
        
            return clf
        ```
        
        ### prediction
        
        You have your `SVC` model at this point.  Here we finish up with making a prediction.
        
        ```python
        def predict(clf, row):
            return clf.predict(row)
        ```
        
        This is using a generic row like our example is, but getting more explicit with your parameters may suit you better.
        
        
        ## Running the code...
        
        We've defined three methods: `load`, `train`, and `predict`.  There are implicit dependencies between these functions 
        which we could write some code to execute, but that's where `makeup` comes in.
        
        ### in a notebook
        
        ```python
        import iris
        from makeup import run, target
        
        target(iris.train, requires=iris.load)
        run(iris, 'train')
        ```
        
        On the command line, this could be executed with:
        
        ```sh
        python -m makeup iris train
        ```
        
        You may also [override](docs/OVERRIDES.md) the data source with a URL.
        
        ```sh
        python -m makeup iris train --load file://./data.csv
        ``` 
        
        You could imagine dependencies getting more intricate:
        
        ```python
        from makeup import target
        import examples.iris as iris
        
        target(iris.features, requires=iris.load)
        target(plot, requires=iris.features)
        
        target(iris.split, requires=iris.features)
        target(iris.train, requires=iris.split)
        ```
        
        ```
        load -> features |-> plot
                         \-> split -> train
        ```
        
        OR, in abbreviated form:
        
        ```python
        from makeup import workflow
        import examples.iris as iris
        
        workflow({
            iris.features: iris.load,
            plot: iris.features,
            iris.split: iris.features,
            iris.train: iris.split,
        })
        ```
        
        
        
        
        
Keywords: machine learning,data science,make,models
Platform: UNKNOWN
Description-Content-Type: text/markdown
