Metadata-Version: 1.1
Name: pyPiper
Version: 0.2.2
Summary: A pipelining framework designed for data analysis but can be useful to other applications
Home-page: https://github.com/daniyall/pyPiper
Author: daniyall
Author-email: daniyal.l@outlook.com
License: LICENSE
Download-URL: https://github.com/daniyall/pyPiper/archive/0.2.0.tar.gz
Description: A pipelining framework for Python. Developers can create nodes and chain
        them together to create pipelines.
        
        Classes that extend ``Node`` must implement ``run`` method that will be
        called whenever new data is available.
        
        A simple example
        
        .. code:: python
        
            from pyPiper import Node, Pipeline
        
            class Generate(Node):
                def setup(self):
                    self.pos = 0
        
                def run(self, data):
                    if self.pos < self.size:
                        self.emit(self.pos)
                        self.pos = self.pos + 1
                    else:
                        self.close()
        
            class Square(Node):
                def run(self, data):
                    self.emit(data**2)
        
        
            pipeline = Pipeline(Generate("gen", size=10) | Square("square"))
            print(pipeline)
            pipeline.run()
        
        Nodes can also specify a batch size that dictates how much data should
        be pushed to the node. For example, building on the previous example. In
        this case ``batch_size`` is specified in the nodes ``setup`` method.
        Alternatively, it can be set when creating the node (ex.
        ``Printer("print", batch_size=5)``)
        
        .. code:: python
        
            class Printer(Node):
                def setup(self):
                    self.batch_size = Node.BATCH_SIZE_ALL
        
                def run(self, data):
                    print(data)
        
            pipeline = Pipeline(Generate("gen", size=10) | Square("square") | Printer("print"))
            print(pipeline)
            pipeline.run()
        
Keywords: data-science,pipelining,stream-processing,data-analysis
Platform: UNKNOWN
