Metadata-Version: 1.1
Name: scrapydo
Version: 0.2.2
Summary: Crochet-based blocking API for Scrapy.
Home-page: https://github.com/rolando/scrapydo
Author: Rolando Espinoza La fuente
Author-email: rndmax84@gmail.com
License: MIT
Description: ScrapyDo
        ========
        
        Crochet_-based blocking API for Scrapy_.
        
        This module provides function helpers to run Scrapy_ in a blocking fashion. See
        the `scrapydo-overview.ipynb <http://nbviewer.ipython.org/github/darkrho/scrapydo/blob/master/notebooks/scrapydo-overview.ipynb>`_
        notebook for a quick overview of this module.
        
        
        Installation
        ============
        
        Using ``pip``::
        
          pip install scrapydo
        
        
        Usage
        =====
        
        The function ``scrapydo.setup`` must be called once to initialize the reactor.
        
        Example:
        
        .. code:: python
        
            import scrapydo
            scrapydo.setup()
        
            scrapydo.default_settings.update({
                'LOG_LEVEL': 'DEBUG',
                'CLOSESPIDER_PAGECOUNT': 10,
            })
        
            # Enable logging display
            import logging
            logging.basicConfig(level=logging.DEBUG)
        
            # Fetch a single URL.
            response = scrapydo.fetch("http://example.com")
        
            # Crawl an URL with given callback.
            def parse_page(response):
                yield {
                    'title': response.css('title').extract(),
                    'url': response.url,
                }
                for href in response.css('a::attr(href)'):
                    url = response.urljoin(href)
                    yield Request(url, callback=parse_page)
        
            items = scrapydo.crawl('http://example.com', callback)
        
            # Run an existing spider class.
            spider_args = {'foo': 'bar'}
            items = scrapydo.run_spider(MySpider, **spider_args)
        
        
        Available Functions
        ===================
        
        ``scrapydo.setup()``
            Initialize reactor.
        
        ``scrapydo.fetch(url, spider_cls=DefaultSpider, capture_items=True, return_crawler=False, settings=None, timeout=DEFAULT_TIMEOUT)``
            Fetches an URL and returns the response.
        
        ``scrapydo.crawl(url, callback, spider_cls=DefaultSpider, capture_items=True, return_crawler=False, settings=None, timeout=DEFAULT_TIMEOUT)``
            Crawls an URL with given callback and returns the scraped items.
        
        ``scrapydo.run_spider(spider_cls, capture_items=True, return_crawler=False, settings=None, timeout=DEFAULT_TIMEOUT, **kwargs)``
            Runs a spider and returns the scraped items.
        
        ``highlight(code, lexer='html', formatter='html', output_wrapper=None)``
            Highlights given code using pygments. This function is suitable for use in a IPython notebook.
        
        
        .. _Scrapy: http://scrapy.org
        .. _Crochet: https://github.com/itamarst/crochet
        
Platform: UNKNOWN
Classifier: Programming Language :: Python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
