Metadata-Version: 2.1
Name: noizze-crawler
Version: 12
Summary: A web page crawler which returns (title, og:image, og:description).
Home-page: https://github.com/YUChoe/noizze_crawler
Author: Tom YU Choe
Author-email: yonguk.choe@gmail.com
License: MIT
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: System Administrators
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.6
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Description-Content-Type: text/markdown
Requires-Dist: beautifulsoup4 (>=4.5.0)

[![Latest Version](https://img.shields.io/pypi/v/noizze-crawler.svg)](https://pypi.org/project/noizze-crawler/)
[![PEP8](https://img.shields.io/badge/code%20style-pep8-orange.svg)](https://www.python.org/dev/peps/pep-0008/)


# NOIZZE Crawler

A web page crawler PyPI Package which returns (title(og, head), image(og, meta), description(og, meta)).

## Dependency
* BeautifulSoup4

## Installation
Run the folowing to install:

```shell
pip install noizze-crawler
```

## Usage

```python
import noizze_crawler as nc
import sys


if __name__ == '__main__':
    url = 'https://dvdprime.com/g2/bbs/board.php?bo_table=comm&wr_id=20525678'

    try:
        (title, desc, image_url, html) = nc.crawler(url)

    except nc.HostNotFound as e:
        print("Host Not Found")
        sys.exit(1)
    except nc.HTTPError as e:
        print("HTTP {}".format(e))
        sys.exit(1)

    print(title, desc, image_url)  # html
```

## ChangeLog
* v11: Fixed bugs #3 #8 
* v10: Fixed bugs
* v9: Youtube crawler with Google API #4
* v8: Changed PyPI dependency - bs4
* v7: PEP8 passed codes
* v6: HostNotFound, HTTPError exceptions

