Metadata-Version: 1.0
Name: jparser
Version: 0.0.14
Summary: A robust parser which can extract title, content, images from html pages
Home-page: UNKNOWN
Author: Sun, Junyi
Author-email: UNKNOWN
License: MIT
Description: 
        Usage Example:
        ^^^^^^^^^^^^^^^^^^^^^
        ::
        
            import urllib2
            from jparser import PageModel
            html = urllib2.urlopen("http://news.sohu.com/20170512/n492734045.shtml").read().decode('gb18030')
            pm = PageModel(html)
            result = pm.extract()
            
            print "==title=="
            print result['title']
            print "==content=="
            for x in result['content']:
                if x['type'] == 'text':
                    print x['data']
                if x['type'] == 'image':
                    print "[IMAGE]", x['data']['src']
            
        
Platform: UNKNOWN
