Metadata-Version: 1.1
Name: sketchtml
Version: 0.0.2
Summary: Helper library to experiment with HTML fingerprinting.
Home-page: https://github.com/redapple/sketchtml
Author: Paul Tremberth
Author-email: paul.tremberth@gmail.com
License: MIT license
Description: =========
        SketcHTML
        =========
        
        
        .. image:: https://img.shields.io/pypi/v/sketchtml.svg
                :target: https://pypi.python.org/pypi/sketchtml
        
        .. image:: https://img.shields.io/travis/redapple/sketchtml.svg
                :target: https://travis-ci.org/redapple/sketchtml
        
        .. image:: https://readthedocs.org/projects/sketchtml/badge/?version=latest
                :target: https://sketchtml.readthedocs.io/en/latest/?badge=latest
                :alt: Documentation Status
        
        .. image:: https://pyup.io/repos/github/redapple/sketchtml/shield.svg
             :target: https://pyup.io/repos/github/redapple/sketchtml/
             :alt: Updates
        
        
        Helper library to experiment with HTML fingerprinting.
        
        
        * Free software: MIT license
        * Documentation: https://sketchtml.readthedocs.io.
        
        
        Features
        --------
        
        * TODO
        
        References
        ----------
        
        * `Locality Sensitive Hashing for Scalable Structural Classification and Clustering of Web Documents (2013)
          <https://www.researchgate.net/publication/256004161_Locality_Sensitive_Hashing_for_Scalable_Structural_Classification_and_Clustering_of_Web_Documents>`__
        * `Enforcing k-anonymity in Web Mail Auditing -- Mail-Hash (2016) <http://dl.acm.org/citation.cfm?id=2835803>`__
          (`patent <http://www.freepatentsonline.com/y2017/0169251.html>`__)
        * `Structural Clustering of Machine-Generated Mail (2016) <http://dl.acm.org/citation.cfm?id=2983350>`__
        * `Web-Scale Information Extraction with Vertex (2011) <http://dl.acm.org/citation.cfm?id=2005642>`__
        
        Credits
        ---------
        
        This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.
        
        .. _Cookiecutter: https://github.com/audreyr/cookiecutter
        .. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage
        
        
        
        =======
        History
        =======
        
        0.0.2 (2017-06-23)
        ------------------
        
        * Add lxml.etre.iterparse-based tag sequence iterator
        * Add implementation for `MailHash`_
        * Add implementation for `Stripped-XPath-lists`_
        
        .. _MailHash: http://dl.acm.org/citation.cfm?id=2835803
        .. _Stripped-XPath-lists: http://dl.acm.org/citation.cfm?id=2983350
        
        
        0.0.1 (2017-06-19)
        ------------------
        
        * Hachenberg & Gottron HTML tag fingerprint based on LZW
        
Keywords: sketchtml
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
