Metadata-Version: 2.0
Name: filehash
Version: 0.1.dev1
Summary: Module to wrap around hashlib and facilitate generating checksums / hashes of files and directories.
Home-page: https://github.com/leonidessaguisagjr/filehash
Author: Leonides T. Saguisag Jr.
Author-email: leonidessaguisagjr@gmail.com
License: MIT
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Utilities
Description-Content-Type: text/x-rst

``filehash``
============

Python module to facilitate calculating the checksum or hash of a file.  Tested against Python 2.7, Python 3.6, PyPy 2.7 and PyPy 3.5.

``FileHash`` class
------------------

The ``FileHash`` class wraps around the ``hashlib`` and ``zlib`` modules and contains the following methods:

- ``hash_file(filename)`` - Calculate the file hash for a single file.  Returns a string with the hex digest.
- ``hash_dir(path, pattern='*')`` - Calculate the file hashes for an entire directory.  Returns a list of tuples where each tuple contains the filename and the calculated hash.
- ``verify_sfv(sfv_filename)`` - Reads the specified SFV (Simple File Verification) file and calculates the CRC32 checksum for the files listed, comparing the calculated CRC32 checksums against the specified expected checksums.  Returns a list of tuples where each tuple contains the filename and a boolean value indicating if the calculated CRC32 checksum matches the expected CRC32 checksum.  To find out more about SFV files, see the `Simple file verification entry in Wikipedia <https://en.wikipedia.org/wiki/Simple_file_verification>`_.
- ``verify_checksums(checksum_filename)`` - Reads the specified file and calculates the hashes for the files listed, comparing the calculated hashes against the specified expected hashes.  Returns a list of tuples where each tuple contains the filename and a boolean value indicating if the calculated hash matches the expected hash.

For the checksum file, the file is expected to be a plain text file where each line has an entry formatted as follows::

   {hash}[SPACE][ASTERISK]{filename}

This format is the format used by programs such as the ``sha1sum`` family of tools for generating checksum files.  Here is an example generated by ``sha1sum``::

   f7ef3b7afaf1518032da1b832436ef3bbfd4e6f0 *lorem_ipsum.txt
   03da86258449317e8834a54cf8c4d5b41e7c7128 *lorem_ipsum.zip

The ``FileHash`` constructor has two optional arguments:

- ``hash_algorithm='sha256'`` - Specifies the hashing algorithm to use.  Use ``hashlib.algorithms_available`` to get a list of possible hashing algorithms to use.  In addition to the hashing algorithms provided in ``hashlib``, there is also support for crc32 checksums as provided by ``zlib.crc32``.  Defaults to SHA256.
- ``chunk_size=4096`` - Integer specifying the chunk size to use (in bytes) when reading the file.  This comes in useful when processing very large files to avoid having to read the entire file into memory all at once.  Default chunk size is 4096 bytes.

Example usage
-------------

The library can be used as follows::

   >>> import os
   >>> from filehash import FileHash
   >>> md5hasher = FileHash('md5')
   >>> md5hasher.hash_file("./testdata/lorem_ipsum.txt")
   '72f5d9e3a5fa2f2e591487ae02489388'
   >>> sha1hasher = FileHash('sha1')
   >>> sha1hasher.hash_dir("./testdata", "*.zip")
   [FileHashResult(filename='lorem_ipsum.zip', hash='03da86258449317e8834a54cf8c4d5b41e7c7128')]
   >>> sha512hasher = FileHash('sha512')
   >>> os.chdir("./testdata")
   >>> sha512hasher.verify_checksums("./hashes.sha512")
   [VerifyHashResult(filename='lorem_ipsum.txt', hashes_match=True), VerifyHashResult(filename='lorem_ipsum.zip', hashes_match=True)]
   >>> crc32hasher = FileHash('crc32')
   >>> crc32hasher.verify_sfv("./lorem_ipsum.sfv")
   [VerifyHashResult(filename='lorem_ipsum.txt', hashes_match=True), VerifyHashResult(filename='lorem_ipsum.zip', hashes_match=True)]

License
-------

This is released under an MIT license.  See the ``LICENSE`` file in this repository for more information.


