Metadata-Version: 2.1
Name: cityhash
Version: 0.3.3.post0
Summary: Python bindings for CityHash and FarmHash
Home-page: https://github.com/escherba/python-cityhash
Author: Alexander [Amper] Marshalov
Author-email: alone.amper+cityhash@gmail.com
Maintainer: Eugene Scherba
Maintainer-email: escherba+cityhash@gmail.com
License: MIT
Download-URL: https://github.com/escherba/python-cityhash/tarball/master/0.3.3.post0
Description: # CityHash/FarmHash
        
        Python wrapper for [FarmHash](https://github.com/google/farmhash) and
        [CityHash](https://github.com/google/cityhash), a family of fast
        non-cryptographic hash functions.
        
        <https://github.com/escherba/python-cityhash/actions/workflows/build.yml/badge.svg?branch=master>
        
        [![Latest
        Version](https://img.shields.io/pypi/v/cityhash.svg)](https://pypi.python.org/pypi/cityhash)
        
        [![Downloads](https://img.shields.io/pypi/dm/cityhash.svg)](https://pypi.python.org/pypi/cityhash)
        
        [![License](https://img.shields.io/pypi/l/cityhash.svg)](https://opensource.org/licenses/mit-license)
        
        [![Supported Python
        versions](https://img.shields.io/pypi/pyversions/cityhash.svg)](https://pypi.python.org/pypi/cityhash)
        
        ## Getting Started
        
        To use this package in your program, simply type
        
        ``` bash
        pip install cityhash
        ```
        
        This package exposes Python APIs for CityHash and FarmHash under
        `cityhash` and `farmhash` namespaces, respectively. Each provides 32-,
        64- and 128-bit implementations.
        
        ## Usage Examples
        
        ### Stateless hashing
        
        Usage example for FarmHash:
        
        ``` python
        >>> from farmhash import FarmHash32, FarmHash64, FarmHash128
        >>> FarmHash32("abc")
        1961358185
        >>> FarmHash64("abc")
        2640714258260161385
        >>> FarmHash128("abc")
        76434233956484675513733017140465933893
        ```
        
        ### Hardware-independent fingerprints
        
        Fingerprints are seedless hashes which are guaranteed to be hardware-
        and platform-independent. This can be useful for networking applications
        which require persisting hashed values.
        
        ``` python
        >>> from farmhash import Fingerprint128
        >>> Fingerprint128("abc")
        76434233956484675513733017140465933893
        ```
        
        ### Incremental hashing
        
        CityHash and FarmHash do not support incremental hashing and thus are
        not ideal for hashing of streams. If you require incremental hashing
        feature, use [MetroHash](https://github.com/escherba/python-metrohash)
        or [xxHash](https://github.com/ifduyue/python-xxhash) instead, which do
        support it.
        
        ### Fast hashing of NumPy arrays
        
        The Python [Buffer
        Protocol](https://docs.python.org/3/c-api/buffer.html) allows Python
        objects to expose their data as raw byte arrays to other objects, for
        fast access without copying to a separate location in memory. Among
        others, NumPy is a major framework that supports this protocol.
        
        All hashing functions in this packege will read byte arrays from objects
        that expose them via the buffer protocol. Here is an example showing
        hashing of a 4D NumPy array:
        
        ``` python
        >>> import numpy as np
        >>> from farmhash import FarmHash64
        >>> arr = np.zeros((256, 256, 4))
        >>> FarmHash64(arr)
        1550282412043536862
        ```
        
        The arrays need to be contiguous for this to work. To convert a
        non-contiguous array, use NumPy's `ascontiguousarray()` function.
        
        ### SSE4.2 support
        
        On CPUs that support SSE4.2 instruction set, FarmHash-64 has an
        advantage over its non-optimized version and over vanilla CityHash-64,
        as can be seen below. The numbers below were recoreded on a 2.4 GHz
        Intel Xeon CPU (E5-2620), and the task was to hash a 512x512x3 NumPy
        array.
        
        <table style="width:88%;">
        <colgroup>
        <col style="width: 31%" />
        <col style="width: 27%" />
        <col style="width: 27%" />
        </colgroup>
        <thead>
        <tr class="header">
        <th>Method</th>
        <th>Time (64-bit)</th>
        <th>Time (128-bit)</th>
        </tr>
        </thead>
        <tbody>
        <tr class="odd">
        <td>FarmHash / SSE4.2</td>
        <td>373 µs ± 48.3 µs</td>
        <td>480 µs ± 15.3 µs</td>
        </tr>
        <tr class="even">
        <td>FarmHash</td>
        <td>464 µs ± 19.2 µs</td>
        <td>490 µs ± 23.0 µs</td>
        </tr>
        <tr class="odd">
        <td>CityHashCrc / SSE4.2</td>
        <td>
        <p>N/A</p>
        </td>
        <td>377 µs ± 21.7 µs</td>
        </tr>
        <tr class="even">
        <td>CityHash</td>
        <td>492 µs ± 16.7 µs</td>
        <td>487 µs ± 22.0 µs</td>
        </tr>
        </tbody>
        </table>
        
        The SSE4 support in CityHash is available under `cityhashcrc` module. To
        use SSE4.2-optimized CityHash in a platform-independent way, you can use
        the following:
        
        ``` python
        try:
            from cityhashcrc import CityHashCrc128 as CityHash128
        except Exception:
            from cityhash import CityHash128
        ```
        
        ## Development
        
        ### Local workflow
        
        For those who want to contribute, here is a quick start using some
        makefile commands:
        
        ``` bash
        git clone https://github.com/escherba/python-cityhash.git
        cd python-cityhash
        make env           # create a Python virtualenv
        make test          # run Python tests
        make cpp-test      # run C++ tests
        make shell         # enter IPython shell
        ```
        
        The Makefiles provided have self-documenting targets. To find out which
        targets are available, type:
        
        ``` bash
        make help
        ```
        
        ### Distribution
        
        The wheels are built using
        [cibuildwheel](https://cibuildwheel.readthedocs.io/) and are distributed
        to PyPI using GitHub actions using [this
        workflow](.github/workflows/publish.yml). The wheels contain compiled
        binaries and are available for the following platforms: windows-amd64,
        ubuntu-x86, linux-x86\_64, linux-aarch64, and macosx-x86\_64.
        
        ## See Also
        
        For other fast non-cryptographic hash functions available as Python
        extensions, see
        [MetroHash](https://github.com/escherba/python-metrohash),
        [MurmurHash](https://github.com/hajimes/mmh3), and
        [xxHash](https://github.com/ifduyue/python-xxhash).
        
        ## Authors
        
        The original Python bindings were written by Alexander \[Amper\]
        Marshalov, then were largely rewritten for more flexibility by Eugene
        Scherba. The CityHash and FarmHash algorithms and their C++
        implementation are by Google.
        
        ## License
        
        This software is licensed under the [MIT
        License](http://www.opensource.org/licenses/mit-license). See the
        included LICENSE file for details.
        
Keywords: google,hash,hashing,cityhash,farmhash,murmurhash
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: C++
Classifier: Programming Language :: Cython
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: System :: Distributed Computing
Description-Content-Type: text/markdown
