Metadata-Version: 2.1
Name: pyhtnorm
Version: 1.0.0
Summary: Fast Simulation of Hyperplane-Truncated Multivatiate Normal Distributions
Home-page: https://github.com/zoj613/htnorm/
License: BSD-3-Clause
Keywords: statistical sampling,multivariate gaussian distribution,hyperplane truncated multivariate normal,structured precision multivariate normal,sampling distribution,posterior sampling
Author: Zolisa Bleki
Requires-Python: >=3.6.1,<4.0.0
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: numpy (>=1.17.0,<2.0.0)
Project-URL: Repository, https://github.com/zoj613/htnorm/
Description-Content-Type: text/markdown

# htnorm

This repo provides a C implementation of a fast and exact sampling algorithm for a 
multivariate normal distribution (MVN) truncated on a hyperplane as described [here][1]

this repo implements the following from the paper:

- efficient Sampling from a MVN truncated on a hyperplane: 

    ![hptrunc](https://latex.codecogs.com/svg.latex?%5Cmathbf%7Bx%7D%20%5Csim%20%5Cmathcal%7BN%7D_%7B%5Cmathcal%7BS%7D%7D%28%5Cmathbf%7B%5Cmu%7D%2C%20%5Cmathbf%7B%5CSigma%7D%29%3B%20%5Chspace%7B2mm%7D%20%5Cmathcal%7BS%7D%20%3D%20%5C%7B%5Cmathbf%7Bx%7D%20%3A%20%5Cmathbf%7BG%7D%5Cmathbf%7Bx%7D%20%3D%20%5Cmathbf%7Br%7D%5C%7D%2C%20%5Cmathbf%7BG%7D%20%5Cin%20%5Cmathcal%7BR%7D%5E%7Bk_2%20%5Ctimes%20k%7D%2C%20rank%28%5Cmathbf%7BG%7D%29%20%3D%20k_2%20%3C%20k)

- efficient sampling from a MVN with a stuctured precision matrix that is a sum of an invertible matrix and a row rank matrix: 

    ![struc](https://latex.codecogs.com/svg.latex?%5Cmathbf%7Bx%7D%20%5Csim%20%5Cmathcal%7BN%7D%5C%5B%5Cmathbf%7B%5Cmu%7D%2C%20%28%5Cmathbf%7BA%7D%20&plus;%20%5Cmathbf%7B%5CPhi%7D%5ET%5Cmathbf%7B%5COmega%7D%5Cmathbf%7B%5CPhi%7D%29%5E%7B-1%7D%5C%5D%3B%20%5Chspace%7B2mm%7D%20%5Cmathbf%7B%5CPhi%7D%20%5Cin%20%5Cmathcal%7BR%7D%5E%7Bn%20%5Ctimes%20p%7D%2C%20%5Cmathbf%7B%5COmega%7D%20%5Cin%20%5Cmathcal%7BR%7D%5E%7Bn%20%5Ctimes%20n%7D%2C%20%5Cmathbf%7BA%7D%20%5Cin%20%5Cmathcal%7BR%7D%5E%7Bp%20%5Ctimes%20p%7D)

- efficent sampling frfom a MVN with a structured precision and mean:

    ![strucmean](https://latex.codecogs.com/svg.latex?%5Cmathbf%7Bx%7D%20%5Csim%20%5Cmathcal%7BN%7D%5CBig%5C%5B%28%5Cmathbf%7BA%7D%20&plus;%20%5Cmathbf%7B%5CPhi%7D%5ET%5Cmathbf%7B%5COmega%7D%5Cmathbf%7B%5CPhi%7D%29%5E%7B-1%7D%5Cmathbf%7B%5CPhi%7D%5ET%5Cmathbf%7B%5COmega%7D%5Cmathbf%7Bt%7D%2C%20%28%5Cmathbf%7BA%7D%20&plus;%20%5Cmathbf%7B%5CPhi%7D%5ET%5Cmathbf%7B%5COmega%7D%5Cmathbf%7B%5CPhi%7D%29%5E%7B-1%7D%5CBig%5C%5D%3B%20%5Chspace%7B2mm%7D%20%5Cmathbf%7B%5COmega%7D%20%5Cin%20%5Cmathcal%7BR%7D%5E%7Bn%20%5Ctimes%20n%7D%2C%20%5Cmathbf%7BA%7D%20%5Cin%20%5Cmathcal%7BR%7D%5E%7Bp%20%5Ctimes%20p%7D)

The algorithms implemented have the following practical applications:
- Topic models when unknown parameters can be interpreted as fractions.
- Admixture models
- discrete graphical models
- Sampling from posterior distribution of an Intrinsic Conditional Autoregressive prior [icar][8]
- Sampling from posterior conditional distributions of various bayesian regression problems.


## Dependencies

- A C compiler that implements the C99 standard or later
- An installation of `LAPACK`.

## Usage

Building a shared library of `htnorm` can be done with the following:
```bash
# optionally set path to LAPACK shared library
$ export LIBS_DIR="some/path/to/lib/"
$ make lib
```
Afterwards the shared library will be found in a `lib/` directory of the project root,
and the library can be linked dynamically via `-lhtnorm`.

The puplic API exposes the samplers through the function declarations
```C
 int htn_hyperplane_truncated_mvn(rng_t* rng, const ht_config_t* conf, double* out);
 int htn_structured_precision_mvn(rng_t* rng, const sp_config_t* conf, double* out);
```

The details of the parameters are documented in ther header files ["htnorm.h"][4].

Random number generation is done using [PCG64][2] or [Xoroshiro128plus][3] bitgenerators. 
The API allows using a custom generator, and the details are documented in the header file 
["rng.h"][5].

## Examples
```C
#include "htnorm.h"

int main (void)
{
    ...
    // instantiate a random number generator
    rng_t* rng = rng_new_pcg64_seeded(12345);
    ht_config_t config;
    init_ht_config(&config, ...);
    double* out = ...; // array to store the samples
    int res = htn_hyperplane_truncated_mvn(rng, &config, out);
    // res contains a number that indicates whether sampling failed or not.
    ...
    // finally free the RNG pointer at some point
    rng_free(rng);
    ...
    return 0;
}
```

## Python API

### Dependencies
- NumPy >= 1.17

A high level python interface to the library is also provided. Linux users can 
install it using wheels via pip (thus not needing to worry about availability of C libraries),
```bash
pip install -U pyhtnorm
```
Wheels are not provided for MacOS. To install via pip, one can run the following commands:
```bash
#set the path to LAPACK shared library
export LIBS_DIR=<some directory>
pip install -U pyhtnorm
```
Alternatively, one can install it from source. This requires an installation of [poetry][7] and the following shell commands:

```bash
$ git clone https://github.com/zoj613/htnorm.git
$ cd htnorm/
$ poetry install
# add htnorm to python's path
$ export PYTHONPATH=$PWD:$PYTHONPATH
```

Below is an example of how to use htnorm in python to sample from a multivariate
gaussian truncated on the hyperplane ![sumzero](https://latex.codecogs.com/svg.latex?%5Cmathbf%7B1%7D%5ET%5Cmathbf%7Bx%7D%20%3D%200) (i.e. making sure the sampled values sum to zero). The python
API is such that the code can be easily integrated into other existing libraries.
Since `v1.0.0`, it supports passing a `numpy.random.Generator` instance.
Thus, one can reuse the same generator without having to declare one specifically for `htnorm`.

```python
from pyhtnorm import hyperplane_truncated_mvnorm
import numpy as np

rng = np.random.default_rng()

# generate example input
k1, k2 = 1000, 1
temp = rng.random((k1, k1))
cov = temp @ temp.T
G = np.ones((k2, k1))
r = np.zeros(k2)
mean = rng.random(k1)

# passing `random_state` is optional. If the argument is not used, a fresh
# random generator state is instantiated internally using system entropy.
o = hyperplane_truncated_mvnorm(mean, cov, G, r, random_state=rng)
print(o.sum())  # verify if sampled values sum to zero
# alternatively one can pass an array to store the results in
hyperplane_truncated_mvnorm(mean, cov, G, r, out=o)
```

For more information about the function's arguments, refer to its docstring.

A pure numpy implementation is demonstrated in this [example script][9].


## R API

One can also access the API in R. To install the package, use one the following 
commands:
```R
devtools::install_github("zoj613/htnorm")
pak::pkg_install("zoj613/htnorm")
```
Below is an R translation of the above python example:

```R
library(htnorm)

# make dummy data
mean <- rnorm(1000)
cov <- matrix(rnorm(1000 * 1000), ncol=1000)
cov <- cov %*% t(cov)
G <- matrix(rep(1, 1000), ncol=1000)
r <- c(0)
# initialize the Generator instance
rng <- HTNGenerator(seed=12345, gen="pcg64")
samples <- rng$hyperplane_truncated_mvnorm(mean, cov, G, r)
#verify if sampled values sum to zero
sum(samples)

# alternatively one can pass a vector to store the results in
out <- rep(0, 1000)
rng$hyperplane_truncated_mvnorm(mean, cov, G, r, out = out)
sum(out)  #verify
```

## Licensing

`htnorm` is free software made available under the BSD-3 License. For details
see the [LICENSE][6] file.


## References
- Cong, Yulai; Chen, Bo; Zhou, Mingyuan. Fast Simulation of Hyperplane-Truncated 
   Multivariate Normal Distributions. Bayesian Anal. 12 (2017), no. 4, 1017--1037. 
   doi:10.1214/17-BA1052.
- Bhattacharya, A., Chakraborty, A., and Mallick, B. K. (2016). 
  “Fast sampling with Gaussian scale mixture priors in high-dimensional regression.” 
  Biometrika, 103(4):985. 


[1]: https://projecteuclid.org/euclid.ba/1488337478
[2]: https://www.pcg-random.org/
[3]: https://en.wikipedia.org/wiki/Xoroshiro128%2B
[4]: ./include/htnorm.h 
[5]: ./include/rng.h
[6]: ./LICENSE
[7]: https://python-poetry.org/docs/pyproject/
[8]: https://www.sciencedirect.com/science/article/abs/pii/S1877584517301600
[9]: ./examples/numpy_implementation.py

