Metadata-Version: 2.1
Name: relplot
Version: 1.0
Summary: Compute and plot reliability diagrams based on calibration distance.
Author: Preetum Nakkiran, Jarosław Błasiok
License: Copyright (C) 2023 Apple Inc. All Rights Reserved.
        
        IMPORTANT:  This Apple software is supplied to you by Apple
        Inc. ("Apple") in consideration of your agreement to the following
        terms, and your use, installation, modification or redistribution of
        this Apple software constitutes acceptance of these terms.  If you do
        not agree with these terms, please do not use, install, modify or
        redistribute this Apple software.
        
        In consideration of your agreement to abide by the following terms, and
        subject to these terms, Apple grants you a personal, non-exclusive
        license, under Apple's copyrights in this original Apple software (the
        "Apple Software"), to use, reproduce, modify and redistribute the Apple
        Software, with or without modifications, in source and/or binary forms;
        provided that if you redistribute the Apple Software in its entirety and
        without modifications, you must retain this notice and the following
        text and disclaimers in all such redistributions of the Apple Software.
        Neither the name, trademarks, service marks or logos of Apple Inc. may
        be used to endorse or promote products derived from the Apple Software
        without specific prior written permission from Apple.  Except as
        expressly stated in this notice, no other rights or licenses, express or
        implied, are granted by Apple herein, including but not limited to any
        patent rights that may be infringed by your derivative works or by other
        works in which the Apple Software may be incorporated.
        
        The Apple Software is provided by Apple on an "AS IS" basis.  APPLE
        MAKES NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION
        THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY AND FITNESS
        FOR A PARTICULAR PURPOSE, REGARDING THE APPLE SOFTWARE OR ITS USE AND
        OPERATION ALONE OR IN COMBINATION WITH YOUR PRODUCTS.
        
        IN NO EVENT SHALL APPLE BE LIABLE FOR ANY SPECIAL, INDIRECT, INCIDENTAL
        OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
        SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
        INTERRUPTION) ARISING IN ANY WAY OUT OF THE USE, REPRODUCTION,
        MODIFICATION AND/OR DISTRIBUTION OF THE APPLE SOFTWARE, HOWEVER CAUSED
        AND WHETHER UNDER THEORY OF CONTRACT, TORT (INCLUDING NEGLIGENCE),
        STRICT LIABILITY OR OTHERWISE, EVEN IF APPLE HAS BEEN ADVISED OF THE
        POSSIBILITY OF SUCH DAMAGE.
Project-URL: Homepage, https://github.com/apple/ml-calibration
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: pandas
Requires-Dist: matplotlib
Requires-Dist: scikit-learn
Requires-Dist: seaborn
Requires-Dist: deprecation

# relplot: Principled Reliability Diagrams

`relplot` is a Python package for plotting reliability diagrams and measuring calibration error,
in a theoretically-principled way.
The package generates reliability diagrams as shown on the right
(reproduced in [notebooks/figure1.ipynb](./notebooks/figure1.ipynb)):
![](imgs/hero.png)

The density of predictions $f_i \in [0, 1]$ is visualized as the
thickness of the red regression line, and the gray band shows
bootstrapped confidence bands around the regression.

The reliability diagram is obtained by kernel smoothing with a careful choice of parameters, and the associated calibration measure is called the *SmoothECE* (abbreviated smECE).
The SmoothECE is roughly equal to the standard ECE of the smoothed reliability diagram.
The reliability diagram for a toy dataset of 8 points is shown below;
more theoretical details are available in the accompanying preprint
*Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing.*

![](imgs/smoothing.png)


## Installation

Install with Pip:
```sh
> pip install relplot
```

Or, clone the repo and install with:
```sh
> cd relplot
> pip install .
```

## Getting Started 

Basic usage (on sample data):

```python
import relplot as rp
import numpy as np

## generate toy data (miscalibrated)
N = 5000
f = np.random.rand(N)
y = (np.random.rand(N) > 1-(f + 0.2*np.sin(2*np.pi*f)))*1

## compute calibration error (smECE) and plot
print('calibration error:', rp.smECE(f, y))
fig = rp.rel_diagram(f, y)
fig.show()
```
This is reproduced in [notebooks/demo.ipynb](notebooks/demo.ipynb).

For more control, one can compute the calibration data with `relplot.prepare_rel_diagram`, and then plot it later with `relplot.plot_rel_diagram`.
For example:
```python
...
diagram = rp.prepare_rel_diagram(f, y) # compute calibration data (dictionary)
print('calibration error:', diagram['ce']) 
plt.plot(diagram['mesh'], diagram['mu']) # plot the calibration curve manually
fig, ax = rp.plot_rel_diagram(diagram) # plot the diagram in a new figure
```


### Data Format
Methods expect inputs in the form
of a 1D array of predicted probabilities (f) and a 1D array of binary labels (y),
where $f_i \in [0, 1]$ and $y_i \in \{0, 1\}$.
We then consider the calibration of the
distribution $(f_i, y_i)$ of prediction-outcome pairs.
This package primarily considers the binary outcome setting, but can be used
to measure multi-class confidence calibration as shown below.

### Multi-class Calibration
In the multi-class setting, *confidence calibration* can be measured by expressing it as the binary
calibration of the distribution on (confidence, accuracy) pairs.
A convenience function for this common use case is provided:
```python
# f: [N, C] array of logits over C classes
# y: [N, 1] array of predicted classes 
conf, acc = relplot.multiclass_logits_to_confidences(f, y) # reduce to binary setting
relplot.rel_diagram(f=conf, y=acc) # plot confidence calibration diagram
relplot.smECE(f=conf, y=acc) # compute smECE of conficence calibration
```

### Customization
The plot made by `relplot.rel_diagram` can be customized in various ways, as shown below.
See this notebook for examples of more options: [notebooks/figure1.ipynb](./notebooks/figure1.ipynb)

![](imgs/simple_plot.png)


## Additional Notebooks and Features
- The header image (Figure 1 of the paper) is generated in [notebooks/figure1.ipynb](./notebooks/figure1.ipynb)
- The experiments in the paper are reproduced in [notebooks/paper_experiments.ipynb](./notebooks/paper_experiments.ipynb)
- `relplot.metrics` contains implementations of various alternate calibration measures, including binnedECE and laplace kernel calibration. This is in addition to the recommended calibration measure of smoothECE (`relplot.smECE`).
- `relplot.rel_diagram_binned` plots the "binned" reliability diagram. Not recommended for usage; included for comparison.
- `relplot.config.use_tex_fonts` can be set to True if you have $\LaTeX$ installed.






## Citation
If you use relplot in your work, please consider citing:


```bibtex
@misc{relplot2023,
      title={Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing},
      author={Jarosław Błasiok and Preetum Nakkiran},
      year={2023},
}
```
