Metadata-Version: 2.1
Name: medplexity
Version: 0.1.2
Summary: medplexity helps with evaluation of LLMs for medical use-cases.
License: MIT
Author: MaksymPetyak
Author-email: petyak.mi@gmail.com
Requires-Python: >=3.10,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Provides-Extra: google-search-results
Provides-Extra: langchain
Requires-Dist: datasets (>=2.14.5,<3.0.0)
Requires-Dist: google-search-results (>=2.4.2,<3.0.0) ; extra == "google-search-results"
Requires-Dist: langchain (>=0.0.306,<0.0.307) ; extra == "langchain"
Requires-Dist: openai (>=0.28.0,<0.29.0)
Requires-Dist: pydantic (>=2.3.0,<3.0.0)
Description-Content-Type: text/markdown

## Medplexity 

[![Documentation Status](https://readthedocs.org/projects/medplexity/badge/?version=latest)](https://medplexity.readthedocs.io/en/latest/?badge=latest)
[![Discord](https://dcbadge.vercel.app/api/server/jUKkgqVzQ?style=flat&compact=true)](https://discord.gg/jUKkgqVzQ)
![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)


Medplexity is a python library to help with evaluation of LLMs for medical applications.

<img src="images/medplexity-logo.png" alt="medplexity-logo" width="512px" style="border-radius: 16px;"/>

It is designed to help with the following tasks:
- Evaluating performance of LLMs on existing medical datasets and benchmarks. E.g. MedQA, PubMedQA, etc.
- Comparing performance of different prompts, models, and architectures.
- Exporting results of evaluation for visualisation and further analysis. 

The goal is to help answer questions like "How much better would GPT-4 perform given a vector database to load certain resources?".


## 🔧 Quick install
```bash
pip install medplexity
```

## 📖 Documentation

Documentation can be found [here](https://medplexity.readthedocs.io/en/latest/).


## Example
See [MedQA notebook](`notebooks/MedQA.ipynb`) for a full example with MedQA dataset.

## Contributions

Contributions are welcome! Check out the todos below, and feel free to open a pull request.
Remember to install `pre-commit` to be compliant with our standards:

```bash
pre-commit install
```

Feel free to raise any questions on [Discord](https://discord.gg/jUKkgqVzQ)

## Todos
Some initial todos include:
- [x] Multiple-Choice datasets
  - [x] Add MedMCQA dataset
  - [x] Add PubMedQA dataset
  - [x] Add MedQA dataset
  - [x] Add MMLU dataset
- [ ] Long-form question answering datasets
  - [x] Add HealthSearchQA dataset
  - [x] Add MedicationQA dataset
  - [ ] Add LiveQA dataset
- [ ] Explore datasets for multi-modality, specifically vision tasks for GPT-4V.
- [ ] LLMs
    - [x] Wrapper for OpenAI
    - [x] Wrapper for deepinfra
    - [ ] Wrapper for Google PALM
    - [ ] Wrapper for HuggingFace text-gen
- [ ] Jupyter notebook quickstart
- [x] Example with langchain integration
- [ ] Visualisation of results
  - [ ] Add export of evaluations
  - [ ] Frontend for exploring exported results


## 📜 License
Medplexity is licensed under the MIT License. See the LICENSE file for more details.

