Metadata-Version: 2.1
Name: deltacat
Version: 1.1.23
Summary: A scalable, fast, ACID-compliant Data Catalog powered by Ray.
Home-page: https://github.com/ray-project/deltacat
Author: Ray Team
License: UNKNOWN
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.9
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aws-embedded-metrics==3.2.0
Requires-Dist: boto3~=1.34
Requires-Dist: numpy==1.21.5
Requires-Dist: pandas==1.3.5
Requires-Dist: pyarrow==12.0.1
Requires-Dist: pydantic==1.10.4
Requires-Dist: ray>=2.20.0
Requires-Dist: s3fs==2024.5.0
Requires-Dist: tenacity==8.1.0
Requires-Dist: typing-extensions==4.4.0
Requires-Dist: pymemcache==4.0.0
Requires-Dist: redis==4.6.0
Requires-Dist: getdaft==0.3.6
Requires-Dist: schedule==1.2.0

# DeltaCAT

DeltaCAT is a Pythonic Data Catalog powered by Ray.

Its data storage model allows you to define and manage fast, scalable,
ACID-compliant data catalogs through git-like stage/commit APIs, and has been
used to successfully host exabyte-scale enterprise data lakes.

DeltaCAT uses the Ray distributed compute framework together with Apache Arrow
for common table management tasks, including petabyte-scale
change-data-capture, data consistency checks, and table repair.

## Getting Started

### Install

```
pip install deltacat
```

### Running Tests

```
pip3 install virtualenv
virtualenv test_env
source test_env/bin/activate
pip3 install -r requirements.txt

pytest
```


