Metadata-Version: 2.1
Name: date-a-scientist
Version: 0.1.7
Summary: Query dataframes, find issue with your notebook snippets as if a professional data scientist was pair coding with you
License: MIT
Author: IMPRV Dev Team
Author-email: dev@imprv.ai
Requires-Python: >=3.10,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: numpy (==1.26.4)
Requires-Dist: pandasai (>=2.2.8,<3.0.0)
Requires-Dist: pygments (>=2.18.0,<3.0.0)
Project-URL: Homepage, https://github.com/imprv-ai/date-a-scientist
Project-URL: Issues, https://github.com/imprv-ai/date-a-scientist/issues
Description-Content-Type: text/markdown


# Date a Scientist

Query dataframes, find issue with your notebook snippets as if a professional data scientist was pair coding with you.

Currently just a thin wrapper around an amazing library called `pandas-ai` by sinaptik-ai!

## How to use it?

```python
from date_a_scientist import DateAScientist
import pandas as pd

df = pd.DataFrame(
    [
        {"name": "Alice", "age": 25, "city": "New York"},
        {"name": "Bob", "age": 30, "city": "Los Angeles"},
        {"name": "Charlie", "age": 35, "city": "Chicago"},
    ]
)
ds = DateAScientist(
    df=df,
    llm_openai_api_token=...,  # your OpenAI API token goes here
    llm_model_name="gpt-3.5-turbo",  # by default, it uses "gpt-4o"
)

# should return "Alice"
ds.chat("What is the name of the first person?")
```

Additionally we can pass a description of fields, so that more meaningful questions can be asked:

```python
ds = DateAScientist(
    df=df,
    llm_openai_api_token=...,  # your OpenAI API token goes here
    llm_model_name="gpt-3.5-turbo",  # by default, it uses "gpt-4o"
    column_descriptions={
        "name": "The name of the person",
        "age": "The age of the person",
        "city": "The city where the person lives",
    },
)

ds = DateAScientist(
    df=df,
    llm_openai_api_token=...,  # your OpenAI API token goes here
    llm_model_name="gpt-3.5-turbo",  # by default, it uses "gpt-4o"
)

# should return DataFrame with Chicago rows
ds.chat("Who lives in Chicago?")
```

## Inspirations

- https://github.com/sinaptik-ai/pandas-ai
- https://levelup.gitconnected.com/create-copilot-inside-your-notebooks-that-can-chat-with-graphs-write-code-and-more-e9390e2b9ed8

