Metadata-Version: 2.1
Name: ts-task-script-utils
Version: 1.7.0
Summary: Python utility for Tetra Task Scripts
License: Apache-2.0
Author: Tetrascience
Author-email: developers@tetrascience.com
Requires-Python: >=3.7.2,<3.11
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Dist: arrow (>=1.2.2,<2.0.0)
Requires-Dist: dateparser (>=1.1.1,<2.0.0)
Requires-Dist: fastparquet (>=0.8.1)
Requires-Dist: fsspec (<=2023.1.0)
Requires-Dist: numpy (>=1.17.0,<2.0.0)
Requires-Dist: pandas (>=1.3.5)
Requires-Dist: pendulum (>=2.1.2,<3.0.0)
Requires-Dist: pydash (>=5.1.0,<6.0.0)
Requires-Dist: python-dateutil (>=2.8.2,<3.0.0)
Description-Content-Type: text/markdown

# ts-task-script-utils <!-- omit in toc -->

[![Build Status](https://travis-ci.com/tetrascience/ts-task-script-utils.svg?branch=master)](https://travis-ci.com/tetrascience/ts-task-script-utils)

## Version <!-- omit in toc -->

v1.7.0

## Table of Contents <!-- omit in toc -->

- [Summary](#summary)
- [Installation](#installation)
- [Usage](#usage)
  - [Parsing Numbers](#parsing-numbers)
  - [Parsing Datetimes](#parsing-datetimes)
    - [`parse` Usage](#parse-usage)
  - [Generating Random UUIDs for Task Scripts](#generating-random-uuids-for-task-scripts)
  - [Creating datacube Parquet files](#creating-datacube-parquet-files)
  - [Logging warning messages in Task Scripts](#logging-warning-messages-in-task-scripts)
- [Changelog](#changelog)
  - [v1.7.0](#v170)
  - [v1.6.0](#v160)
  - [v1.5.0](#v150)
  - [v1.4.0](#v140)
  - [v1.3.1](#v131)
  - [v1.3.0](#v130)
  - [v1.2.0](#v120)
  - [v1.1.1](#v111)
  - [v1.1.0](#v110)

## Summary

Utility functions for Tetra Task Scripts

## Installation

`pip install ts-task-script-utils`

## Usage

### Parsing Numbers

```python
from task_script_utils.parse import to_int

string_value = '1.0'
int_value = to_int(string_value)

# `int_value` now has the parsed value of the string
assert isinstance(int_value, int)
assert int_value == 1

# it returns `None` if the value is unparseable
string_value = 'not an int'
int_value = to_int(string_value)

assert int_value is None
```

### Parsing Datetimes

**DEPRECATION WARNING!**

- Do not use the old datetime parser:
  `convert_datetime_to_ts_format` (from `task_script_utils.convert_datetime_to_ts_format`)
- Instead, use the newer `parse` from `task_script_utils.datetime_parser`

#### `parse` Usage

```python
from task_script_utils.datetime_parser import parse

parse("2004-12-23T12:30 AM +05:30")
parse("2004-12-23T12:30 AM +05:30", <format_list>)
parse("2004-12-23T12:30 AM +05:30", <format_list>, <datetime_config>)
```

`parse()` returns a `TSDatetime` Object. You can use `TSDatetime.tsformat()` and
`TSDatetime.isoformat()` to get datetime string. You can also use
`TSDatetime.datetime()` to access python datetime object.

You can read more about the datetime parser [here](task_script_utils/datetime_parser/README.md).

### Generating Random UUIDs for Task Scripts

To generate standard and random UUIDs, Python's `uuid` module can be used (`uuid1` for standard and `uuid4` for random).
However, to get UUIDs that are reproducible for a given task script and input file, a custom UUID generator is provided:
`task_script_utils.random.TaskScriptUUIDGenerator`.

```python
from pathlib import Path
from task_script_utils.random import TaskScriptUUIDGenerator

input_file = Path(...)
file_content = input_file.read_bytes()
rand = TaskScriptUUIDGenerator("common/my-task-script:v1.0.0", file_content)

# Get 3 random bytes:
random_bytes = rand.randbytes(3)

# Get a random UUID:
uuid = rand.uuid()
```

It's also possible to use a class method and provide the task script identifiers separately:

```python
from pathlib import Path
from task_script_utils.random import TaskScriptUUIDGenerator

input_file = Path(...)
file_content = input_file.read_bytes()
rand = TaskScriptUUIDGenerator.from_task_script_identifier_parts("common", "my-task-script", "v1.0.0", file_content)
```

This is a singleton class, meaning creating multiple instances of the class with the same arguments results in getting
the identical object back, e.g.:

```python
from pathlib import Path
from task_script_utils.random import TaskScriptUUIDGenerator

input_file = Path(...)
file_content = input_file.read_bytes()
rand1 = TaskScriptUUIDGenerator("common/my-task-script:v1.0.0", file_content)
rand2 = TaskScriptUUIDGenerator("common/my-task-script:v1.0.0", file_content)

assert rand1 is rand2
```

It's also possible to get the most-recently-created instance through the `get_last_created` method:

```python
from pathlib import Path
from task_script_utils.random import TaskScriptUUIDGenerator

input_file = Path(...)
file_content = input_file.read_bytes()
rand1 = TaskScriptUUIDGenerator("common/my-task-script:v1.0.0", file_content)

rand2 = TaskScriptUUIDGenerator.get_last_created()

assert rand1 is rand2
```

### Creating datacube Parquet files

[See the docs](task_script_utils/datacubes/README.md).

### Using Python's `logging` module in Task Scripts

Task Scripts can write workflow logs which are visible to users on TDP, but only if the logs are written via the logger provided by the `context` object. The `context` logger is documented here: [context.get_logger](https://developers.tetrascience.com/docs/context-api#contextget_logger).

This utility is a wrapper for the `context` logger which allows Task Scripts to use Python's `logging` module for creating TDP workflow logs, instead of directly using the `context` logger object. This means the `context` logger object doesn't need to be passed around to each function which needs to do logging, and Task Script code can benefit from other features of the Python `logging` module such as [integration with `pytest`](https://docs.pytest.org/en/7.1.x/how-to/logging.html).

To log warning messages on the platform from a task script do the following:

- Setup the log handler in `main.py`:

```python
from task_script_utils.logging import (
    setup_ts_log_handler,
)
```

- Then within the function called by the protocol:

```python
setup_ts_log_handler(context.get_logger(), "main")
```

- In a module where you wish to log a warning:

```python
import logging
logger = logging.getLogger("main." + __name__)
```

- Log a warning message with:

```python
logger.warning("This is a warning message")
```

## Changelog

### v1.7.0

- Add `task_script_utils.logging` for logging warning messages in task scripts

### v1.6.0

- Add `task_script_utils.datacubes.parquet` for creating Parquet file representations of datacubes

### v1.5.0

- Add `TaskScriptUUIDGenerator` class for generating random UUIDs and random bytes.

### v1.4.0

- Add `extract-to-decorate` functions

### v1.3.1

- Update datetime parser usage in README.md

### v1.3.0

- Added string parsing functions

### v1.2.0

- Add boolean config parameter `require_unambiguous_formats` to `DatetimeConfig`
- Add logic to `parser._parse_with_formats` to be used when `DatetimeConfig.require_unambiguous_formats` is set to `True`
  - `AmbiguousDatetimeFormatsError` is raised if mutually ambiguous formats are detected and differing datetimes are parsed
- Add parameter typing throughout repository
- Refactor `datetime_parser` package
- Add base class `DateTimeInfo`
- Segregate parsing logic into `ShortDateTimeInfo` and `LongDateTimeInfo`

### v1.1.1

- Remove `convert_to_ts_iso8601()` method

### v1.1.0

- Add `datetime_parser` package

