Metadata-Version: 2.1
Name: superb-data-klient
Version: 1.2.2
Summary: A Python API wrapping services of the Superb Data Kraken (SDK)
Author-email: "Team SDK | e:fs TechHub GmbH" <sdk@efs-techhub.com>
Maintainer: Team SDK
Maintainer-email: sdk@efs-techhub.com
License: Apache-2.0
Keywords: sdk,superb data kraken,superbdataklient,super data klient,superb data klient,superb-data-klient,superb,data,klient
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.31.0
Requires-Dist: opensearch-py>=2.3.0
Requires-Dist: PyJWT>=2.8.0
Requires-Dist: azure-core>=1.29.0
Requires-Dist: azure-storage-blob>=12.17.0
Requires-Dist: msrest>=0.7.0
Requires-Dist: TerevintoSoftware.PkceClient>=0.1.0

![PyPI - License](https://img.shields.io/pypi/l/superb-data-klient)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/superb-data-klient)
![PyPI](https://img.shields.io/pypi/v/superb-data-klient?label=version)
![PyPI - Downloads](https://img.shields.io/pypi/dm/superb-data-klient)


# superb-data-klient


**superb-data-klient** offers a streamlined interface to access various services of the *Superb Data Kraken platform* (**SDK**). With the library, you can
effortlessly fetch and index data, manage indices, spaces and organizations on the **SDK**.

Designed primarily for a Jupyter Hub environment within the platform, it's versatile enough to be set up in other environments too.


## Installation and Supported Versions

```console
$ python -m pip install superb-data-klient
```

## Usage


### Authentication


To begin, authenticate against the SDK's OIDC provider. This is achieved when instantiating the client object:

1. **System Environment Variables** (recommended for Jupyter environments):
    ```python
    import superbdataklient as sdk
    client = sdk.SDKClient()
    ```
   This approach leverages environment variables **SDK_ACCESS_TOKEN** and **SDK_REFRESH_TOKEN**.


2. **Login Credentials**:
    ``` python
    import superbdataklient as sdk
    sdk.SDKClient(username='hasslethehoff', password='lookingforfreedom')
    ```

3. **Authentication Code Flow**:

   If none of the above mentioned authentication methods fit, authentication is fulfilled via code-flow.

   **CAUTION** Beware that this method only works in a browser-environment.

**NOTE:** If your user account was linked from an external identity provider, your account in the SDK identity provider (Keycloak) does not have a password by default. To enable login via basic authentication, you need to set a password through self-service first.

Follow these steps to set your password:

1. Go to the self-service portal for your environment:
   - [https://{domain}/auth/realms/{realm}/account/](https://{domain}/auth/realms/{realm}/account/).
   - e.g. [https://app.sdk-cloud.de/auth/realms/efs-sdk/account/](https://app.sdk-cloud.de/auth/realms/efs-sdk/account/).
2. Set a password for your account.
3. Once the password is set, you can log in using basic authentication (option 2).

### Configuration


While the default settings cater to the standard SDK instance, configurations for various other instances are also available.


#### Setting Environment

``` python
import superbdataklient as sdk
client = sdk.SDKClient(env='sdk-dev')
client = sdk.SDKClient(env='sdk')
```

#### Overwriting Settings

``` python
client = sdk.SDKClient(domain='mydomain.ai', realm='my-realm', client_id='my-client-id', api_version='v13.37')
```


#### Proxy
To Use the SDK Client behind a company proxy a user might add the following config parameters to the constructor.  
**NOTE**: The environment Variables "http_proxy" and "https_proxy" will overwrite the settings in the SDKClient. 
So remove them before configuring the SDKClient.
```python
client = SDKClient(username='hasslethehoff', 
                   password='lookingforfreedom', 
                   proxy_http="http://proxy.example.com:8080", 
                   proxy_https="https://proxy.example.com:8080", 
                   proxy_user="proxyusername", 
                   proxy_pass="proxyuserpassword")
```

---
### Examples


#### Organizations


Get details of all organizations, or retrieve by ID or name:

``` python
client.organization_get_all()
client.organization_get_by_id(1337)
client.organization_get_by_name('my-organization')
```

#### Spaces


To retrieve spaces related to an organization:

``` python
organization_id = 1234
client.space_get_all(organization_id)
client.space_get_by_id(organization_id, space_id)
client.space_get_by_name(organization_id, space)
```

#### Index


<!--
TODO: implement after search service works without all_access ()

List all accessible indices:

``` python
indices = client.index_get_all()
```
-->

Retrieve a specific document:

``` python
document = client.index_get_document(index_name, doc_id)
``` 

Fetch all documents within an index:

``` python
documents = client.index_get_all_documents("index_name")
```

Iterate through documents using a generator:

``` python
documents = client.index_get_documents("index-name")
for document in documents:
   print(document)
```

Index multiple documents:

``` python
documents = [
   {"_id": 123, "name": "document01", "value": "value"},
   {"_id": 1337, "name": "document02", "value": "value"}
]
index_name = "index"
client.index_documents(documents, index_name)
``` 

Note: The optional **_id** field is used as the document ID for indexing in OpenSearch.

Filter indices by organization, space, and type:

``` python
client.index_filter_by_space("my-organization", "my-space", "index-type")
```

For all spaces in an organization, use `*` instead of a space name. Available **index_type** values are **ANALYSIS** or **MEASUREMENTS**.

Create an application index:

``` python
mapping = {
   ...
}
client.application_index_create("my-application-index", "my-organization", "my-space", mapping)
```

Remove an application index by its name:

``` python
client.application_index_delete("my-organization_my-space_analysis_my-application-index")
```

#### Storage


List files in Storage:

``` python
files = client.storage_list_blobs("my-organization", "space")
```

Download specific files from Storage:

``` python
files = ['file01.txt', 'directory/file02.json']
client.storage_download_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')
```

Use regex patterns for file downloads:

``` python
files = ['file01.txt', 'directory/file02.json']
client.storage_download_files_with_regex(organization='my-organization', space='my-space', files=files, local_dir='tmp', regex=r'.*json$')
```

Upload files from a local directory. Ensure the presence of a valid `meta.json` if the `metadataGenerate` property on the space is not set to `true`:

``` python
files = ['meta.json', 'file01.txt', 'file02.txt']
client.storage_upload_files_to_loadingzone(organization='my-organization', space='my-space', files= files, local_dir='tmp')
```

