Metadata-Version: 2.1
Name: oscars_toolbox
Version: 0.1.7
Summary: A package for optimization. See the GitHub repo for instructions and version notes.
Author-email: Oscar Scholin <scholinoscar@gmail.com>
Project-URL: Homepage, https://github.com/oscars47/oscars-toolbox
Project-URL: Issues, https://github.com/oscars47/oscars-toolbox/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE

# oscars-toolbox
A package for helpful general algorithms I've developed. See the PyPI release: https://pypi.org/project/oscars-toolbox/. See also my wesbite, https://oscars47.github.io/.

## Current functions as of latest version:
 ## ```trabbit``` 
This repository contains a custom gradient descent algorithm called "Tortoise and Rabbit" (trabbit) implemented in Python. The algorithm aims to perform double optimization to determine the best parameters for a given loss function by incorporating both gradient descent and random input generation strategies.

### Functions and Their Usage

## `trabbit`

The `trabbit` function implements the custom gradient descent algorithm. It optimizes the parameters of a given loss function using a combination of gradient descent and random input generation.

- **Parameters:**
  - `loss_func`: A function to minimize. Assumes all arguments are already passed through `partial`.
  - `random_gen`: A function to generate random inputs.
  - `bounds`: Bounds for each parameter (default: `None`). If `None`, no bounds are implemented.
  - `x0_ls`: Initial guesses within a list (default: `None`). If `None`, `random_gen` is used. Can also be a list of initial parameters to try before implementing gradient descent.
  - `num`: Number of iterations (default: 1000).
  - `alpha`: Learning rate (default: 0.3).
  - `temperature`: Fraction of iterations to use for random input generation (default: 0.1).
  - `tol`: Tolerance for convergence. The algorithm stops if the loss is less than `tol` (default: 1e-5).
  - `grad_step`: Step size to estimate the gradient (default: 1e-8).
  - `verbose`: Whether to print out the loss at each iteration (default: `True`).

- **Returns:**
  - `x_best`: Best parameters found.
  - `loss_best`: Best loss achieved.

- **Example Usage:**
  ```python
  from oscars_toolbox.trabbit import trabbit
  
  # Define a sample loss function
  def sample_loss(x):
      return np.sum(x**2)
  
  # Define a random input generator
  def random_gen():
      return np.random.uniform(-10, 10, size=3)
  
  # Run the trabbit algorithm
  best_params, best_loss = trabbit(sample_loss, random_gen)
  print(f'Best parameters: {best_params}')
  print(f'Best loss: {best_loss}')
  ```

### Detailed Description

The `trabbit` function incorporates a combination of gradient descent and random input generation to optimize a loss function. The algorithm proceeds as follows:

1. **Initial Guess**:
   - If `x0_ls` is provided, each initial guess is evaluated using a minimization function (`min_func`). If `x0_ls` is `None`, random inputs are generated using `random_gen`.

2. **Minimization Function**:
   - The `min_func` uses the Nelder-Mead algorithm (or bounded optimization if `bounds` are provided) to minimize the loss function and return the optimal parameters.

3. **Gradient Descent with Random Hopping**:
   - The algorithm performs gradient descent with a specified learning rate (`alpha`). If no improvement is seen for a specified fraction of iterations (`temperature`), the algorithm hops out and uses a new random input generated by `random_gen`.

4. **Convergence Check**:
   - The algorithm checks if the gradient is too small or if the loss is below the tolerance level (`tol`). If so, it hops out or terminates.

5. **Verbose Output**:
   - If `verbose` is `True`, the algorithm prints the current loss, best loss, and iteration details at each step.

6. **Keyboard Interrupt Handling**:
   - The algorithm gracefully handles keyboard interrupts and prints the best parameters and corresponding loss found so far.

## ```implement_torch```
The primary functions include training a model, evaluating its performance, and counting the number of trainable parameters. The functions utilize popular libraries such as `torch`, `torch.nn`, `torch.optim`, `tqdm`, and `sklearn`.

### Functions

#### 1. `train_only`

This function trains a given model on the provided training data and evaluates it on the validation data.

- **Parameters:**
  - `model`: The neural network model to be trained.
  - `device`: The device to use for computation (`cpu` or `cuda`).
  - `train_loader`: DataLoader for the training dataset.
  - `val_loader`: DataLoader for the validation dataset.
  - `num_epochs`: Number of training epochs (default: 5).
  - `learning_rate`: Learning rate for the optimizer (default: 1e-3).
  - `weight_decay`: Weight decay (L2 regularization) factor for the optimizer (default: 1e-4).
  - `loss_func`: Loss function to use (default: `nn.CrossEntropyLoss()`).

- **Returns:**
  - `model`: The trained model.
  - `train_accuracy`: Training accuracy.
  - `val_accuracy`: Validation accuracy.
  - `train_acc_ls`: List of training accuracies for each epoch.
  - `val_acc_ls`: List of validation accuracies for each epoch.

- **Example Usage:**
  ```python
  model, train_accuracy, val_accuracy, train_acc_ls, val_acc_ls = train_only(model, device, train_loader, val_loader)
  ```

#### 2. `train_model`

This function constructs and trains a model based on the provided architecture and training settings. This differs from `train_only` because it only

- **Parameters:**
  - `model_func`: Function to create the model.
  - `device`: The device to use for computation (`cpu` or `cuda`).
  - `train_loader`: DataLoader for the training dataset.
  - `val_loader`: DataLoader for the validation dataset.
  - `input_size`: Size of the input layer.
  - `output_size`: Size of the output layer.
  - `neurons_ls`: List specifying the number of neurons in each hidden layer.
  - `num_epochs`: Number of training epochs (default: 5).
  - `learning_rate`: Learning rate for the optimizer (default: 1e-3).
  - `weight_decay`: Weight decay (L2 regularization) factor for the optimizer (default: 1e-4).
  - `use_cnn`: Boolean flag to indicate if a convolutional neural network (CNN) should be used (default: False).
  - `loss_func`: Loss function to use (default: `nn.CrossEntropyLoss()`).
  - `img_channels`: Number of image channels (default: 3).

- **Returns:**
  - `model`: The trained model.

- **Example Usage:**
  ```python
  model = train_model(my_model_func, device, train_loader, val_loader, input_size, output_size, neurons_ls)
  ```

#### 3. `evaluate`

This function evaluates a trained model on the test data and returns the confusion matrix and optionally other metrics. Only use if using the models defined in `torch_models.py`.

- **Parameters:**
  - `model`: The trained neural network model.
  - `test_loader`: DataLoader for the test dataset.
  - `num_classes`: Number of classes in the dataset.
  - `device`: The device to use for computation (`cpu` or `cuda`).
  - `return_extra_metrics`: Boolean flag to indicate if additional metrics (accuracy, precision, recall, F1 score) should be returned (default: False).

- **Returns:**
  - `conf_matrix`: Confusion matrix.
  - `accuracy`: Accuracy score (if `return_extra_metrics` is True).
  - `precision`: Precision score (if `return_extra_metrics` is True).
  - `recall`: Recall score (if `return_extra_metrics` is True).
  - `f1`: F1 score (if `return_extra_metrics` is True).

- **Example Usage:**
  ```python
  conf_matrix, accuracy, precision, recall, f1 = evaluate(model, test_loader, num_classes, device, return_extra_metrics=True)
  ```

#### 4. `count_parameters_torch`

This function counts the number of trainable parameters in a model.

- **Parameters:**
  - `model`: The neural network model.

- **Returns:**
  - `num_params`: The number of trainable parameters.

- **Example Usage:**
  ```python
  num_params = count_parameters_torch(model)
  print(f'The model has {num_params} trainable parameters.')
  ```

### Example Workflow

```python
from oscars_toolbox.implement_torch import train_only, evaluate
import matplotlib.pyplot as plt

# Assuming `train_loader`, `val_loader`, `test_loader` are defined DataLoader objects
# and `my_model_func` is a function that creates a neural network model

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Train the model
model, train_accuracy, val_accuracy, train_acc_ls, val_acc_ls = train_only(my_model_func, device, train_loader, val_loader, num_epochs=20)

# Evaluate the model
conf_matrix, accuracy, precision, recall, f1 = evaluate(model, test_loader, num_classes=10, device=device, return_extra_metrics=True)

# Plot confusion matrix
ax, fig = plt.subplots(1,1, figsize=(7,5))
cm = ax.imshow(conf_matrix, cmap='viridis')
fig.colorbar(cm, ax=ax)

print(f"Test Accuracy: {accuracy}")

# Count the number of trainable parameters
num_params = count_parameters_torch(model)
print(f'The model has {num_params} trainable parameters.')
```

## `torch_models.py`

This repository contains Python code for implementing various neural network architectures using PyTorch, including standard Multi-Layer Perceptrons (MLP), Convolutional Neural Networks (CNN), and K-Nearest Neighbor (KAN) layers. The code also includes a modified CNN architecture with KAN layers integrated.

### Functions and Classes

#### 1. `MLP`

Implements a standard Multi-Layer Perceptron (MLP) with configurable hidden layers and neurons.

- **Parameters:**
  - `neurons_ls`: List of integers representing the number of neurons in each layer, including the input and output layers.

- **Example Usage:**
  ```python
  model = MLP([784, 128, 64, 10])
  ```

#### 2. `TestCNN`

Implements a standard Convolutional Neural Network (CNN) for image classification tasks.

- **Parameters:**
  - `num_classes`: Number of output classes (default: 10).
  - `num_channels`: Number of input channels (default: 3).

- **Example Usage:**
  ```python
  model = TestCNN(num_classes=10, num_channels=3)
  ```

#### 3. `TestCNNKAN`

Implements a modified CNN with a KAN layer for image classification tasks.

- **Parameters:**
  - `num_classes`: Number of output classes (default: 10).
  - `num_channels`: Number of input channels (default: 3).

- **Example Usage:**
  ```python
  model = TestCNNKAN(num_classes=10, num_channels=3)
  ```

#### 4. `CNN`

Implements a generalized CNN for any number of convolutional layers followed by linear layers.

- **Parameters:**
  - `img_size`: Tuple containing the height and width of the input images.
  - `in_channels`: Number of channels in the input data.
  - `num_classes`: Number of output classes.
  - `conv_layers`: List of tuples containing the number of out_channels, kernel_size, and stride for each convolutional layer.

- **Example Usage:**
  ```python
  model = CNN((32, 32), 3, 10, [(32, 3, 1), (64, 3, 1)])
  ```

#### 5. `KCNN`

Implements a modified CNN with KAN layers for image classification tasks.

- **Parameters:**
  - `img_size`: Tuple containing the height and width of the input images.
  - `in_channels`: Number of channels in the input data.
  - `num_classes`: Number of output classes.
  - `conv_layers`: List of tuples containing the number of out_channels, kernel_size, and stride for each convolutional layer.

- **Example Usage:**
  ```python
  model = KCNN((32, 32), 3, 10, [(32, 3, 1), (64, 3, 1)])
  ```

### Example Workflow

Here is an example workflow using the provided classes and functions to train and evaluate a model:

1. **Import Necessary Libraries**:
   ```python
   import torch
   from torch import nn, optim
   from torch.utils.data import DataLoader, TensorDataset
  from oscars_toolbox.torch_models import MLP, CNN
  from oscars_toolbox.implement_torch import train_only, evaluate
   ```

2. **Prepare Data Loaders**:
   ```python
   # Example data
   train_data = TensorDataset(torch.randn(100, 3, 32, 32), torch.randint(0, 10, (100,)))
   val_data = TensorDataset(torch.randn(20, 3, 32, 32), torch.randint(0, 10, (20,)))
   test_data = TensorDataset(torch.randn(20, 3, 32, 32), torch.randint(0, 10, (20,)))

   train_loader = DataLoader(train_data, batch_size=10)
   val_loader = DataLoader(val_data, batch_size=10)
   test_loader = DataLoader(test_data, batch_size=10)
   ```

3. **Define Model and Training Parameters**:
   ```python
   device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
   neurons_ls = [784, 128, 64, 10]  # Example for MLP
   conv_layers = [(32, 3, 1), (64, 3, 1)]  # Example for CNN
   ```

4. **Train the Model**:
   ```python
   # Using MLP
   model = MLP(neurons_ls)
   trained_model, train_acc, val_acc, train_acc_ls, val_acc_ls = train_only(model, device, train_loader, val_loader)

   # Using CNN
   model = CNN((32, 32), 3, 10, conv_layers)
   trained_model, train_acc, val_acc, train_acc_ls, val_acc_ls = train_only(model, device, train_loader, val_loader)
   ```

5. **Evaluate the Model**:
   ```python
   conf_matrix, accuracy, precision, recall, f1 = evaluate(trained_model, test_loader, num_classes=10, device=device, return_extra_metrics=True)
   print(f'Confusion Matrix:\n{conf_matrix}')
   print(f'Test Accuracy: {accuracy}')
   print(f'Test Precision: {precision}')
   print(f'Test Recall: {recall}')
   print(f'Test F1 Score: {f1}')
   ```

6. **Count Trainable Parameters**:
   ```python
   num_params = count_parameters_torch(trained_model)
   print(f'The model has {num_params} trainable parameters.')
   ```


##  `implement_xgb.py`
### Functions
The `evaluate_xgb` function is designed to evaluate the performance of an XGBoost model on a validation dataset. Here's a detailed description of its functionality:

#### Parameters:
- `xgb_model`: The trained XGBoost model to be evaluated.
- `X_val`: The validation input features.
- `y_val`: The true labels for the validation set.
- `return_extra_metrics` (default is `False`): A boolean flag indicating whether to return additional evaluation metrics beyond the confusion matrix.

#### Functionality:
1. **Time Tracking**:
   - The function records the start time using `time.time()`.

2. **Prediction**:
   - The model makes predictions on the validation data `X_val` and stores them in `y_pred`.

3. **Time Tracking**:
   - The function records the end time using `time.time()`.

4. **Confusion Matrix**:
   - The function computes the confusion matrix for `y_val` and `y_pred` using `confusion_matrix(y_val, y_pred, normalize='true')`.

5. **Return Basic Metrics**:
   - If `return_extra_metrics` is `False`, the function returns only the confusion matrix.

6. **Return Extra Metrics**:
   - If `return_extra_metrics` is `True`, the function calculates additional performance metrics:
     - `accuracy`: The accuracy score of the predictions.
     - `precision`: The weighted precision score.
     - `recall`: The weighted recall score.
     - `f1`: The weighted F1 score.
     - `roc_auc`: The weighted ROC-AUC score.
   - It also calculates the time taken per sample by dividing the total evaluation time by the number of samples in `y_val`.
   - The function returns a tuple containing the confusion matrix, accuracy, precision, recall, F1 score, ROC-AUC score, and time per sample.

#### Returns:
- If `return_extra_metrics` is `False`: The confusion matrix.
- If `return_extra_metrics` is `True`: A tuple containing the confusion matrix, accuracy, precision, recall, F1 score, ROC-AUC score, and time per sample.

### `plot_confusion_matrix` 
This function is designed to plot and save confusion matrices for validation and test datasets, along with their respective accuracies. Here is a detailed description of its functionality:

#### Parameters:
- `confusion_matrix_val`: The confusion matrix for the validation set (numpy array).
- `confusion_matrix_test`: The confusion matrix for the test set (numpy array).
- `allowed_categories`: A list of category labels for the axes of the confusion matrices.
- `accuracy_val`: The accuracy of the model on the validation set (float).
- `accuracy_test`: The accuracy of the model on the test set (float).
- `save_path`: The path where the resulting plot will be saved (string).
- `show_percentages` (default is `False`): A boolean flag indicating whether to display percentages in the cells of the confusion matrices.

#### Functionality:
1. **Plotting Setup**:
   - The function creates a figure with two subplots arranged horizontally, each for one of the confusion matrices (validation and test).
   - It sets the size of the figure to be 18 by 9 inches.

2. **Validation Confusion Matrix Plot**:
   - The first subplot (`ax[0]`) displays the confusion matrix for the validation set using a colormap (`viridis`).
   - A colorbar is added to the subplot for reference.
   - The x-axis and y-axis ticks are set according to the number of categories in `allowed_categories`.
   - The tick labels are set to `allowed_categories`, with the x-axis labels rotated 90 degrees for better readability.
   - The subplot is labeled with "Predicted" on the x-axis and "Actual" on the y-axis.
   - The title of the subplot includes the validation accuracy as a percentage.

3. **Test Confusion Matrix Plot**:
   - The second subplot (`ax[1]`) displays the confusion matrix for the test set in a similar manner to the validation subplot.
   - It includes a colorbar, tick marks, tick labels, axis labels, and a title with the test accuracy.

4. **Percentage Annotations**:
   - If `show_percentages` is `True`, the function adds text annotations to each cell in both confusion matrices.
   - Each annotation shows the percentage of predictions for that cell relative to the total predictions for the corresponding actual category.

5. **Layout and Save**:
   - The function adjusts the layout to ensure that subplots and labels fit well within the figure.
   - The resulting plot is saved to the specified `save_path`.
   - The plot is closed to free up memory.

#### Returns:
- The function does not return any values. It saves the generated plot to the specified file path.

#### Example Workflow

1. **Import Necessary Libraries**:
   Ensure you have the required libraries imported.
   ```python
   import time
   from oscars_toolbox.implement_xgb import evaluate_xgb
   ```

2. **Load Dataset**:
   Load and prepare your dataset. In this example, we'll use the Iris dataset.
   ```python
   # Load the Iris dataset
   data = load_iris()
   X = data.data
   y = data.target

   # Split the dataset into training and validation sets
   X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
   ```

3. **Train the XGBoost Model**:
   Train an XGBoost model on the training data.
   ```python
   # Initialize the XGBoost classifier
   xgb_model = xgb.XGBClassifier(use_label_encoder=False, eval_metric='mlogloss')

   # Train the model
   xgb_model.fit(X_train, y_train)
   ```

4. **Evaluate the Model**:
   Use the `evaluate_xgb` function to evaluate the trained model on the validation set.
   ```python
   # Evaluate the model and get only the confusion matrix
   conf_matrix = evaluate_xgb(xgb_model, X_val, y_val)
   print("Confusion Matrix:\n", conf_matrix)

   # Evaluate the model and get all the metrics
   conf_matrix, accuracy, precision, recall, f1, roc_auc, time_per_sample = evaluate_xgb(xgb_model, X_val, y_val, return_extra_metrics=True)
   print("Confusion Matrix:\n", conf_matrix)
   print(f"Accuracy: {accuracy}")
   print(f"Precision: {precision}")
   print(f"Recall: {recall}")
   print(f"F1 Score: {f1}")
   print(f"ROC AUC Score: {roc_auc}")
   print(f"Time per sample: {time_per_sample} seconds")
   ```

### Example Output

This example workflow would produce output similar to the following:

```
Confusion Matrix:
 [[1.  0.  0. ]
  [0.  1.  0. ]
  [0.  0.  1. ]]
Confusion Matrix:
 [[1.  0.  0. ]
  [0.  1.  0. ]
  [0.  0.  1. ]]
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1 Score: 1.0
ROC AUC Score: 1.0
Time per sample: 0.0012345678901234567 seconds
```

This workflow provides a comprehensive way to evaluate an XGBoost model's performance on a validation dataset, using both basic and extended metrics depending on the user's needs.


Here is an example workflow demonstrating how to use the `plot_confusion_matrix` function:

### Example Workflow

1. **Import Necessary Libraries**:
   Ensure you have the required libraries imported.
   ```python
   from oscars_toolbox.implement_xgb import evaluate_xgb, plot_confusion_matrix
   ```

3. **Generate Confusion Matrices**:
   Generate or load confusion matrices for validation and test sets. For this example, we'll create synthetic data.
   ```python
   # Example confusion matrices
   confusion_matrix_val = np.array([[50, 2, 1], [3, 45, 2], [1, 2, 47]])
   confusion_matrix_test = np.array([[48, 4, 1], [2, 46, 2], [3, 1, 46]])

   # Categories
   allowed_categories = ['Category 1', 'Category 2', 'Category 3']

   # Example accuracies
   accuracy_val = 0.95
   accuracy_test = 0.94
   ```

4. **Plot and Save the Confusion Matrices**:
   Use the `plot_confusion_matrix` function to visualize and save the confusion matrices.
   ```python
   # Plot and save the confusion matrices
   save_path = 'confusion_matrices.png'
   plot_confusion_matrix(confusion_matrix_val, confusion_matrix_test, allowed_categories, accuracy_val, accuracy_test, save_path, show_percentages=True)
   ```

#### Example Output

Running the above workflow will generate a saved image (`confusion_matrices.png`) containing two confusion matrices, one for the validation set and one for the test set, with the respective accuracies displayed in the titles. If `show_percentages` is set to `True`, the cell values will be annotated with percentages.

This workflow demonstrates how to prepare data, define the plotting function, and use it to visualize and save confusion matrices along with relevant metrics.
