Metadata-Version: 2.1
Name: pelutils
Version: 0.6.0
Summary: Utility functions that are often useful
Home-page: https://github.com/peleiden/pelutils
Author: Søren Winkel Holm, Asger Laurits Schultz
Author-email: swholm@protonmail.com
License: BSD-3-Clause
Download-URL: https://pypi.org/project/pelutils/
Description: # pelutils
        
        Various utilities useful for Python projects. Features include
        
        - Feature-rich logger using `Rich` for colourful printing
        - Parsing for combining config files and command-line arguments - especially useful for parametric methods
        - Time taking and profiling
        - Easy to use data storage class for easy data saving and loading
        - Table formatting
        - Miscellaneous standalone functions providing various functionalities - see `pelutils/__init__.py`
        - Data-science submodule with extra utilities for statistics, plotting, and machine learning using `PyTorch`
        - `unique` function similar to `np.unique` but in linear time (currently Linux x86_64 only)
        
        `pelutils` supports Python 3.7+.
        
        ## Logging
        
        Easy to use logger which fits common needs.
        
        ```py
        # Configure logger for the script
        log.configure("path/to/save/log.log", "Title of log")
        
        # Start logging
        for i in range(70):  # Nice
            log("Execution %i" % i)
        
        # Sections
        log.section("New section in the logfile")
        
        # Verbose logging for less important things
        log.verbose("Will be logged")
        with log.unverbose:
            log.verbose("Will not be logged")
        
        # Error handling
        # The zero-division error and stacktrace is logged
        with log.log_errors:
            0 / 0
        # Entire chained stacktrace is logged
        with log.log_errors:
            try:
                0 / 0
            except ZeroDivisionError as e:
                raise ValueError("Denominator must be non-zero") from e
        
        # Disable printing if using tqdm
        # Do not do this if the loop may be ended by a break statement!
        for elem in log.tqdm(tqdm(range(5))):
            log(elem)  # Will be logged, but not printed
        
        # User input
        inp = log.input("WHAT... is your favourite colour? ")
        
        # Log all logs from a function at the same time
        # This is especially useful when using multiple threads so logging does not get mixed up
        def fun():
            log("Hello there")
            log("General Kenobi!")
        with mp.Pool() as p:
            p.map(collect_logs(fun), args)
        
        # Disable printing when using tqdm so as to not print a million progress bars
        for i in log.tqdm(tqdm(range(100))):
            log(i)  # i will be logged to logfile but not printed
        ```
        
        ## Time Taking and Profiling
        
        Simple time taker inspired by Matlab Tic, Toc, which also has profiling tooling.
        
        ```py
        TT.tick()
        <some task>
        seconds_used = TT.tock()
        
        for i in range(100):
            TT.profile("Repeated code")
            <some task>
            TT.profile("Subtask")
            <some subtask>
            TT.end_profile()
            TT.end_profile()
        print(TT)  # Prints a table view of profiled code sections
        
        # Alternative syntax using with statement
        with TT.profile("The best task"):
            <some task>
        
        # Profile a loop
        # Do not do this if the loop may be ended by a break statement!
        for elem in TT.profile_iter(range(100), "The second best task"):
            <some task>
        
        # When using multiprocessing, it can be useful to simulate multiple hits of the same profile
        with mp.Pool() as p, tt.profile("Processing 100 items on multiple threads", hits=100):
            p.map(100 items)
        ```
        
        ## Data Storage
        
        A data class that saves/loads its fields from disk.
        Anything that can be saved to a `json` file will be.
        Other data types will be saved to relevant file formats.
        
        ```py
        @dataclass
        class Person(DataStorage):
            name: str
            age: int
            numbers: np.ndarray
            subfolder = "older"
            json_name = "yoda.json"
        
        yoda = Person(name="Yoda", age=900, numbers=np.array([69, 420]))
        yoda.save("old")
        # Saved data at old/older/yoda.json
        # {
        #     "name": "Yoda",
        #     "age": 900
        # }
        # There will also be a file named numbers.npy
        yoda = Person.load("old")
        ```
        
        ## Parsing
        
        A combination of parsing CLI and config file arguments which allows for a powerful, easy-to-use workflow.
        Useful for parametric methods such as machine learning.
        
        A file `main.py` could contain:
        ```py
        options = {
            "learning-rate": { "default": 1.5e-3, "help": "Controls size of parameter update", "type": float },
            "gamma": { "default": 1, "help": "Use of generator network in updating", "type": float },
            "initialize-zeros": { "help": "Whether to initialize all parameters to 0", "action": "store_true" },
        }
        parser = Parser(options)
        location = parser.location  # Experiments are stored here
        experiments = parser.parse()
        parser.document_settings()  # Save a config file to reproduce the experiment
        ```
        
        This could then by run by
        `python main.py data/my-big-experiment --learning_rate 1e-5`
        or by
        `python main.py data/my-big-experiment --config cfg.ini`
        where `cfg.ini` could contain
        
        ```
        [DEFAULT]
        gamma = 0.95
        [RUN1]
        learning-rate = 1e-4
        initialize-zeros
        [RUN2]
        learning-rate = 1e-5
        gamma = 0.9
        ```
        
        # pelutils.ds
        
        This submodule contains various utility functions for data science and machine learning. To make sure the necessary requirements are installed, install using
        ```
        pip install pelutils[ds]
        ```
        Note that in some terminals, you will instead have to write
        ```
        pip install pelutils\[ds\]
        ```
        
        ## PyTorch
        
        All PyTorch functions work independently of whether CUDA is available or not.
        
        ```py
        # Clear CUDA cache and synchronize
        reset_cuda()
        
        # Inference only: No gradients should be tracked in the following function
        # Same as putting entire function body inside with torch.no_grad()
        @no_grad
        def infer():
            <code that includes feedforwarding>
        
        # Feed forward in batches to prevent using too much memory
        # Every time a memory allocation error is encountered, the number of batches is doubled
        # Same as using y = net(x), but without risk of running out of memory
        bff = BatchFeedForward(net, len(x))
        y = bff(x)
        # Change to another network
        bff.update_net(net2)
        ```
        
        ## Statistics
        
        Includes various commonly used statistical functions.
        
        ```py
        # Get one sided z value for exponential(lambda=2) distribution with a significance level of 1 %
        zval = z(alpha=0.01, two_sided=False, distribution=scipy.stats.expon(loc=1/2))
        
        # Get correlation, confidence interval, and p value for two vectors
        a, b = np.random.randn(100), np.random.randn(100)
        r, lower_r, upper_r, p = corr_ci(a, b, alpha=0.01)
        ```
        
        ## Matplotlib
        
        Contains predefined rc params, colours, and figure sizes.
        
        ```py
        # Set wide figure size
        plt.figure(figsize=figsize_wide)
        
        # Use larger font for larger figures - works well with predefined figure sizes
        update_rc_params(rc_params)
        
        # 15 different, unique colours
        c = iter(colours)
        for i in range(15):
            plt.plot(x[i], y[i], color=next(c))
        ```
        
        
        
        # History
        
        ## 0.6.0 - Breaking changes
        
        - A global instance of `TickTock`, `TT`, has been added - similar to `log`
        - Added `TickTock.profile_iter` for performing profiling over a for loop
        - Fixed wrong error being thrown when keyboard interrupting within `with TT.profile(...)`
        - All collected logs are now logged upon an exception being thrown when using `log.log_errors` and `collect_logs`
        - Made `log.log_errors` capable of handling chained exeptions
        - Made `log.throw` private, as it had little use and could be exploited
        - `get_repo` no longer throws an error if a repository has not been found
        - Added utility functions for reading and writing `.jsonl` files
        - Fixed incorrect `torch` installations breaking importing `pelutils`
        
        ## 0.5.9
        
        - Add `split_path` function which splits a path into components
        - Fix bug in `MainTest` where test files where not deleted
        
        ## 0.5.7
        
        - Logger prints to `stderr` instead of `stdout` at level WARNING or above
        - Added `log.tqdm` that disables printing while looping over a `tqdm` object
        - Fixed `from __future__ import annotations` breaking `DataStorage`
        
        ## 0.5.6
        
        - DataStorage can save all picklable formats + `torch.Tensor` specifically
        
        ## 0.5.5
        
        - Test logging now uses `Levels.DEBUG` by default
        - Added `TickTock.fuse_multiple` for combining several `TickTock` instances
        - Fixed bugs when using multiple `TickTock` instances
        - Allow multiple hits in single profile
        - Now possible to profile using `with` statement
        - Added method to logger to parse boolean user input
        - Added method to `Table` for adding vertical lines manually
        
        ## 0.5.4 - Breaking changes
        
        - Change log error colour
        - Replace default log level with print level that defaults to `Levels.INFO`
        
          `__call__` now always defaults to `Levels.INFO`
        - Print microseconds as `us` instead of `mus`
        
        ## 0.5.3
        
        - Fixed missing regex requirement
        
        ## 0.5.2
        
        - Allowed disabling printing by default in logger
        
        ## 0.5.1
        
        - Fixed accidental rich formatting in logger
        - Fixed logger crashing when not configured
        
        ## 0.5.0 - Breaking changes
        
        - Added np.unique-style unique function to `ds` that runs in linear time but does not sort
        - Replaced verbose/non-verbose logging with logging levels similar to built-in `logging` module
        - Added `with_print` option to `log.__call__`
        - Undid change from 0.3.4 such that `None` is now logged again
        - Added `format` module. Currently supports tables
        - Updated stringification of profiles to include percentage of parent profile
        - Added `throws` function that checks if a functions throws an exception of a specific type
        - Use `Rich` for printing to console when logging
        
        ## 0.4.1
        
        - Added append mode to logger to append to old log files instead of overwriting
        
        ## 0.4.0
        
        - Added `ds` submodule for data science and machine learning utilities
        
          This includes `PyTorch` utility functions, statistics, and `matplotlib` default values
        
        ## 0.3.4
        
        - Logger now raises errors normally instead of using `throw` method
        
        ## 0.3.3
        
        - `get_repo` now accepts a custom path search for repo as opposed to always using working dir
        
        ## 0.3.2
        
        - `log.input` now also accepts iterables as input
        
          For such inputs, it will return a generator of user inputs
        
        ## 0.3.1 - Breaking changes
        
        - Added functionality to logger for logging repository commit
        - Removed function `get_commit`
        - Added function `get_repo` which returns repository path and commit
        
          It attempts to find a repository by searching from working directory and upwards
        - Updates to examples in `README` and other minor documentation changes
        - `set_seeds` no longer returns seed, as this is already given as input to the function
        
        ## 0.3.0 - Breaking changes
        
        - Only works for Python 3.7+
        
        - If logger has not been configured, it now does no logging instead of crashing
        
          This prevents dependecies that use the logger to crash the program if it is not used
        - `log.throw` now also logs the actual error rather than just the stack trace
        - `log` now has public property `is_verbose`
        - Fixed `with log.log_errors` always throwing errors
        - Added code samples to `README`
        - `Parser` no longer automatically determines if experiments should be placed in subfolders
        
          Instead, this is given explicitly as an argument to `__init__`
        
          It also supports boolean flags in the config file
        
        ## 0.2.13
        
        - Readd clean method to logger
        
        ## 0.2.12 - Breaking changes
        
        - The logger is now solely a global variable
        
          Different loggers are handled internally in the global _Logger instance
        
        ## 0.2.11
        
        - Add catch property to logger to allow automatically logging errors with with
        - All code is now indented using spaces
        
        ## 0.2.10
        
        - Allow finer verbosity control in logger
        - Allow multiple log commands to be collected and logged at the same time
        - Add decorator for aforementioned feature
        - Change thousand_seps from TickTock method to stand-alone function in `__init__`
        - Verbose logging now has same signature as normal logging
        
        ## 0.2.8
        
        - Add code to execute code with specific environment variables
        
        ## 0.2.7
        
        - Fix error where the full stacktrace was not printed by log.throw
        - `set_seeds` now checks if torch is available
        
          This means torch seeds are still set without needing it as a dependency
        
        ## 0.2.6 - Breaking changes
        
        - Make Unverbose class private and update documentation
        - Update formatting when using .input
        
        ## 0.2.5
        
        - Add input method to logger
        
        ## 0.2.4
        
        - Better logging of errors
        
        ## 0.2.1 - Breaking changes
        
        - Removed torch as dependency
        
        ## 0.2.0 - Breaking changes
        
        - Logger is now a global variable, `log`
        
          Logging should happen by importing the log variable and calling `.configure` to set it up
        
          To reset the logger, `.clean` can be called
        - It is still possible to just import `Logger` and use it in the traditional way, though `.configure` should be called first
        - Changed timestamp function to give a cleaner output
        - `get_commit` now returns `None` if `gitpython` is not installed
        
        ## 0.1.2
        
        - Update documentation for logger and ticktock
        - Fix bug where seperator was not an argument to `Logger.__call__`
        
        ## 0.1.0
        
        - Include `DataStorage`
        - Logger can throw errors and handle seperators
        - TickTock includes time handling and units
        - Minor parser path changes
        
        ## 0.0.1
        
        - Logger, Parser, and TickTock added from previous projects
        
Keywords: utility,logger,parser,profiling
Platform: UNKNOWN
Description-Content-Type: text/markdown
Provides-Extra: ds
