Metadata-Version: 2.1
Name: pyjls
Version: 0.2.1
Summary: Joulescope™ file format
Home-page: https://joulescope.readthedocs.io
Author: Jetperch LLC
Author-email: joulescope-dev@jetperch.com
License: Apache 2.0
Project-URL: Bug Reports, https://github.com/jetperch/jls/issues
Project-URL: Funding, https://www.joulescope.com
Project-URL: Twitter, https://twitter.com/joulescope
Project-URL: Source, https://github.com/jetperch/jls/
Description: <!--
        # Copyright 2021 Jetperch LLC
        #
        # Licensed under the Apache License, Version 2.0 (the "License");
        # you may not use this file except in compliance with the License.
        # You may obtain a copy of the License at
        #
        #     http://www.apache.org/licenses/LICENSE-2.0
        #
        # Unless required by applicable law or agreed to in writing, software
        # distributed under the License is distributed on an "AS IS" BASIS,
        # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        # See the License for the specific language governing permissions and
        # limitations under the License.
        -->
        
        # JLS
        
        main: [![Build Status](https://travis-ci.org/jetperch/jls.svg?branch=main)](https://travis-ci.org/jetperch/jls)
        develop: [![Build Status](https://travis-ci.org/jetperch/jls.svg?branch=develop)](https://travis-ci.org/jetperch/jls)
        
        Welcome to the [Joulescope®](https://www.joulescope.com) File Format project.
        The goal of this project is to provide performant data storage for huge, 
        simultaneous, one-dimensional signals. This repository contains:
        
        * The JLS file format specification
        * The implementation in C
        * Language bindings for Python
        
        > **⚠ CAUTION ⚠**  
        > We are actively developing this library.  Many features are not 
        > implemented and the API is subject to rapid change.
        
        
        ## Features
        
        * Cross-platform
          * Microsoft Windows x64
          * Apple macOS x64
          * Apple macOS ARM 🔜  
          * Linux x64
        * Support for multiple, simultaneous data sources
        * Support for multiple, simultaneous signal waveforms
        * Fixed sample rate signals (FSR)
          * Handles missing samples gracefully (interpolate) 🔜
          * Multiple data types including:
            - Floating point: f32
             - Floating point: f64 🔜 
            - Unsigned integers in nibble (4 bit) increments 🔜 
            - Signed integers in nibble (4 bit) increments 🔜
            - Fixed-point, signed integers in nibble (4 bit) increments 🔜
            - Boolean (digital) 1-bit signals 🔜
        * Variable sample rate (VSR) signals 🔜
        * Fast read performance
          * Signal Summaries
            * "Zoomed out" view with mean, min, max, standard deviation
            * Provides fast waveform load without any additional processing steps
          * Automatic load by summary level
          * Fast seek, next, previous access
        * Sample ID to Wall-clock time (UTC) for FSR signals 🔜
        * Annotations
          * Global VSR annotations
          * Signal annotations, timestamped to sample_id for FSR and UTC time for VSR
          * Support for text, marker, and user-defined (text, binary, JSON)
        * User data
          * Arbitrary data included in the same file
          * Support for text, binary, and JSON
        * Reliability
          * Integrated integrity checks using CRC32C
          * File data still accessible in the case of improper program termination 🔜
          * Uncorrupted data is still accessible in presence of file corruption 🔜
          * Write once, except for indices and the doubly-linked list pointers
        * Compression options 🔜
          * lossless 🔜
          * lossy 🔜
          * lossy with downsampling below threshold 🔜
        
        Items marked with 🔜 are under development and coming soon.
        As of Mar 2021, the JLS v2 file structure is well-defined.
        However, the datatype and compression storage formats are not 
        yet defined, and the software still needs to grow to support 
        the target feature set.
        
        
        ## Why JLS?
        
        The world is already full of file formats, and we would rather not create 
        another one.  However, we could not identify a solution that met these
        requirements.  [HDF5](https://www.hdfgroup.org/solutions/hdf5/) meets the
        large storage requirements, but not the reliability and rapid load requirements.
        The [Saleae binary export file format v2](https://support.saleae.com/faq/technical-faq/binary-export-format-logic-2)
        is also not suitable since it buffers stores single, contiguous blocks.
        [Sigrok v2](https://sigrok.org/wiki/File_format:Sigrok/v2) is similar.
        The [Sigrok v3](https://sigrok.org/wiki/File_format:Sigrok/v3) format
        (under development as of Mar 2021) is better in that it stores sequences of
        "packets" containing data blocks, but it still will does not allow for
        fast seek or summaries.
        
        Timeseries databases, such as [InfluxDB](https://www.influxdata.com/), are 
        powerful tools.  However, they are not well-designed for fast sample-rate
        data.
        
        Media containers are another option, especially the ISO base media file format
        used by MPEG4 and many others:
          * [ISO/IEC 14496-14:2020 Specification](https://www.iso.org/standard/79110.html)
          * [Overview](https://mpeg.chiariglione.org/standards/mpeg-4/iso-base-media-file-format)
        
        However, the standard does not included the ability to store the signal summaries
        and our specific signal types.
        
        
        ## Why JLS v2?
        
        This file format is based upon JLS v1 designed for
        [pyjoulescope](https://github.com/jetperch/pyjoulescope) and used by the
        [Joulescope](https://www.joulescope.com/) test instrument.  We are leveraging
        the lessons learned from v1 to make v2 better, faster, and more extensible.
        
        The JLS v1 format has been great for the Joulescope ecosystem and has
        accomplished the objective of long data captures (days) with fast
        sampling rates (MHz).  However, it now has a long list of issues that are difficult
        to address without a significant restructuring.  The issues include:
        
        - Inflexible storage format (always current, voltage, power, current range, GPI0, GPI1).
        - Unable to store from multiple sources.
        - Unable to store other sources and signals.
        - No annotation support: 
          [41](https://github.com/jetperch/pyjoulescope_ui/issues/41),
          [93](https://github.com/jetperch/pyjoulescope_ui/issues/93).
        - Inflexible user data support.
        - Inconsistent performance across sampling rates, zoom levels, and file sizes: 
          [48](https://github.com/jetperch/pyjoulescope_ui/issues/48),
          [103](https://github.com/jetperch/pyjoulescope_ui/issues/103).
        - Unable to correlate sample times with UTC:
          [55](https://github.com/jetperch/pyjoulescope_ui/issues/55).
        
        The JLS v2 file format will address all of these issues, dramatically 
        improve performance, and add new capabilities, such as signal compression.
        
        
        ## How?
        
        At its lowest layer, JLS is an enhanced 
        [tag-length-value](https://en.wikipedia.org/wiki/Type-length-value) (TLV)
        format. TLV files form the foundation of many reliable image and video formats, 
        including MPEG4 and PNG.  The enhanced header contains additional fields
        to speed navigation and improve reliability.  The JLS file format calls 
        each TLV a **chunk**.  The enhanced tag-length component the **chunk header**
        or simply **header**.  The file also contains a **file header**, not to be 
        confused with the **chunk header**.  A **chunk** may have zero payload length,
        in which case the next header follows immediately.  Otherwise, a 
        **chunk** consists of a **header** followed by a **payload**. 
        
        The JLS file format supports sources that produce data.  The file allows
        the application to clearly define and label the source.  Each source
        can have any number of associated signals.
        
        Signals are 1-D sequences of values over time consisting of a single,
        fixed data type.  Each signal can have multiple tracks that contain
        data associated with that signal. The JLS file supports two signal types: 
        fixed sample rate (FSR) and variable sample rate (VSR).  FSR signals
        store their sample data in the FSR track using FSR_DATA and FSR_SUMMARY.
        FSR time is denoted by samples using timestamp.  FSR signals also support:
        
        * Sample time to UTC time mapping using the UTC track.
        * Annotations with the ANNOTATION track. 
        
        VSR signals store their sample data in the VSR track.  VSR signals
        specify time in UTC (wall-clock time).  VSR signals also
        support annotations with the ANNOTATION track.
        The JLS file format supports VSR signals that only use the 
        ANNOTATION track and not the VSR track.  Such signals are commonly 
        used to store UART text data where each line contains a UTC timestamp. 
        
        Signals support DATA chunks and SUMMARY chunks.
        The DATA chunks store the actual sample data.  The SUMMARY chunks
        store the reduced statistics, where each statistic entry represents
        multiple samples.  This file format stores the mean, min, max, 
        and standard deviation.  Although standard deviation requires the
        writer to compute the square root, standard deviation keeps the
        same units and bit depth requirements as the other fields.  Variance
        requires twice the bit size for integer types since it is squared.
        
        Before each SUMMARY chunk, the JLS file will contain the INDEX chunk
        which contains the starting time and offset for each chunk that 
        contributed to the summary.  This SUMMARY chunk enables fast O(log n)
        navigation of the file. 
        
        The JLS file format design supports SUMMARY of SUMMARY.  It supports
        the DATA and up to 15 layers of SUMMARIES.  timestamp is given as a
        64-bit integer, which allows each summary to include only 20 samples
        and still support the full 64-bit integer timestamp space.  In practice, the
        first level summary increases a single value to 4 values, so summary
        steps are usually 50 or more.
        
        Many applications, including the Joulescope UI, prioritize read performance,
        especially visualizing the waveform quickly following open, 
        over write performance.   Waiting to scan through a 1 TB file is not a 
        valid option.  The reader opens the file and scans for sources and signals.
        The application can then quickly load the highest summary of summaries 
        for every signal of interest.  The application can very quickly display this
        data, and then start to retrieve more detailed information as requested.
        
        
        ## Example file structure
        
        ```
        sof
        header
        USER_DATA(0, NULL)    // Required, point to first real user_data chunk
        SOURCE_DEF(0)         // Required, internal, reserved for global annotations
        SIGNAL_DEF(0, 0.VSR)  // Required, internal, reserved for global annotations
        TRACK_DEF(0.VSR)
        TRACK_HEAD(0.VSR)
        TRACK_DEF(0.ANNO)
        TRACK_HEAD(0.ANNO)
        SOURCE_DEF(1)         // input device 1
        SIGNAL_DEF(1, 1, FSR) // our signal, like "current" or "voltage"
        TRACK_DEF(1.FSR)
        TRACK_HEAD(1.FSR)
        TRACK_DEF(1.ANNO)
        TRACK_HEAD(1.ANNO)
        TRACK_DEF(1.UTC)
        TRACK_HEAD(1.UTC)
        USER_DATA           // just because
        TRACK_DATA(1.FSR)
        TRACK_DATA(1.FSR)
        TRACK_DATA(1.FSR)
        TRACK_DATA(1.FSR)
        TRACK_INDEX(1.FSR, lvl=0)
        TRACK_SUMMARY(1.FSR, lvl=1)
        TRACK_DATA(1.FSR)
        TRACK_DATA(1.FSR)
        TRACK_DATA(1.FSR)
        TRACK_DATA(1.FSR)
        TRACK_INDEX(1.FSR, lvl=0)
        TRACK_SUMMARY(1.FSR, lvl=1)
        TRACK_DATA(1.FSR)
        TRACK_DATA(1.FSR)
        TRACK_DATA(1.FSR)
        TRACK_DATA(1.FSR)
        TRACK_INDEX(1.FSR, lvl=0)
        TRACK_SUMMARY(1.FSR, lvl=1)
        TRACK_INDEX(1.FSR, lvl=1)
        TRACK_SUMMARY(1.FSR, lvl=2)
        USER_DATA           // just because
        eof
        ```
        
        Note that TRACK_HEAD(1.FSR) points to the first TRACK_INDEX(1.FSR, lvl=0) and
        TRACK_INDEX(1.FSR, lvl=1). 
        Each TRACK_DATA( is in a doubly-linked list with its next and previous
        neighbors.  Each TRACK_INDEX(1.FSR, lvl=0) is likewise in a separate doubly-linked
        list, and the payload of each TRACK_INDEX points to the summarized TRACK_DATA
        instances.  TRACK_INDEX(1.FSR, lvl=1) points to each TRACK_INDEX(1.FSR, lvl=0) instance.
        As more data is added, the TRACK_INDEX(1.FSR, lvl=1) will also get added to
        the INDEX chunks at the same level.
        
        
        ## References
        
        * JLS v1: 
          [lower-layer](https://github.com/jetperch/pyjoulescope/blob/master/joulescope/datafile.py),
          [upper-layer](https://github.com/jetperch/pyjoulescope/blob/master/joulescope/data_recorder.py).
        * [Sigrok/v3](https://sigrok.org/wiki/File_format:Sigrok/v3), which shares
          many of the same motivations.
        * Tag-length-value: [Wikipedia](https://en.wikipedia.org/wiki/Type-length-value).
        * Doubly linked list: [Wikipedia](https://en.wikipedia.org/wiki/Doubly_linked_list).
        
Keywords: JLS,Joulescope
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: End Users/Desktop
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: Microsoft :: Windows :: Windows 10
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Software Development :: Embedded Systems
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: System :: Hardware :: Hardware Drivers
Classifier: Topic :: Utilities
Requires-Python: ~=3.8
Description-Content-Type: text/markdown
Provides-Extra: dev
