Metadata-Version: 1.0
Name: matricks
Version: 0.3.20
Summary: manipulate datasets encoded as 2-D matrices with annotation (first) row and (first) column
Home-page: http://pypi.python.org/pypi/matricks/
Author: Nick Seidenman
Author-email: seidenman@wehi.edu.au
License: UNKNOWN
Description: ===========================================
        Matricks: Manipulating Datasets as Matrices
        ===========================================
        
        Class for importing and querying expression dataasets organized as a column-
        and row-annotated  matrix.
        
        Expression datasets contain the numeric results of one or more samples
        derived from microarray assays.   Common to each of the assays is the
        specific platform (microarray).   The dataset can be regarded as a table
        with rows and columns.  Each column represents a single assay, and each row 
        contains the assay results for a specific probe on the assay platform.  Thus,
        the values in any given row are those obtained from the same probe location
        on the platform.  These are referred to as `expression profiles`.
        
        A dataset can be regarded as a table, such as this one:
        
        +----------+-------+-------+-------+-------+
        | probe_id | HSC 1 | HSC 2 | NK 1  | NK 2  | 
        +==========+=======+=======+=======+=======+
        | 45283    | 10.14 |  9.31 |   8.9 |  8.78 |
        +----------+-------+-------+-------+-------+
        | 45284    | 12.52 | 12.63 | 12.55 | 11.96 |
        +----------+-------+-------+-------+-------+
        | 45285    |  6.78 |  6.91 |  7.83 |  7.86 |
        +----------+-------+-------+-------+-------+
        | 45286    |  5.58 |  5.06 |  6.69 |  6.64 |
        +----------+-------+-------+-------+-------+
        | 45287    |  7.85 |  8.13 |  8.47 |  8.56 |
        +----------+-------+-------+-------+-------+
        | 45288    |  8.12 |  7.17 |  8.71 |  8.08 |
        +----------+-------+-------+-------+-------+
        | 45289    |  6.82 |  6.15 |  5.87 |  5.32 |
        +----------+-------+-------+-------+-------+
        | 45290    | 10.55 | 10.39 |  10.7 |  9.93 |
        +----------+-------+-------+-------+-------+
        
        
        Expression datasets, with rare exception, are stored in text (i.e. flat) files
        that have the following format:
        
        * two or more rows of data, delimited by ASCII newline (\\x0a) characters.
          (Strictly speaking, there needen't be any data at all, but what's the point of that?)
        
        * each line or row consists of two or more columns of data, delimited by ASCII TAB (\\x09) characters.
        
        * the first column contains the key or `probe ID`, assumed to be alpha-numeric, or for the probe.
        
        * the first row consists of labels identifying the probe ID and sample columns.  This, too, is assumed
          to be alpha-numeric.
        
        * the second through last rows contain expression values and, aside from the first column, which
          contains the probe ID, are assumed to be floating point numbers.  In microarray parlance, 
          each row is  typically referred to as an `expression profile`.
        
        Some datasets may differ from this format.  For instance, there may be no (first) row of labels,
        or the data may be of some format other than floating point.  Provision is made for handling these
        arguably special cases.  However, the default settings for instantiating `Matricks` classes
        makes the foregoing assumptions about the contents of raw source data.  It is further assumed that
        the source dataset is encoded in ASCII strings, requiring the conversion of all numeric data
        to ``float`` type objects.
        
        `Matricks` selection operations generally return `Matricks` objects.   These can be iterated,
        row-wise, much like lists or tuples, to access individual expression profiles, the contents of which
        can be retrieved using list / tuple semantics.
        
        
Keywords: dataset manipulation algebra bioinformatics
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: License :: OSI Approved :: BSD License
