Metadata-Version: 2.0
Name: pysaurio
Version: 0.2.1
Summary: A tool for searching & extracting information from multiple text files.
Home-page: https://pypi.python.org/pypi/pysaurio
Author: Antonio Suárez Jiménez
Author-email: pherkad13@gmail.com
License: GNU GPLv3
Keywords: pysaurio search extract text csv collect data merge join pyraptor
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: Topic :: Utilities
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.2
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5

Pysaurio
========

This package contains two tools: Raptor y Reptar

Raptor
------

**Raptor** for extracting and displaying information from a set of files of the same type; and creating a single file with all the selected information.

The information in the files may be in multiple rows::

    PC01.txt:
    User=ms123
    Name=Mayra Sanz
    OS=GNU/Linux
    IP=10.226.140.1

But, also, the information may be in several columns. It is possible to read data from multiple fields in a single line::

    PC01.log:
    User: ms123     Name: Mayra Sanz
    OS: GNU/Linux   IP: 10.226.140.1        

**Example:** data from the following files::

    PC01.txt:
    User=ms123
    Name=Mayra Sanz
    OS=GNU/Linux
    IP=10.226.140.1

    PC02.txt:
    User=lt001
    Name=Luis Toribio
    OS=GNU/Linux
    IP=10.226.140.2

    PC03.txt:
    User=co205
    Name=Clara Osto
    OS=Win
    IP=10.226.140.3

... You can create a CSV file with the following information::

    users.csv:
    User,Name,OS,IP
    MS123,Mayra Sanz,GNU/linux,10.226.140.1
    LT001,Luis Toribio,GNU/linux,10.226.140.2
    CO205,Clara Osto,Win,10.226.140.3

To achieve this you need to create a template (.rap) with Raptor, which is similar to an INI file with the following information::

    users.rap:
    [General]
    description = Get list of users
    extension = txt
    prefix = PC
    output_folder = txt
    input_folder = txt
    output_file = users.csv
    delimiter = ,
    quotechar = "
    include_header = 1
    include_file = 0
    include_record_num = 0
    include_empty_record = 0    
    search_multiple = 0
    alternate_header =
    search_multiple = 0

    [Fields]
    user = User=
    name = Name=
    os = OS=
    ip = IP=

    [Rules]
    rule1 = ('user', 'UPPER')                                       

To create .rap template (If the .rap template exists it is not saved)::

    from pysaurio import Raptor  
    rap1 = Raptor()       
    rap1.description = 'Get list of users'
    rap1.extension = 'txt'
    rap1.prefix = 'PC'
    rap1.input_folder = 'txt'
    rap1.output_folder = 'txt'
    rap1.output_file = 'users.csv'
    rap1.delimiter = ','
    rap1.quotechar = '"'
    rap1.include_header = '1'
    rap1.include_file = '1'
    rap1.include_record_num = '1'
    rap1.include_empty_record = '0'
    rap1.search_multiple = '0'
    rap1.alternate_header = ''
    rap1.fields['user'] = 'User='
    rap1.fields['name'] = 'Name='
    rap1.fields['os'] = 'OS='
    rap1.fields['ip'] = 'IP='
    rap1.rules.append(('user', 'UPPER'))
    rap1.rules.append(('name', 'REMOVEFROM', ' '))     
    rap1.Save("users.rap")  
    del rap1

**Attribute List:**

- description: short descripton of .rap template
- extension: extension of the files to read
- prefix: files must begin with this string
- input_folder: folder of files to read
- output_folder: output folder to save file with result
- output_file: output filename
- delimiter: delimiter character
- quotechar: quote character
- include_header: '0' or '1'
- include_file: '0' or '1'
- include_record_num: '0' or '1'
- include_empty_record: '0' or '1'
- search_multiple: '0' or '1'
- alternate_header: alternative text of the report header
- fields: dictionary with fieldnames and search string (read template)
- record: dictionary with fieldnames and values (read template)
- rules: list of rules (read template)
- list_files: list of filenames to read (auto)
- record_counter: number of records (auto) 
- errors: list of errors (auto)
- number_errors: number of errors after you open or save a template                 

**Functions available for rules:**

- rule1 = (fieldname, 'SUBSTR', postion_initial, lenght)
- rule1 = (fieldname, 'REPLACE', search_string, replace_string)
- rule1 = (fieldname, 'REPLACEALL', search_string, replace_string)
- rule1 = (fieldname, 'UPPER')
- rule1 = (fieldname, 'LOWER')
- rule1 = (fieldname, 'REVERSE')
- rule1 = (fieldname, 'REMOVE')
- rule1 = (fieldname, 'FIELDISDATA')
- rule1 = (fieldname, 'REMOVEFROM', 'string')
- rule1 = (fieldname, 'REMOVETO', 'string')

Opens template (.rap) and creates (.csv) file from the data read from multiple text files::

    from pysaurio import Raptor
    import csv

    rap2 = Raptor()
    rap2.Open('users.rap')
    if rap2.number_errors == 0:         
        file_csv = open(rap2.output_file, 'w', newline='')
        csv_output = csv.writer(file_csv, 
                                delimiter=rap2.delimiter,
                                quotechar=rap2.quotechar, 
                                quoting=csv.QUOTE_MINIMAL)
        if rap2.include_header == '1':
            fields_list = rap2.BuildHeader()
            print(fields_list)
            csv_output.writerow(fields_list)

        for row in rap2.list_files:
            valid_record, new_record = rap2.BuildRow(row)
            new_record = rap2.ApplyRules(new_record)
            if valid_record:
                new_record = list(new_record.values())
                print(new_record)
                csv_output.writerow(new_record)         
        file_csv.close()            
    else:
        print(rap2.ShowError())
    del rap2


Reptar
------

**Reptar** allows merge files, including only the necessary lines.

**Example:** data from the following files::

    PCS01.txt:
    User,Name,OS,IP
    ms123,Mayra Sanz,GNU/Linux,10.226.140.1
    lt001,Luis Toribio,GNU/Linux,10.226.140.2
    co205,Clara Osto,Win,10.226.140.3

    PCS02.txt:
    User,Name,OS,IP
    nn345,Nadia Pacheco,Win,10.226.140.4
    jm401,Juan Madrid,GNU/Linux,10.226.140.5

... You can create a file with the following information::

    Linux.csv:
    User,Name,OS,IP
    MS124,MAYRA SANZ,GNU/LINUX,10.226.140.1
    LT001,LUIS TORIBIO,GNU/LINUX,10.226.140.2
    CO205,CLARA OSTO,WIN,10.226.140.3
    JM401,JUAN MADRID,GNU/LINUX,10.226.140.5

In this example, lines that contain the text "Linux" or beginning with the text "co205" are included::

    from pysaurio import Reptar
    rep1 = Reptar()       
    rep1.description = 'Get list of Linux users'
    rep1.extension = 'txt'
    rep1.prefix = 'PCS'
    rep1.input_folder = 'txt'
    rep1.output_folder = 'txt'
    rep1.output_file = 'Linux.csv'
    rep1.include_header = '1'
    rep1.include_file = '0'
    rep1.include_record_num = '0'
    rep1.alternate_header = ''
    rep1.lines.append(('INCLUDE', 'Linux'))
    rep1.lines.append(('INCLUDRE', '^co205'))
    rep1.rules.append(('line', 'UPPER'))            
    rep1.Save("linux.rep")
    del rep1

    # Opens .rep template and create file with output information

    rep2 = Reptar()
    rep2.Open('linux.rep')
    if rep2.number_errors == 0:         
        file_csv = open(rep2.output_file, 'w')
        if rep2.include_header == '1':
            header = rep2.BuildHeader(rep2.list_files[0])
            print(header)
            file_csv.write(header + '\n')

        for row in rep2.list_files:
            current_file = open(rep2.input_folder + row, 'rb')
            while True:
                new_record = current_file.readline()
                new_record = new_record.decode("utf-8", "ignore")
                if not new_record: break
                valid_record, new_record = rep2.BuildRow(new_record, row)
                if valid_record:
                    new_record = rep2.ApplyRules(new_record)
                    print(new_record)
                    file_csv.write(new_record + '\n')
            current_file.close()
        file_csv.close()            
    else:
        print(rep2.ShowError())
    del rep2


**Functions available for including and excluding lines:**

- line1 = ('EXCLUDE', 'string')
- line1 = ('INCLUDE', 'string')
- line1 = ('EXCLUDRE', 'regex')  # See module re
- line1 = ('INCLUDRE', 'regex')  # See module re

The package contains more examples and data files to test.

Changelog
---------

- Pysaurio 0.2.0 - Initial release (continued "Pyraptor").
- Pysaurio 0.2.1 - Reptar includes rules and the section 'Lines' you can use regular expressions.



