Example
=======

Core of *data-migrator* is the unix pipe and filter paradigm to build data transformers. Source data
is read from a database or some other source. It is piped to a filter written with *data-migrator*,
which emits for example SQL insert statements. These can be piped to a target client.

Consider the most basic datapump in mysql, which is direct pipe between source and target database:

.. code-block:: bash

  $ mysqldump -u [uname] -p[pass] source_db table  | mysql target_db

In this example ``mysqldump`` will export the table as SQL statements and the new database will process
them. Now if you want to do something extra and repeatable with respect to the data, you could
use all kinds of unix filtering with sed, awk, or another favorite poison. It is not very hard to
imagine what Pythonista's would do especially if extra transformations or simplifications are needed.
The basic packages are quite expressive and one would easily setup something like:

.. code-block:: bash

  $ mysql source_db -E 'select * from table' -B  | python my_filter.py | mysql target_db

With `my_filter.py` a basic implementation of `csv`_:

.. code-block:: python

  import sys, csv

  reader = csv.DictReader(sys.stdin)

  for row in reader:
    print 'INSERT INTO `table` (a,b) VALUES ("%(a)s", %(b)s)' % row


To see the options for manipulation is left as an exercise to the reader, but do accept that
as soon things become just a little more complex (think: splitting in two tables, column
reverses, renaming of columns, mixing, joining, filtering, transforming), more declarative support
is helpful. That is why we came up with *data-migrator*. One could simply replace this with:

.. code-block:: python

  from data_migrator import models, transform
  from data_migrator.emitters import MySQLEmitter

  def parse_b(v):
    if v == 'B':
      return 'transformed_B'
    else:
      return v.lower()

  class Result(models.Model):
    id   = models.IntField(pos=0) # keep id
    uuid = models.UUIDField()     # generate new uuid4 field
    # replace NULLs and trim
    a    = models.StringField(pos=1, default='NO_NULL', max_length=5, nullable='NULL', replacement=lambda x:x.upper())
    # parse this field
    b    = models.StringField(pos=2, parse=parse_b, name='my_b')

  class Meta:
    table_name = 'new_table_name'

  # django-esc like creating and saving (to a manager)
  Result(a='my a', b='my b').save()

  if __name__ == "__main__":
    transform.Transformer(models=[Result], emitter=MySQLEmitter).process()

    assert(len(Result.objects) > 1)


And have a nice self explaining transformer which will generate something like:

.. code-block:: sql

  -- transformation for Result to table new_table_name
  -- input headers: id,a,b
  -- stats: in=10,dropped=0,out=10

  SET SQL_SAFE_UPDATES = 0; -- you need this to delete without WHERE clause
  DELETE FROM `new_table_name`;
  ALTER TABLE `new_table_name` AUTO_INCREMENT = 1;

  INSERT INTO `new_table_name` (`id`, `uuid`, `a`, `my_b`) VALUES (0, "ac7100b9-c9ad-4069-8ca5-8db1ebd36fa3", "MY A", "my b");
  INSERT INTO `new_table_name` (`id`, `uuid`, `a`, `my_b`) VALUES (1, "38211712-0eb2-4433-b28f-e3fe33492e7a", "NO_NULL", "some value");
  INSERT INTO `new_table_name` (`id`, `uuid`, `a`, `my_b`) VALUES (2, "a3478903-aed9-462c-8f47-7a89013bc6ea", "CHOPP", "transformed_B");

If you can see the value of this package, continue with the :doc:`installation </install>` of *data-migrator*.
After which there are some more pages in the :doc:`Tutorial </tutorial/tutorial0>` for you to get a grasp of this library.

.. _csv: https://docs.python.org/2/library/csv.html
