Metadata-Version: 2.1
Name: pandas-diff
Version: 1.4.7
Summary: Python utility to extract differences between two pandas dataframes.
Home-page: https://github.com/jaimevalero/pandas_diff
Author: Jaime Valero
Author-email: jaimevalero78@gmail.com
License: MIT license
Keywords: pandas_diff
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >=3.6
License-File: LICENSE
License-File: AUTHORS.rst

Pandas Diff
===========

|CodeFactor| |Python 3|

Installation
------------

Install pandas_diff with pip

.. code:: bash

   pip install pandas_diff

Usage/Examples
--------------

.. code:: python

   import pandas_diff as pd_diff

   import pandas as pd

   # Create two example dataframes
   df_infinity_war = pd.DataFrame([
                   {"hero" : "hulk" , "power" : "strength"},
                   {"hero" : "black_widow" , "power" : "spy"},
                   {"hero" : "thor" , "hammers" : 0 },
                   {"hero" : "thor" , "hammers" : 1 } ] )
   df_endgame = pd.DataFrame([
                   {"hero" : "hulk" , "power" : "smart"},
                   {"hero" : "captain marvel" , "power" : "strength"},
                   {"hero" : "thor" , "hammers" : 2 } ] )

   # Get differences, using the key "hero"
   df = pd_diff.get_diffs(df_infinity_war ,df_endgame ,"hero")

   df

   #operation object_keys  object_values                     object_json                     attribute_changed old_value new_value
   #0   create     [hero]    captain marvel  {'hero': 'captain marvel', 'power': 'strength'...           NaN           NaN      NaN
   #1   delete     [hero]       black_widow  {'hero': 'black_widow', 'power': 'spy', 'hamme...           NaN           NaN      NaN
   #2   modify     [hero]              thor     {'hero': 'thor', 'power': nan, 'hammers': 2.0}       hammers             1        2
   #3   modify     [hero]              hulk  {'hero': 'hulk', 'power': 'smart', 'hammers': ...         power      strength    smart

Why pandas diff ? Cases of use
------------------------------

Migrating from batch to an event driven architecture
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In my work, we use a lot of data pipelines to get info from external
platforms, (active directory, github, jira). We load the new data
replacing the entire table.

By using pandas_diff we detect how the infraestructure changes between
executions, and stream those change events into a kafka cluster, so
other teams could suscribe to their favourite events. Also, by defining
a pandas_diff step in the master pipeline, every item in our project has
ther life cycle events controlled.

Events log
~~~~~~~~~~

For every item in a table, by using pandas_diff you will have an event
log to audit of how the resources are being consumed.

Conciliation
~~~~~~~~~~~~

To conciliate one datasource against the source of truth. Eg: You have a CMDB controlling with info regarding virtual machines. As there are several methods for creating those VMs, you use pandas_diff to replicate state of the infraestructure against the CMDB.

Features
--------

-  Filtering of columns

Roadmap
-------

-  Support for stand alone app

Documentation
-------------

`Documentation <https://pandas-diff.readthedocs.io/en/latest/>`__

.. |CodeFactor| image:: https://www.codefactor.io/repository/github/jaimevalero/pandas_diff/badge
   :target: https://www.codefactor.io/repository/github/jaimevalero/pandas_diff
.. |Python 3| image:: https://pyup.io/repos/github/jaimevalero/pandas_diff/python-3-shield.svg
   :target: https://pyup.io/repos/github/jaimevalero/pandas_diff/




History
-------

0.7.18 (2021-12-05)
-------------------

\* Add codacy badge 

0.7.19 (2021-12-05)
-------------------

\* Feat filter column 

0.7.20 (2021-12-05)
-------------------

\* Feat filter column 

0.7.21 (2021-12-05)
-------------------

\* Add filter fest 

0.7.22 (2021-12-06)
-------------------

\* Add confition keys exist in df's 


1.1.0 (2021-12-06)
------------------

\* Add confition keys exist in df's
1.2.0 (2021-12-06)
------------------

\* Improve doc 

1.2.0 (2021-12-06)
------------------

\* Improve doc 

1.3.0 (2021-12-06)
--------------------

\* Remove workflows 

1.4.0 (2021-12-06)
--------------------

\* Remove workflows 

1.4.0 (2023-09-01)
--------------------

\* Improve doc 

1.4.1 (2023-09-01)
--------------------

\* Improve doc

1.4.2 (2023-09-17)
--------------------

\* Bugfix version string

1.4.3 (2023-09-17)
--------------------

\* bugfix version tag 

1.4.4 (2023-09-17)
--------------------

\* bugfix version tag 

1.4.5 (2023-09-17)
--------------------

\* bugfixx history string 

1.4.6 (2023-09-17)
--------------------

\* bugfix history string 

1.4.7 (2023-09-17)
--------------------

\* bugfix release description 



