Metadata-Version: 2.1
Name: github2pandas
Version: 1.1.0
Summary: github2pandas supports the aggregation of project activities in a GitHub repository and makes them available in pandas dataframes
Home-page: https://github.com/TUBAF-IFI-DiPiT/github2pandas
Author: Maximilian Karl & Sebastian Zug
License: BSD 2
Download-URL: https://github.com/user/reponame/archive/v_01.tar.gz
Keywords: git,github,collaborative code development,git mining
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: pygit2 (==1.5.0)
Requires-Dist: pyyaml (==5.4.1)
Requires-Dist: requests (==2.25.1)
Requires-Dist: datetime (==4.3)
Requires-Dist: pygithub (==1.54.1)
Requires-Dist: argparse (==1.4.0)
Requires-Dist: pydriller (==1.15.2)
Requires-Dist: git2net (==1.4.10)
Requires-Dist: pysqlite3 (==0.4.6)
Requires-Dist: selenium (==3.141.0)
Requires-Dist: python-dotenv (==0.17.0)
Requires-Dist: pandas (==1.2.4)
Requires-Dist: jupyter (==1.0.0)
Requires-Dist: sphinx (==3.5.4)
Requires-Dist: m2r2 (==0.2.7)
Requires-Dist: human-id (==0.1.0.post3)
Requires-Dist: psutil (==5.8.0)
Requires-Dist: sphinx-rtd-theme (==0.5.2)
Requires-Dist: pypiwin32 (==223) ; sys_platform == "win32"

# Transform GitHub Activities to Pandas Dataframes

## General information

This package is being developed by the participating partners (TU Bergakademie Freiberg, OVGU Magdeburg and HU Berlin) as part of the DiP-iT project [Website](http://dip-it.ovgu.de/).

The package implements Python functions for 
+ aggregating and preprocessing GitHub activities (Commits, Actions, Issues, Pull-Requests) and 
+ generating project progress summaries according to different metrics (ratio of changed lines, ratio of aggregated Levenshtein distances e.g.).

`github2pandas` stores the collected information in a collection of pandas DataFrames starting from a user defined root folder. The structure beyond that (file names, folder names) is defined as a member variable in the corresponding classes and can be overwritten. The default configuration results in the following file structure.

```
|-- My_Github_Repository_0               <- Repository name
|   |- Repo.json                         <- Json file containing user and repo name
|   |- Issues
|   |   |- pdIssuesComments.p
|   |   |- pdIssuesEvents.p
|   |   |- pdIssues.p
|   |   |- pdIssuesReactions.p
|   |- PullRequests
|   |   |- pdPullRequestsComments.p
|   |   |- pdPullRequestsEvents.p
|   |   |- pdPullRequests.p
|   |   |- pdPullRequestsReactions.p
|   |   |- pdPullRequestsReviews.p
|   |- Users.p
|   |- Versions
|   |   |- pdCommits.p
|   |   |- pdEdits.p
|   |   |- pdBranches.p
|   |   |- pVersions.db
|   |   |- repo                         <- Repository clone
|   |   |   |- ..
|   |- Workflows
|       |- pdWorkflows.p
|-- My_Github_Repository_1
...
```
The internal structure and relations of the data frames are included in the project's [wiki](https://github.com/TUBAF-IFI-DiPiT/github2pandas/wiki).

## Installation

`github2pandas` is available on [pypi](https://pypi.org/project/github2pandas/). Use pip to install the package.

```
sudo pip3 install github2pandas
```


## Application examples 

GitHub token is required for use, which is used for authentication. The [website](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token) describes how you can generate this for your GitHub account. Customise the username and project name and explore any public or private repository you have access to with your account!

| Aspect              | Example                                                                                                                        | Executable notebook | 
|:------------------- |:------------------------------------------------------------------------------------------------------------------------------ |:------------------- |
| Overview Example    | [Overview_Example.ipynb](https://github.com/TUBAF-IFI-DiPiT/github2pandas/blob/main/notebooks/Overview_Example.ipynb)          | [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/TUBAF-IFI-DiPiT/github2pandas/HEAD?filepath=%2Fnotebooks)  |
| Commits & Edits     | [Version_Example.ipynb](https://github.com/TUBAF-IFI-DiPiT/github2pandas/blob/main/notebooks/Version_Example.ipynb)            |                     |
| Workflows / Actions | [Workflow_Example.ipynb](https://github.com/TUBAF-IFI-DiPiT/github2pandas/blob/main/notebooks/Workflow_Example.ipynb)          |                     |
| Issues              | [Issue_Example.ipynb](https://github.com/TUBAF-IFI-DiPiT/github2pandas/blob/main/notebooks/Issues_Example.ipynb)               |                     |
| Pull-Requests       | [Pull_Requests_Example.ipynb](https://github.com/TUBAF-IFI-DiPiT/github2pandas/blob/main/notebooks/Pull_Requests_Example.ipynb)|                     | 


The documentation of the module is available at [https://github2pandas.readthedocs.io/](https://github2pandas.readthedocs.io/).

# For Contributors

Naming conventions: https://namingconvention.org/python/

## Working with pipenv


| Process                                     | Command                                                 |
| ------------------------------------------- | ------------------------------------------------------- |
| Installation                                | `pipenv install --dev`                                  |
| Run specific script                         | `pipenv run python file.py`                             |
| Run all Tests                               | `pipenv run python -m unittest`                         |
| Run all tests in a specific folder          | `pipenv run python -m unittest discover -s 'tests'`     |
| Run all tests with specific filename        | `pipenv run python -m unittest discover -p 'test_*.py'` |
| Start Jupyter server in virtual environment | `pipenv run jupyter notebook`                           | 

