Metadata-Version: 2.1
Name: pdf-orientation-corrector
Version: 0.1.2
Summary: A Python module to automatically detect and correct the orientation of pages in PDF documents.
Author: James Standbridge
Author-email: james.standbridge.git@gmail.com
Requires-Python: >=3.8,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: Pillow (>=10.1.0,<11.0.0)
Requires-Dist: PyPDF2 (>=3.0.1,<4.0.0)
Requires-Dist: pdf2image (>=1.16.3,<2.0.0)
Requires-Dist: pytesseract (>=0.3.10,<0.4.0)
Description-Content-Type: text/markdown

# PDF Orientation Corrector

## Overview

The PDF Orientation Corrector is a Python module designed for automatic detection and correction of the orientation of pages in PDF documents. It effectively combines the functionalities of PyPDF2, pytesseract, and pdf2image, supplemented with image processing techniques from PIL, to analyze and adjust the orientation of each page in a PDF file.

## Key Features

- **Automated Page Orientation Correction**: Detects and corrects the orientation of text on each page of a PDF.
- **Batch Processing**: Enhances performance when dealing with large documents by processing pages in batches.
- **Image Preprocessing**: Uses PIL to enhance the accuracy of OCR results.
- **Versatile Usage**: Can be run as a standalone script or imported into other Python scripts for PDF processing.

## Prerequisites

Before you begin, ensure you have met the following requirements:

Python 3.x
Libraries: PyPDF2, pytesseract, pdf2image, PIL (Pillow), concurrent.futures (part of the standard library)

## Installation

Clone the repository or download the source code. Install the required dependencies via pip:

```bash
pip install PyPDF2 pytesseract pdf2image Pillow
```

## Usage

The module can be used in two ways:

### As a Script

Run the script from the command line, providing the necessary arguments:

```bash
python pdf_orientation_corrector.py input.pdf output.pdf --batch_size 20 --dpi 300 --verbose
```

### As a Library

Import the module in your Python script:

```python
import pdf_orientation_corrector
# Use the module functions as needed
```

## Author

James Standbridge
Email: james.standbridge.git@gmail.com

