Metadata-Version: 2.1
Name: UKLawCaseScraper
Version: 0.4.4
Summary: A package for scrape case law informations.
Home-page: https://github.com/mrjoneidi/UKLawCaseScraper
Author: Mohammadreza Joneidi Jafari
Author-email: m.r.joneidi.02@gmail.com
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Framework :: Scrapy
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE

# **UKLawCaseScraper**
------
UKLawCaseScraper is a module, allows you to retrieve judgments and decisions information from 2003 onwards, from Find case law in a friendly, Pythonic way without having to solve CAPTCHAs.

## Installation
-----
UKLawCaseScraper can be installed with pip. To install using pip, simply run:
```python
pip install UKLawCaseScraper
```
This package has 3 main modules:
* CaseInfoScraper
* CaseHeaderScraper
* FullTextScraper
* Save_to_json (For version <= 0.4.2)

In versions prior to 0.4.2, it is necessary to use the `save_to_json` function to manually save each output separately. However, starting from version 0.4.3, each output class automatically records its data to a JSON file.

### 1- CaseInfoScraper
---
This module contains 2 functions. These functions scrape Link, Name of the Case, judgment-listing__court, judgment-listing__neutralcitation and Datetime for a page or multipages.
Funcrions:
* scrape_judgments
* scrape_all_judgments_info
* get_response

### 2- CaseHeaderScraper
---
This module has 3 functions. First module give us just page's urls that use in second and third functions. (So you have to run **scrape_judgment_urls** first) then, second function gives us direct download case's PDFs and the last one, scrape case header info :)
Functions:
* scrape_judgment_urls
* judgment_Dlink
* scrape_header_info
* get_response

### 3- FullTextScraper
---
This module has 3 function. It gets output of scrape_header_info and scrape all text of each cases.
Functions:
* load_json
* scrape_full_text_and_headers
* OutputScraper
* get_response

### 4- get_response
----
The **get_response** function is an essential component of the all classes. Here's an explanation of why it is included and its role in the class:
* Network Requests Handling
* Retry Mechanism

The **get_response** function is crucial for managing HTTP requests within the all classes. It ensures that web pages are fetched reliably, handles network errors gracefully, and retries requests when necessary. This design improves the robustness, maintainability, and readability of the scraper code.

#### More information in github repository page in [UKLawCaseScraper](https://github.com/mrjoneidi/UKLawCaseScraper)
----

