Metadata-Version: 2.1
Name: papers-dl
Version: 0.0.23
Summary: A command line application for downloading scientific papers
Author-email: Ben Muthalaly <benmuthalaly@gmail.com>
Project-URL: Homepage, https://github.com/benmuth/papers-dl
Project-URL: Issues, https://github.com/benmuth/papers-dl/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aiohttp==3.9.5
Requires-Dist: beautifulsoup4==4.12.3
Requires-Dist: bs4==0.0.2
Requires-Dist: certifi==2024.2.2
Requires-Dist: cffi==1.16.0
Requires-Dist: charset-normalizer==3.3.2
Requires-Dist: cryptography==42.0.5
Requires-Dist: easygui==0.98.3
Requires-Dist: feedparser==6.0.11
Requires-Dist: google==3.0.0
Requires-Dist: idna==3.6
Requires-Dist: loguru==0.7.2
Requires-Dist: pdf2doi==1.5.1
Requires-Dist: pdfminer.six==20221105
Requires-Dist: pdftitle==0.11
Requires-Dist: pycparser==2.21
Requires-Dist: PyMuPDF==1.23.26
Requires-Dist: PyMuPDFb==1.23.22
Requires-Dist: PyPDF2==2.0.0
Requires-Dist: pyperclip==1.8.2
Requires-Dist: requests==2.31.0
Requires-Dist: retrying==1.3.4
Requires-Dist: sgmllib3k==1.0.0
Requires-Dist: six==1.16.0
Requires-Dist: soupsieve==2.5
Requires-Dist: urllib3==2.2.1
Requires-Dist: w3lib==2.1.2

### Overview
`papers-dl` is a command line application for downloading scientific papers.

### Usage
```shell
# parse DOI identifiers from a file:
papers-dl parse -m doi --path pages/my-paper.html

# parse ISBN identifiers from a file, output matches as CSV:
papers-dl parse -m isbn --path pages/my-paper.html -f csv

# fetch paper with given identifier from any known provider:
papers-dl fetch "10.1016/j.cub.2019.11.030"

# fetch paper from any known Sci-Hub URL with verbose logging on, and store in "papers" directory:
papers-dl -v fetch -p "scihub" -o "papers" "10.1107/s0907444905036693"

# fetch paper from specific Sci-Hub URL:
papers-dl fetch -p "sci-hub.ee" "10.1107/s0907444905036693"

# fetch paper from SciDB (Anna's Archive):
papers-dl fetch -p "scidb" "10.1107/s0907444905036693"
```

### About

`papers-dl` attempts to be a comprehensive tool for gathering research papers from popular open libraries. There are other solutions for this (see "Other tools" below), but `papers-dl` is trying to fill its own niche:

- comprehensive: other tools usually work with a single library, while `papers-dl` is trying to support a collection of popular libraries.
- performant: `papers-dl` tries to improve search and retrieval times by making use of concurrency where possible.

That said, `papers-dl` may not be the best choice for your specific use case right now. For example, if you require features supported by a specific library, one of the more mature and specialized tools listed below may be a better option.

`papers-dl` was initially created to serve as an extractor for [ArchiveBox](https://archivebox.io), a powerful solution for self-hosted web archiving.

This project started as a fork of [scihub.py](https://github.com/zaytoun/scihub.py).

### Other tools

- [Scidownl](https://pypi.org/project/scidownl/)
- [arxiv-dl](https://pypi.org/project/arxiv-dl/)
- [Anna's Archive API](https://github.com/dheison0/annas-archive-api)

### Roadmap

`papers-dl`'s CLI is not yet stable.

Short-term roadmap:

**parsing**
- add support for parsing more identifier types like PMID and ISSN

**fetching**
- add support for downloading formats other than PDFs, like HTML or epub

**searching**
- add a CLI command for searching libraries for papers and metadata

