Metadata-Version: 2.1
Name: fintool
Version: 1.0.2
Summary: All-in-one tools for financial analysis
Home-page: https://github.com/YoshioYamauchi/financialtoolkit
Author: Yoshio Yamauchi == SPARKLE
Author-email: sparkle.official.01@gmail.com
License: MIT
Description: Author : Yoshio Yamauchi 山内義生 == SPARKLE  
        Twitter : [@sparkle_twtt](https://twitter.com/sparkle_twtt)  
        Medium : [@sparkle_mdm](https://sparkle-mdm.medium.com/contents-list-f89c9700ba8)  
        Email : sparkle.official.01@gmail.com  
        You can ask me whatever about the usage of scrapingtools
        
        
        ## Description
          This module helps you scrape the Internet without revealing your identity. It
        also features parallel processing allowing you to send requests concurrently
        over several threads. All programs are written in Python3 and you need
        Ubuntu-18.04 or later.  
        
          The anonymity is backed by the Tor network. The tor network is a free
        proxy chain network available for anyone without any registration. We assume
        that you use a Linux operation system, and if you do, it's not that difficult
        to set up the tor. I'll show you that below.  
        
          The threading is done by a python module "multiprocessing". It's different
        from a similar module "threading" in a meaning that "multiprocessing" actually
        splits tasks over multiple cores and run them concurrently, while "threading"
        is just a pseudo parallelization.
        
        
        ## Required System
        Ubuntu-18.04 or later  
        Python3
        
        ## Python Dependencies
        `stem`, `random-user-agent`, `numpy`, `requests_html`, `lxml`, `requests`, `bs4`
        
        ## Install Tor and Privoxy
        ### install tor and start
        ```
        $ sudo apt update
        $ sudo apt install tor
        $ sudo srvice tor start
        ```
        
        ### change password of tor
        ```
        $ kill $(pidof tor)
        $ sudo bash -c 'echo "ControlPort 9051" >> /etc/tor/torrc'
        $ sudo bash -c 'echo HashedControlPassword $(tor --hash-password "password" | tail -n 1) >> /etc/tor/torrc'
        $ sudo service tor restrat
        ```
        
        ### install privoxy
        ```
        $ sudo apt update
        $ sudo apt install privoxy
        $ sudo bash -c 'echo "forward-socks5t / 127.0.0.1:9050 ." >> /etc/privoxy/config'
        $ sudo service privoxy restart
        ```
        
        
        ## Usage
        ### definition
        ```
        class AnonymizedConcurrentRequest():
           def __init__(self, tor_password, proxies, port=9051, max_rpm=45, ipchange_interval=1,
                         num_processes=1, replace=True, verbose=False):
        ```
        `tor_password` : the password of the tor server  
        `proxies` : the IP and port number of the tor server  
        `port` : tor setup port (9051 as default)  
        `max_rpm` : maximun number of requests sent per minute  
        `ipcahge_interval` : interval of checking IP  
        `num_processes` : number of subprocesses == degree of parallelization  
        `replace` : if files already exists, then replace that with new ones  
        `verbose` : show progress  
        
        
        ### runtest.py
        Restart tor and privoxy
        ```
        $ sudo /etc/init.d/tor restart
        $ sudo /etc/init.d/privoxy restart
        ```
        
        
        Import the module first
        ```
        from scrapingtools import utils
        ```
        Then give a dict of proxies, the setup port, and the password  
        ```
        PROXIES = {"https":"127.0.0.1:8118",
                   "http":"127.0.0.1:8118"} # default
        PORT = 9051 # default
        PROXY_PASSWORD="password" # default
        ```
        The URLs are given as a list of lists, each of which is a pair of a URL and
        the destination file for saving  
        ```
        TASKS = [["results_apple.txt","https://finance.yahoo.com/quote/AAPL?p=AAPL&.tsrc=fin-srch"],
                 ["results_nvida.txt","https://finance.yahoo.com/quote/NVDA?p=NVDA&.tsrc=fin-srch"]]
        ```
        Then run the program, giving the number of CPU cores, maximum number of requests sent per minute
        
        ```
        ACR = utils.AnonymizedConcurrentRequest(PROXY_PASSWORD, max_rpm=60, ipchange_interval=1,
                                                num_processes=1, replace=True, proxies=PROXIES,
                                                port=PORT, verbose=True)
        ACR.concurrent_request(TASKS)
        ```
        
Keywords: pandas,finance,pandas datareader
Platform: any
Requires-Python: >=3.6
Description-Content-Type: text/markdown
