Metadata-Version: 2.1
Name: get-free-proxy
Version: 0.1.2
Summary: A package to get free proxy
Home-page: https://github.com/zwzw911/get-free-proxy
Author: zwzw911
Author-email: zwzw911110@163.com
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: Microsoft :: Windows
Requires-Python: >=3.6
Description-Content-Type: text/markdown

#get_free_proxy
get_free_proxy is a tool to get free proxy from website
### install
`pip install get-free-proxy`
### usage
get_free_proxy depends on [gen_browser_header](https://github.com/zwzw911/gen-browser-header).   

create gen_browser_header setting  

`import gen_browser_header.setting.Setting as gbh_setting`   
`import gen_browser_header.self.SelfEnum as gbh_self_enum`     
`cur_gbh_setting = gbh_setting.GbhSetting()`    
`cur_gbh_setting.proxy_ip = ['10.11.12.13:8080']`    
`cur_gbh_setting.browser_type = {gbh_self_enum.BrowserType.All}`        
`cur_gbh_setting.firefox_ver = {'min': 74, 'max': 75}`    
`cur_gbh_setting.chrome_type = {gbh_self_enum.ChromeType.Stable}`    
`cur_gbh_setting.chrome_max_release_year = 1`    
`cur_gbh_setting.os_type = {gbh_self_enum.OsType.Win64}`

create get_free_proxy setting    

`import get_free_proxy.self.SelfEnum as gfp_self_enum`    
`import get_free_proxy.setting.Setting as gfp_setting`    
`cur_gfp_setting = gfp_setting.GfpSetting()`    
`cur_gfp_setting.proxy_type = {gfp_self_enum.ProxyType.HIGH_ANON}`    
`cur_gfp_setting.protocol = {gfp_self_enum.ProtocolType.HTTP,
                            gfp_self_enum.ProtocolType.HTTPS}`    
`cur_gfp_setting.country = {gfp_self_enum.Country.All}`    
`cur_gfp_setting.storage_type = {gfp_self_enum.StorageType.All}`    
`cur_gfp_setting.mysql = {
     'host': '127.0.0.1',
     'port': 3306,
     'user': 'root',
     'pwd': '1234',
     'db_name': 'db_proxy',
     'tbl_name': 'tbl_proxy',
     'charset': 'utf8mb4'}`    
`cur_gfp_setting.redis = {
    'host': '127.0.0.1',
    'port': 6379,
    'db': 0,  # 0~15
    'pwd': None
}`    
`cur_gfp_setting.result_file_path = os.path.join(tempfile.gettempdir(), 'result.json')`    
`cur_gfp_setting.valid_time_in_db = 86400`    
`cur_gfp_setting.site_max_page_no = 2`    
`cur_gfp_setting.site = {gfp_self_enum.SupportedWeb.Xici}`    

start to get free proxy    

`mainOp = MainOp(cur_gfp_setting, cur_gbh_setting) `       
 首先清空数据库(反正都要全部重新读取网页) 
`mainOp.del_proxy()`   
检测url是否需要使用代理    
`mainOp.check_if_site_need_proxy()`    
从可以直连的网站获得代理    
`tmp_proxies = mainOp.get_proxy_without_proxy()`    
验证代理是否可用    
`first_validate_proxies = mainOp.async_validate_proxies(tmp_proxies, 'https://www.baidu.com')`    
有可用的代理，则使用这些代理，来连接需要代理连接的代理完整；否则，使用固定的cur_gbh_setting.proxy_ip    
`if len(first_validate_proxies) > 0:`    
`    tmp_proxies = mainOp.get_proxy_with_proxy(proxies=first_validate_proxies)`    
`else:`    
`    tmp_proxies = mainOp.get_proxy_with_proxy(proxies=None)`    
获得结果，再次进行验证，是否可以使用   
`second_validate_proxies = mainOp.async_validate_proxies(tmp_proxies, 'https://www.baidu.com')    `    
合并所有可用的代理    
`all_validate_proxies = first_validate_proxies+second_validate_proxies`    
`print('最终有效代理%s' % all_validate_proxies)`    
保存代理    
`mainOp.save_proxy(proxies=all_validate_proxies)`



### gfp_setting  
1. **proxy_type**    
type: ***set, element is enum=>gfp_self_enum.ProxyType***    
default: ***{gfp_self_enum.ProxyType.HIGH_ANON}***   
description:  proxy has 3 type: transparent/anonymous/high_anonymous, TRANS/ANON/HIGH_ANON. There is an addition one All, 
if set, will be replace by  TRANS+ANON+HIGH_ANON   
2. **protocol**    
type: ***set, element is enum=>gfp_self_enum.ProtocolType***
default: {gfp_self_enum.ProtocolType.HTTP, gfp_self_enum.ProtocolType.HTTPS}    
description:  proxy protocol has 4 type: HTTP, HTTPS, SOCK4, SOCK5. There is an addition one All, is set, will be replace by
HTTP+HTTPS+SOCK4+SOCK5.    
3. **country**    
type: ***set, element is enum=>gfp_self_enum.Country***    
default: {gfp_self_enum.Country.China}    
description: some web provide proxy form all countries, the parameter will filter the country. There is an addition one 
All, is set, will ignore country.    
4. **storage_type**
type: ***set, element is enum=>gfp_self_enum.StorageType***    
default: {gfp_self_enum.StorageType.All}    
description: current support 3 storage type: Mysql/Redis/File. There is an addition one All, is set, will be replace by
Mysql+Redis+File    
5. **mysql**   
type: ***dict***    
default:   
{  
'host': '127.0.0.1',  
     'port': 3306,  
     'user': 'root',  
     'pwd': '1234',  
     'db_name': 'db_proxy',  
     'tbl_name': 'tbl_proxy',  
     'charset': 'utf8mb4'     
     }    
description: if **storage_type** include Mysql, set this parameter to connect mysql.    
5. **redis**   
type: ***dict***
default:    
{    
    'host': '127.0.0.1',    
    'port': 6379,    
    'db': 0,  # 0~15    
    'pwd': None    
}     
description: if **storage_type** include Redis, set this parameter to connect redis.
6. **result_file_path**    
type: ***string***    
default: ***os.path.join(tempfile.gettempdir(), 'result.json')***     
description: if **storage_type** include File, all get proxy will be save into the file defined by **result_file_path**    
7. **valid_time_in_db**    
type: ***int***  
default: ***86400***   
unit: ***second***    
description: since all got proxy are free, not sure when these proxy will expire. So set this parameter, it a proxy expire this duration, will not delete/not_choose    
8. **site_max_page_no**    
type: ***int***    
default: ***2***   
description: min:2, max:9. The web site which provide free proxy, the content are pagationed. This parameter determine how many page 
will be handled to extract proxy.    
9. **site**    
type: ***set, enum=>gfp_self_enum.SupportedWeb***   
default: ***{gfp_self_enum.SupportedWeb.Xici}***      
description: this parameter determine which site will be used to extract proxy. currently only support 4 site:
https://www.xicidaili.com, https://www.kuaidaili.com/free, https://hidemy.name/en/proxy-list/#list, https://proxy-list.org/english.
and if All is set, will be replace by above 4 site.   


### change history
0.1.0  use requests-html replace requests    
0.1.1  match gen_browser 0.1.3: when gen_header, add host base on parameter url
0.1.2  add support for zh-cn in setup.py by add encoding="utf-8"    

