Metadata-Version: 1.1
Name: drafttopic
Version: 0.2.0
Summary: A library for automatic detection of topics of new drafts on Wikipedia based on WikiProjects.
Home-page: https://github.com/wikimedia/drafttopic
Author: Aaron Halfaker, Sumit Asthana
Author-email: ahalfaker@wikimedia.org, asthana.sumit23@gmail.com
License: MIT
Description: # Draft topic
        
        Predicting topics to new drafts based on Wikiprojects on English Wikipedia.
        
        ## Setting up
        
        Make sure to have a working python3 environment.
        Install requirements using:
        
        ```
        pip install -r requirements
        ```
        
        Install the library using:
        
        ```
        python setup.py install
        ```
        
        ## Generating machine-readable WikiProjects data
        
        Use the following utility from root directory to generate machine-readable WikiProjects data:
        
        ```
        ./utility fetch_wikiprojects --output <output_file_name.json>
        ```
        
        ## Generating mid-level category to WikiProjects mapping
        
        Use the following utility from root directory to generate a mapping of high-level topic categories to list of WikiProjects contained in them:
        
        ```
        ./utility trim_wikiprojects --wikiprojects wp --output outmid
        ```
        
        ## Labeling a list of page-ids with the wikiprojects and mid-level categories each page belongs to
        
        Use the following utility from root directory to label a list of page-ids with the wikiprojects and the mid-level categories the page belongs to.
        
        ```
        ./utility fetch_page_wikiprojects --api-host=https://en.wikipedia.org/ --input=wikiproject_page_ids.json --output=enwiki.labeled_wikiprojects.json --mid_level_wp=outmid.json --verbose
        ```
        
        In above, the input to the script should be a json containing a list of
        observations, each observation having a **page\_id: \<page-id\>** mapping.
        Additionally also pass the mid-level wikiprojects json for the script to
        generate wikiprojects to mid-level categories mapping. The script augments the
        given list with the mentioned fields, writing them to a new file specified by
        **"output"**
        
        ## Generating predictions for a set of page-ids on Wikipedia
        
        For generating topic predictions for a set of revision-ids, download the relevant model and use revscoring's [score](https://github.com/wikimedia/revscoring/blob/master/revscoring/utilities/score.py) API
        to generate predictions. Note that the revision-ids need to be in a file with a format specified by the API. Use the revision ID of the most recent revision for a page to get a good prediction.
        
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Utilities
Classifier: Topic :: Scientific/Engineering
