Metadata-Version: 2.0
Name: fathom-web
Version: 3.0
Summary: Commandline tools for training Fathom rulesets
Home-page: https://mozilla.github.io/fathom/
Author: Erik Rose
Author-email: erik@mozilla.com
License: MPL
Keywords: machine learning,ml,semantic extraction
Platform: UNKNOWN
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: Mozilla Public License 2.0 (MPL 2.0)
Classifier: Programming Language :: Python :: 3
Requires-Dist: click (<8.0,>=7.0)
Requires-Dist: tensorboardX (>=1.6,<2.0)
Requires-Dist: torch (>=1.0,<2.0)

==================================
The Fathom Trainer and Other Tools
==================================

This is the commandline trainer for `Fathom <https://mozilla.github.io/fathom/>`_, which itself is a supervised-learning system for recognizing parts of web pages. It also includes other commandline tools for ruleset development, like ``fathom-unzip`` and ``fathom-pick``. `See docs for the trainer here <http://mozilla.github.io/fathom/training.html#running-the-trainer>`_.

Version History
===============

3.0
  * Move to Fathom repo.
  * Add ``fathom-unzip`` and ``fathom-pick``.
  * Switch to the Adam optimizer, which is significantly more turn-key, to the point where it doesn't need its learning-rate decay set manually.
  * Tolerate pages for which no candidate nodes were collected.
  * Add 95% CI for per-page training accuracy.
  * Add validation-guided early stopping.
  * Revise per-page accuracy calculation and display.
  * Shuffle training samples before training.
  * Add false-positive and false-negative numbers to per-tag metrics.

3.0a1
  * First release, intended for use with Fathom itself 3.0 or later


