Metadata-Version: 2.1
Name: tokenize-rt
Version: 3.0.1
Summary: A wrapper around the stdlib `tokenize` which roundtrips.
Home-page: https://github.com/asottile/tokenize-rt
Author: Anthony Sottile
Author-email: asottile@umich.edu
License: MIT
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
Description-Content-Type: text/markdown

[![Build Status](https://travis-ci.org/asottile/tokenize-rt.svg?branch=master)](https://travis-ci.org/asottile/tokenize-rt)
[![Coverage Status](https://coveralls.io/repos/github/asottile/tokenize-rt/badge.svg?branch=master)](https://coveralls.io/github/asottile/tokenize-rt?branch=master)

tokenize-rt
===========

The stdlib `tokenize` module does not properly roundtrip.  This wrapper
around the stdlib provides two additional tokens `ESCAPED_NL` and
`UNIMPORTANT_WS`, and a `Token` data type.  Use `src_to_tokens` and
`tokens_to_src` to roundtrip.

This library is useful if you're writing a refactoring tool based on the
python tokenization.

## Installation

`pip install tokenize-rt`

## Usage

### datastructures

#### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)`

A token offset, useful as a key when cross referencing the `ast` and the
tokenized source.

#### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)`

Construct a token

- `name`: one of the token names listed in `token.tok_name` or
  `ESCAPED_NL` or `UNIMPORTANT_WS`
- `src`: token's source as text
- `line`: the line number that this token appears on.  This will be `None` for
   `ESCAPED_NL` and `UNIMPORTANT_WS` tokens.
- `utf8_byte_offset`: the utf8 byte offset that this token appears on in the
  line.  This will be `None` for `ESCAPED_NL` and `UNIMPORTANT_WS` tokens.

#### `tokenize_rt.Token.offset`

Retrieves an `Offset` for this token.

### converting to and from `Token` representations

#### `tokenize_rt.src_to_tokens(text) -> List[Token]`

#### `tokenize_rt.tokens_to_src(Sequence[Token]) -> text`

### additional tokens added by `tokenize-rt`

#### `tokenize_rt.ESCAPED_NL`

#### `tokenize_rt.UNIMPORTANT_WS`

### helpers

#### `tokenize_rt.NON_CODING_TOKENS`

A `frozenset` containing tokens which may appear between others while not
affecting control flow or code:
- `COMMENT`
- `ESCAPED_NL`
- `NL`
- `UNIMPORTANT_WS`

#### `tokenize_rt.parse_string_literal(text) -> Tuple[str, str]`

parse a string literal into its prefix and string content

```pycon
>>> parse_string_literal('f"foo"')
('f', '"foo"')
```

#### `tokenize_rt.reversed_enumerate(Sequence[Token]) -> Iterator[Tuple[int, Token]]`

yields `(index, token)` pairs.  Useful for rewriting source.

## Differences from `tokenize`

- `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline "token"
- `tokenzie-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`)
- `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for
  instance, this means you'll see `Token('STRING', "f'foo'", ...)` even in
  python 2.

## Sample usage

- https://github.com/asottile/add-trailing-comma
- https://github.com/asottile/future-fstrings
- https://github.com/asottile/pyupgrade
- https://github.com/asottile/yesqa


