HomeDevCentral

Update word tokenizer
b8669b258fa5Unpublished

Unpublished Commit · Learn More

Repository Importing: This repository is still importing.

Description

Update word tokenizer

Summary:
Switch to the treebank tokenizer to cut a sentence into words.

PunktWordTokenizer is now considered as an internal part of punkt,
and not a public API class.

See https://github.com/nltk/nltk/commit/0b91a7160717faa2fe93d42f7c6bba735f6dd48a

Test Plan: Tested with Ada Palmer, the Will to Battle

Reviewers: dereckson

Reviewed By: dereckson

Differential Revision: https://devcentral.nasqueron.org/D1353

Details

Provenance
derecksonAuthored on Feb 27 2018, 21:16
derecksonPushed on Feb 28 2018, 21:34
Reviewer
dereckson
Differential Revision
D1353: Update word tokenizer
Parents
rEPNa5e9c45b524e: Shift to Python 3
Branches
Unknown
Tags
Unknown