text#

AutoTextCleaner

A quick, convenient transformer that streamlines basic everyday data cleaning needs.

Lexicon

The pybear lexicon of words in the English language.

NGramMerger

Join specified adjacent words into an N-gram unit, to be handled as a single "word".

StopRemover

Remove stop words from text data.

TextJoiner

Join a (possibly ragged) 2D array-like of strings across rows with the sep character string(s).

TextJustifier

Justify text as closely as possible to the number of characters per line given by the user.

TextLookup

Handle words in a 2D array-like body of text that are not in the pybear Lexicon.

TextLookupRealTime

Handle words in a 2D array-like body of text that are not in the pybear Lexicon.

TextNormalizer

Normalize all text in a dataset to upper-case, lower-case, or leave unchanged.

TextPadder

Map ragged text data to a shaped array, using a fill value to fill out any ragged area.

TextRemover

Remove full strings (not substrings) from text data.

TextReplacer

Search 1D vectors or (possibly ragged) 2D arrays of text data for character substrings and make one-to-one replacements.

TextSplitter

Split a dataset of strings on the given separator(s).

TextStatistics

Generate summary information about the strings and characters in text data.

TextStripper

Strip leading and trailing spaces from 1D or 2D text data.