text#
A quick, convenient transformer that streamlines basic everyday data cleaning needs. |
|
The pybear lexicon of words in the English language. |
|
Join specified adjacent words into an N-gram unit, to be handled as a single "word". |
|
Remove stop words from text data. |
|
Join a (possibly ragged) 2D array-like of strings across rows with the sep character string(s). |
|
Justify text as closely as possible to the number of characters per line given by the user. |
|
Handle words in a 2D array-like body of text that are not in the pybear Lexicon. |
|
Handle words in a 2D array-like body of text that are not in the pybear Lexicon. |
|
Normalize all text in a dataset to upper-case, lower-case, or leave unchanged. |
|
Map ragged text data to a shaped array, using a fill value to fill out any ragged area. |
|
Remove full strings (not substrings) from text data. |
|
Search 1D vectors or (possibly ragged) 2D arrays of text data for character substrings and make one-to-one replacements. |
|
Split a dataset of strings on the given separator(s). |
|
Generate summary information about the strings and characters in text data. |
|
Strip leading and trailing spaces from 1D or 2D text data. |