erase punctuation from text and documents -凯发k8网页登录
erase punctuation from text and documents
syntax
description
erases punctuation and symbols from newdocuments
= erasepunctuation(documents
)documents
. if a word is
empty after removing punctuation and symbol characters, then the function removes
it. for tokenized document input, the function erases punctuation from tokens with
type 'punctuation'
and 'other'
. for example,
the function does not erase punctuation and symbol characters from urls and email
addresses.
erases punctuation and symbols from only the specified token types.newdocuments
= erasepunctuation(documents
,'tokentypes',types
)
examples
input arguments
output arguments
more about
tips
for string input,
erasepunctuation
removes punctuation characters from urls and html tags. this behavior can prevent the functions ,eraseurls
, and from working as expected. if you want to use these functions to preprocess your text, then use these functions before usingerasepunctuation
.
references
[1] unicode character categories.
version history
introduced in r2017bsee also
| | eraseurls
| | | tokenizeddocument