remove stop words from documents -凯发k8网页登录
remove stop words from documents
since r2018b
syntax
description
words like "a", "and", "to", and "the" (known as stop words) can add noise to data. use this function to remove stop words before analysis.
the function supports english, japanese, german, and korean text. to learn how to use
removestopwords
for other languages, see language considerations.
removes the stop words from the newdocuments
= removestopwords(documents
)tokenizeddocument
array
documents
. the function, by default, uses the stop word list given by
the stopwords
function according to the language details of
documents
and is case insensitive.
to remove a custom list of words, use the function.
removes stop words with case matching the stop word list given by the
newdocuments
= removestopwords(documents
,'ignorecase',false)stopwords
function.
tip
use removestopwords
before using the
normalizewords
function as removestopwords
uses
information that is removed by this function.
examples
input arguments
output arguments
more about
algorithms
version history
introduced in r2018b
see also
tokenizeddocument
| | | | normalizewords
| |