Pregunta de entrevista de Celebal Technologies

stemming, lemmatization and tokenization

Respuesta de la entrevista

Anónimo

14 de sept de 2022

Tokenization - It is the process of breaking down the given text into the smallest unit in a sentence called a token. Punctuation marks, words, and numbers can be considered tokens. Stemming- the process of finding the root of words. Lemmatization- The process of finding the form of the related word in the dictionary. It is different from Stemming. It involves longer processes to calculate than Stemming.