Z-score as a parameter for text classification

Автор: Viatcheslav Yatsko

Журнал: Грани познания @grani-vspu

Рубрика: Информационные технологии

Статья в выпуске: 6 (77), 2021 года.

Бесплатный доступ

The article deals with the specific features of the classification of the text documents and the functioning the classifier program. There is described the algorithm for computing Z-score as a classification parameter. The author tested its efficiency for the solution of the authorship attribution task on full texts, aligned texts, and on the aligned texts in combination with the deviation from Zipfian distribution. The testing showed that the use of Z-score as a separate parameter gives a negative result. At the same time, the use of this score basing on the deviation from Zipfian distribution demonstrated a high efficiency, which allowed to develop a variant of Y-method of text classification that was suggested earlier.

Еще

Automatic text document classification, authorship attribution, methods and algorithms, classifier program, Z-score, Zipfian distribution, Y-method, efficiency test

Короткий адрес: https://sciup.org/148322543

IDR: 148322543

Статья научная