On the Importance of Data Representation for the Success of Text Classification
Text mining approaches use natural language processing to automatically extract patterns from texts. Tasks as topic labeling, news classification, question answering, named entity recognition and sentiment analysis, usually require elaborate and effective document representations. In this context, w...
Guardado en:
| Autores principales: | , , , |
|---|---|
| Formato: | Objeto de conferencia |
| Lenguaje: | Inglés |
| Publicado: |
2022
|
| Materias: | |
| Acceso en línea: | http://sedici.unlp.edu.ar/handle/10915/149536 |
| Aporte de: |
| id |
I19-R120-10915-149536 |
|---|---|
| record_format |
dspace |
| institution |
Universidad Nacional de La Plata |
| institution_str |
I-19 |
| repository_str |
R-120 |
| collection |
SEDICI (UNLP) |
| language |
Inglés |
| topic |
Ciencias Informáticas text mining text representations text classification movie reviews sentiment analysis polarity analysis |
| spellingShingle |
Ciencias Informáticas text mining text representations text classification movie reviews sentiment analysis polarity analysis Cuello, Carolina Y. Jofre Caradonna, Vanessa Garciarena Ucelay, María José Cagnina, Leticia On the Importance of Data Representation for the Success of Text Classification |
| topic_facet |
Ciencias Informáticas text mining text representations text classification movie reviews sentiment analysis polarity analysis |
| description |
Text mining approaches use natural language processing to automatically extract patterns from texts. Tasks as topic labeling, news classification, question answering, named entity recognition and sentiment analysis, usually require elaborate and effective document representations. In this context, word representation models in general, and vector-based word representations in particular, have gained increasing interest to alleviate some of the limitations that Bag of Words exhibits. In this article, we analyze the use of several vector-based word representations besides the classical ones, in a polarity analysis task on movie reviews. Experimental results show the effectiveness of more elaborate representations in comparison to Bag of Words. In particular, Concise Semantic Analysis representation seems to be very robust and effective because independently the classifier used with, the results are really good. Dimension and time of getting the representations are also showed, concluding in the efficiency of the classifiers when Concise Semantic Analysis is considered. |
| format |
Objeto de conferencia Objeto de conferencia |
| author |
Cuello, Carolina Y. Jofre Caradonna, Vanessa Garciarena Ucelay, María José Cagnina, Leticia |
| author_facet |
Cuello, Carolina Y. Jofre Caradonna, Vanessa Garciarena Ucelay, María José Cagnina, Leticia |
| author_sort |
Cuello, Carolina Y. |
| title |
On the Importance of Data Representation for the Success of Text Classification |
| title_short |
On the Importance of Data Representation for the Success of Text Classification |
| title_full |
On the Importance of Data Representation for the Success of Text Classification |
| title_fullStr |
On the Importance of Data Representation for the Success of Text Classification |
| title_full_unstemmed |
On the Importance of Data Representation for the Success of Text Classification |
| title_sort |
on the importance of data representation for the success of text classification |
| publishDate |
2022 |
| url |
http://sedici.unlp.edu.ar/handle/10915/149536 |
| work_keys_str_mv |
AT cuellocarolinay ontheimportanceofdatarepresentationforthesuccessoftextclassification AT jofrecaradonnavanessa ontheimportanceofdatarepresentationforthesuccessoftextclassification AT garciarenaucelaymariajose ontheimportanceofdatarepresentationforthesuccessoftextclassification AT cagninaleticia ontheimportanceofdatarepresentationforthesuccessoftextclassification |
| bdutipo_str |
Repositorios |
| _version_ |
1764820462952513537 |