Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: ZENODO
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

The TSL machine: parser, lemma analysis, sentiment analysis and autocoding for Telegram chats

Authors: Giovanni Spitale; Federico Germani; Nikola Biller - Andorno;

The TSL machine: parser, lemma analysis, sentiment analysis and autocoding for Telegram chats

Abstract

The purpose of this tool is performing NLP analysis on Telegram chats. Telegram chats can be exported as .json files from the official client, Telegram Desktop (v. 2.9.2.0). The files are parsed, the content is used to populate a message dataframe, which is then anonymized. The software calculates and displays the following information: user count (n of users, new users per day, removed users per day); message count (n and relative frequency of messages, messages per day); autocoded messages (anonymized message dataframe with code weights assigned to each message based on a customizable set of regex rules); prevalence of codes (n and relative frequency); prevalence of lemmas (n and relative frequency); prevalence of lemmas segmented by autocode (n and relative frequency); mean sentiment per day; mean sentiment segmented by autocode. The software outputs: messages_df_anon.csv - an anonymized file containing the progressive id of the message, the date, the univocal pseudonym of the sender, and the text; usercount_df.csv - user count dataframe; user_activity_df.csv - user activity dataframe; messagecount_df.csv - message count dataframe; messages_df_anon_coded.csv - an anonymized file containing the progressive id of the message, the date, the univocal pseudonym of the sender, the text, the codes, and the sentiment; autocode_freq_df.csv - general prevalence of codes; lemma_df.csv - lemma frequency; autocode_freq_df_[rule_name].csv - lemma frequency in coded messages, one file per rule; daily_sentiment_df.csv - daily sentiment; sentiment_by_code_df.csv - sentiment segmented by code; messages_anon.txt - anonymized text file generated from the message data frame, for easy import in other software for text mining or qualitative analysis; messages_anon_MaxQDA.txt - anonymized text file generated from the message data frame, formatted specifically for MaxQDA (to track speakers and codes). Dependencies: pandas (1.2.1) json random os re tqdm (4.62.2) datetime (4.3) matplotlib (3.4.3) Spacy (3.1.2) + it_core_news_md wordcloud (1.8.1) Counter feel_it (1.0.3) torch (1.9.0) numpy (1.21.1) transformers (4.3.3) This code is optimized for Italian, however: Lemma analysis is based on spaCy, which provides several other models for other languages ( https://spacy.io/models ) so it can easily be adapted. Sentiment analysis is performed using FEEL-IT: Emotion and Sentiment Classification for the Italian Language (Kudos to Federico Bianchi <f.bianchi@unibocconi.it>; Debora Nozza <debora.nozza@unibocconi.it>; and Dirk Hovy <dirk.hovy@unibocconi.it>). Their work is specific for Italian. To perform sentiment analysis in other languages one could consider nltk.sentiment The code is structured in a Jupyter-lab notebook, heavily commented for future reference. The software comes with a toy dataset comprised of Wikiquotes copy-pasted in a chat created by the research group. Have fun exploring it.

{"references": ["Bianchi F, Nozza D, Hovy D. FEEL-IT: Emotion and Sentiment Classification for the Italian Language. In: Proceedings of the 11th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics; 2021. https://github.com/MilaNLProc/feel-it"]}

Keywords

telegram, covid-19, vaccine, social listening, freedom, natural language processing, NLP, ethics, green pass

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 413
    download downloads 20
  • citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    Powered byBIP!BIP!
  • 413
    views
    20
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
413
20
Related to Research communities
Digital Humanities and Cultural Heritage
moresidebar

Do the share buttons not appear? Please make sure, any blocking addon is disabled, and then reload the page.