- home
- Advanced Search
Filters
Clear AllLoading
integration_instructions Research softwarekeyboard_double_arrow_right Software 2021 EnglishZenodo Authors: Spitale, Giovanni; Germani, Federico; Biller-Andorno, Nikola;Spitale, Giovanni; Germani, Federico; Biller-Andorno, Nikola;The purpose of this tool is performing NLP analysis on Telegram chats. Telegram chats can be exported as .json files from the official client, Telegram Desktop (v. 2.9.2.0). The files are parsed, the content is used to populate a message dataframe, which is then anonymized. The software calculates and displays the following information: user count (n of users, new users per day, removed users per day); message count (n and relative frequency of messages, messages per day); autocoded messages (anonymized message dataframe with code weights assigned to each message based on a customizable set of regex rules); prevalence of codes (n and relative frequency); prevalence of lemmas (n and relative frequency); prevalence of lemmas segmented by autocode (n and relative frequency); mean sentiment per day; mean sentiment segmented by autocode. The software outputs: messages_df_anon.csv - an anonymized file containing the progressive id of the message, the date, the univocal pseudonym of the sender, and the text; usercount_df.csv - user count dataframe; user_activity_df.csv - user activity dataframe; messagecount_df.csv - message count dataframe; messages_df_anon_coded.csv - an anonymized file containing the progressive id of the message, the date, the univocal pseudonym of the sender, the text, the codes, and the sentiment; autocode_freq_df.csv - general prevalence of codes; lemma_df.csv - lemma frequency; autocode_freq_df_[rule_name].csv - lemma frequency in coded messages, one file per rule; daily_sentiment_df.csv - daily sentiment; sentiment_by_code_df.csv - sentiment segmented by code; messages_anon.txt - anonymized text file generated from the message data frame, for easy import in other software for text mining or qualitative analysis; messages_anon_MaxQDA.txt - anonymized text file generated from the message data frame, formatted specifically for MaxQDA (to track speakers and codes). Dependencies: pandas (1.2.1) json random os re tqdm (4.62.2) datetime (4.3) matplotlib (3.4.3) Spacy (3.1.2) + it_core_news_md wordcloud (1.8.1) Counter feel_it (1.0.3) torch (1.9.0) numpy (1.21.1) transformers (4.3.3) This code is optimized for Italian, however: Lemma analysis is based on spaCy, which provides several other models for other languages ( https://spacy.io/models ) so it can easily be adapted. Sentiment analysis is performed using FEEL-IT: Emotion and Sentiment Classification for the Italian Language (Kudos to Federico Bianchi <f.bianchi@unibocconi.it>; Debora Nozza <debora.nozza@unibocconi.it>; and Dirk Hovy <dirk.hovy@unibocconi.it>). Their work is specific for Italian. To perform sentiment analysis in other languages one could consider nltk.sentiment The code is structured in a Jupyter-lab notebook, heavily commented for future reference. The software comes with a toy dataset comprised of Wikiquotes copy-pasted in a chat created by the research group. Have fun exploring it. {"references": ["Bianchi F, Nozza D, Hovy D. FEEL-IT: Emotion and Sentiment Classification for the Italian Language. In: Proceedings of the 11th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics; 2021. https://github.com/MilaNLProc/feel-it"]}
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.5281/zenodo.5533907&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
visibility 372visibility views 372 download downloads 16 Powered bymore_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.5281/zenodo.5533907&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu
Loading
integration_instructions Research softwarekeyboard_double_arrow_right Software 2021 EnglishZenodo Authors: Spitale, Giovanni; Germani, Federico; Biller-Andorno, Nikola;Spitale, Giovanni; Germani, Federico; Biller-Andorno, Nikola;The purpose of this tool is performing NLP analysis on Telegram chats. Telegram chats can be exported as .json files from the official client, Telegram Desktop (v. 2.9.2.0). The files are parsed, the content is used to populate a message dataframe, which is then anonymized. The software calculates and displays the following information: user count (n of users, new users per day, removed users per day); message count (n and relative frequency of messages, messages per day); autocoded messages (anonymized message dataframe with code weights assigned to each message based on a customizable set of regex rules); prevalence of codes (n and relative frequency); prevalence of lemmas (n and relative frequency); prevalence of lemmas segmented by autocode (n and relative frequency); mean sentiment per day; mean sentiment segmented by autocode. The software outputs: messages_df_anon.csv - an anonymized file containing the progressive id of the message, the date, the univocal pseudonym of the sender, and the text; usercount_df.csv - user count dataframe; user_activity_df.csv - user activity dataframe; messagecount_df.csv - message count dataframe; messages_df_anon_coded.csv - an anonymized file containing the progressive id of the message, the date, the univocal pseudonym of the sender, the text, the codes, and the sentiment; autocode_freq_df.csv - general prevalence of codes; lemma_df.csv - lemma frequency; autocode_freq_df_[rule_name].csv - lemma frequency in coded messages, one file per rule; daily_sentiment_df.csv - daily sentiment; sentiment_by_code_df.csv - sentiment segmented by code; messages_anon.txt - anonymized text file generated from the message data frame, for easy import in other software for text mining or qualitative analysis; messages_anon_MaxQDA.txt - anonymized text file generated from the message data frame, formatted specifically for MaxQDA (to track speakers and codes). Dependencies: pandas (1.2.1) json random os re tqdm (4.62.2) datetime (4.3) matplotlib (3.4.3) Spacy (3.1.2) + it_core_news_md wordcloud (1.8.1) Counter feel_it (1.0.3) torch (1.9.0) numpy (1.21.1) transformers (4.3.3) This code is optimized for Italian, however: Lemma analysis is based on spaCy, which provides several other models for other languages ( https://spacy.io/models ) so it can easily be adapted. Sentiment analysis is performed using FEEL-IT: Emotion and Sentiment Classification for the Italian Language (Kudos to Federico Bianchi <f.bianchi@unibocconi.it>; Debora Nozza <debora.nozza@unibocconi.it>; and Dirk Hovy <dirk.hovy@unibocconi.it>). Their work is specific for Italian. To perform sentiment analysis in other languages one could consider nltk.sentiment The code is structured in a Jupyter-lab notebook, heavily commented for future reference. The software comes with a toy dataset comprised of Wikiquotes copy-pasted in a chat created by the research group. Have fun exploring it. {"references": ["Bianchi F, Nozza D, Hovy D. FEEL-IT: Emotion and Sentiment Classification for the Italian Language. In: Proceedings of the 11th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics; 2021. https://github.com/MilaNLProc/feel-it"]}
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.5281/zenodo.5533907&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
visibility 372visibility views 372 download downloads 16 Powered bymore_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.5281/zenodo.5533907&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu