Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.

  • Digital Humanities and Cultural Heritage
  • 2012-2021
  • Research data
  • Dataset
  • VTechWorks

Date (most recent)
arrow_drop_down
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Hamilton, Leah; Robb, Esther; Fitzpatrick, April; Goel, Akshay; +1 Authors

    Files Included: FacebookBreachSummarization_FinalReport.pdf - The final report covering the approaches, results, and lessons learned as a part of this project. Includes the resulting summaries as appendices, the results from earlier natural language approaches as tables and figures, and a breakdown of the file structures of included code archives. FacebookBreachSummarization_FinalReport.docx - An editable version of the final report. May not display correctly on all systems. FacebookBreachSummarization_FinalPresentation.pdf - The slides used to give the final presentation for CS 4984/5984. An overview of important results and takeaways that was intended to be presented in 10 minutes. FacebookBreachSummarization_FinalPresentation.pptx - An editable version of the final presentation. May not display correctly on all systems. FacebookBreachSummarization_CodeAndResults.zip - The majority of the code used to obtain the results covered in the report ant presentation, along with input datafiles and results where the results are saved to a file instead of written to the console. If looking to replicate results, start here. See the report for details on the file structure. FacebookBreachSummarization_FastAbsRLFork.zip - A clone of the original fast_abs_rl repository (https://github.com/ChenRocks/fast_abs_rl), modified to work with our corpus. One method of generating abstractive summaries that was tested in this project. FacebookBreachSummarization_FilesForAbsSum.zip - A directory containing cleaned article text from the Facebook corpus saved as .story files for use with the abstractive summarizers. FacebookBreachSummarization_PySparkExtPkgs.zip - A zipped archive of the packages needed to run the PySpark code included in this project which are not a part of base Python. For ease of running the included PySpark code on Python 2.7. FacebookBreackSummarization_PretrainedPGN.zip - A copy of the pretrained pointer-generator network model which is also available through a Google Drive link on https://github.com/abisee/pointer-generator. Works with TensorFlow 1.2.1. Can be used to rapidly generate single-document abstractive summaries without having to train a new model. Summarization is often a time-consuming task for humans. Automated methods can summarize a larger volume of source material in a shorter amount of time, but creating a good summary with these methods remains challenging. This submission contains all work related to a semester-long project in CS 4984/5984 to generate the best possible summary of a collection of 10,829 web pages about the Facebook-Cambridge Analytica data breach, with some early prototyping done on 500 web pages about the 2017 Solar Eclipse. A final report, a final presentation, and several archives of code, input data, and results are included. The work implements basic natural language processing techniques such as word frequency, lemmatization, and part-of-speech tagging, working up to a complete human-readable summary at the end of the course. Extractive, abstractive, and combination methods were used to generate the final summaries, all of which are included and the results compared. The summary subjectively evaluated as best was a purely extractive summary built from concatenating summaries of document categories. This method was coherent and thorough, but involved manual tuning to select categories and still had some redundancy. All attempted methods are described and the less successful summaries are also included. This report presents a framework for how to summarize complex document collections with multiple relevant topics. The summary itself identifies information which was most covered about the Facebook-Cambridge Analytica data breach and is a reasonable introduction to the topic. Global Event and Trend Archive Research (GETAR) project NSF: IIS-1619028

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    VTechWorks
    Dataset . 2018
    License: CC BY NC
    Data sources: VTechWorks
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2018
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      VTechWorks
      Dataset . 2018
      License: CC BY NC
      Data sources: VTechWorks
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2018
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Cheng, Junjie;

    This is the Neural Network Document Summarization project for the Multimedia, Hypertext, and Information Access (CS 4624) course at Virginia Tech in the 2018 Spring semester. The purpose of this project is to generate a summary from a long document through deep learning. As a result, the outcome of the project is expected to replace part of a human’s work. The implementation of this project consists of four phases: data preprocessing, building models, training, and testing. In the data preprocessing phase, the data set is separated into training set, validation set, and testing set, with the 3:1:1 ratio. In each data set, articles and abstracts are tokenized to tokens and then transformed to indexed documents. In the building model phase, a sequence to sequence model is implemented by PyTorch to transform articles to abstracts. The sequence to sequence model contains an encoder and a decoder. Both are implemented as recurrent neural network models with long-short term memory unit. Additionally, the MLP attention model is applied to the decoder model to improve its performance. In the training phase, the model iteratively loads data from the training set and learns from them. In each iteration, the model generates a summary according to the input document, and compares the generated summary with the real summary. The difference between them is represented by a loss value. According to the loss value, the model performs back propagation to improve its accuracy. In the testing phase, the validation dataset and the testing dataset are used to test the accuracy of the trained model. The model generates the summary according to the input document. Then the similarity between the generated summary and the real human-produced summary are evaluated by PyRouge. Throughout the semester, all of the above tasks were completed. With the trained model, users can generate CNN/Daily Mail style highlights according to an input article. DocSummarization.zip contains all source code of the project, the training data set, and a trained model. The DocSummarizationReport of pdf and doc versions describes the project design and all technical details in this project. It also includes an user manual and a developer manual. The DocSummarizationPresentation of pdf and ppt versions is the slides used for the final presentation of the project. It shows the general design and phases of the project.

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2018
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2018
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Bartolome, Abigail; Bock, Matthew; Vinayagam, Radha Krishnan; Krishnamurthy, Rahul;

    The IDEAL (Integrated Digital Event Archiving and Library) and Global Event and Trend Archive Research (GETAR) projects have collected over 1.5 billion tweets, and webpages from social media and the World Wide Web and indexed them to be easily retrieved and analyzed. This gives researchers an extensive library of documents that reflect the interests and sentiments of the public in reaction to an event. By applying topic analysis to collections of tweets, researchers can learn the topics of most interest or concern to the general public. Adding a layer of sentiment analysis to those topics will illustrate how the public felt in relation to the topics that were found. The Sentiment and Topic Analysis team has designed a system that joins topic analysis and sentiment analysis for researchers who are interested in learning more about public reaction to global events. The tool runs topic analysis on a collection of tweets, and the user can select a topic of interest and assess the sentiments with regard to that topic (i.e., positive vs. negative). This submission covers the background, requirements, design and implementation of our contributions to this project. Furthermore, we include data, scripts, source code, a user manual, and a developer manual to assist in any future work. Sentiment_and_Topic_Analysis.pdf: final report Sentiment_and_Topic_Analysis_Presentation.pdf: pdf of final presentation Sentiment_and_Topic_Analysis_Presentation.pptx: PowerPoint of final presentation Sentiment_and_Topic_Analysis_LaTeX.zip: zip file of LaTeX files used to write final report Sentiment_and_Topic_Analysis_Work_Files.tar: source code and data files, contents listed below: AT0412.txt: tested dataset Word2VecSentimentAnalysis.scala: sentiment classifier topics_1 through topics_4: result files Topic analysis/MainWindow.scala: UI code Topic analysis:/pom.xml: used for UI Sentiment analysis/final_sentiment_analysis.py: reads tweet collection for sentiment analysis Sentiment analysis/first3.sh: passes tweet into syntaxnet Sentiment analysis/parse_tree.py: renders parse tree to represent file returned by syntaxnet Sentiment analysis/reverse_polarity_file:polarity reversal and negation words from General Inquirer NSF: IIS-1619028 NSF: IIS-1319578

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    VTechWorks
    Dataset . 2017
    License: CC BY
    Data sources: VTechWorks
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2017
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      VTechWorks
      Dataset . 2017
      License: CC BY
      Data sources: VTechWorks
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2017
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Bialousz, Kenneth; Kokal, Kevin; Orleans-Pobee, Kwamina; Wakeley, Christopher;

    CS4984 is a newly-offered class at Virginia Tech with a unit based, project-problem based learning curriculum. This class style is based on NSF-funded work on curriculum for the field of digital libraries and related topics, and in this class, is used to guide a student based investigation of computational linguistics. The specific problem this report addresses is the creation of a means to automatically generate a short summary of a corpus of articles about earthquakes. Such a summary should be best representative of the texts and include all relevant information about earthquakes. For our analysis, we operated on two corpora--one about a 5.8 magnitude earthquake in Virginia in August 2011, and another about a 6.6 magnitude earthquake in April 2013 in Lushan, China. Techniques used to analyze the articles include clustering, lemmatization, frequency analysis of n-grams, and regular expression searches. Both PDF and Word versions for the final report, a ZIP file of source code, and a PDF and PowerPoint of the final presentation. NSF DUE-1141209 and IIS-1319578

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    VTechWorks
    Dataset . 2014
    License: CC 0
    Data sources: VTechWorks
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2014
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      VTechWorks
      Dataset . 2014
      License: CC 0
      Data sources: VTechWorks
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2014
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Roble, Benjamin; Cheng, Justin; Sbitani, Marwan;

    The goal of this project was to associate existing data in the Virtual Town Square database from the New River Valley area with topical metadata. We took a database of approximately 360,000 tweets and 15,000 RSS news stories collected in the last two years and associated each RSS story and tweet with topics. The open-source natural language processing library Mallet was used to perform topical modeling on the data using Latent Dirichlet Allocation, which was then used to create a Solr instance of searchable tweets and news stories. Topical modeling was not done around specific events, instead the entire tweet data (and entire RSS data) was used as the corpus. The tweet data was analyzed separately from the RSS stories, so the generated topics are specific to each dataset. This report details the methodology used in our work in the Methodology section and contains a detailed Developer’s Guide and User’s Guide so that others may continue our work. The client was satisfied with the outcome of this project as, even though tweets have generally been considered too short to be run through a topical modeling process, we generated topics for each tweet that appear to be relevant and accurate. This collection contains the source code, programs, documentation, and example data used in the project. Please review the "Final Report and Technical Manual" for a comprehensive overview of the project. The open source library Mallet was used and is referenced here: McCallum, Andrew Kachites. "MALLET: A Machine Learning for Language Toolkit." http://mallet.cs.umass.edu. 2002. Virginia Tech Center for Human-Computer Interaction Associate Director: Dr. Kavanaugh, kavan@vt.edu; Virginia Tech PhD Student: Ji Wang (InfoVis Lab), wji@cs.vt.edu; Virginia Tech PhD Student: Mohamed Magdy, mmagdy@vt.edu; Virginia Tech Professor: Dr. Edward Fox, fox@vt.edu

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    VTechWorks
    Dataset . 2014
    License: CC BY
    Data sources: VTechWorks
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2014
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      VTechWorks
      Dataset . 2014
      License: CC BY
      Data sources: VTechWorks
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2014
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
Powered by OpenAIRE graph
Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Hamilton, Leah; Robb, Esther; Fitzpatrick, April; Goel, Akshay; +1 Authors

    Files Included: FacebookBreachSummarization_FinalReport.pdf - The final report covering the approaches, results, and lessons learned as a part of this project. Includes the resulting summaries as appendices, the results from earlier natural language approaches as tables and figures, and a breakdown of the file structures of included code archives. FacebookBreachSummarization_FinalReport.docx - An editable version of the final report. May not display correctly on all systems. FacebookBreachSummarization_FinalPresentation.pdf - The slides used to give the final presentation for CS 4984/5984. An overview of important results and takeaways that was intended to be presented in 10 minutes. FacebookBreachSummarization_FinalPresentation.pptx - An editable version of the final presentation. May not display correctly on all systems. FacebookBreachSummarization_CodeAndResults.zip - The majority of the code used to obtain the results covered in the report ant presentation, along with input datafiles and results where the results are saved to a file instead of written to the console. If looking to replicate results, start here. See the report for details on the file structure. FacebookBreachSummarization_FastAbsRLFork.zip - A clone of the original fast_abs_rl repository (https://github.com/ChenRocks/fast_abs_rl), modified to work with our corpus. One method of generating abstractive summaries that was tested in this project. FacebookBreachSummarization_FilesForAbsSum.zip - A directory containing cleaned article text from the Facebook corpus saved as .story files for use with the abstractive summarizers. FacebookBreachSummarization_PySparkExtPkgs.zip - A zipped archive of the packages needed to run the PySpark code included in this project which are not a part of base Python. For ease of running the included PySpark code on Python 2.7. FacebookBreackSummarization_PretrainedPGN.zip - A copy of the pretrained pointer-generator network model which is also available through a Google Drive link on https://github.com/abisee/pointer-generator. Works with TensorFlow 1.2.1. Can be used to rapidly generate single-document abstractive summaries without having to train a new model. Summarization is often a time-consuming task for humans. Automated methods can summarize a larger volume of source material in a shorter amount of time, but creating a good summary with these methods remains challenging. This submission contains all work related to a semester-long project in CS 4984/5984 to generate the best possible summary of a collection of 10,829 web pages about the Facebook-Cambridge Analytica data breach, with some early prototyping done on 500 web pages about the 2017 Solar Eclipse. A final report, a final presentation, and several archives of code, input data, and results are included. The work implements basic natural language processing techniques such as word frequency, lemmatization, and part-of-speech tagging, working up to a complete human-readable summary at the end of the course. Extractive, abstractive, and combination methods were used to generate the final summaries, all of which are included and the results compared. The summary subjectively evaluated as best was a purely extractive summary built from concatenating summaries of document categories. This method was coherent and thorough, but involved manual tuning to select categories and still had some redundancy. All attempted methods are described and the less successful summaries are also included. This report presents a framework for how to summarize complex document collections with multiple relevant topics. The summary itself identifies information which was most covered about the Facebook-Cambridge Analytica data breach and is a reasonable introduction to the topic. Global Event and Trend Archive Research (GETAR) project NSF: IIS-1619028

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    VTechWorks
    Dataset . 2018
    License: CC BY NC
    Data sources: VTechWorks
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2018
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      VTechWorks
      Dataset . 2018
      License: CC BY NC
      Data sources: VTechWorks
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2018
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Cheng, Junjie;

    This is the Neural Network Document Summarization project for the Multimedia, Hypertext, and Information Access (CS 4624) course at Virginia Tech in the 2018 Spring semester. The purpose of this project is to generate a summary from a long document through deep learning. As a result, the outcome of the project is expected to replace part of a human’s work. The implementation of this project consists of four phases: data preprocessing, building models, training, and testing. In the data preprocessing phase, the data set is separated into training set, validation set, and testing set, with the 3:1:1 ratio. In each data set, articles and abstracts are tokenized to tokens and then transformed to indexed documents. In the building model phase, a sequence to sequence model is implemented by PyTorch to transform articles to abstracts. The sequence to sequence model contains an encoder and a decoder. Both are implemented as recurrent neural network models with long-short term memory unit. Additionally, the MLP attention model is applied to the decoder model to improve its performance. In the training phase, the model iteratively loads data from the training set and learns from them. In each iteration, the model generates a summary according to the input document, and compares the generated summary with the real summary. The difference between them is represented by a loss value. According to the loss value, the model performs back propagation to improve its accuracy. In the testing phase, the validation dataset and the testing dataset are used to test the accuracy of the trained model. The model generates the summary according to the input document. Then the similarity between the generated summary and the real human-produced summary are evaluated by PyRouge. Throughout the semester, all of the above tasks were completed. With the trained model, users can generate CNN/Daily Mail style highlights according to an input article. DocSummarization.zip contains all source code of the project, the training data set, and a trained model. The DocSummarizationReport of pdf and doc versions describes the project design and all technical details in this project. It also includes an user manual and a developer manual. The DocSummarizationPresentation of pdf and ppt versions is the slides used for the final presentation of the project. It shows the general design and phases of the project.

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2018
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2018
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Bartolome, Abigail; Bock, Matthew; Vinayagam, Radha Krishnan; Krishnamurthy, Rahul;

    The IDEAL (Integrated Digital Event Archiving and Library) and Global Event and Trend Archive Research (GETAR) projects have collected over 1.5 billion tweets, and webpages from social media and the World Wide Web and indexed them to be easily retrieved and analyzed. This gives researchers an extensive library of documents that reflect the interests and sentiments of the public in reaction to an event. By applying topic analysis to collections of tweets, researchers can learn the topics of most interest or concern to the general public. Adding a layer of sentiment analysis to those topics will illustrate how the public felt in relation to the topics that were found. The Sentiment and Topic Analysis team has designed a system that joins topic analysis and sentiment analysis for researchers who are interested in learning more about public reaction to global events. The tool runs topic analysis on a collection of tweets, and the user can select a topic of interest and assess the sentiments with regard to that topic (i.e., positive vs. negative). This submission covers the background, requirements, design and implementation of our contributions to this project. Furthermore, we include data, scripts, source code, a user manual, and a developer manual to assist in any future work. Sentiment_and_Topic_Analysis.pdf: final report Sentiment_and_Topic_Analysis_Presentation.pdf: pdf of final presentation Sentiment_and_Topic_Analysis_Presentation.pptx: PowerPoint of final presentation Sentiment_and_Topic_Analysis_LaTeX.zip: zip file of LaTeX files used to write final report Sentiment_and_Topic_Analysis_Work_Files.tar: source code and data files, contents listed below: AT0412.txt: tested dataset Word2VecSentimentAnalysis.scala: sentiment classifier topics_1 through topics_4: result files Topic analysis/MainWindow.scala: UI code Topic analysis:/pom.xml: used for UI Sentiment analysis/final_sentiment_analysis.py: reads tweet collection for sentiment analysis Sentiment analysis/first3.sh: passes tweet into syntaxnet Sentiment analysis/parse_tree.py: renders parse tree to represent file returned by syntaxnet Sentiment analysis/reverse_polarity_file:polarity reversal and negation words from General Inquirer NSF: IIS-1619028 NSF: IIS-1319578

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    VTechWorks
    Dataset . 2017
    License: CC BY
    Data sources: VTechWorks
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2017
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      VTechWorks
      Dataset . 2017
      License: CC BY
      Data sources: VTechWorks
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2017
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Bialousz, Kenneth; Kokal, Kevin; Orleans-Pobee, Kwamina; Wakeley, Christopher;

    CS4984 is a newly-offered class at Virginia Tech with a unit based, project-problem based learning curriculum. This class style is based on NSF-funded work on curriculum for the field of digital libraries and related topics, and in this class, is used to guide a student based investigation of computational linguistics. The specific problem this report addresses is the creation of a means to automatically generate a short summary of a corpus of articles about earthquakes. Such a summary should be best representative of the texts and include all relevant information about earthquakes. For our analysis, we operated on two corpora--one about a 5.8 magnitude earthquake in Virginia in August 2011, and another about a 6.6 magnitude earthquake in April 2013 in Lushan, China. Techniques used to analyze the articles include clustering, lemmatization, frequency analysis of n-grams, and regular expression searches. Both PDF and Word versions for the final report, a ZIP file of source code, and a PDF and PowerPoint of the final presentation. NSF DUE-1141209 and IIS-1319578

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    VTechWorks
    Dataset . 2014
    License: CC 0
    Data sources: VTechWorks
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2014
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      VTechWorks
      Dataset . 2014
      License: CC 0
      Data sources: VTechWorks
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2014
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
  • image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    Authors: Roble, Benjamin; Cheng, Justin; Sbitani, Marwan;

    The goal of this project was to associate existing data in the Virtual Town Square database from the New River Valley area with topical metadata. We took a database of approximately 360,000 tweets and 15,000 RSS news stories collected in the last two years and associated each RSS story and tweet with topics. The open-source natural language processing library Mallet was used to perform topical modeling on the data using Latent Dirichlet Allocation, which was then used to create a Solr instance of searchable tweets and news stories. Topical modeling was not done around specific events, instead the entire tweet data (and entire RSS data) was used as the corpus. The tweet data was analyzed separately from the RSS stories, so the generated topics are specific to each dataset. This report details the methodology used in our work in the Methodology section and contains a detailed Developer’s Guide and User’s Guide so that others may continue our work. The client was satisfied with the outcome of this project as, even though tweets have generally been considered too short to be run through a topical modeling process, we generated topics for each tweet that appear to be relevant and accurate. This collection contains the source code, programs, documentation, and example data used in the project. Please review the "Final Report and Technical Manual" for a comprehensive overview of the project. The open source library Mallet was used and is referenced here: McCallum, Andrew Kachites. "MALLET: A Machine Learning for Language Toolkit." http://mallet.cs.umass.edu. 2002. Virginia Tech Center for Human-Computer Interaction Associate Director: Dr. Kavanaugh, kavan@vt.edu; Virginia Tech PhD Student: Ji Wang (InfoVis Lab), wji@cs.vt.edu; Virginia Tech PhD Student: Mohamed Magdy, mmagdy@vt.edu; Virginia Tech Professor: Dr. Edward Fox, fox@vt.edu

    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
    image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
    VTechWorks
    Dataset . 2014
    License: CC BY
    Data sources: VTechWorks
    image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
    VTechWorks
    Dataset . 2014
    Data sources: VTechWorks
    addClaim

    This Research product is the result of merged Research products in OpenAIRE.

    You have already added works in your ORCID record related to the merged Research product.
    0
    citations0
    popularityAverage
    influenceAverage
    impulseAverage
    BIP!Powered by BIP!
    more_vert
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ VTechWorksarrow_drop_down
      image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
      VTechWorks
      Dataset . 2014
      License: CC BY
      Data sources: VTechWorks
      image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
      VTechWorks
      Dataset . 2014
      Data sources: VTechWorks
      addClaim

      This Research product is the result of merged Research products in OpenAIRE.

      You have already added works in your ORCID record related to the merged Research product.
Powered by OpenAIRE graph