Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
4,595 Research products, page 1 of 460

  • Digital Humanities and Cultural Heritage
  • Research data
  • Other research products
  • 2013-2022
  • Audiovisual

10
arrow_drop_down
Relevance
arrow_drop_down
  • Research data . Audiovisual . 2021 . Embargo End Date: 25 May 2021
    Authors: 
    NAACL 2021 2021; Chubarian, Karine; Khan, Abdul; Sidiropoulos, Anastasios; Xu, Jia;
    Publisher: Underline Science Inc.

    Read the paper on the folowing link: https://www.aclweb.org/anthology/2021.naacl-main.257/ Abstract: Deep Learning-based NLP systems can be sensitive to unseen tokens and hard to learn with high-dimensional inputs, which critically hinder learning generalization. We introduce an approach by grouping input words based on their semantic diversity to simplify input language representation with low ambiguity. Since the semantically diverse words reside in different contexts, we are able to substitute words with their groups and still distinguish word meanings relying on their contexts. We design several algorithms that compute diverse groupings based on random sampling, geometric distances, and entropy maximization, and we prove formal guarantees for the entropy-based algorithms. Experimental results show that our methods generalize NLP models and demonstrate enhanced accuracy on POS tagging and LM tasks and significant improvements on medium-scale machine translation tasks, up to +6.5 BLEU points.

  • Research data . Audiovisual . 2021 . Embargo End Date: 25 May 2021
    Authors: 
    NAACL 2021 2021; Bansal, Mohit; Hannan, Darryl; Maharana, Adyasha;
    Publisher: Underline Science Inc.

    Read the paper on the folowing link: https://www.aclweb.org/anthology/2021.naacl-main.194/ Abstract: Story visualization is an underexplored task that falls at the intersection of many important research directions in both computer vision and natural language processing. In this task, given a series of natural language captions which compose a story, an agent must generate a sequence of images that correspond to the captions. Prior work has introduced recurrent generative models which outperform text-to-image synthesis models on this task. However, there is room for improvement of generated images in terms of visual quality, coherence and relevance. We present a number of improvements to prior modeling approaches, including (1) the addition of a dual learning framework that utilizes video captioning to reinforce the semantic alignment between the story and generated images, (2) a copy-transform mechanism for sequentially-consistent story visualization, and (3) MART-based transformers to model complex interactions between frames. We present ablation studies to demonstrate the effect of each of these techniques on the generative power of the model for both individual images as well as the entire narrative. Furthermore, due to the complexity and generative nature of the task, standard evaluation metrics do not accurately reflect performance. Therefore, we also provide an exploration of evaluation metrics for the model, focused on aspects of the generated frames such as the presence/quality of generated characters, the relevance to captions, and the diversity of the generated images. We also present correlation experiments of our proposed automated metrics with human evaluations.

  • Research data . Audiovisual . 2020 . Embargo End Date: 04 Dec 2020
    Authors: 
    The 28th International Conference on Computational Linguistics 2020;
    Publisher: Underline Science Inc.
  • Authors: 
    Huber, Catrin; Haynes, Ian; Turner, Alexander; Ravesi, Thea; Morris, Rosie;
    Publisher: Newcastle University

    3D scans of objects from Herculaneum that were printed and included as part of Catrin Huber's installation at the House of the Beautiful Courtyard. These files are in .OBJ format with a .MTL file and a texture file that is either .JPG or .PNG format

  • Research data . Audiovisual . 2020 . Embargo End Date: 25 Nov 2020
    Authors: 
    The 28th International Conference on Computational Linguistics 2020; gong, ming; Jiang, Daxin; pei, jian; shou, linjun; Wen, Lijie; zhang, xingyao;
    Publisher: Underline Science Inc.

    Generating texts which express complex ideas spanning multiple sentences requires a structured representation of their content (document plan), but these representations are prohibitively expensive to manually produce. In this work, we address the problem of generating coherent multi-sentence texts from the output of an information extraction system, and in particular a knowledge graph. Graph- ical knowledge representations are ubiquitous in computing, but pose a significant challenge for text generation techniques due to their non-hierarchical nature, collapsing of long- distance dependencies, and structural variety. We introduce a novel graph transforming en- coder which can leverage the relational structure of such knowledge graphs without imposing linearization or hierarchical constraints. Incorporated into an encoder-decoder setup, we provide an end-to-end trainable system for graph-to-text generation that we apply to the domain of scientific text. Automatic and human evaluations show that our technique produces more informative texts which exhibit better document structure than competitive encoder-decoder methods.

  • Research data . Audiovisual . 2021 . Embargo End Date: 15 Oct 2021
    Authors: 
    The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Wang, Jack;
    Publisher: Underline Science Inc.

    Anthology paper link: https://aclanthology.org/2021.emnlp-main.484/ Abstract: We study the problem of generating arithmetic math word problems (MWPs) given a math equation that specifies the mathematical computation and a context that specifies the problem scenario. Existing approaches are prone to generating MWPs that are either mathematically invalid or have unsatisfactory language quality. They also either ignore the context or require manual specification of a problem template, which compromises the diversity of the generated MWPs. In this paper, we develop a novel MWP generation approach that leverages i) pre-trained language models and a context keyword selection model to improve the language quality of the generated MWPs and ii) an equation consistency constraint for math equations to improve the mathematical validity of the generated MWPs. Extensive quantitative and qualitative experiments on three real-world MWP datasets demonstrate the superior performance of our approach compared to various baselines.

  • Authors: 
    Dabağcı, Esra;
    Publisher: Loughborough University

    This is an audio recording of one of the interviews we conducted for our project "Women's Media and Memory". In this project, we aim to record and archive women's memories in Turkey by conducting oral history interviews. In the light of this research, we want to look at feminist understanding of Turkey's history through women's narratives. Please visit the project website for more information: https://www.bizimhikayemiz.org/

  • Authors: 
    Kartomi, Margaret J.; Amir, Iwan Dzulvan;
    Publisher: Monash University

    AV13.2 Audiovisual Example 2 in Chapter 13 of book: Margaret Kartomi, ‘Musical Journeys in Sumatra’, Champaign-Urbana: University of Illinois Press, 2012. For male participants at the Festival of Acehnese Dance and Oboes and based on traditional village performances of the dabôh genre. The two lead dancers formally greet the two kalipah (spiritual leaders) as the solo vocalist sings and the cross-legged seated men play their rapa’i frame drums, after which the clip shows the dancers stabbing themselves with an awl, with no pain felt or wounds received due to their believed religious concentration. Meanwhile the singers sing the most beautiful names of Allah, the prophet Muhammad, and other prophets. The group, named Sanggar Aceh Barat Daya, was led by Bp M Johar of Blangpidie. These recordings of traditional Acehnese music and dance were made during Margaret Kartomi’s Australia Research Council-funded field trip to the Festival Tari dan Seuruné Kalée (Festival of Dance and Oboes) in Lhokseumawe, North Aceh, Indonesia, 28 February- 2 March 2003. Margaret Kartomi was invited to record at the Festival by the Bupati (district head) of North Aceh, Bp Tarmizi Karim and Bp Noerdin Daood, lecturer in Acehnese dance in the Jakarta Arts Institute (IKJ), who was a member of the jury. She was accompanied by her husband, Mas Kartomi, and Monash PhD ethnomusicology alumnus, Iwan Dzulvan Amir. Camera by Iwan Dzulvan Amir. Recorded by permission of the Bupati of North Aceh, Bp Tarmizi Karim, and his appointed artist and 2003 festival organiser - Bp Rizal, on behalf of the troupes who performed in the Festival. Copyright 2003. Margaret J. Kartomi. Camera by Iwan Dzulvan Amir

  • Authors: 
    Kartomi, Margaret;
    Publisher: Monash University

    Recordings by RRI copied by Margaret Kartomi, Bali, Indonesia, 1971Recorded November 1971 Copyright Margaret J. KartomiOriginal format: 1 sound tape reel (0:37:24) : analog; 3¾ ips, 7½ ips

  • Research data . Audiovisual . 2021 . Embargo End Date: 25 May 2021
    Authors: 
    NAACL 2021 2021; Devaraj, Ashwin; Li, Junyi Jessy; Marshall, Iain; Wallace, Byron;
    Publisher: Underline Science Inc.

    Read the paper on the folowing link: https://www.aclweb.org/anthology/2021.naacl-main.395/ Abstract: We consider the problem of learning to simplify medical texts. This is important because most reliable, up-to-date information in biomedicine is dense with jargon and thus practically inaccessible to the lay audience. Furthermore, manual simplification does not scale to the rapidly growing body of biomedical literature, motivating the need for automated approaches. Unfortunately, there are no large-scale resources available for this task. In this work we introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics. We then propose a new metric based on likelihood scores from a masked language model pretrained on scientific texts. We show that this automated measure better differentiates between technical and lay summaries than existing heuristics. We introduce and evaluate baseline encoder-decoder Transformer models for simplification and propose a novel augmentation to these in which we explicitly penalize the decoder for producing "jargon" terms; we find that this yields improvements over baselines in terms of readability.

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
4,595 Research products, page 1 of 460
  • Research data . Audiovisual . 2021 . Embargo End Date: 25 May 2021
    Authors: 
    NAACL 2021 2021; Chubarian, Karine; Khan, Abdul; Sidiropoulos, Anastasios; Xu, Jia;
    Publisher: Underline Science Inc.

    Read the paper on the folowing link: https://www.aclweb.org/anthology/2021.naacl-main.257/ Abstract: Deep Learning-based NLP systems can be sensitive to unseen tokens and hard to learn with high-dimensional inputs, which critically hinder learning generalization. We introduce an approach by grouping input words based on their semantic diversity to simplify input language representation with low ambiguity. Since the semantically diverse words reside in different contexts, we are able to substitute words with their groups and still distinguish word meanings relying on their contexts. We design several algorithms that compute diverse groupings based on random sampling, geometric distances, and entropy maximization, and we prove formal guarantees for the entropy-based algorithms. Experimental results show that our methods generalize NLP models and demonstrate enhanced accuracy on POS tagging and LM tasks and significant improvements on medium-scale machine translation tasks, up to +6.5 BLEU points.

  • Research data . Audiovisual . 2021 . Embargo End Date: 25 May 2021
    Authors: 
    NAACL 2021 2021; Bansal, Mohit; Hannan, Darryl; Maharana, Adyasha;
    Publisher: Underline Science Inc.

    Read the paper on the folowing link: https://www.aclweb.org/anthology/2021.naacl-main.194/ Abstract: Story visualization is an underexplored task that falls at the intersection of many important research directions in both computer vision and natural language processing. In this task, given a series of natural language captions which compose a story, an agent must generate a sequence of images that correspond to the captions. Prior work has introduced recurrent generative models which outperform text-to-image synthesis models on this task. However, there is room for improvement of generated images in terms of visual quality, coherence and relevance. We present a number of improvements to prior modeling approaches, including (1) the addition of a dual learning framework that utilizes video captioning to reinforce the semantic alignment between the story and generated images, (2) a copy-transform mechanism for sequentially-consistent story visualization, and (3) MART-based transformers to model complex interactions between frames. We present ablation studies to demonstrate the effect of each of these techniques on the generative power of the model for both individual images as well as the entire narrative. Furthermore, due to the complexity and generative nature of the task, standard evaluation metrics do not accurately reflect performance. Therefore, we also provide an exploration of evaluation metrics for the model, focused on aspects of the generated frames such as the presence/quality of generated characters, the relevance to captions, and the diversity of the generated images. We also present correlation experiments of our proposed automated metrics with human evaluations.

  • Research data . Audiovisual . 2020 . Embargo End Date: 04 Dec 2020
    Authors: 
    The 28th International Conference on Computational Linguistics 2020;
    Publisher: Underline Science Inc.
  • Authors: 
    Huber, Catrin; Haynes, Ian; Turner, Alexander; Ravesi, Thea; Morris, Rosie;
    Publisher: Newcastle University

    3D scans of objects from Herculaneum that were printed and included as part of Catrin Huber's installation at the House of the Beautiful Courtyard. These files are in .OBJ format with a .MTL file and a texture file that is either .JPG or .PNG format

  • Research data . Audiovisual . 2020 . Embargo End Date: 25 Nov 2020
    Authors: 
    The 28th International Conference on Computational Linguistics 2020; gong, ming; Jiang, Daxin; pei, jian; shou, linjun; Wen, Lijie; zhang, xingyao;
    Publisher: Underline Science Inc.

    Generating texts which express complex ideas spanning multiple sentences requires a structured representation of their content (document plan), but these representations are prohibitively expensive to manually produce. In this work, we address the problem of generating coherent multi-sentence texts from the output of an information extraction system, and in particular a knowledge graph. Graph- ical knowledge representations are ubiquitous in computing, but pose a significant challenge for text generation techniques due to their non-hierarchical nature, collapsing of long- distance dependencies, and structural variety. We introduce a novel graph transforming en- coder which can leverage the relational structure of such knowledge graphs without imposing linearization or hierarchical constraints. Incorporated into an encoder-decoder setup, we provide an end-to-end trainable system for graph-to-text generation that we apply to the domain of scientific text. Automatic and human evaluations show that our technique produces more informative texts which exhibit better document structure than competitive encoder-decoder methods.

  • Research data . Audiovisual . 2021 . Embargo End Date: 15 Oct 2021
    Authors: 
    The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Wang, Jack;
    Publisher: Underline Science Inc.

    Anthology paper link: https://aclanthology.org/2021.emnlp-main.484/ Abstract: We study the problem of generating arithmetic math word problems (MWPs) given a math equation that specifies the mathematical computation and a context that specifies the problem scenario. Existing approaches are prone to generating MWPs that are either mathematically invalid or have unsatisfactory language quality. They also either ignore the context or require manual specification of a problem template, which compromises the diversity of the generated MWPs. In this paper, we develop a novel MWP generation approach that leverages i) pre-trained language models and a context keyword selection model to improve the language quality of the generated MWPs and ii) an equation consistency constraint for math equations to improve the mathematical validity of the generated MWPs. Extensive quantitative and qualitative experiments on three real-world MWP datasets demonstrate the superior performance of our approach compared to various baselines.

  • Authors: 
    Dabağcı, Esra;
    Publisher: Loughborough University

    This is an audio recording of one of the interviews we conducted for our project "Women's Media and Memory". In this project, we aim to record and archive women's memories in Turkey by conducting oral history interviews. In the light of this research, we want to look at feminist understanding of Turkey's history through women's narratives. Please visit the project website for more information: https://www.bizimhikayemiz.org/

  • Authors: 
    Kartomi, Margaret J.; Amir, Iwan Dzulvan;
    Publisher: Monash University

    AV13.2 Audiovisual Example 2 in Chapter 13 of book: Margaret Kartomi, ‘Musical Journeys in Sumatra’, Champaign-Urbana: University of Illinois Press, 2012. For male participants at the Festival of Acehnese Dance and Oboes and based on traditional village performances of the dabôh genre. The two lead dancers formally greet the two kalipah (spiritual leaders) as the solo vocalist sings and the cross-legged seated men play their rapa’i frame drums, after which the clip shows the dancers stabbing themselves with an awl, with no pain felt or wounds received due to their believed religious concentration. Meanwhile the singers sing the most beautiful names of Allah, the prophet Muhammad, and other prophets. The group, named Sanggar Aceh Barat Daya, was led by Bp M Johar of Blangpidie. These recordings of traditional Acehnese music and dance were made during Margaret Kartomi’s Australia Research Council-funded field trip to the Festival Tari dan Seuruné Kalée (Festival of Dance and Oboes) in Lhokseumawe, North Aceh, Indonesia, 28 February- 2 March 2003. Margaret Kartomi was invited to record at the Festival by the Bupati (district head) of North Aceh, Bp Tarmizi Karim and Bp Noerdin Daood, lecturer in Acehnese dance in the Jakarta Arts Institute (IKJ), who was a member of the jury. She was accompanied by her husband, Mas Kartomi, and Monash PhD ethnomusicology alumnus, Iwan Dzulvan Amir. Camera by Iwan Dzulvan Amir. Recorded by permission of the Bupati of North Aceh, Bp Tarmizi Karim, and his appointed artist and 2003 festival organiser - Bp Rizal, on behalf of the troupes who performed in the Festival. Copyright 2003. Margaret J. Kartomi. Camera by Iwan Dzulvan Amir

  • Authors: 
    Kartomi, Margaret;
    Publisher: Monash University

    Recordings by RRI copied by Margaret Kartomi, Bali, Indonesia, 1971Recorded November 1971 Copyright Margaret J. KartomiOriginal format: 1 sound tape reel (0:37:24) : analog; 3¾ ips, 7½ ips

  • Research data . Audiovisual . 2021 . Embargo End Date: 25 May 2021
    Authors: 
    NAACL 2021 2021; Devaraj, Ashwin; Li, Junyi Jessy; Marshall, Iain; Wallace, Byron;
    Publisher: Underline Science Inc.

    Read the paper on the folowing link: https://www.aclweb.org/anthology/2021.naacl-main.395/ Abstract: We consider the problem of learning to simplify medical texts. This is important because most reliable, up-to-date information in biomedicine is dense with jargon and thus practically inaccessible to the lay audience. Furthermore, manual simplification does not scale to the rapidly growing body of biomedical literature, motivating the need for automated approaches. Unfortunately, there are no large-scale resources available for this task. In this work we introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics. We then propose a new metric based on likelihood scores from a masked language model pretrained on scientific texts. We show that this automated measure better differentiates between technical and lay summaries than existing heuristics. We introduce and evaluate baseline encoder-decoder Transformer models for simplification and propose a novel augmentation to these in which we explicitly penalize the decoder for producing "jargon" terms; we find that this yields improvements over baselines in terms of readability.