software . 2022

FreEM-corpora/FreEM-automatic-normalisation: normalisation models for Early Modern French

Bawden, Rachel; Poinhos, Jonathan; Kogkitsidou, Eleni; Gambette, Philippe; Sagot, Benoît; Gabay, Simon;
Open Access French
  • Published: 30 May 2022
  • Publisher: Zenodo
Abstract
Modes to automatically normalise Early Modern French texts (i.e. from the 17th c.) into contemporary French spelling norms. Several models are provided (an LSTM model using Fairseq, a transformer model also using Fairseq and an SMT model trained using Moses). All models are trained on the FreEM_norm parallel corpus. For more information, please refer to: the paper detailing the motivations, methods and training [HAL] the code repository [github] If you use these models, please cite the following paper: Rachel Bawden, Jonathan Poinhos, Eleni Kogkitsidou, Philippe Gambette, Benoît Sagot and Simon Gabay. 2022. Automatic Normalisation of Early Modern French. In Proceedings of the 13th Language Resources and Evaluation Conference, Marseille, France. European Language Resources Association. Bibtex: @inproceedings{bawden-etal-2022-automatic, title = {{Automatic Normalisation of Early Modern French}}, author = {Bawden, Rachel and Poinhos, Jonathan and Kogkitsidou, Eleni and Gambette, Philippe and Sagot, Beno{\^i}t and Gabay, Simon}, url = {https://hal.inria.fr/hal-03540226}, booktitle = {Proceedings of the 13th Language Resources and Evaluation Conference}, publisher = {European Language Resources Association}, year = {2022}, address = {Marseille, France}, note = {To appear} }
Persistent Identifiers
Subjects
free text keywords: Modern French, Early Modern French, Normalisation, Normalization, Digital humanities, Machine Translation, Historical
Communities
  • Digital Humanities and Cultural Heritage
  • Social Science and Humanities
Download from
Open Access
ZENODO
Software . 2022
Providers: ZENODO
Any information missing or wrong?Report an Issue