research data . Dataset . 2016 . Embargo end date: 12 Jul 2017

Brexit stance annotated tweets

Grčar, Miha; Cherepnalkoski, Darko; Mozetič, Igor; Kralj Novak, Petra;
Open Access
  • Published: 01 Jan 2016
  • Publisher: Jožef Stefan Institute
The corpus contains over 4.5 million tweets (tweet IDs) automatically labeled by a machine learning program with stance regarding Brexit: Positive (supporting Brexit), Negative (opposing Brexit), or Neutral (uncommitted). The Brexit referendum was held on June 23, 2016, to decide whether the UK should leave or remain in the EU. In the weeks before the referendum, starting on May 12, the UK geo-located Brexit-related tweets were continuously collected resulting in a dataset of around 4.5 million (4,508,440) tweets from almost one million (998,054) users. A large sample of the collected tweets (35,000) was manually labeled for the stance of their authors regarding Brexit: Positive (supporting Brexit), Negative (opposing Brexit), or Neutral (uncommitted). The labeled tweets were used to train a classifier which then automatically labeled all the remaining tweets. The corpus contains tweet ids and stance labels. The tweets are grouped into files one hour per file. In each file, one row represents one entry (twitter_id, sentiment_label). Lines are ordered by the tweet time. The data collection, annotation, model training and performance estimation is described in detail in: Miha Grčar, Darko Cherepnalkoski, Igor Mozetič, Petra Kralj Novak: Stance and influence of Twitter users regarding the Brexit referendum. Computational Social Networks 4/6. 2017.
Persistent Identifiers
  • Digital Humanities and Cultural Heritage
Funded by
Distributed Global Financial Systems for Society
  • Funder: European Commission (EC)
  • Project Code: 640772
  • Funding stream: H2020 | RIA
Download from
1 research outcomes, page 1 of 1
Any information missing or wrong?Report an Issue