The 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE @ EMNLP 2022



Call for Papers and Shared Task Participation (CASE @ EMNLP 2022): Challenges and Applications of Automated Extraction of Socio-political Events from Text



Important Dates


September 7, 2022: Submission deadline on Softconf

July 15, 2022: Latest ARR submission deadline for ARR

October 2, 2022: Latest ARR commitment deadline 

October 9, 2022: Notification of Acceptance

October 16, 2022: Camera-ready papers due

Workshop dates: December 7-8, 2022

Location: Hybrid -> Abu Dhabi & Online

Please see below for the important dates of the shared tasks.


There are two options for submissions that are



Nowadays, the unprecedented quantity of easily accessible data on social, political, and economic processes offers ground-breaking potential in guiding data-driven analysis in social and human sciences and in driving informed policy-making processes. Governments, multilateral organizations, local and global NGOs, and present an increasing demand for high-quality information about a wide variety of events ranging from political violence, environmental catastrophes, and conflict, to international economic and health crises (Coleman et al. 2014; Porta and Diani, 2015) to prevent or resolve conflicts, provide relief for those that are afflicted, or improve the lives of and protect citizens in a variety of ways. Black Lives Matter protests and conflicts in Syria are only two examples where we must understand, analyze, and improve the real-life situations using such data. Finally, these efforts respond to “growing public interest in up-to-date information on crowds” as well.

Event extraction has long been a challenge for the natural language processing (NLP) community as it requires sophisticated methods in defining event ontologies, creating language resources, and developing algorithmic approaches (Pustojevsky et al. 2003; Boroş, 2018; Chen et al. 2021). Social and political scientists have been working to create socio-political event (SPE) databases such as ACLED, EMBERS, GDELT, ICEWS, MMAD, PHOENIX, POLDEM, SPEED, TERRIER, and UCDP following similar steps for decades. These projects and the new ones increasingly rely on machine learning (ML), deep learning (DL), and NLP methods to deal better with the vast amount and variety of data in this domain (Hürriyetoğlu et al. 2020). Automation offers scholars not only the opportunity to improve existing practices, but also to vastly expand the scope of data that can be collected and studied, thus potentially opening up new research frontiers within the field of SPEs, such as political violence and social movements. But automated approaches as well suffer from major issues like bias, generalizability, class imbalance, training data limitations, and ethical issues that have the potential to affect the results and their use drastically (Lau and Baldwin 2020; Bhatia et al. 2020; Chang et al. 2019). Moreover, the results of the automated systems for SPE information collection have neither been comparable to each other nor been of sufficient quality (Wang et al. 2016; Schrodt 2020). 

SPEs are varied and nuanced. Both the political context and the local language used may affect whether and how they are reported. Therefore, all steps of information collection (event definition, language resources, and manual or algorithmic steps) may need to be constantly updated, leading to a series of challenging questions: Do events related to minority groups are represented well? Are new types of events covered? Are the event definitions and their operationalization comparable across systems? This workshop aims to seek answers to these questions as well. Inspiring innovative technological and scientific solutions for tackling these issues and quantifying the quality of the results. 



Call for Papers

We invite contributions from researchers in computer science, NLP, ML, DL, AI, socio-political sciences, conflict analysis and forecasting, peace studies, as well as computational social science scholars involved in the collection and utilization of SPE data. Social and political scientists will be interested in reporting and discussing their approaches and observing what the state-of-the-art text processing systems can achieve for their domain. Computational scholars will have the opportunity to illustrate the capacity of their approaches in this domain and benefit from being challenged by real-world use cases. Academic workshops specific to tackling event information in general or for analyzing text in specific domains such as health, law, finance, and biomedical sciences have significantly accelerated progress in these topics and fields, respectively. However, there has not been a comparable effort for handling SPEs. We fill this gap. We invite work on all aspects of automated coding and analysis of SPEs and events in general from mono- or multi-lingual text sources. This includes (but is not limited to) the following topics

  • Extracting events in and beyond a sentence, event coreference resolution
  • New datasets, training data collection and annotation for event information
  • Event-event relations, e.g., subevents, main events, causal relations
  • Event dataset evaluation in light of reliability and validity metrics
  • Defining, populating, and facilitating event schemas and ontologies
  • Automated tools and pipelines for event collection related tasks
  • Lexical, syntactic, discursive, and pragmatic aspects of event manifestation
  • Methodologies for development, evaluation, and analysis of event datasets
  • Applications of event databases, e.g. early warning, conflict prediction, policymaking 
  • Estimating what is missing in event datasets using internal and external information
  • Detection of new SPE types, e.g. creative protests, cyber activism, COVID19 related
  • Release of new event datasets
  • Bias and fairness of the sources and event datasets
  • Ethics, misinformation, privacy, and fairness concerns pertaining to event datasets
  • Copyright issues on event dataset creation, dissemination, and sharing

We encourage submissions of new system description papers on our available benchmarks (ProtestNews @ CLEF 2019, AESPEN @ LREC 2020, and CASE @ 2021). Please contact the organizers if you would like to access the data.



This call solicits short and long papers reporting original and unpublished research on the topics listed above. The papers should emphasize obtained results rather than intended work and should indicate clearly the state of completion of the reported results. The page limits and content structure announced at ACL ARR page ( should be followed for both short and long papers. 

Papers should be submitted on the START page of the workshop ( or on ARR page (TBA on the workshop website) in PDF format, in compliance with the ACL publication author guidelines for ACL publications 

The reviewing process will be double-blind and papers should not include the author’s names and affiliations. Each submission will be reviewed by at least three members of the program committee. The workshop proceedings will be published on ACL Anthology.


Task 1 & 2: Multilingual Protest Event Detection:

Task 1- Multilingual protest news detection: This is the same shared task organized at CASE 2021 (For more info: But this time there will be additional data and languages at the evaluation stage. Contact person: Ali Hürriyetoğlu ( Github:  

Task 2- Automatically replicating manually created event datasets: The participants of Task 1 will be invited to run the systems they will develop to tackle Task 1 on a news archive (For more info Contact person: Hristo Tanev ( Github:, please also see 



Task 3- Event causality identification: Causality is a core cognitive concept and appears in many natural language processing (NLP) works that aim to tackle inference and understanding. We are interested to study event causality in news, and therefore, introduce the Causal News Corpus. The Causal News Corpus consists of 3,559 event sentences, extracted from protest event news, that have been annotated with sequence labels on whether it contains causal relations or not. Subsequently, causal sentences are also annotated with Cause, Effect, and Signal spans. Our two subtasks (Sequence Classification and Span Detection) work on the Causal News Corpus, and we hope that accurate, automated solutions may be proposed for the detection and extraction of causal events in news. Contact person: Fiona Anting Tan ( Github: 

Please follow the workshop page for the updates or contact the contact person related to the task you are interested in.



Participants in the Shared Task are expected to submit a paper to the workshop. Submitting a paper is not mandatory for participating in the Shared Task. Papers must follow the CASE 2022 workshop submission instructions (ACL 2022 style template: T.B.D.) and will undergo regular peer review. Their acceptance will not depend on the results obtained in the shared task but on the quality of the paper. Authors of accepted papers will be informed about the evaluation results of their systems prior to the paper submission deadline (see the important dates).

Important Dates for the Shared Task:

The important dates for the tasks are

 ** Task 1 & 2:

Training data available: The training data from CASE 2021 is used.

New test data available: Sept 15, 2022

Test end: Sep 25, 2022

System Description Paper submissions due: Oct 2, 2022

Notification to authors after review: Oct 09, 2022

Camera-ready: Oct 16, 2022

** Task 3:

Training data available: Apr 15, 2022

Validation data available: Apr 15, 2022

Validation labels available: Aug 01, 2022

Test data available: Aug 01, 2022

Test start: Aug 01, 2022

Test end: extended from Aug 15 to Aug 31, 2022

System Description Paper submissions due: Sep 07, 2022

Notification to authors after review: Oct 09, 2022

Camera ready: Oct 16, 2022



Three prominent scholars have accepted our invitation as keynote speakers:

  1. J. Craig Jenkins ( is Academy Professor Emeritus of Sociology at The Ohio State University. He directed the Mershon Center for International Security Studies from 2011 to 2015 and is now senior research scientist. Jenkins is author of more than 100 referred articles and book chapters, as well as author or editor of several books including The Politics of Insurgency: The Farm Worker’s Movement of the 1960s (1986); The Politics of Social Protest: Comparative Perspectives on States and Social Movements, with Bert Klandermans (University of Minnesota Press, 1995); Identity Conflicts: Can Violence be Regulated?, with Esther Gottlieb (Transaction Publishers, 2007) and Handbook of Politics: State and Society in Global Perspective, with Kevin T. Leicht (Springer, 2010). He has received numerous awards, including the Robin M. Williams Jr. Award for Distinguished Contributions to Scholarship, Teaching and Service from the Section on Peace, War and Social Conflict of the American Sociological Association (2015), fellow of the American Association for the Advancement of Science (2009), Joan Huber Faculty Fellow (2003), chair of the Section on Committees of the American Sociological Association (1998-2000), chair of the Section on Political Sociology, ASA (1995-96), and chair of the Section on Collective Behavior and Social Movements, ASA (1994-95). He was elected to the Sociological Research Association in 1993 and was a national security fellow at the Mershon Center for International Security at Ohio State in 1988, a Mershon Center professor from 2003-06 and chair of the Sociology Department, 2006-2010. Jenkins has received numerous grants from funding agencies, including the National Science Foundation, National Endowment for Humanities and Russell Sage Foundation. In 2010-11, he received a Liev Eriksson Mobility Grant from the Norway Research Council. In 2011-12, Jenkins was a Fulbright Fellow to Norway and a visiting professor at the Peace Research Institute of Oslo (PRIO) in Oslo, Norway. In 2017, Jenkins and co-investigator Maciek Slomczynski received a $1.4 million grant from the National Science Foundation for a four-year project on “Survey Data Recycling: New Analytic Framework, Integrated Database and Tools for Cross-National Social, Behavioral and Economic Research.” Jenkins has served as deputy editor of American Sociological Review (1986-1989), and on the editorial boards of Journal of Political and Military Sociology, International Studies Quarterly, Sociological Forum, and Sociological Quarterly.
  2. Scott Althaus ( is Merriam Professor of Political Science, Professor of Communication, and Director of the Cline Center for Advanced Social Research at the University of Illinois Urbana-Champaign. He also has faculty appointments with the School of Information Sciences and the National Center for Supercomputing Applications. His work with the Cline Center applies text analytics methods and Artificial Intelligence algorithms to extract insights from millions of news stories in ways that produce new forms of knowledge that advance societal well-being around the world. His own research interests explore the communication processes that support political accountability in democratic societies and that empower political discontent in non-democratic societies. His interests focus on four areas of inquiry: (1) how journalists construct news coverage about public affairs, (2) how leaders attempt to shape news coverage for political advantage, (3) how citizens use news coverage for making sense of public affairs, and (4) how the opinions of citizens are communicated back to leaders. He has particular interests in popular support for war, data science methods for extreme-scale analysis of news coverage, cross-national comparative research on political communication, the psychology of information processing, and communication concepts in democratic theory. His current projects include using data mining methods to help journalists cover terrorist attacks in responsible ways, a solo-authored book manuscript to be published by Cambridge University Press about the dynamics of popular support for war in the United States, and a co-authored book manuscript (with Tamir Sheafer and Gadi Wolfsfeld) in press with Oxford University Press on understanding the role of media in supporting governmental accountability and increasing the government’s responsiveness to citizen needs.
  3. Thien Huu Nguyen ( is an assistant professor in the Department of Computer and Information Science at the University of Oregon. He obtained his Ph.D. in natural language processing (NLP) at New York University (working with Ralph Grishman) and did a postdoc at the University of Montreal (working with Yoshua Bengio). Thien’s research areas involve information extraction, language grounding, and deep learning where he developed one of the first deep learning models for entity recognition, relation extraction, and event extraction. His current research explores multi-domain and multilingual NLP that aims to learn transferable representations to perform information extraction tasks over different domains and languages. Thien is the director of the NSF IUCRC Center for Big Learning (CBL) at the University of Oregon. His research has been supported by NSF, IARPA, Army Research Office, Adobe Research, and IBM Research.






Organization Committee 

Ali Hürriyetoğlu (KNAW Humanities Cluster DHLab, the Netherlands)

Hristo Tanev (European Commission, Joint Research Centre (EU JRC), Italy),

Vanni Zavarella (EU JRC, Italy)

Reyyan Yeniterzi (Sabancı University, Turkey)

Erdem Yörük (KU, Turkey)

Deniz Yüret (KU, Turkey)

Osman Mutlu (KU, Turkey)

Fırat Duruşan (KU, Turkey)

Ali Safaya (KU, Turkey)

Bharathi Raja Asoka Chakravarthi (Insight SFI Centre for Data Analytics, United Kingdom)

Benjamin J. Radford (UNC Charlotte, United States)

Francielle Vargas (University of São Paulo, Brazil)

Farhana Ferdousi Liza (University of East Anglia, United Kingdom)

Milena Slavcheva (Bulgarian Academy of Sciences, Bulgaria)

Ritesh Kumar (Dr. Bhimrao Ambedkar University, India)

Daniela Cialfi (The ‘Gabriele d’Annunzio’ University, Italy)

Tiancheng Hu (ETH Zürich, Switzerland)

Niklas Stöhr (ETH Zürich, Switzerland)

Fiona Anting Tan (National University of Singapore, Singapore)

Tadashi Nomoto (National Institute of Japanese Literature, Japan)



Program Committee 

Fatih Beyhan (Sabanci University, Turkey)

Elizabeth Boschee (Information Sciences Institute, United States)

Tommaso Caselli (University of Groningen, the Netherlands)

Xingran Chen (University of Michigan – Ann Arbor, United States)

Martin Fajcik (IDIAP Research Institute, Switzerland)

Andrew Halterman (Michigan State University, United States)

Hansi Hettiarachchi (Birmingham City University, United Kingdom)

Li Zhuoqun (Chinese Academy of Sciences, China )

Pasquale Lisena (EURECOM, France)

Arka Mitra (ETH Zurich, Switzerland)

Manolito Octaviano Jr. (National University, Manila, Philippines)

Fabiana Rodrigues de Góes (University of São Paulo, Brazil)

Surendrabikram Thapa (Virginia Tech, United States)

Paul Trust (University College Cork)

Onur Uca (Mersin University, Turkey)

Yongjun Zhang (Stony Brook University, United States)

Ge Zhang (University of Michigan – Ann Arbor, United States)

Juan Pablo Zuluaga-Gomez (IDIAP Research Institute, Switzerland)



Find us on the Sociolinguistic Events Calendar:



Bhatia, S., Lau, J. H., & Baldwin, T. (2020). You are right. I am ALARMED–But by Climate Change Counter Movement. arXiv preprint arXiv:2004.14907.

Boroş, E. (2018). Neural Methods for Event Extraction. Ph.D. thesis, Université Paris-Saclay.

Chang, K. W., Prabhakaran, V., & Ordonez, V. (2019, November). Bias and fairness in natural language processing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): Tutorial Abstracts.

Chen M., Zhang H., Ning Q., Li M., Ji H., Roth D. (2021). Event-centric Natural Language Understanding. Proc. The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI2021) Tutorial. URL: 

Coleman, P. T., Deutsch, M., & Marcus, E. C. (Eds.). (2014). The handbook of conflict resolution: Theory and practice. John Wiley & Sons.

Della Porta, D., & Diani, M. (Eds.). (2015). The Oxford handbook of social movements. Oxford University Press.

Hürriyetoğlu, A., Zavarella, V., Tanev, H., Yörük, E., Safaya, A., & Mutlu, O. (2020, May). Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report. In Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020 (pp. 1-6).

Lau, J. H., & Baldwin, T. (2020, July). Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 2908-2913).

Pustejovsky, J., Castano, J. M., Ingria, R., Sauri, R., Gaizauskas, R. J., Setzer, A., … & Radev, D. R. (2003). TimeML: Robust specification of event and temporal expressions in text. New directions in question answering, 3, 28-34.

Schrodt, P. A. (2020, May). Keynote Abstract: Current Open Questions for Operational Event Data. In Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020.

Wang, W., Kennedy, R., Lazer, D., & Ramakrishnan, N. (2016). Growing pains for global monitoring of societal events. Science, 353(6307), 1502-1503.




[1], accessed on September 28, 2020.

[2], accessed on September 28, 2020.

[3], accessed on September 28, 2020.


CASE 2021


COPE 2019