The 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2022) 

Nowadays, the unprecedented quantity of easily accessible data on social, political, and economic processes offers ground-breaking potential in guiding data-driven analysis in social and human sciences and in driving informed policy-making processes. Governments, multilateral organizations, local and global NGOs, and present an increasing demand for high-quality information about a wide variety of events ranging from political violence, environmental catastrophes, and conflict, to international economic and health crises (Coleman et al. 2014; Porta and Diani, 2015) to prevent or resolve conflicts, provide relief for those that are afflicted, or improve the lives of and protect citizens in a variety of ways. Black Lives Matter protests and conflicts in Syria are only two examples where we must understand, analyze, and improve the real-life situations using such data. Finally, these efforts respond to “growing public interest in up-to-date information on crowds” as well.

Event extraction has long been a challenge for the natural language processing (NLP) community as it requires sophisticated methods in defining event ontologies, creating language resources, and developing algorithmic approaches (Pustojevsky et al. 2003; Boroş, 2018; Chen et al. 2021). Social and political scientists have been working to create socio-political event (SPE) databases such as ACLED, EMBERS, GDELT, ICEWS, MMAD, PHOENIX, POLDEM, SPEED, TERRIER, and UCDP following similar steps for decades. These projects and the new ones increasingly rely on machine learning (ML), deep learning (DL), and NLP methods to deal better with the vast amount and variety of data in this domain (Hürriyetoğlu et al. 2020). Automation offers scholars not only the opportunity to improve existing practices, but also to vastly expand the scope of data that can be collected and studied, thus potentially opening up new research frontiers within the field of SPEs, such as political violence and social movements. But automated approaches as well suffer from major issues like bias, generalizability, class imbalance, training data limitations, and ethical issues that have the potential to affect the results and their use drastically (Lau and Baldwin 2020; Bhatia et al. 2020; Chang et al. 2019). Moreover, the results of the automated systems for SPE information collection have neither been comparable to each other nor been of sufficient quality (Wang et al. 2016; Schrodt 2020). 

SPEs are varied and nuanced. Both the political context and the local language used may affect whether and how they are reported. Therefore, all steps of information collection (event definition, language resources, and manual or algorithmic steps) may need to be constantly updated, leading to a series of challenging questions: Do events related to minority groups are represented well? Are new types of events covered? Are the event definitions and their operationalization comparable across systems? This workshop aims to seek answers to these questions as well. Inspiring innovative technological and scientific solutions for tackling these issues and to quantify the quality of the results. 

 

Call for Papers

We invite contributions from researchers in computer science, NLP, ML, DL, AI, socio-political sciences, conflict analysis and forecasting, peace studies, as well as computational social science scholars involved in the collection and utilization of SPE data. Social and political scientists will be interested in reporting and discussing their approaches and observe what the state-of-the-art text processing systems can achieve for their domain. Computational scholars will have the opportunity to illustrate the capacity of their approaches in this domain and benefit from being challenged by real-world use cases. Academic workshops specific to tackling event information in general or for analyzing text in specific domains such as health, law, finance, and biomedical sciences have significantly accelerated progress in these topics and fields, respectively. However, there has not been a comparable effort for handling SPEs. We fill this gap. We invite work on all aspects of automated coding and analysis of SPEs and events in general from mono- or multi-lingual text sources. This includes (but is not limited to) the following topics

  • Extracting events in and beyond a sentence, event coreference resolution
  • New datasets, training data collection and annotation for event information
  • Event-event relations, e.g., subevents, main events, causal relations
  • Event dataset evaluation in light of reliability and validity metrics
  • Defining, populating, and facilitating event schemas and ontologies
  • Automated tools and pipelines for event collection related tasks
  • Lexical, syntactic, discursive, and pragmatic aspects of event manifestation
  • Methodologies for development, evaluation, and analysis of event datasets
  • Applications of event databases, e.g. early warning, conflict prediction, policymaking 
  • Estimating what is missing in event datasets using internal and external information
  • Detection of new SPE types, e.g. creative protests, cyber activism, COVID19 related
  • Release of new event datasets
  • Bias and fairness of the sources and event datasets
  • Ethics, misinformation, privacy, and fairness concerns pertaining to event datasets
  • Copyright issues on event dataset creation, dissemination, and sharing

 

Finally, we will encourage submissions of new system description papers on our available benchmarks (ProtestNews @ CLEF 2019, AESPEN @ LREC 2020, and CASE @ 2021).

 

Submissions

This call solicits full papers reporting original and unpublished research on the topics listed above. The papers should emphasize obtained results rather than intended work and should indicate clearly the state of completion of the reported results. Submissions should be between 4 and 8 pages in total, plus unlimited pages of references. Final versions of the papers will be given one additional page of content (up to 9 pages plus references) so that reviewers’ comments can be taken into account.

Authors are also invited to submit short papers not exceeding 4 pages (plus two additional pages for references). Short papers should describe:

  • a small, focused contribution;
  • work in progress; 
  • a negative result;
  • a position paper.
  • a report on shared task participation.

Papers should be submitted on the START page of the workshop (T.B.D.) in PDF format, in compliance with the ACL author guidelines (T.B.D.).

The reviewing process will be double-blind and papers should not include the authors’ names and affiliations. Each submission will be reviewed by at least three members of the program committee. If you do include any author names on the title page, your submission will be automatically rejected. In the body of your submission, you should eliminate all direct references to your own previous work.

Workshop Proceedings will be published on ACL Anthology.

 

Shared Tasks 

We have prepared i) a multilingual SPE information classification and extraction task by extending the list of the languages covered in CASE 2021 (English, Spanish, Portuguese, and Hindi) with at least Arabic, Turkish, Chinese, German, and Dravidian languages, ii) a challenge on replicating spatio-temporal distribution of protests pertaining to COVID19, and iii) a task on event causality detection. Participating teams, which were 15 in CASE 2021, will be required to submit a system description report, which will be peer reviewed by the program committee. We expect a comparable number of participants.

Publication

Participants in the Shared Task are expected to submit a paper to the workshop. Submitting a paper is not mandatory for participating in the Shared Task. Papers must follow the CASE 2022 workshop submission instructions (ACL 2022 style template: T.B.D.) and will undergo regular peer review. Their acceptance will not depend on the results obtained in the shared task but on the quality of the paper. Authors of accepted papers will be informed about the evaluation results of their systems prior to the paper submission deadline (see the important dates).

 

Important Dates for the Shared Task

T.B.D.

 

Keynotes

There will be two keynote talks that will be delivered by

i) J. Craig Jenkins (https://sociology.osu.edu/people/jenkins.12) &
Scott Althaus (https://pol.illinois.edu/directory/profile/salthaus), and

ii) Thien Huu Nguyen (https://ix.cs.uoregon.edu/~thien/). 

 

Program

T.B.D.

 

Organization Committee

Ali Hürriyetoğlu (Koc University (KU), Turkey),

Hristo Tanev (European Commission, Joint Research Centre (EU JRC), Italy), 

Vanni Zavarella (EU JRC, Italy),

Reyyan Yeniterzi (Sabancı University, Turkey),

Erdem Yörük (KU, Turkey), 

Deniz Yüret (KU, Turkey) 

Osman Mutlu (KU, Turkey),

Fırat Duruşan (KU, Turkey),

Ali Safaya (KU, Turkey),

Bharathi Raja Asoka Chakravarthi (Insight SFI Centre for Data Analytics, UK),

Benjamin J. Radford (UNC Charlotte, United States),

Francielle Vargas (University of São Paulo, Brazil),

Farhana Ferdousi Liza (University of Essex, UK),

Milena Slavcheva (Bulgarian Academy of Sciences, Bulgaria),

Ritesh Kumar (Dr. Bhimrao Ambedkar University, India),

Daniela Cialfi (The ‘Gabriele d’Annunzio’ University, Italy),

Tiancheng Hu (ETH Zürich, Switzerland),

Niklas Stöhr (ETH Zürich, Switzerland),

Fiona Anting Tang (National University of Singapore, Singapore). 

 

 

Program Committee

(T.B.D.)

 

 

References

Bhatia, S., Lau, J. H., & Baldwin, T. (2020). You are right. I am ALARMED–But by Climate Change Counter Movement. arXiv preprint arXiv:2004.14907.

Boroş, E. (2018). Neural Methods for Event Extraction. Ph.D. thesis, Université Paris-Saclay.

Chang, K. W., Prabhakaran, V., & Ordonez, V. (2019, November). Bias and fairness in natural language processing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): Tutorial Abstracts.

Chen M., Zhang H., Ning Q., Li M., Ji H., Roth D. (2021). Event-centric Natural Language Understanding. Proc. The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI2021) Tutorial. URL: https://blender.cs.illinois.edu/paper/eventtutorial2021.pdf 

Coleman, P. T., Deutsch, M., & Marcus, E. C. (Eds.). (2014). The handbook of conflict resolution: Theory and practice. John Wiley & Sons.

Della Porta, D., & Diani, M. (Eds.). (2015). The Oxford handbook of social movements. Oxford University Press.

Hürriyetoğlu, A., Zavarella, V., Tanev, H., Yörük, E., Safaya, A., & Mutlu, O. (2020, May). Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report. In Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020 (pp. 1-6).

Lau, J. H., & Baldwin, T. (2020, July). Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 2908-2913).

Pustejovsky, J., Castano, J. M., Ingria, R., Sauri, R., Gaizauskas, R. J., Setzer, A., … & Radev, D. R. (2003). TimeML: Robust specification of event and temporal expressions in text. New directions in question answering, 3, 28-34.

Schrodt, P. A. (2020, May). Keynote Abstract: Current Open Questions for Operational Event Data. In Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020.

Wang, W., Kennedy, R., Lazer, D., & Ramakrishnan, N. (2016). Growing pains for global monitoring of societal events. Science, 353(6307), 1502-1503.

 

[1] http://protestmap.raceandpolicing.com, accessed on September 28, 2020.

[2] https://www.cartercenter.org/peace/conflict_resolution/syria-conflict-resolution.html, accessed on September 28, 2020.

[3] https://en.wikipedia.org/wiki/Protests_over_responses_to_the_COVID-19_pandemic, accessed on September 28, 2020.



RELATED EVENTS:

CASE 2021

AESPEN 2020

COPE 2019

CLEF PROTESTNEWS 2019