The 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE @ RANLP 2023, on September 7, 2023)
Nowadays, the unprecedented quantity of easily accessible data on social, political, and economic processes offers ground-breaking potential in guiding data-driven analysis in social and human sciences and in driving informed policy-making processes. Governments, multilateral organizations, and local and global NGOs present an increasing demand for high-quality information about a wide variety of events ranging from political violence, environmental disasters, and conflict, to international economic and health crises (Coleman et al. 2014; Porta and Diani, 2015) to prevent or resolve conflicts, provide relief for those that are afflicted, or improve the lives of and protect citizens in a variety of ways. Citizen actions against the COVID measures in the period 2020-2022 and the Russia – Ukraine war are only two examples where event-centered data can contribute to better understanding of real-life situations. Finally, these efforts respond to “growing public interest in up-to-date information on crowds” as well.
Event extraction has long been a challenge for the natural language processing (NLP) community as it requires sophisticated methods in defining event ontologies, creating language resources, developing algorithmic approaches and ML models (Pustojevsky et al. 2003; Boroş, 2018; Chen et al. 2021). Previous issues of the CASE-series of workshops have featured works which use BERT and other deep learning models, syntactic parsing, semantic argument structure analysis, temporal and space reasoning, lexical learning, and other NLP methods and algorithms. Detecting and extracting information about socio political events is a complex NLP task: events can be described via elaborated syntactic and semantic language structures; event descriptions may enter into different semantic relations between each other, such as coreference, causality, inclusion, spatio-temporal proximity and others.
Events as linguistic phenomena are usually modeled through frames and ontologies and event types are often represented via elaborated taxonomies. Detecting socio-political events in real world texts poses problems, which originate from the dynamics of the activity of the governments, political parties, movements and other socially active groups. They may frequently change their leading figures, strategies and organization. These factors can make existing statistical models and knowledge bases less relevant as the time passes and require development of methods which rely on limited data, such as few-shot learning, man-in-the loop or other specific learning strategies.
Social and political scientists have been working to create socio-political event (SPE) databases such as ACLED, EMBERS, GDELT, ICEWS, MMAD, PHOENIX, POLDEM, SPEED, TERRIER, and UCDP following similar steps for decades. These projects and the new ones increasingly rely on machine learning (ML), deep learning (DL), and NLP methods to deal better with the vast amount and variety of data in this domain (Hürriyetoğlu et al. 2020). Unfortunately automated approaches suffer from major issues like bias, limited generalizability, class imbalance, training data limitations, and ethical issues that have the potential to affect the results and their use drastically (Lau and Baldwin 2020; Bhatia et al. 2020; Chang et al. 2019). Moreover, the results of the automated systems for socio-political events (SPE) information collection have neither been comparable to each other nor been of sufficient quality (Wang et al. 2016; Schrodt 2020). SPEs are varied and nuanced. Both the political context and the local language used may affect whether and how they are reported.
We invite contributions from researchers in computer science, NLP, ML, DL, AI, socio-political sciences, conflict analysis and forecasting, peace studies, as well as computational social science scholars involved in the collection and utilization of SPE data. Academic workshops specific to tackling event information in general or for analyzing text in specific domains such as health, law, finance, and biomedical sciences have significantly accelerated progress in these topics and fields, respectively. However, there has not been a comparable effort for handling SPEs. We fill this gap. We invite work on all aspects of automated coding and analysis of SPEs and events in general from mono- or multi-lingual text sources. This includes (but is not limited to) the following topics: 1) Extracting events and their arguments in and beyond a sentence or document, event coreference resolution. 2) Research in NLP technologies, related to event detection, such as: geocoding, temporal reasoning, argument structure detection, syntactic and semantic analysis of event structures, text classification for event type detection, learning event-related lexica, co-reference in event descriptions, machine translation for multilingual event detection, named entity recognition, fake news analysis, text similarity and others with focus on real or potential event detection applications. 3) New datasets, training data collection and annotation for event information. 4) Event-event relations, e.g., subevents, main events, spatio-temporal relations, causal relations. 5) Event dataset evaluation in light of reliability and validity metrics. 6) Defining, populating, and facilitating event schemas and ontologies. 7) Automated tools and pipelines for event collection related tasks. 8) Lexical, syntactic, semantic, discursive, and pragmatic aspects of event manifestation. 9) Methodologies for development, evaluation, and analysis of event datasets. 10) Applications of event databases, e.g. early warning, conflict prediction, policymaking. 11) Estimating what is missing in event datasets using internal and external information. 12) Detection of new and emerging SPE types, e.g. creative protests, 13) Release of new event datasets, 14) Bias and fairness of the sources and event datasets. 15) Ethics, misinformation, privacy, and fairness concerns pertaining to event datasets. 16) Copyright issues on event dataset creation, dissemination, and sharing. 17) Cross-lingual, multilingual and multimodal aspects in event analysis, 18- Climate change and conflict-related resources and approaches related to contentious politics around climate change. Moreover, we will encourage submissions of new system description papers on our available benchmarks.
Bhatia, S., Lau, J. H., & Baldwin, T. (2020). You are right. I am ALARMED–But by Climate Change Counter Movement.
Boroş, E. (2018). Neural Methods for Event Extraction.
Chang, K. W., Prabhakaran, V., & Ordonez, V. (2019, November). Bias and fairness in natural language processing.
Chen M., Zhang H., Ning Q., Li M., Ji H., Roth D. (2021). Event-centric Natural Language Understanding.
Coleman, P. T., Deutsch, M., & Marcus, E. C. (Eds.). (2014). The handbook of conflict resolution: Theory and practice.
Della Porta, D., & Diani, M. (Eds.). (2015). The Oxford handbook of social movements.
Hürriyetoğlu, A., Zavarella, V., Tanev, H., Yörük, E., Safaya, A., & Mutlu, O. (2020, May). Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report.
Lau, J. H., & Baldwin, T. (2020, July). Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?.
Pustejovsky, J., Castano, J. M., Ingria, R., Sauri, R., Gaizauskas, R. J., Setzer, A., … & Radev, D. R. (2003). TimeML: Robust specification of event and temporal expressions in text.
Schrodt, P. A. (2020, May). Keynote Abstract: Current Open Questions for Operational Event Data.
Wang, W., Kennedy, R., Lazer, D., & Ramakrishnan, N. (2016). Growing pains for global monitoring of societal events.