Many recent important events, such as political elections or the coronavirus (COVID-19) outbreak, have been characterized by widespread diffusion of misinformation. How can AI help?
The slides of the tutorial are available here. There will be two Q&A live sessions on November 19th
The rise of social media has democratized content creation and has made it easy for anybody to share and to spread information online. On the positive side, this has given rise to citizen journalism, thus enabling much faster dissemination of information compared to what was possible with newspapers, radio, and TV. On the negative side, stripping traditional media from their gate-keeping role has left the public unprotected against the spread of disinformation, which could now travel at breaking-news speed over the same democratic channel. This situation gave rise to the proliferation of false information specifically created to affect individual people's beliefs, and ultimately to influence major events such as political elections; it also set the dawn of the Post-Truth Era, where appeal to emotions has become more important than the truth. More recently, with the emergence of the COVID-19 pandemic, a new blending of medical and political misinformation and disinformation has given rise to the first global infodemic. Limiting the impact of these negative developments has become a major focus for journalists, social media companies, and regulatory authorities.
The tutorial offers an overview of the emerging and inter-connected research areas of fact-checking, misinformation, disinformation, ``fake news'', propaganda, and media bias detection, with focus on text and on computational approaches. It further explores the general fact-checking pipeline and important elements thereof such as check-worthiness estimation, spotting previous fact-checked claims, stance detection, source reliability estimation. Finally, it covers some recent developments such as the emergence of large-scale pre-trained language models, and the challenges and opportunities they offer.
Prior knowledge of natural language processing, machine learning, and deep learning would be needed in order to understand large parts of the contents of this tutorial.
Dr. Preslav Nakov is a Principal Scientist at the Qatar Computing Research Institute (QCRI), HBKU. His research interests include computational linguistics, “fake news” detection, fact-checking, machine translation, question answering, sentiment analysis, lexical semantics, Web as a corpus, and biomedical text processing. He received his PhD degree from the University of California at Berkeley (supported by a Fulbright grant), and he was a Research Fellow in the National University of Singapore, a honorary lecturer in the Sofia University, and research staff at the Bulgarian Academy of Sciences. At QCRI, he leads the Tanbih project, developed in collaboration with MIT, which aims to limit the effect of “fake news”, propaganda and media bias by making users aware of what they are reading. Dr. Nakov is President of ACL SIGLEX, Secretary of ACL SIGSLAV, and a member of the EACL advisory board. He is member of the editorial board of TACL, CS&L, NLE, AI Communications, and Frontiers in AI. He is also on the Editorial Board of the Language Science Press Book Series on Phraseology and Multiword Expressions. He co-authored a Morgan & Claypool book on Semantic Relations between Nominals, two books on computer algorithms, and many research papers in top-tier conferences and journals. He also received the Young Researcher Award at RANLP’2011. Moreover, he was the first to receive the Bulgarian President’s John Atanasoff award, named after the inventor of the first automatic electronic digital computer. Dr. Nakov’s research on “fake news” was featured by over 100 news outlets, including Forbes, Boston Globe, Aljazeera, MIT Technology Review, Science Daily, Popular Science, Fast Company, The Register, WIRED, and Engadget, among others.
Giovanni Da San Martino is a Senior Assistant Professor at the University of Padova, Italy. His research interests are at the intersection of machine learning and natural language processing. He has been researching for 10+ years on these topics, publishing more than 60 publications in top-tier conferences and journals. He has worked on several NLP tasks including paraphrase recognition and stance detection and community question answering. Currently, he is actively involved in research on disinformation and propaganda detection, for which he is co-organiser of the Checkthat! labs at CLEF 2018-2020, the NLP4IF workshops on censorship, disinformation, and propaganda, and of its shared task, the 2019 Hack the News Datathon and the SemEval-2020 task 11 on ``Detection of propaganda techniques in news articles.''