View this email in your browser


February, 2020
Our partners on the first project meeting in Ljubljana, Slovenia!




1. Ljubljana, Slovenia in January

2. Tallinn, Estonia in March

3. La Rochelle, France in June

4. Zagreb, Croatia in December

What have we been up to?

Institut Jožef Stefan (JSI), Slovenia

Developed an online demo for cross-lingual offensive speech detection, working on keyword extraction, identifying viewpoints, analysis of sentiment in news in cross-lingual setting and developed an EmbViz, an online toolkit for analysis of neural attention heads. 
Queen Mary University of London (QMUL), United Kingdom
Building classifiers to help moderators of comments automatically find those which need blocking, evaluating the results and generating data to improve them, developing a public shared task to evaluate and comparing the context-sensitive embeddings and language models needed to build those tools. 
Univerza v Ljubljani (UL), Slovenia
Developed state-of-the-art ELMo contextual embeddings for seven EMBEDDIA languages, investigated different neural network architectures to adapt them to specifics of morphologically-rich languages addressed in the EMBEDDIA project and tested different cross-lingual embedding approaches. Disseminated the projects results on several events and meetings and maintained a presence on our social media channels and website. 
Universite de la Rochelle (ULR), France
Looking for strong benchmarks, covering the languages of EMBEDDIA, that could be used to measure the performances of our semantic enrichment tools, presented first results of our experiments regarding the automatic detection of places, events and people in the 7th Workshop on Balto-Slavic Natural Language Processing and polished the code from the developed tools to make it publicly available in EMBEDDIA's GitHub.  
Our partners on the third project meeting in La Rochelle, France!
Helsingin Yliopisto (UH), Finland
Working on automated generation of news reports, designing a news automation architecture that is as widely applicable as possible and technology that identifies when news stories in several languages are linked by their content and topic. Working on a revised partner user needs report and a number of publications, including a book on how the platform companies in Silicon Valley impact news media.
The University of Edinburgh (UE), United Kingdom
Working on visualization systems for two of the project tasks, model interpretability, and cross-lingual news summarization. We aim to create a tool to aid in understanding and exploring the core technologies of the project. Working on prototypes for exploring the temporal changes of topic frequency in a collection of news articles, cross-lingual visual comparison of search results, and the metadata-based exploration of news article clusters.
Styria Medijski Servisi DOO (STY), Croatia
As a technology provider for the Styria Media Group, Trikoder provides insights to the challenges of successful media house. Articles and user comments from Styria's online news portals are made available to the researchers. Preparing the data, creating additional annotations and sharing the results of analyses. End goal is to include the project results in the systems empowering our journalists in their everyday tasks.
OY Suomen Tietotoimisto (STT), Finland
Shared some 2,5 million Finnish and 0,5 million Swedish news articles and their metadata with our EMBEDDIA partners and are eager to see what embeddings and new solutions these old news can help create. Also discussed our ideas and views on how new technologies could help news agencies and other media outlets to produce fast and diverse quality journalism.
Texta OU (TEXTA), Estonia
Building the TEXTA Toolkit and Embeddia dashboard. TEXTA Toolkit is meant for analysts and is supposed to serve the EMBEDDIA Media Assistant Toolkit. By the end of the first year, users can browse and aggregate data, build machine learning models for automatic text classification (both conventional and neural networks). Currently several Estonian state institutions are testing the Toolkit. 
As Ekspress Media (EKSP), Estonia
Crawled and exported a lot of content from articles and comments in order for Texta to analyse and use them in the developed system. 


  • Workshop Showcasing LT Agenda, Brussels, Belgium
  • Lecture Between Euphoria and Dystopia: AI, journalism and perceptions in leading newsrooms, Toronto, Canada
  • Digital Local Public Sphere, Wroclaw, Poland
  • European Youth Science & Media Days (European Parliament), Strasbourg, France
  • META-FORUM 2019, Brussels, Belgium
  • International Conference on Statistical Language and Speech Processing, Ljubljana, Slovenia
  • IPTC Autumn Meeting, Ljubljana, Slovenia
  • News Automation at Work – Practices and Perspectives Workshop, Dublin, Ireland
  • Festival Grounded, Ljubljana, Slovenia
  • Festival Naprej/Forward, Ljubljana, Slovenia
  • Conference by Council of Europe and Slovenian Ministry of Culture "(Last) Call for Quality Journalism”, Ljubljana, Slovenia
  • Open Day at the Faculty of Engineering and Computing (FER), Zagreb, Croatia


We have published more than twenty scientific publications in top conference proceedings and several journals including:
  • Machine Learning and Knowledge Extraction (MAKE),
  • Frontiers in Psychology,
  • Contributions to Contemporary History,
  • Natural language engineering,
  • Language resources and evaluation,
  • Journalism.
Copyright © *|2020|* *|Department of Knowledge Technologies – EMBEDDIA project, Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenia|*, All rights reserved.

Our mailing address is:

This email was sent to <<Email Address>>
why did I get this?    unsubscribe from this list    update subscription preferences
EMBEDDIA · Jamova cesta 39 · Department of Knowledge Technologies – EMBEDDIA project, Jožef Stefan Institute · Ljubljana 1000 · Slovenia

Email Marketing Powered by Mailchimp