Robnik Šikonja, M. (2025, April 16-17). Veliki jezikovni modeli za slovenščino in prevajanje [Conference presentation]. Proofreading and Translation Conference 2025: The Impact of Digital Transformation on Translation, Ljubljana, Slovenia. https://lektornica.si/delavnice/jezikovne/translation-conference-2025-the-impact-of-digital-transformation-in-translation/
Arhar Holdt, Š. (2025, April 16-17). Lektoriranje v času umetne inteligence: Kdo bo postavljal piko na UI? [Conference presentation]. Proofreading and Translation Conference 2025: The Impact of Digital Transformation on Translation, Ljubljana, Slovenia. https://lektornica.si/delavnice/jezikovne/translation-conference-2025-the-impact-of-digital-transformation-in-translation/
Robnik, Šikonja, M. (2025, June 10). Projekt PoVeJMo, Gravitacija in ERA Chair projekt AI4DH [Conference presentation]. 4. Nacionalna konferenca Umetna inteligenca - nove smeri razvoja in izzivi za Slovenijo. Mengeš, Slovenia. https://dogodki.vlada.si/umetna-inteligenca-digitalna-preobrazba-prijava
Robnik Šikonja, M. (2025, June 13). Large Language Models for Analysis of Complex Phenomena [Conference presentation]. AI Methods for Research of Folkloristic Narratives, Ljubljana, Slovenia. https://cjvt.si/llm4dh/en/blog/workshop-ai-methods-for-research-of-folkloristic-narratives/
Arčon, T, Robnik Šikonja, M. and Tratnik, P. (2025, June 13). Motif Detection Using Large Language Models: The Cinderella Case Study [Conference presentation]. AI Methods for Research of Folkloristic Narratives, Ljubljana, Slovenia. https://cjvt.si/llm4dh/en/blog/workshop-ai-methods-for-research-of-folkloristic-narratives/
Horvat, M., Koražija, J. and Tratnik, P. (2025, June 13). Modeling Deliberative Values in Narrative Culture Using LLMs [Conference presentation]. AI Methods for Research of Folkloristic Narratives, Ljubljana, Slovenia. https://cjvt.si/llm4dh/en/blog/workshop-ai-methods-for-research-of-folkloristic-narratives/
Babnik, J. and Tratnik, P. (2025, June 13) The Dragon-Slayer’s Narrative: Structural Kinship and Discursive Divergence [Conference presentation]. AI Methods for Research of Folkloristic Narratives, Ljubljana, Slovenia. https://cjvt.si/llm4dh/en/blog/workshop-ai-methods-for-research-of-folkloristic-narratives/
Robnik, Šikonja, M. (2025, June 17). The importance of language data for the development of LT solutions - future steps [Conference presentation]. EU LDS Country Workshop. Ljubljana, Slovenia. https://language-data-space.ec.europa.eu/events/lds-country-workshop-slovenia-2025-06-17_en
Hüll, N. and Dobrovoljc, K. (2025). Word Order Variation in Spoken and Written Corpora: A Cross-Linguistic Study of SVO and Alternative Orders. Proceedings of the Eighth International Conference on Dependency Linguistics (Depling, SyntaxFest 2025). Ljubljana, Slovenia.
Terčon, L. and Dobrovoljc, K. (2025). ComparaTree: A Multi-Level Comparative Treebank Analysis Tool. Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025). Ljubljana, Slovenia.
Krsnik, L. and Dobrovoljc, K. (2025). STARK: A Toolkit for Dependency (Sub)Tree Extraction and Analysis. Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025). Ljubljana, Slovenia.
Munda, T. and Arhar Holdt, Š. (2025). First Insights into the Syntax of Slovene Student Writing: A Statistical Analysis of Šolar 3.0 vs. Učbeniki 1.0. Proceedings of the Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025). Ljubljana, Slovenia.
Arčon, T., Kosem, I. and Arhar Holdt, Š. (2025, September 24). Using large language models to generate distractors for language games [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
Klemen, M., Doborovoljc, K., Terčon, L., Hüll, N., Arčon, T., and Robnik Šikonja, M. (2025, September 24). Agentic Large Language Models for Grammatical Analysis of Multilingual Corpora [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
Jelovčan, G., Robnik Šikonja, M., Arhar Holdt, Š., and Vreš, D. (2025, September 24). Attempt to Create Synthetic Dataset for Grammar Error Correction in Slovenian Language [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
Pretnar Žagar, A. (2025, September 24). Evaluating LLMs on Value Annotation Task [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
Arčon, T., Robnik Šikonja, M., and Tratnik, P. (2025, September 24). Automatic detection of folkloristic motifs with large language models: the Cinderella tale [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
Robnik Šikonja, A. (2025, September 17). Trends and challenges in artificial intelligence [Conference presentation]. SNC’25 Sinapsa neuroscience conference 2025, Ljubljana, Slovenia. https://www.sinapsa.org/SNC25/programme
Robnik, Šikonja, M. (2025, November 18). Large language models for lexicography [Invited keynote speech at the conference]. eLex 2025: Electronic lexicography in the 21st century: Intelligent Lexicography, Bled, Slovenia. https://elex.link/elex2025/keynote-speakers/
Kosem, I. and Arhar Holdt, Š. (2025). Using Large Language Models to Generate Distractors for Language Games [Conference presentation]. eLex 2025: Electronic lexicography in the 21st century: Intelligent Lexicography, Bled, Slovenia. https://elex.link/elex2025/wp-content/uploads/elex2025_book_of_abstracts.pdf
Robnik Šikonja, M. (2025). What are open LLMs and how do we build them? / Kaj so odprti LLMs in kako jih gradimo? [Conference presentation]. ERA Knowledge Rights 21 Conference, Ljubljana, Slovenia. https://www.odipi.si/era-kr21-konferenca-slovenija-2025/program-era-kr21-konference-2025/
Publications in conference proceedings / workshops:
Piryani, B., Mozafari, J., Abdallah, A., Doucet, A., & Jatowt, A. (2025). MultiOCR-QA: Dataset for Evaluating Robustness of LLMs in Question Answering on Multilingual OCR Texts. https://arxiv.org/abs/2502.16781
Nguyen, N.N., Hamdi, A., Doucet, A., Jatowt, A., Coustaty, M. (2026). Rethinking OCR Evaluation for Information Extraction in Business Documents. In: Oh, S., Doucet, A., Buranarach, M., Buenrostro-Cabbab, I., Liu, Y., Olgado, B.S. (eds) Intelligence and Equity: Shaping the Future of Knowledge. ICADL 2025. Lecture Notes in Computer Science, vol 16242. Springer, Singapore. https://doi.org/10.1007/978-981-95-4861-3_21
Sun, W., Girdhar, N., Tran, H.T.H., González-Gallardo, CE., Coustaty, M., Doucet, A. (2026). Ar-Q-Former: Historical Newspaper Article Separation Based on Multimodal Transformer Structure. In: Yin, XC., Karatzas, D., Lopresti, D. (eds) Document Analysis and Recognition – ICDAR 2025. ICDAR 2025. Lecture Notes in Computer Science, vol 16025. Springer, Cham. https://doi.org/10.1007/978-3-032-04624-6_28
Publications
Klemen, M., Božič, M., Holdt, Š. A., & Robnik-Šikonja, M. (2025). Grammatical error correction of Slovenian school essays using large language models. Journal of Contemporary Educational Studies/Sodobna Pedagogika, 76(3).
Ulčar, M., Žagar, A., Armendariz, C.S., Repar, A., Pollak, S., Purver, M., and Robnik Šikonja, M. (2026). Mono- and cross-lingual evaluation of representation language models on less-resourced languages, Computer Speech & Language, 95, 101852. https://doi.org/10.1016/j.csl.2025.101852
Ulčar, M., Žagar, A., Armendariz, C. S., Repar, A., Pollak, S., Purver, M., & Robnik-Šikonja, M. (2026). Mono-and cross-lingual evaluation of representation language models on less-resourced languages. Computer Speech & Language, 95, 101852. https://doi.org/10.1016/j.csl.2025.101852
Ivačič, N., Škrlj, B., Koloski, B., Pollak, S., Lavrač, N., & Purver, M. (2025). Extreme Multi-Label Text Classification for Less-Represented Languages and Low-Resource Environments: Advances and Lessons Learned. Machine Learning and Knowledge Extraction, 7(4), 142. https://doi.org/10.3390/make7040142
Vobič, I., Robnik Šikonja, M., Žagar, A. & Mance, B. (2025). Watchdog or Copycat? Examining News Diversity in Slovenian Journalism System. Medijska istraživanja, 31 (2), 5-34. https://doi.org/10.22572/mi.30.2.1
Pretnar Žagar, A. (2025). Computational Analysis of Slovenian Historical Newspapers (1771–1914): Linguistic, Thematic, and Nation-Building Insights. Contributions to the Contemporary History, 65(3), 42-66. https://doi.org/10.51663/pnz.65.3.02
Research datasets
Žagar, A., Dobrovoljc, K., Munda, T., Brglez, M., and Robnik Šikonja, M. (2024). Knowledge-Enhanced Winograd Schema Challenge KE-WSC 1.0, Slovenian language resource repository CLARIN.SI. http://hdl.handle.net/11356/1988.