Preskoči na vsebino

Odprta znanost

Conference Presentations

  • Robnik Šikonja, M. (2025, April 16-17). Veliki jezikovni modeli za slovenščino in prevajanje [Conference presentation]. Proofreading and Translation Conference 2025: The Impact of Digital Transformation on Translation, Ljubljana, Slovenia. https://lektornica.si/delavnice/jezikovne/translation-conference-2025-the-impact-of-digital-transformation-in-translation/
  • Arhar Holdt, Š. (2025, April 16-17). Lektoriranje v času umetne inteligence: Kdo bo postavljal piko na UI? [Conference presentation]. Proofreading and Translation Conference 2025: The Impact of Digital Transformation on Translation, Ljubljana, Slovenia. https://lektornica.si/delavnice/jezikovne/translation-conference-2025-the-impact-of-digital-transformation-in-translation/
  • Robnik, Šikonja, M. (2025, June 10). Projekt PoVeJMo, Gravitacija in ERA Chair projekt AI4DH [Conference presentation]. 4. Nacionalna konferenca Umetna inteligenca - nove smeri razvoja in izzivi za Slovenijo. Mengeš, Slovenia. https://dogodki.vlada.si/umetna-inteligenca-digitalna-preobrazba-prijava
  • Robnik Šikonja, M. (2025, June 13). Large Language Models for Analysis of Complex Phenomena [Conference presentation]. AI Methods for Research of Folkloristic Narratives, Ljubljana, Slovenia. https://cjvt.si/llm4dh/en/blog/workshop-ai-methods-for-research-of-folkloristic-narratives/
  • Arčon, T, Robnik Šikonja, M. and Tratnik, P. (2025, June 13). Motif Detection Using Large Language Models: The Cinderella Case Study [Conference presentation]. AI Methods for Research of Folkloristic Narratives, Ljubljana, Slovenia. https://cjvt.si/llm4dh/en/blog/workshop-ai-methods-for-research-of-folkloristic-narratives/
  • Horvat, M., Koražija, J. and Tratnik, P. (2025, June 13). Modeling Deliberative Values in Narrative Culture Using LLMs [Conference presentation]. AI Methods for Research of Folkloristic Narratives, Ljubljana, Slovenia. https://cjvt.si/llm4dh/en/blog/workshop-ai-methods-for-research-of-folkloristic-narratives/
  • Babnik, J. and Tratnik, P. (2025, June 13) The Dragon-Slayer’s Narrative: Structural Kinship and Discursive Divergence [Conference presentation]. AI Methods for Research of Folkloristic Narratives, Ljubljana, Slovenia. https://cjvt.si/llm4dh/en/blog/workshop-ai-methods-for-research-of-folkloristic-narratives/
  • Robnik, Šikonja, M. (2025, June 17). The importance of language data for the development of LT solutions - future steps [Conference presentation]. EU LDS Country Workshop. Ljubljana, Slovenia. https://language-data-space.ec.europa.eu/events/lds-country-workshop-slovenia-2025-06-17_en
  • Hüll, N. and Dobrovoljc, K. (2025). Word Order Variation in Spoken and Written Corpora: A Cross-Linguistic Study of SVO and Alternative Orders. Proceedings of the Eighth International Conference on Dependency Linguistics (Depling, SyntaxFest 2025). Ljubljana, Slovenia.
  • Terčon, L. and Dobrovoljc, K. (2025). ComparaTree: A Multi-Level Comparative Treebank Analysis Tool. Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025). Ljubljana, Slovenia.
  • Krsnik, L. and Dobrovoljc, K. (2025). STARK: A Toolkit for Dependency (Sub)Tree Extraction and Analysis. Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025). Ljubljana, Slovenia.
  • Munda, T. and Arhar Holdt, Š. (2025). First Insights into the Syntax of Slovene Student Writing: A Statistical Analysis of Šolar 3.0 vs. Učbeniki 1.0. Proceedings of the Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025). Ljubljana, Slovenia.
  • Arčon, T., Kosem, I. and Arhar Holdt, Š. (2025, September 24). Using large language models to generate distractors for language games [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
  • Klemen, M., Doborovoljc, K., Terčon, L., Hüll, N., Arčon, T., and Robnik Šikonja, M. (2025, September 24). Agentic Large Language Models for Grammatical Analysis of Multilingual Corpora [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
  • Jelovčan, G., Robnik Šikonja, M., Arhar Holdt, Š., and Vreš, D. (2025, September 24). Attempt to Create Synthetic Dataset for Grammar Error Correction in Slovenian Language [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
  • Pretnar Žagar, A. (2025, September 24). Evaluating LLMs on Value Annotation Task [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
  • Arčon, T., Robnik Šikonja, M., and Tratnik, P. (2025, September 24). Automatic detection of folkloristic motifs with large language models: the Cinderella tale [Conference presentation]. 28th International Conference, Discovery Science AI 4 Science Conference, Ljubljana, Slovenia. https://ds2025.ijs.si/assets/files/978-3-032-05461-6_Book_OnlinePDF.pdf
  • Robnik Šikonja, A. (2025, September 17). Trends and challenges in artificial intelligence [Conference presentation]. SNC’25 Sinapsa neuroscience conference 2025, Ljubljana, Slovenia. https://www.sinapsa.org/SNC25/programme
  • Robnik, Šikonja, M. (2025, November 18). Large language models for lexicography [Invited keynote speech at the conference]. eLex 2025: Electronic lexicography in the 21st century: Intelligent Lexicography, Bled, Slovenia. https://elex.link/elex2025/keynote-speakers/
  • Kosem, I. and Arhar Holdt, Š. (2025). Using Large Language Models to Generate Distractors for Language Games [Conference presentation]. eLex 2025: Electronic lexicography in the 21st century: Intelligent Lexicography, Bled, Slovenia. https://elex.link/elex2025/wp-content/uploads/elex2025_book_of_abstracts.pdf
  • Robnik Šikonja, M. (2025). What are open LLMs and how do we build them? / Kaj so odprti LLMs in kako jih gradimo? [Conference presentation]. ERA Knowledge Rights 21 Conference, Ljubljana, Slovenia. https://www.odipi.si/era-kr21-konferenca-slovenija-2025/program-era-kr21-konference-2025/

Publications in conference proceedings / workshops:

  • Piryani, B., Mozafari, J., Abdallah, A., Doucet, A., & Jatowt, A. (2025). MultiOCR-QA: Dataset for Evaluating Robustness of LLMs in Question Answering on Multilingual OCR Texts. https://arxiv.org/abs/2502.16781
  • Nguyen, N.N., Hamdi, A., Doucet, A., Jatowt, A., Coustaty, M. (2026). Rethinking OCR Evaluation for Information Extraction in Business Documents. In: Oh, S., Doucet, A., Buranarach, M., Buenrostro-Cabbab, I., Liu, Y., Olgado, B.S. (eds) Intelligence and Equity: Shaping the Future of Knowledge. ICADL 2025. Lecture Notes in Computer Science, vol 16242. Springer, Singapore. https://doi.org/10.1007/978-981-95-4861-3_21
  • Sun, W., Girdhar, N., Tran, H.T.H., González-Gallardo, CE., Coustaty, M., Doucet, A. (2026). Ar-Q-Former: Historical Newspaper Article Separation Based on Multimodal Transformer Structure. In: Yin, XC., Karatzas, D., Lopresti, D. (eds) Document Analysis and Recognition – ICDAR 2025. ICDAR 2025. Lecture Notes in Computer Science, vol 16025. Springer, Cham. https://doi.org/10.1007/978-3-032-04624-6_28

Publications

  • Klemen, M., Božič, M., Holdt, Š. A., & Robnik-Šikonja, M. (2025). Grammatical error correction of Slovenian school essays using large language models. Journal of Contemporary Educational Studies/Sodobna Pedagogika, 76(3).
  • Ulčar, M., Žagar, A., Armendariz, C.S., Repar, A., Pollak, S., Purver, M., and Robnik Šikonja, M. (2026). Mono- and cross-lingual evaluation of representation language models on less-resourced languages, Computer Speech & Language, 95, 101852. https://doi.org/10.1016/j.csl.2025.101852
  • Ulčar, M., Žagar, A., Armendariz, C. S., Repar, A., Pollak, S., Purver, M., & Robnik-Šikonja, M. (2026). Mono-and cross-lingual evaluation of representation language models on less-resourced languages. Computer Speech & Language, 95, 101852. https://doi.org/10.1016/j.csl.2025.101852
  • Ivačič, N., Škrlj, B., Koloski, B., Pollak, S., Lavrač, N., & Purver, M. (2025). Extreme Multi-Label Text Classification for Less-Represented Languages and Low-Resource Environments: Advances and Lessons Learned. Machine Learning and Knowledge Extraction, 7(4), 142. https://doi.org/10.3390/make7040142
  • Vobič, I., Robnik Šikonja, M., Žagar, A. & Mance, B. (2025). Watchdog or Copycat? Examining News Diversity in Slovenian Journalism System. Medijska istraživanja, 31 (2), 5-34. https://doi.org/10.22572/mi.30.2.1
  • Pretnar Žagar, A. (2025). Computational Analysis of Slovenian Historical Newspapers (1771–1914): Linguistic, Thematic, and Nation-Building Insights. Contributions to the Contemporary History, 65(3), 42-66. https://doi.org/10.51663/pnz.65.3.02

Research datasets

  • Žagar, A., Dobrovoljc, K., Munda, T., Brglez, M., and Robnik Šikonja, M. (2024). Knowledge-Enhanced Winograd Schema Challenge KE-WSC 1.0, Slovenian language resource repository CLARIN.SI. http://hdl.handle.net/11356/1988.

Software repositories

Public deliverables