On a mission to make Large Language Models safe and reliable for managing IaC

January 5, 2024 - Words by Daniel Vladušič

January 5, 2024
Words by Daniel Vladušič

We’re excited to be part of the new national project PoVeJMo, dedicated to Large Language Models, constrained with smaller input dataset – Slovenian language. XLAB is in the R&D role, contributing and exploiting similarities in the problems between Slovenian language and Infrastructure as a Code. We will build a resource and data-efficient models designed for generating infrastructure code – a domain where we excel with our product for secure and reliable IT Automation, Steampunk Spotter.

Over the next three years we’ll join forces with Faculty of Computer and Information Science, University of Ljubljana, The Research Centre of the Slovenian Academy of Sciences and Arts (ZRC SAZU), Institute of Contemporary History, Semantika d.o.o., Vitasis d.o.o., Better d.o.o. and Špica International

“The PoVeJMo project is of great importance to the XLAB in several respects. Not only are we collaborating with the Faculty of Computer and Information Science in Ljubljana, the alma mater of most of the engineers at XLAB, but also with like-minded experts who have the same problem as we do: The lack of sufficient training data and computer resources to build comprehensive large-scale language models of high quality,” explains Daniel Vladušič, PhD, XLAB Project Manager

The primary outcome of the consortium will be a large language model (LLM), focusing on Slovene language. The knowledge gained from this LLM will be freely available for application in various areas, such as museums XLAB’s focus in LLM development lies in overcoming the limitations like the absence of training data and the availability of resources. Our team will actively participate in the building this LLM and contribute to methods designed to avoid the above-mentioned constraints.

“Our output is LLM focused on Infrastructure as a Code (IaC). The knowledge and realizations gained while constructing LLM for the Slovene language will be transferred to the construction of the entire pipeline for building a specialized LLM focused on IaC, emphasizing additional constraints of robustness and accuracy,” adds Vladušič.

We eagerly anticipate the challenges that the project will bring and believe that together, we will achieve incredible results. Let’s Get IT done!