Abstract
We present the results of applying Large Language Models to extract meteorological data from weather forecasts provided in a variety of formats. Processing information expressed in natural language is a difficult task, even more so when the goal is the extraction of certain numerical values from sources with unknown (at the design) structure. We apply Large Language Models to this task, and we verify their usefulness when processing various data formats describing the forecast in the natural language and condensed tabular and/or numerical representations. All the data was sourced from real meteorological systems and the output was fixed XML structure. We show that all the models tested in the paper succeed, with varying degrees of efficiency, in extracting basic data from the source forecasts and in encoding extracted information into a predefined XML structure. Finally, we pinpoint main types of errors encountered in the transformation process.
Keywords: Text Processing, Data Extraction, Large Language Model, LLM