Search  English (United States) Hrvatski (Hrvatska)

innovative promotional partnershipArtificial Intelligence towards EU Multilingualism

Technical co-sponsorship

Hybrid Event
Event program
Thursday, 5/23/2024 9:00 AM - 1:00 PM,
Camelia 1, Grand hotel Adriatic, Opatija
9:00 AM - 10:30 AMArtificial Intelligence Theory and Explainability
Chair: Alan Jović 
1.T. Gale, M. Guid (University of Ljubljana, Ljubljana, Slovenia)
Argument-Based Regression Trees 
The advancements in the machine learning domain have compelled numerous developments and applications of artificial intelligence in diverse spheres, which—with the increasing complexity of techniques targeting better performance—introduced a deterioration of understandability in the utilized models. However, the explainability remains imperative for critical applications in domains such as medicine. With the modelling process in inductive learning aiming to find a well-generalizable hypothesis, a common problem is that multiple hypotheses may be consistent with training data or that the correct hypothesis cannot be induced from the data. One option for alleviating such shortcomings is to introduce expert knowledge into the modeling process, consequently constraining the hypothesis search space. A demonstrated effective option for eliciting and incorporating expert knowledge is represented by argument-based machine learning, an approach based on argumentation. In this paper, we propose and evaluate an argument-based regression tree (ABRT) method and implement the proposed method as part of the ABTree tool. We show that with the ABRT method, the regression tree structure resembles more of that expected by a domain expert. Additionally, we show that even assigning random arguments will likely not drastically reduce model performance if the model parameters are selected intelligently.
2.D. Vukadin, M. Šilić, G. Delač, K. Vladimir (Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia)
Evaluating Harmony: Neural Network Explanation Metrics and Human Perception 
Recent strides in deep neural network perfor mance have catalyzed the development of state-of-the-art approaches across diverse domains. However, the inherent black-box nature of neural networks poses challenges in contexts where model explainability and transparency are paramount. In response, researchers have proposed algo rithms over the years to enhance neural network understand ing, offering additional insights for human experts. The surge in attribution method research underscores the need to rigorously assess and evaluate their performance. Although numerous metrics have emerged, each targeting specific properties of attribution methods, a notable gap persists in researching the alignment between these metrics and human judgment. Despite the continuous competition among researchers to produce attribution methods achieving new state-of-the-art results on established metrics, it remains unclear whether these advancements positively, negatively, or negligibly impact human comprehension of the model. This paper introduces a novel approach to evaluating the alignment between human judgment and evaluation metrics. The proposed method systematically assesses several well established metrics across the faithfulness, robustness, and localization categories. Through this evaluation, the paper aims to shed light on the effectiveness of existing metrics in capturing human-perceived model understanding, thereby providing valuable perspectives for further advancements in the field. Our code and data are available at: https://github. com/davor10105/harmony_app
3.G. Oparin, V. Bogdanova, A. Pashinin (Institute for System Dynamics and Control Theory of SB RAS, Irkutsk, Russian Federation)
Dynamics of Bipartite Logical Networks 
Several problems of qualitative analysis of trajectories behavior dynamics of bipartite logical networks over a finite time interval are solved using the method of Boolean constraints. A method is proposed for splitting a bipartite trajectory into two independent trajectories based on deterministic block-sequential updating of the state vectors of each part. The conditions for given part existence, isolation, and attraction are obtained as Boolean constraints. The solvability of such equations reduces to the well-known Boolean satisfiability problem, for which the efficient modern solution algorithms provide dynamic analysis for systems with a high-dimensional state vector over large discrete time intervals.
4.A. Andreev, K. Chukharev, S. Kochemazov, A. Semenov (ITMO, St.Petersburg, Russian Federation)
Solving Influence Maximization Problem under Deterministic Linear Threshold Model Using Metaheuristic Optimization 
In the paper we consider the discrete variant of the well-known Influence Maximization Problem (IMP). Given some influence model, it consists in finding a so-called seed set of influential users of fixed size, that maximizes the total spread of influence over the network. We limit our study to the influence model called Deterministic Linear Threshold Model (DLTM). It is well known that IMP under DLTM is computationally hard and there are no approximate algorithms for its solving with a constant approximation ratio if P\neq NP. Therefore, it makes sense to apply metaheuristic algorithms to this problem. In the present research we propose new algorithms for solving IMP under DLTM, which are based on a technique that combines evolutionary and genetic strategies for pseudo-Boolean optimization with a greedy algorithm which is used to find some initial approximation. We also study the behavior of the proposed metaheuristic strategies for IMP and compare it with the one for the Target Set Selection (TSS) under DLTM.
5.S. Kochemazov, O. Zaikin (ISDCT SB RAS, Irkutsk, Russian Federation)
Towards Better SAT Encodings for Hash Function Inversion Problems  
Inversion of reduced-round hash functions is one of the areas of cryptography, in which Boolean satisfiability (SAT) solvers show good performance. Recent results on the inversion of 43-step MD4 using SAT make it possible to believe that more progress can be achieved by careful solver engineering and SAT encodings manipulation. In the present paper we consider possible ways to improve the SAT encodings for inversion of hash functions from the MD and SHA families, in particular, MD4, MD5 and SHA-1. We study the available encodings, including the ones proposed by Vegard Nossum, and that made by automatic encoding tools. We then show that it is possible to further make the encodings better by eliminating some of the auxiliary variables via Boolean minimization. In the computational experiments we consider a variety of benchmarks, which encoding reduced-round variants of the considered hash functions.
6.M. Duvnjak (PBZ, Zagreb, Croatia), A. Merćep, Z. Kostanjcar (Faculty of Electrical Engineering and Computing, Zagreb, Croatia)
Intrinsically Interpretable Models for Credit Risk Assessment 
In the domain of credit risk modeling, Logistic Regression remains the de facto industry standard. Yet, machine learning and deep learning techniques hold the promise of significantly enhancing current predictive capabilities. Employing these more complex models in credit risk assessment presents unique challenges, notably the need for model explainability, which is crucial for stakeholders such as national regulators, risk analysts, and senior management. In this paper, we evaluate XAI (Explainable AI) methods, providing insight into the advantages and disadvantages of advanced credit risk modeling techniques. Our findings demonstrate that XAI methods enhance prediction accuracy, while maintaining a high level of insight into the models' inner workings.
10:30 AM - 11:00 AMBreak 
11:00 AM - 1:00 PMComputer Vision
Chair: Marina Ivašić-Kos 
11:00 AM - 11:45 AMInvited Lecture 
Karla Brkić (Meta Reality Labs, Zürich, Švicarska)
Extended Reality on Meta Quest Devices 
11:45 AM - 1:00 PMPapers 
1.M. Tropčić (Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia), M. Rašić (Clinic for Tumors, Clinical Hospital Center “Sisters of Mercy”, Zagreb, Croatia), M. Subašić (Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia)
YOLOv8 Unleashed on Orthopantomograms: Deep Learning Approach for Mandibular Cyst Diagnosis 
Detecting and diagnosing mandibular lesions on radiological images poses a significant challenge for physicians in related fields. This study presents a YOLOv8-based deep neural network designed for the analysis of orthopantomograms. With a dataset comprising 200 images, annotated with 226 lesions of five types, the model is evaluated across four tasks: detection, multi-class detection, segmentation, and multi-class segmentation. Orthopantomograms, annotated by a team of experts, were augmented to enhance dataset quality and diversity. Evaluation metrics such as precision, recall, mAP@50, and mAP@50-95 were employed to assess the model's performance. Results indicate the model's excellence in detection (precision = 1, recall = 0.945, mAP@50 = 0.966, mAP@50-95 = 0.714) and segmentation (precision = 1, recall = 0.945, mAP@50 = 0.966, mAP@50-95 = 0.714), with slightly lower performance in multi-class tasks (detection: precision = 0.64, recall = 0.662, mAP@50 = 0.735, mAP@50-95 = 0.598; segmentation: precision = 0.64, recall = 0.662, mAP@50 = 0.735, mAP@50-95 = 0.595). This research highlights the potential of deep neural networks in automating cyst diagnosis, offering a valuable tool for physicians in decision-making and treatment planning. Furthermore, the study identifies opportunities for future enhancements, addressing challenges in multi-class tasks, and striving to improve model robustness.
2.E. Turajlic (Faculty of Electrical Engineering, University of Sarajevo, Sarajevo, Bosnia and Herzegovina)
Multilevel Image Thresholding Based on Otsu's Method and Multi-swarm Particle Swarm Optimization Algorithm 
In this paper, a multilevel thresholding method for image segmentation based on Otsu’s between-class variance and multi-swarm particle swarm optimization algorithm with dynamic learning strategy is presented. The considered multilevel image thresholding method is assessed on various standard test images and for different numbers of thresholds. For each test image and a considered number of thresholds, the mean and the standard deviation of Otsu’s objective function over a number of independent runs are evaluated. The experimental results showcased that this method can be successfully employed in multilevel image thresholding.
3.S. Delalić, Z. Kadrić, E. Selmanović, E. Mulaimović (Faculty of Science, University of Sarajevo, Sarajevo, Bosnia and Herzegovina), E. Kadušić (Faculty of Educational Sciences, University of Sarajevo, Sarajevo, Bosnia and Herzegovina)
Selecting Symbols for Object Marking in Computer Vision Tasks 
Deep learning techniques in computer vision (CV) tasks such as object detection, classification, and tracking can be facilitated by using predefined markers on those objects.Selecting markers is an objective that can potentially affect the performance of the algorithms used for tracking as the algorithm might swap similar markers more frequently and therefore require more training data and training time. Still, the issue of marker selection has not been explored in the literature fully and seems to be glossed over in the whole process of designing CV solutions. In this research, the effects of symbol selection for 2D-printed markers on the performance of the neural network were considered. We considered more than 300 ALT code symbols which are readily available on most consumer PCs and provided a go-to selection for tracking n-objects effectively. To this end, a neural network was trained to classify all the symbols and their augmentations after which the confusion matrix was analyzed to extract symbols which the network distinguished the most. We showed that selecting symbols in this way performs better than the random selection, and the selection of common symbols. Furthermore, the methodology presented in this paper can easily be applied to a different set of symbols and different neural network architectures.
4.E. Selmanović, E. Mulaimović, S. Delalić, Z. Kadrić, Z. Šabanac (Faculty of Science, University of Sarajevo, Sarajevo, Bosnia and Herzegovina)
Neural Network Impact on Marker Performance in Computer Vision Tasks  
Many deep-learning computer vision systems analyze objects not previously observed by the system. However, when possible such tasks can be simplified if the objects are marked beforehand. A straightforward method for marking is printing 2D symbols and attaching them to the objects. Selecting these symbols can affect the performance of the CV system, as similar symbols may require extended training time, and a larger training dataset. It is possible to find good symbols which are differentiated by the given neural network easily. Still, there were no efforts to generalize such findings in the literature, and it is not known if the symbols optimal for one network would work just as well in the other. We explored how transferable symbol selection is between the networks. To this end, 20 sets of randomly selected and augmented symbols were classified by 5 neural networks. Each network was given the same training dataset and the same amount of training time. Results were ranked and compared, which allowed the identification of networks which performed similarly, so the symbol selection could be generalised between them.
5.I. Klabučar (Faculty of Electrical Engineering and Computing University of Zagreb, Zagreb, Croatia), I. Pilaš (Croatian Forest Research Institute, Jastrebarsko, Croatia), M. Subašić (Faculty of Electrical Engineering and Computing University of Zagreb, Zagreb, Croatia)
Forest Segmentation with U-Net on Satellite Images 
Identification of different types of forests holds potential for land management applications. For instance, precise identification of forest types could aid in targeted conservation efforts or facilitate sustainable resource utilization. For example, the ability to distinguish between ash and oak forests may inform land managers about the biodiversity and ecological characteristics of specific areas, guiding more effective conservation strategies and land-use planning. This paper explores using UNET models on WorldView-2 satellite images for such purposes. The dataset is comprised of 8-channel satellite images and masks labeling each pixel as one of 12 forest types. UNET models were trained to segment the images according to those same 12 classes. As the ground truth labels were generated semi-automatically they are not entirely reliable further complicating the task of image segmentation. The impact of image resolution was also examined by comparing two UNET models: one trained on full-resolution images, and another on reduced-resolution images. Despite limitations posed by the unreliable ground truth, the results are promising for some classes. Additionally, the accuracy did not significantly deteriorate with lower resolution.
Thursday, 5/23/2024 3:00 PM - 6:00 PM,
Camelia 1, Grand hotel Adriatic, Opatija
3:00 PM - 4:30 PMNatural Language Processing - I
Chair: Darko Huljenić 
1.M. Keber, I. Grubišić, A. Barešić (Ruđer Bošković Institute, Zagreb, Croatia), A. Jović (Faculty of Electrical Engineering and Computing, Zagreb, Croatia)
A Review on Neuro-Symbolic AI Improvements to Natural Language Processing 
Symbolic artificial intelligence reflects the knowledge of domain experts and adheres to domain logic, rules, or any relations between entities. Connectionist (neuro) approaches, which are based on artificial neural networks, are excellent for extracting abstract features, contextualization and embedding interaction between features. If connectionist and symbolic approaches are properly aligned within a model, they benefit from complementary strengths; the combination is referred to as a hybrid or neuro-symbolic artificial intelligence (NSAI) model or approach. Hybrid models are designed to solve task-specific challenges in different fields, such as computer vision, natural language processing (NLP), and time series analysis. Many different approaches to the topic of NSAI have been investigated in recent years, but they offer only limited insight into the extent of NLP gain from NSAI approaches. Therefore, in this review, we focus on neuro-symbolic approaches in standard NLP tasks, i.e., text classification, information extraction, machine translation, and language understanding. Research articles from Scopus, Web of Science and Google Scholar are reviewed using appropriate keywords (e.g., neuro-symbolic, logic, machine translation, …), in the period from 2019 to 2024. The aim of the review is to show the types of NSAI systems in use today, identify the motivation for using NSAI, evaluate the use of additional annotations for content description, and briefly describe how the neuro-symbolic connection improves the methodology or enables trustworthy and explainable applications. The review also points out application areas and improvements achieved in the benchmarks for the reviewed papers.
2.C. Mihai, C. Mocan, C. Nandra, E. Chifu (Technical University of Cluj-Napoca , Cluj-Napoca, Romania)
Different Approaches for Reading Comprehension of Abstract Meaning 
In natural language processing, grasping ab stract meaning in text is pivotal. This challenge involves iden tifying abstract words and concepts to accurately interpret input data implications. Two supervised machine learning models are employed for this purpose, with a focus on comparing results against baseline models. The first model is Bidirectional Encoder Representations from Transformers (BERT), known for achieving state-of-the-art results through f ine-tuning on various language tasks. Utilizing an attention mechanism, BERT captures contextual meaning and depen dencies within the input and output. The second approach employs Logistic Regression (LR), a significant tool for data prediction and classification in machine learning. With binary variation chosen for this task, LR models the probability of an event occurring based on historical information and the correlations between dependent and independent variables. Implementing two approaches introduces risks, but diverse interpretations of results may inspire further improvements. The model aims to emulate human-like interpretation of article summaries, discerning mechanisms and patterns used by the human brain. Potential enhancements include expanding the dataset for broader learning and adopting different hyper parameter optimization strategies. Overall, this model strives to understand text in an abstract manner, offering potential insights for future refinements.
3.M. Sandor, C. Mocan, C. Nandra, E. Chifu (Technical University of Cluj-Napoca , Cluj-Napoca, Romania)
Different Approach for Induction of Unsupervised Lexical Semantic Frames 
The challenge of dealing with the inherent ambiguity and constant evolution of human language could be addressed through semantic role labeling, a process of assigning labels to words or phrases to indicate their roles in a sentence. The goal is to discern sentence meanings by detecting and assigning specific roles to arguments associated with predicates or verbs. This paper emphasizes the impact of lexicons, particularly VerbNet and FrameNet, on labeling and classification. To overcome language ambiguity, we look into solving two tasks: grouping verbs into frame type clusters and clustering verb arguments into frame-specific slots or generic roles. The research effort described herein employs a fully unsupervised approach, utilizing Agglomerative clustering and extracting information from the CoNLL-U format. Three models—BERT, ELMo, and Word2Vec—generate embeddings, and the results are analyzed through agglomerative clustering with optimized hyperparameters. The paper suggests potential enhancements, such as incorporating a small set of annotated data for semi-supervised learning, expanding the dataset, and assessing system performance across different languages by training language models on new language samples. Overall, the research strives to provide a comprehensive solution to the multifaceted challenges of understanding and interpreting evolving human language.
4.K. Altwlkany, S. Delalić (Faculty of Science, University of Sarajev & Infobip, Sarajevo, Bosnia and Herzegovina), E. Selmanović, A. Alihodžić (Faculty of Science, University of Sarajevo, Sarajevo, Bosnia and Herzegovina), I. Lovrić (Infobip, Zagreb, Croatia)
A Recurrent Neural Network Approach to the Answering Machine Detection Problem 
In the field of telecommunications and cloud communications platforms, accurately and in real-time detecting whether an outbound call has been answered by a human or an answering machine is of paramount importance. This problem is of particular significance during campaigns as it enhances service quality, efficiency, and cost reduction through precise caller identification. Despite the significance of the field, it remains inadequately explored in the existing literature. This paper presents an innovative approach to answering machine detection that leverages transfer learning through the YAMNet model for feature extraction. The YAMNet architecture facilitates the training of a recurrent-based classifier, enabling real-time processing of audio streams, as opposed to fixed-length recordings. The results demonstrate an accuracy of over 96% on the test set. Furthermore, we conduct an in-depth analysis of misclassified samples and reveal that an accuracy exceeding 98% can be achieved with the integration of a silence detection algorithm, such as the one provided by FFMPEG.
5.K. Altwlkany, S. Delalić (Faculty of Science, University of Sarajevo & Infobip, Sarajevo, Bosnia and Herzegovina), A. Alihodžić, E. Selmanović, D. Hasić (Faculty of Science, University of Sarajevo, Sarajevo, Bosnia and Herzegovina)
Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization 
The majority of existing literature and research on audio fingerprinting and retrieval is centered around music, with popular applications like Apple's Shazam or Google's Now Playing designed for individual audio recognition on mobile devices. However, the spectral content of speech differs from that of music, necessitating modifications to current audio fingerprinting approaches. This paper offers fresh insights into adapting existing techniques to address the specialized challenge of speech retrieval in telecommunications and cloud communications platforms. The focus is on achieving rapid and accurate audio retrieval in batch processing instead of facilitating single requests, typically on a centralized server. Moreover, the paper demonstrates how this approach can be utilized to support audio clustering based on speech transcripts without undergoing actual speech-to-text conversion. This optimization enables significantly faster processing without the need for GPU computing, a requirement for real-time operation that is typically associated with state-of-the-art speech-to-text tools, such as OpenAI's Whisper.
6.N. Frid, M. Đurasević (Fakultet elektrotehnike i računarstva, Sveučilište u Zagrebu, Zagreb, Croatia)
Application of Evolutionary Optimization in Task Mapping and Scheduling for Heterogeneous Mobile-Edge Computing 
The growing demand for computational power in mobile and Internet of Things (IoT) applications, particularly in real-time contexts like autonomous driving, requires efficient task offloading to the cloud. However, challenges arise due to the need for stable internet connectivity and uncertainties in cloud responsiveness, especially in real-time systems. To overcome these challenges, partial task offloading to edge devices, which are placed between mobile or IoT devices and the cloud, is employed. This paper investigates task mapping and scheduling within a heterogeneous mobile-edge-cloud architecture with limited connectivity between nodes, and constrained task executability at the mobile and edge layers. Based on similarities with challenges in heterogeneous multiprocessor embedded systems, the paper explores the application of NSGA-II-based algorithms that were previously successfully applied in task mapping and scheduling for sparsely connected heterogeneous multiprocessor platforms. The algorithms are evaluated in the heterogeneous mobile-edge-cloud setting, and based on their results, possibilities for optimizing computational task allocation are discussed.
4:30 PM - 4:45 PMBreak 
4:45 PM - 6:00 PMNatural Language Processing - II
Chair: Darko Huljenić 
1.J. Šturm, P. Zajec, M. Škrjanc, D. Mladenić (Jožef Stefan International Postgraduate School | Jožef Stefan Institute, Ljubljana, Slovenia), M. Grobelnik (Jožef Stefan Institute, Ljubljana, Slovenia)
Enhancing Cognitive Digital Twin Interaction Using a LLM Agent 
This paper presents the development of a Language Model (LLM) agent designed to augment analysts' interaction with a cognitive digital twin representation of a country. The agent primarily focuses on intelligent data retrieval and summarization from various sources, including environmental sensors and textual data streams. It leverages advanced LLM capabilities to access and analyze real-time weather conditions, surface and ground water levels, and other environmental factors. Additionally, it integrates text data from news sources, SOS emergency events, and weather predictions and alerts, offering a comprehensive view of the country's current state. The agent's innovative approach not only simplifies data aggregation but also provides insightful summaries, enhancing the analytical process. Furthermore, it engages interactively with users, responding to specific queries by fetching relevant data from diverse databases, thus facilitating informed decision-making in real-time. This paper discusses the agent's architecture, data integration mechanisms, and the benefits of its application in managing and understanding complex country-wide data sets.
2.D. Fischer, G. Hagel (University of Applied Sciences Kempten, Kempten, Germany)
Enhancing NLP-Based Educational Assessment: A Node-Based Graph Approach for Analyzing Freeform Student Texts 
This paper introduces a software tool designed to elevate the assessment of student-submitted freeform text through a node-based graph approach leveraging large language models and customizable natural language processing (NLP) techniques. It combines directed feedback for students with a scoring system to provide a comprehensive understanding of the submitted responses. The node-based approach further allows educators to create tailored assignments with reusable evaluation strategies. This tool aims to scale up individualized feedback efficiently, addressing the challenges posed by large student numbers, thus alleviating the time investment of educators. This capability is particularly vital in scenarios where manual evaluation of numerous freeform submissions is impractical or unfeasible. This paper focuses on the software’s architecture and the integration of node based NLP practices in freeform text analysis.
3.M. Krajčí, M. Napravnik, I. Štajduhar (University of Rijeka – Faculty of Engineering, Rijeka, Croatia)
Processing Medical Diagnostic Reports using Machine Learning 
Given the increasing number of narrative radiology reports, the task of linking these to diagnoses is becoming progressively more challenging, making classification and keyword analysis inevitable. Computational methods, including unsupervised learning and natural language processing, are a tool for automatic discovery of patterns and keywords in a given text. This paper focuses on the application of a transformer model and hierarchical density-based spatial clustering of applications with noise classification (HDBSCAN) of unstructured radiology reports. The transformer model was trained using masked language modeling (MLM), and used to obtain latent embeddings of the reports, which were reduced using uniform manifold approximation and projection (UMAP), and clustered using HDBSCAN. The keywords for the individual clusters were determined using the term frequency-inverse document frequency (TF-IDF) and support vector machine (SVM) weights to explain and analyze the clustering. The results show the diversity of clusters in the dataset, with a high classification accuracy of the linear support vector machine. However, shortcomings in model evaluation indicate that further research and optimization of evaluation metrics are needed. Keyword extraction shows that the process of extracting relevant words needs to be improved for more accurate classification.
4.A. Nedbaylo, D. Hristovski (University of Ljubljana, Ljubljana, Slovenia)
Implementing Literature-based Discovery (LBD) with ChatGPT 
Literature-based discovery (LBD) is a methodology for generating research hypotheses by identifying hidden connections within the scientific literature. While its application has been predominantly in the field of biomedicine, particularly through the use of Medline, the largest freely available biomedical bibliographic database, traditional LBD methodologies have relied heavily on rule based approaches. These include utilizing co-occurrences among biomedical concepts or extracting semantic relations through natural language processing (NLP) methods. This study ventures into novel territory by exploring the use of advanced tools like ChatGPT for LBD, with a focus on leveraging prompt engineering to enhance hypothesis generation. We employed Large Language Models (LLMs) such as GPT-3.5 and GPT-4 to simulate the discovery of relationships between medical concepts. The study specifically examines the effectiveness of these models in autonomously replicating well-established medical correlations and generating potentially novel hypotheses. Our preliminary findings suggest that while LLMs show promise in generating hypotheses that occasionally deviate from established medical knowledge, challenges persist in consistently directing these models to produce truly innovative and less-explored connections. The study highlights the potential of LLMs in enriching the LBD process, yet also underscores the need for cautious evaluation and further research to optimize their application in this domain.
5.A. Andrijašević, B. Vukelić (Veleučilište u Rijeci, Rijeka, Croatia)
Generating Speech Material for Auditory Training Exercises using ChatGPT Chatbot 
Hearing rehabilitation is a complex process comprised of various types of interventions. One of them, auditory training, is commonly offered to adults with hearing loss that either underwent cochlear implant surgery or adapt with difficulty to the prescribed behind-the-ear hearing aid. During this type of intervention, speech material saturated with phonemes of different spectral characteristics is presented to the patient. In this paper, text analysis and generation capabilities offered by the OpenAI’s state-of-the-art GPT-4 model-based ChatGPT chatbot were assessed under a set of explicitly stated Croatian language rules and strict phonetic criteria, with the final objective of novel speech material generation. The results indicate that the chatbot can successfully generate isolated words of standard Croatian language for two target phoneme classes at saturation levels up to 80 %, and for the remaining three classes at levels up to 60 %. Moreover, as much as 90 % of sentences that it generated at the 50 % saturation level satisfy the stated phonetic criteria and, even though generally void of meaning, adhere to the Croatian language rules and, thus, can be used for auditory training exercises
Friday, 5/24/2024 9:00 AM - 12:30 PM,
Camelia 1, Grand hotel Adriatic, Opatija
9:00 AM - 10:30 AMMachine Learning Applications and Other Topics - I
Chair: Alan Jović 
1.D. Kinaneva, G. Hristov, G. Georgiev, P. Zahariev (University of Ruse, Ruse, Bulgaria)
Machine Learning Algorithms for Data Mining and Predictive Analytics in Precision Agriculture 
Artificial intelligence is a global topic that holds significant importance in recent years. It sparks debates about its usage, but despite the contradictions that arise, there are many benefits to applying it in various areas. In this paper, the authors investigate the possibilities of implementing machine learning algorithms to help optimize crop management for precision agriculture. Meeting current food demands becomes an increasingly challenging task as the population grows. Applying machine learning using IoT data analytics in the agricultural sector will bring forth additional advantages, increasing not only the quantity but also the quality of production from crop fields to meet the rising food demands. The main objective of the paper is to employ various machine learning algorithms for predictive modeling and subsequently develop a robust model capable of making accurate predictions on unseen agricultural data. The research methodology involves preprocessing the acquired dataset, including data cleaning and feature engineering to enhance model performance. The authors systematically apply a range of machine learning algorithms, such as regression, decision trees, random forests, and neural networks, to identify the most effective approach for crop yield prediction. For better model performance, boosting techniques of machine learning are also implemented. Cross-validation and performance metrics such as Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), accuracy, confusion matrix, precision, recall and F1 score, are included to evaluate the accuracy of the models, aiming to highlight the strengths and weaknesses of each algorithm in the context of precision agriculture.
2.S. Miovska, C. Martinovska Bande, N. Stojkovik (Goce Delcev University, Stip, Macedonia)
Predicting Wine Properties Based on Weather Conditions Using Machine Learning Techniques 
Wine quality depends on different factors from cultivation to production. The main factors affecting the quality are weather and climate, growing practices of the vineyard and techniques used by winemakers. Chemical properties of wines are not sufficient to predict quality and the price is not the best indicator of quality. We analyze the impact of weather conditions on red and white wines from different wine-producing regions with specific climatic and soil conditions in North Macedonia. The wine data are gathered from certification and quality assessment laboratory and consist of physicochemical characteristics such as alcohol, volatile acids, total extracts, sugar residue, etc. Weather conditions, like amount of precipitation, daily average temperature, temperatures above 10°C, relative air humidity, have a different impact on the vineyards in different periods of growth, so they are considered in a phenological relevant way. The paper describes a model for predicting wine properties based on weather data using local kernel regression. The results show that high temperatures without precipitation during the ripening period have a positive impact on quality.
3.M. Klepo (A1 Hrvatska, Zagreb, Croatia), B. Novoselnik (Sveučilište u Zagrebu Fakultet elektrotehnike i računarstva, Zagreb, Croatia)
Product Demand Forecasting for Shelf Space Allocation in Retail via Machine Learning 
Since shelf space is a limited resource in retail and because of the increasing variety of items being sold, retailers often must make decisions on which products to include in the assortment. The shelf space allocation problem refers to determining how much space should be allocated to each product on the shelf to increase retailer’s profitability. In this paper, a shelf space allocation problem is described, as well as different shelf space optimization approaches and a unique approach that summarizes all previous research. Special emphasis is placed on product demand forecasting, where traditional and machine learning approaches are presented. Machine learning models are implemented in the practical part of the paper in order to forecast demand. The first algorithm used is XGBoost, a highly scalable ensemble of decision trees based on gradient boosting. The model is trained with the goal of forecasting each following step and predictions are made using a recursive iterative technique. The second algorithm is long short-term memory, a type of recurrent neural network with the ability to remember values from earlier steps for future use. The models’ performances were compared using several evaluation metrics on publicly available realistic datasets.
4.Á. Pérez-García, A. Martín Lorenzo, J. López Feliciano (University of Las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain)
Spectral Band Selection Methodology for Future Sensor Development 
Hyperspectral imaging (HSI) significantly impacts diverse Earth observation applications, including precision agriculture, land use change monitoring, climate research, and natural resources management. By measuring hundreds of wavelengths along the electromagnetic spectrum, hyperspectral sensors can discriminate surfaces accurately, identify objects in fine detail and detect changes precisely. Nevertheless, HSI has some drawbacks, such as the challenge of handling vast amounts of information, the redundancy of spectral data, and the high price often associated with it. Therefore, a new model of band reduction based on artificial intelligence classifiers with hyperparameter optimisation and feature selectors is proposed. To test the model, it is applied to three popular HSI datasets: Pavia University, Salinas Valley, and Indian Pines. The results are auspicious; between 75% and 90% accuracy can be achieved with only four wavelengths. The best performance is for Salinas Valley, and the poorest is for Indian pines, coinciding with the ranking of dataset complexity regarding the number of classes and the similarity of spectral signatures. The lowest-ranked Indian pines classes are woods and stone-steel-towers. The findings of this study suggest that it is feasible to develop multispectral sensors that require reduced spectral information, making them cost-effective, tailored to classify specific scenarios.
5.I. Znika (Zagreb University of Applied Sciences, Zagreb, Croatia), A. Radovan (Algebra University College, Zagreb, Croatia)
Personal Physical Fitness Modeling through Real-Time Predictive Models 
Committing to sports as a vital component of a healthy lifestyle necessitates ongoing awareness of one's body, emphasizing physical constraints relative to current health and activity levels. This research focuses on developing real-time predictive models using machine learning algorithms based on real-time and accumulated personal data, encompassing vital functions and other physical and mental parameters. The objective is to optimize the recognition and prediction of an individual's physical fitness for more effective participation in sports. A key element is the analysis of personal data to craft an individualized physical fitness model. By integrating data on vital functions and relevant parameters, the system aims to support decisions on optimal body load during sports activities. The quality of predicting an individual's physical fitness relies on precise and comprehensive data, placing requirements on the monitoring and analysis system for bodily parameters. The data is going to be collected through various wearable devices are presented to the user through an interactive iOS and Android application. In this work, practical examples were created for several individuals, illustrating the relationships of their actual values of physical fitness without using predictive models, considering the use of predictive models with the possibility of customization according to individual preferences.
6.J. Nalić, Z. Mašetić (International Burch University, Faculty of Engineering and Information Technologies, Sarajevo, Bosnia and Herzegovina), I. Djedović (International Burch University, Faculty of Economics and Social Sciences, Sarajevo, Bosnia and Herzegovina)
Building Ensemble Models for Enhanced Credit Scoring: A Case Study from a Bosnian MicroFinancial Institution 
This research paper presents the development of a comprehensive credit scoring model for a MicroFinancial institution in Bosnia and Herzegovina. The study employs a diverse machine learning and deep learning algorithms to analyze a real-life dataset, aiming to establish a reliable and high-performing credit scoring model. We explored the efficacy of single classifiers, including Decision Tree, K-Nearest Neighbors (KNN), Recurrent Neural Networks (RNN), and Long Short-Term Memory (LSTM) networks, alongside advanced ensemble models such as XGBoost and CatBoost. To enhance predictive accuracy, we also experimented with creating additional ensemble combinations of these models. The performance of each model was rigorously evaluated using a range of metrics: Accuracy, Precision, Recall, F1 score, Type I error, and Type II error. The findings reveal that integrating multiple, varied classifiers significantly boosts the performance compared to individual classifiers. This approach not only contributes to the field of credit risk assessment but also offers practical insights for financial institutions seeking robust credit scoring solutions. The results underscore the potential of ensemble methodologies in harnessing the strengths of different machine and deep learning models for improved decision-making in credit scoring.
11:00 AM - 12:30 PMMachine Learning Applications and Other Topics - II
Chair: Marko Horvat 
1.Z. Lončarević, M. Luštrek, A. Gams (Jožef Stefan Institute, Ljubljana, Slovenia)
Evaluation of Classical and Deep Learning Approaches for Human Activity Recognition 
The main idea of the internet of things is to have devices connected and working together for the benefits of human, and it is being realized more with each day, as the number of devices with inbuilt processors, sensors and wireless connectivity increases. One of the most widely spread applications is human activity recognition, which enables applications to make personalized recommendations about healthy living habits. In order to accomplish that, algorithms that are able to recognize human activities have evolved. Traditionally, this is done through feature extraction of hand-crafted features for specific sets of activities and use-cases (classical approaches). Although these methods achieve great performance, expansion of neural networks is shifting this task away from being feature engineering towards being neural network engineering. In this paper we compare several already existing classical approaches, deep learning approaches, and a newly developed AutoPyTorch library that enables a combination of traditional and neural network approaches for human activity recognition tasks. The algorithms were not fine-tuned and results might vary depending on the hyperparameters, but they provide an insight about general patterns as well as upsides and downsides of using different approaches for human activity recognition.
2.D. Georgiev, M. Toshevska, S. Gievska (Faculty of Computer Science and Engineering, University of Ss. Cyril and Methodius - Skopje, Republi, Skopje, Macedonia)
Identification of HIV Inhibitors using Graph Neural Networks 
Graph neural networks (GNN), primed to extract knowledge and discover patterns in graph-structured data, have received particularly increased attention in biomedical research. By integrating information from variety of biomedical knowledge repositories they offer fast and efficient computational alternative approach to the costly and time-consuming process of drug development and research. The core contributions of this paper include design and empirical evaluation of several GNN-based models for identification of potential HIV (Human Immunodeficiency Virus) inhibitors. In particular, the predictive power of model variants based on Graph Attention Network (GAT), Graph Isomorphism Network (GIN) and Continuous Kernel-Based Graph Convolutional Network, specifically-developed to handle molecular data, have been investigated. To assess the effectiveness of the proposed models, the Stanford open graph benchmark dataset for molecular data ogbg-molhiv was used. Furthermore, two types of molecular fingerprints have been proposed to augment the molecular representation in the proposed graph neural models, leading to better performance standing compared to the original models. The paper provides a detailed description of the proposed models for identifying HIV inhibitors, followed by comparative analysis of the experimental results focusing on a discussion of the challenges we face and future research directions that could be to be investigated.
3.L. Mrčela, Z. Kostanjčar (Faculty of Electrical Engineering and Computing, Zagreb, Croatia)
Probabilistic Deep Learning Approach to Credit Card Fraud Detection 
Credit card fraud detection deals with algorithms that can determine in real time whether an ongoing transaction is fraudulent, thereby improving the prevention of such transactions. Fraud detection approaches are often based on defining rigid boundaries between normal and fraudulent transactions. However, one of the challenges in establishing fraud detection rules is that the share of frauds in the total number of observed transactions is often not sufficient to enable good generalization when recognizing new frauds. In this article, we present a probabilistic deep learning approach to credit card fraud detection, which efficiently detects fraud based only on learned properties of normal transactions. Furthermore, we show a simple and natural interpretation of the results obtained by such models that can provide additional reasoning as to why a particular transaction can be considered fraudulent. We confirm our findings with tests on a synthetic set of transaction data and compare them with the results of other commonly used approaches.
4.S. Dumenčić, I. Lučin, M. Alvir, J. Lerga, L. Kranjčević (University of Rijeka, Faculty of Engineering, Rijeka, Croatia)
Detecting Water Surface Borders on Satellite Images 
In the process of climate change, methods used for environmental monitoring are becoming more and more necessary. Changes of water surfaces can be used as an indicator of climate change, including drought and flood. In this paper, we investigated methods for detecting water surface borders on satellite images using manual and automatic approaches. The approaches were tested on true color images and Normalized Difference Water Index (NDWI) images with enhanced water surfaces. The manual algorithm (MSDA) marks pixel values within a specified tolerance from the initial pixel. Additionally, two automatic Shoreline Detection Algorithms (SDA) were developed. The first one is based on the Suzuki algorithm and the second one is based on Canny algorithm in combination with the Suzuki algorithm. The results indicate minimal differences between SDA1 and SDA2 in marking water surface borders, with better performance observed with NDWI images compared to true color images. While MSDA is suitable for precise and adjustable marking, it should be noted that manual marking is more time-consuming. The automatic algorithms can be used for analysis of greater number of satellite images with minimal user input and as such give a good overview of possible changes of the water surface borders.
5.R. Šajina (Sveučilište Jurja Dobrile u Puli, Pula, Croatia)
Evaluacija generalizacije znanja decentraliziranih agenata u kontekstu heterogenosti podataka 
Ovaj rad istražuje utjecaj heterogenosti podataka na decentralizirano učenje pametnih agenata. Iako postoje različiti pristupi simuliranju heterogenosti putem podjela podataka među agentima, fokusiramo se na važnost evaluacije općeg znanja pojedinačnog agenta na ukupnoj reprezentaciji svih podataka. Dosadašnji pristupi često se oslanjaju na testne skupove koji reflektiraju distribuciju trening podataka, no nedostaje analiza općenitosti znanja agenata na cjelokupnom spektru podataka. Stoga, naša studija adresira ovu prazninu analizirajući i ocjenjujući općenito znanje pojedinačnog agenta na kompletnom skupu podataka. Očekujemo da će ova analiza pružiti dublji uvid u sposobnost generalizacije agenata u kontekstu decentraliziranog učenja, nadilazeći granice treniranja i testiranja sličnih distribucija podataka.
6.R. Šajina (Sveučilište Jurja Dobrile u Puli, Fakultet informatike, Pula, Croatia)
Evaluacija i analiza modela dubokih neuronskih mreža za predviđanje kretanja više osoba na sceni 
Predviđanje kretanja više osoba na sceni pred stavlja izazovan zadatak sa širokim potencijalom primjene. Ovaj problematičan aspekt pronalazi svoju primjenu u kontekstu autonomnih vozila, gdje se koristi za predviđanje kretanja pješaka i na temelju tih predviđanja se poduzimaju odgovarajuće akcije. Također, u sportskoj analizi, ovakvi modeli se primjenjuju za predviđanje kretanja igrača, dok se u području robotske mobilnosti koriste za anticipiranje ponašanja okoline i prilagođavanje ponašanja robota prema tim predviđanjima. Ovaj znanstveni rad temelji se na istraživanju modela za predviđanje kretanja više osoba na sceni, s posebnim fokusom na analizi njihovih performansi na novom, do sada neistraženom skupu podataka. U radu će se analizirati najnoviji modeli koji se većinski oslanjaju na arhitekturu Transformera, ali će se također obuhvatiti i pristupi temeljeni na jednostavnijim arhitekturama. Kroz ovu analizu, rad pridonosi dubljem razumijevanju raznolikosti modela za predviđanje kretanja više osoba, pružajući uvid u to kako najnovije arhitekture, poput Transformera, odgovaraju na ovaj problem u usporedbi s prethodnim pristupima. Istraživanje donosi doprinos razvoju tehnika predviđanja kretanja u realnom vremenu, s potencijalom za unapređenje autonomnih sustava u različitim okolinama i scenarijima primjene.

Basic information:

Darko Huljenić (Croatia), Alan Jović (Croatia)

Steering Committee:

Andrea Budin (Croatia), Bojan Cukic (United States), Marko Đurasević (Croatia), Marina Ivašić-Kos (Croatia), Domagoj Jakobović (Croatia), Ruizhe Ma (United States), Neeta Nain (India), Stjepan Picek (Netherlands), Slobodan Ribarić (Croatia), Vitomir Štruc (Slovenia)

Program Committee:

Mario Brčić (Croatia), Karla Brkić (Croatia), Marko Čupić (Croatia), Marko Đurasević (Croatia), Marko Horvat (Croatia), Ivo Ipšić (Croatia), Domagoj Jakobović (Croatia)

Registration / Fees:

Price in EUR
Up to 6 May 2024
From 7 May 2024
Members of MIPRO and IEEE 243 270
Students (undergraduate and graduate), primary and secondary school teachers 130 150
Others 270 300

The discount doesn't apply to PhD students.

NOTE FOR AUTHORS: In order to have your paper published, it is required that you pay at least one registration fee for each paper. Authors of 2 or more papers are entitled to a 10% discount.


Darko Huljenić

Faculty of Electrical Engineering and Computing
Unska 3
HR-10000 Zagreb, Croatia


Alan Jović

Faculty of Electrical Engineering and Computing
Unska 3
HR-10000 Zagreb, Croatia

Phone: +385 1 612 9548

The best papers will get a special award.
Accepted papers will be published in the ISSN registered conference proceedings. Papers in English presented at the conference will be submitted for inclusion in the IEEE Xplore Digital Library. 

There is a possibility that the selected scientific papers with some further modification and refinement are being published in the following journals: Journal of Computing and Information Technology (CIT)MDPI Applied ScienceMDPI Information JournalFrontiers and EAI Endorsed Transaction on Scalable Information Systems.


Opatija is the leading seaside resort of the Eastern Adriatic and one of the most famous tourist destinations on the Mediterranean. With its aristocratic architecture and style, Opatija has been attracting artists, kings, politicians, scientists, sportsmen, as well as business people, bankers and managers for more than 170 years.

The tourist offer in Opatija includes a vast number of hotels, excellent restaurants, entertainment venues, art festivals, superb modern and classical music concerts, beaches and swimming pools – this city satisfies all wishes and demands.

Opatija, the Queen of the Adriatic, is also one of the most prominent congress cities in the Mediterranean, particularly important for its ICT conventions, one of which is MIPRO, which has been held in Opatija since 1979, and attracts more than a thousand participants from over forty countries. These conventions promote Opatija as one of the most desirable technological, business, educational and scientific centers in South-eastern Europe and the European Union in general.

For more details, please visit and

News about event
  • 4/4/2024

    Invited Lecture: 

    Karla Brkic, PhD
    Meta Reality Labs, Zurich, Switzerland


    Extended Reality on Meta Quest Devices 


    What is augmented reality? What is mixed reality? What is extended reality? This talk will introduce mixed and virtual reality on the Meta Quest 3 platform, show some use cases, and discuss the enabling computer vision and computer graphics technologies in the Meta Presence Platform that support development for mixed reality, including scene models, scene anchors and scene mesh. 

Patrons - random
Sveučilište u ZagrebuSveučilište u RijeciFER ZagrebPomorski fakultet RijekaTehnički fakultet Rijeka