|M. Malyy (Skolkovo Institute of Science and Technology, Moscow, Russian Federation), Z. Tekic (Graduate School of Business, HSE University, Moscow, Russian Federation), T. Podladchikova (Skolkovo Institute of Science and Technology, Moscow, Russian Federation)
Hey Google, How Valuable is This Startup? Internet Search Queries and New Ventures’ Valuation – Insights from B2B and B2C Sectors
We introduce, test, and discuss a data-driven approach for startup valuation based on Google Trends (GT), a big data instrument. We collected GT information for the US startups from two different contexts: b2c fooddelivery and b2b money lending. Normalizing and pairing points from Google Trends data with existing valuation points for each identified company, and then, applying regression analysis, we obtained two similar quadratic models, with high explanatory (average adjusted R2 = 0.986) and predictive (average predictive R2 = 0.964) power. However, the outcomes obtained do not come without limitations: the standard errors of models are significantly high (SE_b2c = $262M, SE_b2b = $52M), decreasing its practical applicability. These results suggest that, with further enhancement, our approach has potential for estimating the market value of a particular new venture by relating its and the competitors’ search query data and adding a few known valuation points. Our relatively simple method offers the first insights into the potential of Google Trends data, a limitless source of high-quality data, for solving the considerably challenging task of valuing nascent ventures. It brings several important contributions and opens the door for leveling the playing field in the valuation efforts of different parties, providing ground for decreasing the level of information asymmetry among parties interested in valuation.
|M. Fertalj, L. Brkić, I. Mekterović (Faculty of Electrical Engineering and Computing, Zagreb, Croatia)
A Systematic Review of Peer Assessment Approaches to Evaluation of Open-Ended Student Assignments
Students enrolled in computer science courses should develop logical thinking during their studies and learn to apply theoretical knowledge to open-ended problems, where subjective assessment is expected, and a single exact solution does not exist. With the enrollment of an increasing number of students, the manual grading of such assignments requires a greater workload for the teaching staff. Peer assessment is suitable for grading open-ended assignments and benefits the learning process of students as well as relieves the teacher’s workload.
This paper provides an overview of existing peer assessment strategies and methodologies, analysis of published performance results of algorithms for grading evaluation that are already in use in (under)graduate courses or under development. We compare calibration approaches and systematically catalog problems and improvements published by various authors researching same or similar problems. We will produce a status report of current research and review ideas recommended for future implementation, problems whose solutions are put on hold and overall challenges concerning implementing peer assessment. The aim of this paper is to underline areas already marked for possible improvement or overlooked but justified of further study.
|T. Hlupić, D. Oreščanin (Poslovna inteligencija d.o.o., Zagreb, Croatia), D. Ružak (Visoko učilište Algebra, Zagreb, Croatia), M. Baranović (Fakultet elektrotehnike i računarstva, Zagreb, Croatia)
An Overview of Current Data Lake Architecture Models
As the Data Lakes have gained a significant presence in the data world in the previous decade, several main approaches of building Data Lake architectures have been proposed. From the initial architecture towards the novel ones, omnipresent layers have been established, while at the same time new architecture layers are evolving. The evolution of the Data Lake is mirrored onto the architectures, giving each layer a distinctive role in the data processing and consumption. Moreover, evolving architectures tend to incorporate established approaches, such as Data Vaults, into their layers for more refined usages. In this article, several well known architecture models will be presented and compared with the goal of identifying their advantages. Next to the architecture models, the topic of Data Governance in the terms of the Data Lake will be covered in order to expound its impact on the Data Lake modeling.
|D. Mlinarić, V. Mornar, J. Dončević (Faculty of Electrical Engineering and Computing, Zagreb, Croatia)
Ranking Model for Dormitory Admission Process
The process of enrolling students in dormitories corresponds to the stable marriage problem, where students and dormitories are matched as two sets. Applying the Gale-Shapley deferred acceptance algorithm ensures a condition of stability where students enroll in their highest priority dormitories while dormitories enroll students according to students' scores. In comparison with the application for admission in schools and colleges, the dormitory admission process divides the set of students by gender into two subsets. Therefore, there are two sub-quotas for each dormitory, one for each subset of students. In this paper, we present a ranking model for the admission process based on the modified deferred acceptance algorithm. The experimental results show that the proposed model is suitable for the admission process based on the process' requirements.
|L. Petricioli, K. Fertalj (University of Zagreb Faculty of Electrical Engineering and Computing, Zagreb, Croatia)
Agile Software Development Methods and Hybridization Possibilities beyond Scrumban
Traditional software development methods could no longer cope with the fast-growing software development market, as well as the increasing complexity of the software being developed. Using outdated practices was no longer viable as it could lead to reduced competitiveness and losing possible revenue. Therefore, many companies have turned to implementing agile software development methods to keep up with the fast-paced market. Agile allows them to develop quickly, consistently, and in a flexible manner. However, different approaches are better suited for different projects, and, sometimes, an approach combining appropriate aspects of several other approaches is the best option. This paper describes five popular agile software development approaches of which four can be considered “pure”, namely Scrum, Lean, Kanban, and Kaizen, and a hybrid approach called Scrumban. The paper then aims to assess other possible combinations of the approaches, both found in the literature and those not yet considered.
|A. Šarčević, M. Vranić, D. Pintar, A. Krajna (University of Zagreb, Faculty of Electrical Engineering and Computing, Zagreb, Croatia)
Predictive Modeling of Tennis Matches: A Review
Predicting the outcome of sporting events has always been a popular area of interest for sports professionals as well as the general public. Although domain experts have traditionally been the main contributors for generating sports predictions, relying solely on human expertise and intuition is not a scalable, time-effective, nor a cost-effective solution. Hence, computers are taking the leading role in predicting the outcomes of sporting events, in terms of estimating the final score or guessing the winner, but also for predicting the occurrence of various events that may happen during the game. Predictive analytics has been effectively applied to a wide range of sports, including popular team sports as well as individual sports. The nature of sport plays a significant role in the adaptation of different predictive techniques based on diverse statistical and mathematical models. Tennis has a strongly defined structure and a rigid scoring system which, along with the sport’s popularity and large and easily accessible datasets, makes tennis match modeling a hot topic in scientific literature. This paper provides an overview of scientific papers on the prediction of tennis match outcomes, from the first regression-based models all the way to complex models based on machine learning.
|V. Fomichov (Moscow Aviation Institute (National Research University), Institute No. 3 “Control systems, informa, Moscow, Russian Federation)
Semantic Mapping of Definitions for Constructing Ontologies of Business Processes
The paper grounds the topicality of developing natural language (NL) processing systems for constructing business processes models, in particular, their ontologies. It introduces an original algorithm of semantic mapping of the notions’ definitions in NL (on the example of English language). A definition in NL is transformed into its semantic representation being an expression of an SK- language (standard knowledge language). The class of SK-languages is introduced in the author’s theory of K-representations (knowledge representations). The suggested algorithm AlgSemDef1 is able to process the notions’ definitions including attribute clauses, infinitives, the logical connectives ‘not”, “and’, “or”.
|A. Stojanović (Zagreb University of Applied Sciences, Zagreb, Croatia), M. Horvat (Faculty of Electrical Engineering and Computing, Zagreb, Croatia), Ž. Kovačević (Zagreb University of Applied Sciences, Zagreb, Croatia)
An Overview of Data Integration Principles for Heterogeneous Databases
Modern large-scale information systems often use multiple database management systems, not all of which are necessarily relational. In recent years, NoSQL databases have gained acceptance in certain domains while relational databases remain de facto standard in many others. Many “legacy” information systems also use relational databases. Unlike relational database systems, NoSQL databases do not have a common data model or query language, making it difficult for users to access data in a uniform manner when using a combination of relational and NoSQL databases or simply several different NoSQL database systems. Therefore, the need for uniform data access from such a variety of data sources becomes one of the central problems for data integration. In this paper we provide an overview of the main problems, methods and solutions for data integration between relational and NoSQL databases, as well as between different NoSQL databases. We focus mainly on the problems of structural, syntactic and semantic heterogeneity and on proposed solutions for uniform data access, emphasizing some of the more recent proposals.
|P. Soma Reddy, J. Pelletier (Rochester Institute of Technology, Rochester, NY, United States)
The Pentest Method for Business Intelligence
Information transactions and data retention comprise critical inputs to Business Intelligence processes. However, despite ongoing data-driven Business Intelligence process improvements, many companies only discover they are vulnerable to a cyber-attack after a breach materializes the risk. In this study, we propose that compliance regimes such as the global Payment Card Industry Data Security Standard (PCI-DSS), the federal Gramm-Leach Bliley Act (GLBA), and the regional 23-NYCRR-500 standard provide externally-imposed risk discovery opportunities that should be part of managerial decision-making. This paper describes the penetration test (pentest) method relative to those regulatory regimes. We then consider the potential for the pentest method to yield predictive Business Intelligence data sources in five historical cases: the 2017 Equifax Breach, the 2014 J.P. Morgan Chase Breach, the 2012 Global Payments Breach, the 2010 Nasdaq Hack, and the 2009 Heartland Payments Breach. Our findings suggest that the pentest method--especially relative to PCI-DSS compliance--is a promising inclusion in Business Intelligence processes.