## List of publications

## Journal papers

**Levente Hunyadi**and István Vajk. Constrained quadratic errors-in-variables fitting.*The Visual Computer*30:1347–1358, December 2014. DOI BibTeX@article{hunyadi:tvcj2013constrained, author = "Levente Hunyadi and Istv\'an Vajk", title = "Constrained quadratic errors-in-variables fitting", journal = "The Visual Computer", volume = 30, issue = 12, month = "December", year = 2014, publisher = "Springer Berlin Heidelberg", pages = "1347--1358", doi = "10.1007/s00371-013-0885-2" }

**Levente Hunyadi**and István Vajk. Modeling by fitting a union of polynomial functions to data in an errors-in-variables context.*International Journal of Pattern Recognition and Artificial Intelligence*27(2), 2013. PDF, DOI BibTeX@article{hunyadi:ijprai2013modeling, author = "Levente Hunyadi and Istv\'an Vajk", title = "Modeling by fitting a union of polynomial functions to data in an errors-in-variables context", journal = "International Journal of Pattern Recognition and Artificial Intelligence", volume = 27, number = 2, year = 2013, note = "Article ID 1350004", doi = "10.1142/S0218001413500043", pdf = "http://hunyadi.info.hu/research/ijprai/hunyadi2013modeling.pdf" }

**Levente Hunyadi**and István Vajk. Reconstructing a model of implicit shapes from an unorganized point set.*Scientific Bulletin of “Politehnica” University of Timişoara, Transactions on Automatic Control and Computer Science (Buletinul Stiintific al Universitatii “Politehnica” din Timişoara, Romania, Seria Automatica si Calculatoare)*56(70):57–64, 2011. BibTeX@article{hunyadi:upt2011reconstructing, author = "Levente Hunyadi and Istv\'an Vajk", title = "Reconstructing a model of implicit shapes from an unorganized point set", journal = "Scientific Bulletin of ``Politehnica'' University of Timi\c{s}oara, Transactions on Automatic Control and Computer Science (Buletinul Stiintific al Universitatii ``Politehnica'' din Timi\c{s}oara, Romania, Seria Automatica si Calculatoare)", address = "Timi\c{s}oara (Temesv\'{a}r), Romania", volume = 56, number = 70, year = 2011, pages = "57--64" }

**Levente Hunyadi**and István Vajk. Identifying Dynamic Systems with Polynomial Nonlinearities in the Errors-in-Variables Context.*WSEAS Transactions on Systems*8(7):793–802, July 2009. BibTeX@article{hunyadi:wseas09identification, author = "Levente Hunyadi and Istv\'an Vajk", title = "Identifying Dynamic Systems with Polynomial Nonlinearities in the Errors-in-Variables Context", journal = "{WSEAS} Transactions on Systems", volume = 8, number = 7, month = "July", year = 2009, pages = "793--802" }

**Levente Hunyadi**and István Vajk. An Errors-in-Variables Parameter Estimation Method with Observation Separation.*Scientific Bulletin of “Politehnica” University of Timişoara, Transactions on Automatic Control and Computer Science*54(68)(2):93–100, 2009. BibTeX@article{hunyadi:bstaccs09separation, author = "Levente Hunyadi and Istv\'an Vajk", title = "An Errors-in-Variables Parameter Estimation Method with Observation Separation", journal = "Scientific Bulletin of ``Politehnica'' University of Timi\c{s}oara, Transactions on Automatic Control and Computer Science", year = 2009, volume = "54(68)", number = 2, pages = "93--100", address = "Timi\c{s}oara (Temesv\'{a}r), Romania" }

**Levente Hunyadi**and István Vajk. An identification approach to dynamic errors-in-variables systems with a preliminary clustering of observations.*Periodica Polytechnica Electrical Engineering*52(3-4):127–135, 2008. URL, DOI BibTeX@article{hunyadi:pp09identification, author = "Levente Hunyadi and Istv\'an Vajk", title = "An identification approach to dynamic errors-in-variables systems with a preliminary clustering of observations", journal = "Periodica Polytechnica Electrical Engineering", year = 2008, volume = 52, number = "3-4", pages = "127--135", doi = "10.3311/pp.ee.2008-3-4.01", url = "http://www.pp.bme.hu/ee/2008_3/pdf/ee2008_3_01.pdf" }

**Levente Hunyadi**. Prosper: Developing web applications strongly integrated with Prolog.*Acta Cybernetica*18(4), 2008. URL BibTeX@article{hunyadi:acta08prosper, author = "Levente Hunyadi", title = "{Prosper}: Developing web applications strongly integrated with {Prolog}", journal = "Acta Cybernetica", year = 2008, volume = 18, number = 4, note = "Selected papers from IRFIX 2007", url = "http://www.inf.u-szeged.hu/actacybernetica/edb/vol18n4/pdf/Hunyadi_2008_ActaCybernetica.pdf" }

## Estimation methods in the errors-in-variables context

**Summary.** Constructing a computer model from a large set of data, typically contaminated with noise, is a central problem to fields such as computer vision, pattern recognition, data mining, system identification or time series analysis. In these areas our objective is often to capture the internal laws that govern a system with a succinct parametric representation. Despite the amount and high dimensionality of the data, the equation that relates data points is usually expressed in a compact manner. Unfortunately, nonlinearity in the system under study and the presence of noise means that conventional tools in statistics cannot be directly applied to estimate unknown system parameters.

The dissertation explores estimation methods focused on three related areas of errors-in-variables systems: fitting a nonlinear function to data where the fit is subject to constraints; fitting a union of several elementary nonlinear functions to a data set; and estimating the parameters of discrete-time dynamic systems.

Curve and surface fitting is a well-studied problem but the presence of noise and nonlinearity in the function that relates data points introduces bias and increases estimation variance, which is typically addressed with costly iterative methods. The thesis introduces non-iterative direct methods that fit data subject to constraints, with emphasis on fitting quadratic curves and surfaces, which are nevertheless close to estimates obtained by maximum likelihood methods.

Partitioning a data set into groups whose members are captured by the same relationship in an unsupervised manner is a common task in machine learning, referred to as clustering. While most approaches use a single point as a cluster representative, or cluster data into subspaces, less attention has been paid to nonlinear functions, or manifold clustering. The thesis applies constrained and unconstrained fitting and projection methods in the errors-in-variables context to construct an iterative and a non-iterative algorithm for manifold clustering, which incurs modest computational cost.

Identification of discrete-time dynamic systems is a well-understood problem but its errors-in-variables formulation, when both input and output is polluted by noise, introduces interesting challenges. Several papers discuss system identification of linear errors-in-variables systems but the estimation problem is more difficult in the nonlinear setting. The thesis combines the generalization of the Koopmans–Levin method, an approach to estimate parameters of a linear dynamic system with a scalable balance between accuracy and computational cost, with the nonlinear extension to the original Koopmans method, which gives a non-iterative approach to estimate parameters of a static system described by a polynomial function. The result is an effective system identification method for dynamic errors-in-variables systems with polynomial nonlinearities.

## Crawler.NET: A component-based distributed framework for web traversal

**Abstract.** In web search engines, collecting source documents is an indispensable function to be performed periodically at high speeds. For this end, an extensible, component-based, loosely-coupled distributed architecture for the .NET platform is presented that facilitates efficient parallel crawling. It combines flexibility key to research with scalability, which is a must for high performance. The architecture comprises of a lower layer that constitutes an execution environment and an upper layer that realizes a distributed crawler with a central coordinator.

## Prolog Server Pages Extensible Architecture: Providing web interface for Prolog applications

**Abstract.** The expressive power of Prolog enables its use as a general programming language. However, in the case of applications with web interfaces, the console-oriented, question–answer nature of the language is a considerable drawback. Even though the series of questions and answers may be embedded in a series of HTTP requests and responses, this approach, while acceptable for simpler cases, does not integrate into the web user interface model. First, the paper provides an overview of Prolog language extensions facilitating web programming available at the user's disposal, in particular, the PiLLoW library and foreign language interfaces. Based on these, it sketches a two-layered system architecture that allows constructing complex applications. On one hand, the architecture gives support to access HTTP in a transparent manner integrated into the language, to preserve state between requests, to serve concurrent requests efficiently. On the other hand, the architecture means an extensible template technology with an expression language and structured JSP-like formalism, which encourages the separation of view and business logic. It aims to make it possible to implement applications with web interfaces purely in Prolog, thereby making it unnecessary to embed the application into a system architecture with support for the web. The HTTP protocol and low-level communication is hidden from the programmer by Prolog Web Container, which connects to the web server. It provides an environment for executing user programs, marshals request handling, preserving state between requests and load balancing, thereby giving background support for the Prolog Server Pages technology. Prolog Server Pages, a simple yet extensible structural mechanism, makes it possible to unambiguously separate view and logic. The paper, utilizing its multi-threading capabilities and its built-in low-level external data access support, concludes with a reference implementation in SWI-Prolog, which realizes the presented system architecture. A sample application is provided for demonstration.

## Master's Thesis

**Abstract.** The Internet is one of today's primary information sources yet information available through this medium is scattered and is not necessarily present in its original storage format. In fact, information is available as a set of interconnected documents each with a possibly different terminology. Nevertheless, it is a typical task to return a set of documents that match a given query, exemplified by the wide-spread use of web search engines. Such retrieval of documents, however, is only possible through a periodic traversal of the Web.

For the traversal to be effective, that is, frequently updated documents to be visited regularly, parallelization is indispensable. However, operating a distributed system that supports parallelization across multiple machines is a substantially more complex job than operating a centralized system. Other crucial aspects include easy tracing of job completion and flexible architecture. It is essential that the system support monitoring how jobs are processed and it adopt to various crawling tasks (e.g. traversal of specific domains, language-dependent behavior, reference structure analysis).

In order to battle the aforementioned demands, the author presents an architecture comprising of units loosely-coupled by means of well-defined interfaces and communicating with one another by exchanging messages. Loose coupling allows units to run on different machines and caters for extension with new functionality. Communication between units is transparent independently of whether it is in fact a local or a remote message exchange, and the way units are interconnected is declaratively defined in XML descriptors. In general, the services offered by an underlying framework greatly reduce the complexity that would otherwise be related to coordinating a distributed system. Unit development may subsequently focus on other aspects of web robot construction.