Portuguese Local E-government: A Study on the Most Adopted Databases in 2019

The selected database engine significantly impacts the performance of electronic services. This impact also affects governmental e-services. Based on several inquiries to all Portugal city halls, this study ranks the different available databases (commercial and non-commercial) in the year 2019.

cloud computing, the databases must cope with the large volumes of data efficiently, requiring good performance for both read and write operations (Han et al., 2011). Abadi, Boncz (Abadi et al., 2013) mention, in their study, that the performance of the database systems is directly related with the efficiency of the system to store data, and also its ability to move them to the CPU registers for processing. In all sectors, the databases have a critical role, especially in business, electronic commerce, engineering, health-care, genetics, education, library science and others (Elmasri, 2017). Sharing the beforementioned point of view, Abramova, Bernardino (Abramova et al., 2014) consider the databases the organization's vital point. The selection and decision of a database must follow strict approaches, methodologies and rules, which abide organizational growth without harming performance. Even though when data is structured, its size is small, and databases operate relatively well. The increase of data volume requires performance tuning and maintenance by specialized technicians, for instance, normalization and indexation (Sz´arnyas et al., 2018). This paper presents a study of the most used databases in the Portuguese public town halls, taking into account the main reasons to select the adopted databases. Conclusions of this study consisted of several quizzes sent to the IT managers responsible for the IT town halls, aiming to find more about how and why they used their databases. Presented results allow to study the quality factors and pondered reasons to acquire a database management system, in the scope of the Portuguese government. Additionally, it allows concluding what the most adopted database management systems are.
This document contains five sections. Section 2 presents the state of the art review. Section 3 describes how the quiz was performed, organized and distributed to obtain answers regarding the motivations to select and use specific software databases. Section 4 shows the obtained results. Finally, in Section 5, conclusions are drawn, followed by the introduction of future work guidelines.

State of the Art
Information systems comprehend several organizational activities which concern data; these activities include: acquiring data; storage; recovering; manipulation; and transmitting information to other users (Lopes et al., 2005). Almost all information systems include data storage, which is done internally using databases, relational or non-relational. These databases present themselves as essential for managing modern society's information (Abramova et al., 2014).

According
to Silberschatz, Korth (Silberschatz et al., 1997), database management systems differ significantly from storage systems and also from programming environments. A DBMS consists of a collection of programs and services that allow the user to insert, update, delete and query the stored data, therefore, making it easier to maintain and access the data (Sharma et al., 2017). It is essential to underline that relational databases were created as a technology to store structured data as tables, allowing to perform queries using SQL (Structured Query Language) (Fayech & Ounalli, 2012). For the authors, Viscusi, Batini (Viscusi et al., 2010), each DBMS represents a logical data schema, divided into classes that are related. At the same time, the authors mention that relational databases within governmental institutions are constantly in use not only in the central government but also at local municipalities, electronic stores, sensors, city halls and mobile devices. For all cases where databases fit, vast amounts of data are being produced with different data-rates (Fotache & Strimbei, 2015). As a proof of the importance of databases, according to a study in 1989, the number of estimated databases in the world was already five million (Frawley et al., 1992). The most known and utilized database model is the relational, developed in 1969 by E. F. Codd, who was working as a researcher at IBM. The model assumes a set of theories and predicates logic (Litwin, 2003). Despite the familiarization with relational databases, in the last decades, a large number of NoSQL databases were adopted in many large organizations (Okman et al., 2011). According to (Pore & Pawar, 2015), NoSQL databases were developed around 2000 to deal with the limitations of traditional SQL databases, especially concerning scalability, replication and the storage of non-structured data. For the author Vaish (Vaish, 2013), NoSQL is the terminology for the database class that does not follow the same principles of relational databases, concerning ACID (Atomicity, Consistency, Isolation and Durability) properties. NoSQL is equipped to manage with scalability vs. speed for enterprises such as Google, Facebook, Yahoo and Twitter. When comparing NoSQL with traditional relational databases, NoSQL databases are projected to store data with simple structures, or similar to the object-oriented programming languages (Strauch et al., 2011). In studies carried out by Bassil (Bassil, 2012), the author tested and compared between the database engines of SQL Server, Oracle, IBM DB2, MySQL and the MS Access. In this comparative study, the author concludes that IBM DB2 shows a better performance, despite consuming more memory. At the same time, MS Access uses less CPU. These conclusions are supported by other authors, such as Sing, Sharma (Singh et al., 2015) who also tested query performance. The 2019 ranking "DBengines" identifies 330 database engines. According to this ranking, in 2018 and 2019, the six most popular databases are Oracle, MySQL, SQL Server, PostgreSQL, MongoDB, and IBM DB2. Note that, among these top 6 databases, there are six relational databases, and only one nonrelational (MongoDB). The selection of a database engine has direct implications on the daily performance and organization of both enterprises and research (Wang & Santos, 2011). The authors Abbasi, S´a (Abbasi et al., 2018), in the context of smart-cities, consider the performance of database engines, test and suggest the use of Cassandra, which is a NoSQL oriented engine. Other authors, such as Liu and Luo (Liu & Luo 2010), use PostgreSQL in their project, and eGovMon in the field of eGovernment, justifying the choice with the resources that PostgreSQL provides warehousing data. Then authors Sourouni, Lampathaki (Sourouni et al., 2008) use SQL Server to implement ontology oriented to represent concept relations in the domain of eGovernment and the development of associated services. Authors claim that the decision relies on performance and scalability.
In turn, for Signore, Chesi (Signore et al., 2005), based on the eGovernment project citepl, Oracle DBMS was the choice. Baraka and Madoukh (Baraka & Madoukh, 2012) do not mention which DBMS engine is used to improve the Palestinian eGovernment, but suggest that the SOA oriented model surpasses the actual in-place model in performance, interoperability, flexibility, and management. Tsai, Choi (Tsai et al. 2009), for the implementation of a GIS web based project, decide to use Oracle or SQL Server for licensing reasons, aiming to reuse the organization DBMS's licenses and optimize costs.
The decision on which database to use is not consensual; it is not possible to say which is good or bad without analyzing the context first. The DBMS should be chosen very carefully, due to its cost and impact on the entire surrounding systems (Sibley, 1985). Based on all factors, this paper presents a study that allows assessing the most used DBMSs in the Portuguese government, taking into account the specificity of the city town halls.

Survey/Quiz
Based on a survey, the central objective of this study is to identify the most used DBMSs by the Portuguese city town halls. For Marconi and Lakatos (Marconi & Lakatos, 2004), a questionnaire is an instrument for collecting data with a set of ordered questions. These authors point out the negative aspect of having the questions and answers in a written format, without the presence of the interviewer. According to Gil (Gil, 2008), the survey is a technique based on a set of cross-questions to an individual with the aim to gather information regarding knowledge, beliefs, This study comprehends 308 Portuguese governmental city halls which were selected, aiming to maximize the number of answers and obtain a clear picture of the reality. Table 1 shows some statistical values collected from the obtained results regarding how many workers exist inside the inquired organizations.  Table 2 resumes the list of state-of-the-art DBMS most relevant features.
Additionally, an open field, "others", is also available so that participants can include new features.
After the generic database characteristics are selected, participants must identify which DBMS they have in the governmental institution where they are associated. The list with possible databases to select is set, presenting the thirty most used databases present in the dbengines.com (2018). Anyway, to include all After acquiring all the necessary data for the proposed study, with SPSS software, data was imported for analysis.

Results
The obtained data for this result analysis, results from a quiz to 308 Portuguese town halls, from which, 104 (33, 7%) replied, allowing a trust level of 95% with an error of 8%. This sample is considered a valid reality representation of the DBMSs utilization in the Portuguese central government institutions. From the sample, 11% of the participants are graduates, 18% have certifications, 3% have had higher education, 19% have a BSc, 15% have an MSc, and 3% have a PhD. Note that IT departments are gradually hiring better professionals with more academic formation. Moreover, most of the regional city councils have an average of 3 technicians. Finally, results show that larger cities have more significant IT departments. However, most of the participants have less than 50 thousand inhabitants in their city hall (Table 3).
Regarding the answers to the question "what are the most valued characteristics in a DBMS", 67% of the participants selected "recovery in case of disaster/crash". This identifies the important role that information plays in an organization. Understandably, the performance is the third most relevant characteristic with 57.5% (the response time is directly related to users' satisfaction). Compatibility and scalability got 50% of the votes. The price is the sixth identified characteristic, with 49% of the votes. Pleasantly, it is good to know that public institutions consider the cost of the tools. On the other side of the choices, characteristics such as NoSQL oriented, Global ranking, and memory oriented are the less selected options, with 0.9%, 1.8% and 1.8%, respectively. Table 4 shows all the results, with a variance of 87.72, and a standard deviation of 9.26. Table 5 shows the list of the most used databases. The gathered results present a variance of 39.67 and a standard deviation of 6.3. In general, the preferred database engine in this study is the SQL Server Express, which is present in 68.2% of the Portuguese town halls. Other results metrics show that participants evaluate the SQL Server Express with 3.51 points in a total of 5. It is important to highlight that the "price" is the characteristic with more relevance, showing an average of 4.17 points in 5.The characteristics of performance and security have an average evaluation of 3.4 points in 5. Performance and Technical Support have an evaluation of 3.5 points in 5.. Characteristics, easy to use and documentation are evaluated with 3.7 points in 5. In general, the features considered more relevant in this study when choosing the SQL Server Express, are the price, easy to install, and easy to use and integrate with an ERP software.
This study's second most used DBMS is IBM Informix, which is present in 65.4% of the inquired town halls. Technical assistance and security have 4.01 points in 5. Informix global evaluation is 3.94 points in 5, followed by the performance with 3.85 points in 5. Licensing price, easiness to use, maintenance, and performance stand between 3 and 4 points (in a total of 5). It is worth noting that most Informix users select this database based on ERP software that is widely used among the town halls that require this DBMS. As already mentioned before, technical assistance and licensing are essential factors when pondering a DBMS section.
The third DBMS in the list is MySQL, with 61% of the votes. Regarding the price for licensing, it has an evaluation of 4.14 points in 5. Easiness to use, maintenance and performance have an average evaluation of 3.9 points in 5. Despite having a proper punctuation, the worst detected characteristic with 3.22 points in 5, is the technical assistance. The fourth DBMS in the list is the SQL Server (commercial version) with 56.7%.

Conclusions
This research work reflects an initial study that intends to identify the most used DBMS by the Portuguese government in the city town halls. Simultaneously, it also recognized the reasons which lead to such selection/implementation, as well as the satisfaction level of each participant in the study.  In this study, based on a set of questions sent to 308 Portuguese town halls by email, 104 valid answers were collected, from which, the ranking in this study results. Each database system is evaluated regarding satisfaction, technical assistance, globally, price, documentation, easiness to use, maintenance, performance, security, and performance. In the same time, the characteristics that justify the inquired participants were also collected.
The raking of the most used DBMSs present in Table 5 show that the DBMS with more votes are, SQL Server Express, Informix, MySQL, SQL Server (commercial version), Access, and PostgreSQL. The lessused databases, last in rank, are DiSAM, PostGIS, Redis, and ElastichSearch. In general, it is possible to conclude that the database administrators across the studied institutions consider Technical Assistance, Security and Performance, as critical aspects.
Additionally, it is also visible that because of software requirements such as ERP's, specific DBMSs are required to be installed, obligating the municipalities to use/integrate such databases; Informix and SQL Server are examples of such cases.

Future Work
For future work, this study is going to be extended to understand the reasons that make users choose and acquire DBMSs. In line of this study's fulfillment, the quiz will be repeated annually, to understand the evolution of the Portuguese local government. Moreover, based on the gathered information, this study aims to help IT departments in the acquisition and selection of a DBMS. _____________________________________________________________________________________________________________