Marcin WYSKWARSKI and Dariusz ZDONEK

Silesian University of Technology in Gliwice, Poland

Abstract

The purpose of the work was to determine whether and how city halls in Poland use Twitter. An attempt to identify the information that is sent through tweets. For this purpose, the Twitter profiles of city hall councils were identified. Tweets’ URLs and their content were acquired. The text mining analysis of the tweets were performed. It included text pre-processing, creation of corpora of the documents, creation of a document-term matrix, and application of classical methods deriving from data mining. Due to the number of cities, no requests to city halls concerning their Twitter profiles were sent. Profiles on Twitter were identified using the Google search engine. Only the content of the tweets and the time they were sent were analyzed.

The conducted research allowed to establish the number of users (cities hall) and the number of tweets published by them, the most frequently used hashtags, the number of tweets published per hour and day of the week, the number of hashtags, users mentioned, and links in tweets. Hidden, abstract topics describing tweets were generated with the LDA algorithm. The study found that text mining of tweet content helps determine which groups of information are provided by city halls. It was also determined whether they included links to external sources or hashtags in their tweets and whether they mentioned other users.

Keywords: text mining, Twitter, public administration, LDA. Introduction
Shares