In the past months, you may have participated in our Open Data Requirements Survey. We want to thank you very much for participating in this survey. This document briefly reports on the main findings of the survey. These findings are used to develop and further specify the requirements of the ENGAGE e−infrastructure for open data (see www.engagedata.eu). More results will be reported in a future journal paper. For further questions about this survey you may contact Anneke Zuiderwijk, Delft University of Technology (a.m.g.zuiderwijk-vaneijk@tudelft.nl).
Results
In total, 307 persons started answering questions in this survey and 151 persons completed the survey. Most respondents were actual open data users (84%) and some were potential open data users. The results reported below concern the answers of actual users of open public sector data. Insufficient potential users answered the questionnaire to obtain valid results. The results below include information of persons who finished the survey as well as persons who did not finish the survey.
Background of respondents
About three-quarters of the respondents who used open public sector data were man and about three-quarters was between 26 and 50 years old. Most respondents work in social sciences, mainly in political science, public administration, sociology and other social science domains.
Current use of open public sector data
The respondents mainly used social data (74%), geographic data (67%) and business data (45%). Open data are mainly used monthly or a few times per month, yearly or a few times per year or weekly. Many different websites are used to gather open public sector data, but data.gov and data.gov.uk are used by many respondents.
Respondents were asked which purposes were important for their use of open public sector data. Most listed purposes were assessed as important or very important by the majority of the respondents. For instance, performing a statistical analysis and writing an academic publication were assessed as very important reasons to use open public sector data (by 44% and 42% of the respondents respectively). News reporting and daily operation in work were viewed as less important.
User requirements
Respondents were asked to which extent they were able to perform a number of actions when they use open public sector data. The actions that were assessed as difficult by the majority of the respondents were: 1) discovering and browsing datasets across local, national and international datasets in the own language, 2) processing data by linking them to other data, 3) processing data by linking metadata, 4) providing feedback on the data by rating the data (e.g. rating the quality of the data), 5) providing feedback to the data producer by putting needs for open public sector data and 6) getting training on the use of open public sector data. The actions that were assessed as easy or very easy by the majority of the respondents were 1) searching (e.g. searching for data by typing keywords in a search engine), 2) downloading open public sector data and 3) processing by analyzing the data. Other actions were assessed as neither easy, nor difficult.
The answers to this question were compared with the actions that were assessed as very useful by the majority of the open data users, including searching (e.g. searching for data by typing keywords in a search engine), searching by using an API, finding (getting the data you are looking for), finding by the use of metadata, finding linked publications and other linked material in which certain datasets are already used, discover and browse datasets across local, national and international datasets in the own language, downloading open public sector data, processing data, processing by linking the data, processing by linking metadata, processing by visualizing data in tables, maps and charts, processing by analyzing the data. Actions that were assess as useful include downloading supplementary open data (e.g. metadata), providing feedback to the data producer by putting needs for open data, uploading datasets, uploading processed, enhanced, extended, harmonised, anonymised, annotated and/or linked versions of existing datasets, viewing usage statistics and getting training on the use of open public sector data.
Comparing the difficult actions with the useful actions, shows the most important user requirements for open public sector data, namely 1) discovering and browsing datasets across local, national and international datasets in the own language, 2) processing data by linking them to other data, 3) processing data by linking metadata, 4) providing feedback to the data producer by putting needs for open public sector data and 5) getting training on the use of open public sector data.
Metadata
Approximately three-quarters of all open data users stated that they also used metadata (data about the data). The majority of the respondents stated that metadata always make reusing data easier, always make the interpretation of data easier, always make searching and browsing data easier and always make linking data easier. However, these benefits are often not obtained from the use of metadata, as several problems are noticed. Often there are insufficient metadata and therefore it is difficult to interpret the data, often there are insufficient data about the data quality, often there are insufficient metadata about data gathering and measuring and often metadata have no structure and are therefore difficult to search and browse. Over 70% of the respondents stated that they would like to have the following types of metadata when they use open data: description of the dataset, title of dataset, creator of dataset, country where the dataset was created, source of dataset, format of dataset, keywords/tags in dataset, geographical or spatial coverage of dataset, temporal coverage of dataset, linked datasets, data collection period and completeness of the dataset.
