Transparency on the reporting of public procurement information: lessons from handling compiled procurement information

In this blog post, we will summarise the key challenges affecting the transparency of public procurement information in the UK, including data quality issues such as lack of unique identifiers, duplicated records, inconsistent dates, and missing data fields. We argue that improving data collection, quality, and availability in public procurement is important to support accountability, transparency and to inform policy reform. Finally, we will describe … Continue reading Transparency on the reporting of public procurement information: lessons from handling compiled procurement information

Analysis of two-mode networks

In this blog post it will be analysed a two-mode network of students’ enrolments into modules at the University. Firstly, it will be shown how to visualise this two-mode network. Secondly, it will be demonstrated how to transform this network into a one-mode network to explore the similarities of each mode. The latter will be made using three methods: Overlaps count, Jaccard Similarity and Simple … Continue reading Analysis of two-mode networks

Article review: Exploring crime patterns in Mexico City

By Maria Fernanda Ibarra Gutiérrez Big Data analysis is a research approach that has been growing in importance to study several aspects of society, as we live surrounded by governmental and private systems, technological devices and social media platforms that gather information from our daily activities, choices, purchases, searches, health patterns and other digital touchpoints. Therefore, there is a large amount of data suitable for … Continue reading Article review: Exploring crime patterns in Mexico City

Introduction to scatter plot

By Maria Fernanda Ibarra Gutiérrez This blog looks at the ways in which scatter plots can be used to visualise multiple sets of data and the relationships between several variables. It takes a data set and deals with outliers, formatting the graphs for clarity, using bubbles to show a third variable, adding regression models and trend to the plots and splitting the data into separate … Continue reading Introduction to scatter plot

SDG Indicator Filtering Function

It is a common issue to handle missing values in data preparation step before analysis. In R, missing values are represented by NA, and there are abundant NA-related functions in R to deal with NA values. Since we would like to cluster the SDG indicators later, it is highly recommended to construct a filtering function to guarantee there are no NA values in filtered data … Continue reading SDG Indicator Filtering Function