Transparency on the reporting of public procurement information: lessons from handling compiled procurement information

In this blog post, we will summarise the key challenges affecting the transparency of public procurement information in the UK, including data quality issues such as lack of unique identifiers, duplicated records, inconsistent dates, and missing data fields. We argue that improving data collection, quality, and availability in public procurement is important to support accountability, transparency and to inform policy reform. Finally, we will describe … Continue reading Transparency on the reporting of public procurement information: lessons from handling compiled procurement information

Ball mapper over bank’s customers.

In this blog post, I will show an R application of a Topological Data Analysis tool called Ball Mapper (BM), to visualise the distribution of the bank’s customers that have stayed or exited the bank across the joint distribution of the customers’ characteristics. BM is a useful tool to visualise datasets with multiple dimensions, to do so, BM summarises points that are close to each … Continue reading Ball mapper over bank’s customers.

Article review: Generalized measures for the evaluation of Community Detection methods.

In this blog post, I will summarise an article that proposes a modified version of three community detection assessment measures (Purity, Adjusted Rand Index and Normalized Mutual Information). The modified measures include network topological information to assess misclassification errors according to nodes’ integration into the network. This article was published in 2013 in the International Journal of Social Network Mining by Vincent Labatut (Labatut, 2015). … Continue reading Article review: Generalized measures for the evaluation of Community Detection methods.

Article review. Triadic closure in two-mode networks: Redefining the global and local clustering coefficients.

In this post, I will summarise an article that proposes a redefinition of the clustering coefficients for two-mode networks. The new definition aims to solve some problems that arise from applying, in projected two-mode networks, the clustering coefficient defined in one-mode networks. This article was published in 2013 in the Journal ELSEVIER by Tore Opsahl (Opsahl, 2013). The author introduced the article by explaining some … Continue reading Article review. Triadic closure in two-mode networks: Redefining the global and local clustering coefficients.

Article review: The scales of human mobility.

In this blog post, I will summarise an article that proposes a new approach to model human mobility. This article was published in 2020 in the Journal Nature by Laura Alessandretti, Ulf Aslak and Sune Lehman (Alessandretti et al., 2020).   The authors started the article by explaining that human mobility is a key issue to understand other phenomena such as people’s commuting flows, money’s … Continue reading Article review: The scales of human mobility.

Article review. Analyzing and Modeling Real-World Phenomena with Complex Networks: A Survey of Applications

This blog will review a survey of the applications of complex networks to real-world problems. In particular, six applications related to Social Networks, Economy and Security and Surveillance will be summarised. This article was published in the Journal Advances in Physics, in 2008 by Luciano da Fontoura Costa, Osvaldo N. Oliveira Jr., Gonzalo Travieso, Francisco Aparecido Rodrigues, Paulino Ribeiro Villas Boas, Lucas Antiqueira, Matheus Palhares … Continue reading Article review. Analyzing and Modeling Real-World Phenomena with Complex Networks: A Survey of Applications

Analysis of two-mode networks

In this blog post it will be analysed a two-mode network of students’ enrolments into modules at the University. Firstly, it will be shown how to visualise this two-mode network. Secondly, it will be demonstrated how to transform this network into a one-mode network to explore the similarities of each mode. The latter will be made using three methods: Overlaps count, Jaccard Similarity and Simple … Continue reading Analysis of two-mode networks

Article review: Modeling complex systems with adaptive networks

In this post, I will review an article that used adaptative networks to model complex systems in some real-world problems. This article was published in 2013 in the Journal ELSEVIER by Hiroki Sayama, Irene Pestov, Jeffrey Schmidt, Benjamin James Bush, Chun Wong, Junichi Yamanoi and Thilo Gross (Sayama et al., 2013). This article aimed to introduce fundamental concepts and properties of adaptive networks through a … Continue reading Article review: Modeling complex systems with adaptive networks

Forecast analysis with Random Forest for house property sales data.

In this blog post, I will perform a House Property Sales forecast using a Random Forest technique with a Linear Regression and a Time Series. To conduct these models, it was used two databases: The Raw data: 29580 observations of recorded sales data from 2007 to 2019. The MA data: 347 observations of Moving Average of Median Price grouped by quarterly intervals per property type … Continue reading Forecast analysis with Random Forest for house property sales data.

Dataset: House Property Sales. Exploratory analysis.

By Maria Fernanda Ibarra Gutiérrez and Thu Trang Dinh In this blog post, we will describe the database about House Property Sales, which can be downloaded from: https://www.kaggle.com/htagholdings/property-sales?select=raw_sales.csv According to the first Figure, this database describes some characteristics of the property sales into 5 variables and 29,580 observations from the 7th of February 2007 to the 26 of July 2019. This database does not have … Continue reading Dataset: House Property Sales. Exploratory analysis.

The impact of Covid-19 in World’s Economy

By Maria Fernanda Ibarra Gutiérrez The Coronavirus disease (Covid-19) is a worldwide health problem that according to the World Health Organization (WHO) has spread in 213 countries. Up to the 13th of April 2020, there were 1,807,308 cases around the world according to the Our World in Data database (Ritchie, 2020).   At the current moment, the United States has the higher number of cases … Continue reading The impact of Covid-19 in World’s Economy

Article review: Exploring crime patterns in Mexico City

By Maria Fernanda Ibarra Gutiérrez Big Data analysis is a research approach that has been growing in importance to study several aspects of society, as we live surrounded by governmental and private systems, technological devices and social media platforms that gather information from our daily activities, choices, purchases, searches, health patterns and other digital touchpoints. Therefore, there is a large amount of data suitable for … Continue reading Article review: Exploring crime patterns in Mexico City

Introduction to scatter plot

By Maria Fernanda Ibarra Gutiérrez This blog looks at the ways in which scatter plots can be used to visualise multiple sets of data and the relationships between several variables. It takes a data set and deals with outliers, formatting the graphs for clarity, using bubbles to show a third variable, adding regression models and trend to the plots and splitting the data into separate … Continue reading Introduction to scatter plot