The development of technology over the years has provided the opportunity to collect and store more and more data from many different sources. It has become clear that this mounting volume of data could provide valuable information if it were analyzed in an appropriate way.
One of the challenges with the collection of big data is firstly one of sheer volume which makes it difficult to store and retrieve without adequate technological resources. Another problem is the quality of the data itself, as much has been collected without proper editing and control. This leads to large amounts of “dirty data” being available. In other words the veracity of the data is of prime importance.