Home
/ About
/ Data Transformation
Data Transformation
A tried and trusted data transformation methodology
Significant effort is required to improve the original raw
financial data such that it is accessible, relevant and of value to
the general public. Data transformation for spotlightonspend
leverages the same tried and trusted methodology used by Spikes Cavell to convert raw financial data from
a public body’s financial management systems into actionable
intelligence to support their efforts to transform the way they
procure goods & services.
Cleanse
A validated data extract is processed using a specially
developed engine to standardise the data, remove duplicates,
identify and fix errors and prepare the file for subsequent
processing. We routinely trap more than 200 common data errors in
raw data extracts uploaded for processing – unresolved, these
errors would result in the provision of data to the public that is
inaccurate and potentially misleading.
Redact
To minimise the risk of inadvertent breach of Data Protection
legislation, the cleansed and standardised raw data is further
processed to identify payments made to individuals (for example
foster carers and vulnerable adults in local government) or where
national security, personal security or foreign relations might be
compromised.
Our redaction algorithms are sophisticated and leverage unique
reference databases and rule-sets designed to ensure that bone-fide
payments are not inadvertently obscured. Once a redaction candidate
has been identified and validated by a data analyst, any
identifying information is overwritten to ensure that the recipient
of the payment cannot be identified.
Classify
Every public body’s financial management system is broadly
similar, but when it comes to delivering meaningful visibility of
spending on goods & services there are significant differences
that mean that it is not possible to make meaningful like-for-like
comparisons (for example by department, cost centre or
subjective).
To overcome the absence of uniform and reliable classification,
a sophisticated matching & inference engine is used to match
the supplier record or item description to our unique reference
datasets and allocate the supplier to a primary category derived
from our universal taxonomy (called ‘The V Code’).
The V Code is a hierarchical classification system,
specific to the public sector, that facilitates the classification
of suppliers of goods & services to enable the comparison of
spend data across the public sector so that, for example a local
authority could be meaningfully compared to a hospital trust.
All classifications that represent 97% by value of spend are
validated by classification experts. It is this classification and
validation effort that is used to provide analysis of
'Spend
by Category' in spotlightonspend.
spotlightonspend includes tools that allow you to easily
reallocate suppliers to a different category where that category
might better reflect what you purchase from them.
Enrich
Our matching & inference engine is also used to match the
supplier record to our unique reference datasets and append a range
of attributes to each record. Standard attributes include: the
Number of Employees, Annual Revenue (Actual or Modelled), Date of
Incorporation (Birth Year), Geographic Location and Risk
Classification (Modelled). Several of these attributes are used to
deliver 'Spend
in Summary' in spotlightonspend.