Data Management
Data management is the practice of collecting, keeping, and using data securely, efficiently, and cost-effectively. The goal of data management is to help people, organizations, and connected things optimize the use of data within the bounds of policy and regulation so that they can make decisions and take actions that maximize the benefit to the organization. A robust data management strategy is becoming more important than ever as organizations increasingly rely on intangible assets to create value.
Managing digital data in an organization involves a broad range of tasks, policies, procedures, and practices. The work of data management has a wide scope, covering factors such as how to:
Create, access, and update data across a diverse data tier
Store data across multiple clouds and on premises
Provide high availability and disaster recovery
Use data in a growing variety of apps, analytics, and algorithms
Ensure data privacy and security
Archive and destroy data in accordance with retention schedules and compliance requirements
A formal data management strategy addresses the activity of users and administrators, the capabilities of data management technologies, the demands of regulatory requirements, and the needs of the organization to obtain value from its data.
Types of Data Management
The scope of a data management discipline is quite broad, and a strong data management strategy typically implements the following components to streamline their strategy and operations throughout an organization:
Data processing
Within this stage of the data management lifecycle, raw data is ingested from a range of data sources, such as web APIs, mobile apps, Internet of Things (IoT) devices, forms, surveys, and more. It is, then, usually processed or loaded, via data integration techniques, such as extract, transform, load (ETL) or extract, load, transform (ELT). While ETL has historically been the standard method to integrate and organize data across different datasets, ELT has been growing in popularity with the emergence of cloud data platforms and the increasing demand for real-time data. Independently of the data integration technique used, the data is usually filtered, merged, or aggregated during the data processing stage to meet the requirements for its intended purpose, which can range from a business intelligence dashboard to a predictive machine learning algorithm.
Data storage
While data can be stored before or after data processing, the type of data and purpose of it will usually dictate the storage repository that is leveraged. For example, data warehousing requires a defined schema to meet specific data analytics requirements for data outputs, such as dashboards, data visualizations, and other business intelligence tasks. These data requirements are usually directed and documented by business users in partnership with data engineers, who will ultimately execute against the defined data model. The underlying structure of a data warehouse is typically organized as a relational system (i.e. in a structured data format), sourcing data from transactional databases. However, other storage systems, such as data lakes, incorporate data from both relational and non-relational systems, becoming a sandbox for innovative data projects. Data lakes benefit data scientists in particular, as they allow them to incorporate both structured and unstructured data into their data science projects.
Data governance
Data governance is a set of standards and business processes which ensure that data assets are leveraged effectively within an organization. This generally includes processes around data quality, data access, usability, and data security. For instance, data governance councils tend align on taxonomies to ensure that metadata is added consistently across various data sources. This taxonomy should also be further documented via a data catalog to make data more accessible to users, facilitating data democratization across organizations. Data governance teams also help to define roles and responsibilities to ensure that data access is provided appropriately; this is particularly important to maintain data privacy.
Data security
Data security sets guardrails in place to protect digital information from unauthorized access, corruption, or theft. As digital technology becomes an increasing part of our lives, more scrutiny is placed upon the security practices of modern businesses to ensure that customer data is protected from cybercriminals or disaster recovery incidents. While data loss can be devastating to any business, data breaches, in particular, can reap costly consequences from both a financial and brand standpoint. Data security teams can better secure their data by leveraging encryption and data masking within their data security strategy.