noun_Email_707352 noun_917542_cc noun_Globe_1168332 Map point Play Untitled Retweet

Use our data glossary to master the terms of the data world

What does running a data-driven business mean on a practical level? Why do we need data pipelines? What is the difference between a data warehouse and a data lake?

Roosa-Maria Säntti / May 27, 2020

Successful data usage in a company is based on effective communication and mutual understanding between the teams working with the data. However, you might get lost in the jungle of terms if the terms and definitions aren’t clear to all parties involved.

To help you out, we put together a data glossary related to the topics in our podcast so you can comfortably take a deep dive into the fascinating world of data. Below each term, you’ll find a link to the podcast episode, where we discuss the topic more with experts from well-known companies.

Master these data terms with our data glossary:

  1. API Monetization
  2. APIs and API Management
  3. Data and analytics platforms (data cloud)
  4. Data architecture
  5. Data ecosystem
  6. Data governance
  7. Data lifecycle
  8. Data management in manufacturing (installed base)
  9. Data pipelines
  10. Data-driven business transformation
  11. Data-driven organization
  12. DataOps
  13. Industry 4.0
  14. Predictive analytics
  15. The difference between a data warehouse and a data lake
  16. The difference between artificial intelligence and analytics

1. API Monetization

API Monetization is the process by which businesses create revenue from their existing data and APIs. Monetization allows API providers to move beyond their current business models, scale API programs, and make more possibilities of value creation for their customers, developers, and partners. 

Learn more by listening to our Tietoa tulevasta podcast, which takes a deep dive into the world of data (in Finnish).

2. APIs and API Management

API is an application programming interface that allows parties to exchange data or initiate transaction in a system/service. API is seen also as a contract between API provider and API consumers, where API consumers receive the service (data or transaction) as promised and documented. APIs enable for business faster development of new business services, improve efficiency and decrease costs related to integration landscape. APIs also enable more agility and flexibility in the IT landscape.

There’s a three types of APIs: Private/Internal APIs, Customer/Partner APIs,  Public/Open APIs. Companies internal APIs are used e.g for internal application integrations, integrate microservices and for cloud integrations. Customer APIs give possibilities for new real-time services for the customers, and Partner APIs more agile information sharing and new business models with the partners. Open APIs bring totally new business possibilities for the companies, like different web and mobile phone applications.

API Management platforms ensure centralized and smooth full lifecycle control of APIs. 

API Management also ensures security of APIs, and it gives ability to optimize, monitor and give analytics from APIs.

Learn more by listening to our Tietoa tulevasta podcast, which takes a deep dive into the world of data (in Finnish).

3. Data and Analytics Platforms (Data Cloud)

A modern, cloud-based data and analytics platform combines the traditional reporting with modern analytics and data scientists' services. It provides a platform, for example, for data-based applications that use artificial intelligence.

The platform uses all forms of data from within the organization, from partners, and from external parties. The data is processed almost in real time into various data products, allowing an up-to-date view of the organization's situation. In addition to basic information, that can include predictions produced by machine learning algorithms.

The platform consists of modern public cloud services. The services' license models and technical features are flexible, so the service's cost is calculated according to the usage. Moreover, the services can be sliced ​​for different user groups, which can be very useful: For example, running a heavy analytical process does not interfere with normal operational data processes. Thus, an annual process run doesn't require reserving resources for a whole year.

Learn more by listening to our Tietoa tulevasta podcast (in Finnish).

4. Data Architecture

Data architecture is a part of the overall architecture and can refer to several perspectives. It often relates to the artifacts of data architecture on multiple abstraction levels, such as data models, definitions, and descriptions of information flows and metadata.

With the artifacts, a system project's data processing can be designed and implemented to support data reuse, quality, data security, and privacy, as well as to meet business requirements across functional silos.

Learn more about this topic by listening to our Tietoa Tulevasta podcast, which takes a deep dive into the world of data (in Finnish).

5. Data Ecosystem

A data ecosystem is an open or closed network with an interest in exchanging data between the actors of the network, following common rules: interfaces and data models. The members of an ecosystem have one thing in common: they all benefit from the data so much that it's worth sharing their data with the network. The exchange can also rely on monetary compensation.

A data ecosystem shares a vision of enabling more diverse data and solutions than a single actor could achieve alone. An ecosystem can have an owner, in which case it is about the dominance and benefit of one actor. Alternatively, ecosystem ownership can be decentralized to the members, making all ecosystem actors equal.

In some cases, a data ecosystem has a separate operator taking care of correspondence between actors and data transferring, without utilizing the data in its own operations.

Being involved in a suitable data ecosystem or owning one can, at best, be a significant competitive advantage.

Read or listen to the third episode of our Tietoa Tulevasta podcast to learn about a data ecosystem in international collaboration.

6. Data Governance

Basically, data governance is about data ownership. The owner of a company's business units, equipment, and properties manages the usage of the company's assets and strives to maximize its business benefits. This should also be the case with company-owned data sets.

The owner of a data set is responsible for ensuring the data is of good quality and making sure the user rights comply with the set rules. Thus, corporate data governance should define the policies and tools for data owners and other users of data.

Data governance includes the idea of enabling access and visibility into the data for as many employees as possible—across organizational units. Data access should only be restricted for good reasons, such as privacy.

To comprehensively use the data and develop a data-driven business, the organization needs to have an existing and implemented data governance model. If the model doesn't exist yet, it's good to start from a data set that has the most business value and is prioritized by the organization's top management.

Often, the quickest results happen when the starting point is an analytics development project that generates a significant business advantage.

Learn more by reading or listening to the fourth episode of Tietoa Tulevasta podcast, which discusses how data is utilized in the financial world.

7. Data Lifecycle (Data Lineage)

A data lifecycle refers to the different stages of data elements and data resources from the creation of information to its destruction. The stages can include storing, warehousing, transferring, using, and archiving the information.

Due to data security and privacy requirements, it is important to set business requirements for the end of the data lifecycle as well. Those requirements can include rules, such as how long the information can/should be stored and why.

Metadata management systems visualize the data transfers between various systems and describe how the data transforms from the source to its users. Data lineage refers to the visualization of the data lifecycle.

Learn more by listening to our Tietoa tulevasta podcast, which takes a deep dive into the world of data (in Finnish).

8. Manufacturing Data Management (installed base)

The business of manufacturing companies depends on building equipment that is either sold or rented to a customer. Such companies collect plenty of information about their business operations, including, sales (what has been sold and to whom), components used in production, equipment usage, and information about maintenance.

If this data is managed properly, the life cycle of devices can be accurately modeled. This enables the production of various services, such as financing solutions based on the use of equipment, proactive maintenance, and the sale of spare parts.

If the data is not of high quality, the digitalization of the business is impossible.

Combining and enriching data from different basic systems and making predictions based on that data enables the automation of service processes related to all equipment delivered to customers.

Read more about the topic or listen to the second episode of our Tietoa Tulevasta podcast, which explores data management and digitization of a globally operating manufacturing company. The article and podcast are in Finnish.

9. Data Pipelines

A data pipeline is a controlled function for data processing and data product creation that brings business value. A data product can be, for example, a report or a prediction produced by a machine learning algorithm that’s used via an interface.

The data pipeline includes and combines several components. The components cover data source reading, editing, analyzing, storing the data in different data models, and activating the data through the processed data product. The components are based on a micro-service model, which means that individual components may have different developers and life cycles.

Learn more by listening to our Tietoa tulevasta podcast, which takes a deep dive into the world of data (in Finnish).

10. Data-driven Business Transformation

A business transformation aims for fundamental changes in a business or its processes. A data-driven business transformation uses data and analytics to enable those changes.

At the moment, organizations either use little or no data in addition to traditional financial reporting or use it only in certain operations. The new data-driven way of thinking harnesses data to improve business, management, and service production processes across the organization.

A data-driven business transformation means not only deploying the technology but also developing data availability, data quality, procedures, and a data-driven culture.

To learn more, read or listen to our Tietoa Tulevasta podcast. Both the article and podcast are in Finnish.

11. Data-driven Organization

A data-driven approach means that an organization makes decisions based on information. Being able to make data-driven decisions requires reliable and accessible data. Having the technology and systems is not enough—success takes people, and cultural changes. The data-driven approach creates new opportunities: if used correctly, your data will not only streamline your operations, but also improves results, gives you a competitive advantage, and creates new business opportunities.

Read more about data-driven businesses and listen to our first episode of Tietoa Tulevasta podcast, which explores the results of fruitful cooperation between business and information management. The article and podcast episodes are in Finnish.

12. DataOps

DataOps (data operations) refers to a operating model that uses various personnel roles and technologies to manage data pipelines automatically, and to support data-driven business development.

Companies understand the value of data better than before, but commercializing business data for judicious use requires collaboration between business processes and organizations. As it's important to be able to quickly produce value adding wholes (data products) from business data, this collaboration requires a new kind of approach. The goal of DataOps is to meet that need.

DataOps is actually an interdisciplinary team formed around a business problem. To produce the information needed, the team organizes the data, tools, code, and development environments while taking care of scalability, functionality, and changes in data pipelines. Following the principles of continuous delivery, the team strives to generate information from source data to support business operations.

Learn more by listening to our Tietoa tulevasta podcast (in Finnish).

13. Industry 4.0

Industry 4.0 is a vision of an advanced industry that leverages ecosystems, the industrial Internet, modern technologies, and new business models. The vision is based on the digital transformation of traditional manufacturing and production methods. It is driven by the explosive growth of intelligence and compatibility of machines and devices, as well as rapidly evolving technologies such as digital production chains, robotics, sensors, 3D printing, augmented reality, digital twins, Big Data platforms, artificial intelligence, and machine learning.

Cyber-physical systems are at the core of Industry 4.0. They describe intelligent, interconnected industrial production and logistics units that can communicate with each other, and operate and adapt independently in versatile conditions. The operation of such systems also requires and produces a lot of data, and using this data requires analyzing and processing it with the help of artificial intelligence and machine learning.

Proactive integration, information transparency, and transmission between companies, customers, and products are thus the key to harnessing the benefits of technological development. Hence, data-driven thinking, analytics, data ecosystems, and data management will play an even more significant role in business in the future.

Learn more by listening to our Tietoa tulevasta podcast (in Finnish).

14. Predictive analytics

Machine learning and statistical methods allow us to model future events based on previous data. Such modeling is called predictive analytics. Typical applications for predictive analytics can be, for example, customer attrition expectations, financial data predictions, and predicting machinery maintenance needs.

Modeling that retrieves new information from a previous event is called predictive analytics. Sentiment analysis is a good example: that's when the feedback or the tone of a customer comment is assessed automatically. This enables an immediate reaction to negative feedback.

Predictive analytics is usually distinguished from descriptive analytics. Instead of reporting the situation with available information, predictive analytics acquire new information.

Learn more by listening to our Tietoa tulevasta podcast (in Finnish).

15. The Differences Between a Data Warehouse and a Data Lake

A data warehouse supports the organization's traditional core functions and obtains answers to defined questions from known source data. A data lake supports a more predictive and experimental approach.

A data warehouse is mainly for structural information processing. A data lake enables the processing of all kinds of data in the organization. As the data warehouse and data lake are used for different purposes, they complement each other.

A data lake is often used together with a data warehouse to store all the raw data, and only an applicable part of it is transmitted to the data warehouse. Recently, we've seen new solutions on the market that combine a data lake and a data warehouse. Such a hybrid solution doesn't have a well-established term yet.

Read or listen to the fifth episode of Tietoa tulevasta podcast, which immerses into the potential of technology and digitalization in the healthcare industry.

16. The Difference between artificial intelligence and analytics

Artificial intelligence is an umbrella term for solutions that are regarded as intelligent. Search engines, smart speakers, and self-driving cars are examples of artificial intelligence. It’s often associated with system autonomy and independence from human decision-making. Analytics, on the other hand, refers to data-based reporting and visualization produced for human decision-making.

The information obtained with artificial intelligence—which often means machine learning—can be utilized with analytics. Simultaneously, the available data analytics is often used to develop artificial intelligence. For example, we can find out what people want from a smart speaker and how the product meets customers’ needs. Artificial intelligence-based decision-making systems require accurate analysis of financial figures.

Compared to analytics, artificial intelligence takes many steps further towards independent data use.

Learn more by listening to our Tietoa tulevasta podcast (in Finnish).

Are you hungry to learn more?

Subscribe to our Tietoa Tulevasta podcast on Soundcloud and follow us on Instagram to stay tuned. Our podcast brings data glossary terms to life with practical examples from everyday business life. Once your business needs new technology to reach your goals, TietoEVRY’s in-depth expertise is at your service.

Roosa-Maria Säntti
Head of Cloud, Data and Insight

Roosa believes that data can truly change the world for better and benefit organizations, society and environment. She has been supporting customers in building foundations for their data driven transformation programs by helping to setup technical environments and brightening data strategies.

Author

Roosa-Maria Säntti

Head of Cloud, Data and Insight

Share on Facebook Tweet Share on LinkedIn