What is unstructured data? | NTT DATA | NTT DATA

Thu, 28 July 2022

What is unstructured data?

Every time we use an electronic device, we generate data that is called either structured data or unstructured data, depending on what type it is. The importance of each type does not lie in what they are or in their volume, but rather in what they reflect: trends, attitudes, conflicts.... How organisations analyse this data and turn the conclusions into strategic decisions is a matter of some importance. 

The volume of data we are capturing is constantly increasing. The research firm Gartner estimates that unstructured data now accounts for 80 to 90% of all new data captured by businesses, and that it is growing three times faster than structured data. 

Structured data vs. unstructured data 

Structured data is subject to a specific database format. They are usually qualitative and may be generated by people or by algorithms. They enable information to be organised in a simple way, and they are classified in categories such as names, credit card numbers, or telephone numbers. 

The other type of data is unstructured, and it is stored in its native format until it is needed for use. In other words, despite having an internal structure, they are not predefined by data models. This information is usually more qualitative, and it can be reused as needed, as it does not have to meet any specific format requirements.  

It is also very varied. Some examples of unstructured data include social media data, surveillance data, geospatial data, audio data, meteorological data, and reports, invoices, records, emails, and productivity applications. 

Despite the fact that most organisations are unaware of their value, those that analyse these data will have a competitive advantage over their competition. They are a fantastic opportunity for obtaining results and improving the customer's or user's experience. The question is - how do we obtain them? 

How is unstructured data processed? 

The large quantities involved in unstructured data mean that automated data collection and digestion techniques must be used, in order to subsequently convert them into forms that can be efficiently subjected to automated processes. This processing is based on two techniques: 

  • The first is Optical Character Recognition (OCR), or the text digitization process. It automatically identifies the data based on an image, symbols or characters that belong to an alphabet, and then stores them as data.  
  • The second technique is Natural Language Processing (NLP). This is an area of knowledge of Artificial Intelligence (AI) which analyses written and spoken content, understands its meaning and predicts what is likely to follow it. 

And how is unstructured data transformed into valuable data?

The value of unstructured data depends on how it is processed. A key requirement is an AI-powered unstructured data platform that automatically classifies a wide range of document types, and which analyses and extracts the most important information quickly and with a very high level of accuracy.  

From that point on, the automation of repetitive manual processes enhances performance, reduces human errors, and enables employees to concentrate on tasks that add value.  

Any sector can benefit from the management analysis of unstructured data. Take for example the following industries: 


In the health sector, using information from unstructured documents is a major revolution which will optimise and improve the quality of care, and innovation and analysis in clinical processes.  


Another sector that may benefit significantly from the analysis and management of unstructured data is insurance. In addition to being able to extract all the details from the information that is generated every day, problems can be improved and identified in a short space of time. For example, travel insurance claims can be systematized to make decisions about services and to focus on the customer in order to offer more flexible and market-oriented processes. 


There are also areas with internal processes which require an analysis of an enormous amount of information, such as banking. Practical examples of some procedures that banks might need range from analysing a person's employment record to looking for information in a public deed. Using a tool that is capable of extracting information about clients is particularly important in these cases. 

Basic Services 

And finally, in basic services, the use of unstructured data means that customers' and users' aspirations, needs and desires can be identified in order to develop new services or improve existing ones, in addition to streamlining all internal personnel procedures. 

Dolffia is NTT Data's AI-based document processing platform, which is capable of understanding, learning, and suggesting actions to unlock information from any unstructured information source. It is a SaaS solution that offers speed (dozens of documents can be processed in the time it takes a person to read just one), adaptability (it can extract information from a wide range of structures), greater precision (it classifies documents with an accuracy rate of 95%), a return on investment (it significantly reduces operating costs and optimises payroll costs), eliminates risks (arising from lost or incorrectly processed documents) and provides agility (during setup). It can also be easily integrated into any other documentation generation tool, including ERP, CRM, CMD, RPA, etc… 

Dolffia is a quantum leap in data management, offering increasingly improved customer/user experiences, thanks to informed decision-making that provides greater profitability and eliminates inefficiencies in processes. 

How can we help you?

Get in touch