Data Science

What it is

The Research Studio Data Science (DSc) is an R&D initiative for innovation and applied research that accelerates the development, deployment and adoption of Data Analytics and Data Science technology and relevant innovations.
The Research Studio Data Science applies scientifically rigorous data analytics to obtain business-relevant insights based on large amounts of heterogeneous data. It uses, integrates and extends commonly-used open source software for big data analytics and builds SW combinations.

Why applied Data Science R&D is needed?

In order to remain competitive in increasingly complex and global markets and dealing with increasing complex system issues, it is becoming indispensable for companies and actors in all sectors of the economy and society to take advantage of key insights that can be gained through the analysis of data that they routinely collect and – if feasible – relate it to other data which are publically available (open data) or can be readily obtained from providers (national stats). Enterprises and public administrations are already using such insights to lower costs and improve their overall efficiency.

There are however several hurdles to analysing this data effectively.

The data is often stored in different formats on different systems, most of it in an unstructured form, and much of it of poor quality. But even once the data has been integrated into usable forms, it is necessary to ask the right questions of the data to obtain key insights.

Any data set contains spurious structures and correlations that are artefacts of the way that the data has been collected or processed — making decisions based on such artefacts should be avoided. Furthermore, even though there may be plenty of data available, only a small part of it may be pertinent for a desired analysis — processing all data simply because it exists is wasted effort.

Finally, the data may simply not contain the necessary information — the desire to extract particular insights from a collection of data does not guarantee that these insights actually exist in the data! For these reasons, the characteristics of the data and the background knowledge about the data must be taken into account — a scientifically rigorous approach must be applied to investigate and analyse the data so as to arrive at results that are plausible and applicable to the real world with a benefit for business.

What the Research Studio DSc can do

The Research Studio Data Science builds on the following competences:

  • Advanced research on improving methodologies and tools for Data Science for specific business and social domains
  • Designing scientifically rigorous approaches for obtaining clear and plausible insights from data analytics
  • Developing innovative Data Science-based solutions to company and societal problems implemented in prototypes
  • Consulting on using Data Science to gain insights based on the analysis of existing data and the collection of new data


How does Research Studio DSc do it?

Methodological approaches will be applied in in the following areas:

  • Data Management: data cleaning, pre-processing, validation and curation
  • Data Integration: structured and unstructured data, multimodal data, data selection
  • Data Analytics: information retrieval, data mining, semantic analysis, natural language processing, machine learning, modelling
  • Data Interaction: visualisation, visual analytics, data exploration, decision support
  • High-Performance Computing: scalability of developed solutions

In order to provide Data Science solutions in an optimal way, the Research Studio Data Science is building up a multi-disciplinary team consisting of researchers specialised in statistics, data management, data mining, information retrieval, machine learning, natural language processing, visual analytics and high-performance computing, along with software developers.

For whom is it? 

The Data Science Research Studio will work together with and for companies and public administration to develop innovative solutions for gaining maximum value from data. Main areas of expertise include telecommunications, banking and insurance, intelligent manufacturing, and health.

The solutions will be implemented as prototypes to enable efficient transfer to industry. Through research contracts or through licensing, data analytics solutions will be transferred to companies to be integrated into their business processes and information technology systems.

