Home  |   Blog Home  |   Blogroll  |   Authors  
semantic rationalization blog series: part 1 – philosophy and approach | semantic data integration blog

Over the next few weeks, I’ll be posting a series of blogs here to answer the most frequently asked question we get here at expressor: “What is semantic rationalization?” It’s obviously a big differentiator for us – but it’s also a concept or phrase not easily parsed to determine a meaning. So we hope you find this helpful.

In this first installment, I’ll discuss the philosophy and approach behind our vision for semantic rationalization, then dive into more detail in subsequent entries.

Semantic rationalization, from the expressor point of view, consists of the mechanisms required to construct a business abstraction layer in which multiple user roles can contribute to delivering and maintaining a data integration application. Our branded marketing term for this is “smart semantics.” This is a very different concept than that employed by earlier generation ETL tools – and it is fundamental to how we intend to make data integration simpler.

To understand our approach we need to first understand the business and technical goals that expressor is targeting and then dissect the functionality involved in building the solution. Our primary business goal is to make data integration much more affordable. Most people we talk to already agree that our revolutionary pricing model – which is based on the business value delivered and the hardware it runs on – has achieved that objective.

Our fundamental technical goals are to make data integration significantly easier and to allow more individuals with differing business experience to participate in the data integration process. An analogy here is the way that technology solved the challenge of the data volume explosion by including parallel processing in ETL tools. Just as pipeline parallelism allows multiple processors to work independently on their specific tasks while contributing to complete the final product, allowing multiple participants to work and contribute their individual expertise in parallel during the development of a data integration application helps them build better applications more quickly.

If we look at the functionality involved in delivering a data integration application and where the current ETL / data integration technologies tend to bottleneck, we see that most often the ETL developer is the limiting resource in the process. So it’s not surprising why vendors like us are focused on increasing the developer’s ability to deliver solutions faster – very much like hardware and chip manufacturers are focused on developing faster processors. This is a good thing – improving the developer’s productivity is very important since these folks are usually very expensive and good ones are in limited supply. But it’s not the whole story.

Using another technology analogy, if we look at how processors were sped up, we find that inside the typical CPU there are a number of microprocessors performing a myriad of support tasks for the instruction processor. There are microprocessors which decode addresses, pre-fetch instructions and data, predict execution logic, etc. – all of which are vital to improving the CPU’s overall performance.

The data integration process can be improved in the same way. We can envision having someone who is responsible for the data (a data steward role) defining the business name of a particular data item, as well as how that data is used in the business. This concept is remarkably similar to the idea of master data, and just like with master data management, a common mistake is assuming that all data is critical and needs to very tightly managed – resulting in a massive effort before a project can begin.

We believe that a much better approach is to allow this ontology to grow organically over time. A key requirement for turning this approach into reality is the ability to change previous decisions and/or correct mistakes easily. This is actually a much more complex requirement than one might typically imagine. It brings up questions like “what happens to data integration decisions previously made when the definition about the way a datum is used in the business is changed?” Clearly a sophisticated impact analysis mechanism is required along with a well-defined scope on implementing any updates necessitated by the change.

On that point, let me wrap up this entry. Next time, I’ll discuss the creation of business rules and look at the connection between semantic rationalization and the semantic Web.

- Michael Ruland, field engineering

Post to Twitter Tweet This Post

  • Share/Bookmark


Comments:
2 Comments posted on "semantic rationalization blog series: part 1 – philosophy and approach"

[...] my previous blog on semantic rationalization, I introduced you to our semantic rationalization concepts, our goals to [...]

[...] my previous blog on semantic rationalization, I introduced you to our semantic rationalization concepts, our goals to [...]

Post a comment

Name: 
Email: 
URL: 
Comments: 
Back To Top