December 3rd, 2009 by Ron Mulderij

Every year when autumn comes the assistants of the sales department get a little nervous. They know what will happen in short term. It’s almost Christmas and the selections of contacts to receive a Christmas card have to be made.
Every year it’s the same. First the selections for every account manager are made and they will have to check manually if these are correct. This year will be the same as ever, which means that:
- relevant companies and contacts are missing
- new companies and contact persons will be added
- contact persons will be deleted
- contact persons will be transferred to their new company
- addresses appear to be not up-to-date Read the rest of this entry »
Tags: address standardization, christmas cards, CRM, CRM-system, customer view, marketing, process management, validation
Posted in Data Quality | No Comments »
November 10th, 2009 by Ron Mulderij

Through the increase of modern technologies our payments are processed electronically more and more. Banks try to reduce costs and force their customers to carry out the payments themselves. Internet banking has become the standard. Customers no longer can deliver written transfer orders at their bank, but have to book the transfers using internet banking facilities.People can easily make a typing error in the account number that still will result in an existing account number. The risks are fully on the customer’s side. Although banks always are willing to help them to get the money returned, it’s better to avoid these errors.
In my opinion, banks should be obliged to perform a name-number-check for every payment or at least for every larger amount. Read the rest of this entry »
Tags: fault-tolerant matching, money transfer, name-number-check, risk mangement, typing errors
Posted in Data Quality | 1 Comment »
October 28th, 2009 by Holger Wandt

The internet is on the verge of one of the most fundamental changes in its history. The Internet Corporation for Assigned Names and Numbers (ICANN) is expected to agree on the use of internet addresses in non-Latin characters during this week’s ICANN convention in Seoul. If all goes according to plan, it will be possible to use Greek, Cyrllic, Arabic, Chinese, Korean and many other characters in the internet browser’s address bar. More than half of the 1.6 billion internet users in the world are using a character set which is not Latin. Therefore, ICANN expects that the number of non-Latin domain names, and thus the number of new internet usersm, will increase rapidly.
This far-reaching change in the use of he internet is based on a system that can “translate” or “convert” different writing systems (with sometimes different writing directions, i.a Arabic and Hebrew). On a high level, it would look a little like this, I would imagine:
|
عربي
|
中文
|
English
|
日本語
|
Deutsch
|
Français
|
Español
|
Русский
|
Português
|
한국어
|
Italiano
|
|
AR
|
ZH
|
EN
|
JA
|
DE
|
FR
|
ES
|
RU
|
PT
|
KO
|
IT
|
Naturally, this phenomenon raises questions concerning the matching of internet addresses. Is ووو.هُمَنِنفِرِرِنسِ.كُم the same as www.humaninference.com? It appears that generic multilingual data matching issues also apply in this particular case. How do we handle these comparisons? For a couple of thoughts, please read this…….
Tags: ICANN, international domain names, internet address, matching, Seoul, transliteration
Posted in Data Quality | No Comments »
October 21st, 2009 by Holger Wandt

On 28 january 2010 the next Human Inference Data Quality Summit will be held in the Evoluon in Eindhoven (NL). The theme – Value your data, value your future- is inspired by the idea that investments in data quality have become part of standard business and that vision, strategy and solutions are being synchronized with these investments. As data quality has reached a certain level of maturity, it is time to have an in-depth look at the (near) future of Data Quality.
The program is challenging, comprehensive and entertaining. Keynote speakers include Ted Friedman (vice president Gartner Research), Mathias Klier (professor at the University of Innsbruck) and Sabine Palinckx (CEO Human Inference). Additionally, in the break-out-session a wide variety of theme-related topics will be addressed: maximising the buisnes value of information, guiding a dq-project through migration, data quality maturity, marketing effectiveness and many more….. In short, the Data Quality Summit is not to be missed!
Save the date and register by clicking this link!
Tags: data quality maturity, Data Quality Summit, DQS 2010, gartner, Human Inference, integration, migration, Sabine Palinckx, Ted Friedman
Posted in Data Quality | No Comments »
September 7th, 2009 by Holger Wandt

A major bank in Dongguan (China) refused a potential customer because his name is Li Jun. Apparently, there were already over 300 bank accounts assigned to the name Li Jun. Not that this particular Li Jun was responsible for opening all these accounts, there were just too many men with exactly the same name. The bank states that the refusal is nothing personal, since nobody with the name Li Jun will be accepted as customer in the near future….. In the meanttime, Li Jun is taking legal action against the bank. Read the rest of this entry »
Tags: Banks, Chinese characters, customer view, Data Quality, deduplication, interpretation, knowledge, single customer view
Posted in Data Governance, Data Quality | No Comments »
August 24th, 2009 by Holger Wandt

The more a company knows about its customer’s wishes, needs and habits and the more that company is able to tailor its proposition accordingly, the greater the value it will eventually provide for its customers. We all know that there are countless examples where defective, fragmented, or just plain poor customer data cause unnecessary costs, decrease in revenue, employee dissatisfaction or frustation, damage of the corporate image and many other unsdesirable or painful consequences.
Customer data quality and integration problems impact every area of the value chain of organisations. Far too often companies have a multiple view of their customers. Customer Data Integration (or MDM for Customer Data) is the key to providing companies with a single view of their customer. Read the rest of this entry »
Tags: cdi, customer view, data processes, identification, intelligent matching, MDM for customer data
Posted in Data Quality, MDM for customer data | No Comments »
August 21st, 2009 by Ramon de Noronha

The term Golden Record is closely related to Customer Data Integration or MDM for Customer data. It refers to the “single truth” which has been created or calculated from all those duplicate customer records from different systems. This post is not about finding or tagging all those duplicate records. There all kinds of ways to find them using advanced statistical methods, fuzzy matching etc.
But what do you once you have found the duplicates. How do you create the best possible customer data out of all gathered elements? Read the rest of this entry »
Tags: ACCU, deduplication, first name, golden record, matching methods, MDM for customer data
Posted in MDM for customer data | 2 Comments »
August 21st, 2009 by Ramon de Noronha
Just a few days ago I wrote about the many standards we have for streetnames in the Netherlands. But on top of that new streetnames are added constantly for newly build neighboorhoods. Sometimes this also results into changing of existing streetnames. This was also the case last week, when rescue people were not able to find the exact location in Putten. An emergency call was made for a 60 year old man, who suffered from heart failure. People who tried to re-animate the man heard the ambulance passing by, but they didn’t see the ambulance. The end result was that they arrived after 19 minutes and they were too late to save the man’s life. This is a very unfortunate accident and an investigation has been started to find out what exactly went wrong. Preliminary results shows that the navigition systems of both the police and the ambulance were not up-to-date.
I have looked at the location using Google Maps. Normally you expect that a street consists of one thoroughfare. But in this case the street, named “Kraakweg”, consists of three different parts, which are clearly not in one direct line. I have indicated it with 1, 2 and 3. Number 4 indicates another street, but with almost the same name “De Kraak”.

Read the rest of this entry »
Tags: address standardisation, naming confusion, street names, toponymics
Posted in Data Quality | No Comments »
August 17th, 2009 by Ramon de Noronha
Just before this summer the U.S. Department of Justice filed a report about the FBI Terrorist Watchlist. This watchtlist serves as a critical tool for screening and law enforcement personnel for alerting them when they come across a known or suspected terrorist. It is used by personnel at airports, harbours and the borderline. Also when you apply for a visum you are matched against this watchlist. The Terrorist Screening Center, a subsidiary of the FBI, is responsible for maintaining the watchlist.
This watchlist was created in 2004 from several other lists and at that time it consisted of about 68.000 entries. I use the word entries, because in the years after it became fuzzy if one record is the same as one individual. By the end of 2008 the list had grown to over 1,1 million entries. In 2008 after the American Civil Liberties Union (ACLU) mentioned that the list had passed the 1 million, the government came with an explanation. Although we have recorded over 1 million entries in the database, the net result is that these records correspond to about 400.000 individuals. Terrorist often use different and thus multiple identities, use several (falsified) passports etc. But adding entries with only the first initials and last name, while an entry of the full first names and last name already exists will result in unwanted side-effects. Read the rest of this entry »
Tags: compliance, identification, identity, interpretation, knowledge, persistent identification, processes, suspect list matching
Posted in Data Governance, Data Quality | No Comments »
August 17th, 2009 by Ramon de Noronha
So once in a while I visit Amsterdam and have a drink or two in the centre. Afterwards I use the tram to get back to the hotel. This weekend I was quite surprised to find out that all the streetnames are announced in English, at each stop. The easy and obvious one is of course Centraal Station, which was translated to Central Station. I also can see how they came up with Rembrandt Square instead of Rembrandtsplein. But translating “Spui” to “Courtyard with a chapel” doesn’t help any tourists to find their destination. Read the rest of this entry »
Tags: address standardisation, interpretation, persistent identification, standardisation
Posted in Data Quality | 1 Comment »