<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Value Talk</title>
	<atom:link href="http://www.datavaluetalk.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.datavaluetalk.com</link>
	<description>Customer data is a valuable asset. Why not treat it that way?</description>
	<lastBuildDate>Tue, 09 Mar 2010 11:24:54 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Stop using Customer Relationship Management systems &#8211; and learn about possibilities to make dealing with customer information easier</title>
		<link>http://www.datavaluetalk.com/2010/03/09/stop-using-customer-relationship-management-systems-and-learn-about-possibilities-to-make-dealing-with-customer-information-easier/</link>
		<comments>http://www.datavaluetalk.com/2010/03/09/stop-using-customer-relationship-management-systems-and-learn-about-possibilities-to-make-dealing-with-customer-information-easier/#comments</comments>
		<pubDate>Tue, 09 Mar 2010 11:24:54 +0000</pubDate>
		<dc:creator>Vincent van Hunnik</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Data Services]]></category>
		<category><![CDATA[campaign management]]></category>
		<category><![CDATA[contact details]]></category>
		<category><![CDATA[CRM-system]]></category>
		<category><![CDATA[mass mailing]]></category>
		<category><![CDATA[self service]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1433</guid>
		<description><![CDATA[
Have you ever tried to get contact details in and out of a CRM system, and ended up with a bigger mess? I have. The concept is easy: store all information about prospects and customers in one system, allowing you to have your communication efforts streamlined.
Reality, however, is harder: contact details entered on your website [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-thumbnail wp-image-1436" title="knock_1691050" src="http://www.datavaluetalk.com/wp-content/uploads/2010/03/knock_1691050-150x150.jpg" alt="knock_1691050" width="150" height="150" /></p>
<p>Have you ever tried to get contact details in and out of a CRM system, and ended up with a bigger mess? I have. The concept is easy: store all information about prospects and customers in one system, allowing you to have your communication efforts streamlined.</p>
<p>Reality, however, is harder: contact details entered on your website should be fed to the system automatically. Sending your periodic newsletter should be based on the details in your CRM system. Not to mention dealing with information on bounces. Integrating your CRM system(s) with mass mailing, campaign management and self service portals is helpful, but for some reason the major means of transporting lead and customer information still seems to be Excel&#8230; Leaving you with the necessity to mass import results, new contacts and changed information.<span id="more-1433"></span></p>
<p>What you really want is the ability to throw whatever information you have at the system, and let the system determine if you already know someone and take care of adding information to already existing contacts. In addition you do not want to spend time on maintaining information you have. Actually, wouldn’t it be nice if you would be able to have your website automatically add leads to your CRM system (e.g. via the Salesforce Web2Lead function), but without  adding duplicates?</p>
<p>The good news is that this can be done. There is tooling out there that keeps your contact data up-to-date, prevents adding duplicate records and makes dealing with contact data easy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2010/03/09/stop-using-customer-relationship-management-systems-and-learn-about-possibilities-to-make-dealing-with-customer-information-easier/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Adieu Marcel &#8230;..</title>
		<link>http://www.datavaluetalk.com/2010/03/02/adieu-marcel/</link>
		<comments>http://www.datavaluetalk.com/2010/03/02/adieu-marcel/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 14:57:34 +0000</pubDate>
		<dc:creator>Jacques Baron</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Data Services]]></category>
		<category><![CDATA[civil registry]]></category>
		<category><![CDATA[French names]]></category>
		<category><![CDATA[processing French data]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1424</guid>
		<description><![CDATA[
Everybody who has ever been on holiday in France has probably had a neighbour named Gaston, Jacques, Louis, Claire or Françoise . We are used to those first names, they evocate the “France profonde”, sleepy villages at the end of a road, films of Pagnol or Rohmer. Walks along the Seine in de shadow of [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-thumbnail wp-image-1431" title="french-waiter 3" src="http://www.datavaluetalk.com/wp-content/uploads/2010/03/french-waiter-3-150x150.jpg" alt="french-waiter 3" width="150" height="150" /></p>
<p>Everybody who has ever been on holiday in France has probably had a neighbour named Gaston, Jacques, Louis, Claire or Françoise . We are used to those first names, they evocate the “France profonde”, sleepy villages at the end of a road, films of Pagnol or Rohmer. Walks along the Seine in de shadow of “Notre Dame” in the spring. Coffee at a terrace of the Boulevard Saint-Germain where an obsequious garçon, named Marcel, is looking at your girl friend or wife in a way you dot not really appreciate. This particular image of France is in danger. In a few years our total frame of reference could have disappeared.</p>
<p>Nowadays French parents let their imagination go freely when they are choosing first names for their children. Looking at recent entries in the civil registry, you will find rather unusual first names like Bulle, Héribert, Loeva, Hermès, Evolène, and Argan.<br />
These first names have all kind of origins. For example, they can be a combination of first names (Timéo, which is derived from Timothée and Théo),or they are different writing forms of known first names (Lilou becomes Lee-Lou). We can also find names from Greek or Celtic mythology or even from literature, like Arwen, a character from the novel Lord of the Rings.<span id="more-1424"></span></p>
<p>This interest for uncommon first names will of course have consequences for the processing of French data, especially if you take into consideration that these “new” first names, with a frequency less than 3000, are now in the majority. But this diversity will not necessary be a curse. Maybe we will be delivered from ambiguous names, of which we never know whether it is a first name or a surname. Consider the 3 most common surnames in France are Martin, Bernard and Thomas. <br />
But don’t worry; in order to keep the challenge going when you process French data, names like Jacqueline-Germain, Jean Marie Marie Luce, Louise Alexandrine will of course not disappear entirely.<br />
And next time you will be in France, enjoying a salade Niçoise with a cold glass of rosé, overlooking a harbor, where small fishing boots are dancing on the lazy waves, you just will have to get used to the fact that the waitress’s first name is not Marie but Fanchon or Eole.</p>
<p>Source:  Le Parisien 19-02-2010 and “Les 4 000 plus beaux prénoms rares “, de Stéphanie Rapoport, chez First, 8,90 €</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2010/03/02/adieu-marcel/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fødselsnummer &#8211; Crossing centuries in Norway</title>
		<link>http://www.datavaluetalk.com/2010/02/15/f%c3%b8dselsnummer-crossing-centuries-in-norway/</link>
		<comments>http://www.datavaluetalk.com/2010/02/15/f%c3%b8dselsnummer-crossing-centuries-in-norway/#comments</comments>
		<pubDate>Mon, 15 Feb 2010 12:57:54 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[personal identification number]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1332</guid>
		<description><![CDATA[
The Norwegian Fødselsnummer (Birthnumber) is an 11-digit number with 2 control digits. The 10-th digit is a control digit calculated with a weighted modulo 11 variant over the first 9 digits. The 11-th digit is a control digit calculated with another weighted modulo 11 variant over the first 9 digits combined with the 10-th control [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.datavaluetalk.com/wp-content/uploads/2010/02/Norwegianbirthnummer.jpg"><img class="alignleft size-medium wp-image-1333" src="http://www.datavaluetalk.com/wp-content/uploads/2010/02/Norwegianbirthnummer-300x177.jpg" alt="Norwegian Fødselsnummer examples" width="300" height="177" /></a></p>
<p>The <a href="http://no.wikipedia.org/wiki/F%C3%B8dselsnummer">Norwegian Fødselsnummer</a> (Birthnumber) is an 11-digit number with 2 control digits. The 10-th digit is a control digit calculated with a weighted modulo 11 variant over the first 9 digits. The 11-th digit is a control digit calculated with another weighted modulo 11 variant over the first 9 digits combined with the 10-th control digit.</p>
<p>As in other countries also this number is based on the<a href="http://www.datavaluetalk.com/2010/02/01/is-270368a172x-a-correct-finnish-henkilotunnus/" target="_blank"> date of birth</a>. The first 6 digits represent the birth date as “ddmmyy”. Problem with a 6-digit date is that you cannot identify the century – is a Fødselsnummer starting with 121009 someone born in 1909 or 2009? The Norwegian government has solved this by grouping the following 3 individual digits (individual number) in groups representing a certain era. If you are born between 1854-1899, then your individual number must be between 500 and 749, born between 1900-1999 then your number lies between 000 and 499, and for those born recently between 2000-2039 then your number lies between 500 and 999. With some exceptions for those with an individual number between 900 and 999.<span id="more-1332"></span></p>
<p><a href="http://www.datavaluetalk.com/2010/01/19/why-there-are-maximum-of-females-in-a-country/" target="_blank">Like in other countries</a> the odd individual numbers are given for males, the even for females.</p>
<p>Be aware that by validating national personal identification numbers, like the Fødselsnummer, that contain a date part you cannot rely only on the control digits. The Fødselsnummer “31046812355” is completely valid if we look to the 10th and 11th control digit – however the birth date April 31 in 1968 did never occur!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2010/02/15/f%c3%b8dselsnummer-crossing-centuries-in-norway/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Matching Engines go beyond apples and oranges</title>
		<link>http://www.datavaluetalk.com/2010/02/11/new-matching-engines-go-beyond-apples-and-oranges/</link>
		<comments>http://www.datavaluetalk.com/2010/02/11/new-matching-engines-go-beyond-apples-and-oranges/#comments</comments>
		<pubDate>Thu, 11 Feb 2010 14:01:05 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[apples and oranges]]></category>
		<category><![CDATA[atomic string comparison]]></category>
		<category><![CDATA[cultural differences]]></category>
		<category><![CDATA[information retrieval]]></category>
		<category><![CDATA[intelligent matching methods]]></category>
		<category><![CDATA[Lucene]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1323</guid>
		<description><![CDATA[
Professional matching engines are becoming more and more intelligent. Within Human Inference, we also see that our matching techniques are capable of using more and more intelligence, and needless to say that we incorporate and use this intelligence in our engines in order to adopt to the way that humans do their matching.
Traditional data quality or matching engines were based on atomic [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.datavaluetalk.com/wp-content/uploads/2010/02/beyond-apples-and-oranges2.jpg"><img class="alignleft size-medium wp-image-1328" src="http://www.datavaluetalk.com/wp-content/uploads/2010/02/beyond-apples-and-oranges2-300x215.jpg" alt="Beyond apples and oranges" width="300" height="215" /></a></p>
<p>Professional matching engines are becoming more and more intelligent. Within Human Inference, we also see that our matching techniques are capable of using more and more intelligence, and needless to say that we incorporate and use this intelligence in our engines in order to adopt to the way that humans do their matching.</p>
<p>Traditional data quality or matching engines were based on atomic string comparison functions like match-codes, phonetic comparison, Levenshtein string distance, n-gram comparisons or similar functions. These kinds of functions are relatively easy to implement and to use although a significant amount of plumbing is needed to get reasonable results. Open source projects like the<a href="http://lucene.apache.org/java/docs/" target="_blank"> Lucene search engine</a>, and variants, provide a solid and proven set of these functions. The drawback of these functions is that it’s not always clear for what purpose one needs to utilize a particular function. An even larger issue is the fact that these low-level DQ functions cannot distinguish between apples and oranges – you end up comparing family names with street names. We still see that,  for example BI vendors,  claim to provide data quality functionality, while they only provide these atomic comparisons.<span id="more-1323"></span></p>
<p>Within Human Inference we have been developing matching engines that look beyond these primary functions for years. Engines capable of identifying given names, surnames, family names, postal codes, titles,  initials, etc. The true benefit of this approach is that matching results are significantly higher, because you are comparing apples with apples and oranges with oranges. The glueing or plumbing in this approach to validate street or family names is completely under the hood for the data stewards. With a correct set of reference data, the right mix of atomic functions and – not the least – vivid domain knowledge, these matching engines are capable of quickly and adequately finding duplicates – beyond the ones that have simple typos.</p>
<p>The complexity in matching apples starts if you take into account the variants in apples, or to speak in Data Quality terminology, in case you take into account that per country or region people have more or less subtle differences in using names, streets, measurements and writing sets.</p>
<p>The moment you value these differences you also recognize new opportunities. You will notice that by looking at an apple, you get information on oranges. By looking at the name Белоусовa (Beloussowa), you might recognize a family name and that you’re dealing with a female. By looking to the number 681012-2355, you might recognize that this is a valid<a href="http://www.datavaluetalk.com/2010/01/19/why-there-are-maximum-of-females-in-a-country/" target="_blank"> Swedish personnummer</a>, and that the birth date of this male is October 12, 1968. By looking to an email like <a href="mailto:Winfried.vanHolland@humaninference.com">Winfried.vanHolland@humaninference.com</a> you might recognize a given name “Winfried”, that you’re dealing with a male, that he has surname “van Holland” and that he is working for a company called Human Inference, and I leave it up to you from which country he originates&#8230;. By retrieving additional information out of obvious information, the matching moves beyond the apples and oranges, and becomes easier, faster and more accurate.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2010/02/11/new-matching-engines-go-beyond-apples-and-oranges/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is 270368A172X a correct Finnish Henkilötunnus?</title>
		<link>http://www.datavaluetalk.com/2010/02/01/is-270368a172x-a-correct-finnish-henkilotunnus/</link>
		<comments>http://www.datavaluetalk.com/2010/02/01/is-270368a172x-a-correct-finnish-henkilotunnus/#comments</comments>
		<pubDate>Mon, 01 Feb 2010 16:10:46 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Finland]]></category>
		<category><![CDATA[Human Inference]]></category>
		<category><![CDATA[National Identification Numbers]]></category>
		<category><![CDATA[personal identification number]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1311</guid>
		<description><![CDATA[
The Finnish national personal identification number is the Henkilötunnus, aka Hetu or Ht, has the following format &#8211; ddmmyyc999C. For details how to calculate the control character, I refer to the overview blog on National Identification Numbers.
Validating  the Hetu 270368A172X shows that it is indeed a correct number. The number 270368172 generates  indeed 29 for [...]]]></description>
			<content:encoded><![CDATA[<div class="mceTemp"><img class="alignleft size-full wp-image-1320" title="FinlandHetu270368A172X-150x150" src="http://www.datavaluetalk.com/wp-content/uploads/2010/02/FinlandHetu270368A172X-150x150.jpg" alt="FinlandHetu270368A172X-150x150" width="150" height="150" /></div>
<p>The Finnish national personal identification number is the<a title="Henkiklötonnus" href="http://fi.wikipedia.org/wiki/Henkil%C3%B6tunnus" target="_blank"> Henkilötunnus</a>, aka Hetu or Ht, has the following format &#8211; ddmmyyc999C. For details how to calculate the control character, I refer to the overview blog on <a href="http://www.datavaluetalk.com/2010/01/19/why-there-are-maximum-of-females-in-a-country/" target="_blank">National Identification Numbers</a>.</p>
<p>Validating  the Hetu 270368A172X shows that it is indeed a correct number. The number 270368172 generates  indeed 29 for the modulo 31 proof, represented by control character &#8220;X&#8221; in the checksum list. The number shows that this is the 86-th girl born on the 27th of March 2068.</p>
<p>The latter might is exactly the start for the discussion on validity. Althought the number itself is well formed, and passes all the automatic checks, dealing with this number in a data quality assessment will raise your digital eyebrow. In the data quality world we will nowadays say that this Hetu is a wrong Hetu, that it cannot be correct.</p>
<p>So always use a bit of human inference when dealing with finnish national personal identification numbers.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2010/02/01/is-270368a172x-a-correct-finnish-henkilotunnus/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Remarkable facts on Dutch National Personal Identification Number (Burgerservicenummer BSN)</title>
		<link>http://www.datavaluetalk.com/2010/01/19/remarkable-facts-on-dutch-national-personal-identification-number-burgerservicenummer-bsn/</link>
		<comments>http://www.datavaluetalk.com/2010/01/19/remarkable-facts-on-dutch-national-personal-identification-number-burgerservicenummer-bsn/#comments</comments>
		<pubDate>Tue, 19 Jan 2010 15:35:34 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[11-proof]]></category>
		<category><![CDATA[Personal Identification Numbers]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1293</guid>
		<description><![CDATA[
The national personal identification number in the Netherlands is called the Burgerservicenummer (or abbreviated with BSN, introduced since november 2007). It is a 9-digit number where the number can be validated by a weighted 11-proof. Basically all the digits become a weighting factor and by calculating the sequential digits with their weight the final result [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-thumbnail wp-image-1307" title="bsn" src="http://www.datavaluetalk.com/wp-content/uploads/2010/01/bsn1-150x150.jpg" alt="bsn" width="150" height="150" /></p>
<p>The national <a href="http://en.wikipedia.org/wiki/Personal_identification_number" target="_blank">personal identification number</a> in the Netherlands is called the <a href="http://www.bprbzk.nl/BSN" target="_blank">Burgerservicenummer </a>(or abbreviated with BSN, introduced since november 2007). It is a 9-digit number where the number can be validated by a weighted 11-proof. Basically all the digits become a weighting factor and by calculating the sequential digits with their weight the final result must exactly be divisible by 11.</p>
<p>A nice effect of this weighted 11-proof is that there are at least 2 digits different between 2 individual numbers. You need to perform at least 2 changes to come from one number to another &#8211; it might be that there are 2 completely different digits (e.g., 1126827<strong>65</strong> and 1126827<strong>77</strong>) or the you need to swap one digit and change another (e.g., 4270965<strong>0</strong><span style="color: #ff0000">9</span> and 4270965<span style="color: #ff0000">1</span><strong>0</strong>).</p>
<p>Mathematically it might still be that there are two succeeding numbers like 4270961<strong>69</strong> and 4270961<strong>70</strong>, which still need 2 changes to come from the one to the other.<span id="more-1293"></span></p>
<p>This effect helps in preventing mistakes while typing these numbers, you need to make more than one mistake and some bad luck to get exactly a number that matches the proof.</p>
<p>For those who like statistics, there are exactly 90909090 possible combinations &#8211; which in itself is a nice number but doesn&#8217;t match the proof. The first possible number is 000000012 (assuming that 000000000 is not used), the last is 999999990.</p>
<p>For more on Personal Identification Numbers I refer to another summary blog on European numbers or to a handsome <a href="http://prezi.com/csnv3cynv4ai/" target="_blank">presentation</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2010/01/19/remarkable-facts-on-dutch-national-personal-identification-number-burgerservicenummer-bsn/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Why there are maximum of (fe)males in a country</title>
		<link>http://www.datavaluetalk.com/2010/01/19/why-there-are-maximum-of-females-in-a-country/</link>
		<comments>http://www.datavaluetalk.com/2010/01/19/why-there-are-maximum-of-females-in-a-country/#comments</comments>
		<pubDate>Tue, 19 Jan 2010 13:38:24 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[identification]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[privacy-sensitive]]></category>
		<category><![CDATA[social security number]]></category>
		<category><![CDATA[unique identification]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1288</guid>
		<description><![CDATA[Within Europe there is no such system as European Social Security Number or European Identification Number. A lot of countries have their own system, and other countries are struggling to get a system into place.
The struggle of some countries has to do with historical reasons and with privacy aspects. Unique identifiation is not always used in favour of [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone" src="http://4.bp.blogspot.com/_jQS2yW8CbuY/Sb43ZcL28WI/AAAAAAAAADg/cNBvLb2bq6o/s320/CartaoCidadao_f.jpg" alt="" width="320" height="207" />Within Europe there is no such system as European Social Security Number or European Identification Number. A lot of countries have their own system, and other countries are struggling to get a system into place.</p>
<p>The struggle of some countries has to do with historical reasons and with privacy aspects. Unique identifiation is not always used in favour of the community. And some of the used identification systems contain privacy-sensitive information, among others date of birth, gender and/or place of birth, where older systems might even contain religious or other privacy-senitive information.</p>
<p>A wide range of countries use the combination of date of birth, gender identification and the political region where you are born. In such a mechanism it is most common that part of the identification number is a 2-digit or 3-digit serial number to identify the unique male or female born on a specific date (or born on a specific month). Some countries provide odd serial numbers for male, and even for female. Bulgaria is the only one that wants &#8220;odd&#8221; females. Some countries like to divide on range (0-499 male, 500-999 female).  And some countries like Norway make nice combinations to include the century of birth or period of birth in the serial number.<span id="more-1288"></span></p>
<p>This &#8216;number&#8217; generation brings the effect that pretty soon you will encounter the maximum number of citizens that the system can handle on a specific day. Some systems run out of numbers if there are more than 500 males or females born on a day. The Denmark system encountered that situation in 2007, where due to immigration the population exceeded the system for January 1st 1965! The Denmark system (CPR-nummer)  has a 3-digit serial number where one of the digits is also the control digit (diminishing the possible numbers than from 500 to less than 50).</p>
<p>Remarkable to see what some countries are doing to solve the &#8216;century&#8217; issue, people with the same ID but born in the 19th, 20th or 21st century, they add 20 or 40 to the month. Same is true for foreigner identification, e.g. Sweden that is adding 60 to the day of birth. Or again Sweden that is adding 20 to the month to distinguish persons from organisations.</p>
<p>If you want to see the details on these systems you might watch <a href="http://prezi.com/csnv3cynv4ai/">http://prezi.com/csnv3cynv4ai/</a> or <a href="http://en.wikipedia.org/wiki/National_identification_number">http://en.wikipedia.org/wiki/National_identification_number</a>. Be prepared, definitely there have been PhDs around to invent these systems.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2010/01/19/why-there-are-maximum-of-females-in-a-country/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Attempted bombing Christmas Day could have been prevented!</title>
		<link>http://www.datavaluetalk.com/2010/01/13/attempted-bombing-christmas-day-could-have-been-prevented/</link>
		<comments>http://www.datavaluetalk.com/2010/01/13/attempted-bombing-christmas-day-could-have-been-prevented/#comments</comments>
		<pubDate>Wed, 13 Jan 2010 08:08:19 +0000</pubDate>
		<dc:creator>Eddy Reimerink</dc:creator>
				<category><![CDATA[Data Governance]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Blacklist Matching]]></category>
		<category><![CDATA[Customer Due Diligence]]></category>
		<category><![CDATA[Misspelled Name]]></category>
		<category><![CDATA[Terrorist]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1275</guid>
		<description><![CDATA[ 
Lack of understanding of the complexity of international names caused a near-accident successfully prevented by the Dutchman Jasper Schuringa.
 On Flight 253, on its way from Amsterdam to Detroit, a passenger tried to explode the airplane. This passenger was not called John Smith, or Peter Johnson. No, his name was a little more complicated: Umar Farouk [...]]]></description>
			<content:encoded><![CDATA[<p><strong><img class="alignleft size-thumbnail wp-image-1300" title="flight-253-suspect" src="http://www.datavaluetalk.com/wp-content/uploads/2010/01/flight-253-suspect-150x150.jpg" alt="flight-253-suspect" width="150" height="150" /> </strong></p>
<p><strong>Lack of understanding of the complexity of international names caused a near-accident successfully prevented by the Dutchman <em><a title="Jasper Schuringa" href="http://www.nydailynews.com/news/national/2009/12/27/2009-12-27_how_flying_dutchman_made_stop_he_was_getting_on_fire__i_just_jumped_over_the_sea.html" target="_self">Jasper Schuringa</a></em>.</strong></p>
<p> On Flight 253, on its way from Amsterdam to Detroit, a passenger tried to explode the airplane. This passenger was not called John Smith, or Peter Johnson. No, his name was a little more complicated: <em><a title="Umar Farouk Abdulmutallab" href="http://en.wikipedia.org/wiki/Umar_Farouk_Abdulmutallab" target="_self">Umar Farouk Abdulmutallab</a></em>. Easy to misspell, and that is exactly what happened. A <a title="Misspelling of name was the cause" href="http://tpmmuckraker.talkingpointsmemo.com/2010/01/state_dept_didnt_think_abdulmutallab_had_visa_--_b_1.php" target="_self">misspelling of the name </a>of Umar Farouk Abdulmutallab resulted in the State Department believing he did not have a valid U.S. visa.</p>
<p> <strong>We love damage control, not prevention</strong></p>
<p><span id="more-1275"></span>Now we are introducing bodyscan devices, to detect at the airport what could have been detected earlier, if only we would have applied proper technology to detect the misspelling of the name. Government and also ompanies I speak with seem to be satisfied with simple old-fashioned comparison algorithms to “comply”. If we compare the name against the watchlist or against database X, we have done our duty…..</p>
<p>Well, I don’t believe so! It is your responsibility as a governmental organization or company, to act responsible and do the best you can to prevent terrorism. This is not: “let’s do the absolute minimum to comply to the regulations”. This sad example again shows this is irresponsible. A safer world starts with responsible behavior and thinking about prevention. One of the keys for prevention is using <a title="Blacklist Matching" href="http://www.humaninference.com/en/Our%20Solutions/Solutions/~/media/0ABE15C8F3264C0CA3656BEAEE03FD04.ashx" target="_blank">sophisticated name-search technology</a>. This will keep us from introducing more expensive solutions for damage control, or worse: me having to fly naked the next time when I visit the USA.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2010/01/13/attempted-bombing-christmas-day-could-have-been-prevented/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Matching persons with different official names</title>
		<link>http://www.datavaluetalk.com/2010/01/06/matching-persons-with-different-official-names/</link>
		<comments>http://www.datavaluetalk.com/2010/01/06/matching-persons-with-different-official-names/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 15:32:59 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[cultural differences]]></category>
		<category><![CDATA[fault-tolerant matching]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[names]]></category>
		<category><![CDATA[naming confusion]]></category>
		<category><![CDATA[nicknames]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1269</guid>
		<description><![CDATA[Dealing with matching of persons or contact data in general, we are all aware that individuals can make use of abbreviations or nicknames as kind of synonyms for their name. Classic examples are the usage of the name Bill for the actual name William, or like my own father is using the name Mans while [...]]]></description>
			<content:encoded><![CDATA[<p class="mceTemp"><img class="alignnone" title="what is the what?" src="http://img1.fantasticfiction.co.uk/images/n37/n185744.jpg" alt="" width="107" height="137" />Dealing with matching of persons or contact data in general, we are all aware that individuals can make use of abbreviations or nicknames as kind of synonyms for their name. Classic examples are the usage of the name <em>Bill </em>for the actual name <em>William</em>, or like my own father is using the name <em>Mans </em>while officially his name is <em>Hermanus</em>. Most matching engines make use of a kind of synonym table to take care of this. That can be done because within a culture or region the nicknames are quite often linked to the same names and people do not tend to use completely different official registered names.</p>
<p>It becomes more challenging if there is no longer a link between nickname and official name. That may happen, for example, if people move from one cultural region to another where also other writing sets are used. Take for example my chinese friend<em> </em>高为民, whose Latin name would be Gao Weimin (family name first), but the moment he works in Europe or the US he is using the Latin variant William Gao. There is no common relation to the name William and Weimin both in Latin or Chinese and it they are no phonetic variants of each other. <span id="more-1269"></span></p>
<p>Recently, I have read a very impressive book from Dave Eggers, called `What is the What´. It gives you a good insight in one of the current problem areas of the world and how people try to survive there. Achak Denk is one of the so-called <a title="Valentino Achak Deng organization" href="http://www.valentinoachakdeng.org/" target="_blank">Lost Boys from Sudan</a>. During his live in Sudan, in refugee camps and finally in the US he is officially using differnt names. That has nothing to do with purposely trying to mystify his identity, but more with receiving an identity from your environment &#8211; at that time and place. He is born as Achak, baptized as Valentino, and later on using the name Dominic or Dominic Arou and  Marialdit. Of course there are people calling him nick names as &#8216;Sleeper&#8217; or &#8216;Gone Far&#8217; but at certain periods in his life he is officially using completely different names. This makes automatic matching of persons, or even manual matching, challenging and keeps it interresting.</p>
<p>I would recommend the book to everyone who wants to learn about what is happening in our world, and especially those interested in names (don&#8217;t forget to study all the names in the last Section of the book).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2010/01/06/matching-persons-with-different-official-names/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Let&#8217;s be honest &#8211; Solve your data quality before jumping into Pattern-Based Strategy</title>
		<link>http://www.datavaluetalk.com/2009/12/21/lets-be-honest-solve-your-data-quality-before-jumping-into-pattern-based-strategy/</link>
		<comments>http://www.datavaluetalk.com/2009/12/21/lets-be-honest-solve-your-data-quality-before-jumping-into-pattern-based-strategy/#comments</comments>
		<pubDate>Mon, 21 Dec 2009 14:35:43 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Business Activity Monitoring]]></category>
		<category><![CDATA[Business Inteligence]]></category>
		<category><![CDATA[gartner]]></category>
		<category><![CDATA[pattern-based strategy]]></category>
		<category><![CDATA[Yvonne Genovese]]></category>

		<guid isPermaLink="false">http://www.datavaluetalk.com/?p=1255</guid>
		<description><![CDATA[
In the evolution of information technology Gartner provided a new term as ultimate goal to reach: Pattern-Based Strategy.
As you were reaching for the  final destination in your ultimate journey to transform bits and bytes to real information, again you encounter a new optimum. Pattern-Based Strategy, as described by Yvonne Genovese et al. can be identified [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1267" title="pattern" src="http://www.datavaluetalk.com/wp-content/uploads/2009/12/pattern.jpg" alt="pattern" width="95" height="75" /></p>
<p>In the evolution of information technology Gartner provided a new term as ultimate goal to reach:<a href="http://blogs.gartner.com/andrew_white/2009/08/10/what-is-your-pattern-based-strategy/" target="_blank"> Pattern-Based Strategy</a>.</p>
<p>As you were reaching for the  final destination in your ultimate journey to transform bits and bytes to real information, again you encounter a new optimum. Pattern-Based Strategy, as described by <a href="http://my.gartner.com/portal/server.pt?open=512&amp;objID=260&amp;mode=2&amp;PageID=3460706&amp;authorId=15631" target="_blank">Yvonne Genovese</a> <em>et al. </em>can be identified as the last era in all the eras of  IT-value add. Basically, the level of control identifies in which of the era you currently operate &#8211; from tight control and pure automation in the &#8216;old&#8217; days via augmentation, e-commerce/Web 1.0 and web 2.0 to the highest era called &#8211; Pattern-Based Strategy.<span id="more-1255"></span></p>
<p>In Pattern-Based strategy you make use of the collective intelligence, not only collect data inside and outside your enterprise from those you know, but pro-actively collect information from the crowd, analyse this information and see if you can recognize the patterns on which you know how to respond. Try to be fast on weak signals so that you are the first in that market.</p>
<p>As in Business Intelligence (where you look in your rear mirror to make decisions for the future) and Business Activity Monitoring, within Pattern-Basted Strategies (where you look around and even let others look around while you make decisions) it only works if you can compare data, if you can make information out of your data. In order to make the right company decisions, to steer well, you need to rely on the information that you use. And although we like the new approach behind Pattern-Based Strategy, we as Human Inference, still see enterprises struggle with their Business Intelligence because of their data quality. We see them starting with Business Activity Monitoring and fail because of their data quality. And now we see them start dreaming  of Pattern-Based Strategy. Before you act on that dream, solve the data quality &#8211; there are solutions available.</p>
<p>For those who already tried to combine information from social communities, the lack of standards there, prevents that you can easily collect that collective intelligence for your patterns. As Human Inference,  we value your data, we  identified these issues and provide data quality tools to enrich this collective intelligence in order to transform it to trusted information for your strategy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datavaluetalk.com/2009/12/21/lets-be-honest-solve-your-data-quality-before-jumping-into-pattern-based-strategy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
