Continuing on the theme of semantic isolation and data siloing* somewhat, I was listening to the Diane Rhem Show on US talk radio station NPR via Finland's YLE Mondo. Today's programme "New Voter ID Laws and the 2012 Elections" was about the USA's law regarding voter eligability and the problems of keeping track of who is and who isn't allowed to vote in certain elections (2012 presidential election being of particular concern).
One part of the programme concentrated on the difficulty of cross-referencing between voting lists in states, counties and various government bodies such as the driver and vehicle licensing (DMV). One of the major problems is that actually identifying and subsequently cross-refenencing people by ID numbers (plural!), by name (due to misspellings, usually involving punctuation) or by address and location. Making this even more interesting is the temporal aspect that over time people move and records in differing data-sets end up overlapping or being temporally disjoint.
One of the current solutions is to use what was termed "election geeks" - people with highly detailed knowledge of voter lists and how to match records in different formats from different states and agencies together. That is a group of people who are highly skilled in performing the manual task of deisolating the semantics of each data set (of voters) and matching these together.
One of the presenters remarked that while a technological solution was necessary we need more so called election geeks. Putting it another way, we need more highly skilled engineers specialising in information, information theory, semantics and ultimlately semiotics. What a great idea!
*this links to part 1, parts: 2 and 2 and a half.