Page 1 of 2

Two data standards, two database systems

Posted: Tue Feb 06, 2007 11:26 am
by huffkw
Two data standards, two database systems

Raising the data standard AND increasing research output

We need a central system that encourages data stability

Based on recent discussions, I think one of the basic issues still under the radar of major Church genealogy systems is that we are using two very different data standards in different settings, but only officially recognize and assist one standard in centralized systems. We have one fairly relaxed standard for most temple ordinances, but a much more demanding unofficial standard for family histories (and related sealings).

It would be very helpful to officially recognize the higher standard for family histories and sealings and to provide central systems that encourage and support it. I believe the situation needs to change, so that instead of finding the best genealogy data on home PCs, as it is now, we find even higher quality data on a central system in the future. The extensive collections of complete, current, reliable, incrementally improvable, peer-reviewable, unduplicated, lineage-linked family genealogy data found on PCs should be found in a new centralized system in even more robust form.

If someone is considering creating a new “client-side” file system to replace PAF, before that happens I hope they will consider the option of leaving PC-PAF essentially as it is, and using the available programming resources to add all the new functionality centrally where it is most needed, in my opinion. These new functions can only be fully effective at a single central site where all genealogists can contribute and benefit.

The home PC-PAF collections would continue to be the source, the original assembly point, for most new genealogy data on the central system, and might continue to be the gold standard for the currency and accuracy for some parts of it, such as new births into the family. But the central database could eventually integrate far more data than could ever be assembled on any PC, and would support the linking of many more types of data to individuals (using storage on many other servers). At some point in time, the PC-based database programs may evolve into mostly being used as places to store downloaded copies of the central data, available for family review in offline mode.

(My note gradually grew into nearly 4 pages, so I put the rest on my website)
MORE......http://www.genreg.com/study/20070204Two ... ndards.htm

Many different people

Posted: Tue Feb 06, 2007 2:15 pm
by ClarkeGJ
A few points to consider:

The Family History department is getting closer to releasing the New FamilySearch. This is a "centralized" pedigree so everyone can look at and add to the same family tree. The collaborative environment we help contributors communicate with one another to improve accuracy and reduce duplication. While this browser-base application will accomodate most new and intermediate genealogists, there is still many reasons for the use of "client-side" files system. There will be reasons for custom user interfaces to accomodate different user profiiles and cultural differences. The will be reasons for off-line analysis and processing of genealogical data. There will be a need to keep private data separate from shared and public data.

If these "PC-PAF" like products are upgraded to be compatible with the New FamilySearch, then we may one day have One Data Standard and One Data System even though the data processing and data storage may be distributed for one reason or another. They key is for the Church to create a commanding benefit for playing in a common collaborative space.

Overall I think I'm in agreement with most of the priniciples you have presented.

Posted: Wed Feb 07, 2007 8:04 pm
by huffkw
Thanks for your thoughts. With all the design work being done on new genealogy computer systems, this should be a very interesting year.

A Long Term Goal

Posted: Thu Feb 08, 2007 11:17 pm
by huffkw
A Long Term Goal

I have already said plenty on this forum, but I have one more item.

I want to suggest a worthy goal, well within our current capabilities, that I will use to judge progress over the next few years:
-----------------------------------------------------
Goal: A database that contains at least the 300 million deceased Americans in integrated, lineage-linked, well-documented form (mostly linked to relevant source documents, and, where available, their images), with no duplicates (except those few clearly identified as ambiguous, with all the alternate readings), which might be completely populated within 5 years.
-----------------------------------------------------
All these names should be verifiably tied in some way to LDS members as lineage-linked relatives (or it can be easily seen who is not so tied, if there are any such people), so that no one could argue in the future that we are just randomly doing names without a valid family interest, and are insensitive to the feelings of other religious traditions that see our temple work as an affront to them and their ancestors whom they see as “doing well enough without the Mormons’ help.”

The non-members who may have a tendency to be critical can also get a huge information benefit from this new system for their own genealogical work, so that they will have much more difficulty making their complaints. They will at least have to say “This system is very helpful to us, but.......” when they offer criticisms. Or perhaps they can be charged with any policing they feel is necessary, perhaps being able to mark but not remove people.

This creating of a large, high quality database will solve a host of practical problems, including the need for sufficient names to supply the busy temples. This finished database could provide a fifty year’s supply of the highest quality temple names, including complete family relationships for all sealings to take place.

If we achieve anything less than this goal, I will just consider our interim work to be part of a learning process that will eventually get us there. We can do it straight out of the box, or we can meander until we get there. The only question is when we will be able to do the reengineering work and user training work to get us there.

One would expect the first responsibility for genealogical data accuracy to be on the individual members, but I think we have reached the point where it is no longer adequate to leave it just to them. Without better central storage and cooperation tools, I believe we have reached an impasse, where the ordinances that have already been done are so voluminous and accessible that they actually discourage many participants from doing any really new work.

Finally, if we reach this goal, I expect we will have developed a system that can easily be upgraded to make it adequate for worldwide use.

Posted: Sun Feb 11, 2007 2:28 pm
by mikael.ronstrom
Hi,
These notes are based on my personal experience of genealogy which is entirely from
Sweden so it is not necessarily applicable globally.

In Sweden there are church books that record births, marriages, deaths and sometimes also
when and where people have moved. In addition a great resource is a record of a yearly visit
by the priest that is called husforhorslangd, freely translated into house investigation record.
These HFL record all people living in a village, when they are born, married and dead. As such
it is an invaluable resource for genealogists.

These records makes the life of a genealogists fairly straightforward. Thus the scheme of
following your ancestors back in time becomes the natural way forward. There is also a
surplus of data so only when the genealogists doesn't record his data properly should there
be any risk of duplication. Around 90% of all births were recorded (my own personal estimate)
so when all these records exists simultaneously it's quite ok for genealogists to follow the
scheme of searching for one's ancestors.

In much more than 90% of all counties in Sweden all these books exists at least until early 1800's.
And in most cases it spans back to beginning or mid 1700's. When we get further back it
becomes more difficult and here an approach of each person caring only for his own ancestors
becomes inefficient. So another approach is required.

As an example my personal ancestors comes to around 35% from a county called Luleå. Here the
birth, marriage and death records exist from around 1695. The HFL exists from around 1720. This
means that almost every line for my ancestors from here until those born around 1700 can be
researched. Also I have a gold mine here in that all deaths recorded between 1708 and 1756
contain also names of the parents and the birth place and age at death. So personal research here
on ancestors can take one pretty far. Thus it is pretty straightforward using normal indexes to
avoid duplication.

However research beyond this is also possible. Thus it is possible to find almost 75% of the people
also that lived between 1600-1700. However this requires a completely different approach. This
approach can only be done if the focus is to research all people in the county of Luleå and
possibly also people in the counties close by.

The material accessible for those times are tax records which was done yearly, contains only head
of family and the number of sons, daughters, son-in-laws, daughter-in-laws, elderly, servants and
so forth. In addition there is also a gold mine of court records. However these records are more or
less impossible to use if the focus is to research only one line. So a better approach is that people
with ancestors from this area join forces and record all tax records, transcribe court records to make
them indexable, go through all other types of records. With this material at hand it is again possible to
find ancestral trees. Some of the people found might very well be impossible to link so here one simply
handles them as probable ancestors.

So handling these records will require a more sophisticated approach to how we cooperate as
genealogists and I think personally that most of the methods will be localised, but if we make tools
available that are adaptable to local requirements then genealogists in those areas can get together
and cooperate on generating records for those counties all the way back to around 1600 in the case
of Luleå. There are even some lines where one can continue back to around 1500 and even some
very rare ones back to 1400 but these are mostly based on tax records with very little information
other than name of head of household.

Rgrds Mikael Ronstrom


huffkw wrote:Two data standards, two database systems

Raising the data standard AND increasing research output

We need a central system that encourages data stability

Based on recent discussions, I think one of the basic issues still under the radar of major Church genealogy systems is that we are using two very different data standards in different settings, but only officially recognize and assist one standard in centralized systems. We have one fairly relaxed standard for most temple ordinances, but a much more demanding unofficial standard for family histories (and related sealings).

It would be very helpful to officially recognize the higher standard for family histories and sealings and to provide central systems that encourage and support it. I believe the situation needs to change, so that instead of finding the best genealogy data on home PCs, as it is now, we find even higher quality data on a central system in the future. The extensive collections of complete, current, reliable, incrementally improvable, peer-reviewable, unduplicated, lineage-linked family genealogy data found on PCs should be found in a new centralized system in even more robust form.

If someone is considering creating a new “client-side” file system to replace PAF, before that happens I hope they will consider the option of leaving PC-PAF essentially as it is, and using the available programming resources to add all the new functionality centrally where it is most needed, in my opinion. These new functions can only be fully effective at a single central site where all genealogists can contribute and benefit.

The home PC-PAF collections would continue to be the source, the original assembly point, for most new genealogy data on the central system, and might continue to be the gold standard for the currency and accuracy for some parts of it, such as new births into the family. But the central database could eventually integrate far more data than could ever be assembled on any PC, and would support the linking of many more types of data to individuals (using storage on many other servers). At some point in time, the PC-based database programs may evolve into mostly being used as places to store downloaded copies of the central data, available for family review in offline mode.

(My note gradually grew into nearly 4 pages, so I put the rest on my website)
MORE......http://www.genreg.com/study/20070204Two ... ndards.htm

Posted: Sun Feb 11, 2007 6:11 pm
by greenwoodkl
I heartily agree with mikron. The majority of my personal research in the last few years has taken me to French-speaking Switzerland. The birth records contain parent names and occasionally relative names of those who stood in as witnesses/godparents. As a practicality and to hopefully assist others in the future researching in the same villages, I'm attempting to record nearly everyone in the village records. This way I look for duplicate individuals and unique names to link up into family units. Any tool that could be created to help groups or individuals manage the research into vital records and help form families out of those individual records would be appreciated.

Waiting in the wings is the perfect answer........

Posted: Mon Feb 12, 2007 4:24 am
by huffkw
Waiting in the wings is the perfect answer........

mikron and kgthunder:
I am very pleased to hear both of your reports on your research interests and possibilities.

All my design work and prototyping has been aimed at creating a place that would welcome all the kinds of data you describe, whether pure personal pedigree data, or more widely and generically valuable data consisting of whole villages or counties with all known relationships, including data arranged into descendant structures beginning with ancient progenitors. There may be thousands of researchers – descendants of those people in the villages you document – who would care about these broader groups of connected people, where your own personal pedigree may only be of special interest to a few people. My system would be happy to accept both, but gives separate visibility and emphasis to large collections of names in descendant form, since they are more generally useful, and give rise to few duplicates.

With the tool I wish to see going, a group of people could work together to vacuum up and interrelate whole counties, as you suggest. If one person or a cooperating group does a good job, no one need ever do it again, making it a great boon to all descendants, and hopefully it will encourage some of those ambitious descendants to do the same for other whole counties or villages. The net gain in overall research efficiency by this specialization is huge – hundreds of times more efficient, especially since the public record sets’ idiosyncrasies and languages do not need to be learned by thousands of different people. One locality expert can do it once for all. Those who follow behind can check it out for themselves, but it is a very simple matter at that point, especially if library references are given or actual images of documents are linked-to as well.

As far as I know, the usual means for cooperating these days consists of a place to store a static GEDCOM, and maybe get some indexing of it. That method does some good, but it also has a host of depressing problems and inefficiencies. What I want to get going some way is an online, real-time database system that allows users to update their data every day, if they wish, one name or one relationship at a time (with the option to do bulk updates as well, including imports of selected segments of other online family structures (in the same database), or just create links to these other structures) – all the typical PAF features, but online, plus all the extra cooperation features of a very specialized wiki. (The typical wiki would turn to confusing, highly duplicated mush in no time, I fear). I have the complete prototype running, but it needs some real professional production system guys to make it fly, so thousands of people could use it simultaneously. But that will take a lot more cash than I have, to polish it to that level and pay the hosting bills. But I keep hoping there will be a way. . . . . . . . .

werelate.org

Posted: Tue Feb 13, 2007 7:39 pm
by dhanks-p40
huffkw wrote: . . . all the typical PAF features, but online, plus all the extra cooperation features of a very specialized wiki. (The typical wiki would turn to confusing, highly duplicated mush in no time, I fear). . .
http://werelate.org is a very interesting wiki-based genealogy tool

Posted: Wed Feb 14, 2007 12:11 pm
by dlongmore
I also have found WeRelate to be a very well designed and implemented wiki tool. They just added a gedcom upload feature as well as Family Tree Explorer feature that is quite impressive. I would encourage you to give it a try.

The Holy Grail

Posted: Thu Feb 15, 2007 8:37 pm
by huffkw
The Holy Grail

Thanks for your observations.

WeRelate.org has some great features as a wiki, not the least of which is it is free, but I am looking for the Holy Grail, the perfect worldwide system, and this is not it (as far as I can tell). The trouble is, I haven’t found anyone else who has my same interest in a serious worldwide system, suitable for building nation-sized genealogy databases that are well-documented and without duplicates, so it may never happen. Nobody may ever build one. For you database people out there, if I told you I was sure there was a general solution to the problem, could you come up with your own version of it? Then someone might understand my point. There are numerous logical problems to solve, but they can be solved reliably.

A Failure of Imagination
When I bring up this issue, it is like I said I had invented space travel, and could go to Venus and back in a week, but the only thing anyone can ever imagine, when I say those words “space travel,” is base jumping, parachuting off a cliff. That is a kind of space travel, I guess, but that is not what I meant. But no matter how I explain space travel, that is the only image they can form in their minds.