Page 1 of 1

Mass production and assembly of finished genealogy data -- up to 2000 times faster

Posted: Fri Aug 06, 2010 10:45 pm
by huffkw
In recent years, the Church has taken on some huge projects to streamline and speed up genealogy research. It is putting billions of images online, with an online indexing system to harness the cooperation of many people to make those records accessible. It has assembled in one database all the Church historical records related to temple work.

But still there is an important piece missing, in my opinion. Since the times of Henry Ford, the mass production techniques of specialization and cooperation have changed most of the industries in the United States to make them hundreds or thousands of times more productive, but the genealogy industry has largely escaped those radical reengineering efforts. It is still mostly a "cottage industry" in its work methods, barely touched by the "industrial revolution" described by Adam Smith.

It appears that the LDS Church needs an annual flow of new fully-researched names in the range of 10 million in order to avoid all use of unresearched names and duplicates from the past. I don't see another reliable way being developed to easily produce at least 10 million new fully-researched names each year, so I wish to describe such a system. It should be noted that there are some fee-exchange aspects of the system which are needed to ensure a certain level of fairness among the participants. I assume those necessary fee-exchange features make it unsuitable for direct Church sponsorship.

The inherent "mathematics of genealogy" strongly invite a change in approach to achieve vast gains in overall productivity through "industrial strength" cooperation. These mathematics have always been there, but have gone completely unnoticed, so it seems. The solution I suggest is relatively easy to implement with today's computers, but it certainly could have been done 10 years ago, and probably could of been done 30 years ago with the first availability of general-purpose computers.

Below is a short description of the project I propose. All the important pieces of the system have been tested in research mode, and now are being assembled for general public use. A patent was issued in 2004. The system is actually operating at a very basic level in a beta testing mode at http://www.GenReg.com, and significant amounts of documentation appear on the site under the "about us" button, including a 30 minute narrated PPT video. It can accept registrations and genealogical data, but it is far from complete. Two people in southern Utah County are doing the work in PHP/MySQL. A noticeably larger staff with a wide mix of skills will be needed to make it fully successful, including programmers to flesh out all the necessary mechanisms. Anyone with the slightest interest is invited to inquire further. This is a bootstrap startup with a near-zero budget. It is an idealistic endeavor, and I assume that venture capital involvement and extreme profit-making goals would seriously warp the project from the beginning, although there is every reason to believe that the project could become at least self-supporting.

------------------------------------------------------------------
Genealogy Registry, http://www.GenReg.com

We provide the concepts, methods, and tools for researchers to cooperate in completing high quality genealogy on an industrial scale, up to 2,000 times faster than in the past. Researchers can "finish" data so that no one need ever do the basic research again, which is the best possible way to stop all unnecessary duplication. With this vast increase in efficiency, it becomes possible to finish the basic genealogy for whole nations in extremely short time periods. Theoretically, the entire US could be completed in two weeks: If all of the 4 million US genealogists each completed and contributed 75 names, and spent about 1 hour on each name, or 80 hours in all, that would mean that the 300 million names of those who have died in the US could be collected in a two week period.

Participating researchers who do their part completing a 10-generation single-surname descendency structure should receive back completed versions of the other 1023 single-surname descendencies they need to complete their full 10-generation pedigree, making it a 1000-to-1 gain for them. This would be an impossible, million dollar product using normal historical methods, but participants get it essentially for free. The completed database can also be treated as a cooperative publishing project, where the royalties from distribution would be returned to the data suppliers who agree to the publishing of their data.
------------------------------------------------------------------

An Efficient Human Genealogy Computer Alternative

Posted: Mon Aug 16, 2010 5:26 pm
by huffkw
I thought it might be interesting to add a note about what kind of a "human computer" it would take to do the genealogy cooperative research for the entire United States if modern digital computers were not available. The output would be the 300 million fully-researched names of all those people who have died in the United States. We could actually have done this within reasonable time and cost parameters in a pre-computer world, if someone had invented the process 100 years ago. The central "computer" would consist of about 500 people and 2000 four-drawer file cabinets, altogether filling up space equal to about one floor of the Genealogy Library in Salt Lake City. Working one shift a day, those 500 people could record all the ancestors and relationships in about 50 years. With three shifts a day of 500 people, it would take about 17 years. They would be receiving that information from millions of genealogists who prepared it in a format suitable for their filing system. For a more detailed description, see
http://www.genreg.com/presentations/HumanComputer.pdf.

Posted: Tue Aug 17, 2010 10:04 pm
by greenwoodkl
I would think more responses may be gathered if this thread was posted on the FamilySearch forums instead of the LDSTech forums.

Posted: Thu Aug 19, 2010 5:18 am
by huffkw
Thanks for your comment. I'm sorry to say I did not understand it correctly at first. I guess I don't keep up well enough with all that goes on in the various Church-related forums. I did finally take a look at the separate FamilySearch Forums (http://forums.familysearch.org), but it seemed to me like the topic might clash with the focused material that is already there. Perhaps the next step is to put this post on one of the more public genealogy blogs listed on Cyndislist.com, for example. If this project ever goes nationwide, it will probably mean that 95% of the users will be non-LDS anyway. The Church members might be kept so busy with all the new Church programs that they will not have much time to participate in this one.

Incidentally, I expect that many genealogists will have trouble accepting that this new idea might be workable, simply because no one else has come up with it, even though vast amounts of time, thought, and money have been spent by various parts of the genealogy industry over the last decade on software and database development. I suspect the reason it has not popped up before is because there are about 12 problems or issues that have to be recognized and solved simultaneously before it becomes obvious that there is a better way. Taking issues 2 or 3 at a time, there appears to be no better answer than what we see. Maybe I finally found a use for all that calculus in college. :-)

In my opinion

Posted: Thu Aug 26, 2010 9:01 am
by dpenrod75
The greatest incentive for ensuring accurate genealogical data is the uniting of families eternally. Rewards of monetary gain will only expose corruption to the entire process.

Posted: Thu Aug 26, 2010 4:51 pm
by huffkw
You raise a serious question, of course, but I have several answers I can try out on you, and see if any of them help with your concerns.

My main interest is in simply getting very large amounts of genealogy work done quickly, and a little money in the right places could make a big difference, as I hope to show below. If we can change the incentives slightly, we might get a huge improvement in the outcome.
1. My first answer is that anyone who wants to do it for free is more than welcome to do so, so I don't think it will keep out the idealists. In any event, the database will have to be rather large before any of it could be considered for "publication," since the assumption is that individual pedigrees are the product being sold. Those pedigrees can only be constructed from a very large database, if any kind of general coverage is to be provided.
2. There is already a lot of money flowing around in the genealogy industry, perhaps $2 billion or $3 billion a year, to make a wild guess. We have lots of professional genealogists, librarians, archivists, microfilm specialists, etc., who make their livings doing genealogy work. Some of the best quality work could come from professionals, if they decide to get involved. Genealogy is certainly not the sole province of unpaid volunteers -- witness the rather large staff at Ancestry.com and other such organizations.
3. I would like to see this genealogy work done on a world scale. If you're looking at this problem strictly from an LDS Church viewpoint, then of course your interest in uniting families forever is a valid one. But probably less than 2% of the world's genealogists are concerned about that issue, even though we might share large blocks of ancestors in common. I am guessing that we have somewhere between 100,000 and 400,000 genealogists in the Church. There are perhaps 4 million genealogists in the United States and perhaps as many as 80 million genealogists worldwide. If we want to get some serious work done on a worldwide scale, then there are perhaps 800 times more people out there who could work with us if we have the correct system and methods.
4. There are many people who would like to do genealogy work, but do not have the resources to do it. For example, people in Russia or China might be happy to do work on a worldwide genealogy database (some of them work for United States genealogy companies already), but they would have to receive some compensation. So we might find that we have United States genealogy hobbyists funding the work of people in Europe, Eastern Europe, etc. which would help with this project. In that situation, the genealogy system would end up doing some missionary work and welfare work as well, especially if some of these workers are members or friends of members.
5. Even in the United States, there are lots of people who would love to do genealogy work for a living, especially since their health may preclude them from doing more vigorous physical work.
6. If your fear is that people will just put garbage in the system in the hopes of collecting money, there need to be safeguards against that regardless of the source of the data or the motives of those who enter it. Even if we assume that people today have the best of motives, it is easy to see that well over half of the data floating around on the Internet is so bad as to be almost useless. One of the largest problems we have, and one of the largest causes of duplication, is simply that the data that is available is so bad as to be untrustworthy, so that people feel they have to do it again.
7. I believe there is a general "fairness" question involved in genealogy research, some of which can be partially solved through some kind of payment. There are some people who do a great deal of work, and then may feel pestered when many other people would like copies of part of their treasure trove. The result is that they may hide their life's work to avoid the hassle. If they can receive the proper recognition for the work they have done, and possibly receive enough compensation to pay some of their research expenses, they might feel a great deal more open about sharing their work with others.

Posted: Fri Aug 27, 2010 6:22 am
by dpenrod75
Well good luck. If it moves the good work forward then I hope it works out.