newFamilySearch Pedigree Display Record

garysturn · #31

But, how often would the most submitted name be wrong? (10% of the time) That is the only time someone would have to enter a personal opinion to change it, and if enough people entered a better opinion that name would soon become the most submitted name and would then display for all users without a personal submission.

Below I have included a screenshot of a database I created in Access that sorts the names like I described.

Plus add my other suggestion to display in Red any name that the person has submitted that disagrees with the most submitted name and people will correlate with each other to get their submissions to all be the same. And you can communicate with each other to get people over to your side if you don't agree with the most submitted name.

One problem with a true Wiki style approach in this type of application is that every time you log onto the system someone has changed the view and people would go in and change it back again only to find it changed back the next time they log on.

RussellHltn wrote:OK, so someone puts in just a first name. That will cause others to see just one name so the first person to add a full name will now become default for all. It's like a Wikipedia effect. The problem with the current situation (or an automatic situation that can only be overridden by an own Personal Opinion) is that each person who is interested in that name will have to add an opinion just to get their display to look right. That could create a very large number of records and a slow system. While my idea doesn't reduce it to the minimum, it does eliminate the need for 100 personal opinions to satisfy 100 users of that combined individual.

garysturn · #32

Some have proposed some type of voting system or some way for someone to choose a display record for all to see. I orginally thought that that would be a good solution, but then I realized that there would need to be a record to hold that choice in each folder. Then what happens if there are 5 duplicates and they each have an record in them that contains this vote or choice and they get combined into one folder. Which record becomes the new record with the choices in it and what do you do with the other four records. That is when I came to the conclusion that the only way that will work is to have a sort that will elevate the most submitted version. We will just have to work with others to get disagreements resolved when we don't agree, the version that has the most submissions should still be the top record for those without a personal submission.

russellhltn · #33

I think the biggest problem with voting is that you'd have to have some kind of tracking on the votes so each user could only vote once per combined individual. That would be quite a large database.

The_Earl · #34

Ok, so now I have my screenshots, but I still have no idea what I am looking at.

I think there are two basic problems with genealogy.

1. Genealogy is a time-series problem. What existed in 1400 is not what exists now.

2. Genealogy is a massively parallel task. What I am working on is tied to what you are working on, it is just a matter of time before we step on each others work.

This thread is not really concerned with the first, so I will leave that for now.

2:

If we are going to merge your genealogy to mine, we need some mechanism to do so. We have to be able to identify and resolve conflicts, and knit parts of the trees together that match.

With nFS, this is a HUGE problem, since we have a massive amount of work that has been siloed for a long time. Now we need to merge this work together.

We can learn a lot about the way to solve this problem by looking at existing systems that solve similar problems. I offer a few examples.

Source code repositories are the industry standard way of getting large teams of developers coordinated on large code repositories. These systems can be very simple, from one-at-a-time, to auto-conflict resolution and submission filters that check the code for certain attributes before allowing submissions (whitespace comes to mind). Often these systems are also part of a larger system that runs a sanity check (nightly build) periodically to verify that no critical parts fail (nightly unit testing).

Wiki's are large collaborative documents that almost anyone can modify. A few interesting trends have come from large wiki projects like Wikipedia. Most Wikipedia articles have an adoptive editor. These people protect the pages from malicious editing and obvious erroneous entries. These editors are often not permitted admin or moderator rights, they simply build a reputation for correctness. Editors also often report other users to moderators for more drastic corrections.

Wikipedia articles that are unstable are marked as such. Most of these pages do not allow edits to the main page without first agreement on the 'talk' page. Articles are also marked if they contain usable, but problematic information, or if other editing is needed.

Many open source projects have large repositories of bugs written by the user community. These bugs are often incomplete, vague, erroneous, or otherwise useless. Buried in these lists are bugs that are well written, complete and important.

Bugzilla uses a voting system to push good bugs up, and allow bad bugs to stagnate. Voting does not remove or add a bug, but only can elevate the status of existing bugs. Viewing voting statistics shows the relative importance of a bug.

All of these systems have a few things in common:

1. They all require the author to be identified. In rare cases, this is only by IP, or some other non-deterministic name. Generally anonymous users are not given the same privileges or deference to identified users. Users build credibility by with meaningful identified contribution.

2. All these systems require the contributor to justify their contribution with some sort of message. The contribution is judged by its content, AND its stated purpose at submission. Often, contributions that would otherwise be acceptable are rejected because their justification does not match the actual submission.

3. All of these system keep a complete change history. Deletions are rare, and normally only permitted to privileged users. There is always a history of all changes made, and often an easy mechanism to go back to a previous version.

4. General consent is needed before accepting a submission. Sometimes (seems half of my examples) this is implied by lack of action reverting the change. Systems that offer simple methods to roll-back changes seem to use lack-of-action. Systems that are more difficult to roll-back use more overt methods of change acceptance.

So, I don't think that answers any questions as to the way sorting should be done

. I do think it gives a good argument for the four points above to be included.

Personally, I will stick by my stance that genealogy needs to be built in a version control system like software. I think I should have my genealogy 'branch' and I should be responsible to resolve conflicts between it and the 'trunk' version of my genealogy. Only uncontested, well verified information should exist in 'trunk'.

So, if I were to sort the above. I would only show my submitted information. I would mark records that conflicted or partially matched other submissions. I would then have the option of merging the two records, or staying with my own record. I might make it possible to adopt parts of another record without merging the whole record. If I wanted to see the other records submitted. I would select the contested record and run the conflict tool to see the other records and their conflicts with my own.

Barring that, I think I would prefer the ability to filter out the records that I did not submit, and the ability to sort by submitter, or by some other criteria (like DOB or name).

I find it problematic that some information (like submitter) would be only shown in the record color. That may cause accessibility problems.

The Earl

#35

Source Control would be great as I have previously mentioned, but the problem as I see it is that we need to know WHAT to give to each user as they view the trunk. Even with 2 reliable sources (Naval Census and Marriage Record in a specific case) you get two versions of a name Geo and George Leo. In this example we do not know the actual name but which one do we go with? Personally the sort that we explained previously by Russell I believe is a great solution for those that are just trying to link and get names but do not have time to verify the info before merging it.

The_Earl · #36

thedqs wrote: <snip>
we need to know WHAT to give to each user as they view the trunk. Even with 2 reliable sources (Naval Census and Marriage Record in a specific case) you get two versions of a name Geo and George Leo.
<snip>

My point is that you would make a choice. Trunk would only contain the justified, uncontested result of a merge. If you could not resolve the two records, they would remain 'branched'.

I guess you would have a UUID that would identify Geo and George Leo as candidates for [UUID]. You would be able to see all candidates for [UUID] if you requested them, but you would not by default see all of those records, just your current branch, or the current trunk version.

My original post was only partially on-topic, and I think this is starting to wander away from usefulness to the original purpose of the thread. I think I will try to gell some of my ideas and get them in a presentable form before I follow this to much farther. I will post my thoughts when they are well thought out and presentable.

Thanks
Barrie

garysturn · #37

In newFamilySearch all submissions, temple ordinance dates, LDS membership records and personal opinions are combined into a folder for each individual. The name displayed in the pedigrees and summary view is determined by a sort of these items in the folder. The current sort order is (1) personal submission of the current user (2) LDS membership record (3) alphabetical sort. Many have discussed different ways this should be changed to offer a more complete name in the pedigrees because when there is no personal submission or LDS membership record the alphabetical sort often only displays a first name. My suggestion was to just change the sort to (1) personal submission of the current user (2) LDS membership record (3) the most submitted version of the name. I have also recommended listing how many entries are included in a version of a name when the program says multiple (Example: [42] MULTIPLE). I have listed some reasons why I feel this is still the best method of the proposals I have seen.

-Displays a more complete name in the pedigree and summary view. If the most correct submission is not the most submitted, over time as people enter personal opinions and work together to correct their submissions the most correct name will become the most submitted and will elevate to the display record.
-Most of the time the most submitted name will already be the most correct name. Whereas most of the time the alphabetically sorted name is not the most correct.
-This will still display the user’s personal opinion in the pedigree and summary view ahead of the most submitted if it is different. People will be able to see in the details view if their submission is different from the most submitted version of a name.
-This will also display the LDS Membership Record ahead of the most submitted if it is different. If the LDS Membership Record is wrong there is already a procedure to get it corrected. Thus it encourages correcting errors in membership records.
-This method would list from top to bottom the most submitted version of a name at the top to the least submitted at the bottom (after the personal submission and LDS membership record if any).
-It is a type of voting because personal opinions are counted toward the most submitted.
-Will eliminate the need for more personal opinions. There is no need for more people to enter another opinion once the correct name is displayed, so the folders will not continue to get full as everyone enters a personal opinion in order to get the most correct name to display.
-Provides the best method for working with others. If people see that their opinion is not the most popular, they will check their own submission and correct it or try to convince others of their opinion.
-Puts incomplete submissions at the bottom of the list. Incomplete submissions are usually only sent in by one or two people.
-Does not require a complicated software source ranking system. Individuals evaluate the sources and enter an opinion and that counts toward the most submitted name.
-Once all submissions are claimed and corrected we will have an acceptable conclusion. It will be the combined conclusion of the family members and not of a computer program.
-When the life browser is added and source images are attached, if the new information changes peoples opinions all they need to do is edit their own submission and it counts toward the most submitted.
-The same procedures will work for other events (dates and places). Eventually they will want to display standardized place names but this will help until that method is perfected.
-No other programming changes other than changing the sort order and displaying the number of multiple names submitted are required, so programmers can concentrate on other functions and don’t have to spend a lot of time coming up with some type of ranking system and it will help stop the fast growing number of entries in the folders because now every user must submit another opinion to elevate the most correct name.

Gary Turner

Tech Forum

newFamilySearch Pedigree Display Record

Automatic Sort

Voting option

Huh? part 2

The most submitted name is still the best solution