Page 1 of 1

indexing with CAPTCHAs

Posted: Tue Mar 11, 2014 8:18 am
by mndrix
A member of my ward recently suggested a way to use LDS Account login screens to increase the amount of FamilySearch indexing that goes on. The idea is basically the same as reCAPTCHA. On each LDS Account login screen, below the username and password inputs, show a small image taken from an indexing document (a name, a city, a year, etc). The user must transcribe the image before he may login.

I suppose there are dozens of variations on this theme, but it seemed like a great idea to me.

Re: indexing with CAPTCHAs

Posted: Tue Mar 11, 2014 11:12 am
by jdlessley
Sounds like forced indexing on members.

reCAPTCHA used for security and reCAPTCHA used for digitizing books are separate programs. When reCAPTCHA is used for security the application generates the distorted image much the same as CAPTCHA does. This is necessary to ensure there is an correct answer to compare a user's input against. reCAPTCHA used for digitizing does not have a 'correct' answer. Instead several humans are needed to provide answers. When enough answers that are the same are provided then a reasonably correct answer is derived from all the answers provided. This is similar to indexing in a way. Without a correct answer it is impossible to use reCAPTCHA for digitizing in conjunction with security. Users can input any answer because there is no 'correct' answer.

The primary development of reCAPTCHA was to help digitize books. From the description of reCAPTCHA several humans need to decipher and digitize the text a computer cannot through OCR (optical character recognition). [As a side note: reCAPTCHA was originally developed to help digitize printed books. OCR for printed characters is less complex than OCR for handwriting.] A large number of answers are needed in order to reach a high confidence that the number of same answers has produced a correct response. No single answer is considered correct in of itself. If merely providing an answer is needed, whether correct or not, there is no validity to using this for log-in security. It is just an added level to the sign-in process that slows it down and complicates it.

I probably use the LDS Account log-in a dozen or more times on some days, and probably more on other days. A good number of times I am in a hurry to get to a site, get information or input information, and get off. Adding an additional step that can take several minutes, if not longer, while I attempt to transcribe some data that I may or may not be able to do may just add an unnecessary layer and frustration.

The usefulness of using reCAPTCHA to force indexing on members and still ensure valid or accurate indexing is questionable. Once a user determines there is no 'correct' answer and they are in a hurry or just do not feel like doing indexing at the moment, they will just input any answer to get signed in. That answer could be anything whether related to the needed answer or not. This adds useless data to the aggregation of data to be correlated. Eventually the number of answers needed to ensure a high number of correct answers would have to be increased. Currently indexing uses a considerably smaller number of transcriptions to determine a 'correct' answer than reCAPTCHA does for digitizing books.

Re: indexing with CAPTCHAs

Posted: Tue Mar 11, 2014 11:20 am
by russellhltn
There are currently threads on the forum about how to get more members to use the calendar. Adding a indexing step to the login (particularly when one is forced to re-login so frequently) would just discourage use of the tools.

Re: indexing with CAPTCHAs

Posted: Tue Mar 11, 2014 1:23 pm
by mndrix
jdlessley wrote:Sounds like forced indexing on members.
One of the "variations" is to let users opt in/out for this experience. Those kind of UI details aren't really the crux of this idea, though.

The interesting insight in this member's suggestion (and reCAPTCHA and Duolingo and vineyard.lds.org, etc.) is that people are willing to perform work in tiny quantities if it's incidental to another goal they care about.

It's the same reason retail stores have a "donate your change to charity" jar near the cash register. Most shoppers don't care enough about charity to write and mail a check, but they'll gladly contribute 30 cents in passing if presented with the opportunity.
jdlessley wrote:Eventually the number of answers needed to ensure a high number of correct answers would have to be increased.
I agree. Devoted, die hard indexers will always be more efficient. Collecting dimes from retail donation jars is less efficient than big donations but it still adds up to millions of dollars in charitable contributions each year.

Re: indexing with CAPTCHAs

Posted: Wed Mar 12, 2014 9:43 pm
by sbradshaw
I like this idea but it's true there are some problems that would need to be overcome. What if FamilySearch made a browser plugin/extension that made you index a name every time you visit sites you specify, or something like that?

Re: indexing with CAPTCHAs

Posted: Thu Mar 13, 2014 3:00 am
by marianomarini
Technically speaking, the main problem would be "extract" a small image from a document to be presented to the user!
Unless you are thinking showing the intire document to be read and indexed.

Re: indexing with CAPTCHAs

Posted: Wed Oct 14, 2015 1:59 pm
by tfantina
Sorry not to bring this up from the dead but I really like the idea and have been thinking about it for several months. In fact I joined this board to see if anyone else had the same thoughts. I know on a lot of Family Search Indexing documents there are highlighted areas where they expect the information to be, not always accurate but usually a good template, I imagine Captcha images could be pulled much the same way.