Search Errors

Discussions about the Notes and Journal tool on LDS.org. This includes the Study Toolbar as well as the scriptures and other content on LDS.org that is integrated with Notes and Journal.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

Read ! as BUT NOT

#21

Post by dmaynes »

dmaynes wrote:I'm wondering if the ! (NOT) operator is being parsed and handled as a binary logical operator.
If you read the NOT (!) operator as "AND NOT" or "BUT NOT" you will get the correct search.

For example, if you want to find all the scriptures containing LAMAN but not LEMUEL, you would enter <<laman ! lemuel>>. If you want to find all the scriptures containing NEPHI and LEHI but not JERUSALEM, you would enter <<nephi & lehi ! jerusalem>>.

It makes no sense to say "but not JERUSALEM and NEPHI and LEHI." Therefore, this syntax is invalid.

Because the ! operator is read "BUT NOT," it makes no sense to combine "AND BUT NOT". It is already a conjunctive operator. Being a conjunctive operator, it is commutative with AND (as long as it is not first in the logical phrase). It is not commutative with OR. If you need to use an OR operator, you should use parenthesis.

So, <<pillar & cloud & ! fire>> is invalid because you cannot read find "pillar AND cloud AND BUT NOT fire."Search priority or precedence appears to follow a strict left-to-right ordering. If you want the ordering with precedence on the right, you need to use parenthesis. The search <<pillar ! fire | cloud>> is read "pillar BUT NOT fire OR cloud." It is performed strictly left-to-right. The verses are found (1) those containing pillar, then those with fire are removed and finally any additional verses with cloud are added. It cannot be reordered with parenthesis. The search <<pillar | cloud ! fire>> has two possible orderings: <<(pillar | cloud) ! fire>> (this is the default ordering) and <<pillar | (cloud ! fire)>>. This second ordering is the same as <<cloud ! fire | pillar>>, and is read "cloud BUT NOT fire OR pillar.

A mathematician would view these operators as set operations. While there exist equivalences between set and logical operators, the exact operation is important. In terms of sets the AND (&) operation is equivalent to the intersection operator, the OR (|) operation is equivalent to the union operator, and the BUT NOT (!) operation is equivalent to the set subtraction operator.

Thanks,
Dennis
User avatar
WelchTC
Senior Member
Posts: 2085
Joined: Wed Sep 06, 2006 8:51 am
Location: Kaysville, UT, USA
Contact:

#22

Post by WelchTC »

Alan_Brown wrote:With apologies to Tom, I must disagree.
No need to apologize as I may have misread or misunderstood the question. I'll forward this thread on to the developers, however.

Tom
User avatar
mkmurray
Senior Member
Posts: 3266
Joined: Tue Jan 23, 2007 9:56 pm
Location: Utah
Contact:

#23

Post by mkmurray »

dmaynes wrote:If you read the NOT (!) operator as "AND NOT" or "BUT NOT" you will get the correct search.

For example, if you want to find all the scriptures containing LAMAN but not LEMUEL, you would enter <<laman ! lemuel>>. If you want to find all the scriptures containing NEPHI and LEHI but not JERUSALEM, you would enter <<nephi & lehi ! jerusalem>>.

It makes no sense to say "but not JERUSALEM and NEPHI and LEHI." Therefore, this syntax is invalid.

Because the ! operator is read "BUT NOT," it makes no sense to combine "AND BUT NOT". It is already a conjunctive operator. Being a conjunctive operator, it is commutative with AND (as long as it is not first in the logical phrase). It is not commutative with OR. If you need to use an OR operator, you should use parenthesis.

So, <<pillar & cloud & ! fire>> is invalid because you cannot read find "pillar AND cloud AND BUT NOT fire."Search priority or precedence appears to follow a strict left-to-right ordering. If you want the ordering with precedence on the right, you need to use parenthesis. The search <<pillar ! fire | cloud>> is read "pillar BUT NOT fire OR cloud." It is performed strictly left-to-right. The verses are found (1) those containing pillar, then those with fire are removed and finally any additional verses with cloud are added. It cannot be reordered with parenthesis. The search <<pillar | cloud ! fire>> has two possible orderings: <<(pillar | cloud) ! fire>> (this is the default ordering) and <<pillar | (cloud ! fire)>>. This second ordering is the same as <<cloud ! fire | pillar>>, and is read "cloud BUT NOT fire OR pillar.

A mathematician would view these operators as set operations. While there exist equivalences between set and logical operators, the exact operation is important. In terms of sets the AND (&) operation is equivalent to the intersection operator, the OR (|) operation is equivalent to the union operator, and the BUT NOT (!) operation is equivalent to the set subtraction operator.

Thanks,
Dennis
So by using Computer Science jargon in an effort to make this explanation quite a bit shorter, the NOT operator (!) is implemented as a binary logical operator and not as a unary logical operator.

Of course, this is what is trying to be explained to non-technical folks by saying you must put the operator between words, as found in the help documentaton.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

Search which works and doesn't work, inconsistently

#24

Post by dmaynes »

tomw wrote:This forum will do fine. Thanks for pointing it out. I'll forward over to the team responsible for the online scriptures.
Here's the search URL: http://scriptures.lds.org/en/search?typ ... ecked&bw=1

This search for <<mary virgin>> should return two scriptures (Luke 1:27 and Alma 7:10). It doesn't always work. I printed the search results page as a PDF file and I am attaching it.

I have gotten the search to fail by toggling the "Sort by relevance" box.

Here's a test session:
1- Start at http://scriptures.lds.org
2- Enter <<virgin mary>> in the search box and press Enter
3- Failed
4- Toggle "sort by relevance" Failed
5- Toggle "sort by relevance" Succeeded
6- Toggle "sort by relevance" Succeeded
7- Toggle "sort by relevance" Failed
8- Toggle "sort by relevance" 4 times, Succeeded each time
9- Toggle "sort by relevance" Failed

I don't know why this is failing.

Could this be related to my browser? Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5

Thanks,
Dennis
Attachments
Search_for_mary_virgin.pdf
(34.77 KiB) Downloaded 179 times
User avatar
aebrown
Community Administrator
Posts: 15153
Joined: Tue Nov 27, 2007 8:48 pm
Location: Draper, Utah

#25

Post by aebrown »

dmaynes wrote:This search for <<mary virgin>> should return two scriptures (Luke 1:27 and Alma 7:10). It doesn't always work. I printed the search results page as a PDF file and I am attaching it.

Could this be related to my browser? Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5

I can confirm this erratic behavior for both Firefox 3 and IE7.
User avatar
aebrown
Community Administrator
Posts: 15153
Joined: Tue Nov 27, 2007 8:48 pm
Location: Draper, Utah

#26

Post by aebrown »

dmaynes wrote:This search for <<mary virgin>> should return two scriptures (Luke 1:27 and Alma 7:10). It doesn't always work. I printed the search results page as a PDF file and I am attaching it.

Could this be related to my browser? Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5

I can confirm this erratic behavior for both Firefox 3 and IE7.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

#27

Post by dmaynes »

Alan_Brown wrote:I can confirm this erratic behavior for both Firefox 3 and IE7.
Thanks - I wasn't sure if it would be reproduced. The error is specific to the word "mary." It does not reproduce with other words that I tried.

Failure counts (20 toggles of "sort by relevance")

<<mary>> 5
<<mary virgin>> 15
<<virgin mary>> 2
<<mary born>> 5
<<born mary>> 7
<<virgin>> 0
<<born>> 0
<<virgin born>> 0
<<born joseph>> 0
<<joseph mary>> 0 (upon repeat, I did see failures)
<<mary jesus>> 2
<<jesus mary>> 13

1. The problem is associated with the word "mary"
2. "Sort by relevance" does not matter
3. Because of the variability and runs of failures, the trials are not independent (i.e., it's not stateless; there appears to be a service state that causes this to happen; this is probably related to a caching mechanism). I did a first-order Markov analysis on 90 trials and the test of independence failed with a chi-square value of 18 and 1 d.f. (probability < .0005)
4. All of the above tests were performed with "search all word forms" checked.
5. The problem reproduces with "search all word forms" not checked.
6. I believe that time between searches interacts with the runs of failures and successes. The longer the time interval between searches, the failures and successes appear to have greater independence. This makes sense if the failures are stored in the service cache.

Thanks,
Dennis
User avatar
WelchTC
Senior Member
Posts: 2085
Joined: Wed Sep 06, 2006 8:51 am
Location: Kaysville, UT, USA
Contact:

#28

Post by WelchTC »

Nice work! I've alerted the developers.

Tom
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

More strange words

#29

Post by dmaynes »

tomw wrote:Nice work! I've alerted the developers.
It didn't seem reasonable that "mary" was the only word where the search would have trouble.

I found several additional words. Please note that they do not all appear to fail with the same frequency. I had to click "Search" for merit a lot of times before the failure reproduced. Only the word "merit" fails (with 0 results) or succeeds (with 8 results) like "mary."

I compiled this list by looking at words in the Topical Guide under "M" and searching them. I eventually started looking at only "short" words because those appeared to be the only ones that failed.

magic - inconsistent results: magics Morm. 1:19 is not always displayed. - returns 1 or 2 results.
"Search all word forms" is checked.
Toggle "Sort by relevance" twenty times.

maid - returns 24 or 31 results.
"Search all word forms" is checked.
"Sort by relevance" is not checked.
Click "search" twenty times.

man - returns 1071 or 1070 results.
"Search all word forms" is checked.
"Sort by relevance" is not checked.
Click "search" twenty times.

meat - returns 158 or 165 results.
"Search all word forms" is checked.
"Sort by relevance" is not checked.
Click "search" twenty times.

melt - returns 29 or 31 results.
"Search all word forms" is checked.
"Sort by relevance" is not checked.
Click "search" twenty times.

merit - returns 0 or 8 results.
"Search all word forms" is checked.
"Sort by relevance" is not checked.
Click "search" twenty times.

Maybe the examples where the number of results changes will help identify the problem, if it is the same problem.

Thanks,
Dennis
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

Dual mode scripture search - stemming inconsistency

#30

Post by dmaynes »

tomw wrote:This forum will do fine. Thanks for pointing it out. I'll forward over to the team responsible for the online scriptures.

Tom
Interesting inconsistencies exist because there are two different search engines for searching the scriptures:

1-- Search from the Scriptures home page (http://scriptures.lds.org)
2-- Search from the main website page (http://www.lds.org), with a "filter" on scriptures

I haven't explored this thoroughly, but one such inconsistency is in the way stemming works.

Scripture search for <<treasure heart>> from the scripture search engine and 10 references are returned.

Scripture filter for <<treasure heart>> from the lds.org search engine and 11 references are returned. (You have to filter by the New Testament, Book of Mormon, and Doctrine and Covenants. Filter by All Scriptures returns Topical Guide hits, also.)

The extra reference from the lds.org search engine is Romans 2:5 -- "But after thy hardness and impenitent heart treasurest up unto thyself wrath against the day of wrath and revelation of the righteous judgment of God;"

The word "TREASUREST" is returned in the lds.org search engine, it is not found using the scripture search engine.

If you search for <<TREASUREST>> (no quotes) using the scripture search engine, Romans 2:5 is the only reference returned. This is because the word "treasurest" is not recognized to have the same word form as "treasure."

Is the development team interested in knowing about these stemming inconsistencies? Is there an appropriate way to report them?

Thanks,
Dennis
Post Reply

Return to “Notes and Journal, and Online Scriptures”