Search Errors

Discussions about the Notes and Journal tool on LDS.org. This includes the Study Toolbar as well as the scriptures and other content on LDS.org that is integrated with Notes and Journal.
User avatar
aebrown
Community Administrator
Posts: 15153
Joined: Tue Nov 27, 2007 8:48 pm
Location: Draper, Utah

#11

Post by aebrown »

dmaynes wrote:Is this working the way that it is intended to work? The scriptures that are returned will have the words "be" and "one" but sometimes there are intervening words.
tomw wrote:Yes, that is how it is designed.
With apologies to Tom, I must disagree. If the search were designed this way, it would be a horrible design (unlike any other phrase search I've ever seen) and completely inconsistent with the documentation regarding phrase searches. I'm quite confident that the scripture search is not designed this way. In most cases I have tried, a search for a quoted phrase will only find the phrase.

I think some incorrect conclusions were drawn because of the results of the "be one" search. I would speculate that "be one" allows for intervening words because "be" is a very common word and is excluded from the normal search if you search for it by itself. So I could see how spurious results might be returned based on this unindexed word (not that it should do this, but it might be a less tested code path, or a compromise based on index size or performance). But if you search for "chain neck" you will indeed find no matches, because that phrase does not exist in the scriptures. That is the correct behavior; it is the "be one" search that has questionable results.

I also think there is some mishandling of AND with NOT. Continuing with the chain/neck example, if we search with "search all word forms" turned off, we have:
chain = 13
chain neck = 4
chain !neck = 9

This is exactly what one would expect. But with AND, we have:
chain = 13
chain & neck = 4
chain & !neck = 0

This last result is unexpected and seems to be a bug.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

Zero or one intervening words on quoted searches

#12

Post by dmaynes »

Alan_Brown wrote:With apologies to Tom, I must disagree. If the search were designed this way, it would be a horrible design (unlike any other phrase search I've ever seen) and completely inconsistent with the documentation regarding phrase searches. I'm quite confident that the scripture search is not designed this way. In most cases I have tried, a search for a quoted phrase will only find the phrase.

I think some incorrect conclusions were drawn because of the results of the "be one" search. I would speculate that "be one" allows for intervening words because "be" is a very common word and is excluded from the normal search if you search for it by itself. So I could see how spurious results might be returned based on this unindexed word (not that it should do this, but it might be a less tested code path, or a compromise based on index size or performance).
I agree that the apparent forgiveness of intervening words in the design can be confusing. It appears that in actuality the design is either 0 (zero) or 1 (one) intervening words. As an example, consider "tempt the Lord thy God."

"tempt God" returns 5 results. Three of those have the exact phrase "tempt God" and the other two have one intervening word. But, "tempt the Lord thy God" is not returned with this search. (3 intervening words).

"tempt Lord God" returns 4 results. All of these are variants of "tempt the Lord thy God".

The engine doesn't seem to distinguish on what the intervening word is. If we return to the example of "be one" and 2 Ne. 9:12 the intervening word is "restored."
This is exactly what one would expect. But with AND, we have:
chain = 13
chain & neck = 4
chain & !neck = 0

This last result is unexpected and seems to be a bug.
It looks like a bug to me also.
User avatar
aebrown
Community Administrator
Posts: 15153
Joined: Tue Nov 27, 2007 8:48 pm
Location: Draper, Utah

#13

Post by aebrown »

dmaynes wrote:I agree that the apparent forgiveness of intervening words in the design can be confusing. It appears that in actuality the design is either 0 (zero) or 1 (one) intervening words. As an example, consider "tempt the Lord thy God."
The original statement was that "intervening words" would still allow a match, and that would not be acceptable in my opinion.

I guess it's a bit more reasonable to have a single intervening word, but this still seems odd, particularly if the intervening word is a significant word. I can see where it is reasonable that "forth fruit" would match "bring forth her fruit", since "her" is insignificant. But I have a hard time with a design that allows "forth fruit" to match "forth good fruit" "forth tame fruit" and "forth evil fruit". It allows for too many false positives, which may require you to sift through a lot of incorrect matches to find what you are looking for.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

Are HONOR and HONOUR considered the same word form?

#14

Post by dmaynes »

When I search for <<honor father mother>> with "search all word forms" checked, two references are returned: 1Ne. 17:55 and Mosiah 13:20. The Old Testament and New Testament references are not returned.

The reason they are not returned is because the Old and New Testament use the King's English and the word "honor" is spelled "honour" in the King James Version of the Bible.

What does it take to bind these two words together?

I would believe that spelling variations of the same word are definitely the same word form. There are a lot of these variations in the scriptures and it would be helpful if you did not need to know whether English or American spellings were being used.

As another example, consider "plow" and "plough." I wanted to find http://scriptures.lds.org/en/luke/9/62#62 Luke 9:62 -- "And Jesus said unto him, No man, having put his hand to the aplough, and blooking back, is fit for the kingdom of God." I couldn't remember the reference and I couldn't spell plough.

Thanks,
Dennis
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

NOT does not appear to work properly

#15

Post by dmaynes »

Alan_Brown wrote:I also think there is some mishandling of AND with NOT. Continuing with the chain/neck example, if we search with "search all word forms" turned off, we have:
chain = 13
chain neck = 4
chain !neck = 9

This is exactly what one would expect. But with AND, we have:
chain = 13
chain & neck = 4
chain & !neck = 0

This last result is unexpected and seems to be a bug.
There is some sort of error with the NOT function and it does not necessarily require the AND (&) to be explicitly listed. As an example with "search all word forms" turned on (I don't think it matters for this example), search::

horse rider = 9 results, 10 references
horse rider heels = 1 result, 1 reference
horse rider !heels = 4 results, 5 references

This last result is also a bug, because 9 references should be returned.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

#16

Post by dmaynes »

dmaynes wrote:There is some sort of error with the NOT function and it does not necessarily require the AND (&) to be explicitly listed. .
Order of the search terms also matters with the NOT function.

myrrh returns 13 results (with 2 multi-verse results)
frankincense returns 12 results (with 3 multi-verse results)
myrrh frankincense returns returns 3 results (with 1 multi-verse result)
frankincense !myrrh returns 9 results (with 2 multi-verse results)
(!myrrh) frankincense returns 0 results --- This is an ERROR
frankincense myrrh returns 3 results (with 1 multi-verse result)
!frankincense myrrh returns 0 results -- This is an ERROR
myrrh !frankincense returns 10 results (with 1 multi-verse result)

Except for the errors when the NOT function is at the beginning, these results seem reasonable.
User avatar
aebrown
Community Administrator
Posts: 15153
Joined: Tue Nov 27, 2007 8:48 pm
Location: Draper, Utah

#17

Post by aebrown »

dmaynes wrote:There is some sort of error with the NOT function and it does not necessarily require the AND (&) to be explicitly listed. As an example with "search all word forms" turned on (I don't think it matters for this example), search::

horse rider = 9 results, 10 references
horse rider heels = 1 result, 1 reference
horse rider !heels = 4 results, 5 references

This last result is also a bug, because 9 references should be returned.

Actually, this is working properly. Your assumption that "search all word forms" wouldn't matter is incorrect. If you do this with "search all word forms" turned off, you get the obviously correct results:

horse rider = 5
horse rider heels = 1
horse rider !heels = 4

The help page for searching tells you that if you want to do any kind of advanced search, that the "search all word forms" setting is ignored. If you want to include word forms, you need to follow the search term with a tilde. Thus:

horse~ rider~ = 9
horse~ rider~ heels = 1
horse~ rider~ !heels = 8

These results all match the documentation and are internally consistent.
User avatar
aebrown
Community Administrator
Posts: 15153
Joined: Tue Nov 27, 2007 8:48 pm
Location: Draper, Utah

#18

Post by aebrown »

dmaynes wrote:Order of the search terms also matters with the NOT function.

myrrh returns 13 results (with 2 multi-verse results)
frankincense returns 12 results (with 3 multi-verse results)
myrrh frankincense returns returns 3 results (with 1 multi-verse result)
frankincense !myrrh returns 9 results (with 2 multi-verse results)
(!myrrh) frankincense returns 0 results --- This is an ERROR
frankincense myrrh returns 3 results (with 1 multi-verse result)
!frankincense myrrh returns 0 results -- This is an ERROR
myrrh !frankincense returns 10 results (with 1 multi-verse result)

Except for the errors when the NOT function is at the beginning, these results seem reasonable.

I'm not so sure this is a problem, either. The documentation says:
Use the symbols “ & ” (and), “ | ” (or), or “ ! ” (not) between words for Boolean searches (the words themselves cannot be used to designate Boolean searching).

Note that it says that these special symbols are to be used between words. By putting the ! (NOT) operator at the beginning of the search string you are breaking that rule. In my testing, the order doesn't matter at all, as long as you always put the NOT operator between words.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

#19

Post by dmaynes »

Alan_Brown wrote:Actually, this is working properly. Your assumption that "search all word forms" wouldn't matter is incorrect. ... The help page for searching tells you that if you want to do any kind of advanced search, that the "search all word forms" setting is ignored.
Ah, now I see. It is confusing that the default search mechanism of "search all word forms" is used when multiple words are provided as long as none of the special search symbols are used. But, when any of the special search symbols are present, the "search all word forms" is no longer allowed.

I'm not sure how to eliminate the confusion. The current search results indicate that "all word forms" or "all occurrences" were returned. This is a very subtle distinction and not likely to be noticed.

I appreciate your time. I assume if I have had trouble like this using the scripture search that others will have or have had trouble, also. I suspect that several of my posts have been confused because of this issue. I would suggest that if a user enters a query with multiple words that the search engine append the "`" special character to all the words if the "search all word forms" box is checked. In other words, you were able to get the expected behavior by explicitly using the special tilde character.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

#20

Post by dmaynes »

Alan_Brown wrote:Note that it says that these special symbols are to be used between words. By putting the ! (NOT) operator at the beginning of the search string you are breaking that rule. In my testing, the order doesn't matter at all, as long as you always put the NOT operator between words.
I'm wondering if the ! (NOT) operator is being parsed and handled as a binary logical operator. This would explain why forms like <<rider & !horse>> or <<!horse & rider>> fail. The only way that the ! (NOT) operator appears to work is if it is placed between the words, exactly as stated in the documentation. Any other placement appears to result in a search that fails. For example, "(Nephi | Lehi) & Jerusalem" succeeds, but "(Nephi | Lehi) & !Jerusalem" fails.
Post Reply

Return to “Notes and Journal, and Online Scriptures”