Search Errors

Discussions about the Notes and Journal tool on LDS.org. This includes the Study Toolbar as well as the scriptures and other content on LDS.org that is integrated with Notes and Journal.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

Single word problem no longer reproduces

#31

Post by dmaynes »

dmaynes wrote:It didn't seem reasonable that "mary" was the only word where the search would have trouble.

I found several additional words. Please note that they do not all appear to fail with the same frequency. I had to click "Search" for merit a lot of times before the failure reproduced. Only the word "merit" fails (with 0 results) or succeeds (with 8 results) like "mary."
This problem now appears to be resolved. It is not reproducing for me. I had the problem with the word "hen" about a week ago, so the fix seems to be relatively recent. Good job!

Thanks,
Dennis
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

what desired thou father -- words ignored, stemming problem

#32

Post by dmaynes »

tomw wrote:This forum will do fine. Thanks for pointing it out. I'll forward over to the team responsible for the online scriptures.

Tom
I admit to being somewhat contrary at times. I wanted to see if the exact phrase search would span references. It does not. However, I did find two problems.

If you search for the words <<what desired thou father>> with search all word forms turned off, 4 references will be returned.

Here's the URL http://scriptures.lds.org/en/search?type=words&last=what+desired+thou+father&help=&wo=checked&search=what+desired+thou+father&iw=scriptures&tx=checked&hw=checked&sw=checked&bw=1

The problem is that the words <<thou>> and <<what>> appear to be ignored. The search should return no references. Is it desirable to ignore some words in the search? If words are ignored, it seems like the proper feedback to the user is that the words <<thou>> and <<what>> are frequent words and were ignored in the search. I could make an argument for words like <<and>>, <<or>>, and <<the>>, but I don't think these two words should be ignored.

By the way, the search from http://www.lds.org (restricted to The Old Testament) returns Daniel 2:23 with this search. The other references are not returned. The three words <<desired>>, <<fathers>>, and <<thou>> are highlighted. The word <<what>> does not appear in the reference. It is not highlighted. The http://www.lds.org search engine also appears to be ignoring this word.

The next problem appears to be related to the stemming (Search all words forms) function. If you search for the words <<desirest father>> with search all word forms turned on, no references will be returned.

http://scriptures.lds.org/en/search?search=desirest+father&do=Search

However, Daniel 2:23 should be returned because <<desirest>> is a word form of <<desire>>. The scripture search function from http://www.lds.org finds Daniel 2:23 because it places <<desired>> and <<desirest>> in the same word form group.

Thanks,
Dennis
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

What did our father do when we were upon it?

#33

Post by dmaynes »

dmaynes wrote:I admit to being somewhat contrary at times.
Here's more contrariness. If you search for the words <<what did our father do when we were upon it>> you will have 759 hits returned.

http://scriptures.lds.org/en/search?type=words&last=desired+father+things+it&help=&wo=checked&search=what+did+our+father+do+when+we+were+upon+it&do=Search&iw=scriptures&tx=checked&af=checked&hw=checked&sw=checked&bw=1

The problem is that a lot of little words are being ignored in the searches. It seems problematic that some of these words are ignored. Here's a partial list that I was able to discover. I'm sure there are more words.

who
thou
what
when
why
did
but
our
we
were
are
be
for
upon
on
is
was
it

If you include any of these words in your search, they are currently ignored.

Thanks,
Dennis
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

Mishandling of stop words

#34

Post by dmaynes »

dmaynes wrote:The problem is that a lot of little words are being ignored in the searches. It seems problematic that some of these words are ignored. Here's a partial list that I was able to discover. I'm sure there are more words.
The search I have been using is the three words <<Jesus wept STOP-WORD>>. I am assuming the mishandling of these little words is due to their presence on the stop-word list of the search engine. I don't have a great recommendation on how to handle these stop words, but I find it strange that the search for the words <<Jesus wept when you were there>> would return John 11:35 "Jesus wept." Here is a more complete list of words where this behavior is demonstrated:

a
about
all
also
an
and
any
are
as
at
be
but
by
can
did
for
from
get
has
have
he
her
hers
him
his
how
if
in
is
it
its
may
me
my
not
of
on
or
our
out
shall
she
so
that
the
thee
them
their
there
they
this
thou
thy
upon
us
was
we
were
what
when
who
why
with
ye
you
your

It seems that the specific algorithm for dealing with the stop words needs to be carefully evaluated. There seems like there are three ways to deal with this problem: (1) Post-process all returned lists after the stop words are eliminated, (2) Treat stop words like any other word, or (3) Implement an alternate indexing probe function for the stop word list.

I do not believe that the behavior of ignoring stop words in the query without any form of user feedback is good. It causes a lot of false positives to be returned, or some confusion on the part of the user.

Another possibility is to follow the current procedure. If a group of stop words only are provided to the search (such as <<out there>>) the returned page states:
The words OUT and THERE are not significant words and were not used in the search.
If words are going to be ignored, then the user should be told which words are being ignored.

Thanks,
Dennis
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

Cannot search for "can not"

#35

Post by dmaynes »

tomw wrote:This forum will do fine. Thanks for pointing it out. I'll forward over to the team responsible for the online scriptures.

Tom
There is something strange with the search engine's interpretation of searching for <<can not>>.

If I search for <<Jesus wept can not>> I get a list of 264 hits. http://scriptures.lds.org/en/search?typ ... ecked&bw=1

If I search for <<Jesus wept>> I get a list of 3 hits. http://scriptures.lds.org/en/search?typ ... ecked&bw=1

Why would I get more hits when I add the words <<can not>>?

If I search for <<Jesus~ wept~ can~ not~>> I get 0 hits. http://scriptures.lds.org/en/search?type=words&last=jesus+wept&help=&wo=checked&search=jesus~+wept~+can~+not~&do=Search&iw=scriptures&tx=checked&af=checked&hw=checked&sw=checked&bw=1

And, that is correct.

If I search for <<Jesus wept cannot>> I get 0 hits. http://scriptures.lds.org/en/search?type=words&last=jesus~+wept&help=&wo=checked&search=jesus+wept+cannot&do=Search&iw=scriptures&tx=checked&af=checked&hw=checked&sw=checked&bw=1

And, again this is correct.

The problem seems to be with the "stop words" of "can" and "not," some automatic combining of "can" and "not", which results in the discard of "Jesus" and "wept" from the search.

I don't think I have done anything wrong, but it sure seems strange.

Thanks,
Dennis
User avatar
mkmurray
Senior Member
Posts: 3266
Joined: Tue Jan 23, 2007 9:56 pm
Location: Utah
Contact:

#36

Post by mkmurray »

Perhaps it detects "not" as a keyword and tries to implement some additional logic.
dmaynes
Member
Posts: 233
Joined: Sat Nov 01, 2008 10:50 am
Location: Pleasant Grove, Utah

#37

Post by dmaynes »

mkmurray wrote:Perhaps it detects "not" as a keyword and tries to implement some additional logic.
I don't believe that it is an error introduced by trying to implement search logic.

Instead, it appears to be some mishandling in the processing of the stop words.

Here are additional observations.

If you are searching for "all word forms" the words "can" and "not" get combined together as the word "cannot" and other words appear to be ignored.

Example: <<fountain>> returns 38 hits: http://scriptures.lds.org/en/search?type=words&last=not+can+fountain~&help=&wo=checked&search=fountain&do=Search&iw=scriptures&tx=checked&hw=checked&sw=checked

<<fountain cannot>> returns 1 hit: http://scriptures.lds.org/en/search?typ ... ecked&bw=1

<<fountain not can>> returns 38 hits: http://scriptures.lds.org/en/search?typ ... ecked&bw=1 (because "not" and "can" are ignored in the search - I reported this bug earlier in the thread)

<<fountain can not>> returns 297 hits: http://scriptures.lds.org/en/search?typ ... ecked&bw=1 (the word "fountain" appears to be ignored)

<<fountain~ can not>> returns 0 hits: http://scriptures.lds.org/en/search?type=words&last=fountain+can+not&help=&wo=checked&search=fountain~+can+not&do=Search&iw=scriptures&tx=checked&hw=checked&sw=checked&bw=1 (this is correct -- note the tilde affixed to the word "fountain" says "search all word forms" but it disables some sort of parsing function that is involved with the stop words. It is this parsing function that appears to have the problem.)

Thanks,
Dennis
hatchaz-p40
New Member
Posts: 1
Joined: Wed Feb 10, 2010 10:49 pm
Location: Holbrook, Arizona, USA

Reference error D&C 124:99b

#38

Post by hatchaz-p40 »

The online scriptures have the same error as my hard copy version, which has been corrected in more recent print editions.

D&C 124:99b links to Isaiah 40:3, but should link to Isaiah 40:31
Post Reply

Return to “Notes and Journal, and Online Scriptures”