dmaynes wrote:Dashes that are present in the text appear to not be parsed or handled correctly. ... I haven't tried to determine the behavior for the second word that follows the dash.
The dash is usually written as two hyphens. It is a grammatical construct that is used to offset modifying phrases. The search engine appears to process the dash like a word connector (similar to the use of a hyphen) not like a word separator.
Here is a summary of searching involving dashes and hyphens.
- Dashes are treated like word connectors and not word separators, just like hyphens. This appears to be a design flaw.
- Phrase search cannot be used to find phrases that include dashes.
- Phrase search will find phrases that include hyphenated words but the hyphenated or compound form must be specified as written in the text for the search to succeed.
- Stemming is active for the second word of a hyphenated word. It is not active for the second word in the dashed construct.
Tests were performed using the following three quotes:
Neil Anderson quote
We limped along the tree-lined country road in second gear. It would be impossible to drive to Bordeaux in this condition, and we looked for possible help. Our first hope was a convenience store just preparing to close. I asked about possible rental-car locations or train stations nearby. We were far from any city of any size, however, and my questions brought little response.
Neil L. Andersen, “Room in the Inn,” Ensign, Dec 2008, 12–15
Shari Pippen quote
After two more years of testing and one outpatient surgery, I had an answer—mitochondrial myopathy, a form of muscular dystrophy. To this day I do not understand what it fully means. I do understand a little about what causes it and what I can do to lessen its accompanying symptoms and complications. While it is not life threatening, there is no cure.
Shari Phippen, “Becoming Spiritually Whole,” Ensign, Dec 2008, 34–36
Thomas S. Monson quote
It has been my privilege, accompanied by my counselors and by other General Authorities, to dedicate three new temples: in Curitiba, Brazil; in Panama City, Panama; and in Twin Falls, Idaho—bringing to 128 the number of temples in operation throughout the world.
Thomas S. Monson, “Welcome to Conference,” Ensign, Nov 2008, 4–6
Search full word without quotes
<<treelined>> returns Anderson quote -- (Expected result)
<<answermitochondrial>> returns Pippen quote -- (
Unexpected result)
<<idahobringing>> returns Monson quote -- (
Unexpected result)
Search partial word without quotes (stemming in second word)
<<treelin>> returns Anderson quote
Note: <<lin>> does not return "lined" in a general search
<<answermitochondria>> does not return Pippen quote
<<idahobring>> does not return Monson quote
Note: <<bring>> returns "bringing" in a general search
Search full word with asterisk
<<treelin*>> returns Anderson quote -- (Expected result)
<<answermito*>> returns Pippen quote -- (
Unexpected result)
<<idahobring*>> returns Monson quote -- (
Unexpected result)
Phrase search with hyphen or dash in the middle of the phrase
<<"We limped along the treelined">> does not return Anderson quote -- (
Unexpected result)
<<"we limped along the tree-lined">> returns Anderson quote -- (Expected result)
As far as I can tell, there is no way using a phrase search to find content where a dash is present in the phrase.
<<"I had an answer mitochondrial myopathy">> does not return Pippen quote
<<"I had an answer-mitochondrial myopathy">> does not return Pippen quote
<<"I had an answer--mitochondrial myopathy">> does not return Pippen quote
<<"idahobringing">> does not return Monson quote
<<"idaho bringing">> does not return Monson quote
<<"idaho-bringing">> does not return Monson quote
<<"idaho--bringing">> does not return Monson quote
Phrase search with hyphen or dash at the end of the searched phase
<<"We limped along the tree">> does not return Anderson quote -- (Expected result)
<<"I had an answer">> does not return Pippen quote -- (
Unexpected result)
Phrase search with hyphen or dash at the beginning of the searched phrase
<<"lined country road in second gear">> does not return Anderson quote -- (Expected result)
<<"mitochondrial myopathy">> does not return Pippen quote -- (
Unexpected result)
Unquoted search with hyphen or dash at the beginning of the searched phrase
<<lined country road in second gear>> returns Anderson quote -- (
Unexpected result)
<<mitochondrial myopathy>> returns Pippen quote -- (Expected result)