"Not Found" returned on some Come Follow Me lesson pages (wget)

Discuss ideas and suggestions around the Church website.
Post Reply
User avatar
chrissv
New Member
Posts: 35
Joined: Fri Mar 02, 2007 9:17 am
Location: Dudley, MA

"Not Found" returned on some Come Follow Me lesson pages (wget)

#1

Post by chrissv »

I am creating calendar entries for the Come Follow Me Individual and Family lessons. To avoid having to manually type in the info for the weekly lesson, I want to programatically extract it from the Church web page for each week.

Based on clicking around the web page, I see that the format of the URL for each week is uniform, and it looks like this:
https://www.lds.org/study/manual/come-follow-me-for-individuals-and-families-new-testament-2019/XX?lang=eng
where "XX" goes from 01, 02, etc. up to 50.

Easy enough, I write a simple script to use "wget" to pull down each week's page to my hard drive where I can extract the information.

But I am not succeeding getting all of the weeks. In fact, I only get weeks 01 through 05, and then 10. Every other week number gets me a small web page with "This page is unavailable. Error code: 2-1919"

I know the URL for all those weeks is correct, since if I put that URL in my Chrome browser window I get the correct web page. But retrieving with "wget" doesn't work.

If I were receiving the "Not found" for every attempted access, I would suspect something wrong with my script. But since I get some -- but not all -- of the week pages, I suspect something with the lds.org web page.

I tried adding a "Referer:" header to the request (with an lds.org web page as the referer) but that didn't change anything.

Does anyone have any suggestions on how to get all of the weekly web pages downloaded?

Thanks,

Steven
User avatar
sbradshaw
Community Moderators
Posts: 6245
Joined: Mon Sep 26, 2011 9:42 pm
Location: Utah
Contact:

Re: "Not Found" returned on some Come Follow Me lesson pages (wget)

#2

Post by sbradshaw »

I have seen the "This page is unavailable. Error code: 2-1919" error when I try to access too many pages on LDS.org in too short a period of time. I suspect that there is some sort of security restriction to prevent the server from getting overloaded. Maybe adding a pause between each page would help.
Samuel Bradshaw • If you desire to serve God, you are called to the work.
User avatar
chrissv
New Member
Posts: 35
Joined: Fri Mar 02, 2007 9:17 am
Location: Dudley, MA

Re: "Not Found" returned on some Come Follow Me lesson pages (wget)

#3

Post by chrissv »

Good suggestion. I added a 30-second delay between each page request (should be plenty of time), and it didn't change the outcome. Still getting the errors for the same pages.
russellhltn
Community Administrator
Posts: 34417
Joined: Sat Jan 20, 2007 2:53 pm
Location: U.S.

Re: "Not Found" returned on some Come Follow Me lesson pages (wget)

#4

Post by russellhltn »

Ordinarily, I'd suggest ctrl-F5 on those pages to make sure you're not pulling from cache. But I'm not sure how to do that with wget.
Have you searched the Help Center? Try doing a Google search and adding "site:churchofjesuschrist.org/help" to the search criteria.

So we can better help you, please edit your Profile to include your general location.
User avatar
chrissv
New Member
Posts: 35
Joined: Fri Mar 02, 2007 9:17 am
Location: Dudley, MA

Re: "Not Found" returned on some Come Follow Me lesson pages (wget)

#5

Post by chrissv »

The program I am using "wget" doesn't have the concept of a cache - it goes out to the web page directly.

I did some more testing, and it looks like things are not consistent. One time I would get the "unavailable" page and another time it will be served up just fine. I've tried it from multiple computers so I think it is something with lds.org

Note that when I actually use a browser, I get the pages every time no problem. It's only using this alternate method does it most times not work.
Post Reply

Return to “Main Church Website”