Page 1 of 7

Google Calendar Sync Issue (Robots.txt)

Posted: Mon Aug 18, 2014 9:12 pm
by azwheels
I'm trying to one-way sync my ward calendar to my Google calendar. It was working well last week, but suddenly it was not refreshing when calendar events changed. So I deleted that calendar sync from my Google calendar and requested a fresh url from the lds.org calendar site. When I tried to sync with the new url, I received the following error message: Could not fetch the url because robots.txt prevents us from crawling the url.

Re: Google Calendar Sync Issue

Posted: Mon Aug 18, 2014 9:23 pm
by russellhltn
Google only syncs about once a day. So changes to the calendar will not be reflected immediately.

You can use the URL only once every nine minutes. If you try to refresh/reread it more often then that, you will get an error. Creating a new URL does NOT reset the timer. Note that this includes all devices that use the URL, so if you've just added a device or service to sync to, that could be the problem.

Re: Google Calendar Sync Issue

Posted: Mon Aug 18, 2014 9:27 pm
by azwheels
My google calendar was not refreshing the lds.org calendar data at all. Even after a week, the changes made to lds.org did not show up in the Google calendar. That's the reason I tried to re-sync. It has been over an hour since I tried to sync using the new url, and I still get the same error.

Re: Google Calendar Sync Issue

Posted: Mon Aug 18, 2014 10:51 pm
by russellhltn
Sync is working for me, so if there's an outage, it's not across the board. Have you tried putting the URL in a web browser? You should get a ICS file. (And then you'll have to wait 9 minutes before doing any further testing.)

Re: Google Calendar Sync Issue

Posted: Tue Aug 19, 2014 9:41 am
by azwheels
I put the url into the Google Chrome web browser. It downloaded a file, which I opened in Notepad. Looks like it contains info about each calendar event.

I wonder if there is a setting in Google that tells robots.txt to allow all? Remember the error message is:

Could not fetch the url because robots.txt prevents us from crawling the url.

When I google this issue, here's what I get:

URLs restricted by robots.txt errors

Google was unable to crawl the URL due to a robots.txt restriction. This can happen for a number of reasons. For instance, your robots.txt file might prohibit the Googlebot entirely; it might prohibit access to the directory in which this URL is located; or it might prohibit access to the URL specifically. Often, this is not an error. You may have specifically set up a robots.txt file to prevent us from crawling this URL. If that is the case, there's no need to fix this; we will continue to respect robots.txt for this file.

If a URL redirects to a URL that is blocked by a robots.txt file, the first URL will be reported as being blocked by robots.txt (even if the URL is listed as Allowed in the robots.txt analysis tool).

Re: Google Calendar Sync Issue

Posted: Tue Aug 19, 2014 11:34 am
by aebrown
azwheels wrote:I put the url into the Google Chrome web browser. It downloaded a file, which I opened in Notepad. Looks like it contains info about each calendar event.

I wonder if there is a setting in Google that tells robots.txt to allow all?
The fact that you could download the iCAL file shows that the LDS.org calendar sync service is working properly.

I don't understand why Google Calendar would even care about a robots.txt file. After all, the only operation that is needed is to download a single file from the URL that you supply (the sync URL). There should be no web crawling involved.

I personally sync my LDS.org calendar to Google Calendar, and I've never seen that error message.

Re: Google Calendar Sync Issue

Posted: Tue Aug 19, 2014 11:36 am
by azwheels
OK. Do you think maybe some malware or virus could cause this?

Re: Google Calendar Sync Issue

Posted: Tue Aug 19, 2014 11:44 am
by aebrown
azwheels wrote:OK. Do you think maybe some malware or virus could cause this?
I would think that is quite unlikely. The message does seem like something Google could be reporting, and that report would be based on information that Google sees on the lds.org server as it tries to download the file. I wouldn't think that malware would be likely at all to interfere with that part of the process.

I'm not sure what to suggest. You might try generating a new sync URL on the LDS.org calendar and see if that helps.

Re: Google Calendar Sync Issue

Posted: Tue Aug 19, 2014 11:48 am
by azwheels
I did that. Twice, in fact. If you think of anything else I can try, please let me know. I really miss seeing the church events on my family calendar.

Re: Google Calendar Sync Issue

Posted: Tue Aug 19, 2014 12:09 pm
by russellhltn
azwheels wrote:Could not fetch the url because robots.txt prevents us from crawling the url.
You might re-check your procedure for entering the URL. I don't think "robots.txt" comes into play unless you're asking google to index the link to make it available for a web search.