Late Collisions and familysearch.org FT issues

Discussions about Internet service providers (ISPs), the Meetinghouse Firewall, wired and wireless networking, usage, management, and support of Meetinghouse Internet
Post Reply
jasondivy
New Member
Posts: 11
Joined: Wed Dec 22, 2010 11:21 pm

Late Collisions and familysearch.org FT issues

#1

Post by jasondivy »

I'm the assistant stake technology specialist for family history in my stake. I'm tasked with keeping the Internet running in the Stake Center, where our FHC is located. I have one issue, and one error message. These may or may not be related.

The problem started when the FM changed our ISP from a DSL to a T1. FamilySearch FamilyTree is badly broken. At first, I thought the FamilySearch developers had changed something and would eventually fix it (as they are in active development). However, the problem has not gone away, and seems to only occur in our building, not from home. This leads me to believe there is a networking problem of some type.

FamilyTree pages fail to load, partially or in full. It is far more common for a page to fail to load, than it is for one to load fully, but the specific behavior is erratic. "Server Error", or "Failed to load..." family relationships/change history/sources/discussions/ordinances/etc are very common. This occurs irrespective of browser. Since other websites don't appear to have problems, it leads me to speculate that there could be a problem on the familysearch.org servers that is being compounded by a local problem.

When they changed our ISP, the FM also changed out our Cisco 800 series VPN (same model, different unit). We had all kinds of trouble getting the unit to register, primarily because we needed to configure a static IP on the WAN (and the HTML interface was being disagreeable).

Now, when logging into the unit and setting "terminal monitor", and visiting FamilyTree, I see 2-4 "LATE COLLISION" errors on FE4 (WAN). Since this is an ethernet level error, I can't see how this would cause more than a slowed signal on an https connection (requiring the tcp layer to resend lost packets). In other words, I have my doubts that this is the actual problem. So far, though, it's my only real lead. Online forums suggest than this error should never occur in a proper setup, and that it is usually caused by either a duplex mismatch, or a bad cable run. Since the cable run is brand new for this transition (and is probably a fairly long run; I can't measure it), I fear this may be the problem, though I'd like to have some better assurance before bothering the FM about it.

The FE4 interface is in half duplex 100 mode. I have tried setting it to full duplex, and the connection goes down (it comes back up when set to half or auto). I do not yet know if the ISP equipment is in half, full, or auto duplex. I have already tried substituting a cross over cable in the networking cabnet between the wall jack and FE4 with no change in behavior (a recommendation on many sites when auto mdi/mdi-x fails when manually setting duplex on cisco equipment).

Again, I don't know if the familysearch.org problems and the late collisions are related in any substantial way.

Any theories or diagnostic suggestions are welcome. I don't have a Fluke or access to the building attic (where the cable runs), and I may need a good rationale before I can get the FM involved. I have physical access to all other equipment, including the networking cabinet, and the electrical closet where the T1 comes into the building. I'm not entirely sure what to look for in Wireshark, but I've used the program before and it doesn't scare me.

Thanks in advance.
russellhltn
Community Administrator
Posts: 34417
Joined: Sat Jan 20, 2007 2:53 pm
Location: U.S.

Re: Late Collisions and familysearch.org FT issues

#2

Post by russellhltn »

I wonder if you have the right firewall. It should be a Cisco 881W configured by CHQ. I would not expect you to have any console access to it. When the FM Group swapped it out, it might not have been scripted properly for a meetinghouse. You'd need to call the Global Service Desk and have them take a look at it.

It does sound like it's not real happy with your Ethernet LAN connection. Either the cable or the device at the other end. However, connecting a computer direct to the firewall should test that theory.
Have you searched the Help Center? Try doing a Google search and adding "site:churchofjesuschrist.org/help" to the search criteria.

So we can better help you, please edit your Profile to include your general location.
jasondivy
New Member
Posts: 11
Joined: Wed Dec 22, 2010 11:21 pm

Re: Late Collisions and familysearch.org FT issues

#3

Post by jasondivy »

I wonder if you have the right firewall. It should be a Cisco 881W configured by CHQ
Sounds about right. I don't remember the precise model number offhand, but it is something thereabouts.
I would not expect you to have any console access to it.
Yes, I have the password. Yes, I know the powers that be would prefer I didn't. Yes, I know that things could go very badly wrong with the wrong command. No, I haven't changed the bootable configuration. No, I haven't typed in random commands. I've been extraordinarily careful, and will consult the global service desk if any real changes are needed. No, I won't tell you how I got the password, that could potentially allow others access. It is handy, though, to see error messages and query the port configurations.
When the FM Group swapped it out, it might not have been scripted properly for a meetinghouse.
It was setup for a meeting house, but there is always the possibility that there is a configuration error... that something should have been set but didn't take for some reason. If I ask them to take a look at it, and they decide to reload the unit, and they give me a different IP range... again -- I am not looking forward to setting up 3 printers on 9 computers... again (27 combinations = a pain in the neck to configure more than once; I've done it about 5 times now). This may be the only option. Or I could go through this to no avail (if it is the cable or a hardware incompatibility).
It does sound like it's not real happy with your Ethernet LAN connection.
WAN, actually. FE 0-3 are LAN ports. FE4 is WAN.
Either the cable or the device at the other end.
Both are possibilities. Both are non-trivial to test.
However, connecting a computer direct to the firewall should test that theory.
Worth a try next time I get down there. Configure my laptop static to the ISP IP address, install a webserver (ick), watch for late collisions.

On the other hand, it is probably not that simple. the 100BASE-T standard supports lines up to 100 meters. I expect enterprise grade hardware to handle the distance. Experience tells me, however, that most consumer grade NICs won't get nearly as far. On a run of this distance (over the gym and past several classrooms), I would expect my laptop to have problems. It should be under 100 m, but I can't verify that (estimates are ranging 100-150 ft).

I can connect my laptop directly at the cabinet, but that will only tell me if FE4 is failing outright. If it works fine, then it may still be the cable run, or an incompatibility with the NetVanta installed by the ISP (duplex mismatch, failing to auto sense), or outright hardware failure on either end.

And this is still assuming the late collisions are causing the familysearch.org problems, or are indicative of the same root cause. Dropped packets at the ethernet level should be corrected on the tcp level. (That's the whole point behind tcp, actually.)

I'll be back down there either tomorrow or Monday, depending.
jasondivy
New Member
Posts: 11
Joined: Wed Dec 22, 2010 11:21 pm

Re: Late Collisions and familysearch.org FT issues

#4

Post by jasondivy »

The ISP support line remains open till 7, but the engineers had already left.
GSC conectivity team has also left for the day.

The ISP help desk did however confirm that their equipment is running in full duplex. This explains the late collisions detected, but not why the Cisco 800 refuses to switch to full duplex, nor is it proof positive that this has anything to do with the errors on FamilyTree.

I guess I'm dealing with this on Monday (probably).
john84601
New Member
Posts: 47
Joined: Sun Mar 11, 2012 2:24 pm

Re: Late Collisions and familysearch.org FT issues

#5

Post by john84601 »

-- I think the biggest clue you have is that it's negotiating down to half duplex. I would suspect a cabling issue. If you have access to the console, I woud look at counters (CRC or frame). In a "good" environement, you will rarely if ever see these numbers change. If they are having problems at Layer 1 (cabling), you will likely see these increment.

-- To use Wireshark you would need either to 1) setup a mirror or span port on the router (I wouldn't recomend this on church equipment) or 2) obtain access to an Ethernet TAP (pricey). Once you had a pcap, you could send it to someone like myself who could help you interprit it quickly. It's probably eaiser for you to find someone who has access to a Fluke to test the line.

-- My money is that it's bad cabling. Probably a broken wire, poorly terminated connector, ran to close to power, etc. When you get it tested, make sure that testing is more than just a "continuity test" and that the tester will test the signal "at frequency".
Post Reply

Return to “Meetinghouse Internet”