Monitoring And Troubleshooting Active Directory Replication

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Monitoring And Troubleshooting Active Directory Replication as PDF for free.

More details

  • Words: 4,980
  • Pages: 17
Monitoring and Troubleshooting Active Directory Replication Replication may be defined as a duplicate copy of similar data on the same or a different platform or system. When using a directory service such as Active Directory, the directory database is carried by all domain controllers so that when you want to contact a domain controller for use, there is always a local copy local for use so that requests do not have to be sent over the wide area network (WAN). Replication for Active Directory operates within the directory service component of the security subsystem. This component is called Ntdsa.dll and is accessed through the Lightweight Directory Access Protocol (LDAP). Ntdsa.dll runs as a part of the local security authority (LSA), which runs as Lsass.exe. Updates are transported over Internet Protocol (IP) by the remote procedure call (RPC) protocol. The Simple Mail Transfer Protocol (SMTP) is also available for use as well, although it’s more common to see RPC over IP used. When considering Active Directory, replication takes place and a copy of the Active Directory database is stored and updated on all other participating domain controllers on your network and in a perfect world, each copy of the database is the same and all domain controllers are synchronized. If this happens, then all your domain controllers are synchronized with an exact duplicate copy of the Active Directory database. When you install Active Directory, for the most part even if all the default settings are chosen, the replication process from domain controller to domain controller is automatic and practically transparent. For the most part, domain controllers handle the replication processes without advanced configuration and most times, without a problem. In figure 1, you can see a common network (2 sites connected via a WAN link) with a domain controller in each location. Again, the benefit of having a domain controller local to your PC’s at each network segment is to have requests made of the domain controller kept local to the PC’s in need of its services to speed up requests (by keeping them local) or in case of disaster recovery, which could happen if the WAN link drops, the local PCs can still find a local domain controller to use. Keeping traffic off the wide area network (WAN) and containing it to the local area network (LAN) is the best design practice you can implement.

Figure 1: A Common Wide Area Network (WAN) As a systems administrator, you should still consider that Active Directory performance still needs to be monitored and analyzed. The health and maximized performance of Active Directory depends on a smooth replication process. If you are having problems with replication, you will know not only from blatant logging in your Event Viewer, but from poor performance as well. Many times, you cannot stop every problem from occurring, but hopefully after reading this article, you will be better equipped to handle issues and keep your network as optimized as possible to handle the traffic traversing it. Consider a common problem such as a failed network link. In figure 2, you see that the main wide area network link has been broken.

Figure 2: A Failed Network Link ISP’s and telecom service providers occasionally have problems and service can be interrupted. This of course stops the communication between domain controllers, therefore also severing the replication process. This can prevent the synchronization of information between domain controllers and possibly cause corruption and/or other problems. A good way to make sure that this doesn’t happen is to set up a backup link (such as ISDN as seen in figure 2). ISDN (Integrated Services Digital Networks) is a digital WAN technology used to facilitate connections between sites. More commonly used today for disaster recovery, ISDN still has a place in today’s marketplace. Although still used, you don’t have to limit yourself to any technology when it comes to backup links, you can use a fractional or full T1, a DSL line, or any other technology that allows you to have redundancy in your links. The goal is to have redundant links to keep your domain controllers in constant communication with each other so that the Active Directory database stays synchronized and healthy. A common symptom of replication problems is that information is not updated on some or all domain controllers. For example, a systems administrator creates a user account on one domain controller, but the changes are not

propagated to other domain controllers. In most environments, this is a potentially serious problem because it affects network security and can prevent authorized users from accessing the resources they require. You can take several steps to troubleshoot Active Directory replication; each of these is discussed in the following sections.

Verifying Network Connectivity In order for replication to work properly in distributed environments, you must have network connectivity. Although ideally all domain controllers would be connected by high-speed and redundant LAN or WAN links, this is rarely the case for larger deployments and for most companies that utilize slow WAN links that aren’t recoverable from a disaster. Always make sure your network topology is documented and tested to ensure that it’s connected. There are many tools you can use to verify connectivity such as Ping and Tracert which come with just about every operating system ever created that runs TCP/IP. In real world deployments, analog/dial-up connections and slow connections are common. If you have verified that your replication topology is set up properly, you should confirm that your servers are able to communicate over the network. Problems such as a failed dialup connection attempt can prevent important Active Directory information from being replicated. Learn how to use ping and other ICMP based protocol troubleshooting tools in the links section at the end of this article.

Verifying Router and Firewall Configurations When building a secure network, most times controls are placed on network devices to filter the traffic going from place to place. The most commonly used tool to control traffic is a Firewall. A router or any other device that utilizes a firewall feature set, or some other form of Access Control that stops access to and from other hosts connected can also be used. A firewall is usually dedicated to only protecting the perimeter so its been designed to do that, do not assume that the use of a firewall stops any risk of you being attacked, it only minimizes that risk. Firewalls are used to restrict the types of traffic that can be transferred between networks. Their main use is to increase security by preventing unauthorized users from transferring information. In some cases, company firewalls may block the types of network access that must be available in order for Active Directory replication to occur. For example,

if a specific router or firewall prevents data from being transferred using SMTP, replication that uses this protocol will fail.

Network Ports Used by Active Directory Replication RPC replication uses dynamic port mapping as per the default setting. When you need to connect to an RPC endpoint during Active Directory replication, RPC uses TCP port 135. RPC on the client contacts the RPC endpoint mapper on the server at a well-known port and RPC randomly allocates high TCP ports from port 1024 to 65536. Because of this configuration, a client will never need to know what port to use for Active Directory replication; it will just take place seamlessly. There are also other ports assigned for Active Directory replication. There are as follows: Protocol Port LDAP udp 389 tcp 389 LDAP (SSL) udp 636 tcp 636 Kerberos udp 88 tcp 88 DNS udp 53 tcp 53 SMB over IP udp 445 tcp 445 Global Catalog Server tcp 3269 tcp 3268

Examining the Event Logs: Errors, if they occur, will show up in the Event Viewer logs. At the end of this article, I have placed a link to the Microsoft Website so that you can learn how to use the Event Viewer. The Event Viewer can be very helpful when trying to locate and resolve a replication problem. Many errors are reported to the Event Viewer for your review. Whenever an error in the replication configuration occurs, the computer writes events to the Directory Service and File Replication Service (FRS) event logs. By using the Event Viewer administrative tool, you can quickly and easily view the details associated with any problems in replication. For example, if one domain controller is not able to communicate with another to transfer changes, a log entry is created.

You may receive events such as: • •



Event ID 1311 in the directory service log Event ID 1265 with error "DNS Lookup Failure" or "RPC server is unavailable" in the directory service log. Or, received "DNS Lookup Failure" or "Target account name is incorrect" from the repadmin command Event ID 1265 "Access denied," in directory service log. Or, received "Access denied" from the repadmin command Note: The link at the end of the article covers the explanation of these specific errors and more.

Verifying Site Links Before domain controllers in different sites can communicate with each other, the sites must be connected by site links. If replication between sites is not occurring properly, verify that the proper site links are in place. Verify your site links by using the Replication diagnostics utility (Repadmin.exe). Use this tool to verify correct site links and to display inbound and outbound connections. You can also use it to display the replication queue. You can get the tool by using the link at the end of this article.

Verifying That Information Is Synchronized It’s often easy to forget to perform manual checks regarding the replication of Active Directory information. One of the reasons for this is that Active Directory domain controllers have their own read/write copies of the Active Directory database. Therefore, if connectivity does not exist, you will not encounter failures while creating new objects. It is important to periodically verify that objects have been synchronized between domain controllers. This process might be as simple as logging on to a different domain controller and looking at the objects within a specific OU. This manual check, although it might be tedious, can prevent inconsistencies in the information stored on domain controllers, which, over time, can become an administration and security nightmare.

Verifying Authentication Scenarios A common replication configuration issue occurs when clients are forced to authenticate across slow network connections. The primary

symptom of the problem is that users complain about the amount of time it takes them to log on to the Active Directory (especially during times of high volume of authentications, such as at the beginning of the workday). Usually, you can alleviate this problem by using additional domain controllers or reconfiguring the site topology. A good way to test this is to consider the possible scenarios for the various clients that you support. Often, walking through a configuration, such as “A client in Domain A is trying to authenticate using a domain controller in Domain B, which is located across a very slow WAN connection,” can be helpful in pinpointing potential problem areas.

Verifying the Replication Topology The Active Directory Sites and Services tool allows you to verify that a replication topology is logically consistent. You can quickly and easily perform this task by right-clicking the NTDS Settings within a Server object and choosing All Tasks => Check Replication Topology. If any errors are present, a dialog box alerts you to the problem. You can verify the Active Directory topology using the Active Directory Sites and Services tool. Besides for ensuring that replication always continues, you can also learn how to monitor it as well. There are several ways in which you can monitor the behavior of Active Directory replication and troubleshoot the process if problems occur. In our next article we will look at the replication monitor and part III of this article will cover the system monitor.

Viewing the Routing Tables The routing tables are an important part of Windows’ TCP/IP protocol stack, but they aren’t something that the operating system normally displays to the casual user. If you want to see the routing tables, you will have to open a Command Prompt window and then enter the ROUTE PRINT command. Upon doing so, you will see a screen similar to the one that’s shown in Figure A.

Figure A: This is what the Windows routing tables look like Before I delve into the routing tables, I recommend entering another command into the Command Prompt window. The command is: IPCONFIG /ALL The reason why I am recommending that you use the IPCONFIG /ALL command is because it shows you how TCP/IP is really setup on the machine. Sure, you could look in the TCP/IP section of the network adapter’s properties sheet, but the information is more reliable if you get it from IPCONFIG. I have seen a couple of instances over the years in which IPCONFIG reported completely different information than what was entered into the machine’s TCP/IP configuration screen. This doesn’t happen often, but if the right type of error occurs you can experience this type of mismatch. To put it bluntly, the information that’s keyed into the TCP/IP properties sheet reflects how you would like Windows to set up the TCP/IP protocol for the choosen network. The information presented by IPCONFIG shows how Windows has actually configured the protocol. Even if you haven’t had some bizarre Windows error, it’s still useful to get your configuration information through IPCONFIG. If a machine has multiple network cards, it can be tough to remember which configuration is bound to which card. IPCONFIG lists the various configurations in an easy to read, per NIC basis, as shown in Figure B.

Figure B: The IPCONFIG /ALL displays the machine’s TCP/IP configuration on a per NIC basis

Examining the Routing Tables Right about now you might be wondering why I had you to do an IPCONFIG /ALL, when this article is supposed to be discussing routing tables. The reason for this is that normally you never even look at the routing tables unless you are having problems with your machine. If you are having problems, then the best place to start the troubleshooting process is to compare the information provided by IPCONFIG to the information stored in the routing tables. As you saw in Figure B, the IPCONFIG /ALL screen displayed some basic TCP/IP information such as the IP address, the default gateway, etc. The routing tables aren’t quite as intuitive though. Therefore, I want to take some time to discuss how to read the routing tables and what the information in the tables mean. In order to understand what the information in these columns mean, you need to understand a little bit about how a router works. A router’s job is to facilitate moving traffic from one network to another. As such, a router will contain multiple network interface cards, each connected to a different network segment. When a user sends a packet that’s destined for a different network segment than the one that the PC is presently attached to, the packet is sent to the router. It is up to the router to figure out which network segment the packet should be forwarded to. It doesn’t matter if the router is connected to two network segments or a dozen. The decision making process is the same, and it’s all based on routing tables. If you look at the Route Print screen, you will notice that the routing tables are divided into five different columns. The first column is the

network destination column. This column lists all of the network segments that the router is attached to. The Netmask column provides the subnet mask not of the network interface that’s attached to the segment, but of the segment itself. This basically allows the router to determine the address class for the destination network. The third column is the gateway column. Once the router has determined which destination network it needs to send the packet to, it looks at the gateway listing. The gateway listing tells the router which IP address the packet should be forwarded through in order to reach the destination network. The Interface column tells the router which NIC is connected to the appropriate destination network. Technically, the interface column only tells the router the IP address that has been assigned to the NIC that connects the router to the destination network. However, the router is smart enough to know which physical interface the address has been bound to. The final column in the routing table is the Metric column. Metrics are a science in themselves, but I will try to give you a brief explanation of what they do. The best way that I have ever heard metrics explained is in terms of an airport. Imagine for a moment that I needed to fly from Charlotte, NC (the closest major airport to my home in South Carolina) to Miami, Florida. Being that the Charlotte airport is pretty big, I have a lot of choices of how I could get to Miami Beach. I could hop a North West Airlines flight. It would take me to Detroit Michigan and then down to Miami (Detroit is a bit out of the way). Likewise, I could hop a Continental Airlines flight that would take me to Houston, TX, and then to Miami. Another option would be to just take a US Airways flight nonstop to Miami. So which airline should I take? In real life, there are a lot of factors to consider such as the price of the ticket and the departure times, but let’s assume that everything was equal. If there were no differences between the airlines other than the route, then I would fly the airline that makes the fewest stops. It would get me to my destination more quickly, and since there are fewer stops, there would be less chance of having a problem with my connection, lost luggage, and things like that. Routing works the same way. Many times, there is more than one way that a router could send a packet. In such a case, it makes sense to send the packet along the shortest (or most reliable) path. This is where the metrics come into play. Windows does not even look at metrics unless there are multiple paths to a destination. If there are multiple paths though, Windows checks the metrics to determine the shortest route. This is an over simplified explanation, but it gets the point across.

Additional Routing Options

Earlier, I showed you the Route Print command, but there are actually a lot of other things that you can do with the ROUTE command. The ROUTE command’s syntax is as follows: ROUTE [-f] [-p] [command [destination] [] The –f switch is optional. This switch tells Windows to clear the routing table of all gateway entries. If the –f switch is used in conjunction with other commands then all gateway entries will be cleared prior to executing other instructions within the command. The –p switch makes a specified route persistent. Normally, when a server is rebooted then any routes that you specify via the ROUTE command are removed. The –p switch tells Windows to keep the route even if the system is rebooted. The command portion of the ROUTE command’s syntax is relatively simple. The command set consists of four options PRINT, ADD, DELETE, and CHANGE. I’ve already shown you the ROUTE PRINT command, but even the ROUTE PRINT command has other options. For example, you can use wild cards with this command. For instance, if you only wanted to print routes pertaining to the 192.x.x.x subnet, you could use the command ROUTE PRINT 192*. The ROUTE DELETE command works very similarly to the ROUTE Print command. Simply enter the ROUTE DELETE command followed by the destination or the gateway that you want to delete from the routing table. For example, if you wanted to remove the 192.0.0.0 gateway, you could enter the command ROUTE DELETE 192.0.0.0. The ROUTE CHANGE and the ROUTE ADD commands have the same basic syntax as each other. When you enter this command, you must usually specify the destination, subnet mask, and gateway. You might also specify a metric and an interface, but that’s optional. For example, if you wanted to add a destination using the bare minimal syntax, you could do so as follows: ROUTE ADD 147.0.0.0 255.0.0.0 148.100.100.100 In this command, 147.0.0.0 is the new destination that you are adding. 255.0.0.0 would be the subnet mask for the destination, and 148.100.100.100 would be the gateway address. You can extend the command with the METRIC and IF parameters. Doing so would look something like this: ROUTE ADD 147.0.0.0 255.0.0.0 148.100.100.100 METRIC 1 IF 1 The metric parameter is optional, but it specifies the metric or number of hops for the route. The IF parameter tells Windows which NIC to use.

In this particular case, Windows would use the NIC that’s bound to Windows as interface 1. If you don’t use the IF parameter then Windows will automatically search for the best interface to use.

Conclusion In this article, I have explained how to use the ROUTE command to display the Windows routing tables and make changes to those tables if necessary. If you need a little extra help, you can get more syntax examples by entering the ROUTE /? Command.

Testing through Nslookup The DNS protocol has been around for decades and is a stable and reliable protocol. Even so, DNS does occasionally have problems. PING is a great tool for DNS server diagnosis, and I tend to use it quite frequently myself. However, sometimes PING just doesn’t give you enough information about the problem at hand. When you need more information about a DNS problem than what PING provides you with, you can always turn to the NSLOOKUP command. In this article, I will show you how to use NSLOOKUP. The DNS protocol has been around for decades and is a stable and reliable protocol. Even so, DNS does occasionally have problems. These problems might stem from a loss of connectivity, an invalid DNS record, or a number of other issues. When a DNS server doesn’t behave in the way that it is expected to, many people turn to the PING command for help. PING is a great tool for DNS server diagnosis, and I tend to use it quite frequently myself. However, sometimes PING just doesn’t give you enough information about the problem at hand. When you need more information about a DNS problem than what PING provides you with, you can always turn to the NSLOOKUP command. NSLOOKUP is a built in DNS diagnostic utility that’s available to both Windows and UNIX Administrators. In this article, I will show you how to use NSLOOKUP.

The Basics NSLOOKUP has a fairly rich syntax and can be a bit confusing for those who have not worked with DNS a great deal. Therefore, I want to start out by showing you some of the basics. Although NSLOOKUP exists in both UNIX and Windows, there are some differences in the way that it

behaves in the two operating systems. For the purposes of this article, I will be using the Windows version. The first thing that you need to understand about NSLOOKUP is that when you use the NSLOOKUP command, it assumes that you are querying a local domain on your private network. You can query an external domain, but NSLOOKUP will try to search for the domain internally first. For example, the brienposey.com domain is external to my network. If I perform an NSLOOKUP against brienposey.com, NSLOOKUP returns the information that’s shown in Figure A.

Figure A: This is what happens when NSLOOKUP queries an external domain If you look at the figure, you will see that there are non existent domain error messages for the IP addresses 147.100.100.34 and 147.100.100.5. These are the addresses of my internal DNS servers. Below this information however is the non authoritative answer. This means that my DNS server queried an external DNS server in an effort to resolve the IP address associated with the brienposey.com domain. Now, let’s take a look at what happens when you query an internal domain. One of the local domains on my private network is production.com. If I perform an NSLOOKUP against production.com, I get the results shown in Figure B.

Figure B: This is what it looks like when I query an internal domain If you look at the top portion of this screen, you will notice that I’m getting the exact same non-existent domain error messages as I got when I queried an external domain. At first, this may seem puzzling. The reason why I got this error message was because I performed an NSLOOKUP outside of the NSLOOKUP shell. I will talk more about the NSLOOKUP shell in the next section. For now though, you need to know that you can enter the NSLOOKUP command by itself. When you do, you will see the familiar non-existent domain error messages, but you will then be taken to the NSLOOKUP prompt (the > sign). From there you can enter various NSLOOKUP commands. When you are done, you can use the EXIT command to return to the command prompt. The other thing that you should notice about Figure B is the bottom portion of the output. Beneath the reference to production.com is a string of IP addresses. These are the IP addresses of all of the domain controllers within the domain. I should also point out that if multiple IP addresses are assigned to a single server then all of the server’s IP addresses will be displayed by NSLOOKUP.

The NSLOOKUP Shell Now that I have shown you how to use the NSLOOKUP command to see the IP address or addresses associated with the domain, let’s do something a little bit more useful. One of the things that you can do with NSLOOKUP is to look up a specific type of DNS record. A good example of this is an MX record. In case you aren’t yet familiar with all of the intricacies of DNS, the MX record points to the organization’s mail server. For example, suppose

that someone wanted to send an E-mail message to you, one of the first things that their mail server would have to do is to resolve your domain’s IP address. However, a normal address resolution won’t usually work for this purpose. In Figure A, you saw that when I ran a DNS query against the brienposey.com domain, the domain resolved to the address 24.235.10.4. Keep in mind though, that this is the IP address of the server that hosts my Web site, not the address of my mail server. If someone wanted to send me an E-mail message their Email client would have to resolve the IP address of my domain’s mail server. This is where the MX record comes into play. The MX record is a record on a domain’s DNS server that specifies the IP address of the domain’s mail server. As you can see, the MX record is rather important. Suppose however that your domain was having trouble receiving E-mail and you suspected that a DNS server issue was to blame. You could use NSLOOKUP to confirm that the domain does indeed have an MX record and that the MX record is pointed to the correct IP address. Earlier I briefly mentioned that you could work within the NSLOOKUP shell. To troubleshoot an MX record problem, you pretty much have to work within this shell. Therefore, you would start the process by entering the NSLOOKUP command at the command prompt. Once the NSLOOKUP shell is open, you will need to tell NSLOOKUP which DNS server you want to query. To do so, enter the SERVER command, followed by the DNS server’s IP address. You can also enter the server’s fully qualified domain name (assuming that it can be resolved) as an alternative to the server’s IP address. Now that you have specified a DNS server for NSLOOKUP to use, you can query domains without receiving the non-existent domain error messages that you saw earlier (as long as you remain within the NSLOOKUP shell). To do so, you would simply type the domain name that you want to query. For example, if you look at Figure C, you can see where I have specified a particular DNS server and then queried an external and an internal domain.

Figure C: The error messages go away if you specify a DNS server Now, let’s get back to the business of looking up a domain’s MX record. To do so, you need to issue a command that tells NSLOOKUP to query based on MX records. The command that you will have to use is: SET QUERY=MX Issuing this command by itself won’t give you any information about the domain’s MX record though. For that you have to actually query the domain by entering the domain name. If you look at Figure D, you will see that I have specified an MX query and then entered the production.com domain name. NSLOOKUP now returns a wealth of information pertaining to my domain’s MX record.

Figure D: When an MX query is specified, you can get a wealth of information about your domain’s MX record

Conclusion As you can see, NSLOOKUP can provide you with a wealth of DNS server diagnostic information. However, NSLOOKUP is not limited to providing the types of information that I have discussed. The NSLOOKUP shell is actually a fairly rich interface with a rather large command set. You can view a list of the available commands and their syntax by entering a question mark at the NSLOOKUP prompt (note: you can not use NSLOOKUP /? to view the command set).

Related Documents