File System Factory™                                                         

Introduction...


Version 1.2.1

Release Information

Best Practices

Videos


Support Forum

 

 

Novell File System Factory 1.2.1 Best Practices

January 2005

This document is intended as a guide in installing, configuring and using File System Factory™ in a variety of situations and environments. The comments here are directed at installations based on Version 1.2.1 or later.

1. Selecting an Engine Server

The focal point of File System Factory™ (FSF) is the Engine (FSFENGIN.NLM). This component makes decisions and carries out the policies and directives defined by the administrator.

 FSF is for the most part a background process; taking action based on events as they occur in the Directory. However, the FSF engineering team is sensitive to the general need to have transactions complete quickly. Even so, the Engine has been designed and tested such that it does not require the resources of a high-end server, even in a larger environment.

 The Engine will have no problem or negative impact running on a general purpose server that is also serving files to users. However, in larger shops, the general industry trend toward smaller special purpose servers has proven to be beneficial in many cases.

 Therefore, in general the recommendation is that, you designate a small utility server to run the Engine. There are no known incompatibilities with other Novell or 3rd party software, so the Engine can share a server with something else if needed.

 The following suggested configuration will more than aptly support a FSF Engine providing services for 50,000 users:

  • A small general purpose NetWare certified server (Ex: Dell® PowerEdge 2650 or Compaq® Proliant DL320)
  • Single 2 GHz Processor
  • 1 Gig RAM
  • Disk requirements are not substantial above and beyond that required to run the NetWare OS. Roughly 1 Meg of Disk per 1000 users managed. Therefore about 50 Meg to support 50,000 users.
  • NetWare 6.0 or later at the latest support pack.

2. Selecting a Volume on Which to Install FSF

Given the disk requirements of only about 1 Meg per 1000 users managed, it is fair to say that there is little cause for concern with regard to FSF being a burden on any volume on any modern server.

 FSF may be installed and is supported on any volume. FSFCONFIG.EXE is the utility program that the administrator may use to remotely distribute FSFEVENT.NLM, the event monitoring component of FSF, to other servers on the network. This utility will attempt to install on remote servers based on the volume name and path name of the master directory on the Engine server. In other words, if the Engine is installed on SERVER1 on the WORK volume in directory FACTORY, then the TOOL will expect that there will be a volume named WORK on each server as it distributes the NLM and other files. This restriction will be removed at some point in future.

Although FSF is supported on any volume, FSF engineering recommends installation on the SYS volume in the default FACTORY directory for the following reasons:

  • Disk space usage and requirements are minimal.
  • As described above, a volume by the same name must exist on all servers if it is desired to use the automated distribution tool, which is recommended.
  • FSFEVENT.NLM should be loaded on each server as soon as possible in the AUTOEXEC.NCF. The SYS volume always mounts first in NetWare, thus allowing an earlier load.
  • The vast majority of FSF development, testing, and troubleshooting by technical support is done with FSF installed on the SYS volume.

3. Selecting Servers to run Event Monitoring

The event monitoring component of FSF (FSFEVENT.NLM) monitors and records events in the Directory and sends them in to the Engine for consideration and possible action. This allows FSF to take action based on any identity provisioning or management solution being employed on the network now or in the future.

 FSFEVENT is derived from a mature code base that was designed and proven to have a negligible impact on early Pentium class servers. During normal operation, FSFEVENT uses no disk space on the server. In the event of a communications problem, a small amount of disk is used to temporarily cache events that cannot be immediately communicated with the Engine. Therefore the performance and disk impact on any server running FSFEVENT is negligible.

 In general, FSFEVENT will only record and process events for objects held in replicas on the local server. The only exception to this is that the creation of back references on a server without a copy of a given object may trigger a create event for that object on the server. An example of this is giving a user direct trustee rights to the file system of a server with no replicas. This will trigger a create event for that user on that server. If FSFEVENT were running on that server, it would report the event to the Engine. The Engine is intelligent enough to know that this event is rogue however.

At a minimum, FSFEVENT should be placed on enough servers to have complete coverage of the partitions of the tree containing any objects to be monitored. If one server has a replica of all partitions (whether it is the Engine server or not,) it would be sufficient to run the event service on just that one server.  If the tree is more distributed, you need to run the event service on as many servers it takes to be sure that every partition has at least one server with a replica running the event service.  Duplication of partitions (more than one server holding replicas of any one partition) running the service is not a problem due to the ability of the global event subsystem to determine duplicate events, and act only once.

 In fact, where possible, FSF Engineering suggests running FSFEVENT on at least two servers in each replica ring. If for some reason the tree is prone to long periods of delayed DS synchronization (days or weeks), it is suggested to only run FSFEVENT on one server in each replica ring where feasible.

4. Replica Location Relative to the Engine Server and Target Servers

FSF makes no replica location requirements with regard to the server running the Engine or to any Target servers being pointed to by FSF Policies.

 Therefore, in general FSF engineering makes no recommendations in this area. However, the following facts are known:

  • The engine makes a considerable number of requests against the Directory in normal operation. Partitioning and replica location should take this into account, especially in a WAN situation. Every effort should be made to keep DS healthy and synchronizing within a few minutes.
  • Having replicas of partitions containing the managed user objects on the Target servers will improve Engine performance in that it will on average not have to wait as long to set or manage trustee assignments.
  • Having replicas of partitions containing the managed user objects on the Engine server will improve Engine performance to a moderate extent. This fact should not be the sole factor in selecting the Engine server however.
  • Having replicas of partitions containing the FSF Policy objects on the Engine server will improve performance. In a WAN situation, if feasible, it is suggested that one or more separate containers be created to hold Policy objects and partitioned if needed so that the Engine server can hold a replica locally. Again, this is not a recommendation, but rather a suggestion if you are concerned about performance.

 

5. Policy Location

FSF Policy objects may be created in any container in the tree. The location of the Policy object in the tree has no bearing on which objects or parts of the tree to which it may be assigned.

The only recommendation in this area from FSF engineering is that the administrators establish their own standard for selecting the location of Policy objects and adhere to that standard. This will improve overall manageability.

 

6. Replica Assignment

FSF Policies may be assigned to containers, groups, or individual users. In general, the decision of which of these to use is based on a combination of tree design and policy exception forecasting.

FSF engineering recommends that you use container policies wherever possible and rely on group or even individual user assignments to deal with exceptions. Consider the following scenarios.

Scenario 1

Problem: As the administrator, Sue would like to provision managed storage for all employees in the ATLANTA container. Their storage should reside on the server in the Atlanta office. As a rule, employees should be given a quota of 250 Megabytes on the Network. However, the System Engineers need a quota of 500 Megabytes. How can she easily accomplish this?

Solution: Sue should set up two policies: 1) ATL-DEFAULT and 2) ATL-LARGE. She should point both Policies to the same path(s) on the Atlanta server. The only difference between the two policies is the quota setting. Sue should then assign the ATL-DEFAULT policy to the ATLANTA container in the tree and the ATL-LARGE policy to the System Engineers group.

Result: Everyone is provisioned storage on the Atlanta server and given a quota of 250 Meg except the users in the System Engineering group, which have a quota of 500 Meg. If someone in the Atlanta office is added to the System Engineering group, their quota is automatically changed to 500 Meg. If someone in the Atlanta office is removed from the System Engineering group, their quota is automatically reverted back to the default of 250 Meg.

As an aside, FSF engineering also suggests that Sue directly assign the ATL-LARGE policy directly to Bob Smith’s user object. Bob happens to be the District Director.

Scenario 2

Problem: As the network administrator at a community college with 5,000 students, Fred would like to give all students 75 Meg of storage space on the network. He has two student servers and would like to have all students with usernames beginning with A-J on server STU-1 and all usernames beginning with K-Z on server STU-2. Fred has a flat tree containing user accounts created via a DirXML driver.

Solution: Fred should set up two policies: 1) STU-AtoJ and 2) STU-KtoZ. He should define both policies identically except for the paths, which should point to servers STU-1 and STU-2 respectively. If Fred had designed his tree such that he had a A-J container and a K-Z container, he could assign his policies directly to those containers and be done with it. However, he has a flat container populated by DirXML. Fred should create 2 groups called A-J and K-Z and assign the respective policies to them. Then he should modify the driver to add each user to the appropriate group when they are created.

Result: Everyone is automatically provisioned storage on the appropriate server based on the logic in the DirXML driver which triggers the application of the appropriate storage management policy.

One problem with this scenario is that Fred’s wish is counterproductive to one of the biggest assets of FSF; and that is its ability to load balance storage across servers and volumes. Disk usage patterns as well as future advanced algorithms that will be made available in FSF may cause Fred to change his mind about this wish. See the section below on “Load Balancing and Distribution Algorithms” for more information.

Scenario 3

Problem: A school district would like to provide network disk storage for all students such that the disk storage is always available to the student as they are promoted and move from school to school in the district. Servers are located in each school with T-1 connectivity back to the district office. The data needs to be on the server at the school that each student attends.

Solution 1: Create a hierarchy in eDirectory such that each school is represented by its own container. Create a policy for each school and assign it to the school container and configure it so that it points to the server at the school. Create the students in the appropriate container based on the school. When students are promoted, say from Middle School A to High School A, move the user objects to High School A’s container and their data will move from the server at the middle school to the server at the high school. This will happen automatically with no backfill required since object moves trigger policy reevaluation and application.

Solution 2: Create a flat structure in eDirectory where all user objects live. Create a group representing each school. Create a policy for each school and assign it to the school group. Add the students to the appropriate group based on their school. When students are promoted, remove them from one school’s group and add them to another school’s group. When you are ready for the data to move for a particular school, perform a backfill on the group. The policy for the school will be applied and the data will be moved to the new school server. This method is advantageous since it allows us to maintain a flat Directory and have some manual control over the data move process. 

Solution 3: Create a flat structure in eDirectory where all user objects live. Create a policy for each school and point it to the server at the associated school. Set the policy attribute on the user as a part of your provisioning process, perhaps a DirXML driver connected to the student database. Then as users are promoted, the driver sets the policy attribute to point to the new school. File System Factory™ acts on this event by applying the new policy and moving the user data automatically to the new school server.

Scenario 4

Problem: Bob has home directories for 20,000 users on 8 NetWare 4.11 servers. He would like to move everyone to a new set of 4 NetWare 6.5 servers.

Solution: Bob should create a File System Factory™ policy that points to the 4 NetWare 6.5 servers. He should then assign that policy to the containers holding the user objects. Finally, he should perform a full backfill operation against the containers using the “Enforce Path” option.

All users home directories will be moved from the 4.11 servers and load balanced across the 6.5 servers. All trustee assignments and file system attributes will be moved seamlessly. Bob can even define a schedule for the migration and set bandwidth throttling parameters so that network and server performance will not be affected during normal business hours.

Scenario 5

Problem: Ann is the administrator of a network with 7,500 employee user accounts. Ann has not given the majority of her user’s home directory storage. She would like to do this now using 5 NetWare 5.1 servers that she runs. She is somewhat hesitant about doing this now given that she anticipates installing a 3-node NetWare 6.0 server cluster connected to a SAN in about eight months.

Solution Part 1: Ann should not wait. She should define a Policy pointing to her 5.1 servers. Then associate the policy with her users. Lastly, she should issue a full backfill operation against the containers holding her users.  In a flash, all of her users will have managed disk storage on the network where they can be productive for the next eight months.

Solution Part 2: Eight months later, Ann has her new cluster and SAN installed. She should then modify her Policy that she defined in Part 1 above by adding path definitions so that it points to the SAN-connected cluster. She should remove the path pointers to the 5.1 servers. Finally, she should run a full backfill operation against the user containers using the “Enforce Path” option.   

All user home directories will be moved from the 5.1 servers to the SAN-connected cluster seamlessly. Ann can then decommission the 5.1 servers or use them for some other purpose.

 

7. Load Balancing and Distribution Algorithms

In its current release, FSF supports 3 different algorithms in its policy definition:

  1. Random Distribution
  2. Distribution based on actual space remaining on the volume
  3. Distribution based on percentage space remaining on the volume

Unless there is a specific need otherwise, FSF engineering suggests the use of the default “Random Distribution” algorithm. This has proven to be the overall best option in almost every case involving real world customer environments.

New FSF algorithms to be introduced later will employ advanced techniques to assure both disk allocation and disk usage based on both growth and performance metrics.

In shops where the administrators have been managing disk by hand, historically there has been a tendency to try to use predictability based somehow on the username to decide where to place the home directory for a user. This was necessary because the management was done manually and the administrator needed to be able to quickly find the home directory for a user. FSF engineering has seen numerous instances of customers attempting to continue to manage this way with FSF when it is not necessary and actually counterproductive to the load balancing mission of FSF.  

It is generally accepted that disk location predictability and load balancing are mutually exclusive to each other. In the estimation of FSF engineering, the current and future benefits of load balancing far outweigh that of location predictability and we have seen without exception those customers that make this transition in thinking be very successful.

 

8. Backfill Processing

A backfill operation, especially one issued against a container, may cause FSF to initiate different action events against a large number of users. The administrator may elect to issue the backfill request in "Check Mode", which instructs FSF to perform the analysis portion of the request and report on the action to be taken, but avoid taking any actions. This allows the administrator to preview the results of his actions and prevent unintended results.

FSF engineering recommends that administrators develop the habit of always running backfills in CHECK mode first and use the resulting report to verify actions as appropriate before running the backfill live.

 

9. Home Directory Rights and Templates

Members of less than sophisticated user populations tend to delete and/or rename their home directories either by accident or on purpose.

FSF engineering recommends always specifying a template in the policy definition, even if you do not want to pre-populate each home directory with a set of files and/or directories. FSF will use the attributes set on the template directory as model in setting the attributes on each home directory covered by the policy. Setting the Rename Inhibit and Delete Inhibit flags on the template directory will cause the same flags to be set on each user’s home directory, which is a good practice.

 

10. Using Leveling Directories

The Engine contains an additional feature that enables the automatic creation of leveling directories with the specified policy paths. When a policy is first defined, the administrator has the chance to select between three leveling directory algorithms:

  1. None - no leveling directories will be created (default)
  2. First Letter – the home directory will be placed in a subdirectory within the specified policy path based on the first letter of the user common name.
  3. Last Letter - the home directory will be placed in a subdirectory within the specified policy path based on the first letter of the user common name.

 This feature was added in response to customer requests. Placing too many directories in a single path can lead to problems with some administrator and end-user tools. However, in many ways, this option lends itself to the disadvantages of using predictability in placement of user storage as discussed above. In general, FSF engineering recommends using the default algorithm of NONE for leveling directories.

 The use of leveling directories is encouraged when the N is greater than 3000 based on the following formula (assuming that the policy is using Random Distribution as the balancing algorithm):

N=U/P

where

P=number of paths defined to the policy

U=the number of users that will be managed by the policy

 

11. Use Consistency Check Storage Reports

The Engine User Interface contains the ability produce storage reports on all users or groups in a particular part of the tree. You are encouraged to use this reporting tool prior to the creation or application on any storage policy to determine where consistency deficiencies exist so that you can take appropriate steps to correct them. After policies are created and applied, these reports can be rerun at any time to confirm compliance as well as determine storage distributions and workloads across storage subsystems.

 

Copyright © 2005 Condrey Consulting Corporation. All Rights Reserved.