Key ATG architecture principles
- by Glen Borkowski
Overview
The purpose of this article is to describe some of the important foundational concepts of ATG. This is not intended to cover all areas of the ATG platform, just the most important subset - the ones that allow ATG to be extremely flexible, configurable, high performance, etc. For more information on these topics, please see the online product manuals.
Modules
The first concept is called the 'ATG Module'. Simply put, you can think of modules as the building blocks for ATG applications. The ATG development team builds the out of the box product using modules (these are the 'out of the box' modules). Then, when a customer is implementing their site, they build their own modules that sit 'on top' of the out of the box ATG modules. Modules can be very simple - containing minimal definition, and perhaps a small amount of configuration. Alternatively, a module can be rather complex - containing custom logic, database schema definitions, configuration, one or more web applications, etc. Modules generally will have dependencies on other modules (the modules beneath it). For example, the Commerce Reference Store module (CRS) requires the DCS (out of the box commerce) module.
Modules have a ton of value because they provide a way to decouple a customers implementation from the out of the box ATG modules. This allows for a much easier job when it comes time to upgrade the ATG platform. Modules are also a very useful way to group functionality into a single package which can be leveraged across multiple ATG applications.
One very important thing to understand about modules, or more accurately, ATG as a whole, is that when you start ATG, you tell it what module(s) you want to start. One of the first things ATG does is to look through all the modules you specified, and for each one, determine a list of modules that are also required to start (based on each modules dependencies). Once this final, ordered list is determined, ATG continues to boot up. One of the outputs from the ordered list of modules is that each module can contain it's own classes and configuration. During boot, the ordered list of modules drives the unified classpath and configpath. This is what determines which classes override others, and which configuration overrides other configuration. Think of it as a layered approach.
The structure of a module is well defined. It simply looks like a folder in a filesystem that has certain other folders and files within it. Here is a list of items that can appear in a module:
MyModule:
META-INF - this is required, along with a file called MANIFEST.MF which describes certain properties of the module. One important property is what other modules this module depends on.
config - this is typically present in most modules. It defines a tree structure (folders containing properties files, XML, etc) that maps to ATG components (these are described below).
lib - this contains the classes (typically in jarred format) for any code defined in this module
j2ee - this is where any web-apps would be stored.
src - in case you want to include the source code for this module, it's standard practice to put it here
sql - if your module requires any additions to the database schema, you should place that schema here
Here's a screenshots of a module:
Modules can also contain sub-modules. A dot-notation is used when referring to these sub-modules (i.e. MyModule.Versioned, where Versioned is a sub-module of MyModule).
Finally, it is important to completely understand how modules work if you are going to be able to leverage them effectively. There are many different ways to design modules you want to create, some approaches are better than others, especially if you plan to share functionality between multiple different ATG applications.
Components
A component in ATG can be thought of as a single item that performs a certain set of related tasks. An example could be a ProductViews component - used to store information about what products the current customer has viewed. Components have properties (also called attributes). The ProductViews component could have properties like lastProductViewed (stores the ID of the last product viewed) or productViewList (stores the ID's of products viewed in order of their being viewed). The previous examples of component properties would typically also offer get and set methods used to retrieve and store the property values. Components typically will also offer other types of useful methods aside from get and set. In the ProductViewed component, we might want to offer a hasViewed method which will tell you if the customer has viewed a certain product or not.
Components are organized in a tree like hierarchy called 'nucleus'. Nucleus is used to locate and instantiate ATG Components. So, when you create a new ATG component, it will be able to be found 'within' nucleus. Nucleus allows ATG components to reference one another - this is how components are strung together to perform meaningful work. It's also a mechanism to prevent redundant configuration - define it once and refer to it from everywhere.
Here is a screenshot of a component in nucleus:
Components can be extremely simple (i.e. a single property with a get method), or can be rather complex offering many properties and methods. To be an ATG component, a few things are required:
a class - you can reference an existing out of the box class or you could write your own
a properties file - this is used to define your component
the above items must be located 'within' nucleus by placing them in the correct spot in your module's config folder
Within the properties file, you will need to point to the class you want to use:
$class=com.mycompany.myclass
You may also want to define the scope of the class (request, session, or global):
$scope=session
In summary, ATG Components live in nucleus, generally have links to other components, and provide some meaningful type of work. You can configure components as well as extend their functionality by writing code.
Repositories
Repositories (a.k.a. Data Anywhere Architecture) is the mechanism that ATG uses to access data primarily stored in relational databases, but also LDAP or other backend systems. ATG applications are required to be very high performance, and data access is critical in that if not handled properly, it could create a bottleneck. ATG's repository functionality has been around for a long time - it's proven to be extremely scalable. Developers new to ATG need to understand how repositories work as this is a critical aspect of the ATG architecture.
Repositories essentially map relational tables to objects in ATG, as well as handle caching. ATG defines many repositories out of the box (i.e. user profile, catalog, orders, etc), and this is comprised of both the underlying database schema along with the associated repository definition files (XML). It is fully expected that implementations will extend / change the out of the box repository definitions, so there is a prescribed approach to doing this. The first thing to be sure of is to encapsulate your repository definition additions / changes within your own module (as described above). The other important best practice is to never modify the out of the box schema - in other words, don't add columns to existing ATG tables, just create your own new tables. These will help ensure you can easily upgrade your application at a later date.
xml-combination
As mentioned earlier, when you start ATG, the order of the modules will determine the final configpath. Files within this configpath are 'layered' such that modules on top can override configuration of modules below it. This is the same concept for repository definition files. If you want to add a few properties to the out of the box user profile, you simply need to create an XML file containing only your additions, and place it in the correct location in your module. At boot time, your definition will be combined (hence the term xml-combination) with the lower, out of the box modules, with the result being a user profile that contains everything (out of the box, plus your additions). Aside from just adding properties, there are also ways to remove and change properties.
types of properties
Aside from the normal 'database backed' properties, there are a few other interesting types:
transient properties - these are properties that are in memory, but not backed by any database column. These are useful for temporary storage.
java-backed properties - by nature, these are transient, but in addition, when you access this property (by called the get method) instead of looking up a piece of data, it performs some logic and returns the results. 'Age' is a good example - if you're storing a birth date on the profile, but your business rules are defined in terms of someones age, you could create a simple java-backed property to look at the birth date and compare it to the current date, and return the persons age.
derived properties - this is what allows for inheritance within the repository structure. You could define a property at the category level, and have the product inherit it's value as well as override it. This is useful for setting defaults, with the ability to override.
caching
There are a number of different caching modes which are useful at different times depending on the nature of the data being cached. For example, the simple cache mode is useful for things like user profiles. This is because the user profile will typically only be used on a single instance of ATG at one time. Simple cache mode is also useful for read-only types of data such as the product catalog. Locked cache mode is useful when you need to ensure that only one ATG instance writes to a particular item at a time - an example would be a customers order. There are many options in terms of configuring caching which are outside the scope of this article - please refer to the product manuals for more details.
Other important concepts - out of scope for this article
There are a whole host of concepts that are very important pieces to the ATG platform, but are out of scope for this article. Here's a brief description of some of them:
formhandlers - these are ATG components that handle form submissions by users.
pipelines - these are configurable chains of logic that are used for things like handling a request (request pipeline) or checking out an order.
special kinds of repositories (versioned, files, secure, ...) - there are a couple different types of repositories that are used in various situations. See the manuals for more information.
web development - JSP/ DSP tag library - ATG provides a traditional approach to developing web applications by providing a tag library called the DSP library. This library is used throughout your JSP pages to interact with all the ATG components.
messaging - a message sub-system used as another way for components to interact.
personalization - ability for business users to define a personalized user experience for customers. See the other blog posts related to personalization.