obiee memory usage
- by user554629
Heap memory is a frequent customer topic.
    Here's the quick refresher, oriented towards AIX, but the principles apply to other unix implementations.
    1. 32-bit processes have a maximum addressability of 4GB; usable application heap size of 2-3 GB.   On AIX it is controlled by an environment variable: export LDR_CNTRL=....=MAXDATA=0x080000000   # 2GB ( The leading zero
    is deliberate, not required )  
    1a. It is  possible to get 3.25GB  heap size for a 32-bit process
    using @DSA (Discontiguous Segment Allocation)      export LDR_CNTRL=MAXDATA=0xd0000000@DSA  # 3.25 GB 32-bit only        One side-effect of using AIX segments "c" and "d" is that shared libraries will be loaded privately, and not shared.        If you need the additional heap space, this is worth the trade-off.  This option is frequently used for 32-bit java.  
    1b. 64-bit processes have no need for the @DSA option.
    2. 64-bit processes can double the 32-bit heap size to 4GB using:  export LDR_CNTRL=....=MAXDATA=0x100000000  # 1 with 8-zeros   
    2a. But this setting would place the same memory limitations on obiee as a 32-bit process   
    2b. The major benefit of 64-bit is to break the binds of 32-bit
    addressing.  At a minimum, use 8GB export LDR_CNTRL=....=MAXDATA=0x200000000  # 2 with 8-zeros   
    2c.  Many large customers are providing extra safety to their servers by
    using 16GB: export LDR_CNTRL=....=MAXDATA=0x400000000  # 4 with 8-zeros
    There is no performance penalty for providing virtual memory
    allocations larger than required by the application. 
     - If the server only uses 2GB of space in 64-bit ... specifying
    16GB just provides an upper bound cushion.    When an unexpected user query causes a sudden memory surge, the extra memory keeps the
    server running.
    3.  The next benefit to 64-bit is that you can provide huge thread
    stack sizes for  
         strange queries that might otherwise crash the server.      nqsserver uses fast recursive algorithms to traverse complicated control structures.    This means lots of thread space to hold the stack frames.   
    3a. Stack frames mostly contain register values;  64-bit registers are
    twice as large as 32-bit
             At a minimum you should  quadruple the size of the server stack
    threads in NQSConfig.INI
             when migrating from 32- to 64-bit, to prevent a rogue query from crashing the server.           Allocate more than is normally necessary for safety.   
    3b. There is no penalty for allocating more stack size than
    you need ...
              it is just virtual memory;   no real resources  are consumed until the extra space is needed.   
    3c. Increasing thread stack sizes may require the process heap size (MAXDATA) to be increased.          Heap space is used for dynamic memory requests, and for thread stacks.          No performance penalty to run with large heap and thread stack sizes.
              In a 32-bit world, this safety would require careful planning to avoid exceeding 2GM usable storage.     3d. Increasing the number of threads also may require additional heap storage.          Most thread stack frames on obiee are allocated when the server is started,          and the real memory usage increases as threads run work.
    Does 2.8GB sound like a lot of memory for an AIX application
    server? 
    - I guess it is what you are accustomed to seeing from "grandpa's
    applications".
    - One of the primary design goals of obiee is to trade memory for
    services ( db, query caches, etc)
    - 2.8GB is still well under the 4GB heap size allocated with
    MAXDATA=0x100000000
    - 2.8GB process size is also possible even on 32-bit Windows
    applications
    - It is not unusual to receive a sudden request for 30MB of contiguous storage on obiee.- This is not a memory leak;  eventually the nqsserver storage will stabilize, but it may take days to do so.
    vmstat is the tool of choice to observe memory usage.  On AIX vmstat will show  something
    that may be  startling to some people ... that available free memory ( the 2nd
    column ) is always  trending toward zero ... no available free memory.  Some customers have concluded that "nearly zero memory free" means it is time to upgrade the server with more real memory.   After the upgrade, the server again shows very little free memory available.
    Should you be concerned about this?   Many customers are !!  Here is
    what is happening: 
    - AIX filesystems are built on a paging model.    If you
    read/write a  filesystem block it is paged into memory ( no read/write system calls )
    - This filesystem "page" has its own "backing store" on disk, the original filesystem block.   When the system needs the real memory page holding the file block, there is no need to "page out".
       The page can be stolen immediately, because the original is still on disk in the filesystem.
    - The filesystem  pages tend to collect ... every filesystem block that was
    ever seen since
       system boot is available in memory.  If another application needs the file block, it is retrieved with no physical I/O.
    What happens if the system does need the memory ... to satisfy a
    30MB heap request by nqsserver, for example?
    - Since the filesystem blocks have their own backing store ( not on
    a paging device ) 
      the kernel can just steal any filesystem block ... on a
    least-recently-used basis
      to satisfy a new real memory request for "computation pages". 
    No cause for alarm.   vmstat is accurately displaying whether all filesystem blocks have been touched, and now reside in memory.    
    
  Back to nqsserver:  when should you be worried about its memory footprint? 
    Answer:  Almost never.   Stop monitoring it ... stop fussing over it
    ... stop trying to optimize it.
    This is a production application, and nqsserver uses the memory it
    requires to accomplish the job, based on demand.
    C'mon ... never worry?   I'm from New York ... worry is what we do best. 
    Ok, here is the metric you should be watching, using vmstat: 
    - Are you paging ... there are several columns of vmstat outputbash-2.04$ vmstat 3 3
      System configuration: lcpu=4 mem=4096MB 
      kthr    memory               page             
      faults        cpu    
      ----- ------------ ------------------------ ------------
      -----------
       r  b    avm   fre  re  pi  po  fr   sr  cy  in   sy  cs
      us sy id wa
       0  0 208492  2600   0   0   0   0    0   0  13   45  73  0  0 99 
      0
       0  0 208492  2600   0   0   0   0    0   0   9   12  77  0  0 99 
      0
       0  0 208492  2600   0   0   0   0    0   0   9   40  86  0  0 99 
      0 avm  is the "available free memory" indicator that trends toward
    zerore   is "re-page".  The kernel steals a real memory page for
    one process;  immediately repages back to original processpi   "page in".   A process memory page
      previously paged out, now paged back in because the process needs
      itpo  "page out" A process memory block
      was paged out, because it was needed by some other process Light paging activity ( re, pi, po ) is not a concern for worry.  
    Processes get started, need some memory, go away. Sustained paging activity  is
    cause for concern.   obiee users are having a terrible day if these counters are always changing.
    Hang on ... if nqsserver needs that memory and I reduce MAXDATA to keep the process under control, won't the nqsserver process crash when the memory is needed? Yes it will.   It means that nqsserver is configured to
    require too much memory and there are  lots of options to reduce the real memory requirement.
     - number of threads
     - size of query cache
     - size of sort
    But I need nqsserver to keep running.  
  Real memory is over-committed.    Many things can cause this:- running all application processes on a single
      server    ... DB server, web servers, WebLogic/WebSphere,
    sawserver, nqsserver, etc.
      You could move some of those to another host machine and
    communicate over the network  The need for real memory doesn't go away, it's just distributed to other host machines.
    - AIX LPAR is configured
        with too little memory.     The AIX admin needs to
    provide more real memory to the LPAR running obiee.
    - More memory to this LPAR
      affects other partitions.  Then it's time to visit your friendly IBM
    rep and buy more memory.