Heap memory is a frequent customer topic.
Here's the quick refresher, oriented towards AIX, but the principles apply to other unix implementations.
1. 32-bit processes have a maximum addressability of 4GB; usable application heap size of 2-3 GB. On AIX it is controlled by an environment variable: export LDR_CNTRL=....=MAXDATA=0x080000000 # 2GB ( The leading zero
is deliberate, not required )
1a. It is possible to get 3.25GB heap size for a 32-bit process
using @DSA (Discontiguous Segment Allocation) export LDR_CNTRL=MAXDATA=0xd0000000@DSA # 3.25 GB 32-bit only One side-effect of using AIX segments "c" and "d" is that shared libraries will be loaded privately, and not shared. If you need the additional heap space, this is worth the trade-off. This option is frequently used for 32-bit java.
1b. 64-bit processes have no need for the @DSA option.
2. 64-bit processes can double the 32-bit heap size to 4GB using: export LDR_CNTRL=....=MAXDATA=0x100000000 # 1 with 8-zeros
2a. But this setting would place the same memory limitations on obiee as a 32-bit process
2b. The major benefit of 64-bit is to break the binds of 32-bit
addressing. At a minimum, use 8GB export LDR_CNTRL=....=MAXDATA=0x200000000 # 2 with 8-zeros
2c. Many large customers are providing extra safety to their servers by
using 16GB: export LDR_CNTRL=....=MAXDATA=0x400000000 # 4 with 8-zeros
There is no performance penalty for providing virtual memory
allocations larger than required by the application.
- If the server only uses 2GB of space in 64-bit ... specifying
16GB just provides an upper bound cushion. When an unexpected user query causes a sudden memory surge, the extra memory keeps the
server running.
3. The next benefit to 64-bit is that you can provide huge thread
stack sizes for
strange queries that might otherwise crash the server. nqsserver uses fast recursive algorithms to traverse complicated control structures. This means lots of thread space to hold the stack frames.
3a. Stack frames mostly contain register values; 64-bit registers are
twice as large as 32-bit
At a minimum you should quadruple the size of the server stack
threads in NQSConfig.INI
when migrating from 32- to 64-bit, to prevent a rogue query from crashing the server. Allocate more than is normally necessary for safety.
3b. There is no penalty for allocating more stack size than
you need ...
it is just virtual memory; no real resources are consumed until the extra space is needed.
3c. Increasing thread stack sizes may require the process heap size (MAXDATA) to be increased. Heap space is used for dynamic memory requests, and for thread stacks. No performance penalty to run with large heap and thread stack sizes.
In a 32-bit world, this safety would require careful planning to avoid exceeding 2GM usable storage. 3d. Increasing the number of threads also may require additional heap storage. Most thread stack frames on obiee are allocated when the server is started, and the real memory usage increases as threads run work.
Does 2.8GB sound like a lot of memory for an AIX application
server?
- I guess it is what you are accustomed to seeing from "grandpa's
applications".
- One of the primary design goals of obiee is to trade memory for
services ( db, query caches, etc)
- 2.8GB is still well under the 4GB heap size allocated with
MAXDATA=0x100000000
- 2.8GB process size is also possible even on 32-bit Windows
applications
- It is not unusual to receive a sudden request for 30MB of contiguous storage on obiee.- This is not a memory leak; eventually the nqsserver storage will stabilize, but it may take days to do so.
vmstat is the tool of choice to observe memory usage. On AIX vmstat will show something
that may be startling to some people ... that available free memory ( the 2nd
column ) is always trending toward zero ... no available free memory. Some customers have concluded that "nearly zero memory free" means it is time to upgrade the server with more real memory. After the upgrade, the server again shows very little free memory available.
Should you be concerned about this? Many customers are !! Here is
what is happening:
- AIX filesystems are built on a paging model. If you
read/write a filesystem block it is paged into memory ( no read/write system calls )
- This filesystem "page" has its own "backing store" on disk, the original filesystem block. When the system needs the real memory page holding the file block, there is no need to "page out".
The page can be stolen immediately, because the original is still on disk in the filesystem.
- The filesystem pages tend to collect ... every filesystem block that was
ever seen since
system boot is available in memory. If another application needs the file block, it is retrieved with no physical I/O.
What happens if the system does need the memory ... to satisfy a
30MB heap request by nqsserver, for example?
- Since the filesystem blocks have their own backing store ( not on
a paging device )
the kernel can just steal any filesystem block ... on a
least-recently-used basis
to satisfy a new real memory request for "computation pages".
No cause for alarm. vmstat is accurately displaying whether all filesystem blocks have been touched, and now reside in memory.
Back to nqsserver: when should you be worried about its memory footprint?
Answer: Almost never. Stop monitoring it ... stop fussing over it
... stop trying to optimize it.
This is a production application, and nqsserver uses the memory it
requires to accomplish the job, based on demand.
C'mon ... never worry? I'm from New York ... worry is what we do best.
Ok, here is the metric you should be watching, using vmstat:
- Are you paging ... there are several columns of vmstat outputbash-2.04$ vmstat 3 3
System configuration: lcpu=4 mem=4096MB
kthr memory page
faults cpu
----- ------------ ------------------------ ------------
-----------
r b avm fre re pi po fr sr cy in sy cs
us sy id wa
0 0 208492 2600 0 0 0 0 0 0 13 45 73 0 0 99
0
0 0 208492 2600 0 0 0 0 0 0 9 12 77 0 0 99
0
0 0 208492 2600 0 0 0 0 0 0 9 40 86 0 0 99
0 avm is the "available free memory" indicator that trends toward
zerore is "re-page". The kernel steals a real memory page for
one process; immediately repages back to original processpi "page in". A process memory page
previously paged out, now paged back in because the process needs
itpo "page out" A process memory block
was paged out, because it was needed by some other process Light paging activity ( re, pi, po ) is not a concern for worry.
Processes get started, need some memory, go away. Sustained paging activity is
cause for concern. obiee users are having a terrible day if these counters are always changing.
Hang on ... if nqsserver needs that memory and I reduce MAXDATA to keep the process under control, won't the nqsserver process crash when the memory is needed? Yes it will. It means that nqsserver is configured to
require too much memory and there are lots of options to reduce the real memory requirement.
- number of threads
- size of query cache
- size of sort
But I need nqsserver to keep running.
Real memory is over-committed. Many things can cause this:- running all application processes on a single
server ... DB server, web servers, WebLogic/WebSphere,
sawserver, nqsserver, etc.
You could move some of those to another host machine and
communicate over the network The need for real memory doesn't go away, it's just distributed to other host machines.
- AIX LPAR is configured
with too little memory. The AIX admin needs to
provide more real memory to the LPAR running obiee.
- More memory to this LPAR
affects other partitions. Then it's time to visit your friendly IBM
rep and buy more memory.