This chapter describes how the DIGITAL UNIX operating system uses the physical memory installed in the system. This chapter also describes how to configure and tune virtual memory, swap space, and buffer caches. Many of the tuning tasks described in this chapter require you to modify system attributes. See Section 2.11 for more information.
The total amount of physical memory is determined by the capacity of the memory boards installed in your system. The system distributes this memory in 8-KB units called pages.
The system distributes pages of physical memory among three areas:
Wired memory
At boot time, the operating system and the Privileged Architecture Library (PAL) code wire a contiguous portion of physical memory in order to perform basic system operations. Static wired memory is reserved for operating system data and text, system tables, the metadata buffer cache, which temporarily holds recently accessed UNIX File System (UFS) and CD-ROM File System (CDFS) metadata, and the Advanced File System (AdvFS) buffer cache. Static wired memory cannot be reclaimed through paging. You can reduce the amount of static wired memory only by removing subsystems.
In addition, the kernel uses
dynamically wired memory
for dynamically allocated data structures.
User processes also wire memory
for address space.
The amount of dynamically wired memory varies according
to the demand.
The maximum amount is specified by the value of the
vm-syswiredpercent
attribute (the default is 80 percent of physical
memory).
Memory that is dynamically wired cannot be reclaimed through paging.
You can reduce the amount of dynamically wired memory by allocating more kernel
resources to processes (for example, by increasing the value of the
maxusers
attribute).
Virtual memory
The virtual memory subsystem uses a portion of physical memory to cache processes' most-recently accessed anonymous memory and file-backed memory. The subsystem efficiently allocates memory to competing processes and tracks the distribution of all the physical pages. This memory can be reclaimed through paging.
Unified Buffer Cache
The Unified Buffer Cache (UBC) uses a portion of physical memory to cache most-recently accessed file system data. The UBC contains actual file data for reads and writes and for page faults from mapped file regions and also AdvFS metadata. By functioning as a layer between the operating system and the storage subsystem, the UBC can decrease the number of disk operations. This memory can be reclaimed through paging.
Figure 4-1 shows how physical memory is used.
The virtual memory subsystem and the UBC compete for the physical pages that are not wired. Pages are allocated to processes and to the UBC, as needed. When the demand for memory increases, the oldest (least-recently used) pages are reclaimed from the virtual memory subsystem and the UBC and reused. Various attributes control the amount of memory available to the virtual memory subsystem and the UBC and the rate of page reclamation. Wired pages are not reclaimed.
System performance depends on the total amount of physical memory and also the distribution of memory resources. DIGITAL UNIX allows you to control the allocation of memory (other than static wired memory) by modifying the values of system attributes. Tuning memory usually involves the following tasks:
Increasing system resource allocation to improve application performance
Modifying how the system allocates memory and the rate of page reclamation
Modifying how file system data is cached in memory
You can also configure your swap space for optimal performance. However, to determine how to obtain the best performance, you must understand your workload characteristics, as described in Chapter 1.
When programs are executed, the system moves data and instructions among various caches, physical memory, and disk swap space. Accessing the data and instructions occurs at different speeds, depending on the location. Table 4-1 describes the various hardware resources (in the order of fastest to slowest access time).
Resource | Description |
CPU caches | Various caches reside in the CPU chip and vary in size up to a maximum of 64 KB (depending on the type of processor). These caches include the translation lookaside buffer, the high-speed internal virtual-to-physical translation cache, the high-speed internal instruction cache, and the high-speed internal data cache. |
Secondary cache | The secondary direct-mapped physical data cache is external to the CPU, but usually resides on the main processor board. Block sizes for the secondary cache vary from 32 bytes to 256 bytes (depending on the type of processor). The size of the secondary cache ranges from 128 KB to 8 MB. |
Tertiary cache | The tertiary cache is not available on all Alpha CPUs; otherwise, it is identical to the secondary cache. |
Physical memory | The actual amount of physical memory varies. |
Swap space | Swap space consists of one or more disks or disk partitions (block special devices). |
The hardware logic and the PAL code control much of the movement of addresses and data among the CPU cache, the secondary and tertiary caches, and physical memory. This movement is transparent to the operating system. Figure 4-2 shows an overview of how instructions and data are moved among various hardware components during program execution.
Movement between caches and physical memory is significantly faster than movement between disk and physical memory, because of the relatively slow speed of disk I/O. Therefore, avoid paging and swapping operations, and applications should utilize caches when possible. Figure 4-3 shows the amount of time that it takes to access data and instructions from various hardware locations.
For more information on the CPU, secondary cache, and tertiary cache, see the Alpha Architecture Reference Manual.
The virtual memory subsystem performs the following functions:
Allocates memory to processes
Tracks and manages all the pages in the system
Uses paging and swapping to ensure that there is enough memory for processes to run and to cache file system I/O
The following sections describe these functions in detail.
For each process, the
fork
system call performs
the following tasks:
Creates a UNIX process body, which includes a set of data
structures that the kernel uses to track the process and a set of resource
limitations.
See
fork
(2)
for more information.
Allocates a contiguous block of
virtual address space,
which is the array of pages that an application can map into physical memory.
Virtual address space is used for
anonymous memory
(memory used for the stack, heap, or
malloc
function)
and for
file-backed memory
(memory used for program
text or shared libraries).
Pages of anonymous memory are paged in when needed
and paged out when pages must be reclaimed.
Pages of file-backed memory
are paged in when needed and released when pages must be reclaimed.
Creates one or more threads of execution. The default is one thread for each process. Multiprocessing systems support multiple process threads.
Because memory is limited, a process' entire virtual address space cannot be in physical memory at one time. However, a process can execute when only a portion of its virtual address space (its working set) is mapped to physical memory.
For each process, the virtual memory subsystem allocates a large amount of virtual address space but uses only part of this space. Only 4 TB is allocated for user space. User space is generally private and maps to a nonshared physical page. An additional 4 TB of virtual address space is used for kernel space. Kernel space usually maps to shared physical pages. The remaining space is not used for any purpose.
In addition, user space is sparsely
populated with valid pages.
Only valid pages are able to map to physical pages.
The
vm-maxvas
attribute specifies the maximum amount of
valid virtual address space for a process (that is, the sum of all the valid
pages).
The default is 128000 pages (1 GB).
Figure 4-4 shows the use of process virtual address space.
When a virtual page is touched or accessed, the virtual memory subsystem must locate the physical page and then translate the virtual address into a physical address. Each process has a page table, which is an array containing an entry for each current virtual-to-physical address translation. Page table entries have a direct relation to virtual pages (that is, virtual address 1 corresponds to page table entry 1) and contain a pointer to the physical page and protection information.
Figure 4-5 shows the translation of a virtual address into a physical address.
A process' resident set is the complete set of all the virtual addresses that have been mapped to physical addresses (that is, all the pages that have been accessed during process execution). Resident set pages may be shared among multiple processes. A process' working set is the set of virtual addresses that are currently mapped to physical physical addresses. The working set is a subset of the resident set and represents a snapshot of the process' resident set.
When a nonfile-backed virtual address is requested, the virtual memory subsystem locates the physical page and makes it available to the process. This process occurs at different speeds, depending on the location of the page (see Figure 4-3).
If a requested address is currently being used (active), it will have an entry in the page table. In this case, the PAL code loads the physical address into the translation lookaside buffer, which then passes the address to the CPU.
If a requested address is not active in the page table, the PAL lookup code issues a page fault, which instructs the virtual memory subsystem to locate the page and make the virtual-to-physical address translation in the page table.
If a requested virtual address is being accessed for the first time, the virtual memory subsystem performs the following tasks:
Allocates an available page of physical memory.
Fills the page with zeros.
Enters the virtual-to-physical address translation in the page table.
This is called a zero-filled-on-demand page fault.
If a requested virtual address has already been accessed, it will be in one of the following locations:
The virtual memory subsystem's internal data structures
If the physical address is located in the internal data structures (for example, the hash queue list or the page queue list), the virtual memory subsystem enters the virtual-to-physical address translation in the page table. This is called a short page fault.
Swap space
If the virtual address has already been accessed, but the physical page has been reclaimed, the page contents will be found in swap space. The virtual memory subsystem copies the contents of the page from swap space into the physical address and enters the virtual-to-physical address translation in the page table. This is called a page-in page fault.
If a process needs to modify a read-only virtual page, the virtual memory subsystem allocates an available page of physical memory, copies the read-only page into the new page, and enters the translation in the page table. This is called a copy-on-write page fault.
To improve process execution time and decrease the number of page faults, the virtual memory subsystem attempts to anticipate which pages the task will need next. Using an algorithm that checks which pages were most recently used, the number of available pages, and other factors, the subsystem maps additional pages, along with the page that contains the requested address.
The virtual memory subsystem also uses page coloring to reduce execution time. If possible, the subsystem attempts to map a process' entire resident set into the secondary cache. If the entire task, text, and data are executed within the cache, addresses do not have to be fetched from physical memory.
The
private-cache-percent
attribute specifies the percentage of the cache that is reserved
for anonymous (nonshared) memory.
The default is to reserve 50 percent of
the cache for anonymous memory and 50 percent for file-backed memory (shared).
To cache more anonymous memory, increase the value of the
private-cache-percent
attribute.
This attribute is primarily used for benchmarking.
The virtual memory subsystem allocates physical pages to processes and the UBC, as needed. Because physical memory is limited, these pages must be periodically reclaimed so that they can be reused.
The virtual memory subsystem uses page lists to track the location and age of all the physical memory pages. At any one time, each physical page can be found on one of the following lists:
Free list--Pages that are clean and are not being used (the size of this list controls when page reclamation occurs)
Active list--Pages that are being used by the virtual memory subsystem or the UBC
To determine which pages should be reclaimed first, the page-stealer daemon identifies the oldest pages on the active list and designates these least-recently used (LRU) pages as follows:
Inactive pages are the oldest pages that are being used by the virtual memory subsystem.
UBC LRU pages are the oldest pages that are being used by the UBC.
Use the
vmstat
command or
dbx
to
determine the number of pages that are on the page lists.
Remember that pages on the active list (the
act
field
in the
vmstat
output)
include both inactive and UBC LRU pages.
As physical pages are allocated to processes and the UBC, the free list becomes depleted, and pages must be reclaimed in order to replenish the list. To reclaim pages, the virtual memory subsystem does the following:
Prewrites the oldest dirty (modified) pages to swap space
Uses paging to reclaim individual pages
Uses swapping to suspend processes and reclaim a large number of pages
See Section 4.3.5, Section 4.3.6, Section 4.3.8, and Section 4.3.9 for more information about prewriting pages, paging, and swapping.
The virtual memory subsystem attempts to prevent a memory shortage by prewriting modified pages to swap space.
When the virtual memory subsystem anticipates that the pages on
the free list will soon be depleted, it prewrites to swap space
the oldest modified (dirty) inactive pages.
The value of the
vm-page-prewrite-target
attribute
determines the number of pages that the subsystem will prewrite
and keep clean.
The default value is 256 pages.
In addition, when the number of modified UBC LRU pages
exceeds the value of the
vm-ubcdirtypercent
attribute, the virtual memory subsystem prewrites to swap space the
oldest modified UBC LRU pages.
The default value of the
vm-ubcdirtypercent
attribute is
10 percent of the total UBC LRU pages.
To minimize the impact of
sync
(steady state flushes)
when prewriting UBC pages, the
ubc-maxdirtywrites
attribute specifies the maximum number of disk writes that the kernel
can perform each second.
The default value is 5.
See Section 4.7.13 for more information about prewriting dirty pages.
When the demand for memory depletes the free list, paging begins. The virtual memory subsystem takes the oldest inactive and UBC LRU pages, moves the contents of the modified pages to swap space, and puts the clean pages on the free list, where they can be reused.
If the free page list cannot be replenished by reclaiming individual pages, swapping begins. Swapping temporarily suspends processes and moves entire resident sets to swap space, which frees large amounts of physical memory.
The point at which paging and swapping start and stop depends on the values of some virtual memory subsystem attributes. Figure 4-6 shows the default values of these attributes.
Detailed descriptions of the attributes are as follows:
vm-page-free-target
--Paging starts when the number of pages on the free list is less than
this value (the default is 128 pages).
vm-page-free-min
--Specifies the threshold
at which a page must be reclaimed for each page allocated (the default
is 20 pages).
vm-page-free-swap
--Idle
task swapping starts when the number of pages on the free list
is less than this value for a period of time (the default is 74 pages).
vm-page-free-optimal
--Hard
swapping starts when the number of pages on the free list is less than
this value for five seconds (the default is 74 pages).
The first processes to be swapped out include those with the lowest
scheduling priority and those with the largest resident set size.
vm-page-free-hardswap
--Swapping
stops when the number of pages on the free list is more
than this value (the default is 1280 pages).
vm-page-free-reserved
--Only
privileged tasks can get memory when the number of pages on the
free list is less than this value (the default is 10 pages).
See Section 4.3.8 and Section 4.3.9 for information about paging and swapping operations.
Because the UBC shares with the virtual memory subsystem the physical pages that are not wired by the kernel, the allocation of memory to the UBC can affect file system performance and paging and swapping activity. The UBC is dynamic and consumes varying amounts of memory in order to respond to changing file system demands.
Figure 4-7 shows how memory is allocated to the UBC.
The following attributes control the amount of memory available to the UBC:
Specifies the minimum percentage of memory that the UBC can utilize. The default is 10 percent.
Specifies the maximum percentage of memory that the UBC can utilize. The default is 100 percent.
ubc-borrowpercent
attribute
Specifies the UBC borrowing threshold.
The default is 20 percent.
From the value of the
ubc-borrowpercent
attribute to the value of the
ubc-maxpercent
attribute, the UBC is only borrowing
memory from the virtual memory subsystem.
When paging starts,
pages are first reclaimed from the UBC until the amount of memory
allocated to the UBC reaches the value of the
ubc-borrowpercent
attribute.
When the memory demand is high and the number of pages on the free page
list reaches the value of the
vm-page-free-target
attribute, the virtual memory subsystem uses paging to replenish the free
page list.
The page reclamation code controls paging and swapping.
The page-out daemon and task swapper daemon are extensions of the page
reclamation code.
See
Section 4.3.6
for more
information about the attributes that control paging and swapping.
The page reclamation code activates the page-stealer daemon, which first reclaims the pages that the UBC has borrowed from the virtual memory subsystem, until the size of the UBC reaches the borrowing threshold (the default is 20 percent). If the reclaimed pages are dirty (modified), their contents must be written to disk before the pages can be moved to the free page list. Freeing borrowed UBC pages is a fast way to reclaim pages, because UBC pages are usually unmodified. See Section 4.3.7 for more information about UBC borrowed pages.
If freeing UBC borrowed memory does not sufficiently replenish the free list, a pageout occurs. The page-stealer daemon reclaims the oldest inactive and UBC LRU pages.
Paging becomes increasingly aggressive if the number of free pages
continues to decrease.
If the number of pages on the free page list falls below the value of
the
vm-page-free-min
attribute (the default is 20
pages), a page must be reclaimed for each page allocated.
To
prevent deadlocks, if the number of pages on the free page list falls below
the value of the
vm-page-free-reserved
attribute
(the default is 10 pages), only privileged tasks can get memory until the free page
list is replenished.
Paging stops when the number of pages on the free list reaches the
value of the
vm-page-free-target
attribute.
If paging individual pages does not replenish the free list, swapping is used to free a large amount of memory. See Section 4.3.9 for more information.
Figure 4-8 shows the movement of pages during paging operations.
If there is a high demand for memory, the virtual memory subsystem may be unable to replenish the free list by reclaiming pages. Swapping reduces the demand for physical memory by suspending processes, which dramatically increases the number of pages on the free list. To swap out a process, the task swapper suspends the process, writes its resident set to swap space, and moves the clean pages to the free list.
Idle task swapping begins when the number of pages on the free list falls
below the value of the
vm-page-free-swap
attribute for a period of time (the default is 74 pages).
The task swapper suspends all tasks that have been idle for 30 seconds or more.
If the number of pages on the
free list falls below the value of the
vm-page-free-optimal
attribute (the default is 74 pages) for more than five seconds, hard swapping begins.
The task swapper suspends, one at a time, the tasks with the lowest priority
and the largest resident set size.
Swapping stops when the number of pages on the free list
reaches the value of the
vm-page-free-hardswap
attribute
(the default is 1280).
A
swapin
occurs when the number of pages
on the free list reaches the value of the
vm-page-free-optimal
attribute for
a period of time.
The task's working set is paged in
from swap space and it can now execute.
The value of the
vm-inswappedmin
attribute specifies the
minimum amount of time, in seconds, that a task
must remain in the inswapped state before it can be outswapped.
The default value is 1 second.
Swapping has a serious impact on system performance. You can modify the attributes described in Section 4.3.6 to control when swapping starts and stops.
Increasing the rate of swapping (swapping earlier during page reclamation) increases throughput. As more processes are swapped out, fewer processes are actually executing and more work is done. Although increasing the rate of swapping moves long-sleeping threads out of memory and frees memory, it degrades interactive response time. When an outswapped process is needed, it will have a long latency.
If you decrease the rate of swapping (swap later during page reclamation), you will improve interactive response time, but at the cost of throughput.
To facilitate the movement of data between memory and disk, the virtual memory subsystem uses synchronous and asynchronous swap buffers. The virtual memory subsystem uses these two types of buffers to immediately satisfy a page-in request without having to wait for the completion of a page-out request, which is a relatively slow process.
Synchronous swap buffers are used for page-in page faults and for swap outs. Asynchronous swap buffers are used for asynchronous pageouts and for prewriting modified pages. See Section 4.7.15 and Section 4.7.16 for tuning information.
The DIGITAL UNIX operating system uses the Unified Buffer Cache (UBC) as a layer between the operating system and disk. The UBC holds actual file data, which includes reads and writes from conventional file activity and page faults from mapped file sections, and AdvFS metadata. The cache can improve I/O performance by decreasing the number of disk I/O operations.
The UBC shares with the virtual memory subsystem the physical pages
that are not wired by the kernel.
The maximum and minimum percentages of
memory that the UBC can utilize are specified by the
ubc-maxpercent
attribute (the default is 100 percent) and
the
ubc-minpercent
attribute (the default is 10 percent).
In addition, the
ubc-borrowpercent
attribute
specifies the percentage of
memory allocated to the UBC above which the memory is only borrowed from
the virtual memory subsystem.
The default is 20 percent of physical memory.
See
Section 4.3.7
for more information.
The UBC is dynamic and consumes
varying amounts of memory in order to respond to changing file system
demands.
For example, if file system activity is heavy,
pages will be allocated to the UBC up to the value of the
ubc-maxpercent
attribute.
In contrast, heavy process activity, such as large increases in the
working sets for large executables, will cause the virtual memory
subsystem to reclaim UBC borrowed pages.
Figure 4-7
shows the allocation of physical memory
to the UBC.
The UBC uses a hashed list to quickly locate the physical pages that it is holding. A hash table contains file and offset information that is used to speed lookup operations.
The UBC also uses a buffer to facilitate
the movement of data between memory and disk.
The
vm-ubcbuffers
attribute specifies maximum file system device I/O queue depth
for writes (that is, the number of UBC I/O requests that can be outstanding).
See
Section 4.7.17
for tuning information.
The metadata buffer cache is part of kernel wired memory and is used to cache only UFS and CDFS metadata, which includes file header information, superblocks, inodes, indirect blocks, directory blocks, and cylinder group summaries. The DIGITAL UNIX operating system uses the metadata buffer cache as a layer between the operating system and disk. The cache can improve I/O performance by decreasing disk I/O operations.
The metadata buffer cache is configured at boot time and uses
bcopy
routines to move data in and out of memory.
The size of the metadata buffer cache is specified by the value of the
bufcache
attribute.
See
Section 4.9
for tuning information.
The following sections describe how to configure memory and swap space, which includes the following tasks:
Determining how much physical memory your system requires (Section 4.6.1)
Determining how much swap space you need (Section 4.6.2)
Choosing a swap space allocation mode (Section 4.6.3)
This section describes how to determine your system's memory requirements. The amount of memory installed in your system must be able to provide an acceptable level of user and application performance.
To determine your system's memory requirements, you must gather the following information:
The amount of memory that will be wired
The amount of memory that the virtual memory subsystem requires to cache the anonymous regions of process data
The amount of memory that the UBC requires to cache file system data
See Section 4.6.2 for information about swap space requirements.
Your system's performance depends on the swap space configuration. DIGITAL recommends a minimum of 128 MB for swap space.
To calculate the swap space required by your system and workload, compare the total modifiable virtual address space (anonymous memory) required by your processes with the total amount of physical memory. Modifiable virtual address space holds data elements and structures that are modified during process execution, such as heap space, stack space, and data space.
To calculate swap space requirements if you are using immediate mode, total the anonymous memory requirements for all processes and then add 10 percent of that value. If you are using deferred mode, total the anonymous memory requirements for all processes and then divide by two.
Application messages, such as the following, usually indicate that not
enough swap space is configured into the system or that a process limit has
been reached:
Use multiple disks for swap space.
The page reclamation code
uses a form of disk striping (known as swap space
interleaving) so that pages can be written to the multiple disks.
To optimize swap space, ensure that all your swap disks are configured
when you boot the system, instead of adding swap space while the system is
running.
Use the
The following list describes how to configure swap space for high
performance:
Configure all of your swap space at boot time
Use fast disks for swap space to decrease page fault latency
Do not use busy disks for swap space
Spread out your swap space across multiple disks (never put
multiple swap partitions on the same disk)
Spread out your swap disks across multiple I/O buses to prevent a single
bus from becoming a bottleneck
Use the Logical Storage Manager (LSM) to stripe your swap
disks
See
Chapter 5
for more
information about configuring and tuning
swap disks for high performance and availability.
There are two methods that you can use to allocate swap space.
The methods
differ in the point in time at which the virtual memory subsystem reserves
swap space for a process.
There is no performance benefit attached to
either method; however, deferred mode is recommended
for very-large memory/very-large database (VLM/VLDB) systems.
The swap allocation methods are as follows:
Immediate mode--Swap space is
reserved when modifiable
virtual address space is created.
Immediate mode is often referred
to as
eager mode
and is the default swap
space allocation mode.
Anonymous memory is memory that is not backed by a file, but is
backed by swap space (for example, stack space, heap space, and memory
allocated by the
Deferred mode--Swap space is not
reserved until the virtual memory subsystem
needs to write a modified virtual page to swap space.
Deferred
mode is sometimes referred to as
lazy mode.
Deferred mode requires less swap space than immediate mode and causes
the system to run faster because it requires less swap space bookkeeping.
It postpones the reservation and allocation of swap space for anonymous
memory until it is needed.
However,
because deferred mode does not reserve swap space in advance, the swap space
may not be available when a task needs it, and the process may be killed
asynchronously.
You can enable the deferred swap space allocation mode
by removing or moving the
See the
System Administration
manual for more information on
swap space allocation methods.
The virtual memory subsystem is a primary source of performance problems.
Performance may degrade if the virtual memory subsystem cannot
keep up with the demand for memory and excessive paging and swapping occurs.
A memory bottleneck may cause a disk I/O bottleneck, because
excessive paging and swapping decreases performance
and indicates that the natural working set size has exceeded the
available memory.
The virtual memory subsystem runs at a high
priority when servicing page faults, which blocks the execution of other
processes.
If you have excessive page-in and page-out activity from a swap partition,
the system may have a high physical memory commitment ratio.
Excessive paging
also can increase the miss rate for the secondary cache, and may be
indicated by the following output:
The output of the
The output of the
The output of the
The tuning recommendations that will provide the best performance benefit
involve the following two areas:
System resource allocation
Increasing the available address space
Increasing the kernel resources available to processes
Memory allocation and page reclamation
Modifying the percentage of memory allocated to the UBC
Changing the rate of swapping
Changing how the system prewrites modified inactive pages
Table 4-2
describes the primary tuning tasks
guidelines and lists the performance benefits as well as tradeoffs.
If the previous tasks do not sufficiently improve performance, there
are
advanced tuning tasks that you can perform.
The advanced tuning tasks
include the following:
Modify the sizes of the page-in and page-out clusters
Modify the swap device I/O queue depth
Modify the amount of memory the UBC uses to cache large files
Increase the paging threshold
Enable aggressive task swapping
Decrease the size of the file system caches
Reserve memory at boot time for shared memory
Table 4-3
describes the advanced tuning tasks
guidelines and lists the performance benefits as well as tradeoffs.
The following sections describe these guidelines in detail.
You can improve performance and reduce the demand for memory by running
fewer applications simultaneously.
Use the
You can reduce the static size of the kernel by deconfiguring any unnecessary
subsystems.
Use the
Use the
If your applications are memory-intensive, you may want to
increase the available address space.
Increasing the address space
will cause only a small increase in the demand for memory.
However,
you may not want to increase the address space if your applications use
many forked processes.
The following attributes
determine the available address space for processes:
This attribute controls the maximum amount of virtual address space
available to a process.
The default value is 1 GB (1073741824).
For Internet servers, you may want to increase this value to 10 GB.
These attributes control the maximum amount of user process address
space, which is the maximum number of valid virtual regions.
The default value for both attributes is 1 GB.
These attributes control the maximum size of a user process stack.
The default value of the
These attributes control the maximum size of a user process data segment.
The default value of the
You can use the
If
your applications are memory-intensive, you may want to increase the system
resources that are available to processes.
Be careful when increasing
the system resources, because this will increase the amount of wired memory
in the system.
The following attributes affect system resources:
The
The default value assigned to the
If your system experiences a lack of resources (for example,
The
The
The
If your system experiences a lack of processes,
you can increase the value of the
The
If your system, especially a Web server, experiences a lack of threads,
you can increase the value of the
You can use the
The
You may want to increase the value of the
The
Changing the protection attributes of a single page within
a virtual memory region causes all pages within that region to be treated
as though they had individual protection attributes.
For example, each thread
of a multithreaded task has a user stack in the stack region for the process
in which it runs.
Because multithreaded tasks have guard pages
(that is, pages that do not have read/write access) inserted between the user
stacks for the threads, all pages in the stack region for the process are
treated as though they have individual protection attributes.
The default value of the
You may want to increase the value of the
If your applications are memory-intensive or you have a VLM system,
you may want to increase
the value of the
In addition, you may want to increase the
value of the
If your applications are memory-intensive or you have a VLM
system, you may want to increase the
value of the
In addition, you may want to increase the
value of the
If your applications are memory-intensive, you may want to increase the
value of the
You may want to reduce your applications' use of memory to free memory
for other purposes.
Follow these coding considerations to reduce your
applications' use of memory:
Configure and tune applications according to the guidelines
provided by
the application's installation procedure.
For example, you may be
able to reduce an application's anonymous memory requirements,
set parallel/concurrent processing attributes, size shared global areas and
private caches, and set the maximum number of open/mapped files.
Look for data cache collisions between heavily used data
structures, which occur when the distance between two data
structures allocated in memory is equal to the size of the primary
(internal) data cache.
If your data structures are small, you can
avoid collisions by allocating them contiguously in memory.
To do this,
use a single
If an application uses large amounts of data for a short time,
allocate the data dynamically with the
If an application uses the
If your application fits in a 32-bit address space and allocates
large
amounts of dynamic memory by using structures that contain many
pointers, you may be able to reduce memory usage by using
the
See the
Programmer's Guide
for more information on process memory allocation.
You may be able to improve performance by reducing the maximum
percentage of memory available for the UBC.
If you decrease the maximum size of the UBC, you increase the amount of memory
available to the virtual memory subsystem, which may reduce the paging and
swapping rate.
However, reducing the memory allocated to the UBC
may adversely affect I/O performance because the UBC will hold less
file system data, which results in more disk I/O operations.
Therefore, do not significantly decrease the maximum size of the UBC.
The maximum amount of memory that can be allocated to the UBC
is specified by the
If the page-out
rate is high and you are not using the file system heavily, decreasing
the value of the
Use the
You also may be able to prevent paging by increasing the percentage of
memory that the UBC borrows from the virtual memory subsystem.
To do this, decrease the value of the
Swapping has a drastic impact on system performance.
You can modify attributes to control when swapping begins and
ends.
Increasing the rate of swapping (swapping earlier during page
reclamation), moves long-sleeping threads out of
memory, frees memory, and increases throughput.
As more processes are
swapped out, fewer processes are actually executing and more work is done.
However, when an outswapped process is needed, it will have a long latency,
so increasing the rate of swapping will degrade interactive response time.
In contrast, if you decrease the rate of swapping (swap later during
page reclamation), you will improve interactive response time, but
at the cost of throughput.
To increase the rate of swapping, increase the value of the
To decrease the rate of swapping, decrease the value of the
The virtual memory subsystem attempts to prevent a memory shortage by
prewriting modified pages to swap space.
When the virtual memory subsystem anticipates that the pages on
the free list will soon be depleted, it prewrites to swap space
the oldest modified (dirty) pages on the inactive list.
To reclaim a page that has been
prewritten, the virtual memory subsystem only needs to validate the page.
Increasing the rate of dirty page prewriting will reduce peak workload
performance, but it will prevent
a drastic performance degradation when memory is exhausted.
Decreasing the
rate will improve peak workload performance, but it will cause a drastic
performance degradation when memory is exhausted.
You can control the rate of dirty page prewriting by modifying the
values of the
The
The
In addition, you may want to minimize the impact of I/O spikes caused by the
To minimize the impact of
The virtual memory
subsystem reads in and writes out additional pages in an attempt to anticipate
pages that it will need.
The
Decreasing the value of the
The
Decreasing the value of the
Synchronous swap buffers are used for page-in page faults and
for swapouts.
The
You can modify the value of the
Increasing the swap device I/O queue depth increases overall system
throughput, but consumes memory.
Decreasing the swap device I/O queue depth decreases memory demands
and improves interactive response time, but decreases overall system throughput.
Asynchronous swap buffers are used for asynchronous pageouts and for
prewriting modified pages.
The
The value of the
Increasing the queue depth will free memory and increase the overall
system throughput.
Decreasing the queue depth will use more memory, but will
improve the interactive response time.
If you are using LSM, you may want to increase the page-out rate.
Be careful if you increase the value of the
The UBC uses a buffer to facilitate
the movement of data between memory and disk.
The
Increasing the UBC write device queue depth frees memory and increases
the
overall file system throughput.
Decreasing the UBC write device queue depth increases memory demands,
but improves the interactive response time.
If a large file completely fills the UBC, it may take all of the pages
on the free page list, which may cause the system to page excessively.
The
The
Increasing the value of the
Decreasing the value of the
To force the system
to reuse the pages in the UBC instead of taking pages from the free
list, perform the following tasks:
Make the maximum size of the UBC greater than the size of
the UBC as a
percentage of percentage of memory.
That is, the value of the
Make the value of the
For example, using the default values, the UBC would have to be larger
than
50 percent of all memory and a file would have to be larger than 10 percent
of the UBC (that is, the file size would have to be at least 5 percent of
all memory) in order for the system to reuse the pages in the UBC.
On large-memory systems that are doing a lot of file system operations,
you may want to lower the
The
Increasing the value of the
Do not decrease the value of the
You can enable the
By default, the
The metadata buffer cache contains recently accessed UFS and
CDFS metadata.
On large-memory systems with a high cache hit rate,
you may want to decrease the size of
the metadata buffer cache.
This will increase the amount of memory
that is available to the virtual memory subsystem.
However, decreasing the size of the cache may degrade UFS performance.
The
For systems that use only AdvFS, set the value of the
The namei cache is used by all file systems to map file pathnames to
inodes.
Use
To free memory resources, decrease the number of elements in the namei
cache by decreasing the value of the
To free memory resources, you may want to decrease the percentage of
physical memory allocated to the AdvFS buffer cache.
The
Granularity hints allow you to reserve a portion of dynamically wired
physical memory at boot time for shared memory.
Granularity
hints allow the translation lookaside buffer to map more than a single page
and enable shared page table entry functionality, which will cause fewer
buffer misses.
On typical database servers, using granularity hints provides a 2 to
4
percent run-time performance gain that reduces the shared memory detach time.
In most cases, use the Segmented Shared
Memory (SSM) functionality (the default) instead of the granularity
hints functionality.
To enable granularity hints, you must specify a value for the
Section 4.7.24.1
and
Section 4.7.24.2
describe how to enable granularity hints.
To use granularity hints, you must specify the number of 4-MB chunks
of
physical memory to reserve for shared memory at boot time.
This memory cannot
be used for any other purpose and cannot be returned to the system or
reclaimed.
To reserve memory for shared memory, specify a nonzero value for the
The value you specify for the
You can determine if you have reserved the appropriate amount of memory.
For example, you can initially specify 512 for the value of the
The output shows the following:
The first number (402) specifies the number of 512-page chunks
(4 MB).
The second number (4) specifies the number of 64-page chunks.
The third number (0) specifies the number of 8-page chunks.
The fourth number (2) specifies the number of 1-page chunks.
To save memory, you can reduce the value of the
The following attributes also affect granularity hints:
Specifies the shared memory segment size above which memory is allocated
from the memory reserved by the
When set to 1 (the default), the
If the value of the
In addition, messages will display on the system console indicating
unaligned size and attach address requests.
The unaligned attach messages
are limited to one per shared memory segment.
You can make granularity hints more effective
by making both the shared memory segment starting address and size aligned
on an 8-MB boundary.
To share Level 3 page table entries, the shared memory segment attach
address (specified by the
The attach address and the shared memory segment size is specified
by the application.
In addition, System V shared memory semantics allow a
maximum shared memory segment size of 2 GB minus 1 byte.
Applications that
need shared memory segments larger than 2 GB can construct these regions by
using multiple segments.
In this case, the total shared memory size specified
by the user to the application must be 8-MB aligned.
In addition, the value
of the
If the total shared memory size specified to the application is greater
than 2 GB, you can specify a value of 2139095040 (or 0x7f800000) for the
value of the
Use the following
For the best performance, the
Because of how shared memory is divided into shared memory
segments, there may be some unshared segments.
This occurs when the
starting address or the size is aligned on an 8-MB boundary.
This condition may be unavoidable in some cases.
In many cases, the
value of
Shared memory locking changes a lock that was a single lock into a hashed
array of locks.
The size of the hashed array of locks can be modified by
modifying the value of the
The UBC and
the virtual memory subsystem compete for the physical memory that is not
wired by the kernel.
You may be able to improve file system performance by
tuning the UBC.
However, increasing the amount of memory available to the
UBC will affect the virtual memory subsystem and may increase the rate of
paging and swapping.
The amount of memory allocated to the UBC is determined by the
The following output may indicate that the size of the UBC is too small
for your configuration:
The output of the
The output of the
The UBC is flushed by the
You can improve UBC performance by following the guidelines
described in
Table 4-4.
You can also improve
file system performance by following the guidelines
described in
Chapter 5.
The following sections describe these guidelines in detail.
If there is an insufficient amount of memory allocated to the UBC,
I/O performance may be degraded.
If you allocate more memory to the UBC, you will improve the chance that
data will be found in the cache.
By preventing the
system from having to copy data from a disk, you may improve I/O performance.
However, allocating more memory to the UBC may cause excessive paging and
swapping.
To increase the maximum amount of memory allocated to the UBC, you
can increase the value of the
If
The UBC borrows all physical memory above the
value of the
Increasing the value of the
To ensure that the value of the
If the values of the
You may want to use the
A portion of physical memory is wired for use by the metadata buffer
cache, which is the traditional BSD buffer cache.
The file system code
that deals with UFS metadata, which includes directories, indirect blocks,
and inodes, uses this cache.
You may be able to improve UFS performance by following the guidelines
described in
Table 4-5.
The following sections describe these guidelines in detail.
The
You may want to increase the size of the
metadata buffer cache if you have a high cache miss rate (low hit rate).
In general, you do not have to increase the cache size.
Never increase the value of the
To determine whether to
increase the size of the metadata buffer cache, use
Allocating additional memory to the metadata buffer cache reduces the
amount of memory available to the virtual memory subsystem and the UBC.
In general, you do not have to increase the value of the
The hash chain table for the metadata buffer cache stores the heads of
the hashed buffer queues.
Increasing the size of the hash chain table
spreads out the buffers and may reduce linear searches, which
improves lookup speeds.
The
You can modify the value of the
lack of paging space
" swap space below 10 percent free
"swapon -s
command to display your swap space configuration.
The first line displayed
is the total allocated swap space.
Use the
iostat
to display disk usage.
4.6.3 Choosing a Swap Space Allocation Mode
malloc
or
sbrk
routines).
When anonymous memory is allocated,
the operating system reserves swap space for the memory.
Usually, this
results in an unnecessary amount of reserved swap space.
Immediate mode requires more swap space than deferred mode, but it ensures
that the swap space will be available to processes when it is needed.
/sbin/swapdefault
file.
4.7 Tuning Virtual Memory
vmstat
shows a very
low free page count or shows high page-in and page-out activity.
See
Section 2.4.2
for more information.
ps
command shows high
task swapping activity.
See
Section 2.4.1
for more information.
iostat
command shows
excessive swap disk I/O activity.
See
Section 2.5.1
for more information.
Table 4-2: Primary Virtual Memory Tuning Guidelines
Action
Performance Benefit
Tradeoff
Reduce the number of processes
running at the same time (Section 4.7.1)
Reduces demand for memory
None
Reduce the static size of the
kernel (Section 4.7.2)
Reduces demand for memory
None
Increase the available address
space (Section 4.7.3)
Improves performance for
memory-intensive processes
Slightly increases the
demand for memory
Increase the available system
resources (Section 4.7.4)
Improves performance for
memory-intensive processes
Increases wired memory
Increase the maximum number of
memory-mapped files that are available to a
process (Section 4.7.5)
Increases file mapping and improves
performance for memory-intensive
processes, such as Internet servers
Consumes memory
Increase the maximum number of
virtual pages within a process' address space that can have individual
protection attributes (Section 4.7.6)
Improves performance for memory-intensive
processes and for Internet servers that maintain large tables or resident
images
Consumes memory
Increase the size of a
System V message and queue (Section 4.7.7)
Improves performance for memory-intensive
processes
Consumes memory
Increase the maximum size of a
single System V shared memory region (Section 4.7.8)
Improves performance for memory-intensive
processes
Consumes memory
Increase the minimum size of a
System V shared memory segment (Section 4.7.9)
Improves performance for
VLM and VLDB systems
Consumes memory
Reduce process memory requirements
(Section 4.7.10)
Reduces demand for memory
None
Reduce the amount of physical
memory available to the UBC (Section 4.7.11)
Provides more memory resources
to processes
May degrade file system performance
Increase the rate of swapping
(Section 4.7.12)
Frees memory and increases
throughput
Decreases interactive response
performance
Decrease the rate of swapping
(Section 4.7.12)
Improves interactive response
performance
Decreases throughput
Increase the rate of dirty page
prewriting
(Section 4.7.13)
Prevents drastic performance
degradation when memory is exhausted
Decreases peak workload
performance
Decrease the rate of dirty page
prewriting
(Section 4.7.13)
Improves peak workload performance
May cause drastic performance
degradation when memory is exhausted
Table 4-3: Advanced Virtual Memory Tuning Guidelines
Action
Performance Benefit
Tradeoff
Increase the size of the page-in
and page-out clusters
(Section 4.7.14)
Improves peak workload performance
Decreases total system workload
performance
Decrease the size of the page-in
and page-out clusters
(Section 4.7.14)
Improves total system workload
performance
Decreases peak workload
performance
Increase the swap device I/O queue
depth for pageins and swapouts
(Section 4.7.15)
Increases overall system throughput
Consumes memory
Decrease the swap device I/O queue
depth for pageins and swapouts
(Section 4.7.15)
Improves the interactive response
time and frees memory
Decreases system throughput
Increase the swap device I/O
queue depth for pageouts
(Section 4.7.16)
Frees memory and increases throughput
Decreases interactive response
performance
Decrease the swap device I/O
queue depth for pageouts
(Section 4.7.16)
Improves interactive response
time
Consumes memory
Increase the UBC write device
queue depth (Section 4.7.17)
Increases overall file system
throughput and frees memory
Decreases interactive response
performance
Decrease the UBC write device
queue depth (Section 4.7.17)
Improves interactive response
time
Consumes memory
Increase the amount of UBC memory
used to cache a large file
(Section 4.7.18)
Improves large file performance
May allow a large file to consume
all the pages on the free list
Decrease the amount of UBC memory
used to cache a large file
(Section 4.7.18)
Prevents a large file from consuming
all the pages on the free list
May degrade large file performance
Increase the paging threshold
(Section 4.7.19)
Maintains performance when free
memory is exhausted
May waste memory
Enable aggressive swapping
(Section 4.7.20)
Improves system throughput
Degrades interactive
response performance
Decrease the size of the metadata
buffer cache (Section 4.7.21)
Provides more memory resources
to processes on large systems
May degrade UFS performance
Decrease the size of the namei
cache (Section 4.7.22)
Decreases demand for memory
May slow lookup operations and
degrade file system performance
Decrease the amount of memory
allocated to the AdvFS cache (Section 4.7.23)
Provides more memory resources
to processes
May degrade AdvFS performance
Reserve physical memory for
shared memory (Section 4.7.24)
Improves shared memory detach
time
Decreases the memory available
to
the virtual memory subsystem and the UBC
4.7.1 Reducing the Number of Processes Running Simultaneously
at
or the
batch
command to run applications at offpeak hours.
4.7.2 Reducing the Static Size of the Kernel
setld
command to display the installed
subsets and to delete subsets.
sysconfig
command to display the configured
subsystems and to delete subsystems.
4.7.3 Increasing the Available Address Space
per-proc-address-space
and
max-per-proc-address-size
per-proc-stack-size
and
max-per-proc-stack-size
per-proc-stack-size
attribute
is 2097152 bytes.
The default value of the
max-per-proc-stack-size
attribute is 33554432 bytes.
You may need to increase these values if you receive
cannot grow
stack
messages.
per-proc-data-size
and
max-per-proc-data-size
per-proc-data-size
attribute is
134217728 bytes.
The default value of the
max-per-proc-data-size
is 1 GB.
setrlimit
function to control the
consumption of system resources by a parent process and its child processes.
See
setrlimit
(2)
for information.
4.7.4 Increasing the Available System Resources
maxusers
attribute specifies the number of simultaneous
users that a system can support without straining system resources.
System
algorithms use the
maxusers
attribute to size various system
data structures, and to determine the amount of space allocated to system
tables, such as the system process table, which is used to determine how many
active processes can be running at one time.
maxusers
attribute
depends on the size of your system.
Increasing the value of the
maxusers
attribute allocates more system resources for use by the
kernel.
However, this also increases the amount of physical memory consumed
by the kernel.
Decreasing the value of the
maxusers
attribute
reduces kernel memory usage, but allocates less system resources to
processes.
Out of processes
messages), you can increase the value of the
maxusers
attribute to 512.
A lack of resources may also be indicated
by a
No more processes
error message.
If you have sufficient
memory on a heavily loaded system (for example, more than 96 MB), you can
increase the value of the
maxusers
attribute to 1024.
task-max
task-max
attribute specifies the maximum number of
tasks that can run simultaneously.
The default value is 20 + 8 *
maxusers
.
thread-max
thread-max
attribute specifies the maximum number of
threads.
The default value is 2 *
task-max
.
max-proc-per-user
attribute specifies the maximum
number of processes that can be allocated at any one time to each user,
except superuser.
The default value of the
max-proc-per-user
attribute is 64.
max-proc-per-user
attribute.
The value must be more than the maximum number of processes
that will be started by your system.
If you have a Web server, these
processes include CGI processes.
If you plan to run more than 64 Web server daemons
simultaneously, increase the attribute value to 512.
On a very
busy server with sufficient memory, you can use a higher value.
Increasing this value can improve the performance of multiprocess
Web servers.
max-threads-per-user
attribute specifies the
maximum
number of threads that can be allocated at any one time to each user,
except superuser.
The default value is 256.
max-threads-per-user
attribute.
The value must be more than the maximum number of threads
that will be started by your system.
You can increase the value of the
max-threads-per-user
attribute to 512.
On a very busy server
with sufficient memory, you can use a higher value, such as 4096.
Increasing this value can improve the performance of multithreaded
Web servers.
setrlimit
function to control the
consumption of system resources by a parent process and its child processes.
See
setrlimit
(2)
for information.
4.7.5 Increasing the Number of Memory-Mapped Files
vm-mapentries
attribute specifies the maximum
number of memory-mapped files in a user address.
Each map entry describes one unique disjoint portion
of a virtual address space.
The default value is 200.
vm-mapentries
attribute for VLM systems.
Because Web servers map files into memory,
for busy systems running multithreaded Web server
software, you may want to increase the value to 20000.
This will
increase the limit on file mapping.
This attribute affects all processes, and increasing its value
will increase the demand for memory.
4.7.6 Increasing the Number of Pages With Individual Protections
vm-vpagemax
attribute specifies the maximum
number of virtual pages within a process' address space that
can be given individual protection attributes.
These protection attributes
differ from the protection attributes associated
with the other pages in the address space.
vm-vpagemax
attribute is determined by dividing the value
of the
vm-maxvas
attribute (the address
space size in bytes) by 8192.
If a stack region for a multithreaded task
exceeds 16 KB pages, you may want to increase the value of the
vm-vpagemax
attribute.
For example, if the value
of the
vm-maxvas
attribute is 1 GB (the default), set
the value of
vm-vpagemax
to 131072 pages
(1073741824/8192=131072).
This value improves the efficiency of Web
servers that maintain large tables or resident images.
vm-vpagemax
attribute for VLM systems.
However, this attribute affects all processes, and increasing its value
will increase the demand for memory.
4.7.7 Increasing the Size of a System V Message and Queue
msg-max
attribute.
This attribute
specifies the maximum size of a single System V message.
However, increasing the value of this attribute
will increase the demand for memory.
The default value is 8192 bytes (1 page).
msg-tql
attribute.
This attribute
specifies the maximum number of messages that can be queued to a single
System V message queue at one time.
However, increasing the value of this attribute
will increase the demand for memory.
The default value is 40.
4.7.8 Increasing the Size of a System V Shared Memory Region
shm-max
attribute.
This attribute
specifies the maximum size of a single System V shared memory region.
However, increasing the value of this attribute
will increase the demand for memory.
The default value is 4194304 bytes (512 pages).
shm-seg
attribute.
This attribute
specifies the maximum number of System V shared memory regions that
can be attached to a single process at any point in time.
However, increasing the value of this attribute
will increase the demand for memory.
The default value is 32.
4.7.9 Increasing the Minimum Size of a System V Shared Memory Segment
ssm-threshold
attribute.
Page table sharing occurs when the size of a System V
shared memory segment reaches the value specified by this attribute.
However, increasing the value of this attribute
will increase the demand for memory.
4.7.10 Reducing Application Memory Requirements
malloc
call instead of multiple calls.
malloc
function instead of declaring it statically.
When you have finished
using dynamically allocated memory, it is freed for use by
other data structures
that occur later in the program.
If you have limited memory resources,
dynamically allocating data reduces an application's memory usage and
can substantially improve performance.
malloc
function
extensively,
you may be able to improve its processing
speed or decrease its memory utilization by using the function's control
variables to tune memory allocation.
See
malloc
(3)
for details on tuning
memory allocation.
-xtaso
flag.
The
-xtaso
flag is
supported by all versions of the C
compiler (-newc
,
-migrate
, and
-oldc
versions).
To use the
-xtaso
flag, modify your source code with a C-language pragma that controls
pointer size allocations.
See
cc
(1)
for details.
4.7.11 Reducing the Memory Available to the UBC
ubc-maxpercent
attribute.
The default is 100 percent.
The minimum amount of memory that can be allocated to the UBC is specified
by the
ubc-minpercent
attribute.
The default is 10
percent.
If you have an Internet server, use these default values.
ubc-maxpercent
attribute may reduce
the rate of paging and swapping.
Start with the default value
of 100 percent and decrease the value in increments of 10.
If the values of the
ubc-maxpercent
and
ubc-minpercent
attributes are close together, you may
seriously degrade I/O performance or cause the system to page excessively.
vmstat
command to determine whether the system is paging excessively.
Using
dbx
, periodically examine the
vpf_pgiowrites
and
vpf_ubcalloc
fields of the
vm_perfsum
kernel structure.
The page-out rate may shrink if pageouts greatly exceed
UBC allocations.
ubc-borrowpercent
attribute.
Decreasing the value of the
ubc-borrowpercent
attribute
allows less memory to remain in the UBC when page reclamation
begins.
This can reduce the UBC effectiveness,
but may improve the system response time when a low-memory condition
occurs.
The value of the
ubc-borrowpercent
attribute
can range from 0 to 100.
The default value is 20 percent.
4.7.12 Changing the Rate of Swapping
vm-page-free-optimal
attribute (the default is 74 pages).
Increase the value only by 2 pages at a time.
Do not specify a value
that is more than the value of the
vm-page-free-target
attribute (the default is 128).
vm-page-free-optimal
attribute by 2 pages at a time.
Do not specify a value that is less than the value of the
vm-page-free-min
attribute (the default is 20).
4.7.13 Controlling Dirty Page Prewriting
vm-page-prewrite-target
attribute and the
vm-ubcdirtypercent
attribute.
vm-page-prewrite-target
attribute
specifies the number of virtual memory pages that the subsystem will prewrite
and keep clean.
The default value is 256 pages.
To increase the rate of virtual memory dirty page prewriting, increase the
value of the
vm-page-prewrite-target
attribute from
the default value (256) by increments of 64 pages.
vm-ubcdirtypercent
attribute specifies the percentage
of
UBC LRU pages that can be modified before the virtual memory subsystem
prewrites the dirty UBC LRU pages.
The default value is
10 percent of the total UBC LRU pages (that is, 10 percent of the UBC LRU
pages must be dirty before the UBC LRU pages are prewritten).
To increase the rate of UBC LRU dirty page prewriting, decrease the value
of
the
vm-ubcdirtypercent
attribute by increments of 1 percent.
sync
function when prewriting UBC LRU dirty pages.
The value of the
ubc-maxdirtywrites
attribute
specifies the
maximum number of disk writes that the kernel can perform each second.
The default value of the
ubc-maxdirtywrites
attribute
is 5 I/O operations per second.
sync
(steady state flushes)
when prewriting dirty UBC LRU pages, increase the value
of the
ubc-maxdirtywrites
attribute.
4.7.14 Modifying the Size of the Page-In and Page-Out Clusters
vm-max-rdpgio-kluster
attribute specifies the
maximum size of an anonymous page-in cluster.
The default value is 16 KB
(2 pages).
If you increase the value of this attribute, the system will spend
less time page faulting because more pages will be in memory.
This will increase
the peak workload performance, but will consume more memory and decrease
the total system workload performance.
vm-max-rdpgio-kluster
attribute will conserve memory and increase the total system workload performance,
but will increase paging and decrease the peak workload performance.
vm-max-wrpgio-kluster
attribute specifies the maximum size of an
anonymous page-out cluster.
The default value is 32 KB (4 pages).
Increasing
the value of this attribute improves the peak workload performance and conserves
memory, but causes more pageins and decreases the total system workload performance.
vm-max-wrpgio-kluster
attribute improves the total system workload performance and decreases the
number of pageins, but decreases the peak workload performance and consumes
more memory.
4.7.15 Modifying the Swap I/O Queue Depth for Pageins and Swapouts
vm-syncswapbuffers
attribute specifies
the maximum swap device I/O queue depth for pageins and swapouts.
vm-syncswapbuffers
attribute.
The value should be equal to the approximate number of simultaneously
running processes that the system can easily handle.
The default is 128.
4.7.16 Modifying the Swap I/O Queue Depth for Pageouts
vm-asyncswapbuffers
attribute controls the
maximum depth of the swap device I/O queue for pageouts.
vm-asyncswapbuffers
attribute
should be the approximate number of I/O transfers
that a swap device can handle at one time.
The default value is 4.
vm-asyncswapbuffers
attribute, because this will cause page-in requests to lag
asynchronous page-out requests.
4.7.17 Modifying the UBC Write Device Queue Depth
vm-ubcbuffers
attribute specifies the maximum file system device I/O queue
depth for writes.
The default value is 256.
4.7.18 Controlling Large File Caching
vm-ubcseqpercent
attribute specifies the
maximum amount of memory allocated to the UBC that can be used to cache a
file.
The default value is 10 percent of memory allocated to the UBC.
vm-ubcseqstartpercent
attribute specifies the
size of the UBC as a percentage of physical memory, at which time the
virtual memory subsystem starts stealing the
UBC LRU pages for a file to satisfy the demand for pages.
The default is 50 percent of physical memory.
vm-ubcseqpercent
attribute will improve the performance of a large single file,
but decrease the remaining amount of memory.
vm-ubcseqpercent
attribute will increase the available memory, but will degrade the
performance of a large single file.
ubc-maxpercent
attribute (the default is 100 percent)
must be greater than the value of the
vm-ubcseqstartpercent
attribute (the default is 50 percent).
vm-ubcseqpercent
attribute,
which specifies the size of a file as a percentage of the UBC, greater
than a referenced file.
The default value of the
vm-ubcseqpercent
attribute is 10 percent.
vm-ubcseqstartpercent
value
to 30 percent.
Do not specify a lower value unless you decrease
the size of the UBC.
In this case, do not change the value of the
vm-ubcseqpercent
attribute.
4.7.19 Increasing the Paging Threshold
vm-page-free-target
attribute specifies the minimum
number of pages on the free list before paging starts.
The default
value is 128 pages.
vm-page-free-target
attribute will increase the
paging activity but may improve performance when free memory is exhausted.
If you increase the value, start at the default value (128 pages or 1 MB)
and then double the value.
Do not specify a value above 1025 pages or 8 MB.
A high value can waste memory.
vm-page-free-target
attribute unless you have a lot of memory or you experience a serious
performance degradation when free memory is exhausted.
4.7.20 Enabling Aggressive Task Swapping
vm-aggressive
attribute (set the value
to 1) to allow the virtual memory subsystem to aggressively swap out processes
when memory is needed.
This improves system throughput, but degrades the
interactive response performance.
vm-aggressive
attribute is disabled
(set
to 0), which results in less aggressive swapping.
In this case,
processes are swapped in at a faster rate than if aggressive swapping is
enabled.
4.7.21 Decreasing the Size of the Metadata Buffer Cache
bufcache
attribute specifies the percentage of
physical memory that the kernel wires for the metadata buffer cache.
The default size of the metadata buffer cache is 3 percent of physical
memory.
You can decrease the value of the
bufcache
attribute to a minimum of 1 percent.
bufcache
attribute to 1 percent.
4.7.22 Decreasing the Size of the namei Cache
dbx
to monitor the cache by examining the
nchstats
structure.
name-cache-size
attribute.
The default values are 2*nvnode
*11/10
(for 32-MB or larger systems) and 150 (for 24-MB systems).
The maximum value is 2*max-vnodes
*11/10.
4.7.23 Decreasing the Size of the AdvFS Buffer Cache
AdvfsCacheMaxPercent
attribute determines the maximum amount of physical memory that can be used
for the AdvFS buffer cache.
The default is 7 percent of memory.
However, decreasing the size of the AdvFS buffer cache may adversely affect
AdvFS I/O performance.
4.7.24 Reserving Physical Memory for Shared Memory
gh-chunks
attribute.
To make granularity hints more
effective, modify applications to ensure that both the shared
memory segment starting address and size are
aligned on an 8-MB boundary.
4.7.24.1 Tuning the Kernel to Use Granularity Hints
gh-chunks
attribute.
For example, if you want to reserve
4 GB of memory, specify 1024 for the value of
gh-chunks
(1024 * 4 MB = 4 GB).
If you specify a value of 512, you will reserve
2 GB of memory.
gh-chunks
attribute
depends on your database application.
Do not reserve an excessive
amount of memory, because reserving memory decreases the memory available
to the virtual memory subsystem and the UBC.
gh-chunks
attribute.
Then, invoke the
following sequence of
dbx
commands while running the
application that allocates shared memory:
# dbx -k /vmunix /dev/mem
(dbx) px &gh_free_counts
0xfffffc0000681748
(dbx) 0xfffffc0000681748/4X
fffffc0000681748: 0000000000000402 0000000000000004
fffffc0000681758: 0000000000000000 0000000000000002
(dbx)
gh-chunks
attribute until only one or two 512-page chunks are free while the application
that uses shared memory is running.
gh-chunks
attribute.
The
default is 8 MB.
shmget
function
returns a failure if the requested segment size is larger than the value
specified by the
gh-min-seg-size
attribute, and if there
is insufficient memory in the
gh-chunks
area to satisfy
the request.
gh-fail-if-no-mem
attribute is
0, the entire request will be satisfied from the pageable memory area if the
request is larger than the amount of memory reserved by the
gh-chunks
attribute.
4.7.24.2 Modifying Applications to Use Granularity Hints
shmat
function) and the shared
memory segment size (specified by the
shmget
function)
must be aligned on an 8-MB boundary.
This means that the lowest 23 bits
of both the address and the size must be zero.
shm-max
attribute, which specifies the maximum size
of a System V shared memory segment, must be 8-MB aligned.
shm-max
attribute.
This is the maximum value
(2 GB minus 8 MB) that you can specify for the
shm-max
attribute and still share page table entries.
dbx
command sequence to determine
if page table entries are being shared:
# dbx -k /vmunix /dev/mem
(dbx) p *(vm_granhint_stats *)&gh_stats_store
struct {
total_mappers = 21
shared_mappers = 21
unshared_mappers = 0
total_unmappers = 21
shared_unmappers = 21
unshared_unmappers = 0
unaligned_mappers = 0
access_violations = 0
unaligned_size_requests = 0
unaligned_attachers = 0
wired_bypass = 0
wired_returns = 0
}
(dbx)
shared_mappers
kernel variable should be equal to the number of shared memory segments,
and the
unshared_mappers
,
unaligned_attachers
, and
unaligned_size_requests
variables should
be 0 (zero).
total_unmappers
will be greater than
the value of
total_mappers
.
vm-page-lock-count
attribute.
The default value is 64.
4.8 Tuning the UBC
ubc-maxpercent,
ubc-minpercent,
and
ubc-borrowpercent
attributes.
You may be able to improve performance
by modifying the value of these attributes, which are described in
Section 4.4.
vmstat
or
monitor
command shows excessive file system page in activity but little
or no page out activity or shows a very low free page count.
iostat
command shows
little or no swap
disk I/O activity or shows excessive file system I/O activity.
update
daemon.
You can monitor the UBC usage lookup hit ratio by using
dbx
.
You
can view UBC statistics by using
dbx
and checking
the
vm_perfsum
structure.
You can also monitor the UBC by using
dbx -k
and examining the
ufs_getapage_stats
structure.
See
Chapter 2
for information about
monitoring the UBC.
Table 4-4: Guidelines for Tuning the UBC
Action
Performance Benefit
Tradeoff
Increase the memory allocated
to the UBC (Section 4.8.1)
Improves file system performance
May cause excessive paging and
swapping
Decrease the amount of memory
borrowed by the UBC (Section 4.8.2)
Improves file system performance
Decreases the memory available
for processes and may decrease system
response time
Increase the minimum size of the
UBC (Section 4.8.3)
Improves file system performance
Decreases the memory available
for processes
Modify the application to use
mmap
(Section 4.8.4)Decreases memory requirements
None
Increase the UBC write device
queue depth (Section 4.7.17)
Increases overall file system
throughput and frees memory
Decreases interactive response
performance
Decrease the UBC write device
queue depth (Section 4.7.17)
Improves interactive response
time
Consumes memory
4.8.1 Increasing the Maximum Size of the UBC
ubc-maxpercent
attribute.
The default value is 100 percent.
However, the performance of an application that generates a lot of
random I/O will not be improved by enlarging the UBC because the next
access location for random I/O cannot be predetermined.
See
Section 4.3.7
for information about UBC memory allocation.
4.8.2 Decreasing the Amount of Borrowed Memory
vmstat
output shows excessive paging but few or
no pageouts, you may want to increase the value of the
ubc-borrowpercent
attribute.
This situation can occur on low-memory systems (24-MB systems) because
they reclaim UBC pages more aggressively than systems with more memory.
ubc-borrowpercent
attribute and
up to the value of the
ubc-maxpercent
attribute.
Increasing the value of the
ubc-borrowpercent
attribute
allows more memory to remain in the UBC when page reclamation
begins.
This can increase the UBC cache effectiveness,
but may degrade system response time when a low-memory condition
occurs.
The value of the
ubc-borrowpercent
attribute
can range from 0 to 100.
The default value is 20 percent.
See
Section 4.3.7
for information about UBC memory allocation.
4.8.3 Increasing the Minimum Size of the UBC
ubc-minpercent
attribute
will prevent large programs from completely filling the UBC.
For I/O servers, you may want to raise the value of the
ubc-minpercent
attribute to ensure that memory is
available for the UBC.
The default value is 10 percent.
ubc-minpercent
is
appropriate, use the
vmstat
command to examine
the page-out rate.
ubc-maxpercent
and
ubc-minpercent
attributes are close together, you may
degrade I/O performance or cause the system to page excessively.
See
Section 4.3.7
for information about UBC memory allocation.
4.8.4 Using mmap in Your Applications
mmap
function instead of the
read
or
write
function in your
applications.
The
read
and
write
system calls require a page of buffer memory and a page of UBC memory, but
mmap
requires only one page of memory.
4.9 Tuning the Metadata Buffer Cache
Table 4-5: Guidelines for Tuning the Metadata Buffer Cache
Action
Performance Benefit
Tradeoff
Increase the memory allocated
to the metadata buffer cache
(Section 4.9.1)
Improves UFS performance
Reduces the memory available to
the virtual memory subsystem and the UBC
Increase the size of the hash
chain table (Section 4.9.2)
Improves lookup speed
Consumes memory
4.9.1 Increasing the Size of the Metadata Buffer Cache
bufcache
attribute specifies the size of the
kernel's metadata buffer cache as a percentage of physical memory.
The default is 3 percent.
bufcache
to more than 10 percent.
dbx
to examine the
bio_stats
structure.
The miss rate (block misses divided by the sum of the block misses and block
hits) should not be more than 3 percent.
bufcache
attribute.
4.9.2 Increasing the Size of the Hash Chain Table
buffer-hash-size
attribute specifies the size
of
the hash chain table for the metadata buffer cache.
The default
hash chain table size is 512 slots.
buffer-hash-size
attribute so that each hash chain has 3 or 4 buffers.
To determine a value
for the
buffer-hash-size
attribute, use
dbx
to examine the value of
nbuf
, then divide
the value by 3 or 4, and finally round the result to a power of 2.
For example, if
nbuf
has a value of 360, dividing
360 by 3 gives you a value of 120.
Based on this calculation,
specify 128 (2 to the power of 7) as the value of the
buffer-hash-size
attribute.