Friday, March 30, 2012

DB2 for z/OS: Trading Memory for MIPS (Part 1)

If you are a DB2 for z/OS DBA or systems programmer, chances are that a good bit of what you do is focused on improving the CPU efficiency of your organization's DB2-accessing application workload. There are various ways in which people go about reducing DB2-related CPU consumption, including SQL statement tuning, physical database design changes (e.g., adding or modifying indexes), shifting work to zIIP engines (for example, by going with native SQL procedures versus external stored procedures for data access via DDF), and exploiting new DB2 features and functions (such as hash organization of data in a table space -- a potential winner for tables for which single-row access via a unique key is the dominant data access mode). If you're doing these things, good -- keep doing them; but, let me tell you something: I've been working with DB2 on the mainframe platform since the late 1980s, and to this day I am still taken aback by the number of sites at which one of the best DB2 efficiency-boosters of all is not fully leveraged (and sometimes is hardly leveraged at all). I'm talking about z/OS system memory. You should really be looking to use this resource for all it's worth, because a) System z servers these days often are configured with LOTS of the stuff (you can get 3 terabytes on a z196), b) mainframe memory costs a lot less than it did a few years ago, c) adding memory to your system doesn't affect the cost of your System z software, and d) thoughtful memory utilization offers a variety of ways to enable DB2 to do more while consuming less in the way of processor cycles. In this, the first of a three-part entry, I'll look at maximizing the effectiveness of your DB2 buffer pools without making them larger. In part two, I'll examine the issue of configuring DB2 to use more memory for buffer pools in an unapologetic way -- meaning, when you want more memory resource for your DB2 subsystem, don't ask for it as though you're Oliver Twist ("Please, sir, I want some more"). In part three, I'll look at ways you can leverage mainframe memory for performance purposes via pools other than the buffer pools (e.g., the EDM pool, the sort pool, and the RID pool).

I want to start out this memory-for-MIPS series with information on using the buffer pool virtual storage resource that you already have more beneficially, from a performance perspective, for three reasons: 1) the buffer pool configuration is typically the primary DB2 consumer of a system's memory resource, 2) you can really do some good by tuning what you've got, and 3) before asking for more memory for DB2 buffers, you ought to be making good use of the memory you've got.

Following, then, are some of my favorite hints and tips for getting your DB2 buffer pool house in order:

A buffer pool dedicated to the work file table spaces is a different animal. Let me start by saying that it's a REALLY good idea to have buffer pools that are dedicated to your work file table spaces (one for the 4K table spaces, and one for the 32K table spaces -- the latter being, often, MUCH more heavily utilized in a DB2 9 or 10 environment, versus a DB2 V8 system). In configuring these buffer pools, keep a couple of things in mind:
  • Almost all of the reads of data in work file table spaces will be of the prefetch variety. Why should you care about this? Because the default value for the virtual pool sequential steal threshold, VPSEQT, is 80. Basically, that setting means that only about 80% of the buffers in a pool dedicated to work file table spaces will get a lot of use, with the other 20% being reserved for relatively infrequent random (single-page) reads. Raise that threshold value significantly, and it's like getting more memory for free (you should see fewer prefetch read I/Os for the pool in question, and fewer read I/Os means less CPU spent in driving I/O operations). I've often seen a VPSEQT value of 95-99% for a work file-dedicated buffer pool used to good effect.
  • The motivation for externalizing updated work file table space pages to disk is not the same as it is for application table spaces and indexes. For application data (in table spaces and indexes), you don't want too much time to pass between the updating of a page and the externalization of that changed page to disk. Why? Because if the DB2 subsystem should fail, restart processing will be elongated if a large number of pending writes have to be processed (part of restart following abnormal termination involves DB2 making sure that committed updates made to pages that were in memory and not yet externalized at the time of the failure are reflected in the data on disk; otherwise, the data on disk would be inconsistent). It's a different story for pages belonging to work file table spaces. Those table spaces are conceptually like scratch pads and don't have to be processed during restart as do application table spaces and indexes and the DB2 catalog. So, for the work file table spaces, page externalization is about ensuring that the associated buffer pools don't get too full of non-stealable pages. That being the case, the deferred write and vertical deferred write queue thresholds (DWQT and VDWQT) can be raised for work file-dedicated buffer pools without adversely affecting DB2 restart performance. Raising these thresholds for the pools used for work file table spaces reduces disk write activity and that reduces CPU consumption. Some folks specify 70 and 40 for DWQT and VDWQT, respectively, some go with 80 and 50, and some go with even higher values (the defaults are 30 for DWQT and 5 for VDWQT). A word of caution: at one site I saw that DWQT and VDWQT for a work file-dedicated buffer pool were both set to 90, and the data manager threshold (indicating that 95% of a pool's buffers are unavailable) was hit a great many times per hour for that pool. That is REALLY BAD for performance (big spike in GETPAGE activity), so watch that you don't go overboard with choosing elevated values for the deferred write thresholds for your work file buffer pools (a DB2 monitor will tell you if the data manager threshold has been reached for a buffer pool; so, too, will output of the DB2 command -DISPLAY BUFFERPOOL DETAIL).

Steal smart. By default, DB2 utilizes a least-recently-used (LRU) algorithm in identifying buffers that are to be stolen to accommodate new pages being brought into memory from disk. That's the right algorithm to use in most cases; however, if you have a DB2 9 (or earlier) system and you "pin" some objects in memory by assigning those objects to a pool that has as many or more buffers than the objects have pages (e.g., you assign index A, with 200 pages, and index B, with 300 pages, to a pool that has 500 or more buffers) then you should use the first-in, first-out (FIFO) page steal algorithm for the pin-pool. Why? Because FIFO is a simpler, and therefore less costly (CPU-wise), page steal algorithm versus LRU. For a pool used for object pinning, little to no buffer stealing will occur (depending on whether or not you actually have enough buffers in the pool to hold all the pages of the objects assigned to the pool), so why go with the more expensive page steal algorithm?

In a DB2 10 environment you have a new choice here: PGSTEAL(NONE). When that option is specified for a buffer pool, here's what happens:
  1. When an object assigned to a PGSTEAL(NONE) pool is first accessed, DB2 will, in the background, prefetch every page of that object into the pool (the application process that first accessed the object doesn't have to wait for this prefetch-all process to finish -- it gets the pages that it needs and goes on from there).
  2. For further access to the now-fully-in-memory object, the DB2 optimizer will assume that no I/Os will be required, and this assumption will factor into SQL statement access path selection.
  3. If in fact the pool does not have enough buffers to hold all of the pages of an object assigned to the pool, buffers will be stolen as needed based on the FIFO page steal algorithm.

Use page-fixing of buffers appropriately. Since DB2 V8, you've had the option of fixing a pool's buffers in memory via the PGFIX(YES) option of the -ALTER BUFFERPOOL command. Page-fixing a pool's buffers (meaning that the page frames holding those buffers can't be stolen by z/OS) delivers CPU savings by eliminating the need to fix in memory and then release a buffer every time it receives a page read in from disk or provides a page to be written out to disk (those page fix and release actions become unnecessary when the buffers are always in a fixed-in-memory state). Page-fixing then, can boost a DB2 system's CPU efficiency when it's used for pools that have a high level of I/O activity (and the focus here is usually on read I/O activity). If there are 10 or 20 read I/O operations per second for a buffer pool, fixing that pool's buffers in memory won't have much of an impact on CPU consumption. If the rate of read I/O activity for a pool is hundreds or thousands per second, page-fixing buffers can deliver a nice CPU savings benefit. You might think that it would be easier to specify PGFIX(YES) for all of a DB2 subsystem's buffer pools, but unless you're really awash in mainframe memory I wouldn't recommend that approach. I think that it's a good idea to have some pools -- those with lower rates of read I/O activity -- for which buffers are not page-fixed. That approach provides z/OS with something of a safety valve, referring to stealable page frames holding DB2 buffers, that can be utilized if the system memory resource becomes overtaxed.

You may have heard that DB2 10 will use 1 MB page frames (if available) for page-fixed buffer pools. That's true, and this feature of DB2 10 further boosts the CPU savings effect of fixing buffers in memory by improving the efficiency of virtual-to-real storage address translation (versus a 4 KB page frame usage scenario). Keep this in mind, though: DB2 9 can't use 1 MB page frames for buffer pools, whether page-fixed or not. What this means is that you don't want to configure an overly large portion of your z/OS system's real storage resource to be managed using 1 MB page frames (that's done via the LFAREA parameter of the IEASYSxx member of PARMLIB) until you've migrated to DB2 10 and are likely past the point of needing to fall back to DB2 9. If you want to be conservative, migrate to DB2 10 and THEN configure your z/OS system for 1 MB page frames.

Consider robbing Peter to pay Paul, if Peter has more than he needs and Paul doesn't have enough. As I mentioned up front, this first of my three-part memory-for-MIPS blog entry is focused on making the best use of the memory already allocated for your DB2 buffer pools. That doesn't mean that you can't make a pool larger -- you can, if you decrease the size of another pool by the same amount. Suppose that buffer pools BP2 and BP3 are both sized at 40,000 buffers each in your system. If the read I/O rate for BP2 is 20 per second and the rate for BP3 is 2000 per second, I'd seriously consider taking a big chunk of BP2's buffers away and giving them to BP3 (for example, you might take BP2 from 40,000 down to 20,000 buffers, and take BP3 from 40,000 buffers to 60,000. Such a move might put a serious dent in the high read I/O rate seen for BP3 without substantially increasing the read I/O rate associated with BP2.

As for these read I/O rates I've mentioned, you can get those from a DB2 monitor statistics detail report, or perhaps from a monitor's online display of buffer pool activity. You can also get this information from the output of the DB2 command -DISPLAY BUFFERPOOL(ACTIVE) DETAIL (and I specify ACTIVE in the command to get information for pools that are actually being used). To get per-second figures using the command output, do the following:
  1. Issue -DISPLAY BUFFERPOOL(ACTIVE) DETAIL at the beginning of some busy hour of the day (referring to workload being processed by the DB2 subsystem). The output of this issuance of the command does not have to be retained.
  2. Issue -DISPLAY BUFFERPOOL(ACTIVE) DETAIL a second time, one hour after issuing the command the first time. Retain the output from this issuance of the command.
  3. A few minutes after issuing the command the second time, issue it a third time. All you want from the output of this issuance of the command is the timestamp value that appears in the DSNB409I part of the output (e.g., "INCREMENTAL STATISTICS SINCE 10:00:48 MAR 19, 2012"). That's the "to" time for the interval of activity captured in the second issuance of the -DISPLAY BUFFERPOOL(ACTIVE) DETAIL command. The "from" time for this interval is the timestamp value found in the output of the second issuance of the command. Subtract the "from" timestamp value from the "to" value, and you have the number of seconds in the interval. Divide the read I/O figures in the output of the second issuance of the -DISPLAY BUFFERPOOL(ACTIVE) DETAIL command by this number of seconds, and you have per-second rates of activity.

Whatever the source of your numbers (DB2 monitor or DB2 command output), what you want, for each pool, is the TOTAL read I/O rate per second. That's the sum of synchronous read I/Os (random + sequential) and asynchronous read I/Os (sequential prefetch + list prefetch + dynamic prefetch). This rate is your guide to appropriate use of buffer page-fixing, and your measuring stick to gauge the impact of, for example, increasing VPSEQT for a work file-dedicated pool, or increasing the size of a high-read-I/O pool by adding buffers taken from a low-read-I/O pool. In part two of my three-part memory-for-MIPS blog entry, we'll revisit read I/O rates in the context of boosting the aggregate size of a buffer pool configuration. Look for that next entry to be posted in a week or two.

Monday, March 12, 2012

APAR PM29226 and DB2 for z/OS Native SQL Procedure Source Code Management

Sometimes an APAR (in IBM-ese, a description of a software functional enhancement or defect correction effected via application of a fix) is not what it -- at first -- seems. Take DB2 for z/OS APAR PM29226, for example. The opening sentence of the brief (just one paragraph) APAR text reads as follows: "This APAR provides sample job DSNTEJ67 which initiates the process of converting source for an external SQL procedure into source for a native SQL procedure." A mainframe DB2 person reading those words might understandably conclude that the APAR is not relevant to his environment if either a) his organization has not made use of external SQL procedure functionality (external SQL procedures being what stored procedures written in SQL had to be prior to the introduction of native SQL procedures with DB2 9 for z/OS), or b) external SQL procedures have been deployed, but converting those to native SQL procedures is not currently a priority. In other words, you might think that PM29226 is about external-to-native SQL procedure conversion, period.

Folks, I have news for you: REGARDLESS of whether or not your organization has developed external SQL procedures, if you are using -- or thinking about using -- native SQL procedures in your DB2 for z/OS system, you really ought to take a look at the changes wrought by PM29226. External-to-native SQL procedure conversion is NOT the primary raison d'etre of this APAR; rather, the process of converting an external SQL procedure to a native SQL procedure is used to ILLUSTRATE a more important aspect of PM29226 -- that being the introduction of a number of services intended to facilitate source code management (SCM) in a DB2 for z/OS SQL procedure context.

Some background here: native SQL procedures are a big deal -- for my money, easily the most important enhancement since stored procedure functionality was introduced in the mid-1990s with DB2 Version 4 (I've blogged a lot on the subject, starting with an entry posted to the blog I maintained while working as an independent DB2 consultant). Native SQL procedures are advantageous in a number of ways, but they also introduce some new issues regarding development and deployment. One of these has to do with source code management. Here's why: for years, SCM as it pertains to stored procedures was an external-to-DB2 matter because stored procedure programs were external-to-DB2 objects. People would develop stored procedure programs in languages such as COBOL and Java, and they would use their SCM tool of choice to manage the associated source code (even an external SQL procedure has an outside-of-DB2 source code component, as the procedure is turned into a C-language program with embedded SQL statements as part of the execution preparation process). Along came native SQL procedures, and for the first time (as one DBA I met put it), "there's only one piece." In other words, there's not the DB2 piece and its corresponding external-to-DB2 piece -- the latter being the COBOL or Java (or whatever) source module (or the load module if you're talking about the executable). Instead, for a native SQL procedure there's just the DB2 piece: the CREATE PROCEDURE statement is the source code, period (and the stored procedure's package is the executable, period). DB2 people (DBAs and developers of DB2-accessing programs) have been accustomed to dealing with programs, written in various languages, that contain SQL statements. When the SQL statement (in this case, CREATE PROCEDURE) is the program, that introduces a new wrinkle with respect to SCM.

As native SQL procedures are still somewhat new in the mainframe DB2 world (they are available in a DB2 9 new-function mode environment, or with DB2 10 in CM9 mode, or DB2 10 NFM for organizations migrating to DB2 10 from DB2 V8), leading SCM tools tend not to provide explicit support for the management of program source when the source language is SQL Procedure Language (aka SQL PL, the language used to create DB2 native SQL procedures). That situation will be addressed as more and more organizations utilize native SQL procedures and request support in this area from their SCM tool vendors. In the meantime, the fix for PM29226 (UK71501 for DB2 10, and UK71502 for DB2 9) provide a means of bridging the SQL PL SCM gap.

To specifics: as previously mentioned, PM29226 modifies DB2 sample job DSNTEJ67. With the application of the fix for the APAR, DSNTEJ67 makes available to you a number of services, in the form of sample REXX routines, that support important SCM tasks applicable to the development and use of native SQL procedures. These tasks include source extraction, source transformation, and source deployment. [The new services, initially packaged in a macro library, are much easier to read when they have been expanded into members of a PDS. Step three of the PM29226-modified DSNTEJ67 job does this, using a temporary PDS. You might want to expand the REXX services into your own PDS, outside of the DSNTEJ67 job stream.] Among the services provided by PM29226 are the following:
  • SQLPLSRC extracts the SQL PL source of a procedure from the DB2 catalog. Two other services enable one to save extracted SQL PL source code to either a RECFM F data set (fixed-length records, one logical record per physical block on disk) or a RECFM V data set (one variable-length logical record, stored as one physical block). The choice of record format might depend on the type of input that is suitable for an existing SCM tool or process: in some cases that could be a file; in others, a string.
  • CHKSQLPL inspects and validates SQL PL source code, using the precompiler with the HOST(SQLPL) option.
  • SQLPLTRN transforms a procedure's SQL PL source -- changing, adding, or removing options (e.g., schema, QUALIFIER, SQL PATH, DEBUG MODE, etc.) as specified by a user.
  • CRSQLPL deploys a SQL procedure, with the SQL PL input coming from a file or a string.
Check these services out, and see if they can be of use to you as an aid in performing source code management tasks for your DB2 native SQL procedures.