Sunday, February 28, 2016

DB2 for z/OS: When Did You Last Check Your Subsystem's DSMAX Value?

As a DB2 for z/OS system grows and evolves, the value of certain ZPARM parameters should be reviewed to ensure that they are still appropriate in the context of the current environment. Failure to do this can result in a situation in which the operational efficiency (or some other characteristic) of the system is negatively impacted. One of these ZPARMs that should be periodically revisited is DSMAX.

DSMAX specifies the maximum number of data sets (these would be associated with table spaces and indexes) that can be open and allocated to DB2 at one time. For a long time, the maximum value you could specify for DSMAX was 10,000. That changed, about 15 years ago, with DB2 Version 6, which allowed the DSMAX specification to be as high as 32,767. The upper bound of DSMAX was further increased, to 65,041, with DB2 Version 8. It went to 100,000 with DB2 9; and to 200,000 with DB2 10. The default value of DSMAX has also steadily increased: it was 3000 with DB2 Version 7; went to 9960 with DB2 Version 8; and to 20,000 with DB2 10.

The rise in maximum and default DSMAX values was warranted by the increase in the number of data sets associated with a typical DB2 system, and THAT phenomenon owes to a number of factors, including growth over time in the size of DB2 for z/OS-managed databases, more table spaces for a given number of tables (a universal table space can hold one and only one table), increasing use of DB2 for z/OS as a database management system for vendor-supplied applications (several of which are characterized by databases that contain a great many objects), more tablespace partitions (DB2 Version 8 introduced table-controlled partitioning, which allows a table to have up to 4096 partitions, versus a minimum of 254 partitions for an index-controlled partitioned table space), and more index partitions (DB2 Version 8 table-controlled partitioning enabled partitioning of more than one index on a partitioned table).

Here's why you should periodically check your DSMAX value and its effect on a DB2 subsystem's operational efficiency: when the number of data sets open and allocated to a DB2 subsystem reaches 99% of that subsystem's DSMAX specification, DB2 will physically close some open data sets (300 of them, or 3% of DSMAX, whichever is smaller). Which data sets will be closed? Well, DB2 will first look for data sets belonging to objects defined with CLOSE YES, and will select some of those for closing on a least-recently-used basis. If these data set close operations do not sufficiently reduce the number of open data sets, DB2 will close data sets belonging to objects defined with CLOSE NO, again making selections on a least-recently-used basis. Some level of this data set close activity is typically not a problem, but too much can negatively impact system performance. I've seen a number of situations in which an overly-high level of DB2 data set close activity has been overlooked (this can lead to more data set open activity than you'd like, as data sets closed for DSMAX reasons have to be re-opened when next accessed).

Where can you go to get a handle on data set close activity in your DB2 environment? That's easy: just get your hands on a DB2-monitor-generated statistics long report (ideally, one that captures activity for a particularly busy hour of the day) or an online display of DB2 subsystem statistics, and locate the information pertaining to "open/close activity." There you will find a number of useful fields. Among these is the high-water mark for the number of open DB2 data sets. If you see there a figure significantly larger than the DSMAX value for the DB2 subsystem, that's telling you something. More important, take a look at the field labeled DS CLOSED-THRESH REACHED (or something similar -- different DB2 monitor products can have slightly different labels for the same field). That shows you the number of times that DB2 data sets were physically closed due to the DSMAX threshold being reached. If that figure indicates a data set close rate of more than 1 per second (e.g., if you see that data sets were closed due to the DSMAX threshold being reached more than 3600 times in a one-hour elapsed time interval), my recommendation would be to increase the value of DSMAX.

How high should the DSMAX value be for a DB2 subsystem? Well, a value high enough to get the rate of data set close operations to the single digits per minute (or even to zero) is nice, but there is such a thing as too high; see, the increases in DSMAX maximum and default values that I mentioned previously were enabled largely by z/OS enhancements that reduced the amount of below-the-line (referring to the 16 MB level) virtual storage required for each open data set. The below-the-line virtual storage needed for an open data set, while smaller than before, is not zero. The 200,000 maximum DSMAX value allowed in a DB2 11 system might best be thought of as a theoretical limit -- the actual maximum value that would work in your environment would likely be considerably less than 200,000. If your current DSMAX value is resulting in a higher-than-desired rate of data set close operations, consider bumping the value up by 1 or 2 thousand to see if that gets the close rate down to where you want it to be. Generally speaking, I'd say that a value of up to 30,000 for DSMAX should not be problematic. If you want to go beyond that figure, you probably first ought to get a good idea as to the availability of virtual storage below the 16 MB line for the DB2 system of interest. More information on calculating an appropriate DSMAX value can be found in the DB2 for z/OS Installation and Migration Guide.

To sum up the message of this blog entry, I'd urge you to first check, using your DB2 monitor (I particularly like to use monitor-generated statistics long reports) the rate of data set close activity resulting from the DSMAX threshold being reached for particular DB2 subsystems at your site. If that close rate exceeds 1 per second, bump the value of DSMAX up by something in the range of 1000 to 2000. If you feel a need to take DSMAX above 30,000, proceed with caution and with, ideally, good knowledge of the availability of below-the-line virtual storage in the target system.

Stay on top of this going forward. Don't let "data set creep" cause an overly high level of data set close activity for your DB2 subsystems.

Sunday, February 21, 2016

Statement-Level Control of DB2 for z/OS Query Parallelization

I'm writing about this now because of the increased interest I've seen, at various DB2 for z/OS sites, in using DB2 query parallelization as a means of boosting zIIP engine utilization when -- and this often happens during nighttime hours -- workloads that typically drive consumption of zIIP MIPS (e.g., DB2 DDF transactions, or Java programs running in WebSphere Application Server for z/OS) diminish. During periods of heavy overnight batch processing, people might observe very high utilization levels for general-purpose engines in a z/OS LPAR, while zIIP engines in the system are not at all busy. It might be nice, in such cases, if one could shift some batch work from general-purpose to zIIP engines, but how might that be accomplished (and here I'm talking about batch programs written in a language such as COBOL, versus Java -- Java batch programs, like any Java programs in a z/OS environment, would be zIIP-eligible)? Getting some DB2 query parallelization going for batch programs would be a means of moving a portion of their processing from general-purpose to zIIP engines, as the "child tasks" associated with a parallelized query can have up to 80% of their processing handled by zIIP engines. Problem was, getting this done required, until somewhat recently, that a batch program's DB2 package be bound with DEGREE(ANY). Binding a package with that option makes ALL of the program's queries candidates for parallelization. What if you only want a particular statement in the package to be parallelize-able?

People contemplating utilization of query parallelization for dynamic queries faced a similar conundrum. They could set the value of the DB2 ZPARM parameter CDSSRDEF, which establishes the default value of the CURRENT DEGREE special register, to ANY, and that would enable parallelism for dynamic queries, but it would make ALL dynamic queries candidates for parallelization by default. That might be OK for a DB2 subsystem dedicated to an analytics workload, but it probably isn't the situation you want for a DB2 subsystem that supports high-volume transaction and/or batch applications. True, setting the value of CURRENT DEGREE to ANY prior to issuance of a dynamic query by a program, and then changing CURRENT DEGREE back to 1, would be a way to limit parallelization to a particular query, but changing the CURRENT DEGREE special register value is not always going to be feasible. Suppose a query is generated by an analytics tool? How would CURRENT DEGREE be set in that case? What if the dynamic query of interest is issued by an application purchased from a vendor? That's not a scenario that would lend itself to changing a special register like CURRENT DEGREE.

And I've just been talking here about making a query parallelize-able by DB2. What about the degree of parallelization? The PARAMDEG parameter in ZPARM allows you to control the degree to which a parallelized query can be split, but that's at the DB2 subsystem level. A low degree of parallelization (e.g., 4) might be appropriate if the aim of parallelization is getting some work shifted from general-purpose to zIIP engines, and there is a desire to keep the number of parallel tasks for split queries from getting very high, but would that be a good degree of parallelization if the objective is to get a query generated and submitted via an analytics tool to complete as quickly as possible? Would not a higher degree of parallelization better support that aim?

These formerly sticky problems -- being selective about parallelization for both static and dynamic queries, and having a right-for-the-moment degree of parallelization -- were very nicely addressed via a new catalog table, SYSQUERYOPTS, and related enhancements that were delivered with DB2 10 for z/OS. Although DB2 10 has been out there for more than five years, I've found that SYSQUERYOPTS escaped the notice of many a DB2 DBA. I'm telling you now: notice this. It gives you statement-level control over both parallelization and degree of parallelization.

How is this done? It's pretty easy: you create, for your user ID, a DSN_USERQUERY_TABLE (the DDL for this table, and descriptions of the table's columns, can be found online in the Knowledge Center for DB2 10 and DB2 11). Then you insert into that table a row for the query for which you want to provide values for parallelization (i.e., whether the query is a candidate for parallelization, and the degree of parallelization). This inserted row will include the text of the query that is to be parallelized. You could include the query text in the insert statement that populates DSN_USERQUERY_TABLE, but it's recommended that, for static SQL, you instead pull the statement text from the SYSPACKSTMT catalog table (an example of an INSERT statement that targets DSN_USERQUERY_TABLE and gets a query's text from SYSPACKSTMT can be found in the Knowledge Center for DB2 10 and DB2 11). Similarly, for a dynamic query for which you want to provide parallelism parameters, it's best to get the statement text from the DB2 dynamic statement cache (this text, for each statement in the cache, is placed in the DSN_STATEMENT_CACHE_TABLE when you issue EXPLAIN STMTCACHE ALL). In addition to the query text, the row you insert into DSN_USERQUERY_TABLE will contain the value 'ANY' in the DEF_CURR_DEGREE column (indicating that the query is a candidate for parallelization by DB2) and some integer value in the MAX_PAR_DEGREE column (this is your specification of the maximum degree of parallelism for the query). You can also indicate that these parallelism specifications (i.e., that this query is a candidate for parallelism, and, if parallelized, can be split into as many as X number of child queries) apply to the statement whenever it is issued, or only when it is issued in association with a certain package in a certain collection. After inserting this row into your DSN_USERQUERY_TABLE, you would issue the BIND QUERY command, and that would cause rows to be inserted into the SYSQUERY and SYSQUERYOPTS catalog tables (another catalog table introduced with DB2 10, SYSQUERYPLAN, can also be populated via execution of the BIND QUERY command, but that's done when you want to provide an access path for a query, and that's not the subject of this blog entry).

After you've executed the BIND QUERY command (and you might issue the command a second time, with the LOOKUP(YES) option specified, to verify that the first issuance of the command populated SYSQUERY and SYSQUERYOPTS as expected), the parallelism options you selected for the target query will go into effect when the query is next prepared for execution. That will happen, for a static query, when the package of which the statement is a part is rebound, and for a dynamic query when the statement next goes into the dynamic statement cache.

A couple more comments. First, for all this to work, the value of the OPTHINTS parameter in ZPARM has to be YES (the default value is NO). Second, in addition to providing a means of activating and controlling query parallelism at the statement level, the SYSIBM.SYSQUERYOPTS table (and SYSIBM.SYSQUERY and DSN_USERQUERY_TABLE and the BIND QUERY command) enables specification of the REOPT bind option and activation of star join functionality at the statement level.

That's it in a nutshell. Through the mechanism I've described herein, a static SELECT issued by a nighttime batch program might be parallelized and split 4 ways, while on the same DB2 system a particular dynamic query generated by an analytics tool is parallelized and split 20 ways when it is executed in the middle of the online day. Such fine-grained control of query parallelism -- determining the individual statements for which it can be used, and the associated degree of parallel processing -- could open up new use cases for parallelism in your environment. Give it some thought.