Friday, March 28, 2014

DB2 for z/OS People: Wearing Your "Applications" Hat Versus Your "Systems" Hat

Not long ago, a colleague of mine sent to me an e-mail message that he'd received from a person who supports DB2 for z/OS at a US-based company. In this message the IT professional wrote of some DB2-accessing application programs that were running into lock contention issues with page-level locking (the default) in effect. He knew that going to row-level locking for the target tables (accomplished by altering the associated table spaces with LOCKSIZE(ROW)) would likely resolve the locking conflicts that were negatively impacting throughput and response time for the aforementioned applications; however, he was reluctant to pursue this physical database design change because it would be a DB2 response to what he regarded as an application problem.

This person, in a well-meaning way, was wearing his "systems" hat. Wrong choice of figurative headwear, I'd say. It might have been the right hat to wear 10 or 20 years ago, but times have changed. These days, it is more important than ever for a mainframe DB2 person to reach first for his or her "applications" hat. Hereinafter I'll endeavor to explain the reasoning behind this contention of mine.

First, I have a "micro" response regarding the particular matter of page- versus row-level locking, and I believe that this will serve as a perfect lead-in to my take on the larger, "macro" issue which has to do with striking the proper balance between system- and application-focused thinking.

Row- versus page-level locking -- think about it

IBM invented relational database technology, and DB2 for z/OS was, in essence, the first commercial-grade, enterprise-level relational DBMS on the market (it was called DB2 for MVS back then). Page-level was initially the finest locking granularity available in a mainframe DB2 environment, and for years that worked just fine for all kinds of DB2-using organizations. In cases where lock contention problems arose with page-level locking in effect, programmers were often ready and willing to make application code changes that eliminated or at least minimized the occurrence of timeouts and/or deadlocks. On occasion, when an application code change was not feasible, a DB2 DBA would reduce effective lock granularity by limiting the number of rows that could be stored in a table space's pages (this via the MAXROWS option of ALTER TABLESPACE). The marching orders given to most DB2 for z/OS DBAs were, "Focus on making the DB2 workload as CPU-efficient as possible." DBAs heeded these orders, and so routinely wore their "systems" hat.

Time went by, and while DB2 for MVS (and then OS/390 and then z/OS) continued to grow in terms of installations and of data volumes managed and transaction volumes processed, relational database management software for distributed systems servers made the scene and eventually became quite popular. For these DBMSs, row-level was (and still is) the default locking granularity utilized. Page-level locking continued to be (and still is) the default for DB2 for z/OS, but a new index structure introduced with DB2 Version 4 made row-level locking an option for the mainframe platform.

Relational database technology evolved, and so did the ways in which people used it. An important development was the rise of vendor-supplied -- versus in-house-written -- application software. Companies producing this software understandably sought to develop code that would be suitable for use with a wide range of database management systems. That meant designing with row-level locking in mind -- the default in most of the relational database world.

The rise of vendor-supplied applications, of course, didn't mean the end of in-house application development. Organizations continued to build their own apps to address unique demands and opportunities in their respective markets, but in this sphere there was a pronounced shift towards multi-tiered, client-server applications and away from more monolithic application architectures. People developing these applications favored the use of data access interfaces, such as JDBC and ODBC, that worked across many different relational DBMSs. Developing in such a DBMS-agnostic way, programmers quite naturally gravitated towards coding with the most widely-used database locking granularity -- row-level -- in mind.

Fast-forward, then, to the present day. You have row-level-locking-using vendor-supplied applications, and you have in-house developers writing code with an implicit assumption of row-level locking granularity, and you have people who want to pair these purchased and in-house-written applications with DB2 for z/OS (which a friend of mine likes to call the "super-server"), and you have DB2 DBAs and systems programmers looking sternly out from under their "systems" hats and telling these people, "Row-level locking doesn't deliver optimal CPU efficiency. Come back to me after you've changed that application to work well with page-level locking."

EXCUSE ME? There are people coming to you, bearing applications that will drive workload growth on the platform you support, and you're giving them the cold shoulder because the preferred (and maybe required, absent code changes) locking mode might cause DB2-related CPU consumption to be a little higher than it otherwise would be? I have a different response to recommend. How about, "Row-level locking? No problem. Of course we can do that. We've been able to do that for about 20 years now. Works like a champ. Let's talk about setting up a testing environment for the application, and let's start putting a production implementation plan together."

On a macro level: technology enables applications

DB2 for z/OS features all kinds of great technology, and that story gets better with each new release of the software, but guess what? That advanced technology is valuable to an organization to the extent that it facilitates and accelerates the development of applications that make the organization money (enabling new revenue streams, upping business volumes from existing clients, growing the customer base, etc.) or reduce costs (better inventory management, reduced customer churn, reduction of fraud-related losses, etc.), or both; thus, your mindset as a DB2 support person should be about rolling out the red carpet to application developers (and to those interested in storing the data related to vendor-supplied applications in a DB2 for z/OS database). You should aim to make DB2 a preferred DBMS for your organization's programmers, and you don't get there by making those folks do things differently versus other data-serving platforms with which they work.

When you wear your "applications" hat, you think "accommodate" more than "challenge" when working with programmers. Supporting row-level locking is just one example of such accommodation. Another example would be creating an index on a column expression in order to make an otherwise non-indexable predicate indexable (and stage 1). Maybe you utilize concentrate-statements-with-literals functionality (introduced with DB2 10) to have DB2 parameterize dynamic SQL statements that were coded with literal values, when application-side statement parameterization is not feasible. Maybe you create an "instead of" trigger to allow programmers to code data-changing SQL statements targeting what would otherwise be a read-only view. Maybe you create a user-defined function that replicates the capability of a function that was useful to a development team when they worked with some other DBMS. Maybe you define a check constraint for a table to help ensure that only certain data values can be inserted into a column -- this so that programmers won't have to code that data domain-restricting logic themselves. There are many, many more ways in which you can use DB2 functionality to make life easier for application developers -- I've just scratched the surface here. Put on your "applications" hat and get creative and be proactive -- go to programmers with ideas, instead of just waiting for them to bring problems and challenges to you.

Now, reaching first for your "applications" hat doesn't mean throwing your "systems" hat away. Enhancing the CPU efficiency of the DB2 for z/OS workload at your site is a very worthwhile pursuit. That said, you should think about actions that could reduce per-transaction CPU consumption without requiring application code changes. Tuning steps of that nature include page-fixing buffer pools, combining persistent threads with RELEASE(DEALLOCATE) packages, and, in some cases, hash-organizing data in a table. Shoot, just rebinding packages when you go to a new release of DB2 can deliver significant CPU savings for your organization.

DB2 is a tool. When people approach you about using that tool to enable a new in-house-developed application, or to support a vendor-supplied application, you should be delighted, not bothered. Such requests give you the opportunity to show what DB2 for z/OS can do as a data server. Consider DB2 features and functions while wearing your "applications" hat, and you'll be that much more effective in leveraging DB2 technology for the benefit of developers, and to provide an ideal data-serving system for vendor-supplied applications. DB2 for z/OS really is the "super server" -- tops in availability, scalability, and security. Don't stand between it and your organization's applications folks. Be a bridge, not a barrier.

Saturday, March 1, 2014

DB2 for z/OS SQL PL Routines: REBIND PACKAGE vs. ALTER REGENERATE

I have long been a big fan of DB2 for z/OS native SQL procedures, but I also recognized early on that putting this technology to use would require DB2 DBAs to learn some new things. Much of this newness has to do with the fact that for a native SQL procedure (and for its more recently arrived cousin, the "native" SQL scalar function, also known as a compiled SQL scalar function or a non-inline SQL scalar function), there is no external-to-DB2 executable -- no object module, no load module. There is just the SQL PL routine's package (SQL PL is the language in which native SQL procedures and non-inlne SQL scalar functions are written). The package is the executable, period. That being the case, managing these packages is a different game versus managing a package that is tied to (for example) the external-to-DB2 load module of a COBOL stored procedure program. Think about making a package "active" in a DB2 for z/OS system. In the context of a load module-tied package, that's familiar territory: you bind the package into the target DB2 environment, and it doesn't get "activated" (i.e., it's not used) until the associated load module (batch job, CICS transaction, external stored procedure, whatever) is executed in that environment. But what if the package itself is the sole executable (the case, as previously mentioned, for a native SQL procedure or non-inlined SQL scalar function)? What then? Well, that's what the DEPLOY option of BIND PACKAGE, and the ACTIVATE VERSION option of ALTER PROCEDURE and ALTER FUNCTION, are for, as I pointed out in a blog entry (written while I was working as an independent consultant) that I posted back in 2009. In the blog entry I'm writing right now, I want to provide you with information about another new wrinkle associated SQL PL routines: "full" versus "partial" package bind.

"Say what?" you might be thinking. "You don't 'partially' bind a package. You bind the whole thing." That WAS true before SQL PL routines came along (and with "SQL routines" I am NOT referring to external SQL procedures, which are written in SQL PL but become C language programs with embedded SQL DML statements in the course of being prepared for execution). When the package in question is related to a native SQL procedure or a non-inline SQL scalar user-defined function (UDF), a REBIND PACKAGE operation in fact rebinds only part of the package. To understand why this is so, consider that the package for a SQL PL routine has a section that corresponds to the SQL "control" statements in the routine (i.e., the logic flow control statements such as IF, WHILE, ITERATE, and LOOP), and another section that pertains to the SQL DML statements (e.g., SELECT, INSERT, UPDATE, DELETE) in the native SQL procedure or non-inline SQL scalar UDF. Generally speaking, a REBIND PACKAGE is executed so as to drive reoptimization for SQL DML statements. If, say, a new index is added to a table to provide a better-performing access path for a static SQL statement embedded in a transaction program, DB2 will not use the new index in executing the statement unless the package of which the statement is a part is rebound. But what if the package is, essentially, the compiled form of a native SQL procedure or a non-inline SQL scalar UDF? In that case, if you want to drive reoptimization of DML statements included in the SQL procedure or UDF, you don't necessarily want the control statements in the SQL PL routine reworked, and so REBIND PACKAGE for a SQL PL routine leaves the control-statement section of the package alone and rebinds only the DML section.

What if you DO want a SQL PL routine's package to be rebound in its entirety? That can be done by way of an ALTER PROCEDURE (or ALTER FUNCTION) statement with the REGENERATE option specified. The effect of that statement's execution will be a rebinding of the whole of the package -- the SQL control statements as well as the DML statements. Why might you want to use ALTER PROCEDURE (or ALTER FUNCTION) with REGENERATE versus REBIND PACKAGE for the package associated with a SQL PL routine? One reason would be to rebind the control statements of a routine so as to get the advantage of a SQL PL performance enhancement (an example of such an enhancement would be the reduced path length of IF statement execution delivered with DB2 10 for z/OS). Here's another reason to use ALTER PROCEDURE (or FUNCTION) with REGENERATE: in a DB2 10 (or later) environment, it gets the bulk of the control section of the package, as well as the DML section, copied into above-the-bar virtual storage in the DB2 database services address space (DBM1) when the package is executed (i.e., when it's allocated to a thread). If a native SQL procedure created in a DB2 9 system is not either regenerated (via ALTER PROCEDURE with REGENERATE) or recreated in a DB2 10 (or later) system, the control section of the procedure's package will occupy more below-the-bar virtual storage when allocated to a thread.

Note that the REBIND PACKAGE option APREUSE, which tells DB2 to reuse the existing access path (when possible) for SQL DML statements when a package is rebound, is not an option that can be specified in an ALTER PROCEDURE (or FUNCTION) statement; so, when you execute ALTER PROCEDURE (or FUNCTION) with REGENERATE, access path changes for SQL DML statements in the procedure or function are a possibility.

Now you know (if you didn't already): with respect to a package associated with a native SQL procedure or a non-inline SQL scalar UDF, use REBIND PACKAGE when all you want to do is rebind the part of the package pertaining to the SQL DML statements in the procedure or function. To rebind all of the package's statements -- control statements as well as DML statements -- use ALTER PROCEDURE or ALTER FUNCTION with the REGENERATE option.