Friday, December 28, 2012

DB2 for z/OS: Clearing Up a Native SQL Procedure Misconception

A few days ago I received a question that I believe reflects a rather widely held misconception about DB2 for z/OS native SQL procedures. In the interest of clearing the air, I'll provide in this entry the information that I communicated to my question-asking friend.

Native SQL procedures were introduced with DB2 9 for z/OS in new-function mode (or with DB2 10 in new-function mode, for those organizations that migrated or are migrating to DB2 10 from DB2 V8). Native SQL procedures changed the DB2 for z/OS stored procedure game in a big way, as I pointed out in an entry posted several years ago to the blog I maintained while working as an independent DB2 consultant. One of the important differences between a native SQL procedure and an external stored procedure (such as a COBOL stored procedure) is what I'd call the execution locale. Whereas an external stored procedure executes at least partially in a WLM-managed stored procedure address space (and I'll explain that "partially" in a moment), a native SQL procedure executes entirely in the DB2 database services address space (also known as DBM1). That difference prompted the question to which I referred in my opening: to what address space is the CPU time associated with execution of a native SQL procedure charged?

The person who asked this question was thinking that because a native SQL procedure executes in the DB2 DBM1 address space, the CPU cost of executing a native SQL procedure would show up in the CPU consumption of DBM1. While I can see how one could arrive at that conclusion, it is not correct. The CPU cost of executing a native SQL procedure is in fact charged to the address space through which the stored procedure call came to DBM1; so, if a native SQL procedure is called by a network-attached DRDA requester, the CPU cost of executing the native SQL procedure will be charged to the DB2 DDF address space. If the caller is a CICS transaction program, the CPU cost of executing a called native SQL procedure will be charged to the CICS region in which the transaction executes.

If you think about it, this makes sense. There are two key factors to keep in mind here. First, though a native SQL procedure executes in the DB2 database services address space, it executes under the task of the application process that issued the stored procedure call (e.g., a DDF enclave SRB or a CICS subtask TCB or a batch address space's TCB). This is true of any SQL statement: the statement executes in DBM1, but under the task of the statement-issuing process, and associated CPU consumption is charged back to the address space through which the statement came to DBM1. [There is a slight exception to this rule: if a SQL statement is issued by an external stored procedure, the CPU cost of executing the statement will be charged to the address space through which the stored procedure call came to DBM1, as opposed to being charged to the WLM-managed address space in which the SQL-issuing stored procedure runs.]

The second key factor is this: a native SQL procedure is comprised wholly of SQL statements. Some of these statements will likely be data manipulation statements (e.g., SELECT, INSERT, DELETE, UPDATE), and some will likely be "logic" statements (e.g., LOOP, IF, WHILE), but it's all SQL. Yes, a native SQL procedure executes in the DBM1 address space, but SQL statements ALWAYS execute in the DBM1 address space. That's why I stated earlier that an external DB2 stored procedure generally executes "partially" in a WLM-managed stored procedure address space: if that external DB2 stored procedure issues SQL statements (and a stored procedure program doesn't have to issue SQL statements), those SQL statements will execute -- as is true of all SQL statements -- in the DB2 DBM1 address space; so, it could be said that the external stored procedure will execute partially in its assigned WLM-managed stored procedure address space (for execution of program code other than SQL statements) and partially in the DB2 DBM1 address space (for execution of SQL statements issued by the stored procedure program). A native SQL procedure executes entirely in DBM1 because it is comprised entirely of SQL statements. And where does the cost of SQL statement execution show up? In the address space through which the SQL statements come to DBM1 (with the aforementioned slight exception related to external stored procedures). And that's why the CPU cost of native SQL procedure execution gets charged to the address space through which the associated CALL statement comes to DB2.

I hope that this information will be useful to you. Best wishes to all for a successful 2013.

Monday, December 17, 2012

DB2 for z/OS: Clone Tables and "Active Versioning"

A DB2 for z/OS DBA contacted me a few days ago and posed a seemingly simple question, the answer to which ended up being a little more involved than might be expected. The DBA's query: how can one alter a table that has a clone?

The specific need in this case was to add a column to a table for which a clone had been created. As is true of other ALTER TABLE operations, addition of a table column cannot be accomplished while the clone relationship exists. The in-a-nutshell solution is to drop the clone table (via ALTER TABLE DROP CLONE), perform the desired alteration of the base table (such as ALTER TABLE ADD COLUMN), and then re-create the clone table (ALTER TABLE ADD CLONE). The thing is, the aforementioned DBA was doing this, and it wasn't working. In this blog entry, I'll explain why that was so, and how we determined the procedural steps required for success.

First, a brief level-set regarding DB2 clone tables: this refers to a feature, introduced with DB2 9 for z/OS in new-function mode, whereby one can create a "clone" of a DB2 table. If you have a base table named TBX, an associated clone could be created with a statement that would look like this:

ALTER TABLE TBX
ADD CLONE TBX_CLONE;

[You don't have to have the word "clone" in the name of the clone table, but having such an identifier is helpful when it comes to accessing a clone table -- it helps to ensure that you're accessing a clone, rather than the base table, when clone access is what you want.]

The clone table created by the statement above would look, in a structural sense, just like the base table: same columns, in the same order, with the same names and data types; same indexes; same check constraints (if any); etc. The one difference between the clone and the base table (other than the table name) is the data in the two tables, and this difference is the essence of a clone table's raison d'etre. Initially, a clone table is empty (and it resides in another "instance" of the base table's table space, distinguished by the fifth-level qualifier in the name of the underlying data set, which at first will be I0002 for the clone, versus I0001 for the base). After being created, the clone table can be populated in the usual way: through SQL inserts or the LOAD utility. Adding data to, and changing data in, the clone table has no effect on the users and programs accessing data in the base table. Once the clone data contains data as desired, that data can be effectively "switched" with the data in the base table -- quickly and programmatically -- by way of the SQL statement EXCHANGE DATA. Users and applications continue to access table TBX (to use my example table-name), but now the data they're accessing is the data that -- before the "switch" -- was the data in TBX_CLONE. In effect, through the EXCHANGE DATA statement, the table that had been the clone in the relationship becomes the new base table and takes on the base table's name, while the table that had been the base table becomes the new clone and takes on the name of the clone table. To learn a little more about what happens when EXCHANGE DATA is executed, you can check out an entry that I posted a few years ago to the blog that I maintained while working as an independent DB2 consultant.

OK, back to the question asked by the DBA. He knew that an ALTER TABLE table statement (other than an ALTER TABLE DROP CLONE) could not be successfully executed for a target table with a clone; so, he dropped the clone and successfully executed an ALTER TABLE ADD COLUMN statement. After doing that, however, the DBA was not able to re-create a clone for the table -- the ALTER TABLE ADD CLONE statement failed with a -148 error code, reason code 9, indicating that the ALTER failed because "active versioning" was in effect for the table.

DB2 uses table space versioning to ensure data integrity and preserve data availability through database schema changes effected by way of ALTER TABLE statements. A new version for a table space is created by DB2 when (for example) a new column is added to a table in the table space via ALTER TABLE ADD COLUMN, and the table space is placed in an advisory REORG-pending state (AREO*). This state will be removed, and the associated schema change will be applied (i.e., physically implemented) when the table space in question is reorganized. You might think that versioning would no longer be "active" for the table space following a reorganization, but post-REORG, the ALTER TABLE ADD CLONE statement to which I've referred still didn't work for the DBA who contacted me for help: again, the error code received was -148, condition code 9 (failure due to active versioning).

Faced with this situation, the DBA ran the REPAIR utility for the affected table space, with the VERSIONS option. Well, that reset the version information for the table space in the SYSIBM.SYSTABLESPACE catalog table (the values of the OLDEST_VERSION and CURRENT_VERSION columns for the table space's row in SYSTABLESPACE were both changed to 0, indicating that versioning had NEVER been active for the table space), and after that the ALTER TABLE ADD CLONE statement could be successfully executed; however, there was a problem: the column added to the table was of the type INTEGER WITH DEFAULT NULL, but the null indicator was not set for the column (a SELECT for the table with an IS NULL predicate referencing the new column returned no results). On top of that, the value sometimes found in the new column (which was supposed to have NULL as its default value) was -2147483648.

That's pretty weird, but weird things can happen when you use REPAIR to "turn off" indicators (such as active versioning) that DB2 uses for data integrity purposes. To be sure, REPAIR has its place as a tool for DBAs, and indeed it is sometimes the ONE way to address an issue impacting data availability. Still, it's a utility with which you can shoot yourself in the foot, and when I hear of a situation in which REPAIR is used to "fix" a DB2 database problem, I am always motivated to find a means by which the problematic object indicator can be "turned off" or reset through actual removal of the circumstances prompting DB2 to set said indicator, versus sledgehammering the indicator with REPAIR. The presence of "active versioning," as indicated by non-equal values for OLDEST_VERSION and CURRENT_VERSION for a table space's row in SYSIBM.SYSTABLESPACE, is addressed by two actions: reorganizing the table space that is in AREO* status and deleting from SYSCOPY any records associated with image copies of the table space that were taken when a version other than the table's current version was in effect. My DBA friend took these two steps for the table space holding the altered table (image copy records associated with older versions of the table space were removed from SYSCOPY via execution of the MODIFY RECOVERY utility), and everything then worked as he wanted it to: a clone could be created for the the altered table, and there were none of the unexpected data retrieval results for the newly added column that he'd seen after using REPAIR to force a resetting of the table space's version number.

So, if you have a table with a clone, and you want to alter that table in some way (e.g., by adding a column to the table), do the following:
  1. Drop the clone table with ALTER TABLE DROP CLONE.
  2. Alter the table as needed (e.g., issue an ALTER TABLE ADD COLUMN statement).
  3. REORG the table space to apply the schema change specified in the preceding ALTER TABLE statement, and take a full image copy in the process (an inline copy is taken anyway if you run an online REORG). The REORG will ensure that only one version is associated with the data in the table space, and the full image copy will allow you to delete older image copy records (those taken when prior versions were in effect for the table space).
  4. Run MODIFY RECOVERY for the table space, with a specification that will remove from SYSCOPY records for image copies taken when a prior version was in effect for the table space. A specification of RETAIN LAST(1) should preserve the record for the image copy taken during the REORG (step 3 above) while deleting records for older image copies. Note that this is a special case: it's generally recommended that you retain SYSCOPY records for at least the two most recent full image copies of a table space. So, after running this MODIFY RECOVERY job with RETAIN LAST(1), for subsequent SYSCOPY "housekeeping" go back to a MODIFY RECOVERY specification that will retain records for two or more full image copies of the table space.
  5. With versioning no longer "active" for the table space, you can issue an ALTER TABLE ADD CLONE statement to create a clone for the table you altered in step 2.

And leave REPAIR out of this picture -- you don't need to run it to be able to create a clone for an altered table.