25.01.2015 Views

Planning Your ASE 15.0 Migration - Sybase

Planning Your ASE 15.0 Migration - Sybase

Planning Your ASE 15.0 Migration - Sybase

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> <strong>15.0</strong> <strong>Migration</strong><br />

Tips, Tricks, Gotcha’s & FAQ.<br />

ver 1.0


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Table of Contents<br />

Table of Contents .......................................................................................................................... iii<br />

Introduction & Pre-Upgrade <strong>Planning</strong> .........................................................................................1<br />

High Level Upgrade Steps for <strong>ASE</strong> 15 .........................................................................................2<br />

General <strong>ASE</strong> 15 <strong>Migration</strong> FAQ...................................................................................................4<br />

SySAM 2.0 Installation & Implementation...................................................................................7<br />

SySAM 2.0 Background ...............................................................................................................7<br />

SySAM 2.0 Implementation Steps ................................................................................................7<br />

<strong>ASE</strong> 15 & SySAM FAQ .............................................................................................................18<br />

Preparing for the Upgrade ...........................................................................................................21<br />

Review Database Integrity ..........................................................................................................21<br />

Find all currently partitioned tables.............................................................................................22<br />

Increase HW Resource Requirements.........................................................................................22<br />

Preparing for Post-Upgrade Monitoring for QP Changes ...........................................................22<br />

Review Trace Flags.....................................................................................................................22<br />

Run Update Index Statistics ........................................................................................................23<br />

Installation & Upgrade .................................................................................................................25<br />

Installing the Software ................................................................................................................25<br />

Alternative to Dump/Load for <strong>ASE</strong> 12.5.1+ ...............................................................................25<br />

Cross Platform Dump/Load ........................................................................................................26<br />

Installation & Upgrade FAQ.......................................................................................................27<br />

Partitioned Tables .........................................................................................................................29<br />

Partitions & Primary Keys/Unique Indices .................................................................................29<br />

Global vs. Local Indexes.............................................................................................................30<br />

Multiple/Composite Partition Keys & Range Partitioning..........................................................32<br />

Semantic Partitions & Data Skew ...............................................................................................33<br />

Partitioned Tables & Parallel Query ...........................................................................................33<br />

Partitioning Tips..........................................................................................................................34<br />

Dropping Partitions .....................................................................................................................41<br />

Creating a Rolling Partition Scheme...........................................................................................42<br />

Partitioning FAQ.........................................................................................................................43<br />

Query Processing Changes ...........................................................................................................45<br />

Query Processing Change Highlights..........................................................................................45<br />

Determining Queries Impacted During <strong>Migration</strong> ......................................................................50<br />

Diagnosing and Fixing Issues in <strong>ASE</strong> <strong>15.0</strong>.................................................................................65<br />

Query Processing FAQ: ..............................................................................................................86<br />

Storage & Disk IO Changes .........................................................................................................91<br />

Very Large Storage Support........................................................................................................91<br />

DIRECTIO Support & FileSystem Devices................................................................................92<br />

Tempdb & FileSystem devices ...................................................................................................93<br />

Changes to DBA Maintenance Procedures .................................................................................95<br />

Space Reporting System Functions.............................................................................................95<br />

Sysindexes vs. syspartitions & storage........................................................................................96<br />

VDEVNO column.......................................................................................................................96<br />

DBA Maintenance FAQ..............................................................................................................97<br />

Update Statistics & datachange().................................................................................................99<br />

Automated Update Statistics .......................................................................................................99<br />

Datachange() function.................................................................................................................99<br />

Update Statistics Frequency and <strong>ASE</strong> 15..................................................................................100<br />

iii


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Update Statistics on Partitions .................................................................................................. 101<br />

Needs Based Maintenance: datachange() and derived_stats() .................................................. 101<br />

Update Statistics FAQ .............................................................................................................. 102<br />

Computed Columns/Function-Based Indices ........................................................................... 103<br />

Computed Column Evaluation.................................................................................................. 103<br />

Non-Materialized Computed Columns & Invalid Values......................................................... 103<br />

Application and 3 rd Party Tool Compatibility.......................................................................... 105<br />

#Temp Table Changes .............................................................................................................. 105<br />

3 rd Party Tool Compatibility ..................................................................................................... 119<br />

Application/3 rd Party Tool Compatibility FAQ ........................................................................ 120<br />

dbISQL & <strong>Sybase</strong> Central ......................................................................................................... 121<br />

dbISQL & <strong>Sybase</strong> Central FAQ ............................................................................................... 121<br />

Appendix A - Common Performance Troubleshooting Tips .................................................. 123<br />

iv


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Introduction & Pre-Upgrade <strong>Planning</strong><br />

<strong>ASE</strong> <strong>15.0</strong> is a significant upgrade for <strong>Sybase</strong> customers. In addition to the new features that are available,<br />

the query processor has been completely re-written. Consequently, customers should expect to spend more<br />

time testing their applications for <strong>ASE</strong> <strong>15.0</strong>. This document is a living document, and will attempt to<br />

highlight some of the known tips, tricks or problem areas to help customers prepare for the upgrade, and in<br />

particular highlight how some of the new features may help speed the upgrade process (for example: using<br />

sysquerymetrics to isolate regression queries).<br />

The purpose of this document is to provide a comprehensive look at the new features of <strong>ASE</strong> <strong>15.0</strong> focusing<br />

on either the application impact or how they affect migrations. It is not intended to be a complete<br />

migration guide providing step-by-step instructions on migrating existing systems and applications to <strong>ASE</strong><br />

15 – which is the focus of the official <strong>ASE</strong> <strong>15.0</strong> <strong>Migration</strong> Guide. Instead, the goals of this document are:<br />

• Document what what steps application developers and DBAs will need to do prior to<br />

migration.<br />

• Document system changes that will impact maintenance procedures, monitoring scripts, or<br />

third party tools.<br />

• Document which new features of <strong>ASE</strong> <strong>15.0</strong> that customers are expected to be adopting early<br />

in the migration cycle, particularly any migration tips or gotcha’s that would be of assistance.<br />

• Document what tools that are included with <strong>ASE</strong> <strong>15.0</strong> that can be used to facilitate migration.<br />

The rationale for this document was that the official <strong>Migration</strong> Guide treated all these items lightly initially<br />

- and instead focused on documenting testing procedures. While an attempt was made to try to add some of<br />

this document’s content, publication issues resulted in the content being altered in such a way that<br />

important points were lost or the context changed - in some cases resulting in significant inaccuracies. As a<br />

result, where the content in this document overlaps the migration guide, this document should be<br />

considered to supercede the <strong>ASE</strong> <strong>15.0</strong> <strong>Migration</strong> Guide. It should be noted that the only content in this<br />

guide is the application impact and migration considerations. Guidance about the upgrade procedures can<br />

still be found in the <strong>ASE</strong> <strong>15.0</strong> <strong>Migration</strong> Guide.<br />

This document is organized into the following sections:<br />

• Implementing SySAM 2.0<br />

• Preparing for the Upgrade<br />

• Installation and Upgrade<br />

• System Table & System Function Changes<br />

• IO Subsystem Changes<br />

• Updating Statistics & datachange()<br />

• Semantic Partitions<br />

• Computed Columns/Function-Based Indices<br />

• Query Processing<br />

• dbISQL & <strong>Sybase</strong> Central<br />

In each section, we will attempt to provide as much information as possible to assist DBA’s in determine<br />

which features to implement, the best strategies to implement, and the merits of implementing the changes.<br />

An important point is that this document assumes that you are upgrading from <strong>ASE</strong> 12.5.x – this<br />

assumption is based on the fact that <strong>ASE</strong> 12.0 and previous releases have already reached the end of their<br />

1


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

support cycle. Additionally, some of the recommendations may be obsoleted with later Interim Releases<br />

(IR’s) of <strong>15.0</strong> such as <strong>15.0</strong>.2 (planned for Q2 2007 release). This document is based on <strong>ASE</strong> <strong>15.0</strong>.1 ESD<br />

#1 which was released in early November 2006.<br />

High Level Upgrade Steps for <strong>ASE</strong> 15<br />

While the <strong>ASE</strong> <strong>15.0</strong> <strong>Migration</strong> Guide contains much more detailed migration steps, this document would<br />

not be complete without pointing out some of the high-lights or specific differences with <strong>ASE</strong> <strong>15.0</strong><br />

migrations over previous releases that really need to be discussed.<br />

Plan <strong>Your</strong> SySAM 2.0 License Management Architecture<br />

<strong>ASE</strong> <strong>15.0</strong> is the first product in the <strong>Sybase</strong> family to introduce version 2.0 of the SySAM technology.<br />

There are several options for controlling license management – including unserved file based licenses.<br />

However, the most flexibility in terms of ability to control licenses, ease product installation, etc., it is<br />

likely that most <strong>Sybase</strong> customers will want to use a license server. It is importart that you determine how<br />

you are going to manage your licenses prior to installing the software for the first time as you will be<br />

prompted for license host information during the software installation.<br />

Determine <strong>Your</strong> Upgrade Path<br />

Generally, most DBA’s are very familiar with <strong>ASE</strong> migrations and have long established procedures for<br />

their installations for migrations. This section however will detail two important aspects that may affect<br />

those procedures.<br />

<br />

Alternative to Dump/Load<br />

In the past, there generally have been only two upgrade paths – upgrade in place, or dump/load. Because of<br />

the mount/unmount feature added to <strong>ASE</strong> 12.5, there is now a third method that is likely considerably faster<br />

than dump/load for customers using SAN disk technology.<br />

1. Queisce the database using a manifest file<br />

2. Copy/move the devices along with the manifest file to the <strong>15.0</strong> host<br />

3. Mount the database using the “mount database” command<br />

4. Bring the database online<br />

It is during the online step that the database upgrade is performed. Note that this technique is only<br />

supported between machines from the same vendor and platform – cross platform is not supported (you<br />

must use the XPDL feature for this).<br />

<br />

Reduced 32-bit Platform Availability<br />

In addition, <strong>ASE</strong> <strong>15.0</strong> is only released as a 64-bit application on most platforms, with 32-bit versions<br />

available only for Windows, Linux and Solaris. If you are running a 32-bit version of <strong>Sybase</strong> <strong>ASE</strong>,<br />

particularly on HP or AIX, you will need to make sure you are running a later 64-bit version of the<br />

operating system. The rationale for this change is that most of the Unix OS vendors have fully<br />

implemented 64-bit versions of their operating systems and have discontinued support (in most cases) for<br />

their 32-bit versions. This lack of support as well as in some cases degraded performance of 32-bit<br />

applications on the 64-bit OS’s, has made it impractical for <strong>Sybase</strong> to continue providing 32-bit binaries for<br />

these platforms. For a full list of which 64-bit versions and patches are required for your platform, consult<br />

the <strong>ASE</strong> certification page at http://certification.sybase.com/ucr/search.do.<br />

If you are upgrading from a 32-bit to a 64-bit release as a result, and take advantage of the additional<br />

memory offered, you may want to observe the application carefully through the MDA table<br />

2


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

monOpenObjectActivity. With increased memory availability, a previous table scan that resulted in<br />

physical IO may now be able to completely fit in data cache and be all logical IOs. As a result of not<br />

needing to yield the CPU to wait for physical IOs, processes involved in logical memory table scans will<br />

use their full time slice, yield the CPU and then immediate jump on the runnable queue. The result of this<br />

behavior is an immediate and dramatic increase in CPU utilization - sometimes a near constant 100%<br />

utilized and a resulting drop in application performance. Table scans on application tables can be<br />

determined by a query similar to:<br />

select *<br />

from master..monOpenObjectActivity<br />

where DBID not in (1, 2, 3) -- add additional tempdb dbid’s if using multiple tempdbs<br />

and UsedCount > 0<br />

and IndexID = 0<br />

order by LogicalReads desc, UsedCount desc<br />

The reason for excluding tempdb(s) is that it is common for table scans of temporary tables for no other<br />

reason than typically they are either small and usually lack indexing.<br />

Prepare for the Upgrade<br />

After determining how you are going to upgrade, you will then need to prepare for the upgrade itself. With<br />

some of the changes present in <strong>ASE</strong> 15 and some of the new features, we have collected some guidance<br />

around additional hardware resources, etc. that you should review prior to attempting the upgrade. The<br />

details are provided in the section on this topic later in the paper.<br />

One item that may help is to create a quick IO profile of the application in 12.5.x and <strong>15.0</strong>.x. This can be<br />

done by collecting MDA monOpenObjectActivity and monSysWaits data for a representative period of<br />

time. For monOpenObjectActivity data, you may wish to collect two sets of data - one for application<br />

databases and another for temp databases. If you compare the monOpenObjectActivity data with <strong>ASE</strong><br />

<strong>15.0</strong>, you may notice the following:<br />

• A huge increase in table scans on one or more tables (see earlier discussion on 32 to 64 bit for<br />

logic). This could be due to an optimization issue or just a move to a merge-join instead of a<br />

nested loop join using an index.<br />

• A large decrease in IOs in tempdb - particularly when joins between two or more temp tables<br />

are involved. This likely is due to merge join.<br />

• A significant drop in the UsedCount column for particular indexes (when IndexID > 1).<br />

Likely this is the result of missing statistics if the index contains more than one column.<br />

While this may not point out the exact queries affected (discussed later), it can help reduce the effort to find<br />

the queries to just those involving specific tables. Keep in mind that the observation periods should be<br />

during the same relative loading to ensure that the results are comparable.<br />

Stress Test <strong>Your</strong> Application with <strong>ASE</strong> 15 QP<br />

Because of the new QP engine, it is critical that <strong>Sybase</strong> customer stress test their application before and<br />

after the upgrade to determine any changes in query processing behavior. Some advice, common issues<br />

and particularly useful tools are discussed later in this paper.<br />

Post-upgrade Tasks<br />

As is normal, the successful completion of a <strong>Sybase</strong> <strong>ASE</strong> upgrade does not finish with bringing the system<br />

online for end-user access. After the upgrade itself is complete, there are a number of tasks that need to be<br />

considered, the following not all inclusive:<br />

• Post-upgrade monitoring of query processing<br />

• Implementation of new <strong>ASE</strong> features that do not require application changes<br />

3


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

• Upgrade of 3 rd party tools to versions that support <strong>ASE</strong> <strong>15.0</strong><br />

General <strong>ASE</strong> 15 <strong>Migration</strong> FAQ<br />

Are there customers running <strong>ASE</strong> <strong>15.0</strong> in production How many<br />

Yes. There are customers running <strong>ASE</strong> <strong>15.0</strong> in production across a broad spectrum of customers – from<br />

financial companies to state governments to electronics companies. We can’t tell how many as customers<br />

typically don’t tell <strong>Sybase</strong> when they have moved systems into production on the new release.<br />

Consequently, we are only aware of customers that are running in production when interaction with field<br />

sales teams or technical support staff specifically identifies a system in production on <strong>ASE</strong> <strong>15.0</strong>. At this<br />

point, the adoption of <strong>ASE</strong> <strong>15.0</strong> is proceeding much the same as with previous releases, such as <strong>ASE</strong> 12.5,<br />

however, there are more customers looking to migrate sooner at this stage than previous due to the new<br />

partitioning feature, the performance improvements with the QP engine and other improvements.<br />

Consequently, we expect the adoption of <strong>ASE</strong> <strong>15.0</strong> overall to be quicker than previous releases.<br />

Why should we upgrade to <strong>ASE</strong> <strong>15.0</strong><br />

The exact reason why a specific customer should upgrade to <strong>ASE</strong> <strong>15.0</strong> will be specific to their<br />

environment. For customers with large data volumes, the semantic partition feature in <strong>ASE</strong> <strong>15.0</strong> may be<br />

enough of a driving reason. For others with complex queries, the substantial performance enhancements in<br />

<strong>ASE</strong>’s QP engine for complex queries (3 or more tables in a join) or the performance improvements for<br />

GROUP BY or other enhancements may be the reason.<br />

If the benefits in <strong>ASE</strong> <strong>15.0</strong> are not a sufficient reason, then platform stability may be a consideration. It is a<br />

fact that <strong>Sybase</strong> compiles <strong>ASE</strong> on specific OS versions from each vendor. As new hardware is released by<br />

the vendor, often it requires newer versions of the OS to leverage the hardware advances. As a result, this<br />

causes changes to the behavior of the core OS kernel, making older versions of <strong>Sybase</strong> software unstable<br />

on the newer OS versions. <strong>Sybase</strong> works with OS vendors to ensure that <strong>Sybase</strong> <strong>ASE</strong> remains compatible<br />

with newer OS versions during the reasonable lifespan of when it was compiled – typically 4-5 years.<br />

After that point, the advances in technology – both software and hardware, make it impossible to maintain.<br />

<strong>ASE</strong> 12.5 was first released in Q2 of 2002 and is expected to be EOL’d in 2008 – a life span of 6 years –<br />

which stretches the service life of the software to its maximum. This is analgous to running Windows 98<br />

on a desktop today – while fully possible, it can’t take advantage of today’s dual core processors that are<br />

common in desktop systems, the multi-tasking capabilities of todays software, and device drivers for newer<br />

disk drives (SATA RAID, etc.) are not available – which prevents the hardware from achieving it’s full<br />

potential.<br />

But Isn’t 12.5.4 Also Planned<br />

<strong>ASE</strong> 12.5.4 was indeed planned and subsequently released, but it has a much more reduced implementation<br />

than originally planned. The primary goal of <strong>ASE</strong> 12.5.4 is to provide a GA release for the features<br />

introduced in the Early Adopter 12.5.3a IR (the “a” signified “adopter” release) and to officially merge the<br />

12.5.3a code changes into the main 12.5.x code path. This is in keeping with <strong>Sybase</strong>’s stated policy of<br />

having two supported releases in market simultaneously (for the full description of the <strong>Sybase</strong> <strong>ASE</strong> & RS<br />

lifecycle policy, please refer to:<br />

http://www.sybase.com/products/informationmanagement/adaptiveserverenterprise/lifecycle<br />

However, the EOL date for 12.5.x has already been established as September 2008, consequently upgrades<br />

to 12.5.4 should only be considered as a temporary measure until a 15.x migration can be affected.<br />

It is also important to note that many of the features introduced in 12.5.x – including 12.5.3a, have already<br />

been merged into the <strong>15.0</strong> codeline (as of ESD #2). Future enhancements to those features – including<br />

4


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

RTMS, Encrypted Columns, etc. are being planned in the <strong>15.0</strong> codeline and may not be back-ported to <strong>ASE</strong><br />

12.5.x. This is inline with <strong>Sybase</strong>’s policy that while two releases may be supported simultaneously, it can<br />

be viewed that the earlier release is effectively in maintenance mode with no new functionality being added<br />

except in limited circumstances.<br />

5


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

SySAM 2.0 Installation & Implementation<br />

While this topic is dealt with in greater detail in other documents, given the frequency of questions, this<br />

paper would be amiss if this topic was not mentioned at all. However, for more detail information on this<br />

topic, please go to http://www.sybase.com/sysam, which has recorded webcasts, presentations, whitepapers<br />

and other information describing SySAM 2.0. This paper will attempt to provide a background for SySAM<br />

2.0, and provide a planning guide along with implementation steps and considerations for customers.<br />

SySAM 2.0 Background<br />

Even before Sarbanes-Oxley (SOX), many customers were requesting strong license management and<br />

reporting capabilities from <strong>Sybase</strong>. Sarbanes-Oxley made this even more important as it held the CEO of<br />

corporations responsible for any financial wrong-doing – including misuse of software assets. As the<br />

banking industry was one of those most heavily watched as a result of legislation and given <strong>Sybase</strong>’s<br />

position within the financial community, <strong>Sybase</strong> adopted a more stringent licensing management<br />

implementation.<br />

While it is built using the same flexlm technology used in SySAM 1.0, it was decided that a true network<br />

licensing management implementation should be provided. In SySAM 1.0, it was common for customers<br />

to simply deploy <strong>ASE</strong> servers using the same license keys on all the servers. This reuse of the same license<br />

string, of course, was not legitimate and many times resulted in over deployment of software vs. actual<br />

licensed. As SOX –driven audits were performed; it became apparent that this was an issue that required<br />

resolving – sometimes resulting in draconian measures taken by the company such as restricting access to<br />

<strong>Sybase</strong>’s website as a means of controlling software access. Since SOX applies to all publicly traded<br />

companies, <strong>Sybase</strong> opted to implement SySAM 2.0 to make licensing compliance easier to measure for<br />

corporate users. In fact, one of the critical components of SySAM 2.0 was a reporting feature that would<br />

allow IT managers to monitor and report on license compliance to corporate officers as necessary.<br />

However, in order to make it as easy as possible to also support the types of flexibility the real world<br />

requires during hardware transitions, unexpected growth, testing/evaluation periods, etc., SySAM 2.0<br />

provides features such as license overdrafts, license borrowing (for mobile tools use), grace periods and<br />

other techniques to allow temporary software use without restriction. At no time does the SySAM software<br />

report license usage to <strong>Sybase</strong>. So, much like before, any long term use of <strong>Sybase</strong> software beyond the<br />

licensing agreements must be reconciled with the company’s <strong>Sybase</strong> sales representative.<br />

While <strong>ASE</strong> was the first product to use SySAM 2.0, subsequent product releases including RS <strong>15.0</strong>,<br />

PowerDesigner 12.0, and future product releases will use SySAM 2.0 as well. Consequently, it is wise to<br />

consider the broader scope for SySAM 2.0 than thinking of it as limited to <strong>ASE</strong>.<br />

It is important to understand one aspect very clearly. SySAM 2.0 will NEVER contact <strong>Sybase</strong> to report<br />

license compliance issues. The purpose of SySAM 2.0 was to aid corporations in assuring compliance to<br />

meet legislative requirements – not report software compliance to <strong>Sybase</strong>, Inc. In fact, if you opt to accept<br />

the optional license overdrafts, any use of overdrafted licenses would not be reported to <strong>Sybase</strong> and any<br />

purchase orders to correct the overdrafts would need to be initiated by the customer manually by reporting<br />

the overdraft to <strong>Sybase</strong> and requesting pricing information on the continued use.<br />

SySAM 2.0 Implementation Steps<br />

Before installing <strong>ASE</strong> <strong>15.0</strong> or other SySAM 2.0 managed product, you will need to follow the following<br />

steps:<br />

1. Determine license management architecture (served, unserved, OEM, mixed)<br />

2. Determine number of license servers, location and infrastructure<br />

3. Inventory product license quantities and types by host/location<br />

7


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

4. Determine host id’s of designated license servers and redundant host id’s<br />

5. Generate license files from <strong>Sybase</strong> Product Download Center (SPDC)<br />

6. Install SySAM software on license server hosts<br />

7. Deploy license key files to license servers<br />

8. Generate license file stubs for deployed software<br />

9. Implement reporting and monitoring requirements<br />

10. Update licenses congruent with <strong>Sybase</strong> Support Contracts<br />

Most of these steps will be discussed in more detail in the following sections. However, not all will be, and<br />

the intent will be not to reiterate the product documentation as much as to provide best practices, insights<br />

and work-arounds to the gotcha’s involved.<br />

To facilitate in this discussion, the following mythical customer scenario will be used: Bank ABC is a<br />

large <strong>Sybase</strong> customer with several different business units operating as separate profit and loss centers.<br />

Bank ABC is headquartered in New York City with production systems there as well as in northern New<br />

Jersey where IT staff and developers are located. The NJ office operates also as a primary backup facility<br />

to the production systems with a full DR site in Houston, TX (with some IT staff located there as well).<br />

Bank ABC also operates foreign offices in London, Frankfurt, Australia and Tokyo for its trading business<br />

units – with a group of developers in the London and Australia locations. London and Frankfurt serve as<br />

DR sites for each other as does Australia and Tokyo. Bank ABC also operates several web applications<br />

(for it’s retail as well as trading subsidiaries) outside the corporate firewall with mirrored sites in NY,<br />

Houston, San Francisco, London, Tokyo and Australia. In other words – the usual multi-national corporate<br />

mess that drives IT license management crazy.<br />

License Architecture<br />

Generally, SySAM license management architectures fall into two categories – served and unserved. A<br />

third case, OEM licensing (used for embedded applications), will not be discussed in this document as the<br />

best implementation is dependent on the application, deployment environment and licensing agreements<br />

between <strong>Sybase</strong> and the vendor.<br />

It is expected that <strong>Sybase</strong> customers most likely will use ‘served’ licenses or a mix of ‘served’ and<br />

‘unserved’ licenses. The rationale is that ‘served’ licenses take advantage of one or more centralized<br />

license servers that provide licensing to all the deployed software locations. This eliminates the hassle of<br />

maintaining license keys and individually tying <strong>Sybase</strong> licenses to specific hosts in the case of hardware<br />

upgrades, etc. However, a license management server could be impractical in small environments or when<br />

license server access is restricted such as when the systems run outside the corporate firewall. To get a<br />

better idea of how this works and the impact it could have on your implementation, each of these cases<br />

(served/unserved) will be discussed in detail.<br />

<br />

Unserved Licenses<br />

It is easier to understand ‘served’ licenses (and their attractiveness) once unserved licenses are discussed.<br />

‘Unserved’ licenses are exactly what you would expect – and everything that you likely wanted to avoid.<br />

When using an ‘unserved’ license, the <strong>Sybase</strong> product (<strong>ASE</strong>) looks for a license file locally on the same<br />

host and performs the license validation and checkout. While a more detailed list of considerations is<br />

contained in the SySAM documentation, consider the following short list of advantages and disadvantages:<br />

Advantages<br />

Local license file – Because the license file is local,<br />

network issues or license server process failures will not<br />

affect the product licensing resulting in grace mode<br />

operations.<br />

Disadvantages<br />

License generation - Because the license is node-locked,<br />

each unserved license will have to be generated uniquely<br />

at SPDC<br />

8


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Advantages<br />

Works best for remote locations such as outside the<br />

firewall where limited admistration access is required.<br />

Licenses are dedicated and always available<br />

No extra hardware or software installation requirements<br />

Disadvantages<br />

Licenses will need to be regenerated for each node when<br />

support renews (typically yearly) or when the host<br />

hardware changes.<br />

Licenses can’t be pooled and used as needed (pursuant to<br />

licensing agreement restrictions).<br />

License usage reporting is not available<br />

Like most DBA’s, our staff at Bank ABC really can’t be bothered manually generating and maintaining<br />

hundreds of licenses each year for all of their servers. However, the web-applications running outside the<br />

firewall pose a problem for administration, it is decided that the web-application servers will use unserved<br />

licenses. Assuming Bank ABC has two servers in each location (one for retail applications and one for<br />

trading), we have a total of 12 servers in 6 locations. Fortunately, this is not likely one person doing this –<br />

like most corporations Bank ABC has divided it’s DBA’s along the business units, and in this case the<br />

retail DBA’s will take care of their systems and the trading DBA’s will take care of theirs – leaving 6 at the<br />

most. But even then it is likely that the London & Australian staff will be responsible for the servers in<br />

their local areas as well – so each geography is likely only concerned with 2 unserved licenses.<br />

<br />

Served Licenses<br />

As the name suggests, this implies that one or more license servers services multiple deployed software<br />

host platforms. For example, Bank ABC could have a single license server that provided all the licenses to<br />

every server that had network access to it – the ultimate in license management simplicity. One license<br />

server, one license file, one license generation from SPDC – and done. Of course, not exactly realistic, but<br />

that is a discussion for a later discussion on how many license servers you should have. For now consider<br />

the following diagram:<br />

Figure 1 – License Server Provisioning Licenses for Multiple <strong>ASE</strong> Software Installations<br />

There are a couple of key points about the above diagram:<br />

1. The only license server process running is on the license management host (top). The<br />

deployed <strong>ASE</strong> machines are not running a license server.<br />

9


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

2. <strong>Sybase</strong> SYSAM 2.0 compatible products will check their local files for licenses. The license<br />

‘stub’ file will point the product to the license server(s) to use for actually aquiring the<br />

license.<br />

On the license server, license files are saved to the directory $SYB<strong>ASE</strong>/SYSAM-2_0/licenses. Each<br />

license file can contain more than one type of license and more than one file may be used. The typical<br />

license file will resemble the following:<br />

SERVER tallman-d820 0013028991D3<br />

VENDOR SYB<strong>ASE</strong><br />

USE_SERVER<br />

PACKAGE <strong>ASE</strong>_EE SYB<strong>ASE</strong> COMPONENTS=<strong>ASE</strong>_CORE OPTIONS=SUITE SUPERSEDE \<br />

ISSUED=14-jun-2006 SIGN2="0F70 418E B42B 2CC9 D0E4 8AEC 1FD0 \<br />

B6C7 69CE 1A05 F6BF 45F5 BEE4 408C C415 1AA5 18B8 6AA1 3641 \<br />

6FDD 52E1 45B6 5561 05D4 9C62 AD6B 02AA 9171 5FAC 2434"<br />

INCREMENT <strong>ASE</strong>_EE SYB<strong>ASE</strong> 2007.08150 15-aug-2007 2 \<br />

VENDOR_STRING=SORT=100;PE=EE;LT=SR PLATFORMS=i86_n DUP_GROUP=H \<br />

ISSUER="CO=<strong>Sybase</strong>, Inc.;V=<strong>15.0</strong>;AS=A;MP=1567" \<br />

ISSUED=14-jun-2006 BORROW=720 NOTICE="<strong>Sybase</strong> - All Employees" \<br />

SN=500500300-52122 TS_OK SIGN2="0E12 0DC8 5D26 CA5B D378 EB1A \<br />

937B 93F9 CAF2 CDD8 0C3E 4593 CA29 E2F1 8F95 15D1 2E60 11C0 \<br />

10BE 26EC 8168 4735 8A52 DD9F C239 5E88 36D7 1530 A947 1A7C"<br />

Note that this file is only located on the central license server – it is not distributed with the software to the<br />

deployed installations. These machines will have a much simpler license stub that resembles:<br />

SERVER TALLMAN-D820 ANY<br />

USE_SERVER<br />

Which tell the <strong>Sybase</strong> products to contact the license host (tallman-d820 in this case) in order to acquire a<br />

license. This has huge benefit with respect to license administration in that when licenses change, only the<br />

license file at the license server host has to be updated. By default, when a SySAM 2.0 product is installed,<br />

a default stub file SYB<strong>ASE</strong>.lic is generated and contains a sample license stub based on the answers to the<br />

license server question from the software installer GUI.<br />

There is a variation of the above file that may need to be used if you are running within a firewall<br />

environment in which common ports are blocked. That format is:<br />

# Replace the port # with any unused port that your firewall supports<br />

SERVER TALLMAN-D820 ANY<br />

USE_SERVER<br />

VENDOR SYB<strong>ASE</strong> PORT=27101<br />

It is really helpful to first understand the detail architecture of flexlm. The MacroVision FlexLM software<br />

has a license manager daemon (lmgrd) that runs on the host machine. This process is not responsible for<br />

actual license management - but rather managing the other vendor licensing daemons that need to be<br />

executed. On windows platforms, this is the %SYB<strong>ASE</strong>%\SYSAM-2_0\bin\SYB<strong>ASE</strong>.exe executable,<br />

while on Unix, the path is $SYB<strong>ASE</strong>\SYSAM-2_0\bin\SYB<strong>ASE</strong>. During startup, the lmgrd starts the<br />

SYB<strong>ASE</strong> licensing daemon and any other daemons it finds. When a program requests a license, the<br />

program first contacts the FlexLM lmgrd daemon to find the port number that the SYB<strong>ASE</strong> daemon is<br />

running on. While the FlexLM lmgrd daemon normally listens in the range 27000-27009, the vendor<br />

daemons can be started on any available port. For example, consider the following snippet of the<br />

SYB<strong>ASE</strong>.log file in the SySAM directory<br />

0:18:42 (lmgrd) pid 2380<br />

0:18:42 (lmgrd) Detecting other license server manager (lmgrd) processes...<br />

0:18:44 (lmgrd) Done rereading<br />

0:18:44 (lmgrd) FLEXnet Licensing (v10.8.0 build 18869) started on tallman-d820 (IBM PC) (12/9/2006)<br />

0:18:44 (lmgrd) Copyright (c) 1988-2005 Macrovision Europe Ltd. and/or Macrovision Corporation. All<br />

Rights Reserved.<br />

0:18:44 (lmgrd) US Patents 5,390,297 and 5,671,412.<br />

0:18:44 (lmgrd) World Wide Web: http://www.macrovision.com<br />

0:18:44 (lmgrd) License file(s): C:\sybase\SYSAM-2_0\licenses\SYB<strong>ASE</strong>.lic …<br />

0:18:44 (lmgrd) lmgrd tcp-port 27000<br />

0:18:44 (lmgrd) Starting vendor daemons ...<br />

0:18:44 (lmgrd) Started SYB<strong>ASE</strong> (pid 1828)<br />

0:18:45 (SYB<strong>ASE</strong>) FLEXnet Licensing version v10.8.0 build 18869<br />

0:18:45 (SYB<strong>ASE</strong>) Using options file: "C:\sybase\SYSAM-2_0\licenses\SYB<strong>ASE</strong>.opt"<br />

10


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

…<br />

0:18:45 (lmgrd) SYB<strong>ASE</strong> using TCP-port 1060<br />

0:18:45 (lmgrd) SYB<strong>ASE</strong> using TCP-port 1060<br />

In the above log, the lmgrd (highlighted in yellow) lines show that it starts up, starts the vendor daemons,<br />

and then starts listening on port 27000. The SYB<strong>ASE</strong>licensing daemon (highlighted in blue) starts up,<br />

reads the options file, then reads all the licenses, and finally starts listening on port 1060. Note that this<br />

port number could change. If running in a firewalled environment in which you need to use a consistent<br />

port, then you use the PORT= notation on the VENDOR line to specify which port the specific<br />

vendor daemon should use. Note that you may also have to specify both the lmgrd and the SYB<strong>ASE</strong><br />

executables in any firewall software’s permitted applications/exclusions list - particularly on MS Windows<br />

systems using either the supplied MS firewall or 3 rd party firewall software.<br />

Another interesting point is that while the license server may have multiple license files – for example, one<br />

for each product – the deployed software only needs to have that one license stub, regardless of how many<br />

SySAM 2.0 products are actually installed on that machine. This is illustrated in Figure 1 above in that the<br />

license server has multiple files, but the deployed hosts have a single license stub file – most likely simply<br />

SYB<strong>ASE</strong>.lic (the default). We will discuss this and other deployment techniques later.<br />

License Server Quantity & Locations<br />

The number of license servers you will need depends on a number of factors:<br />

• The geographic distribution of your business locations and IT infrastructure to support it<br />

• Number and locations of Disaster Recovery sites<br />

• SySAM availability logical clustering/redundancy<br />

• The number and locations of development organizations<br />

• Business unit cost allocations/IT support agreements<br />

The one consideration that is not likely is the number of hosts accessing the license server for licenses. A<br />

single license server running on a workstation class machine can easily support hundreds of server<br />

products. In fact, an internal <strong>Sybase</strong> license server proving well over 100 licenses to <strong>ASE</strong> hosts only uses 5<br />

minutes of cpu time per week for the SySAM activity. The rest of these will be discusse in the following<br />

sections.<br />

<br />

Geographic Distribution<br />

While not absolutely necessary, <strong>Sybase</strong> suggests that additional license servers be used for each geographic<br />

region – possibly even advisable on a site/campus basis. The rationale for this is that the products have<br />

built-in heart beats to the license server as well as the need to aquire licenses during the startup phase.<br />

Network issues between locations could prevent products from starting (hampering DR recovery<br />

procedures) or cause running products to enter grace mode – and possibly shutdown.<br />

Given our example ABC Bank scenario above, we have the following locations and activities:<br />

11


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Site<br />

Production<br />

DR<br />

Development<br />

Web System<br />

Comments<br />

Manhattan, NY <br />

New Jersey Primary DR for NYC<br />

Houston, TX DR for NYC & SF<br />

London, UK DR for Frankfurt<br />

Frankfurt, Germany DR for London<br />

Melbourne, Australia DR for Tokyo<br />

Tokyo, Japan DR for Melbourne<br />

San Francisco, CA <br />

It is likely that we will want license servers in all 8 locations. Strictly speaking, given that Manhattan and<br />

NJ are in very close proximity, it is possible for the NYC systems to use the NJ license servers. However,<br />

in this case, since ABC Bank is leasing the lines between Manhattan and Newark NJ installed by the local<br />

telephone company, the DBA’s are considering a separate license server to avoid issues in case of<br />

telephone company hardware issues.<br />

<br />

Disaster Recovery (DR) Sites<br />

While it is true that <strong>Sybase</strong> software has built-in grace periods, for <strong>ASE</strong>, the installation grace period is 30<br />

days from the time the software is installed. Consequently, if you only use a single license server at the<br />

production facility and you experience a site failure, you likely will not be able to access those licenses<br />

during the DR recovery procedures when starting up the mirrored system. On top of this, warm-standby<br />

and active clustering systems need to be up and running already, so they will need access to a license server<br />

at anypoint. Again, however, if the primary site fails (and takes out the single license server) an<br />

unexpected shutdown at the DR facility (i.e. a configuration change) – could result in not being able to<br />

restart the DR facility servers. For this reason, it is strongly recommended that a license server be installed<br />

at each DR facility. This server should minimally have the licenses used by the DR systems.<br />

<br />

License Server Redundancy<br />

There are several mechanisms used by the license server software to attempt to prevent software issues due<br />

to license server failure. The first mechanism, of course, is the grace periods. However, they should be<br />

viewed only as an immediate and temporary resolution and should not be depended upon for more than a<br />

few hours at most. While it is true that the runtime grace period is 30 days – more than long enough to fix<br />

a failed hardware component or to totally change license servers – the risk becomes more pronounced in<br />

that if the <strong>ASE</strong> is shutdown for any reason, it likely will not restart as it will be working on the installation<br />

grace which likely expired long ago.<br />

Another approach that could work is to use multiple independent license servers. As the <strong>Sybase</strong> software<br />

starts up – or conducts its license heartbeat, it reads the first license file it comes to lexigraphically in the<br />

license directory. If it fails to obtain a license, it then reads the next license file, and so on, until all the<br />

license files in the directory have been read. On the surface, this seems to be a simple problem – ABC<br />

Bank for example, could simply put 6 license servers around the globe and if one should be done, the other<br />

5 would be available. The problem is the pesky license quantity issue. As software products check out<br />

licenses, the pool of licenses available drop down. While there is an optional overdraft, this (by default) is<br />

10% of the licenses being checked out for the given license server. To see the problem here, let’s assume<br />

12


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

that both NYC and NJ have 10 <strong>ASE</strong> servers in production each (ignoring all other locations/servers).<br />

Consequently, when the licenses for each were generated, the DBA picked 10 for each server and accepted<br />

the overdraft. Using the default of 10%, each site could now actually run 11 servers. You should start to<br />

see the problem. If NYC fails, 9 of the <strong>ASE</strong>’s will not be able to restart as all of the licenses for NJ would<br />

be used (10) and the first NYC server started at the NJ failover site will check out the single remaining<br />

license. This technique is an option for site license customers, however, as they simply need to generate<br />

the license pool for each license server to accommodate all the licenses they would expect to use in a worse<br />

case scenario. While the overdraft does make sense when doing platform migrations, special benchmarks,<br />

etc. – for license server redundancy, it is not likely to help much.<br />

A common thought then is to insall SySAM software on a clustered server. The general thinking is that if<br />

one boxy fails, the other will restart the license server immediately. The problem with this approach is that<br />

the license server uses node-locking on the unique information from the host machine – either the primary<br />

network MAC address or the OS hostid. When the license manager is restarted on the other half of the<br />

cluster, the MAC or hostid will not match what is in the SySAM license file – and the licenses will be<br />

useless. Which brings us to the most likely implementation – SySAM clustering.<br />

The FlexLM software used by SySAM 2.0 supports a three node logical clustering of license servers.<br />

There are a couple of key points that should be mentioned about this:<br />

• 2 of the 3 license servers must be available at all times<br />

• The license servers should be fairly closely located. More on this in a minute.<br />

• While it may not be required to have license server high availability due to grace periods, it<br />

may be recommended for larger installations<br />

Now then, in the SySAM documentation (specifically the FlexLM Licensing End User Guide, it states that<br />

a three node SySAM redundancy configuration should have all the license servers on the same subnet.<br />

This requirement is based upon having guaranteed good communications vs. a WAN implementation.<br />

Even given this restriction, however, there are ways deploying the license servers in a three node<br />

configuration very effectively. For example, you could have one server be the first server for production<br />

systems, one be primary for development systems and one as a failover.<br />

When configuring a three-node license server logical cluster, there are some additional steps that need to be<br />

completed:<br />

• The three hostid’s have to be provided at the time of license generation<br />

• The license files and license stubs will have three server lines instead of one<br />

When the license files are generated, the license files can then be distributed to all three host machines<br />

being used in the license server redundant setup. In addition, the license stubs with the three server lines<br />

would be distributed or rolled into the distribution package for distribution. The stub would resemble:<br />

SERVER PRODLICSRVR ANY<br />

SERVER BACKUPLICSRVR ANY<br />

SERVER DEVELLICSRVR ANY<br />

USE_SERVER<br />

Note that the order of the hostnames listed in the license stub file on the installed hosts specifies the order<br />

in which the license servers are contacted. If the above file was the distributed, each SySAM 2 product<br />

would attempt to acquire a license from the license server running on the PRODLICSRVR host first – then<br />

attempt the others in order. This can be exploited by rearranging the server lines on some hosts so that the<br />

contact one of other license servers first. This can be particularly useful for a license server managing<br />

certain development products that may have a much shorter license heartbeat.<br />

13


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

<br />

Development System License Servers<br />

While <strong>Sybase</strong> does provide a ‘free’ development license for <strong>ASE</strong>, this license is often too restrictive for<br />

real development organizations. The biggest problems with it are:<br />

• Limited resources (single cpu, small number of connections, etc.)<br />

• Lack of maintenance (can not apply EBF’s)<br />

As a result, most organizations with normal sized development teams will have fully licensed<br />

Development/Test/QA systems. While these systems could use the same license server as the production<br />

systems, having a separate license server for these systems achieves the following:<br />

• Protects the production systems from development/testing system issues<br />

• Provides an ideal environment for <strong>Sybase</strong> DBA’s to test SySAM procedures<br />

• Reduces the overhead on production license servers from tools interaction<br />

The last bullet takes a bit of explaining. While <strong>ASE</strong> has a license heartbeat measured in hours (4 hours or<br />

more by default), some of the tools may have a license heartbeat measured in minutes (5 minutes or less).<br />

While this may seem a bit extreme, the rationale is that it is necessary for rapid reclamation of floating<br />

licenses from the pool of developers. By having a separate server for providing development licenses, the<br />

production license management servers are not under as great of stress and can respond to license<br />

heartbeats faster as well as service larger numbers of systems – reducing the actual number of license<br />

servers necessary.<br />

<br />

Business Unit License Provisioning<br />

In most medium to large corporations, the business units are divided into different cost centers. In these<br />

cases, IT infrastructure costs are often allocated among the business units on a project basis – with some<br />

centralized costs for common components such as network infrastructure. This can impact the number of<br />

license servers in several ways:<br />

• If it is decided that the SySAM license servers will be placed on separate hardware from the<br />

software, the costing for this hardware acquisition (small as it may be) may need to be<br />

charged to the appropriate business units.<br />

• DBA resources may be assigned on a business unit level – and may not even have access to<br />

machines belonging to the other business units – therefore are not able to generate or maintain<br />

the license files for those machines.<br />

One reason that some may think should be included in the above is that one business unit may unfairly grab<br />

all of the available licenses. However, SySAM 2.0 allows license provisioning via the OPTIONS file, in<br />

which licenses can be pre-allocated based on logical application groups by hostid. Note that this also could<br />

be used within an environment to provision production licenses away from development/test licenses or<br />

similar requirement to ensure that some quantity of licenses are preserved for specific use. How to do this<br />

is covered in the FlexLM Licensing End User Guide in the chapter on the OPTIONS file. Note that for<br />

<strong>Sybase</strong>, this file should have a default name of SYB<strong>ASE</strong>.opt and be located in the SySAM licenses<br />

subdirectory.<br />

Inventory Current Deployment<br />

Now that we have an idea of considerations that go into determining how many license servers we may<br />

need, the next step to determining how many we will actually use and where they might be located would<br />

be to do an inventory of the current software deployment and license usage. For production systems, this<br />

may be easy – finding all the development licenses may be more fun. To make your job easier, you may<br />

want to create a spreadsheet with the following columns (this will really speed up the SPDC generation as<br />

it will help you reduce the number of interactions necessary – especially changes).<br />

14


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Location<br />

Host<br />

machine<br />

CPUs/<br />

Cores<br />

Bus Unit /<br />

Application<br />

Software/<br />

Option<br />

License<br />

Types<br />

Licenses<br />

Req’d<br />

SySAM<br />

Server<br />

Consider the following example using our mythical ABC Bank:<br />

Location<br />

Host<br />

machine<br />

CPUs/<br />

Cores<br />

Bus Unit /<br />

Application<br />

Software/<br />

Option<br />

License<br />

Types<br />

Licenses<br />

Req’d<br />

SySAM<br />

Server<br />

NYC Prod_01 12 Trading <strong>ASE</strong> SR 1 nyc_sysam_01<br />

NYC Prod_01 12 Trading <strong>ASE</strong> partitions CPU 12 nyc_sysam_01<br />

NYC Prod_01 12 Trading <strong>ASE</strong> encrypt col CPU 12 nyc_sysam_01<br />

NYC Web_01 8 Trading <strong>ASE</strong> CPU 8 (file)<br />

NYC Web_01 12 Trading <strong>ASE</strong> encrypt col CPU 12 (file)<br />

NJ Prod_01_HA 12 Trading <strong>ASE</strong> SR 1 nj_sysam_01<br />

At first, of course, you may not know the license server host (last column), but it can be filled in later after<br />

you’ve completed your inventory and you’ve decided how many license servers you want to deal with or<br />

need based on availability requirements.<br />

License Server HostID’s<br />

Obtaining your HostID is fairly simple, although, it does differ for each platform. For example, on some<br />

platforms, you simply issue the Unix ‘hostid’ command, while on others, you will need the primary<br />

network interfaces physical MAC address. Detailed instructions are contained in the FlexLM Licensing<br />

End User Guide shipped with the software.<br />

However, this does require a bit of warning. When using the MAC address, the NIC that will be used for<br />

node locking will be the first NIC bound in order when the system boots. On Microsoft Windows systems,<br />

this is more than just a bit fun as MicroSoft and other hardware vendors love to create network devices out<br />

of firewire ports, USB streaming camera’s, etc. If you are running a Microsoft system, you can ensure the<br />

NIC binding order by opening the Control Panel and navigating to the Network Connections folder. When<br />

it is open, you then select ‘Advanced Settings’ from the ‘Advanced’ menu option at the top (same menu bar<br />

as File/Edit, etc.). This should open the Advanced Settings dialog. Select the ‘Adapters and Bindings’ tab<br />

and in the top pane, re-order the binding order in the desired order.<br />

If you are using an operating system other than Microsoft Windows that uses MAC based hostid’s (Linux,<br />

MacOS), check with your system administrator to see if you can specifically control the network adapter<br />

binding order.<br />

15


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Generate Licenses from SPDC<br />

After you have determined your license servers and/or file based node locked host machines, you are ready<br />

to generate licenses from the <strong>Sybase</strong> Product Download Center. While it is fairly straight forward and<br />

based on a web-form wizard that walks you through the process, it can be a bit tedious – especially if you<br />

make a mistake (you will have to check the licenses back-in – and then re-check them out to fix a mistake).<br />

A couple of items to specifically watch out for when generating licenses:<br />

• The license quantity will often default to all remaining (first time) or the same number as the<br />

last licensing action.<br />

• After the first time you enter a license server’s info, subsequent licensing actions for the same<br />

license server will be quicker as the license server can simply be selected from the list.<br />

• Each licensing action will result in a single license file containing a single license. Save the<br />

files to a safe place where they will be backed up.<br />

• If desired, rather than having dozens of files with individual licenses, you could combine<br />

some of the licenses into one or more license files<br />

A sample license file may look like the following:<br />

SERVER tallman-d820 0013028991D3<br />

VENDOR SYB<strong>ASE</strong><br />

USE_SERVER<br />

PACKAGE <strong>ASE</strong>_EE SYB<strong>ASE</strong> COMPONENTS=<strong>ASE</strong>_CORE OPTIONS=SUITE SUPERSEDE \<br />

ISSUED=14-jun-2006 SIGN2="0F70 418E B42B 2CC9 D0E4 8AEC 1FD0 \<br />

B6C7 69CE 1A05 F6BF 45F5 BEE4 408C C415 1AA5 18B8 6AA1 3641 \<br />

6FDD 52E1 45B6 5561 05D4 9C62 AD6B 02AA 9171 5FAC 2434"<br />

INCREMENT <strong>ASE</strong>_EE SYB<strong>ASE</strong> 2007.08150 15-aug-2007 2 \<br />

VENDOR_STRING=SORT=100;PE=EE;LT=SR PLATFORMS=i86_n DUP_GROUP=H \<br />

ISSUER="CO=<strong>Sybase</strong>, Inc.;V=<strong>15.0</strong>;AS=A;MP=1567" \<br />

ISSUED=14-jun-2006 BORROW=720 NOTICE="<strong>Sybase</strong> - All Employees" \<br />

SN=500500300-52122 TS_OK SIGN2="0E12 0DC8 5D26 CA5B D378 EB1A \<br />

937B 93F9 CAF2 CDD8 0C3E 4593 CA29 E2F1 8F95 15D1 2E60 11C0 \<br />

10BE 26EC 8168 4735 8A52 DD9F C239 5E88 36D7 1530 A947 1A7C"<br />

While the SYSAM documentation describes the license components in greater detail, the key fields are<br />

highlighted above. Taken in order, the first field is a date. This is the expiration date for <strong>Sybase</strong> support<br />

for this group of servers. After this date, since support is expired, attempts to apply patches will succeed or<br />

fail depending on the support grace period of the product. Prior to this date, this customer will have to<br />

renew their support agreement for these systems and then do a license upgrade.<br />

The second field is the number of licenses – in this case ‘2’. By itself, it is meaningless and has to be<br />

looked at in conjunction with the last field ‘LT=SR’ which says that this license is for a ‘Server’ license vs.<br />

‘CPU’ (or other form). The result is that this license allows 2 different host machines to check out <strong>ASE</strong><br />

server licenses. The reason this is brought to your attention is that for CPU licenses, the above file might<br />

change slightly to resemble something like:<br />

SERVER tallman-d820 0013028991D3<br />

VENDOR SYB<strong>ASE</strong><br />

USE_SERVER<br />

INCREMENT <strong>ASE</strong>_PARTITIONS SYB<strong>ASE</strong> 2007.08150 15-aug-2007 4 \<br />

VENDOR_STRING=PE=EE;LT=CP PLATFORMS=i86_n ISSUER="CO=<strong>Sybase</strong>, \<br />

Inc.;V=<strong>15.0</strong>;AS=A;MP=1567;CP=0" ISSUED=14-jun-2006 BORROW=720 \<br />

NOTICE="<strong>Sybase</strong> - All Employees" SN=500500048-52182 TS_OK \<br />

SIGN2="1BBA CEE0 89FD 7CDA 6729 E6AB D37C B48C A97F D3F7 4AC0 \<br />

E4C9 4310 DCA9 7FE7 1FD2 5C38 1345 931F 7D14 9A34 DB84 6157 \<br />

8B2A 3E90 9654 5177 A539 E362 9A73"<br />

In this license, the license server can provide 4 CPU’s of the <strong>ASE</strong> partition license to <strong>ASE</strong>’s on other hosts<br />

that may be requesting it. Site license customers or other licensing implementations that have an unlimited<br />

quantity of licenses need to make sure that when generating licenses that the number of cpu’s that they<br />

generate licenses for is the same or greater than the number of cpu core’s<br />

16


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

One difference for customers that only had licensed the core <strong>ASE</strong> product in previous versions, in <strong>ASE</strong><br />

<strong>15.0</strong>, the <strong>ASE</strong>_CORE license includes the Java, XML, and XFS (external file system or content<br />

management) licenses or these licenses may be available at no additional cost via the SPDC (as is the case<br />

for XML, etc.). Premium options such as HA, DTM and others still require separate licensing. For further<br />

information, contact your <strong>Sybase</strong> sales representative or <strong>Sybase</strong> Customer Service at 1-800-8SYB<strong>ASE</strong>.<br />

The Developer's Edition (DE) includes all non-royalty options from <strong>Sybase</strong>. Royalty options include the<br />

Enhanced Full Text Search, Real Time Data Services, and most of the tools such as DBXray or <strong>Sybase</strong><br />

Database Expert. When installing the Developer's Edition, simply pick the unserved license model as the<br />

Developer's Edition use the file based licensing mechanism. The Developer's Edition keys are included in<br />

the product CD image. Consequently after installing the software, no further SySAM actions are required.<br />

SySAM Software Installation<br />

The SySAM installation has several gotcha’s for the uninitiated. The first hurdle is simply acquiring the<br />

software itself. While this may seem obvious, however, consider a Sun, HP or IBM hardware organization<br />

that decides to use a small Linux workstation for the license server. Without a valid <strong>ASE</strong> license for Linux,<br />

they will not be able to obtain the software. Fortunately, this can be resolved quite easily. One option is to<br />

download the “<strong>ASE</strong> Developer’s Edition” or “<strong>ASE</strong> Express Edition” (Linux only) for the platform you<br />

wish to run the license server on. After getting the CD image, run the installshield program and only select<br />

the license server components to be installed.<br />

The second hurdle is starting up the license server process. The gotcha here is that you must have at least<br />

one served license file installed in the licenses directory or else the license server will not run. This is<br />

especially obvious on Microsoft Windows systems in which after installation, the usual nonsensical reboot<br />

is required and on restart, you get a services start error on the license server process. In any case, simply<br />

copy or ftp the license files you downloaded earlier into the appropriate directory (making sure they have a<br />

.lic extension), negotiate to the SySAM bin directory and issue ‘sysam start’. You can verify everything is<br />

working via the ‘sysam status’ command as well as looking at the log files.<br />

The third hurdle is to make sure that the license manager starts with each system reboot. For most Unix<br />

systems, this means adding the SySAM startup to the rc.local startup scripts.<br />

Finally, for email alerts, you will need to have smtp email based server running. For Unix systems, this<br />

pretty simple as most Unix variants include a basic email server with the operating system. For Microsoft<br />

Windows users, however, this can be a bit more challenging. The gotcha with this aspect is that it is the<br />

product – not the license server – that alerts the DBA to a license in grace mode or other similar problem.<br />

As an alternative, you could create a centralized polling process that used the <strong>ASE</strong> ‘sp_lmconfig’ stored<br />

procedure. However, an easier approach that presents the licensing information in a more easily parsed<br />

format is to use the MDA table monLicense<br />

1> 1> use master<br />

1><br />

2> select Name, Quantity, Type, Status, GraceExpiry<br />

3> from monLicense<br />

Name Quantity Type Status GraceExpiry<br />

------------------ -------- ---------------- --------- ------------<br />

<strong>ASE</strong>_JAVA 1 Server license expirable NULL<br />

<strong>ASE</strong>_EFTS 1 Server license expirable NULL<br />

<strong>ASE</strong>_ENCRYPTION 1 CPU license expirable NULL<br />

<strong>ASE</strong>_RLAC 1 CPU license expirable NULL<br />

<strong>ASE</strong>_PARTITIONS 1 CPU license expirable NULL<br />

<strong>ASE</strong>_CORE 1 Server license expirable NULL<br />

(6 rows affected)<br />

A simple license grace detection query would merely check:<br />

select Name, Quantity, Type, Status, GraceExpiry<br />

from master..monLicense<br />

where GraceExpiry is not null<br />

17


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

While this technique works for <strong>ASE</strong>, however, it won’t work for other products such as Replication Server.<br />

For other products (as well as for <strong>ASE</strong>), you may wish to use the Unified Agent (UA) process to check for<br />

occurrences of a ‘graced’ license error in the errorlog (error 131274), such as the following:<br />

2006/05/24 14:51:16.82 kernel SySAM: Checked out graced license for 1 <strong>ASE</strong>_CORE (2005.1030) will<br />

expire Fri Jun 09 01:18:07 2006.<br />

2006/05/24 14:51:16.82 kernel SySAM: Failed to obtain 1 license(s) for <strong>ASE</strong>_CORE feature with<br />

properties 'PE=EE;LT=SR'.<br />

2006/05/24 14:51:16.82 kernel Error: 131274, Severity: 17, State: 1<br />

2006/05/24 14:51:16.97 kernel SySAM: WARNING: <strong>ASE</strong> will shutdown on Fri Jun 09 01:18:07 2006, unless a<br />

suitable <strong>ASE</strong>_CORE license is obtained before that date.<br />

2006/05/24 14:51:16.97 kernel This product is licensed to:<br />

2006/05/24 14:51:16.97 kernel Checked out license <strong>ASE</strong>_CORE<br />

Software Deployments & SySAM<br />

Many of our customers prefer to build their own deployment packages for <strong>Sybase</strong> installation – using<br />

internally certified <strong>ASE</strong> versions and patch levels – often using tarballs or other electronic packaging.<br />

Previously, such images were built by simply tarring up the same license key and using it for all the<br />

servers. As mentioned earlier, this doesn’t change much. When using a license server, the only<br />

requirement is to make sure that each host that will be running <strong>Sybase</strong> software has a license file stub. A<br />

sample one – SYB<strong>ASE</strong>.lic – should have been created for you when you first installed the software from<br />

the CD image. If not, the format is extremely simple:<br />

SERVER TALLMAN-D820 ANY<br />

USE_SERVER<br />

For redundant installations, the file will have three SERVER lines vs. the one. From this point, the<br />

software installation builds can be deployed to where-ever necessary. The only post-extraction requirement<br />

that might be necessary is to change the license server names in the file if planning on using a different<br />

license server(s) than those listed in the file when the installation package was created.<br />

An alternative is to listing the license servers in the license stub file is to set the license servers using the<br />

LM_LICENSE_FILE environment variable for served licenses. For unserved licenses, the licenses are<br />

interrogated in alphabetical order of the license files. Both of these are described in the SySAM<br />

documentation.<br />

ESD’s, IR’s and Updating SySAM licenses<br />

So far, every <strong>ASE</strong> <strong>15.0</strong> ESD and IR released - except <strong>ASE</strong> <strong>15.0</strong> ESD #1 - has had a license component<br />

change. Each time, the license change was due to a new feature being added and product bundling resulted<br />

in the name of a existing license component changing - or previously configured options failing as the new<br />

license component may be required. Additionally, as an IR, <strong>ASE</strong> <strong>15.0</strong>.1 required updating the <strong>ASE</strong>_CORE<br />

and other option licenses to the <strong>15.0</strong>.1 level from <strong>15.0</strong>. As a result, <strong>ASE</strong> <strong>15.0</strong> ESD #2, <strong>ASE</strong> <strong>15.0</strong>.1, and<br />

<strong>ASE</strong> <strong>15.0</strong>.1 required DBA’s to “Upgrade” their licenses via the SPDC. This would be in addition to any<br />

upgrade performed as a result of annual support renewal efforts.<br />

Consequently, it is strongly encouraged that DBA’s check all licensable options by applying an ESD or IR<br />

first to a test system with all the customer licensed options enabled. By doing this, unexpected last-minute<br />

licensing actions or attempts to uninstall a patch due to licensing changes can be avoided.<br />

<strong>ASE</strong> 15 & SySAM FAQ<br />

The below questions should have been answered in the above discussions – but are repeated here to ensure<br />

that the points are not missed in the details above. Further information or assistance on SySAM is available<br />

at http://www.sybase.com/sysam. In addition, you can call <strong>Sybase</strong> Customer Service or Technical Support<br />

at 1-800-8SYB<strong>ASE</strong>, or talk to your local <strong>Sybase</strong> Sales Team.<br />

18


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Is SySAM 2.0 Required<br />

Yes. In order to ensure SOX and other legislative requirements that publically traded companies must now<br />

face and the penalities that corporate executives face for financial issues, including misuse of software<br />

licenses, <strong>ASE</strong> now must check out a valid license from the license server.<br />

I have 100 <strong>ASE</strong> servers, do I have to register all of their nodes with <strong>Sybase</strong><br />

No. Only the node where the license server runs needs to be provided to <strong>Sybase</strong>. This node id is part of the<br />

license key, which is why it must be provided. See the discussion above on served licenses (under License<br />

Architecture)<br />

Is a license required for the <strong>ASE</strong> Developer’s Edition<br />

No. Although the installer prompts you for license information, the Developers Edition license enabling all<br />

the non-royalty options of <strong>Sybase</strong> <strong>ASE</strong> is included in the software. Simply pick “unserved” from the<br />

installer option and proceed with the install. Note that this also, by default does not install the license<br />

server, which is not needed for the Developer’s Edition.<br />

My company paid for a license for a server that operates outside our WAN environment where<br />

it can’t contact a license server – do we need a separate license server for it<br />

You have two options here. The first, of course, is to set up a separate license server outside the WAN for<br />

it. However, that is likely impractical. As a result, <strong>Sybase</strong> also provides for “unserved” licenses in which<br />

the <strong>ASE</strong> software simply reads its licensing information from a local file. This latter method is probably<br />

the best choice for remotely deployed software.<br />

My company has locations in Europe, Asia, Africa, Latin America and North America – is it<br />

reasonable to use only a single license server or is more than one necessary<br />

Possible: Yes. Reasonable: Likely not. For geographically dispersed deployments, you may wish to set up<br />

a separate license server for each different geography.<br />

Do I need to install the License Server on a cluster or otherwise provide redundant capability.<br />

Due to the nodeid lock, the license server would not work if it failed over to another server in hardware<br />

clustering configuration. If license server redundancy is desired, you can either use the ‘multi-license<br />

server’ implementation (may be most applicable for site licenses customers), or use the three-node<br />

redundant server configuration.<br />

I normally create “tarballs” for distributing certified releases within my company – will SySAM<br />

2.0 prevent this or cause me problems<br />

No. Unlike previous releases in which the license keys were distributed to every host, the license file for<br />

the deployed <strong>ASE</strong> software only needs to have the location of the license host and port number (if other<br />

than 27000).<br />

Where can we go for more information<br />

http://www.sybase.com/sysam is the official location for SySAM 2.0 materials. In addition, you can call<br />

<strong>Sybase</strong> Customer Service or Technical Support at 1-800-8SYB<strong>ASE</strong>, or talk to your local <strong>Sybase</strong> Sales<br />

Team.<br />

19


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

20


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Preparing for the Upgrade<br />

The <strong>Sybase</strong> <strong>ASE</strong> <strong>15.0</strong> Installation guide contains the full steps for upgrading to <strong>ASE</strong> <strong>15.0</strong> – including the<br />

usual reserved word check utility, etc. This section of the paper concentrates on particular aspects that have<br />

resulted in upgrade issues to date.<br />

Review Database Integrity<br />

The following database integrity considerations should be reviewed prior to upgrading from <strong>ASE</strong> 12.x.<br />

Check Security Tables (sysusers)<br />

Sometimes to provide customized control of userids, login id’s or roles, DBA’s have manually modified the<br />

values of user or role id’s to resemble system accounts. As with all new versions of <strong>Sybase</strong> software,<br />

increased functionality sometimes requires new system roles to control those features. As a result, during a<br />

database upgrade, the upgrade process may attempt to add a new role to the database – and if one already<br />

exists with that id, it will fail causing the entire upgrade to fail. Prior to upgrading, you may wish to review<br />

sysusers and sysroles to ensure any manually inserted user or role id’s are within the normal <strong>Sybase</strong> user<br />

range.<br />

If you are doing the upgrade via dump/load from another server, after the <strong>ASE</strong> <strong>15.0</strong> server has been<br />

created, you will need to copy the existing login id’s and role information from the previous server to the<br />

new <strong>ASE</strong> <strong>15.0</strong> server prior to loading the first database to be upgraded.<br />

Run dbcc checkstorage<br />

Although this is documented in the Installation guide, it is often skipped. Unfortunately, existing<br />

corruptions will cause the upgrade process to fail as nearly every page in the database is often touched<br />

during the upgrade.<br />

Check database space<br />

Make sure you have plenty of free space within the databases to be upgraded. The pre-upgrade utility will<br />

check this for you – especially calculating the space necessary for the catalog changes. However, it is<br />

always advisable when performing an upgrade to have plenty of free space available in the database. Try to<br />

have at least 25% of the space free for databases less than 10GB. Additionally, make sure the transaction<br />

log is as clear as possible. The upgrade instructions specifically remind you to dump the transaction log (to<br />

truncate it) after performing a full database dump. This is important as all the catalog changes will require<br />

some log space.<br />

This is also true of the system databases such as master or sybsystemprocs. Make sure master is only using<br />

about 50% of the available disk space (remember to make a dump of master and to then dump the tran log<br />

to truncate it – often the cause of space consumption in master is the transaction log) and that<br />

sybsystemprocs has sufficient disk space for all the new stored procedures.<br />

Additionally, several of the system databases take up more space in <strong>15.0</strong>. While the amount of additional<br />

space required is a factor of the server pagesize (for 2K pagesize, the increase is generally about 4MB), it is<br />

recommended that you increase the system database sizes prior to the upgrade itself.<br />

If using utilities such as DBExpert to compare query plans, these utilities use abstract query plans – which<br />

are captured in the system segment. Consequently, you will need to make sure that you have enough free<br />

space in the system segment beyond the catalog expansion requirements. Additionally, <strong>ASE</strong> 15 includes a<br />

feature called sysquerymetrics that uses system segment space – you may wish to prepare for this by<br />

expanding the existing system segment to other devices if it was restricted to a smaller disk or disk with<br />

little free space.<br />

21


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Find all currently partitioned tables<br />

As a result of system table changes, and the changes to support semantic partitions, tables currently<br />

partitioned using segment-based partitioning or slices will be unpartitioned during the upgrade process.<br />

Optionally, to save time and effort during the upgrade itself, you may wish to manually unpartition these<br />

tables prior to running the upgrade – particularly tables for which the partitioning scheme will be changed<br />

to one of the semantic partitioning schemes.<br />

Increase HW Resource Requirements<br />

Because <strong>ASE</strong> 15 uses new in-memory sorting and grouping algorithms, <strong>ASE</strong> <strong>15.0</strong> will need additional<br />

memory compared to <strong>ASE</strong> 12.5. Since this involves sorting, you will need additional procedure cache<br />

space for the auxiliary scan buffers used to track sort buffers and additional memory for sorting. Since this<br />

in-memory sorting replaces the need for worktables, the additional memory requirement will be in<br />

whichever data cache tempdb is bound to. While it is not possible to suggest exactly how much memory,<br />

remember, this is replacing former use of worktables, which used data cache as well as disk space –<br />

consequently the additional requirements are likely a small percentage above the current allocations (i.e.<br />

5%). This is one of the areas to carefully monitor after upgrading. In <strong>ASE</strong> <strong>15.0</strong>.1, some new MDA tables<br />

will help monitor this from a procedure cache standpoint - monProcedureCacheModuleUsage, and<br />

monProcedureCacheMemoryUsage. Both tables contain a High Water Mark (HWM) column so that the<br />

max memory usage for particular activities can be observed on a procedure cache module basis.<br />

In addition, due to the increased number of first class objects implemented within the server to support<br />

encrypted columns, partitioned tables and other new features, you may need to increase the number of open<br />

objects/open indexes. This particularly is likely to be true if the <strong>ASE</strong> server contains a number of<br />

databases. You can monitor the metdata cache requirements including object and index descriptors through<br />

sp_sysmon.<br />

Preparing for Post-Upgrade Monitoring for QP Changes<br />

Especially post-upgrade, the MDA tables along with sysquerymetrics will be heavily used to find affected<br />

queries as well as changes in over system resource consumption. This document will include several<br />

techniques useful in spotting queries performing differently using both sysquerymetrics and the MDA<br />

tables. The MDA setup instructions are in the product documentation, but essentially consist of creating a<br />

“loopback” server (not needed after <strong>ASE</strong> <strong>15.0</strong> ESD #2), adding mon_role privileges and then installing the<br />

monitor tables from the installmontable script. Make sure that you install the most current montable script<br />

– if you previously installed the montables and then applied an EBF, you may need to reinstall them to pick<br />

up differences. The MDA table descriptions are in the <strong>ASE</strong> product documentation (<strong>ASE</strong> Performance &<br />

Tuning). You should also familiarize yourself with the sysquerymetrics table, sp_metrics system procedure<br />

and the associated techniques to exploit this new feature to rapidly identify underperforming queries.<br />

Review Trace Flags<br />

Either to influence <strong>ASE</strong> behavior or to correct issues, some customers may be starting <strong>ASE</strong> with trace flags<br />

in the RUN_SERVER file (i.e. –T3605). As many of these trace flags are specific to 12.5.x, they are not<br />

necessary in <strong>15.0</strong>. The below table gives a list of some of these trace flags and the <strong>ASE</strong> <strong>15.0</strong> applicability.<br />

Customers with trace flags should check with <strong>Sybase</strong> Technical Support as well before removing the trace<br />

flag as the trace may be used for another purpose than the main one listed here.<br />

22


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Trace<br />

Flag<br />

Description 12.5.x <strong>15.0</strong> Additional Remarks<br />

291 when ON, predicates of the form col1<br />

fn(col2), where the datatype of<br />

col2 is higher than that of col1, the<br />

expression f(col2) will be cast to the<br />

datatype of the lower type i.e. col1<br />

This is a feature in <strong>15.0</strong><br />

333 disable min-max optimization for all cases No longer supported<br />

364 Use range density instead of total density No longer supported<br />

370 use min-max index only as alternative to<br />

the table scan for single table queries. Do<br />

not perform aggregation optimization for<br />

joins<br />

396 use min-max optimization for single table<br />

queries<br />

526 print semi-graphical execution operator<br />

tree when showplan is turned on<br />

No longer supported<br />

No longer supported<br />

New feature in <strong>15.0</strong><br />

If you are running with a trace flag not listed here, contact Technical Support for assistance.<br />

Additionally, the 300 series diagnostic trace flags (302, 310, 311, etc.) are being discontinued and will be<br />

deprecated in a future release. They have been replaced with showplan options – which provide better<br />

diagnostic output as well as more readable output.<br />

Run Update Index Statistics<br />

Prior to the upgrade, you will likely want to run update statistics using a higher than default step count and<br />

possibly using a non-default histogram tuning factor. A good beginning point might be to increase the<br />

default step count to 200 and the default histogram tuning factor to 5, allowing update statistics to create<br />

between 200 and 1,000 range cells as necessary. For extremely large tables, you may wish to use a step<br />

count of 1,000 or higher (depending on your configured ‘histogram tuning factor’ setting - note that in<br />

<strong>15.0</strong>.1, the default has changed to 20). While not always possible, ideally you should have one histogram<br />

step for every 10,000 data pages with a maximum of 2,000 histogram steps (beyond this is likely not<br />

pratical and would likely only have intangible benefits). However, this likely is only necessary for indexes<br />

that are used to support range scans (bounded or unbounded) vs. direct lookups. The more histogram steps,<br />

the more procedure cache will be needed during optimization, so it likely is best to gradually increase the<br />

number of steps/histogram tuning factor combination until performance is desirable and then stop.<br />

Additionally, it is strongly suggested that you run update index statistics instead of plain update statistics<br />

(there is a discussion on this later in the section on missing statistics and showplan options). For existing<br />

tables with statistics, you may wish to first delete the statistics so that the update statistics command is not<br />

constrained to the existing step counts. While not required, it has been observed during previous upgrades<br />

(i.e. 11.9 to 12.0) that table statistics inherited the histogram cell values when running update statistics<br />

alone resulting in skewed histograms in which the last cell contained all the new data. For large tables, you<br />

could run update statistics with sampling to reduce the execution time as well as increasing the number of<br />

sort buffers temporarily (number of sort buffers is a dynamic configuration variable in 12.5) to 5,000,<br />

10,000 or even 20,000 – however, the larger the setting the more procedure cache will be required –<br />

especially if running concurrent update statistics or create index commands.<br />

The reason this is suggested as part of the pre-upgrade steps is that it can be factored in during the weeks of<br />

testing leading up to the upgrade itself. While a more thorough explanation as to why this may be<br />

necessary will be discuss in the maintenance changes section, <strong>ASE</strong> <strong>15.0</strong> is more susceptible to statistics<br />

issues than previous releases due to now having multiple algorithms vs. singular. As a result, running<br />

23


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

update statistics after the upgrade would normally be recommended to avoid query performance issues –<br />

but since neither the statistics values nor structures change as a result of the upgrade, this step could be<br />

accomplished before the actual upgrade. The notable exception to this would be tables planning on being<br />

partitioned, which will require dropping and recreating the indices – which will recreate the statistics in any<br />

case.<br />

24


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Installation & Upgrade<br />

The installation guide available online along with the downloadable PDF version contains the full details<br />

for installing or upgrading <strong>ASE</strong>. This section mentions common installation problems noted to date.<br />

Installing the Software<br />

Previous releases of <strong>Sybase</strong> software were typically distributed via CD-ROM, which also was the primary<br />

installation media for the first host installed on (other hosts may be installed from an image at larger sites).<br />

However, with the widespread use of Electronic Software Distribution, <strong>ASE</strong> 15 is likely going to be first<br />

installed from a software download from the <strong>Sybase</strong> Product Download Center (SPDC). The downloaded<br />

software is still in the format of a CD image, consequently you may burn it to a CD-ROM for installation if<br />

necessary. To avoid installation problems, consider the following points:<br />

• Installation directly from the downloaded media without using a CD is supported. However, make<br />

sure the download location has a short path (i.e. /sybase/downloads or c:\sybase\downloads) and<br />

that the path does not contain any special characters such as the space character. This is especially<br />

true of Windows systems – avoid directories such as “C:\Program Files\<strong>Sybase</strong>\downloads” due to<br />

the space between “Program” and “Files”. Failing to do this or using the wrong JRE version will<br />

likely result in installation failures citing "class not found" errors.<br />

• Additionally, the software ships with the correct JRE for the installshield GUI. If you have other<br />

Java environments installed (JDK or JRE) and have environment variables referencing these<br />

locations (i.e. JAVA_HOME or if you have java in your path for frequent compilations), unset<br />

these environment variables in the shell you start the installation from.<br />

• Use GNU gunzip utility to uncompress the CD images on Unix systems. The CD images<br />

downloaded were compressed using GNU zip. Decompressing them with standard unix<br />

compression utilities often results in corrupted images. Please use gunzip – which is freely<br />

available from the GNU organization<br />

• Use GNU tar utility to extract the archive files for the CD image. Similar to the above, the file<br />

archive was built using a form of the GNU tar utility. Using standard unix tar may not extract the<br />

files correctly.<br />

• Many of the hardware vendors support a variety of mount options for their CD-ROM drives.<br />

Make sure you are using the correct mount options specified for your platform in the <strong>Sybase</strong><br />

installation manual.<br />

Alternative to Dump/Load for <strong>ASE</strong> 12.5.1+<br />

For customers upgrading from <strong>ASE</strong> 12.5.1 or higher and planning on building a new server vs. upgrade-inplace,<br />

there is an alternative to dump/load which may prove to be much faster. The crucial element in this<br />

alternative is that the new server must have access to the physical devices or copies of them. Users with<br />

SAN technologies may find this process much faster – even if migrating to new hardware at the same time.<br />

The general sequence of steps are as follows:<br />

1. The existing 12.5.x database is either quiesced to a manifest file (during testing) or unmounted<br />

from the server (final).<br />

2. If the new <strong>15.0</strong> server is on another host, the disk devices need to unmounted from the current host<br />

and mounted to the new host machine (or copied using SAN utilities).<br />

3. The manifest file needs to be copied to the new <strong>ASE</strong> <strong>15.0</strong> location<br />

25


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

4. The 12.5.x database is then simply mounted by the new <strong>ASE</strong> <strong>15.0</strong> server using the manifest file<br />

with corrections to the new device path's as necessary.<br />

5. The online database command is issued – the database upgrade will begin as soon as the database<br />

is brought online.<br />

The following example scenario demonstrates this technique. The assumption is that you wish to upgrade a<br />

database 'testdb' and will be using a manifest file named 'testdb_manifest.mfst'.<br />

1. Quiesce the 12.5.x database to a manifest file using the SQL similar to the following:<br />

quiesce database for_upgrd hold testdb<br />

for external dump<br />

to “/opt/sybase/testdb_manifest.mfst”<br />

with override<br />

go<br />

2. Copy the devices using disk copy commands (i.e. dd), SAN utilities, or standard filesystem<br />

command (if file system devices are used).<br />

3. When the device copy is finished, release the quiesce by a SQL statement similar to:<br />

quiesce database for_upgrd release<br />

go<br />

4. If this is the final upgrade (vs. testing), shutdown <strong>ASE</strong> 12.5 to prevent further changes.<br />

5. Move the device copies to the new host machine and mount them as appropriate.<br />

6. In <strong>ASE</strong> <strong>15.0</strong>, use the following SQL statement to list the physical to logical device mappings<br />

using a SQL statement similar to:<br />

mount database all from “/opt/sybase/testdb_manifest.mfst” with listonly<br />

go<br />

7. From the above, look at the physical to logical device mappings and determine the new<br />

physical device mappings that correspond to the logical devices. Then mount the database in<br />

<strong>ASE</strong> <strong>15.0</strong> using SQL similar to:<br />

mount database all from “/opt/sybase/testdb_manifest.mfst” using<br />

“/opt/sybase/syb15/data/GALAXY/test_data.dat” = “test_data”,<br />

“/opt/sybase/syb15/data/GALAXY/test_log.dat” = “test_log”<br />

8. Bring the database online and start the upgrade process by using the normal online database<br />

command similar to:<br />

online database testdb<br />

This technique has a restriction in that the databases need to be aligned with the devices. If a device<br />

contains fragments of more than one database, you may have to move/upgrade them all at the same time.<br />

Cross Platform Dump/Load<br />

In <strong>ASE</strong> 12.5.x, <strong>Sybase</strong> introduced a cross-platform dump/load capability (aka XPDL) that allowed dumps<br />

taken from a quiesced server on one platform to be loaded on another platform - regardless of endian<br />

architecture and 32-bit/64-bit codelines. While there are a few limitations, most popular platforms are<br />

covered. If moving to a new platform while upgrading to <strong>ASE</strong> <strong>15.0</strong>, there are a few things to keep in mind<br />

as part of the upgrade when using XPDL.<br />

• While most data is converted between endian structures, the index statistics appear not to be.<br />

As a result, after a successful XPDL load sequence, in addition to running the sp_post_xpload<br />

procedure, you will need to (re)run update index statistics on all tables (see earlier<br />

recommendations for using higher step values).<br />

• As documented, large binary fields and image data may not be converted between endian<br />

platforms. For example, if the tables contain a binary(255) column, it is not likely that the<br />

data will be converted between endian structures while a binary(8) may be. If you know you<br />

26


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

changed endian architectures, you may have to write a routine to manually swap the bytes -<br />

after ensuring that the values are indeed wrong as a result of the platform change.<br />

Unfortunately, the exact byte swapping to perform will largely depend on how the original<br />

data was encoded.<br />

• If you are storing parsed XML (either in varbinary or image datatype), you will need to<br />

reparse the orginal XML (hopefully this is in a text column) vs. attempting to swap the byte<br />

ordering.<br />

Obviously this may impact the time it takes to perform the migration - resulting in longer application<br />

outage for non-replicated systems. Additionally, you may also want to time both the XPDL and a bcp<br />

implementation (or sybmigrate) to determine which might be faster given your volume of data. One<br />

consideration is that if bcp is nearly the same speed, by using bcp or sybmigrate, you can change the page<br />

size of the server. For example, on Linux, a 4K page size has considerable performance gains over a 2K<br />

page size - and most SAN implementations are tuned to 4K frames.<br />

Installation & Upgrade FAQ<br />

Everytime I start the installation GUI, it fails with a “class not found” error – why<br />

There could be several causes for this. The two most likely causes are that the downloaded media path<br />

contains a space or other special character. A second common cause is that a different JRE is being used<br />

than the one shipped with the installation media<br />

Where do I get the product documentation – is SyBooks available<br />

<strong>Sybase</strong> is discontinuing use of the DynaText based SyBooks in favor of pdf format and Eclipse-based<br />

documentation. Currently, you can download the pdf versions from the <strong>Sybase</strong> Products Download Center.<br />

The online documentation using the Eclipse viewer is also available at<br />

http://sybooks.sybase.com/nav/detail.dodocset=741 which can be found by traversing www.sybase.com<br />

“Support & Services” tab select “Adaptive Server Enterprise” and your language select “Adaptive<br />

Server Enterprise <strong>15.0</strong>”<br />

When I try to install, I get a file missing error with a path like md5 and a string of numbers<br />

Most likely this happens when the installation media was created from a download image vs. a shipped<br />

product CD. As a result, there are several possible causes for this:<br />

gunzip vs. unix compress – The CD images downloaded were compressed using GNU zip.<br />

Decompressing them with standard unix compression utilities often results in corrupted<br />

images. Please use gunzip – which is freely available from the GNU organization<br />

GNU tar vs. unix tar – Similar to the above, the file archive was built using a form of the GNU tar<br />

utility. Using standard unix tar may not extract the files correctly.<br />

Incomplete download image – during the download, the download itself may not have been<br />

complete or in a subsequent ftp from the download machine, the archive file may not be<br />

complete. You may want to download the image again.<br />

If none of the above three work, contact <strong>Sybase</strong> Technical Support for assistance.<br />

When I try to install from the CD, it won't run<br />

Many of the hardware vendors support a variety of mount options for their CD-ROM drives. Make sure<br />

you are using the correct mount options specified for your platform in the <strong>Sybase</strong> installation manual.<br />

27


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

What should I do if the upgrade fails<br />

<strong>Sybase</strong> has provided a utility that will upgrade an existing <strong>ASE</strong> by automatically performing the upgrade<br />

steps. If the upgrade should fail for any reason, customers have the option to manually complete the<br />

upgrade after resolving the problem. Should this be necessary, contact <strong>Sybase</strong> Technical Support for<br />

assistance in completing the upgrade. This is particularly true if upgrading via dump/load. There have<br />

been known issues in which <strong>ASE</strong> <strong>15.0</strong> configuration settings could prevent an upgrade from completing<br />

when loading from an older dump.<br />

28


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Partitioned Tables<br />

One of the key new features in <strong>ASE</strong> <strong>15.0</strong> is the semantic partitioning. As the name implies, this feature<br />

allows data to be distributed to different partitions according to the data value vs. just round-robin<br />

partitioning that earlier versions of <strong>Sybase</strong> <strong>ASE</strong> provided. A full description of this feature is available in<br />

Chapter 10 of the <strong>ASE</strong> <strong>15.0</strong> Transact SQL User’s Guide. The purpose of this section is to highlight some<br />

of the gotcha’s or unexpected behavior when using partitions that some customers have already noticed.<br />

The behaviors are documented – the purpose of this document is to help customers understand the<br />

rationale.<br />

Partitions & Primary Keys/Unique Indices<br />

As documented in the Transact SQL User’s Guide, unique index including primary keys may not be<br />

enforceable if the partition keys are not the same as the index/primary key columns – or a subset of them.<br />

As a result, up through <strong>ASE</strong> <strong>15.0</strong> ESD #2, attempts to create a primary key constraint or a unique local<br />

index on columns not used for partition keys will fail with an error similar to:<br />

1> use demo_db<br />

1><br />

2> create table customers<br />

3> (<br />

4> customer_key int not null,<br />

5> customer_first_name char(11) null ,<br />

6> customer_last_name char(15) null ,<br />

7> customer_gender char(1) null ,<br />

8> street_address char(18) null ,<br />

9> city char(20) null ,<br />

10> state char(2) null ,<br />

11> postal_code char(9) null ,<br />

12> phone_number char(10) null ,<br />

13> primary key (customer_key)<br />

14> )<br />

15> lock datarows<br />

16> with exp_row_size = 80<br />

17> partition by range (state) (<br />

18> ptn1 values ptn2 values ptn3 values ptn4 values )<br />

Msg 1945, Level 16, State 1:<br />

Server 'CHINOOK', Line 2:<br />

Cannot create unique index 'customers_1308528664' on table 'customers'. The table partition condition<br />

and the specified index keys make it impossible to enforce index uniqueness across partitions.<br />

Msg 2761, Level 16, State 4:<br />

Server 'CHINOOK', Line 2:<br />

Failed to create declarative constraints on table 'customers' in database 'demo_db'.<br />

Note that in the above case, since a error of level 16 is returned, the command fails, consequently the table<br />

is NOT created (despite the fact the error message only seems to indicate that it was the primary key<br />

constraint that was just not created vs. the entire table). While a future ESD may eliminate this restriction,<br />

it is one to be very much aware of as the most common partitioning schemes for range partitions typically<br />

do not include primary key attributes. To see how this can happen, consider the following scenario.<br />

Most customers are considering partitioning their tables for one of two reasons – 1) allow more efficient<br />

and practical use of parallel query feature; 2) allow easier DBA tasks. Note that the two are not always<br />

mutually exclusive. However, there is one aspect to partitioning for DBA tasks that can cause unexpected<br />

behavior. The most common method for partitioning for DBA task is to partition based on a date or day –<br />

possibly a modulo day number calculated as an offset. This is typically done to allow fast archiving of<br />

older data. Often times, this date is either not in the primary key at all or is only one column in the primary<br />

key. Another example is when the table is partitioned according to a lower cardinality division such as<br />

State/Country vs. unique key.<br />

29


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Now, let’s assume we have a table containing customers (uniquely identified by cust_id) in the United<br />

States – and to divide them up by sales region or for another reason, we partitioned them by state. Consider<br />

the following diagram:<br />

Figure 2 - Partitioned Table and Primary Key Enforcement Issue<br />

Even though we are attempting to insert the same exact cust_id value (12345), because we are inserting<br />

into different state partitions, the insert succeeds. Why Because, the various index partitions act almost<br />

as if they are independent – consequently, when the values are being inserted, they can’t tell if the value<br />

already exists in another partition. This explains the warning you receive when you partition a table on a<br />

column list that does not include the primary keys or attempt to create a unique local index on columns that<br />

are not used for partition keys.<br />

So, why didn’t <strong>Sybase</strong> enforce uniqueness The answer is speed. Let’s say for example that we have a<br />

50 million row table. The primary key and any non-clustered index would likely need 7 levels of indexing<br />

to find a data leaf node from the root node of the index. When partitioned (assuming even distribution of<br />

values), each partition would only have 1 million rows – which likely would need 5 levels of indexing. In<br />

an unpartitioned table, the unique value check would only take 7 IOs to read to the point of insertion to<br />

determine if a row with that value already exists. However, for a partitioned index, it would have to<br />

traverse all 5 levels for all 50 partitions – 250 IOs in all.<br />

The workaround to this problem is simple – create a global unique index to enforce uniqueness instead of a<br />

local index or primary key constraint (all primary key constraints are partitioned according to the table<br />

schema). Since a global index is unpartitioned, uniqueness can still be enforced.<br />

Global vs. Local Indexes<br />

The aspect that is often overlooked is whether an index on a partitioned table should be created as a local<br />

index or a global index. Some initial guidelines are:<br />

• If the Pkey is not a superset of all the partition keys, use a unique global index.<br />

• If the index does not contain ALL the partition keys, either<br />

o<br />

o<br />

Use a global index<br />

Add the partition keys and use a local index (preferred)<br />

• You may need both a global and local index on the same columns<br />

• If partitioning for maintenance reasons, keep the number of global indexes to an absolute<br />

minimum (with zero being preferred).<br />

30


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

The reason for these rules is that the optimizer can only consider partition elimination when it knows which<br />

partitions are affected - namely inferring that the partition key column(s) appear in the where clause. Take<br />

for example the following example table:<br />

create table trade_detail (<br />

trade_id bigint not null,<br />

trade_date datetime not null,<br />

customer_id bigint not null,<br />

symbol char(5) not null,<br />

shares bigint not null,<br />

price money not null<br />

)<br />

partition by range (trade_date)<br />

(<br />

Jan01 values


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

The index is now usable. Note that it is likely that even in an unpartitioned table, the earlier index might<br />

not get used.<br />

The reason it is suggested that global indexes be kept to a minimum for partitioned tables has more to do<br />

with DBA tasks than query performance. If you are partitioning for DBA maintenance reasons, you will<br />

likely be dropping/truncating partitions - especially if implementing a rolling partition scheme. If only<br />

local indexes exist, these tasks only take a few seconds. However, because a global index only has a single<br />

index tree, when a partition is dropped or truncated, a global index needs to be rebuilt after deleting all the<br />

associated index rows corresponding to the partition rows - a task that could take hours.<br />

In addition to primary key enforcement, other cases when global indexes might be required include queries<br />

using aggregates (especially if grouped) that do not reference the partition columns in the where clause - as<br />

well as other possibilities. It is highly recommended that you thoroughly test your application if you<br />

partition tables that previously were unpartitioned.<br />

Multiple/Composite Partition Keys & Range Partitioning<br />

As described in the Transact SQL User’s Guide, both the Range and the Hash partitioning scheme allow<br />

users to specify more than one column (up to 31) as partition keys – creating a composite partition key. For<br />

hash partitions, it behaves as would be expected. However, for range partitions, the behavior can be<br />

unexpected, especially when the partition keys are numeric. The reason is that the assumption most users<br />

make is that <strong>ASE</strong> uses all of the keys each time to determine where the row needs to be stored. However,<br />

in actuality, <strong>ASE</strong> uses the fewest partition keys in sequence until it can determine the appropriate partition.<br />

This rule is documented as:<br />

if key1 < a, then the row is assigned to p1<br />

if key1 = a, then<br />

if key2 < b or key2 = b, then the row is assigned to p1<br />

if key1 > a or (key1 = a and key2 > b), then<br />

if key1 < c, then the row is assigned to p2<br />

if key1 = c, then<br />

if key2 < d or key2 = d, then the row is assigned to p2<br />

if key1 > c or (key1 = c and key2 > d), then<br />

if key1 < e, then the row is assigned to p3<br />

if key1 = e, then<br />

if key2 < f or key2 = f, then the row is assigned to p3<br />

if key2 > f, then the row is not assigned<br />

This can be summarized by the following points:<br />

• If value < key1 then current partition<br />

• If value = key1, then compare to key2<br />

• If value > key 1, then check next partition range<br />

To see how this works, let’s assume we have a table of 1.2 million customers. Since we want to partition to<br />

aid both parallel query and maintenance, we are going to attempt to partition it by the quarter and then the<br />

customer id. This way, we can archive the data every quarter as necessary. Note, however, that the table<br />

has a month column vs. quarter, but since we know that the quarters are every third month, we try the<br />

following partition scheme (Tip: this example is borrowed from the <strong>Sybase</strong> IQ sample database telco_facts<br />

table in case you wish to try this).<br />

alter table telco_facts_ptn<br />

partition by range (month_key, customer_key)<br />

(p1 values


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

After the partitioning, we notice that instead of evenly distributing the 1.2 million rows (150,000 rows to<br />

each partition), the odd partitions contain 250,000 rows while the even partitions only contain 50,000 rows.<br />

What happened. The answer is that for months 1 & 2 (Jan, Feb), when <strong>ASE</strong> compared the data values<br />

to the first key in the first partition, it was less than the key (1 < 3 and 2 < 3), therefore, it immediately put<br />

all the Jan & Feb data into the first partition regardless of the customer_key value. Only when March data<br />

was being entered did <strong>ASE</strong> note that the values were equal to key1 (3=3) and therefore it needed to<br />

compare customer_key value with key2. Consequently, the even partitions would only have data in which<br />

the month was equal to the partition key and the customer_key value was greater than the customer_key<br />

value for the partition key before it.<br />

Semantic Partitions & Data Skew<br />

In the earlier releases of <strong>ASE</strong>, the segment-based round-robin partitioning scheme supported parallel query<br />

capability. However, if the partition skew exceeded a specific ratio, the optimizer would consider the<br />

partitioning too unbalanced to provide effective parallel query support and process the query in serial<br />

fashion. In order to prevent this, DBA’s using parallel query often had to monitor their partition skew and<br />

attempt to rebalance it by dropping and recreating the clustered index or other techniques.<br />

With semantic partitioning, data skew is no longer a consideration. Rather than evaluating the depth of the<br />

partition for parallel query optimization, the <strong>ASE</strong> <strong>15.0</strong> optimizer considers the type of partition, the query<br />

search arguments, etc. In fact, for semantic partitions, due to the very nature of unknown data distribution,<br />

there likely will be data skew, but since a hash, list or range partition signals where the data resides, the<br />

skew is unimportant.<br />

Partitioned Tables & Parallel Query<br />

Partitioned tables allow more effective parallel query than previous releases of <strong>ASE</strong>. However, in some<br />

cases this could lead to worse performance than expected. For example, in previous releases of <strong>ASE</strong>, the<br />

number of partitions typically created were fairly small – and often within a small multiple (i.e. 2) of the<br />

number of <strong>ASE</strong> engines. Additionally, the max parallel degree and max parallel scan degree were also<br />

tuned to the number of engines. While the latter may still be the case, in many cases, customers are now<br />

looking at creating hundreds of partitions – largely due to the desire to partition on a date column due to<br />

archive/data retention and maintenance operations.<br />

If the partitioning is simply to maintenance operations, parallel query can be disabled by setting max<br />

parallel degree, max parallel scan degree to DEFAULT (1). However, at query time, the proper way to<br />

disable it is to use the set option ‘set parallel_query 0’.<br />

In early releases of <strong>ASE</strong> <strong>15.0</strong>, in order to perform partition elimination for each thread of the parallel query,<br />

the partition columns were added to the query parameters. As a result earlier releases of <strong>ASE</strong> 15 could<br />

have parallel query performance degradation in which queries using covered indices would not be covered<br />

any more due to the addition of the automatic inclusion of the partition columns. For example a query such<br />

as:<br />

-- assume table is partitioned on a column ‘SaleDate’<br />

-- and assume it has an index on StockTicker<br />

Select count(*)<br />

From trades<br />

Where StockTicker=’SY’<br />

Would get expanded to:<br />

-- after expansion for parallel query….<br />

Select count(*)<br />

From trades<br />

Where StockTicker=’SY’<br />

And SaleDate between and <br />

33


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

This specific problem was fixed in ESD #2, however, it likely was the cause of the problem for global vs.<br />

local index selection as mentioned above. If planning on using the parallel query features within <strong>ASE</strong> <strong>15.0</strong>,<br />

you will need to carefully test each query and compare query plans. This can be done via a technique<br />

mentioned later in this document.<br />

Partitioning Tips<br />

If you are planning on implementing partitioning, the following tips are offered for your consideration.<br />

Best Method for Partitioning<br />

There are three basic methods for creating partitions:<br />

1. Use the alter table command<br />

2. Bcp out the data; drop/recreate the table, bcp the data back in<br />

3. Create a new table as desired; use select/into existing; and rename the old/new tables.<br />

The first two are likely common to most readers. The third is a new feature of <strong>ASE</strong> <strong>15.0</strong> that allows a<br />

query to perform a “select/into” to an existing table provided that the table doesn’t have any indices or<br />

constraints (and that the table selecting into is not itself). The syntax is identical to a typical select/into for<br />

a #temp table - except that the tablename must exist (and the structure match the query output columns).<br />

Which one of these methods works best for your situation will depend on hardware capacity, system<br />

activity and the number of rows in the table - as well as whether any of the following are considerations:<br />

• You also wish to alter the table to use datarows locking<br />

• You wish to add a column - either derived or denormalized - that will be used as the partition<br />

key.<br />

Considering the first point, it is likely that any existing history table is still in allpages locking. Extremely<br />

large tables with 10’s-100’s of millions of rows are often a source of contention within <strong>ASE</strong>. Most often<br />

the reason is that these tables started prior to upgrading to <strong>ASE</strong> 11.9.2 and the DOL locking<br />

implementation. By then, altering the locking scheme on such large tables often was impractical – not only<br />

from the table copy step, but also the associate index maintenance. However, if you are now faced with the<br />

situation of having to drop the indices anyhow in order to partition the table, you may want to take<br />

advantage of the time and also alter the table to datarow or datapages locking. Unfortunately, you can’t<br />

partition and change locking in a single alter table (and thereby copy the table only once) as the alter table<br />

command only allows one operation at a time. However, altering the locking will likely run much faster<br />

than partitioning as it is a straight forward table copy vs. partition determination and row movement<br />

situation that partitioning requires.<br />

To get an idea of how the three methods compare, the following test was run on a small 32-bit dual<br />

processor (single core) Windows host machine using a table of over 168,000,000 rows (19 columns wide).<br />

The machine used a mix of SATA hard drives and U160 SCSI internal disks, 4GB of RAM total (2GB for<br />

<strong>ASE</strong>).<br />

34


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Figure 3 - Comparison of Partitioning Methods<br />

One interesting item to note is that the two alter tables were considerably faster than the bcp in/out strategy<br />

- which normally, most DBA’s would not have suspected. A good reason is due to the fact that the alter<br />

table …lock datarows command only took 13 minutes - considerably shorter than the partitioning phase.<br />

The other advantage of the select/into/existing approach is if you need to add a derived or denormalized<br />

column that will be used for partitioning - for example, adding a column to denormalize a column such as<br />

order_date from the parent “orders” table to the “order_items” table to facilitate partitioning and archiving<br />

old orders and order items. Again, this would normally require a third alter table command, followed by an<br />

update. It can be done in the same select/into/existing statement via a standard join or a sql computed<br />

column (using a case statement or other expression).<br />

To further improve performance, partitions can be created using parallel worker threads. In order to work,<br />

parallelism must be enabled at the server configuration level (number of worker threads/max parallel<br />

degree/max scan parallel degree) and parallel sort must be enabled for the database (sp_dboption ‘select<br />

into/bulkcopy/pllsort’, true). Enabling the latter will also help with re-creating the indices after the<br />

partition step is complete.<br />

Re-Creating Indexes on Partitioned Tables<br />

In order to partition a table, you will first need to drop all the indices. Not only is this necessary as the<br />

indices themselves will need to be partitioned, but it also is necessary from a performance standpoint as<br />

dynamically maintaining the index leaf nodes during massive page changes will not be required. Since you<br />

have to drop the indexes anyhow, you may wish to:<br />

• Re-create them as local/partitioned indexes<br />

• Re-create them with larger statistics steps<br />

Re-creating indexes on large tables is a very time consuming task. However, now that the table is<br />

partitioned, worker threads may take advantage and be more efficient than a single process. Consequently,<br />

you may find it much faster to re-create the indexes using parallel threads – however, you may wish to<br />

constrain the number of consumers to the lesser of the number of partitions or twice the number of engines<br />

(consumers =min(# of partitions, # engines * 2)).<br />

35


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Partitioning and Relationships<br />

Some of the largest tables in any OLTP system are the “event” tables that track occurrences of business<br />

transactions - sales, shipments, orders, etc. Often there are one or two other large tables such as customers<br />

or products that are related to these at a higher level. At a detailed level, sales, etc. events are often stored<br />

in one or more tables with a main table representing the “header” and the the details being stored in other<br />

tables. In addition, audit or history tables often contain orders of magnitude more data than the active<br />

counter parts.<br />

As a result, it is best if a common partitioning key can be used between all the tables. This common key<br />

should also collate all the data for a single event to the same partition. This may require altering the<br />

scheme to “carry” the partitioning key to the related tables. For example, a common partitioning scheme<br />

for ease of maintenance is based on a range partition scheme in which each range represents a single day,<br />

week or month. While the “sales” table contains the date and therefore is easily partitioned, the “sales”<br />

details tables may not. Unfortunately, these tables are often much larger than the parent as the number of<br />

child rows is an order of magnitude higher. Consequenlty, the tables that would most benefit from<br />

partitioning are excluded from early attempts. However, if the sales date was “carried” to each of the child<br />

tables containing the sales details, these could be partitioned as well.<br />

One of the biggest reasons for doing this (besides ease of maintenance) is that if you are using parallel<br />

query, by partitioning on the same partition keys, the likelihood is that a parallel query can use a vector join<br />

instead of the traditional M*N explosion of worker threads required.<br />

As a result, before implementing partitioned columns, you may wish to carefully review the partition keys<br />

on related tables. If you need to “carry” a partition key from a parent to a child table, an application change<br />

may be required to ensure that this key’s value is kept in sync. As with any application change, this may<br />

take a while to implement and test.<br />

Which Partitioning Scheme to Use<br />

There are several aspects to determining which paratitioning scheme is the best choice. You should start by<br />

asking yourself the following questions:<br />

• Is the primary goal of partitioning the data to:<br />

a) Reduce the maintenance time for update stats, reorgs, etc.<br />

b) Improve parallel query optimization<br />

c) Improve insert/point query performance<br />

• Is the desired partition data field monotonically increasing – such as transaction id, transaction<br />

date, etc. – and is there a need for range scans within this increasing field<br />

• Are there natural data boundaries (i.e. weeks/months) that can be used to determine the boundaries<br />

for data partitions<br />

• What is the order of magnitude of the number of discreet values for the partitioning columns (i.e.<br />

10’s, 100’s, etc.) Especially the leading column of the list of partition columns.<br />

The first point is extremely important. If partitioning for maintenance, it is more than likely that you will<br />

be using range partitioning on a date column. While other partitioning schemes can be used, they generally<br />

will require maintenance, such as update statistics, to be run periodically. With range partitioning, much of<br />

the maintenance on most of the data can be eliminated.<br />

36


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

The second point is interesting from a different perspective. First, if attempting to evenly distribute the<br />

data, a hash partitioning scheme is the apparent choice. However, similar to the previous example, a<br />

monotonic sequence also becomes a natural for range partitioning when partitioning for maintenance. If<br />

you think about it, date datatypes are a special form of a monotonic sequence in that they are internally<br />

represented as a number of milliseconds elapsed from a particular reference date. As the natural sequence<br />

of the date increases, the internal values increase in order as well. Order numbers, transaction id’s, trade<br />

numbers, etc. are similar – consequently both date and sequential numbers may use hash or range<br />

partitioning depending on the goal.<br />

The last two points are focused more on how evenly distributable the data will be across the partitions.<br />

Unlike indexes where it is highly recommended to put the most distinct column first (a legacy<br />

recommendation that is no longer necessary if you use update index statistics instead of update statistics),<br />

the first column of a partition key sequence does not need to be the most unique. However, if it is very low<br />

cardinality (especially when less than the number of partitions), you may want to order the partition<br />

sequence differently. Keep in mind that the access method will still be via index order – the order of the<br />

partition columns is independent.<br />

The following sections will describe each of the partition schemes and which ones are best for particular<br />

situations.<br />

<br />

Range Partitions<br />

Range partitions will most likely be the most common form of partitioning – particularly for those wishing<br />

to partition on date ranges for maintenance purposes. While range partitions are good choices for 10’s to<br />

low-mid 100’s of partitions, the algorithm for finding which partition to use is a binary search. From early<br />

computer science classes, we understand that the speed of a binary search is a max log 2 number of splits -<br />

meaning that the most iterations used to find the correct value will be log(n)/log(2) where n is the number<br />

of elements and log() is assumed to be the normal base 10 logarithmic function. For 1,000 values, this max<br />

would be 10 splits while for 100 it would be 7. Put another way, retaining 30 days of data online using a<br />

range partition would result in 5 splits for the partition search; while retaining 3 years (1095 days) would<br />

result in 11 splits - doubling the partition search time. Consequently, for really large numbers of partitions<br />

(1,000’s or more), the search time to find the desired partition may adversely affect insert speeds.<br />

From a query optimization perspective, range partitions are also best used if range scans (i.e. column<br />

between and ) or unbounded ranges (i.e. column > value) are frequenly used – particularly<br />

when within a single range. However, this points out a situation in which the granularity of the partitioning<br />

vs. the query performance could need balancing. If extremely narrow partition values are used (i.e. one<br />

partition per day), queries with bounded or unbounded ranges that span more than one day could perform<br />

slower than when a wider partition granularity (week) is used. As an extension of this, the server sort order<br />

could have an impact on this as well. For example, a server using nocase sort order and using a range<br />

partition on a character attribute such as name will have a different data distribution than a partitioning<br />

scheme in a case sensitive sort order system. This does not imply that it might be better to change to a case<br />

insensitive sort order if your range partitioning scheme involves character data since case insensitive sort<br />

orders take longer than case sensitive comparisons. However, it may mean that the range specification<br />

needs careful specification – for example the expression “col


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

• The number of partitions will likely be less than 500.<br />

• Range scans are common on the partition key attributes.<br />

• The ability to add/drop partitions is important.<br />

• Reduction of time for data maintenance activities such as update statistics, reorg or other<br />

activities is desired.<br />

• Partition elimination and parallel query performance is desirable.<br />

• Composite partition columns are required (vs. list partitioning).<br />

<br />

Hash Partition<br />

Hash partitioning is not a substitute for range partitioning. If the number of partitions for a range partition<br />

starts to exceed several hundred, rather than attempting to use a hash partition instead, you should<br />

repartition with broader ranges to reduce the number of partitions. The primary reason for this suggestion<br />

is that the data distribution will be different for hash partitioning – and some queries (such as range scans)<br />

may perform worse when using hash partitions.<br />

While it is true that hash partitions work the best when the number of partitions are large (1000’s), they<br />

also work very well at lower numbers (i.e. 10’s) of partitions as well. One reason is that rather than a<br />

binary search, the correct partition for a datarow is simply determined from the hash function – a fairly<br />

fixed cost whether 10 partitions or 1000.<br />

Hash partitioning is best when an even distribution of values is desired to best achieve insert performance<br />

and performance of point queries (in which only a single row is returned). However, partition elimination<br />

for hash partitioning can only be achieved for those using equality predicates vs. range scans. Additionally,<br />

if the number of partitions are small and the domain of the datatype used allows a wide range of values<br />

(such as an ‘int’ datatype – or a varchar(30)), an even distribution of values may not occur as desired as the<br />

partition assignments are based on all the possible values for the domain vs. the actual number of discrete<br />

values.<br />

From a maintenance perspective, unless the hashed key is a date column, maintenance commands may have<br />

to be run against the entire table as new data rows are scattered throughout the various partitions.<br />

Additionally, a hash partition can not be dropped – or else it would create a “hole” for future data rows<br />

containing those hash values. Addition or deletion of partitions requires re-distribution of the data.<br />

Good candidates for hash partition keys are unbounded data values such as sequential transaction id’s,<br />

social security numbers, customer id’s, product id’s – essentially any data column in which it is most often<br />

accessed using an equality parameter – and most often a primary key or alternate primary key.<br />

In summary, hash partitioning should be considered when:<br />

• The number of partitions exceeds practicality of list partitioning<br />

• Write-balancing/even data distribution is important for IO loading<br />

• Partioning attributes are fairly discrete such as primary keys, pseudo keys or other distinct<br />

elements. In fact, the cardinality of the data values should be in the same order of magnitude<br />

as the domain of the datatypes (to ensure even data distribution).<br />

• Partition elimination and parallel query performance is desirable.<br />

• Range scans are rarely issued on the partition key attributes.<br />

• Reduction of total time on data maintenance operations such as update statistics, reorgs, etc. is<br />

not a consideration (as these operations will still need to be performed on all partitions).<br />

• Composite partition columns are desired.<br />

38


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

<br />

List Partition<br />

Works best for small numbers of discrete values (10's) as list partitioning lookups are more expensive<br />

compared to other partition types due to needing to search through the list of the elements as well as<br />

increased memory consumption to hold the partition key values. Consequently, if range partitioning is an<br />

option for larger quantities (above 30-40), then range partitioning should take precedence over list<br />

partitioning if performance is of consideration. This can’t always be achieved.<br />

Let’s take the scenario that we wish to balance the I/O load of a customer enquiry system by partitioning by<br />

state. If we look at North America, we have roughly 50 states in the US, 12-13 for Canada, slightly more<br />

than 30 for Mexico – plus territories (Carribbean and Pacific Islands). The total number of states and<br />

providences are roughly 100. The natural tendency would be to use a list partition scheme to precisely<br />

locate each state/providence within its distinct partition. However, when rows are inserted, in order to find<br />

the correct partition, the server will need to scan each of the list partition key values sequentially to<br />

determine the correct partition. While this may be quick for states such as Alaska, Alabama, Alberta, etc.,<br />

it may be a bit longer for states such as Wyoming. Even for states in the middle – Maryland, New York,<br />

etc., the number of comparisons would be at least 20 or so. By comparison, a binary search on 100 items is<br />

likely to locate the correct partition within 10 comparisons (remember from programming 101 - the<br />

maximum iterations for a binary search is sqrt(n)). This means that for the more populous state of<br />

California, either method is likely to perform about the same – while other states with large metrolpolitan<br />

populous areas such as New York, Illinois, Ontario, would benefit. To achieve this, the range partitioning<br />

scheme would need to use distinct values and be listed in alphabetical order such as:<br />

Partition by range (state_code) (<br />

P1 values


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Partition by range (state_code) (<br />

NewEngland values (‘ME’,’MA’,’NH’,’RI’,’VT’,’CT’,’NY’),<br />

MidAtlantic values (‘PA’,’NJ’,’MD’,’DE’,’VA’,’WV’),<br />

SouthEast,<br />

GreatLakes,<br />

GulfCoast,<br />

GreatPlains<br />

NorthWest<br />

SouthWest<br />

Maritimes<br />

EasternCanada<br />

WesternCanada<br />

…<br />

)<br />

This has some definite advantages for query performance - and it also is easier for DBA maintenance as<br />

update statistics, reorgs, etc. can start on the different partitions earlier. While the latter is also true of<br />

individual partitions, the above approach is likely less error prone.<br />

Of special consideration for list and range partitions is the handling of NULL values in partition keys.<br />

While list partitions can specify a NULL value, range partions can not. For example:<br />

Is legal, while:<br />

Partition by list (<br />

NullPtn values (NULL),<br />

Ptn_1 values (…),<br />

…<br />

)<br />

Partition by range (<br />

NullPtn values


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

querying a table. By eliminating last page contention, high insert environments could have evenly<br />

distributed inserts across the partitions. While this may appear similar to hash partitioning, there are a<br />

number of differences which we will discuss in a minute. However, the main driver for round-robin<br />

partitioning was eliminated when DOL locking introduced in <strong>ASE</strong> 11.9.2, resolving much of the page<br />

contention issue – and did a much better job as partitioned tables in 11.0 and 11.5 still suffered index<br />

contention. Round-robin partitions still provided some insert speed improvements – especially for bulk<br />

operations such as bcp. However, compared to hash partitioning as a insert/write-balancing mechanism,<br />

round-robin partitioning does have some noted differences.<br />

• The partition determination is based on a round-robin of the user’s session – low concurrency<br />

of user sessions (less than number of partitions) results in unbalanced partitions.<br />

• Because the partition determination is based on user session vs. data semantics, two users<br />

inserting the identical values for the partition keys could save their data to different partitions.<br />

As a result, the optimizer can not perform partition elimination as query optimization<br />

technique – resulting in higher I/O and cpu costs.<br />

• Local indexes are not possible in round-robin partitioning for <strong>ASE</strong> <strong>15.0</strong>. As a result, any<br />

advantages in index tree height for insert activity as compared to hash partitioning is lost.<br />

• One advantage of round-robin partitions is that an update of a data value does not cause the<br />

row to change partitions – an expensive update operation.<br />

• Another advantage is that with hash partitioning, two different data values could result in the<br />

same hash key – and force the writes to the same partition.<br />

• Altering the datatype of the hash partition key may result in significant re-distribution of data<br />

Round robin partitions should be used when the following conditions are true:<br />

• Parallel query performance is not a high consideration.<br />

• Eas of data maintenance is not a high consideration<br />

• Maximum write-balancing is desired among low numbers of concurrent sessions<br />

Dropping Partitions<br />

In the GA release of <strong>ASE</strong> <strong>15.0</strong>, dropping a partition was not supported – however, it was added to <strong>ASE</strong><br />

<strong>15.0</strong>.1 (available Q3’06) for range and list partitions only. Prior to <strong>15.0</strong>.1, in order to remove a partition,<br />

the table should be repartitioned using the same partitioning scheme, but without the undesired partition.<br />

As a work-around, you can truncate all of the data in a partition, and with no realistic limit on the number<br />

of partitions, this has the same effect.<br />

However, this is not as simple as a task as it would seem. Along with dropping a partition, the main reason<br />

you would be dropping a partition is that the partitioned data has been archived. However, this would<br />

imply a need to add a partition later when un-archived in-between existing partitions vs. strictly at the end<br />

(range and list partitions allow partitions to be added to the existing scheme, however, for range partitions,<br />

the key value must be higher than previous partitions effectively adding the partition to the end).<br />

Alternatively, DBA’s might try to use drop/add partition to fix minor partitioning scheme problems without<br />

re-partitioning the entire table. Understanding this, now consider the following scenario. Assume we have<br />

created a range partitioned table on partition keys of 10,20,30,40,50, etc. Now, lets assume that we archive<br />

the data in the first partition (


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

relocated the ‘5’ rows when the new partition is added – effectively turning the ‘add partition’ into a<br />

repartitioning as data is relocated. This problem would not occur currently in <strong>ASE</strong> <strong>15.0</strong>.1 as partitions<br />

currently can be added at the end of a range where the assumption is that no data would need to be<br />

migrated.<br />

Additionally, what is the impact of dropping a hash partition The intent of a drop partition generally is to<br />

remove all data with it. Again, the data removal isn’t where the problem lies – the issue is what happens<br />

when new data is inserted. In this case, consider a hash partition on an integer column using 10 partitions.<br />

Dropping one of them removes the hash bucket for 1/10 th of the possible data values. Let’s assume that the<br />

particular hash bucket removed held the hash keys for integers {5, 32, 41, etc.}. Now, a user inserts a value<br />

of 32. What should happen Should the hashing algorithm have changed to reflect the full domain across<br />

the remaining 9 partitions If this is the case, then is the purpose of the drop partition just to redistribute<br />

the data instead of removing them (again a repartition). Or perhaps, the value should be rejected as the<br />

hash bucket no longer exists (much like attempting to insert an unlisted value in a list partition) The<br />

problem with dropping hash partitions is why dropping hash partitions currently is not supported in <strong>ASE</strong><br />

<strong>15.0</strong>.1.<br />

Dropping a list partition is fairly straight forward - with one little “gotcha”. If you drop a list partition and<br />

then someone attempts to insert a value that was in that partition, they will get an error. Consequently, care<br />

should be taken when dropping a list partition to make sure that the application will not break as a result.<br />

However, a round-robin partition has an even worse problem than hash partitions. Consequently dropping<br />

round-robin partitions is also not supported. As you can see, dropping a partition is a feature that goes<br />

beyond simply removing the partition and all its data. As a work-around, simply truncating a partition<br />

(truncate table command in <strong>ASE</strong> <strong>15.0</strong> now supports truncating just a single partition) may be a more usable<br />

approach. However, even this is not without its issues. If using parallel query, the optimizer considers all<br />

the partitions as part of the worker thread costing and scanning operations (the latter if partition elimination<br />

is not available). As a result, the following guidance is provided regarding drop partition:<br />

• Dropping a list partition is fairly low risk and doesn’t need much consideration - other than<br />

ensuring that the application will not attempt to insert a value that would have gone into the<br />

old partition.<br />

• Dropping a range partition is not advisable if intending on unarchiving data that used to be in<br />

that partition into the existing schema. If the data may be unarchived into a different schema<br />

for analysis, dropping a partition may still be an option.<br />

• Dropping partitions may be recommended when parallel query is enabled.<br />

• If you anticipate dropping a partition, you must either use a range or list partition. If you<br />

suspect that you may need to re-add the partition after dropping it, a list partition is best -<br />

assuming a small number of partition key values.<br />

• Dropping a partition will cause global indexes to be re-built. Unlike a local/partitioned index<br />

in which the index partition can simply be removed, it is likely that the partition’s data is<br />

scattered throughout the index b-tree -including intermediate node values. As a result, you<br />

may find it faster to drop global indexes prior to dropping a partition and recreating them<br />

manually later.<br />

Creating a Rolling Partition Scheme<br />

One of the primary drivers for partitioning from a <strong>Sybase</strong> customer perspective is to ease the administrative<br />

burden of very large tables during maintenance operations. For example, running update statistics, reorgs,<br />

as well as data purge/archive operations on a 500 million row table can be extremely time consuming and<br />

in many cases prohibitive until the weekend. As a result, many <strong>Sybase</strong> customers are looking at<br />

implementing a rolling partition scheme in which each day’s data is partitioned separately. This has some<br />

immediate benefits:<br />

42


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

• Since most of the older data is fairly static, there is no need to run update statistics or reorg on<br />

those partitions. In fact, these operations may only have to be run on the last week’s<br />

partitions - greatly reducing table maintenance time.<br />

• Data purge/archive operations become simplified - rather than a large delete that takes a long<br />

time to run (and locks the entire table) or iterative batches of 1,000 deletes (or so) to avoid a<br />

table lock, older partitions can simply be dropped.<br />

The partition granularity will likely depend on the data retention policy. As discussed earlier in the<br />

partitioning tips, if the data retention policy is 5 years online, having one partition per day is not likely very<br />

efficient as this would require 1,825 partitions. As a result, the first decision will be how granular to make<br />

the partitioning scheme. If keeping 5 years, possibly 1 partition per week (260 partitions) would be<br />

advisable.<br />

Regardless, the method for creating a rolling partition scheme as of <strong>ASE</strong> <strong>15.0</strong>.1 is as follows:<br />

• The table must contain a date field (although any sequentially increasing column is usable).<br />

• The range partition is created using the full date boundary for day, week or month that was<br />

decided for the granularity. Only enough partitions are created to support the retention policy<br />

plus a few in advance.<br />

• With each purge cycle, the next set of partitions are created while the older ones are dropped.<br />

For example, if using a rolling partition on day, but purging weekly, every week, 7 new<br />

partitions will need to be created and the 7 old partitions dropped.<br />

• The number of partitions should be at least one purge cycle in advance. For example, if<br />

partitioning by day/purging by the week, you should have 37 partitions initially - to ensure<br />

rollover conditions during processing doesn’t cause application errors if the partition isn’t<br />

available.<br />

• The partition name should be able to be autogenerated to facilitate scripting<br />

The earlier table example was a good example of a rolling partition based on a single day:<br />

create table trade_detail (<br />

trade_id bigint not null,<br />

trade_date datetime not null,<br />

customer_id bigint not null,<br />

symbol char(5) not null,<br />

shares bigint not null,<br />

price money not null<br />

)<br />

partition by range (trade_date)<br />

(<br />

Jan01 values


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

How do I move data in/out from individual partitions<br />

Currently, <strong>Sybase</strong> does not yet provide a partition merge/split feature comparable to Oracle’s partition<br />

exchange capability – although such a feature is high on the priority list for a future release. Additionally,<br />

you can not use a select statement to retrieve data out of a single partition by using the partition name –<br />

however, you can use bcp to extract data from the partition, and you can select data out of a partition if you<br />

specify a where clause that includes only that partition's key values. The bcp utility has also been enhanced<br />

to support directly loading data into a partition – both of these commands (bcp in/out from a partition) are<br />

expected to be used during data archival operations.<br />

Another way to move data between partitions is to simply change the partition key values. If you update<br />

the column used in a partition key to a value that is in a different partition, the row is physically relocated<br />

to the other partition. This is achieved via a deferred update, in which the existing row is ‘deleted’ from the<br />

current partition, and ‘inserted’ into the new partition. For example, if a table is partitioned by a "state"<br />

column, updating the column by changing it from NY to CA would likely cause the partition to change.<br />

What is recorded in the <strong>ASE</strong> transaction log is a deferred update consisting of the removal of the NY row<br />

and the insertion of the deleted row – the same as any other deferred update operation.<br />

What happens if I change the value in the column used to partition<br />

If you update the column used in a partition key to a value that is in a different partition, the row is<br />

physically relocated to the other partition. This is achieved via a deferred update, in which the existing row<br />

is ‘deleted’ from the current partition, and ‘inserted’ into the new partition.<br />

Can I select the data from a single partition<br />

Not directly. Using the partition key values in a where clause is the only supported method of doing this<br />

today. However, you can bcp out from a specific partition – which gives you the same effect.<br />

Which partition schemes support composite partition keys<br />

Both range and hash partition support multiple partition keys.<br />

44


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Query Processing Changes<br />

The <strong>ASE</strong> <strong>15.0</strong> documentation includes a whole new book just on Query Processing. Consequently, the<br />

purpose of this section is intended to highlight features that might be useful during migration.<br />

Query Processing Change Highlights<br />

Adaptive Server introduced a new optimizer with release <strong>15.0</strong>. The focus of the new optimizer was to<br />

improve the performance of DSS style complex queries. Performance of these queries will likely improve<br />

dramatically. However, OLTP DML statements and simple queries may not see any change in<br />

performance – unless a new feature is used that allows an index to be used where before it couldn’t (i.e.<br />

function-based index) or similar. However, you should test all applications before you use the server in<br />

production for the following issues:<br />

• Because of changes in the parser, some queries may return a general syntax error (message<br />

102) instead of Syntax error at line # (message 156).<br />

• The maximum number of worktables per query increased from 14 to 46.<br />

• In the past, some used trace flag 291 to improve performance with joins when using different<br />

datatypes. Continued using it with <strong>ASE</strong> <strong>15.0</strong> could result in wrong answers. <strong>ASE</strong> <strong>15.0</strong> has<br />

improved algorithms to take care of joins between compatible but different data types. <strong>ASE</strong><br />

<strong>15.0</strong> will use the appropriate indexes even if the SARGs are of different datatypes.<br />

Most of these should be transparent from an application perspective. Some that may not be so transparent<br />

are mentioned below.<br />

Group By without Order By<br />

One of the changes as a result of the new sorting methods is that a ‘group by’ clause without an ‘order by’<br />

clause may return the results in a different order than in previous releases.<br />

Pre-<strong>ASE</strong> 15<br />

Prior to <strong>ASE</strong> <strong>15.0</strong>, queries in <strong>ASE</strong> that used a ‘group by’ clause without an ‘order by’ clause would return<br />

the results sorted in ascending order by the ‘group by’ columns. This was not deliberate, but rather was an<br />

artifact that was the result of the grouping operation. Prior to <strong>ASE</strong> <strong>15.0</strong>, group by’s were processed by one<br />

of two methods:<br />

• Creating a work table and sorting the work table via a clustered index to determine the vector<br />

aggregates<br />

• Access in index order if the group by columns were covered by an index<br />

Either way, the net effect was that the result set was return sorted in grouped order if an order by clause<br />

was not present. This is not in accordance with strict ANSI interpretation of the SQL standard as ANSI<br />

dictates that the result set order can only be influenced by an ‘order by’ clause.<br />

The problem, of course, is that some developers were unaware that this behavior was the result of<br />

unintentional behavior and opted not to use the ‘order by’ clause when the desired result set ordering<br />

matched the ‘group by’ clause.<br />

<br />

<strong>ASE</strong>-<strong>15.0</strong><br />

In <strong>ASE</strong> <strong>15.0</strong>, new in-memory sorting/grouping methods (e.g. hash) do not physically sort the data during<br />

the grouping process. As a result, this does not generate a sorted result set due to implicit sort operations.<br />

This problem can be a bit difficult to spot as optimizations that still choose index access order or a sorted<br />

45


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

work table will still return the rows in sorted order - however, this can vary within the same query<br />

dependent upon the number of rows estimated to be in the result. Consequently, one time the problem may<br />

be apparent and in other executions it may not be. In order to generate a sorted result set, queries will have<br />

to be changed to include an order by clause.<br />

<strong>Sybase</strong> is considering an enhancement in future release that will revert behavior (group by implicitly orders<br />

result). In <strong>ASE</strong> <strong>15.0</strong> ESD #2, traceflag 450 makes group by use the classic (non-hashed) sort method, thus<br />

making the result set order predictable again – but at the possible cost of slower performance. In a later<br />

release, a new ‘optimization criteria’ language command 'set group_inserting {0|1}' is being considered<br />

which will let you control this on a session level – and especially via login triggers – without requiring<br />

trace flags.<br />

Literal Parameterization and Statement Cache<br />

One of the enhancements in CT-Lib in 10.x that dbLib does not support is the notion of dynamic SQL or<br />

fully prepared SQL statements. When using fully prepared SQL statements, the statement with any<br />

constant literals removed, would be compiled and optimized and stored in the <strong>ASE</strong> memory as a dynamic<br />

procedure. Subsequent executions would simply execute the procedure by calling the statement by the<br />

statement id and supplying the parameters for the various values. This implementation is also supported in<br />

ODBC as well as JDBC, although JDBC requires the connection property DYNAMIC_PREPARE to be set<br />

to true (default is false). For high volume OLTP applications, this technique has the fastest execution -<br />

resulting in application performance improvements of 2-3x for C code and upto 10x for JDBC applications.<br />

In <strong>ASE</strong> 12.5.2, <strong>Sybase</strong> implemented a statement cache which was designed to try to bring the same<br />

performance advantage of not re-optimizing repetitive language statements issued by the same user.<br />

However, one difference was that when the statement cache was introduced, the literal constants in the<br />

query were hashed into the MD5 haskey along with the table names, columns, etc. For example, the<br />

following queries would result in two different hashkeys being created:<br />

-- query #1<br />

select *<br />

from authors<br />

where au_lname=’Ringer’<br />

-- query #2<br />

select *<br />

from authors<br />

where au_lname=’Greene’<br />

The problem with this approach was that statements executed identically with only a change in the literal<br />

values would still incur the expense of optimization. For example, updates issued against different rows in<br />

the same table would be optimized over and over. As a result, middle tier systems that did not use the<br />

DYNAMIC_PREPARE connection property and batch jobs that relied on SQL language vs. procedure<br />

execution did not benefit from the statement cache nearly as much as they could have.<br />

In <strong>ASE</strong> <strong>15.0</strong>.1, a new configuration option ‘enable literal autoparam’ was introduced. When enabled, the<br />

constant literals in query texts will be replaced with variables prior to hashing. While this may result in a<br />

performance gain for some applications, there are a few considerations to keep in mind:<br />

• Just like stored procedure parameters, queries using a range of values may get a bad query<br />

plan. For example, a query with a where clause specifying where date_col between<br />

and .<br />

• Techniques to finding/resolving bad queries may not be able to strictly join on the hashkey as<br />

the hashkey may be representing multiple different query arguments.<br />

Note the following considerations about the statement cache:<br />

46


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

• Since we are creating a dynamic procedure, this counts as an ‘open object’. You will have to<br />

increase the configuration ‘number of open objects’ accordingly.<br />

• The following must match exactly: login, user_id(), db_id() and session state (see next bullet).<br />

• A number of set operators affect the MD5 hash, including: Forceplan, jtc, parallel_degree,<br />

prefetch, quoted_identifier, sort_merge, table count, transaction isolation level, chained<br />

(transaction mode). Using these options will cause a different hashkey and a new statement to<br />

be cached vs. reusing a existing cached query plan.<br />

• Only impacts selects, updates, deletes, insert/selects. Does not affect:<br />

• Insert …values() - no where clause plus literals (planned to be added in <strong>15.0</strong>.2)<br />

• Dynamic SQL - query plan is already cached, so no need.<br />

• Stored procedures - as above, query plan(s) are already cached, so no need.<br />

• Statements within procedural constructs (if…else, etc.)<br />

• Statement cache is disabled if abstract plan dump/load active<br />

The key is that for applications that have a high degree of repetitive statements, literal parameterization and<br />

the statement cache could have a significant performance boost. This is also true for batch operations such<br />

as purge routines that deleted data. Without literal paramerization, caching an atomic delete operation was<br />

futile and an app server that was front-ending for a lot of users with different parameters also didn’t get the<br />

benefit - and the statement cache was simply getting thrashed as new statements would get added as old<br />

ones got pushed out.<br />

Even with literal parameterization, the number of distinct login names could affect the statement cache<br />

sizing/turn-over. As noted above, one login will not use the cached query plan of another - and multiple<br />

logins will incur a high number of cached queries for the same SQL text. This has its good and bad points -<br />

on the good side, the range problem (between date1 and date2) would only affect one login - unlike a<br />

procedure optimization which affects all logins.<br />

Set tablecount deprecated/Inceased OptimizationTime<br />

Adaptive Server release <strong>15.0</strong> increases the amount of time for query compilation because the query<br />

processing engine looks for more ways to optimize the query. However, there is a new timeout mechanism<br />

in the search engine that can reduce the optimization time if there is a cheap plan. In addition, more<br />

sophisticated cost based pruning methods are implemented. As a result, the option "set tablecount" is<br />

obsolete in <strong>15.0</strong>. While the statement still executes, it has no effect on query timeout.<br />

It should be noted that ‘set tablecount’ was in effect a query optimization timeout mechanism. At the<br />

default value of 4, the optimizer could consider a maximum of 4! (4 factorial) or 24 possible join<br />

permutations (less non-joined possibilities). When set higher, the optimizer could consider increasing<br />

possibilities - which would take longer to optimize. For example, a customer using <strong>ASE</strong> 11.9.3 with a 12-<br />

way join using a default tablecount would seemingly never return. Set to 12, however, the query<br />

optimization would take 10 minutes and the execution would take 30 seconds. Analyzing trace 302 and<br />

other outputs revealed that the optimizer found the best query plan well within the first minute of<br />

optimization - the subsequent 9 plus minutes were spent exhausting all the other permutations (12! ≈ 479<br />

million less non intersecting joins) possible.<br />

<strong>ASE</strong> <strong>15.0</strong> with more effective cost pruning would likely avoid many of the exhaustive searches that <strong>ASE</strong><br />

11.9.3 conducted. However, still, the number of permutations would result in lengthy optimization times.<br />

As a result, <strong>ASE</strong> <strong>15.0</strong> has a direct control that restricts query optimization time. This new timeout<br />

mechanism is set via the new server configuration:<br />

47


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

-- is a number in the range of 0 4000 (as of <strong>15.0</strong>.1)<br />

sp_configure “optimization timeout limit”, <br />

-- stored procedure timeout - default is 40 (as of <strong>15.0</strong> ESD #2)<br />

sp_configure “sproc optimize timeout limit”, <br />

as well as session and query limits. Note that the first example affects SQL language statements that need<br />

to be optimized (vs. statement cache) and the second one addresses query optimization when a stored<br />

procedure plan is compiled. The default values of 5 for queries and 40 for procedures are likely appropriate<br />

for most OLTP/mixed workload environments, however, reporting systems with complicated queries may<br />

benefit from increasing this value. The value itself is a percentage of time based on the estimated query<br />

execution time based on the current shortest execution plan. The way it works is that the optimizer costs<br />

each plan and estimates the execution time for each. As each plan is costed, if a lower cost plan is found<br />

the timeout limit is re-adjusted. When the timeout limit is reached, the query optimization stops and the<br />

most optimal plan determined by that point will be used. It should be noted that the query timeout limit is<br />

not set until a fully costed plan is found.<br />

The result of this implementation is that queries involving more than 4 tables are likely to take longer to<br />

optimize than previous releases as most developers did not use the set tablecount function. This lengthier<br />

optimization time should be offset by improved execution times as better plans are found. However, in<br />

some cases, the optimizer was picking the more optimal plan anyhow and as a result the query may take<br />

marginally longer. For developers using ‘set tablecount’ to increase the optimization time may need to<br />

increase the timeout limit at the server level to arrive at the same optimal plan. In addition to the server<br />

level setting, a developer could increase the amount of time via the session level command:<br />

set plan opttimeoutlimit <br />

Or at the individual query level by using the abstract plan notation of:<br />

select * from plan "(use opttimeoutlimit )"<br />

Considering that the most common occurrence of ‘set tablecount’ was in stored procedure code, the logical<br />

migration strategy would be to simply replace the ‘set tablecount’ with ‘set plan opttimeoutlimit’. Some<br />

amount of benchmarking may be needed to determine the optimal timeout value - or you could simply<br />

select an arbitrary number such as 10 if you have mainly DSS queries and wish to give the optimizer more<br />

time to find a better query plan.<br />

If you suspect that a query optimization was timed out and a better strategy might have been available, you<br />

can confirm this by using the showplan option 'set option show on' (discussed later). If a time out occurs,<br />

you will see the following in the diagnostics output:<br />

!! Optimizer has timed out in this opt block !!<br />

You could raise the value of the timeout parameter globally for the server or for the session or for this<br />

query only. Raising the value of timeout at a server or session level can hurt other queries due to increased<br />

compilation time and may use more resources like procedure cache. So, be careful when you do this.<br />

Enable Sort-Merge Join and JTC deprecated<br />

The configuration option "enable sort-merge join and JTC" has been deprecated. As a result, you may see<br />

merge joins being used when unexpected. Before attempting to force the old behavior, consider that the<br />

new in-memory sort operations has greatly improved the sort merger join performance. The query<br />

optimizer will not choose merge join where it is deemed in-appropriate. If you do want to turn "sortmerge"<br />

join off, you would have to do it at a session level using the command "set merge_join 0" or use an<br />

optimization goal that disables merge join like "allrows_oltp".<br />

One benefit from this is that Join Transitive Closure may be occurring, helping <strong>ASE</strong> arrive at more efficient<br />

join orders than previously capable of due to this setting typically being disabled at most customer sites<br />

(due to the merge join cost). This may also result in most three or more table queries in which JTC is a<br />

possibility of using the N-ary Join Strategy vs. a strict Nested Loop Join - which may help in some<br />

situations with query response times.<br />

48


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Set Forceplan Changes<br />

In some legacy applications, developers used ‘set forceplan on’ vs. the more flexible and more controllable<br />

PLAN clause with an AQP. In the migration to <strong>ASE</strong> <strong>15.0</strong>’s optimizer, ‘set forceplan’ is attempted to<br />

migrate proc/trigger statements to an AQP implementation - however, this is not always doable resulting in<br />

errors or other issues when a ‘set forceplan’ clause is encountered -particularly during procedure<br />

compilation which could result in the procedure failing to compile.<br />

In many cases, forceplan was used to overcome the following situations:<br />

• Many (>4) tables were involved in the query and the optimization time when set tablecount<br />

was used became excessive - the query plan was pre-determined typically by using set<br />

tablecount one time, the query rewritten with the FROM clause ordered accordingly and then<br />

forceplan used. Typically this is likely the case when the query has more than 6 tables in the<br />

FROM clause. This problem has been mitigated in <strong>ASE</strong> 15 through the optimization timeout<br />

mechanism.<br />

• Join index selectivity resulted in an incorrect join order. Frequently, this was a problem when<br />

either there was a formula involved (such as convert()) in the join, or a variable whose value<br />

was determined within the procedure vs. a parameter - hence value is unknown at<br />

optimization time. While the index selectivity still could be an issue in <strong>ASE</strong> 15, the issue may<br />

be mitigated entirely in <strong>ASE</strong> 15 due to the different join strategies (such as merge and hash<br />

joins), optimizer improvements inherent in <strong>ASE</strong> 15, creating a function based index, or by<br />

using update index statistics with a higher step count.<br />

Developers should search their scripts (or syscomments system table) looking for occurrences of<br />

‘forceplan’ and test the logic using the new optimizer. If a plan force is still required, developers have two<br />

choices:<br />

• Use the PLAN clause as part of the SELECT statement to force the desired logic<br />

• Store the plan as an AQP and enable AQP load to not only facilitate that query but all similar<br />

occurrences of that query.<br />

The second option is especially useful if the query might be issued dynamically from an application vs.<br />

being embedded within a stored procedure/trigger. To get a starting plan for the query, use the set option<br />

show_abstract_plan command when executing the query to retrieve a AQP to start from and then modify it.<br />

For example:<br />

dbcc traceon(3604)<br />

go<br />

set option show_abstract_plan on<br />

go<br />

select * from range_test where row_id=250<br />

go<br />

set option show_abstract_plan off<br />

go<br />

dbcc traceoff(3604)<br />

go<br />

The same query with a plan clause would look like:<br />

select * from range_test where row_id=250<br />

plan '( i_scan range_test_14010529962 range_test ) ( prop range_test ( parallel 1<br />

) ( prefetch 4 ) ( lru ) ) '<br />

go<br />

Note that if the table order is all that is important, a partial plan that only lists the table order would be all<br />

that would be necessary.<br />

49


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Determining Queries Impacted During <strong>Migration</strong><br />

There are many ways to identify queries impacted by changes in the query processing engine during<br />

migration. Each of the sections below describes different common scenarios and how to resolve them. In<br />

each of the cases below, it is assumed that the MDA tables are installed and query monitoring enabled in<br />

both the <strong>ASE</strong> 12.5 and <strong>15.0</strong> systems.<br />

Things To Do Before You Start Testing<br />

Before you begin to diagnose problems, you will need to ensure the following:<br />

• Make sure that the monitoring (MDA) tables are installed and you have access to it.<br />

• Have permission to run "sp_configure" command, if needed.<br />

• Have permission to turn on various "set options" in the query processor to get the diagnostics<br />

output<br />

• Be able to turn on trace flag 3604/3605<br />

• Some of the outputs can be quite huge - so plan for file space<br />

• Practice capturing query plans using AQP and practice MDA queries in order to better<br />

understand how the tables work as well as how large to configure the pipes.<br />

• Create a test database to be used as scratch space. You may be bcp’ing output from one or<br />

both servers into this database to perform SQL analysis. Likely it will be best to create this on<br />

the <strong>15.0</strong> server to facilitate data copying.<br />

• Make sure you have plenty of free space (2GB+) for the system segment<br />

• Disable literal parameterization - possibly even statement caching as a whole for the testing<br />

sessions<br />

With regard to the last point, if literal parameterization is enabled for the statement cache, you may have to<br />

disable it. With literal parameterization, the queries with different search parameters would return the same<br />

hashkey. However, one might return 1 row and the other 1,000 rows. Depending on the clustered index,<br />

the logical I/Os could be different between the two queries just due to the row count differences.<br />

Obviously, queries that impact a larger difference in row counts will have even greater differences. As a<br />

result, when attempting to find query differences between versions, make sure that you disable literal<br />

parameterization. Not only will this allow joins on the hashkey to be accurate, but it also will allow a more<br />

accurate comparison between 12.5 systems and <strong>15.0</strong> as 12.5 did not have the advantage of literal<br />

parameterization. As mentioned, though, you may want to disable statement caching for your session<br />

entirely - either via ‘set statement_cache off’ or by zeroing the statement cache size for the entire server.<br />

As noted in the discussion about statement caching, usinq abstract plan capture disables the statement cache<br />

anyhow.<br />

Abstract Query Plan Capture<br />

Starting in <strong>ASE</strong> 12.0, <strong>Sybase</strong> has provided a facility called Abstract Query Plans (AQP – or sometimes<br />

Query Plans on Disk – QPOD) to capture, load and reuse query plans from executing queries.<br />

Documentation on AQP’s is found in the <strong>ASE</strong> Performance & Tuning: Optimizer and Abstract Plans,<br />

beginning with Chapter 14 in the <strong>ASE</strong> 12.5 documentation. By default, captured query plans are stored in<br />

the sysqueryplans table using the ap_stdout group. The important consideration here is that both <strong>ASE</strong><br />

12.5.2+ and <strong>ASE</strong> <strong>15.0</strong> use a hashkey of the query, consequently, the query plans for identical queries can<br />

be matched. The high level steps for this are as follows:<br />

1. Enable query plan capture on the 12.5 server. This can be done at the session level with ‘set<br />

plan dump on’ or at the server level with ‘sp_configure “abstract plan dump”, 1’<br />

50


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

2. Execute one module of the application to be tested.<br />

3. Turn off AQP dump and bcp out sysqueryplans<br />

4. Enable query plan capture on the <strong>15.0</strong> server.<br />

5. Execute the same module as in #2<br />

6. Turn off AQP dump in <strong>ASE</strong> 15<br />

7. bcp in the 12.5 data into the scratch database (create a table called queryplans_125)<br />

8. copy the <strong>15.0</strong> data into the scratch database (use select/into to create a table called<br />

queryplans_150)<br />

9. Create an index on hashkey, type and sequence for both tables<br />

10. Run queries to identify plan differences<br />

Sample queries include ones such as the following:<br />

-- get a list of the queries that have changed - first<br />

-- by finding all the queries with changes in the query plan text<br />

select t.hashkey<br />

into #qpchgs<br />

from queryplans_125 t, queryplans_150 f<br />

where t.hashkey = f.hashkey<br />

and t.sequence = f.sequence<br />

and t.type = 100 –- aqp text vs. sql<br />

and f.type = 100 –- aqp text vs. sql<br />

and t.text != f.text<br />

-- supplemented by those with more text as detected by having<br />

-- more chunks than the 12.5 version.<br />

union all<br />

select f.hashkey<br />

from queryplans_150 f<br />

where f.sequence not in (select t.sequence<br />

from queryplans_125 t<br />

where f.hashkey = t.hashkey)<br />

-- and then supplemented by the opposite - find queryplans that<br />

-- are shorter in <strong>ASE</strong> <strong>15.0</strong> than in 12.5<br />

union all<br />

select t.hashkey<br />

from queryplans_125 t<br />

where t.sequence not in (select f.sequence<br />

from queryplans_150 f<br />

where f.hashkey = t.hashkey)<br />

go<br />

-- eliminate duplicates<br />

select distinct hashkey<br />

into #qpchanges<br />

from #qpchgs<br />

go<br />

drop table #qpchgs<br />

go<br />

-- get the sql text for the queries identified<br />

select t.hashkey, t.sequence, t.text<br />

from queryplans_125 t, #qpchanges q<br />

where q.hashkey=t.hashkey<br />

and t.type = 10 -- sql text vs. aqp<br />

51


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

-- optionally get the aqp text for comparison<br />

-- first find the ones in which the <strong>15.0</strong> QP may be longer<br />

-- note that we need to have done the earlier query as<br />

-- we use its results in a subquery (highlighted)<br />

select t.hashkey, t.sequence, t.text, f.sequence, f.text<br />

from queryplans_125 t, queryplans_150 f<br />

where t.hashkey = f.hashkey<br />

and t.sequence=*f.sequence<br />

and t.hashkey in (select hashkey from #qpchanges)<br />

and t.type = 100 –- aqp text vs. sql<br />

and f.type = 100 –- aqp text vs. sql<br />

union<br />

-- now find the ones in which the 12.5 QP is longer – this<br />

-- may cause duplicates with the above where exactly equal<br />

select f.hashkey, t.sequence, t.text, f.sequence, f.text<br />

from queryplans_125 t, queryplans_150 f<br />

where f.hashkey = t.hashkey<br />

and f.sequence*=t.sequence<br />

and f.hashkey in (select hashkey from #qpchanges)<br />

and t.type = 100 –- aqp text vs. sql<br />

and f.type = 100 –- aqp text vs. sql<br />

order by t.hashkey, t.sequence, f.sequence<br />

Several items of note about the above:<br />

• Often in applications, the same exact query may be executed more than once. If so, and the<br />

query plans differ between 12.5 vs. <strong>15.0</strong>, the query will result in multiple instances in the<br />

above table.<br />

• The above could be made fancier – and more reusable, by enclosing inside stored procedures<br />

that cursored through the differences building the SQL text and AQP text into sql variables<br />

(i.e. declared as varchar(16000) – even on 2K page servers this works) and then outputting to<br />

screen or inserting into a results table (possibly with the SQL/AQP text declared as text<br />

datatype).<br />

• The above queries are for demonstration purposes only – more refinement is possible to<br />

eliminate duplicates, etc. The goal was to show what is possible – this task is automated via<br />

DBExpert.<br />

• The downside to this method is that execution metrics are not captured – so you can’t<br />

necessarily tell by looking at the output whether the query plan changed the performance<br />

characteristics.<br />

• On the face of it, a large number of query plans will likely be changed due the implementation<br />

of Merge Hoins, N-ary Nested Loop Joins, Hash Sorts, etc.<br />

Using MDA Tables<br />

The purpose of this section is not to serve as an introduction to the MDA tables - but as few customers have<br />

taken advantage of this monitoring capability in the 4 years since its introduction, some background is<br />

needed to explain how to use it to facilitate migration. This technique is a little harder than using<br />

sysqueryplans as a query is not uniquely identified within the MDA tables via the hashkey. However, is a<br />

lot more accurate in that it reports query performance metrics, has less impact on performance than other<br />

methods and is not all that difficult once understood. The general concept relies on using monitoring index<br />

usage, statement execution monitoring and stored procedure profiling during test runs and post-migration to<br />

isolate query differences.<br />

One fairly important aspect is that in <strong>ASE</strong>, the default value for ‘enable xact coordination’ is 1 - and it is a<br />

static switch requiring a server reboot. The reason why this is important is that in <strong>ASE</strong> 12.5.x, the MDA<br />

tables use the ‘loopback’ interface which mimics a remote connection, which will use CIS. Because of the<br />

configuration option ‘enable xact coordination’, CIS will invoke a transactional RPC to the loopback<br />

interface. This could cause a lot of activity in the sybsystemdb database - possibly causing the transaction<br />

52


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

log to fill. As a result, the following best practices are suggested (any of the below fully resolves the<br />

problem individually):<br />

• Enable ‘truncate log on checkpoint’ option for sybsystemdb<br />

• Set ‘enable xact coordination’ to 0<br />

• Use the session setting ‘set transactional_rpc off’ in any procedure or script that selects from<br />

the MDA tables<br />

The first two may cause problems if distributed transactions (XA, JTA or ODBC 2PC) are used by the<br />

application (the configuration option ‘enable DTM’ would have to be 1) - consequently if using distributed<br />

transactions, the safest choice is the third. If distributed transactions are being used, the first option does<br />

expose a risk of losing transactions if the server should crash as heuristic transaction completion likely<br />

would no longer be possible on transactions whose log pages were truncated by a checkpoint, but the<br />

datapages not yet flushed to disk.<br />

<br />

Monitoring Index Usage<br />

Query plan differences that impact query performance typically manifest itself through increased IO -<br />

usually by a significant amount. To that extent the first MDA monitoring technique uses<br />

monOpenObjectActivity to look for table scans as well as index efficiency/usage. The<br />

monOpenObjectActivity table looks like the following:<br />

monOpenObjectActivity<br />

DBID<br />

ObjectID<br />

IndexID<br />

DBName<br />

ObjectName<br />

LogicalReads<br />

PhysicalReads<br />

APFReads<br />

PagesRead<br />

PhysicalWrites<br />

PagesWritten<br />

RowsInserted<br />

RowsDeleted<br />

RowsUpdated<br />

Operations<br />

LockRequests<br />

LockWaits<br />

OptSelectCount<br />

LastOptSelectDate<br />

UsedCount<br />

LastUsedDate<br />

int<br />

int<br />

int<br />

varchar(30)<br />

varchar(30)<br />

int<br />

int<br />

int<br />

int<br />

int<br />

int<br />

int<br />

int<br />

int<br />

int<br />

int<br />

int<br />

int<br />

datetime<br />

int<br />

datetime<br />

<br />

<br />

<br />

<br />

<br />

Figure 4 - MDA table monOpenObjectActivity in <strong>ASE</strong> <strong>15.0</strong>.1<br />

One of the differences between <strong>ASE</strong> <strong>15.0</strong> and 12.5.3 is noticeable in the above. <strong>ASE</strong> 12.5.3 didn’t include<br />

the DBName and ObjectName fields. Regardless, this table has a wealth of information that can be useful<br />

during a migration. First, note that it is at the index level - consequently it is extremely useful to detect<br />

changes in query behavior with minimal impact on the system. To do this however, you will need samples<br />

from a 12.5.x baseline system and the <strong>ASE</strong> <strong>15.0</strong> migration system under roughly the same queryload.<br />

Of the key fields in the table, the “UsedCount” column is perhaps the most important for index usage. This<br />

counter keeps track of each time an index is used as a result of the final query optimization and execution.<br />

As demonstrated earlier, this can be useful in finding table scans using the query:<br />

select *<br />

from master..monOpenObjectActivity<br />

where DBID not in (1, 2, 3) -- add additional tempdb dbid’s if using multiple tempdbs<br />

and UsedCount > 0<br />

and IndexID = 0<br />

order by LogicalReads desc, UsedCount desc<br />

53


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Note that not all table scans be avoided - the key is to look for significant increases in table scans. If you<br />

have a baseline from 12.5 and have loaded both the 12.5 and <strong>15.0</strong> statistics into a database for analysis, a<br />

useful query could be similar to:<br />

-- in the query below, the two databases likely could have different<br />

-- database ID's. Unfortunately 12.5.x doesn't include DBName - so<br />

-- unless this was added by the user when collecting the data, we<br />

-- will assume it is not available - which means we can't join<br />

-- DBID nor DBName - so we need to hardcode the DBID values...<br />

-- these values replace the variables at the first line of the<br />

-- where clause below.<br />

select f.DBName, f.ObjectName, f.IndexID,<br />

LogicalReads_125=t.LogicalReads, LogicalReads_150=f.LogicalReads,<br />

PhysicalReads_125=t.PhysicalReads, PhysicalReads_150=f.PhysicalReads,<br />

Operations_125=t.Operations, Operations_150=f.Operations,<br />

OptSelectCount_125=t.OptSelectCount, OptSelectCount_150=f.OptSelectCount,<br />

UsedCount_125=t.UsedCount, UsedCount_150=f.UsedCount<br />

UsedDiff=t.UsedCount - f.UsedCount<br />

from monOpenObjectActivity_125 t, monOpenObjectActivity_150 f<br />

where t.DBID = @DBID_125 and f.DBID = @DBID_150<br />

and t.ObjectID = f.ObjectID<br />

and t.IndexID = f.IndexID<br />

order by 14 desc -- order by UsedDiff descending<br />

One consideration is that even if the processing load is nearly the same, it is likely that there will be some<br />

difference. That is where the Operations field comes into play. Technically is the the number of operations<br />

such as DML statements or querys that are executed against a table - however, it tends to run a bit high (by<br />

3-5x) as it includes internal operations such as cursor openings, etc. For this query though, it can be used to<br />

normalize the workloads by developing a ratio of operations between the two. As mentioned earlier,<br />

differences in query processing could result in some differences:<br />

• Merge Joins may increase the number of table scans - particularly in tempdb. It may also<br />

result in less LogicalReads for some operations.<br />

• Hash (in-memory) sorting may reduce the number of index operations by using an index to<br />

find a starting point, vs. traversing the index iteratively when using an index to avoid sorting.<br />

• If the table (IndexID=0 or 1) or a specific indes shows an considerable increase in<br />

LogicalReads and the indexes show nearly the same OptSelectCount but the UsedCount has<br />

dropped, the issue might be that statistics are missing or not enough statistics are available<br />

and the optimizer is picking a table scan or an inefficient index.<br />

If updating statistics using update index statistics with a higher number of steps doesn’t solve the problem,<br />

then next you are likely looking at a query optimization issue.<br />

<br />

Monitoring Statement Execution<br />

Monitoring statement execution focuses on the MDA monSysStatement table and related pipe tables.<br />

Consider the following diagram:<br />

54


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Figure 5 - Diagram of MDA Tables Relating SQL Query Execution Statistics, Plan & Text<br />

Note that we are focusing on monSysStatement, monSysSQLText and monSysPlanText vs. the<br />

monProcess* equivalent tables. The rationale is that monSys* tables are stateful and keep track of<br />

historically executed statements whereas the monProcess* tables only record the currently executing<br />

statement.<br />

A second aspect to keep in mind is that the context of one connection’s pipe state is retained for that<br />

connection – but is independent of other connections (allowing multi-user access to MDA historical data).<br />

In other words, if one DBA connects and queries the monSysStatement table, they may see 100 rows.<br />

Assuming more queries are executed and the DBA remains connected, the next time the DBA queries the<br />

monSysStatement table, only the new rows added since they last queried will be returned. A different DBA<br />

who connects at this point would see all the rows. This is important for the following reason: when<br />

sampling the tables repeatedly using a polling process, it is tempting to disconnect between samples.<br />

However, if this happens, upon reconnecting, the session appears as if a “new” session and the state is lost<br />

– consequently, the result set may contain rows already provided in the previous session.<br />

A third consideration is the correct sizing of the pipes. If the pipes are too small, statements or queries may<br />

be dropped from the ring buffer. To accommodate this, you either need to increase the number of pipes<br />

available or sample more frequently. For example, for a particular application, one module may submit<br />

100,000 statements. Obviously setting the pipes to 100,000 may be impossible due to memory constraints.<br />

However, if it is known that those 100,000 statements are issued over the period of an hour, then an<br />

average of 1,667 queries per minute are possibly issued. If guessing at a peak of double, that would mean<br />

3,333 queries per minute. It might be useful to set the pipes to 5,000 and sample every minute to avoid<br />

loosing statements.<br />

Perhaps the two most important points to remember about monSysStatement are:<br />

55


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

1. Statements selected from monSysStatement are returned in execution order if no ‘ORDER<br />

BY’ clause is used. One trick is to add an identity() if selecting into a temp table - or insert<br />

into a table containing an identity column to keep the execution order intact.<br />

2. The line numbers for procedures and triggers actually refers to the physical lines in the<br />

creation script - starting from the end of the previous batch - including blank lines, comments,<br />

etc. If a command spans more than one line, only the first line will show up during<br />

monitoring. A line number of 0 implies that <strong>ASE</strong> is searching for the next executable line<br />

(i.e. jumping from a large if statement - or skipping a large number of declares at the top that<br />

are more compiler instructions vs. execution instructions).<br />

Overall, the high level steps are as follows:<br />

1. Configure the statement pipe, sql text pipe, and statement plan text pipe as necessary for the<br />

<strong>ASE</strong> 12.5 server.<br />

2. Create a temporary repository in tempdb by doing a query similar to the following:<br />

create table tempdb..mdaSysStatement (<br />

row_id numeric(10,0) identity not null,<br />

SPID smallint not null,<br />

KPID int not null,<br />

DBID int null,<br />

ProcedureID int null,<br />

ProcName varchar(30) null,<br />

PlanID int null,<br />

BatchID int not null,<br />

ContextID int not null,<br />

LineNumber int not null,<br />

CpuTime int null,<br />

WaitTime int null,<br />

MemUsageKB int null,<br />

PhysicalReads int null,<br />

LogicalReads int null,<br />

PagesModified int null,<br />

PacketsSent int null,<br />

PacketsReceived int null,<br />

NetworkPacketSize int null,<br />

PlansAltered int null,<br />

RowsAffected int null,<br />

ErrorStatus int null,<br />

StartTime datetime null,<br />

EndTime datetime null<br />

)<br />

3. Repeat for monSysSQLtext and monSysPlanText as desired.<br />

4. Begin a monitoring process that once per minute inserts into the tables created above from the<br />

respective MDA tables. For example, the following query could be placed in a loop with a<br />

‘waitfor delay “00:00:02”’ or similar logic.<br />

insert into mdaSysStatement (SPID, KPID, DBID, ProcedureID<br />

ProcName, PlanID, BatchID, ContextID, LineNumber,<br />

CpuTime, WaitTime, MemUsageKB, PhysicalReads, LogicalReads,<br />

PagesModified, PacketsSent, PacketsReceived,<br />

NetworkPacketSize, PlansAltered, RowsAffected, ErrorStatus,<br />

StartTime, EndTime)<br />

select SPID, KPID, DBID, ProcedureID,<br />

ProcName=object_name(ProcedureID, DBID),<br />

PlanID, BatchID, ContextID, LineNumber,<br />

CpuTime, WaitTime, MemUsageKB, PhysicalReads, LogicalReads,<br />

PagesModified, PacketsSent, PacketsReceived,<br />

NetworkPacketSize, PlansAltered, RowsAffected, ErrorStatus,<br />

StartTime, EndTime<br />

from master..monSysStatement<br />

5. Execute one module of the application to be tested.<br />

6. Stop the application and halt the monitoring.<br />

56


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

7. bcp out the MDA collected data from tempdb.<br />

8. Repeat steps 1-5 for <strong>ASE</strong> <strong>15.0</strong><br />

9. Create a second set of tables in the scratch database – one each for <strong>ASE</strong> <strong>15.0</strong> and 12.5. For<br />

example: mdaSysStmt_125 and mdaSysStmt_150.<br />

10. Load the tables from the collected information – either by bcp-ing back in or via insert/select<br />

Since the monitoring captures all statements from all users, the next step is to isolate out each of the<br />

specific user’s queries and re-normalize using a new identity column. For example, the following query<br />

could be used to build the new table to be used to compare query execution:<br />

select exec_row=identity(10), SPID, KPID, DBID, ProcedureID,<br />

ProcName, PlanID, BatchID, ContextID, LineNumber,<br />

CpuTime, WaitTime, MemUsageKB, PhysicalReads, LogicalReads,<br />

PagesModified, PacketsSent, PacketsReceived,<br />

NetworkPacketSize, PlansAltered, RowsAffected, ErrorStatus,<br />

StartTime, EndTime<br />

into monSysStmt_150<br />

from mdaSysStmt_150<br />

where SPID = 123<br />

and KPID = 123456789<br />

order by row_id<br />

go<br />

create unique index exec_row_idx<br />

on monSysStmt_150 (exec_row)<br />

go<br />

Consequently, if the same exact sequence of test statements is issued against both servers, the exec_row<br />

columns should match. Consider the following query:<br />

-- Get a list of SQL statements that executed slower in <strong>15.0</strong> compared to 12.5<br />

select f.exec_row, f.BatchID, f.ContextID, f.ProcName, f.LineNumber,<br />

CPU_15=f.CPUTime, CPU_125=t.CPUTime,<br />

Wait_15=f.WaitTime, Wait_125=t.WaitTime,<br />

Mem_15=f.MemUsageKB, Mem_125=t.MemUsageKB,<br />

PhysIO_15=f.PhysicalReads, PhysIO_125=t.PhysicalReads,<br />

LogicalIO_15=f.LogicalReads, LogicalIO_125=t.LogicalReads,<br />

Writes_15=f.PagesModified, Writes_125=t.PagesModified,<br />

ExecTime_15=datediff(ms,f.StartTime,f.EndTime)/1000.00,<br />

ExecTime_125=datediff(ms,t.StartTime,t.EndTime)/1000.00,<br />

DiffInMS= datediff(ms,f.StartTime,f.EndTime)-<br />

datediff(ms,t.StartTime,t.EndTime)<br />

into #slow_qrys<br />

from monSysStmt_150 f, monSysStmt_125<br />

where f.exec_row = t.exec_row<br />

and (datediff(ms,f.StartTime,f.EndTime) > datediff(ms,t.StartTime,t.EndTime))<br />

order by 20 desc, f.BatchID, f.ContextID, f.LineNumber<br />

Of course it always nice to tell the boss how many queries were faster – which is quite easy to accomplish<br />

with the above – simply swap the last condition with a ‘less than’ to get:<br />

-- Get a list of SQL statements that executed faster in <strong>15.0</strong> compared to 12.5<br />

select f.exec_row, f.BatchID, f.ContextID, f.ProcName, f.LineNumber,<br />

CPU_15=f.CPUTime, CPU_125=t.CPUTime,<br />

Wait_15=f.WaitTime, Wait_125=t.WaitTime,<br />

Mem_15=f.MemUsageKB, Mem_125=t.MemUsageKB,<br />

PhysIO_15=f.PhysicalReads, PhysIO_125=t.PhysicalReads,<br />

LogicalIO_15=f.LogicalReads, LogicalIO_125=t.LogicalReads,<br />

Writes_15=f.PagesModified, Writes_125=t.PagesModified,<br />

ExecTime_15=datediff(ms,f.StartTime,f.EndTime)/1000.00,<br />

ExecTime_125=datediff(ms,t.StartTime,t.EndTime)/1000.00,<br />

DiffInMS= datediff(ms,f.StartTime,f.EndTime)-<br />

datediff(ms,t.StartTime,t.EndTime)<br />

into #fast_qrys<br />

from monSysStmt_150 f, monSysStmt_125<br />

where f.exec_row = t.exec_row<br />

and (datediff(ms,f.StartTime,f.EndTime) < datediff(ms,t.StartTime,t.EndTime))<br />

order by 20 desc, f.BatchID, f.ContextID, f.LineNumber<br />

57


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

The only gotcha with this technique is that it is only accurate within 100ms – the reason is that it is based<br />

on the CPU ticks length within <strong>ASE</strong>, which defaults to 100ms. Statements that execute less than 100ms<br />

will show up as 0. This could be problematic where an insert that used to take 10ms now takes 20ms –<br />

especially if that insert is executed 1,000,000 times during the day.<br />

<br />

Stored Procedure Profiling<br />

Tracking stored procedure profiles from within the MDA tables can be a bit tricky. Typically, DBA’s like<br />

to track the stored procedures by reporting those most often executed, those that take the longest to execute<br />

or those that have deviated from an expected execution norm. Traditionally, this has been done using<br />

Monitor Server inconjunction with Historical Server and custom views. Unfortunately up through <strong>ASE</strong><br />

<strong>15.0</strong>.1, there isn’t a direct equivalent within the MDA tables - although a planned enhancement for <strong>15.0</strong>.2 is<br />

to include two tables to track this (one aggregated by procedure and the other for each execution of a<br />

procedure). The problem of course is that this will only be in <strong>15.0</strong>.2, consequently comparing with 12.5.x<br />

will not be possible.<br />

The monSysStatement table reports the statement level statistics at a line-by-line basis - which in a sense is<br />

even handy as when a proc execution is slower, the exact line of the procedure where the problem occurred<br />

is easily spotted. Aggregating it to a procedure level for profiling is a bit trickier. Consider the following<br />

stored procedure code (and the comments) which does a top 10 analysis<br />

-- Because of the call to object_name(), this procedure must be executed in the same server as the<br />

-- stored procedures being profiled. If the repository database schema included the ProcName, this<br />

-- restriction could be lifted.<br />

create procedure sp_mda_getProcExecs<br />

@startDate<br />

datetime=null,<br />

@endDate<br />

datetime=null<br />

as begin<br />

-- okay the first step is to realize that this only analyzes proc execs...<br />

-- something else should have been collecting the monSysStatement activity.<br />

-- In addition, the logic of the collector should have done something like a<br />

-- row numbering scheme - for example:<br />

--<br />

-- select row_id=identity(10), * into #monSysStatement from master..monSysStatement<br />

-- select @cur_max=max(row_id) from repository..mdaSysStatement<br />

-- insert into repository..mdaSysStatement<br />

-- (select row_id+@cur_max, * from #monSysStatement)<br />

--<br />

-- But now that we can assume everything was *safely* collected into this repository<br />

-- database and into a table called mdaSysStatement, we can run the analysis<br />

-- The result sets are:<br />

--<br />

-- 1 - The top 10 procs by execution count<br />

-- 2 - The top 10 procs by elapsed time<br />

-- 3 - The top 10 procs by Logical IOs<br />

-- 4 - The top 10 proc lines by execution count<br />

-- 5 - The top 10 proc lines by elapsed time<br />

-- 6 - The top 10 proc lines by Logical IOs<br />

-- The reason for the second set is that we are using these to key off of where a<br />

-- proc may be going wrong - i.e. if a showplan changes due to a dropped index,<br />

-- we will get the exact line #'s affected vs. just the proc.<br />

-- The first step is to put the result set in execution order by SPID/KPID vs.<br />

-- just execution order...we need to do this so our checks for the next line<br />

-- logic to see when a proc begins/exits works....<br />

select exec_row=identity(10), SPID, KPID, DBID, ProcedureID, PlanID,<br />

BatchID, ContextID, LineNumber, CpuTime, WaitTime,<br />

MemUsageKB, PhysicalReads, LogicalReads, PagesModified,<br />

PacketsSent, PacketsReceived, NetworkPacketSize,<br />

PlansAltered, RowsAffected, ErrorStatus, StartTime, EndTime<br />

into #t_exec_by_spid<br />

from mdaSysStatement<br />

where (((@startDate is null) and (@endDate is null))<br />

or ((StartTime >= @startDate) and (@endDate is null))<br />

or ((@startDate is null) and (EndTime = @startDate) and (EndTime


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

-- then we kinda do it again to get rid of the identity - the reason why is that<br />

-- sometimes a union all will fail if the join involves row=row+1 and row is numeric...<br />

select exec_row=convert(int,exec_row), SPID, KPID, DBID, ProcedureID, PlanID,<br />

BatchID, ContextID, LineNumber, CpuTime, WaitTime,<br />

MemUsageKB, PhysicalReads, LogicalReads, PagesModified,<br />

PacketsSent, PacketsReceived, NetworkPacketSize,<br />

PlansAltered, RowsAffected, ErrorStatus, StartTime, EndTime<br />

into #exec_by_spid<br />

from #t_exec_by_spid<br />

order by exec_row<br />

drop table #t_exec_by_spid<br />

create unique index exec_row_idx on #exec_by_spid (SPID, KPID, BatchID, exec_row)<br />

-- then we need to find the proc exec statements - the way we will find this is by<br />

-- finding either where:<br />

--<br />

-- 1) The first line of a batch is a valid procedure<br />

-- 2) When the next line of a batch has a higher context id and the procedureID<br />

-- changes to a valid procedure<br />

-- Due part #1 - find all the procs that begin a batch<br />

select SPID, KPID, BatchID, exec_row=min(exec_row)<br />

into #proc_begins<br />

from #exec_by_spid<br />

group by SPID, KPID, BatchID<br />

having exec_row=min(exec_row)<br />

and ProcedureID!=0<br />

and object_name(ProcedureID,DBID) is not null<br />

-- Union those with procs that occur after the first line...<br />

select e.SPID, e.KPID, e.ProcedureID, e.DBID,<br />

e.BatchID, e.ContextID, e.LineNumber, e.exec_row, e.StartTime<br />

into #proc_execs<br />

from #exec_by_spid e, #proc_begins b<br />

where e.SPID=b.SPID<br />

and e.KPID=b.KPID<br />

and e.BatchID=b.BatchID<br />

and e.exec_row=b.exec_row<br />

union all<br />

-- e1 is the proc entry - e2 is the previous context<br />

select e1.SPID, e1.KPID, e1.ProcedureID, e1.DBID,<br />

e1.BatchID, e1.ContextID, e1.LineNumber, e1.exec_row, e1.StartTime<br />

from #exec_by_spid e1, #exec_by_spid e2<br />

where e1.SPID=e2.SPID<br />

and e1.KPID=e2.KPID<br />

and e1.BatchID=e2.BatchID<br />

and e1.ContextID = e2.ContextID + 1 -- Context should go up by 1<br />

and e1.exec_row = e2.exec_row + 1<br />

-- on the next line<br />

and e1.ProcedureID != e2.ProcedureID -- and the proc differs<br />

and e1.ProcedureID !=0 -- and the proc is not 0<br />

and object_name(e1.ProcedureID,e1.DBID) is not null -- and proc is valid<br />

-- we are finished with this one....<br />

drop table #proc_begins<br />

-- Okay, now we have to find where the proc exits....This will be one of:<br />

--<br />

-- 1 - The SPID, KPID, BatchID, and ContextID are the same as the calling<br />

-- line, but the ProcedureID differs...<br />

-- 2 - The max(LineNumber) if the above is not seen (meaning proc was only<br />

-- statement in the Batch).<br />

--<br />

-- Due part #2 - find all the procs that end a batch<br />

select SPID, KPID, BatchID, exec_row=max(exec_row)<br />

into #proc_ends<br />

from #exec_by_spid<br />

group by SPID, KPID, BatchID<br />

having exec_row=max(exec_row)<br />

and ProcedureID!=0<br />

and object_name(ProcedureID,DBID) is not null<br />

-- in the above we are reporting the row after the proc exits while in the below<br />

-- (after the union all) we are using the last line exec'd within the proc<br />

59


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

select x.SPID, x.KPID, x.ProcedureID, x.DBID,<br />

x.BatchID, begin_row=x.exec_row, end_row=b.exec_row,<br />

x.ContextID, x.StartTime, e.EndTime<br />

into #find_ends<br />

from #exec_by_spid e, #proc_ends b, #proc_execs x<br />

where e.SPID=b.SPID and e.KPID=b.KPID and e.BatchID=b.BatchID<br />

and e.SPID=x.SPID and e.KPID=x.KPID and e.BatchID=x.BatchID<br />

and b.SPID=x.SPID and b.KPID=x.KPID and b.BatchID=x.BatchID<br />

and e.exec_row=b.exec_row<br />

union all<br />

-- e1 is the next line, where e2 is proc return...we will use the e2 row_id<br />

-- vs. e1 though so that later we can get the metrics right.<br />

select x.SPID, x.KPID, x.ProcedureID, x.DBID,<br />

x.BatchID, begin_row=x.exec_row, end_row=e2.exec_row,<br />

x.ContextID, x.StartTime, e1.EndTime<br />

from #exec_by_spid e1, #exec_by_spid e2, #proc_execs x<br />

where e1.SPID=e2.SPID and e1.KPID=e2.KPID and e1.BatchID=e2.BatchID<br />

and e1.SPID=x.SPID and e1.KPID=x.KPID and e1.BatchID=x.BatchID<br />

and e2.SPID=x.SPID and e2.KPID=x.KPID and e2.BatchID=x.BatchID<br />

and e1.ContextID = x.ContextID<br />

-- Context is same as calling line<br />

and e1.exec_row = e2.exec_row + 1 -- on the next line<br />

and e1.ProcedureID != e2.ProcedureID -- and the proc differs<br />

and e2.ProcedureID !=0 -- and the exiting proc is not 0<br />

and object_name(e2.ProcedureID,e2.DBID) is not null -- and exiting proc is valid<br />

and e2.ProcedureID=x.ProcedureID -- and we are exiting the desired proc<br />

and e2.DBID=x.DBID<br />

and e2.exec_row > x.exec_row<br />

drop table #proc_execs<br />

drop table #proc_ends<br />

-- the above could result in a some overlaps...so let's eliminate them....<br />

select exec_id=identity(10), SPID, KPID, ProcedureID, DBID, BatchID,<br />

begin_row, end_row=min(end_row),<br />

StartTime=min(StartTime), EndTime=max(EndTime)<br />

into #final_execs<br />

from #find_ends<br />

group by SPID, KPID, ProcedureID, DBID, BatchID, begin_row<br />

having end_row=min(end_row)<br />

order by begin_row<br />

drop table #find_ends<br />

-- #final_execs contains a list of proc execs in order by SPID along<br />

-- with the beginning and ending lines...we now need to get the<br />

-- execution metrics for each. To do this, we rejoin our execs with<br />

-- the orginal data to get all the metrics for the procs...<br />

select f.exec_id, f.SPID, f.KPID, f.ProcedureID, f.DBID, f.BatchID, f.begin_row,<br />

f.end_row, f.StartTime, f.EndTime, elapsedTotal=datediff(ms,f.StartTime,f.EndTime),<br />

subproc=e.ProcedureID, subdbid=e.DBID, e.LineNumber, e.CpuTime, e.WaitTime,<br />

e.MemUsageKB, e.PhysicalReads, e.LogicalReads, e.PagesModified,<br />

e.PacketsSent, e.RowsAffected, elapsedLine=datediff(ms,e.StartTime,e.EndTime)<br />

into #proc_details<br />

from #final_execs f, #exec_by_spid e<br />

where f.SPID=e.SPID and f.KPID=e.KPID and f.BatchID=e.BatchID<br />

and e.exec_row between f.begin_row and f.end_row<br />

drop table #final_execs<br />

drop table #exec_by_spid<br />

-- now we do the aggregation - first by each execution...so we can later<br />

-- aggregate across executions....if we wanted to track executions by<br />

-- a particular SPID, we would branch from here....<br />

select exec_id, SPID, KPID, ProcedureID, DBID, BatchID,<br />

elapsedTotal, CpuTime=sum(CpuTime),<br />

WaitTime=sum(WaitTime), PhysicalReads=sum(PhysicalReads),<br />

LogicalReads=sum(LogicalReads), PagesModified=sum(PagesModified),<br />

PacketsSent=sum(PacketsSent), RowsAffected=sum(RowsAffected)<br />

into #exec_details<br />

from #proc_details<br />

group by exec_id, SPID, KPID, ProcedureID, DBID, BatchID, elapsedTotal<br />

-- then we do the aggregation by Proc line....this is to spot the bad lines<br />

select DBID, ProcedureID, LineNumber, num_execs=count(*),<br />

elapsed_min=min(elapsedLine),<br />

elapsed_avg=avg(elapsedLine),<br />

elapsed_max=max(elapsedLine),<br />

elapsed_tot=sum(elapsedLine),<br />

60


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

CpuTime_min=min(CpuTime),<br />

CpuTime_avg=avg(CpuTime),<br />

CpuTime_max=max(CpuTime),<br />

CpuTime_tot=sum(CpuTime),<br />

WaitTime_min=min(WaitTime),<br />

WaitTime_avg=avg(WaitTime),<br />

WaitTime_max=max(WaitTime),<br />

WaitTime_tot=sum(WaitTime),<br />

PhysicalReads_min=min(PhysicalReads),<br />

PhysicalReads_avg=avg(PhysicalReads),<br />

PhysicalReads_max=max(PhysicalReads),<br />

PhysicalReads_tot=sum(PhysicalReads),<br />

LogicalReads_min=min(LogicalReads),<br />

LogicalReads_avg=avg(LogicalReads),<br />

LogicalReads_max=max(LogicalReads),<br />

LogicalReads_tot=sum(LogicalReads),<br />

PagesModified_min=min(PagesModified),<br />

PagesModified_avg=avg(PagesModified),<br />

PagesModified_max=max(PagesModified),<br />

PagesModified_tot=sum(PagesModified),<br />

PacketsSent_min=min(PacketsSent),<br />

PacketsSent_avg=avg(PacketsSent),<br />

PacketsSent_max=max(PacketsSent),<br />

PacketsSent_tot=sum(PacketsSent),<br />

RowsAffected_min=min(RowsAffected),<br />

RowsAffected_avg=avg(RowsAffected),<br />

RowsAffected_max=max(RowsAffected),<br />

RowsAffected_tot=sum(RowsAffected)<br />

into #line_sum<br />

from #proc_details<br />

where LineNumber > 0<br />

group by ProcedureID, DBID, LineNumber<br />

drop table #proc_details<br />

select ProcedureID, DBID, num_execs=count(*),<br />

elapsed_min=min(elapsedTotal),<br />

elapsed_avg=avg(elapsedTotal),<br />

elapsed_max=max(elapsedTotal),<br />

CpuTime_min=min(CpuTime),<br />

CpuTime_avg=avg(CpuTime),<br />

CpuTime_max=max(CpuTime),<br />

WaitTime_min=min(WaitTime),<br />

WaitTime_avg=avg(WaitTime),<br />

WaitTime_max=max(WaitTime),<br />

PhysicalReads_min=min(PhysicalReads),<br />

PhysicalReads_avg=avg(PhysicalReads),<br />

PhysicalReads_max=max(PhysicalReads),<br />

LogicalReads_min=min(LogicalReads),<br />

LogicalReads_avg=avg(LogicalReads),<br />

LogicalReads_max=max(LogicalReads),<br />

PagesModified_min=min(PagesModified),<br />

PagesModified_avg=avg(PagesModified),<br />

PagesModified_max=max(PagesModified),<br />

PacketsSent_min=min(PacketsSent),<br />

PacketsSent_avg=avg(PacketsSent),<br />

PacketsSent_max=max(PacketsSent),<br />

RowsAffected_min=min(RowsAffected),<br />

RowsAffected_avg=avg(RowsAffected),<br />

RowsAffected_max=max(RowsAffected)<br />

into #exec_sum<br />

from #exec_details<br />

group by ProcedureID, DBID<br />

drop table #exec_details<br />

-- now we need to get the top 10 by exec count<br />

set rowcount 10<br />

select DBName=db_name(DBID), ProcName=object_name(ProcedureID,DBID),<br />

num_execs, elapsed_min, elapsed_avg, elapsed_max,<br />

CpuTime_min, CpuTime_avg, CpuTime_max,<br />

WaitTime_min, WaitTime_avg, WaitTime_max,<br />

PhysicalReads_min, PhysicalReads_avg, PhysicalReads_max,<br />

LogicalReads_min, LogicalReads_avg, LogicalReads_max,<br />

PagesModified_min, PagesModified_avg, PagesModified_max,<br />

PacketsSent_min, PacketsSent_avg, PacketsSent_max,<br />

RowsAffected_min, RowsAffected_avg, RowsAffected_max<br />

from #exec_sum<br />

order by num_execs desc<br />

-- now get the top 10 by average elapsed time<br />

61


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

select DBName=db_name(DBID), ProcName=object_name(ProcedureID,DBID),<br />

num_execs, elapsed_min, elapsed_avg, elapsed_max,<br />

CpuTime_min, CpuTime_avg, CpuTime_max,<br />

WaitTime_min, WaitTime_avg, WaitTime_max,<br />

PhysicalReads_min, PhysicalReads_avg, PhysicalReads_max,<br />

LogicalReads_min, LogicalReads_avg, LogicalReads_max,<br />

PagesModified_min, PagesModified_avg, PagesModified_max,<br />

PacketsSent_min, PacketsSent_avg, PacketsSent_max,<br />

RowsAffected_min, RowsAffected_avg, RowsAffected_max<br />

from #exec_sum<br />

order by elapsed_avg desc<br />

-- now get the top 10 by average logical IOs<br />

select DBName=db_name(DBID), ProcName=object_name(ProcedureID,DBID),<br />

num_execs, elapsed_min, elapsed_avg, elapsed_max,<br />

CpuTime_min, CpuTime_avg, CpuTime_max,<br />

WaitTime_min, WaitTime_avg, WaitTime_max,<br />

PhysicalReads_min, PhysicalReads_avg, PhysicalReads_max,<br />

LogicalReads_min, LogicalReads_avg, LogicalReads_max,<br />

PagesModified_min, PagesModified_avg, PagesModified_max,<br />

PacketsSent_min, PacketsSent_avg, PacketsSent_max,<br />

RowsAffected_min, RowsAffected_avg, RowsAffected_max<br />

from #exec_sum<br />

order by LogicalReads_avg desc<br />

end<br />

go<br />

-- now lets do the same - but by Proc LineNumber<br />

select DBName=db_name(DBID), ProcName=object_name(ProcedureID,DBID), LineNumber,<br />

num_execs, elapsed_min, elapsed_avg, elapsed_max,<br />

CpuTime_min, CpuTime_avg, CpuTime_max,<br />

WaitTime_min, WaitTime_avg, WaitTime_max,<br />

PhysicalReads_min, PhysicalReads_avg, PhysicalReads_max,<br />

LogicalReads_min, LogicalReads_avg, LogicalReads_max,<br />

PagesModified_min, PagesModified_avg, PagesModified_max,<br />

PacketsSent_min, PacketsSent_avg, PacketsSent_max,<br />

RowsAffected_min, RowsAffected_avg, RowsAffected_max<br />

from #line_sum<br />

order by num_execs desc<br />

-- now get the top 10 by average elapsed time<br />

select DBName=db_name(DBID), ProcName=object_name(ProcedureID,DBID), LineNumber,<br />

num_execs, elapsed_min, elapsed_avg, elapsed_max,<br />

CpuTime_min, CpuTime_avg, CpuTime_max,<br />

WaitTime_min, WaitTime_avg, WaitTime_max,<br />

PhysicalReads_min, PhysicalReads_avg, PhysicalReads_max,<br />

LogicalReads_min, LogicalReads_avg, LogicalReads_max,<br />

PagesModified_min, PagesModified_avg, PagesModified_max,<br />

PacketsSent_min, PacketsSent_avg, PacketsSent_max,<br />

RowsAffected_min, RowsAffected_avg, RowsAffected_max<br />

from #line_sum<br />

order by elapsed_avg desc<br />

-- now get the top 10 by average logical IOs<br />

select DBName=db_name(DBID), ProcName=object_name(ProcedureID,DBID), LineNumber,<br />

num_execs, elapsed_min, elapsed_avg, elapsed_max,<br />

CpuTime_min, CpuTime_avg, CpuTime_max,<br />

WaitTime_min, WaitTime_avg, WaitTime_max,<br />

PhysicalReads_min, PhysicalReads_avg, PhysicalReads_max,<br />

LogicalReads_min, LogicalReads_avg, LogicalReads_max,<br />

PagesModified_min, PagesModified_avg, PagesModified_max,<br />

PacketsSent_min, PacketsSent_avg, PacketsSent_max,<br />

RowsAffected_min, RowsAffected_avg, RowsAffected_max<br />

from #line_sum<br />

order by LogicalReads_avg desc<br />

set rowcount 0<br />

drop table #exec_sum<br />

drop table #line_sum<br />

return 0<br />

Although it appears complicated, the above procedure could easily be modified to produce a complete<br />

procedure profile of executions for the 12.5 and <strong>15.0</strong> systems and then compare the results reporting on<br />

both the procedures that differed and the offending lines. This proc was left intact for the simple reason<br />

that if you don’t have any <strong>ASE</strong> 12.5 statistics, the above proc could be used as is to track proc executions in<br />

12.5 or <strong>15.0</strong> regardless of migration status.<br />

62


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

You can obtain the SQL text for the procedure execution describing what the parameters are by joining on<br />

the SPID, KPID, and BatchID. The only difference is that the SQL text is collected as one large entity<br />

(later split into 255 byte rows) vs. individual statements. Consequently, if a SQL batch executes a proc<br />

more than once, you will have to use the ContextID to identify which proc execution which set of<br />

parameters belong to.<br />

Using Sysquerymetrics<br />

A variation of both that uses the exact query matching based on the hash key but yields execution statistics<br />

is to use sysquerymetrics in <strong>ASE</strong> <strong>15.0</strong> instead of sysqueryplans. Details on how to use sysquerymetrics are<br />

provided in a later section (“Diagnosing Issues in <strong>ASE</strong> <strong>15.0</strong>: Using sysquerymetrics & sp_metrics”).<br />

However, the sysquerymetrics provides a number of performance metric columns including logical io’s,<br />

cpu time, elapsed time, etc. The columns in sysquerymetrics include:<br />

Field<br />

uid<br />

gid<br />

id<br />

hashkey<br />

sequence<br />

exec_min<br />

exec_max<br />

exec_avg<br />

elap_min<br />

elap_max<br />

elap_avg<br />

lio_min<br />

lio_max<br />

lio_avg<br />

pio_min<br />

pio_max<br />

pio_avg<br />

cnt<br />

abort_cnt<br />

Qtext<br />

Definition<br />

User ID<br />

Group ID<br />

Unique ID<br />

The hashkey over the SQL query text<br />

Sequence number for a row when multiple rows are required for<br />

the SQL text<br />

Minimum execution time<br />

Maximum execution time<br />

Average execution time<br />

Minimum elapsed time<br />

Maximum elapsed time<br />

Average elapsed time<br />

Minimum logical IO<br />

Maximum logical IO<br />

Average logical IO<br />

Minumum physical IO<br />

Maximum physical IO<br />

Average physical IO<br />

Number of times the query has been executed.<br />

Number of times a query was aborted by Resource Governor as a<br />

resource limit was exceeded.<br />

query text<br />

Note however, that this displays the query text and not the query plan. Consequently you first identify the<br />

slow queries in <strong>15.0</strong> and then compare the plans. To do this, you begin much the same way as you do with<br />

the AQP capture method, but adding the sysquerymetrics capture detailed later:<br />

1. Enable query plan capture on the 12.5 server. This can be done at the session level with ‘set<br />

plan dump on’ or at the server level with ‘sp_configure “abstract plan dump”, 1’<br />

2. Execute one module of the application to be tested.<br />

3. Turn off AQP dump and bcp out sysqueryplans<br />

63


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

4. Enable query plan capture on the <strong>15.0</strong> server.<br />

5. Enable metrics capture on the <strong>15.0</strong> server either at the server level with ‘sp_configure "enable<br />

metrics capture", 1’ or at the session level with ‘set metrics_capture on’<br />

6. Execute the same module as in #2<br />

7. Turn off AQP dump in <strong>ASE</strong> 15<br />

8. bcp in the 12.5 data into the scratch database (create a table called queryplans_125)<br />

9. copy the AQP data for the <strong>ASE</strong> <strong>15.0</strong> data into the scratch database (use select/into to create a<br />

table called queryplans_150)<br />

10. Create an index on hashkey, type and sequence for both AQP tables.<br />

11. Either use sp_metrics to backup the sysquerymetrics data or copy it to table in the scratch<br />

database as well.<br />

12. Run queries to identify plan differences<br />

The SQL queries now change slightly from the AQP method. First we find the list of query plans that have<br />

changed and then compare that list to the queries that execute slower than a desired number.<br />

-- get a list of the queries that have changed<br />

select t.hashkey<br />

into #qpchgs<br />

from queryplans_125 t, queryplans_150 f<br />

where t.hashkey = f.hashkey<br />

and t.sequence = f.sequence<br />

and t.type = 100 –- aqp text vs. sql<br />

and f.type = 100 –- aqp text vs. sql<br />

and t.text != f.text<br />

union all<br />

select f.hashkey<br />

from queryplans_150 f<br />

where f.sequence not in (select t.sequence<br />

from queryplans_125 t<br />

where f.hashkey = t.hashkey)<br />

union all<br />

select t.hashkey<br />

from queryplans_125 t<br />

where t.sequence not in (select f.sequence<br />

from queryplans_150 f<br />

where f.hashkey = t.hashkey)<br />

go<br />

-- eliminate duplicates<br />

select distinct hashkey<br />

into #qpchanges<br />

from #qpchgs<br />

go<br />

drop table #qpchgs<br />

go<br />

-- now get a list of the ‘slow’ queries in <strong>ASE</strong> <strong>15.0</strong> that are a member<br />

-- of the list of changed queries<br />

select hashkey, sequence, exec_min, exec_max, exec_avg,<br />

elap_min, elap_max, elap_avg, lio_min,<br />

lio_max, lio_avg, pio_min, pio_max, pio_avg,<br />

cnt, weight=cnt*exec_avg<br />

qtext<br />

from ..sysquerymetrics -- database under test<br />

where gid = -- group id sysquerymetrics backed up to<br />

and elap_avg > 2000 -- slow query is defined as avg elapsed time > 2000<br />

and hashkey in (select hashkey from #qpchanges)<br />

64


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Note that this doesn’t tell you if it ran faster or slower in <strong>15.0</strong> – it merely identifies queries in <strong>15.0</strong> that<br />

exceeds some arbitrary execution statistic that has a query plan change from 12.5.<br />

Another very crucial point is that we are comparing on AQP vs. comparing strictly showplans. The<br />

rationale behind this is that comparing the AQP’s is less error prone than comparing showplan text.<br />

Changes in formatting (i.e. the vertical bars to line up the levels in <strong>ASE</strong> 15), new expanded explanations<br />

(DRI checks), and other changes from enhancements in <strong>ASE</strong> showplan will result in a lot more ‘false<br />

postivives’ of query plan changes. There still is a good probability that some queries will be reported as<br />

diffent just due to differences in AQP extensions, but it will be far fewer than with comparing showplan<br />

texts. If you are using a tool that uses showplan text, you may want to consider this - and especially if<br />

using an in-house tool - change the logic to instead use the AQP for the queries.<br />

Diagnosing and Fixing Issues in <strong>ASE</strong> <strong>15.0</strong><br />

This section discusses some tips and tricks to diagnosing problems within <strong>ASE</strong> 15 as well as some<br />

gotcha’s.<br />

Using sysquerymetrics & sp_metrics<br />

The sysquerymetrics data is managed via the stored procedure sp_metrics. The <strong>ASE</strong> <strong>15.0</strong> documentation<br />

contains the syntax for this procedure, however, the use needs a bit of clarification. The key point to<br />

realize is that the currently collecting query metrics are stored in sysquerymetrics with a gid=1. If you<br />

‘save’ previous metric data via sp_metrics ‘backup’, you must choose an integer higher than 1 that is not<br />

already in use. Typically this is done by simply getting the max(gid) from sysquerymetrics and adding 1.<br />

The full sequence to capturing query metrics is similar to the following:<br />

-- set filter limits as desired (as of <strong>15.0</strong> ESD #2)<br />

exec sp_configure "metrics lio max", 10000<br />

exec sp_configure "metrics pio max", 1000<br />

exec sp_configure "metrics elap max", 5000<br />

exec sp_configure "metrics exec max", 2000<br />

go<br />

--Enable metrics capture<br />

set metrics_capture on<br />

go<br />

-- execute test script<br />

-- go<br />

--Flush/backup metrics & disable<br />

exec sp_metrics 'flush'<br />

go<br />

select max(gid)+1 from sysquerymetrics<br />

go<br />

-- if above result is null or 1, use 2 or higher<br />

exec sp_metrics 'backup', '#'<br />

go<br />

set metrics_capture off<br />

go<br />

--Analyze<br />

select * from sysquerymetrics where gid = #<br />

go<br />

--Drop<br />

exec sp_metrics 'drop', '2', '5'<br />

go<br />

Query metrics can be further filtered by deleting rows with metrics less than those of interest. Remember<br />

that as a system table, you need to turn on ‘allow updates’ first. Additionally, when dropping metrics, the<br />

begin range and end range must exist. For example, in the above, if the first metrics group was 3, the drop<br />

would fail with an error that the group 2 does not exist.<br />

65


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

The biggest gotcha with sysquerymetrics is that similar to AQP capture, sysquerymetrics can consume a<br />

large amount of space in the system segment. For example, in an internal <strong>Sybase</strong> system, enabling metrics<br />

capture for an hour consumed 1.5GB of disk space. There is a work-around to this if you want to reduce<br />

the impact:<br />

1. Run consecutive 10 or 15 minute metrics capture<br />

2. At end of each capture period:<br />

a. Select max(gid) from sysquerymetrics<br />

b. sp_metrics 'backup', 'gid'<br />

3. Start next capture period<br />

4. Filter previous results (assuming configuration parameters were set higher)<br />

a. i.e. Delete rows with


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Using Showplan options<br />

As mentioned earlier, diagnostic trace flags 302, 310, etc. are being deprecated. These trace flags often<br />

output somewhat cryptic output, could not be controlled to the extent of the output and generally, very little<br />

of the output was usable by the average user. In <strong>ASE</strong> <strong>15.0</strong>, they are being replaced with Showplan options<br />

using the following syntax.<br />

set option <br />

The list of options includes:<br />

Option<br />

show<br />

show_lop<br />

show_managers<br />

show_log_props<br />

show_parallel<br />

show_histograms<br />

show_abstract_plan<br />

show_search_engine<br />

show_counters<br />

show_best_plan<br />

show_code_gen<br />

show_pio_costing<br />

show_lio_costing<br />

show_elimination<br />

A couple of notes about usage:<br />

show_missing_stats<br />

Description<br />

show optional details<br />

Shows logical operators used.<br />

Shows data structure managers used.<br />

Shows the logical managers used.<br />

Shows parallel query optimization.<br />

Shows the histograms processed.<br />

Shows the details of an abstract plan.<br />

Shows the details of a search engine.<br />

Shows the optimization counters.<br />

Shows the details of the best QP plan.<br />

Shows the details of code generation.<br />

Shows estimates of physical I/O<br />

Shows estimates of logical input/output.<br />

Shows partition elimination.<br />

Shows columns with missing stats<br />

• Some of these require traceflag 3604 (or 3605 if log output desired) for output to be visible<br />

• Generally, you should execute 'set option show on' first before other more restrictive options.<br />

This allows you to specify brief or long to override the more general 'show' level of detail.<br />

• These may not produce output if query metrics capturing is enabled.<br />

Consider the following scenarios. Let's assume that we want to see the query plan, while making sure we<br />

haven't missed any obvious statistics, and we want to view the abstract plan (so we can do a 'create plan'<br />

and fix the app later if need be). For this scenario, the command sequence would be:<br />

set showplan on<br />

set option show_missing_stats on<br />

set option show_abstract_plan on<br />

go<br />

Let's see how this looks in action by just getting the missing statistics:<br />

1> set option show_missing_stats long<br />

2> go<br />

1> dbcc traceon(3604)<br />

2> go<br />

DBCC execution completed. If DBCC printed error messages, contact a user with<br />

System Administrator (SA) role.<br />

67


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

1> select * from part, partsupp<br />

2> where p_partkey = ps_partkey and p_itemtype = ps_itemtype<br />

3> go<br />

NO STATS on column part.p_partkey<br />

NO STATS on column part.p_itemtype<br />

NO STATS on column partsupp.ps_itemtype<br />

NO STATS on density set for E={p_partkey, p_itemtype}<br />

NO STATS on density set for F={ps_partkey, ps_itemtype}<br />

Now let's do something a bit more common. Let's attempt to debug index selection, io costing, etc. the way<br />

we used to with trace flags 302, 310, 315, etc.<br />

dbcc traceon(3604)<br />

set showplan on<br />

set option show on -- get index selectivity<br />

set option show_missing_stats -- highlight missing statistics<br />

set option show_lio_costing -- report logical I/O cost estimates<br />

go<br />

To see how these would work in real life, let’s take a look at a typical problem of whether to use update<br />

statistics or update index statistics and using show_missing_stats and show_lio_costing as the means to<br />

determine which one is more appropriate.<br />

A key point is that show_missing_stats only reports when there are no statistics at all for the column. If<br />

there are density stats as part of an index but not for the column itself, the column will be considered to not<br />

have statistics - which can be extremely useful for identifying why you should run update index<br />

statistics vs. just update statistics. For example, consider the following simplistic example from<br />

pubs2:<br />

use pubs2<br />

go<br />

delete statistics salesdetail<br />

go<br />

update statistics salesdetail<br />

go<br />

dbcc traceon(3604)<br />

set showplan on<br />

--set option show on<br />

set option show_lio_costing on<br />

set option show_missing_stats on<br />

go<br />

select *<br />

from salesdetail<br />

where stor_id='5023'<br />

and ord_num='NF-123-ADS-642-9G3'<br />

go<br />

set showplan off<br />

go<br />

set option show_missing_stats off<br />

set option show_lio_costing on<br />

--set option show off<br />

dbcc traceoff(3604)<br />

go<br />

NO STATS on column salesdetail.ord_num<br />

Beginning selection of qualifying indexes for table 'salesdetail',<br />

Estimating selectivity of index 'salesdetailind', indid 3<br />

stor_id = '5023'<br />

ord_num = 'NF-123-ADS-642-9G3'<br />

Estimated selectivity for stor_id,<br />

selectivity = 0.4310345,<br />

Estimated selectivity for ord_num,<br />

selectivity = 0.1,<br />

scan selectivity 0.07287274, filter selectivity 0.07287274<br />

8.453237 rows, 1 pages<br />

Data Row Cluster Ratio 0.9122807<br />

Index Page Cluster Ratio 0<br />

Data Page Cluster Ratio 1<br />

using no index prefetch (size 4K I/O)<br />

in index cache 'default data cache' (cacheid 0) with LRU replacement<br />

68


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

using no table prefetch (size 4K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

Data Page LIO for 'salesdetailind' on table 'salesdetail' = 1.741512<br />

Estimating selectivity for table 'salesdetail'<br />

Table scan cost is 116 rows, 2 pages,<br />

The table (Allpages) has 116 rows, 2 pages,<br />

Data Page Cluster Ratio 1.0000000<br />

stor_id = '5023'<br />

Estimated selectivity for stor_id,<br />

selectivity = 0.4310345,<br />

ord_num = 'NF-123-ADS-642-9G3'<br />

Estimated selectivity for ord_num,<br />

selectivity = 0.1,<br />

Search argument selectivity is 0.07287274.<br />

using no table prefetch (size 4K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

The Cost Summary for best global plan:<br />

PARALLEL:<br />

number of worker processes = 3<br />

max parallel degree = 3<br />

min(configured,set) parallel degree = 3<br />

min(configured,set) hash scan parallel degree = 3<br />

max repartition degree = 3<br />

resource granularity (percentage) = 10<br />

FINAL PLAN ( total cost = 61.55964)<br />

Path: 55.8207<br />

Work: 61.55964<br />

Est: 117.3803<br />

QUERY PLAN FOR STATEMENT 1 (at line 1).<br />

1 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|SCAN Operator<br />

| FROM TABLE<br />

| salesdetail<br />

| Index : salesdetailind<br />

| Forward Scan.<br />

| Positioning by key.<br />

| Keys are:<br />

| stor_id ASC<br />

| ord_num ASC<br />

| Using I/O Size 4 Kbytes for index leaf pages.<br />

| With LRU Buffer Replacement Strategy for index leaf pages.<br />

| Using I/O Size 4 Kbytes for data pages.<br />

| With LRU Buffer Replacement Strategy for data pages.<br />

Total estimated I/O cost for statement 1 (at line 1): 61.<br />

Now, observe the difference in the following (using update index statistics).<br />

use pubs2<br />

go<br />

delete statistics salesdetail<br />

go<br />

update index statistics salesdetail<br />

go<br />

dbcc traceon(3604)<br />

set showplan on<br />

--set option show on<br />

set option show_lio_costing on<br />

set option show_missing_stats on<br />

go<br />

select *<br />

from salesdetail<br />

where stor_id='5023'<br />

and ord_num='NF-123-ADS-642-9G3'<br />

69


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

go<br />

set showplan off<br />

go<br />

set option show_missing_stats off<br />

set option show_lio_costing on<br />

--set option show off<br />

dbcc traceoff(3604)<br />

go<br />

Beginning selection of qualifying indexes for table 'salesdetail',<br />

Estimating selectivity of index 'salesdetailind', indid 3<br />

stor_id = '5023'<br />

ord_num = 'NF-123-ADS-642-9G3'<br />

Estimated selectivity for stor_id,<br />

selectivity = 0.4310345,<br />

Estimated selectivity for ord_num,<br />

selectivity = 0.04101932,<br />

scan selectivity 0.04101932, filter selectivity 0.04101932<br />

4.758242 rows, 1 pages<br />

Data Row Cluster Ratio 0.9122807<br />

Index Page Cluster Ratio 0<br />

Data Page Cluster Ratio 1<br />

using no index prefetch (size 4K I/O)<br />

in index cache 'default data cache' (cacheid 0) with LRU replacement<br />

using no table prefetch (size 4K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

Data Page LIO for 'salesdetailind' on table 'salesdetail' = 1.41739<br />

Estimating selectivity for table 'salesdetail'<br />

Table scan cost is 116 rows, 2 pages,<br />

The table (Allpages) has 116 rows, 2 pages,<br />

Data Page Cluster Ratio 1.0000000<br />

stor_id = '5023'<br />

Estimated selectivity for stor_id,<br />

selectivity = 0.4310345,<br />

ord_num = 'NF-123-ADS-642-9G3'<br />

Estimated selectivity for ord_num,<br />

selectivity = 0.04101932,<br />

Search argument selectivity is 0.04101932.<br />

using no table prefetch (size 4K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

The Cost Summary for best global plan:<br />

PARALLEL:<br />

number of worker processes = 3<br />

max parallel degree = 3<br />

min(configured,set) parallel degree = 3<br />

min(configured,set) hash scan parallel degree = 3<br />

max repartition degree = 3<br />

resource granularity (percentage) = 10<br />

FINAL PLAN ( total cost = 60.17239)<br />

Path: 55.54325<br />

Work: 60.17239<br />

Est: 115.7156<br />

QUERY PLAN FOR STATEMENT 1 (at line 1).<br />

1 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|SCAN Operator<br />

| FROM TABLE<br />

| salesdetail<br />

| Index : salesdetailind<br />

| Forward Scan.<br />

| Positioning by key.<br />

| Keys are:<br />

70


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| stor_id ASC<br />

| ord_num ASC<br />

| Using I/O Size 4 Kbytes for index leaf pages.<br />

| With LRU Buffer Replacement Strategy for index leaf pages.<br />

| Using I/O Size 4 Kbytes for data pages.<br />

| With LRU Buffer Replacement Strategy for data pages.<br />

Total estimated I/O cost for statement 1 (at line 1): 60.<br />

Note the difference in rows estimated (4 vs. 8), selectivity (0.04 vs. 0.07), and the number of estimated<br />

I/O’s (60 vs. 61). While not significant, you also have to remember that this was pubs2 which had a<br />

whopping 116 total rows. This should underscore the need to have updated statistics for all the columns in<br />

an index vs. just the leading column. One aspect of this is that if you drop and recreate indexes, the<br />

statistics collected during index creation are the same as for update statistics - you may want to<br />

immediately run update index statistics afterwards.<br />

Showplan & Merge Joins<br />

In addition to new operators as well as new levels of detail discussed above, showplan also now includes<br />

some additional information that can help diagnose index issues when a merge join is picked. Consider the<br />

following query:<br />

select count(*)<br />

from tableA, tableB<br />

where tableA.TranID = tableB.TranID<br />

and tableA.OriginCode = tableB.OriginCode<br />

and tableB.Status 'Y'<br />

Since tableA doesn’t have any SARG conditions, the entire table is likely required - and a table scan would<br />

be expected. Assuming a normal pkey/fkey relationship, however, we would expect tableB to use the index<br />

on the fkey relationship {TranID, OriginCode}. Now, let’s consider the following showplan output for the<br />

query:<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|SCALAR AGGREGATE Operator<br />

| Evaluate Ungrouped COUNT AGGREGATE.<br />

|<br />

| |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | Using Worktable2 for internal storage.<br />

| | Key Count: 1<br />

| | Key Ordering: ASC<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | tableB<br />

| | | Table Scan.<br />

| | | Forward Scan.<br />

| | | Positioning at start of table.<br />

| | | Using I/O Size 16 Kbytes for data pages.<br />

| | | With LRU Buffer Replacement Strategy for data pages.<br />

| |<br />

| | |SORT Operator<br />

| | | Using Worktable1 for internal storage.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | tableA<br />

| | | | Table Scan.<br />

| | | | Forward Scan.<br />

| | | | Positioning at start of table.<br />

| | | | Using I/O Size 16 Kbytes for data pages.<br />

| | | | With LRU Buffer Replacement Strategy for data pages.<br />

Note that the merge join is only reporting a single join key - despite the fact the query clearly has two!!!<br />

The likely cause of this is our favorite problem with update statistics vs. update index<br />

statistics - with no statistics on OriginCode (assuming it is the second column in the index or<br />

pkey/fkey constraint), the optimizer automatically estimates the number of rows that qualify for the second<br />

join column (OriginCode) by using the ‘magic’ numbers (10% in this case due to equality). Whether it was<br />

71


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

the distribution of data or other factors, the result was an unwanted table scan of tableB. The reason for<br />

this was that in processing the merge join, the outer table was built specifying only a single join key - then<br />

the inner table was sorted by that join key and then the SARG and the other join condition were evaluated<br />

as part of the scan. The optimizer estimate tableB would have fewer rows likely due to the SARG<br />

condition (Status ‘Y’) - but it is hard to tell from the details that we have.<br />

Now, let’s see what happens if we run update index statistics - forcing statistics to be collected on<br />

all columns in the index vs. just the first column.<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|SCALAR AGGREGATE Operator<br />

| Evaluate Ungrouped COUNT AGGREGATE.<br />

|<br />

| |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | Using Worktable2 for internal storage.<br />

| | Key Count: 2<br />

| | Key Ordering: ASC ASC<br />

| |<br />

| | |SORT Operator<br />

| | | Using Worktable1 for internal storage.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | tableA<br />

| | | | Table Scan.<br />

| | | | Forward Scan.<br />

| | | | Positioning at start of table.<br />

| | | | Using I/O Size 16 Kbytes for data pages.<br />

| | | | With LRU Buffer Replacement Strategy for data pages.<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | tableB<br />

| | | Index : tableB_idx<br />

| | | Forward Scan.<br />

| | | Positioning at index start.<br />

| | | Using I/O Size 16 Kbytes for index leaf pages.<br />

| | | With LRU Buffer Replacement Strategy for index leaf pages.<br />

| | | Using I/O Size 16 Kbytes for data pages.<br />

| | | With LRU Buffer Replacement Strategy for data pages.<br />

Ah ha!!! Now we have 2 columns in the merge join scan! As a result, we see that the join order changed,<br />

and tableB now uses an index. Result: a much faster query.<br />

Since a merge join could be much faster than a nested loop join when no appropriate indexes are available<br />

(in this case - no SARG on tableA, but also see examples later in discussion on tempdb and merge joins),<br />

applications migrating to <strong>ASE</strong> <strong>15.0</strong> may see a number of ‘bad’ queries picking the merge join due to poor<br />

index statistics. The initial gut reaction for DBA’s will likely to be to attempt to turn off merge join. While<br />

a nested loop join may be faster than a “bad” merge join, it is also likely that it will be worse than a “good”<br />

merge join. As a result, a key to diagnosing merge join performance issues is to check the number of keys<br />

used in the merge vs. the number of join clauses in the query.<br />

Set Statistics Plancost Option<br />

In addition to the optional level of details, <strong>ASE</strong> <strong>15.0</strong> includes several new showplan options that can make<br />

analysis easier. The first is the 'set statistics plancost on/off' command. For example, given the following<br />

sequence:<br />

dbcc traceon(3604)<br />

go<br />

set statistics plancost on<br />

go<br />

select S.service_key, M.year, M.fiscal_period,count(*)<br />

from telco_facts T,month M, service S<br />

where T.month_key=M.month_key<br />

and T.service_key = S.service_key<br />

72


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

and S.call_waiting_flag='Y'<br />

and S.caller_id_flag='Y'<br />

and S.voice_mail_flag='Y'<br />

group by M.year, M.fiscal_period, S.service_key<br />

order by M.year, M.fiscal_period, S.service_key<br />

go<br />

The output is a lava query tree as follows:<br />

Emit<br />

(VA = 7)<br />

12 rows est: 1200<br />

cpu: 500<br />

/<br />

GroupSorted<br />

(VA = 6)<br />

12 rows est: 1200<br />

/<br />

NestLoopJoin<br />

Inner Join<br />

(VA = 5)<br />

242704 rows est: 244857<br />

/ \<br />

Sort<br />

IndexScan<br />

(VA = 3)<br />

month_svc_idx (T)<br />

72 rows est: 24 (VA = 4)<br />

lio: 6 est: 6 242704 rows est: 244857<br />

pio: 0 est: 0 lio: 1116 est: 0<br />

cpu: 0 bufct: 16 pio: 0 est: 0<br />

/<br />

NestLoopJoin<br />

Inner Join<br />

(VA = 2)<br />

72 rows est: 24<br />

/ \<br />

TableScan<br />

TableScan<br />

month (M)<br />

service (S)<br />

(VA = 0) (VA = 1)<br />

24 rows est: 24 72 rows est: 24<br />

lio: 1 est: 0 lio: 24 est: 0<br />

pio: 0 est: 0 pio: 0 est: 0<br />

Effectively a much better replacement for 'set statistics io on'. In fact, for parallel queries, as of <strong>ASE</strong> <strong>15.0</strong>.1<br />

and earlier, ‘set statistics io on’ will cause errors, so ‘set statistics plancost on’ should be used instead.<br />

Note in the highlighted section above the cpu and sort buffer cost. This is associated with the new inmemory<br />

sorting algorithms. This option gives us the estimated logical I/O, physical I/O and row counts to<br />

the actual ones evaluated at each operator. If you see that the estimated counts are totally off, then the<br />

optimizer estimates are completely out of whack. Often times, this may be caused by missing or stale<br />

statistics. Let us take the following query and illustrate this fact. The query is also being run with<br />

show_missing_stats option.<br />

1> dbcc traceon(3604)<br />

2> go<br />

1> set option show_missing_stats on<br />

2> go<br />

1> set statistics plancost on<br />

2> go<br />

1> select<br />

2> l_returnflag,<br />

3> l_linestatus,<br />

4> sum(l_quantity) as sum_qty,<br />

5> sum(l_extendedprice) as sum_base_price,<br />

6> sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,<br />

7> sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,<br />

8> avg(l_quantity) as avg_qty,<br />

73


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

9> avg(l_extendedprice) as avg_price,<br />

10> avg(l_discount) as avg_disc,<br />

11> count(*) as count_order<br />

12> from<br />

13> lineitem<br />

14> where<br />

15> l_shipdate group by<br />

17> l_returnflag,<br />

18> l_linestatus<br />

19> order by<br />

20> l_returnflag,<br />

21> l_linestatus<br />

22> go<br />

==================== Lava Operator Tree ====================<br />

/<br />

GroupSorted<br />

(VA = 2)<br />

4 rows est: 100<br />

/<br />

Sort<br />

(VA = 1)<br />

60175 rows est: 19858<br />

lio: 2470 est: 284<br />

pio: 2355 est: 558<br />

cpu: 1900 bufct: 21<br />

/<br />

TableScan<br />

lineitem<br />

(VA = 0)<br />

60175 rows est: 19858<br />

lio: 4157 est: 4157<br />

pio: 1205 est: 4157<br />

/<br />

Restrict<br />

(0)(13)(0)(0)<br />

(VA = 3)<br />

4 rows est: 100<br />

Emit<br />

(VA = 4)<br />

4 rows est: 100<br />

cpu: 800<br />

============================================================<br />

NO STATS on column lineitem.l_shipdate<br />

(4 rows affected)<br />

As you can see that the estimated number of rows is incorrect at the scan level. The query does have a<br />

predicate<br />

l_shipdate update statistics lineitem(l_shipdate)<br />

2> go<br />

1><br />

2> select<br />

3> l_returnflag,<br />

4> l_linestatus,<br />

5> sum(l_quantity) as sum_qty,<br />

74


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

6> sum(l_extendedprice) as sum_base_price,<br />

7> sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,<br />

8> sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,<br />

9> avg(l_quantity) as avg_qty,<br />

10> avg(l_extendedprice) as avg_price,<br />

11> avg(l_discount) as avg_disc,<br />

12> count(*) as count_order<br />

13> from<br />

14> lineitem<br />

15> where<br />

16> l_shipdate group by<br />

18> l_returnflag,<br />

19> l_linestatus<br />

20> order by<br />

21> l_returnflag,<br />

22> l_linestatus<br />

==================== Lava Operator Tree ====================<br />

Emit<br />

(VA = 4)<br />

4 rows est: 100<br />

cpu: 0<br />

/<br />

Restrict<br />

(0)(13)(0)(0)<br />

(VA = 3)<br />

4 rows est: 100<br />

/<br />

HashVectAgg<br />

Count<br />

(VA = 1)<br />

4 rows est: 100<br />

lio: 5 est: 5<br />

pio: 0 est: 0<br />

bufct: 16<br />

/<br />

TableScan<br />

lineitem<br />

(VA = 0)<br />

60175 rows est: 60175<br />

lio: 4157 est: 4157<br />

pio: 1039 est: 4157<br />

/<br />

Sort<br />

(VA = 2)<br />

4 rows est: 100<br />

lio: 6 est: 6<br />

pio: 0 est: 0<br />

cpu: 800 bufct: 16<br />

Well, we now see that the estimated row count for the TableScan operator is same as the actual. This is<br />

great news and also, our query plan has changed to use the HashVectAgg (hash based vector aggregation)<br />

instead of the Sort and GroupSorted combination that was used earlier. This query plan is way faster than<br />

what we got earlier. But we're not done. If you look at the output of the HashVectAgg operator, the<br />

estimated rowcount is 100, whereas the actual row count is 4. Well, we could further improve the statistics,<br />

though this is probably our best plan. Since, the grouping columns are on l_returnflag and l_linestatus, we<br />

decide to create a density on the pair of columns.<br />

1> use tpcd<br />

2> go<br />

1> update statistics lineitem(l_returnflag, l_linestatus)<br />

2> go<br />

1><br />

2> set showplan on<br />

1><br />

2> set statistics plancost on<br />

3> go<br />

75


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

1><br />

2> select<br />

3> l_returnflag,<br />

4> l_linestatus,<br />

5> sum(l_quantity) as sum_qty,<br />

6> sum(l_extendedprice) as sum_base_price,<br />

7> sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,<br />

8> sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,<br />

9> avg(l_quantity) as avg_qty,<br />

10> avg(l_extendedprice) as avg_price,<br />

11> avg(l_discount) as avg_disc,<br />

12> count(*) as count_order<br />

13> from<br />

14> lineitem<br />

15> where<br />

16> l_shipdate group by<br />

18> l_returnflag,<br />

19> l_linestatus<br />

20> order by<br />

21> l_returnflag,<br />

22> l_linestatus<br />

QUERY PLAN FOR STATEMENT 1 (at line 2).<br />

4 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|RESTRICT Operator<br />

|<br />

| |SORT Operator<br />

| | Using Worktable2 for internal storage.<br />

| |<br />

| | |HASH VECTOR AGGREGATE Operator<br />

| | | GROUP BY<br />

| | | Evaluate Grouped COUNT AGGREGATE.<br />

| | | Evaluate Grouped SUM OR AVERAGE AGGREGATE.<br />

| | | Evaluate Grouped COUNT AGGREGATE.<br />

| | | Evaluate Grouped SUM OR AVERAGE AGGREGATE.<br />

| | | Evaluate Grouped SUM OR AVERAGE AGGREGATE.<br />

| | | Evaluate Grouped SUM OR AVERAGE AGGREGATE.<br />

| | | Evaluate Grouped SUM OR AVERAGE AGGREGATE.<br />

| | | Using Worktable1 for internal storage.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | lineitem<br />

| | | | Table Scan.<br />

| | | | Forward Scan.<br />

| | | | Positioning at start of table.<br />

| | | | Using I/O Size 2 Kbytes for data pages.<br />

| | | | With MRU Buffer Replacement Strategy for data pages.<br />

==================== Lava Operator Tree ====================<br />

/<br />

Restrict<br />

(0)(13)(0)(0)<br />

(VA = 3)<br />

4 rows est: 4<br />

Emit<br />

(VA = 4)<br />

4 rows est: 4<br />

cpu: 0<br />

/<br />

/<br />

Sort<br />

(VA = 2)<br />

4 rows est: 4<br />

lio: 6 est: 6<br />

pio: 0 est: 0<br />

cpu: 700 bufct: 16<br />

76


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

HashVectAgg<br />

Count<br />

(VA = 1)<br />

4 rows est: 4<br />

lio: 5 est: 5<br />

pio: 0 est: 0<br />

bufct: 16<br />

/<br />

TableScan<br />

lineitem<br />

(VA = 0)<br />

60175 rows est: 60175<br />

lio: 4157 est: 4157<br />

pio: 1264 est: 4157<br />

Look at the estimated row count for the HashVectAgg. It is same as that of the actual row count.<br />

Query Plans As XML<br />

In <strong>ASE</strong> <strong>15.0</strong>, you can also obtain query plans in XML. This is useful if building an automated tool to<br />

display the query plan graphically such as the dbIsql Plan Viewer. However, another useful technique is to<br />

use an XML query plan to find the last time statistics were updated for a particular table. In the past, the<br />

only means to do this was by using the optdiag utility or by directly querying the systabstats/ sysstatistics<br />

system tables– consider the following example:<br />

$SYB<strong>ASE</strong>/<strong>ASE</strong>-15_0/bin/optdiag statistics le_01.dbo.part -Usa -P<br />

Server name: "tpcd"<br />

Specified database: "le_01"<br />

Specified table owner: "dbo"<br />

Specified table: "part"<br />

Specified column: not specified<br />

Table owner: "dbo"<br />

Table name: "part"<br />

...................................................<br />

Statistics for column: "p_partkey"<br />

Last update of column statistics: Sep 13 2005 7:51:39:440PM<br />

Range cell density: 0.0010010010010010<br />

Total density: 0.0010010010010010<br />

Range selectivity: default used (0.33)<br />

In between selectivity: default used (0.25)<br />

Histogram for column: "p_partkey"<br />

Column datatype: integer<br />

Requested step count: 20<br />

Actual step count: 20<br />

Sampling Percent: 0<br />

Step Weight Value<br />

1 0.00000000 go<br />

1> select count(*) from part where p_partkey > 20<br />

77


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

2> go<br />

-----------<br />

979<br />

1> select showplan_in_xml(-1)<br />

2> go<br />

-----------<br />

979<br />

<br />

<br />

1.0 <br />

1<br />

1<br />

<br />

20<br />

]]><br />

<br />

part<br />

<br />

p_partkey<br />

Sep 13 2005 7:51:39:440PM<br />

<br />

You can also get the above information using the show_final_plan_xml option. Note how the "set plan"<br />

uses the "client" option and traceflag 3604 to get the output on the client side. This is different from how<br />

you need to use the "message" option of "set plan".<br />

1> dbcc traceon(3604)<br />

2> go<br />

DBCC execution completed. If DBCC printed error messages, contact a user with<br />

System Administrator (SA) role.<br />

1> set plan for show_final_plan_xml to client on<br />

2> go<br />

1> select * from part, partsupp<br />

2> where p_partkey = ps_partkey and p_itemtype = ps_itemtype<br />

3> go<br />

<br />

<br />

1.0 <br />

<br />

<br />

part<br />

<br />

p_partkey<br />

p_itemtype<br />

<br />

<br />

p_partkey<br />

p_itemtype<br />

<br />

<br />

<br />

partsupp<br />

<br />

ps_partkey<br />

ps_itemtype<br />

<br />

<br />

ps_partkey<br />

ps_itemtype<br />

<br />

<br />

<br />

The useful aspect of this is that it is a single step operation. Normally, a textual Showplan does not provide<br />

this information, consequently users then need to perform the second step of either using optdiag or<br />

querying the system tables to obtain the last time statistics were updated. By using the XML output, all the<br />

information is available in a single location – which when added to the ease of parsing XML output vs.<br />

textual, allows tool developers to provide enhanced functionality with <strong>ASE</strong> 15 not available easily in<br />

previous releases.<br />

78


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Fixing SQL Queries using AQP<br />

As mentioned earlier, <strong>Sybase</strong> introduced AQP in version 12.0. The goal was to allow customers to modify<br />

a query to use a specific query plan without having to alter the code. The general technique that is<br />

transparent to applications especially is the following steps:<br />

1. Identify the problem queries in the new version<br />

2. Turn AQP capture on in the previous release and run the queries.<br />

3. Extract the AQP from the ap_stdout group<br />

4. Review and modify the AQP’s as necessary (i.e. adjust for partial plan use, etc.)<br />

5. Use create plan to load them into the default ap_stdin group in the newer release<br />

6. Enable abstract plan loading on the new release (sp_configure ‘abstract plan load’, 1)<br />

7. Adjust the abstract plan cache to avoid io during optimization (sp_configure ‘abstract plan<br />

cache’)<br />

8. Retest the queries in the new version to see if the AQP is used<br />

9. Adjust the AQP as necessary to leverage new features that may help the query perform even<br />

better<br />

10. Work with <strong>Sybase</strong> Technical Support on resolving original optimization issue<br />

Note that this technique is especially useful to get past a problem that might delay an upgrade beyond the<br />

window of opportunity, or to get past the “show-stoppers” that occasionally occur. As mentioned earlier,<br />

more documentation is available in the Performance and Tuning Guide on Optimization and Abstract Plans.<br />

This method also replaces the need to use ‘set forceplan on’, and in particular allows more control than just<br />

join-order processing enforcement that forceplan implements. By illustration, consider the following use<br />

cases.<br />

<br />

Forcing an Index Using AQP<br />

Generally, we all are aware that we can force an index by adding the index force to the query itself. For<br />

example:<br />

1> select count(*) from orders, lineitem where o_orderkey = l_orderkey<br />

2> go<br />

QUERY PLAN FOR STATEMENT 1 (at line 1).<br />

3 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|NESTED LOOP JOIN Operator (Join Type: Inner Join)<br />

|<br />

| |SCAN Operator<br />

| | FROM TABLE<br />

| | orders<br />

| | Table Scan.<br />

| | Forward Scan.<br />

| | Positioning at start of table.<br />

| |SCAN Operator<br />

| | FROM TABLE<br />

| | lineitem<br />

| | Table Scan.<br />

| | Forward Scan.<br />

| | Positioning at start of table.<br />

This is an example where the lineitem table is being scanned without an index. This may not be the best<br />

available query plan. Maybe the query would run faster if one use the index on lineitem called l_idx1. This<br />

can be done by rewriting the query as follows using the legacy force option.<br />

79


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

1> select count(*) from orders, lineitem (index l_idx1) where o_orderkey =<br />

l_orderkey<br />

2> go<br />

QUERY PLAN FOR STATEMENT 1 (at line 1).<br />

3 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|NESTED LOOP JOIN Operator (Join Type: Inner Join)<br />

|<br />

| |SCAN Operator<br />

| | FROM TABLE<br />

| | orders<br />

| | Table Scan.<br />

| | Forward Scan.<br />

| | Positioning at start of table.<br />

|<br />

| |SCAN Operator<br />

| | FROM TABLE<br />

| | lineitems<br />

| | Index : l_idx1<br />

| | Forward Scan.<br />

| | Positioning by key.<br />

| | Keys are:<br />

| | l_orderkey ASC<br />

Though force options are simple, pretty soon you'll realize that you cannot do everything using this option<br />

– plus as we mentioned, this requires changing the application code – which even if you could do this,<br />

would take much longer than simply using an AQP. Let us take the same problem, but this time use an<br />

AQP. To make it easier, we will let <strong>ASE</strong> generate the AQP for us, edit that, and use it to force an index.<br />

1> set option show_abstract_plan on<br />

2> go<br />

1> dbcc traceon(3604)<br />

2> go<br />

1> select count(*) from orders, lineitem where o_orderkey = l_orderkey<br />

3> go<br />

The Abstract Plan (AP) of the final query execution plan:<br />

( nl_join ( t_scan orders ) ( t_scan lineitem ) ) ( prop orders ( parallel 1 ) (<br />

prefetch 2 ) (lru ) ) ( prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )<br />

To experiment with the optimizer behavior, this AP can be modified and then passed to the optimizer using<br />

the PLAN clause:<br />

SELECT/INSERT/DELETE/UPDATE ...PLAN '( ... )<br />

Now that we have a starting AP, we can replace the table scans (t_scan) with index accesses and try our<br />

modified plan (indent for readability). To start with, we will specify the tables using (prop tablename) as<br />

in:<br />

1> select count(*) from orders, lineitem where o_orderkey = l_orderkey<br />

2> plan<br />

3> "( nl_join<br />

4> ( t_scan orders )<br />

5> ( t_scan lineitem )<br />

6> )<br />

7> ( prop orders ( parallel 1 ) ( prefetch 2 ) (lru ) )<br />

8> ( prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )<br />

What we need to do is to be able to force the index. The option to force an index scan is (i_scan ). Let us then rewrite the query with the new AP.<br />

1> select count(*) from orders, lineitem where o_orderkey = l_orderkey<br />

2> plan<br />

3> "( nl_join<br />

4> ( t_scan orders )<br />

5> ( i_scan l_idx1 lineitem )<br />

6> )<br />

7> ( prop orders ( parallel 1 ) ( prefetch 2 ) (lru ) )<br />

8> ( prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )<br />

80


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

<br />

Forcing a join order<br />

Join orders can be forced using the old legacy style by using the command set forceplan on. In this case, we<br />

may want to force the join order such that the lineitem table is outer to the orders table, which essentially<br />

boils down to switching the join order of these two tables. It can be achieved through the legacy option,<br />

where you specify a different join order by switching tables in the from clause of the query.<br />

1> set forceplan on<br />

2> go<br />

1> select count(*) from lineitem, orders where o_orderkey = l_orderkey<br />

2> go<br />

Since, we are on the subject of using AP and we know how to get a starting AP, we shall see how we can<br />

use an existing AP and then modify it. Our starting AP looks like the following:<br />

1> select count(*) from orders, lineitem where o_orderkey = l_orderkey<br />

2> plan<br />

3> "( nl_join<br />

4> ( t_scan orders )<br />

5> ( t_scan lineitem )<br />

6> )<br />

7> ( prop orders ( parallel 1 ) ( prefetch 2 ) (lru ) )<br />

8> ( prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )<br />

What we need to do is to be able to force the join order. This can be done by switching the join order in the<br />

AP.<br />

<br />

1> select count(*) from orders, lineitem where o_orderkey = l_orderkey<br />

2> plan<br />

3> "( nl_join<br />

4> ( t_scan lineitem)<br />

5> ( t_scan orders )<br />

6> )<br />

7> ( prop orders ( parallel 1 ) ( prefetch 2 ) (lru ) )<br />

8> ( prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )<br />

Forcing a different join strategy<br />

Things get much more difficult when you have to force a different join strategy. There are session level<br />

options whereby you can join strategies on or off. But the best way would be to use an AP. Let us start with<br />

the session level options and say you want to try and see if merge join performs better than nested loop<br />

join:<br />

1> set nl_join 0<br />

2> go<br />

1> select * from orders, lineitem where o_orderkey = l_orderkey<br />

2> go<br />

The Abstract Plan (AP) of the final query execution plan:<br />

( m_join ( i_scan l_idx1 lineitem ) ( sort ( t_scan orders ) ) ) ( prop lineitem (<br />

parallel 1 ) (prefetch 2 ) ( lru ) ) ( prop orders ( parallel 1 ) ( prefetch 2 ) (<br />

lru ) )<br />

To experiment with the optimizer behavior, this AP can be modified and then passed to the optimizer using<br />

the PLAN clause: SELECT/INSERT/DELETE/UPDATE ...PLAN '( ... )<br />

QUERY PLAN FOR STATEMENT 1 (at line 1).<br />

4 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|MERGE JOIN Operator (Join Type: Inner Join)<br />

| Using Worktable2 for internal storage.<br />

| Key Count: 1<br />

| Key Ordering: ASC<br />

|<br />

| |SCAN Operator<br />

81


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| | FROM TABLE<br />

| | lineitem<br />

| | Index : X_NC1<br />

| | Forward Scan.<br />

| | Positioning at index start.<br />

| | Index contains all needed columns. Base table will not be read.<br />

| |SORT Operator<br />

| | Using Worktable1 for internal storage.<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | orders<br />

| | | Table Scan.<br />

| | | Forward Scan.<br />

You can selectively turn join strategies on/off as needed at a session level. The ones that you can<br />

experiment with are:<br />

set nl_join [0 | 1]<br />

set merge_join [0 | 1]<br />

set hash_join [0 | 1]<br />

Using APs, you can start off with the one produced by the optimizer as shown before.<br />

1> select count(*) from orders, lineitem where o_orderkey = l_orderkey<br />

2> plan<br />

3> "( nl_join<br />

4> ( t_scan orders )<br />

5> ( t_scan lineitem )<br />

6> )<br />

7> ( prop orders ( parallel 1 ) ( prefetch 2 ) (lru ) )<br />

8> ( prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )<br />

Then you can modify the AP to change the join algorithm from nested loop to merge. Note that merge join<br />

needs ordering on the joining column. To get the required ordering, use a AP construct called enforce that<br />

will generate the right ordering on the right column.<br />

1> select count(*) from orders, lineitem where o_orderkey = l_orderkey<br />

2> plan<br />

3> "( m_join<br />

4> (enforce ( t_scan orders ))<br />

5> (enforce ( t_scan lineitem ))<br />

6> )<br />

7> ( prop orders ( parallel 1 ) ( prefetch 2 ) (lru ) )<br />

8> ( prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )<br />

QUERY PLAN FOR STATEMENT 1 (at line 1).<br />

4 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|MERGE JOIN Operator (Join Type: Inner Join)<br />

| Using Worktable2 for internal storage.<br />

| Key Count: 1<br />

| Key Ordering: ASC<br />

| |SORT Operator<br />

| | Using Worktable1 for internal storage.<br />

| | |SCAN Operator<br />

| | |FROM TABLE<br />

| | |orders<br />

| | |Table Scan.<br />

| | |Forward Scan.<br />

| | |Positioning at start of table.<br />

| | |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

82


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| | | lineitem<br />

| | | Table Scan.<br />

| | | Forward Scan.<br />

| | | Positioning at start of table.<br />

Note the fact that orders table is sorted, while the lineitem table is not. The enforce construct will find out<br />

the right ordering required to do the join.<br />

<br />

Forcing different subquery attachment<br />

Changing subquery attachment is applicable only to those correlated subqueries that cannot be flattened.<br />

The main reason why you want to change the subquery attachment is probably to reduce the number of<br />

times a subquery gets evaluated. Let us take a three table join as shown below and we use the old trick of<br />

getting the AP. Only the skeletal plan output is shown that shows that the subquery is being attached after<br />

the join of the three outer tables is performed. This is highlighted in the showplan output.<br />

1> select count(*)<br />

2> from lineitem, part PO, customer<br />

3> where l_partkey = p_partkey and l_custkey = c_custkey<br />

4> and p_cost = (select min(PI.p_cost) from part PI where PO.p_partkey =<br />

PI.p_partkey)<br />

5> go<br />

The Abstract Plan (AP) of the final query execution plan:<br />

( scalar_agg ( nested ( m_join ( sort ( m_join ( sort ( t_scan customer ) ) ( sort<br />

(t_scan lineitem ) ) ) ) ( i_scan part_indx (table (PO part ) ) ) ( subq (<br />

scalar_agg ( t_scan (table (PI part ) ) ) ) ) ) ) (prop customer ( parallel 1 ) (<br />

prefetch 2 ) ( lru ) ) ( prop (table (PO part)) ( parallel 1 ) (prefetch 2 ) ( lru<br />

) ) (prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) ) ( prop (table(PI part))<br />

( parallel 1 ) (prefetch 2 ) ( lru ) )<br />

To experiment with the optimizer behavior, this AP can be modified and then passed to the optimizer using<br />

the PLAN clause: SELECT/INSERT/DELETE/UPDATE ...PLAN '( ... )<br />

QUERY PLAN FOR STATEMENT 1 (at line 1).<br />

12 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|SCALAR AGGREGATE Operator<br />

| Evaluate Ungrouped COUNT AGGREGATE.<br />

|<br />

| |SQFILTER Operator has 2 children.<br />

| | |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | |<br />

| | | |SORT Operator<br />

| | | | Using Worktable4 for internal storage.<br />

| | | |<br />

| | | | |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | | | |<br />

| | | | | |SORT Operator<br />

| | | | | | Using Worktable1 for internal storage.<br />

| | | | | |<br />

| | | | | | |SCAN Operator<br />

| | | | | | | FROM TABLE<br />

| | | | | | | customer<br />

| | | | | | | Table Scan.<br />

| | | | | |SORT Operator<br />

| | | | | | Using Worktable2 for internal storage.<br />

| | | | | |<br />

| | | | | | |SCAN Operator<br />

| | | | | | | FROM TABLE<br />

| | | | | | | lineitem<br />

| | | | | | | Table Scan.<br />

| | | |SCAN Operator<br />

83


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| | | | FROM TABLE<br />

| | | | part<br />

| |<br />

| | Run subquery 1 (at nesting level 1).<br />

| |<br />

| | QUERY PLAN FOR SUBQUERY 1 (at nesting level 1 and at line 4).<br />

| |<br />

| | Correlated Subquery.<br />

| | Subquery under an EXPRESSION predicate.<br />

| |<br />

| | |SCALAR AGGREGATE Operator<br />

| | | Evaluate Ungrouped MINIMUM AGGREGATE.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | part<br />

| |<br />

| | END OF QUERY PLAN FOR SUBQUERY 1.<br />

One other thing that may facilitate your understanding is to get the operator tree. You can get it by using<br />

trace flag 526 or set statistics plancost. Trace 526 outputs the lava operator tree without the cost (as<br />

illustrated below) – having the cost available is useful for determining whether the AP you are forcing is<br />

efficient, however. As a result, ‘set statistics plancost on’ is the recommended approach. Note the location<br />

of the aggregate on PI (highlighted).<br />

==================== Lava Operator Tree ====================<br />

Emit<br />

(VA = 12)<br />

/<br />

ScalarAgg<br />

Count<br />

(VA = 11)<br />

/<br />

SQFilter<br />

(VA = 10)<br />

/ \<br />

MergeJoin ScalarAgg<br />

Inner Join Min<br />

(VA = 7) (VA = 9)<br />

/ \ /<br />

Sort TableScan TableScan<br />

(VA = 5) part(PO) part(PI)<br />

(VA = 6) (VA = 8)<br />

/<br />

MergeJoin<br />

Inner Join<br />

(VA = 4)<br />

/ \<br />

Sort<br />

Sort<br />

(VA = 1) (VA = 3)<br />

/ /<br />

TableScan<br />

TableScan<br />

customer<br />

lineitem<br />

(VA = 0) (VA = 2)<br />

============================================================<br />

This query plan may not be optimal. The subquery is dependent on the outer table part (PO), which means<br />

that it can be attached anywhere after this table has been scanned. Let us assume that the correct join order<br />

needs to be part (PO) as the outer most table, followed by lineitem and then customer as the innermost<br />

table. Let's also assume that we need to attach the subquery to the scan of table PO. This can be achieved<br />

by starting off with the AP produced in the previous example and then modifying it to our need.<br />

1> select count(*)<br />

2> from lineitem, part PO, customer<br />

84


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

3> where l_partkey = p_partkey and l_custkey = c_custkey<br />

4> and p_cost = (select min(PI.p_cost) from part PI where PO.p_partkey =<br />

PI.p_partkey)<br />

5> plan<br />

6> "(scalar_agg<br />

7> (m_join<br />

8> (sort<br />

9> (m_join<br />

10> (nested<br />

11> (scan (table (PO part)))<br />

12> (subq (scalar_agg (scan (table (PI part)))))<br />

13> )<br />

14> (sort<br />

15> (scan lineitem)<br />

16> )<br />

17> )<br />

18> )<br />

19> (sort<br />

20> (scan customer)<br />

21> )<br />

22> )<br />

23> )<br />

24>(prop customer ( parallel 1 ) ( prefetch 2 ) ( lru ) )<br />

25>(prop (table (PO part)) ( parallel 1 ) (prefetch 2 ) ( lru ) )<br />

26>(prop lineitem ( parallel 1 ) ( prefetch 2 ) ( lru ) )<br />

27>(prop (table(PI part)) ( parallel 1 ) (prefetch 2 ) ( lru ) )<br />

28> go<br />

==================== Lava Operator Tree ====================<br />

Emit<br />

(VA = 12)<br />

/<br />

ScalarAgg<br />

Count<br />

(VA = 11)<br />

/<br />

MergeJoin<br />

Inner Join<br />

(VA = 10)<br />

/ \<br />

Sort Sort<br />

(VA = 7) (VA = 9)<br />

/ /<br />

MergeJoin<br />

TableScan<br />

Inner Join<br />

customer<br />

(VA = 6) (VA = 8)<br />

/ \<br />

SQFilter Sort<br />

(VA = 3) (VA = 5)<br />

/ \ /<br />

TableScan ScalarAgg TableScan<br />

part(PO) Min lineitem<br />

(VA = 0) (VA = 2) (VA = 4)<br />

/<br />

TableScan<br />

part (PI)<br />

(VA = 1)<br />

============================================================<br />

12 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|SCALAR AGGREGATE Operator<br />

85


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| Evaluate Ungrouped COUNT AGGREGATE.<br />

|<br />

| |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | Using Worktable5 for internal storage.<br />

| | Key Count: 1<br />

| | Key Ordering: ASC<br />

| |<br />

| | |SORT Operator<br />

| | | Using Worktable3 for internal storage.<br />

| | |<br />

| | | |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | | | Using Worktable2 for internal storage.<br />

| | | | Key Count: 1<br />

| | | | Key Ordering: ASC<br />

| | | |<br />

| | | | |SQFILTER Operator has 2 children.<br />

| | | | |<br />

| | | | | |SCAN Operator<br />

| | | | | | FROM TABLE<br />

| | | | | | part<br />

| | | | | | PO<br />

| | | | | | Table Scan.<br />

| | | | | | Forward Scan.<br />

| | | | |<br />

| | | | | Run subquery 1 (at nesting level 1).<br />

| | | | |<br />

| | | | | QUERY PLAN FOR SUBQUERY 1 (at nesting level 1 and at line 4).<br />

| | | | |<br />

| | | | | Correlated Subquery.<br />

| | | | | Subquery under an EXPRESSION predicate.<br />

| | | | |<br />

| | | | | |SCALAR AGGREGATE Operator<br />

| | | | | | Evaluate Ungrouped MINIMUM AGGREGATE.<br />

| | | | | |<br />

| | | | | | |SCAN Operator<br />

| | | | | | | FROM TABLE<br />

| | | | | | | part<br />

| | | | | | | PI<br />

| | | | | | | Table Scan.<br />

| | | | | | | Forward Scan.<br />

| | | | | END OF QUERY PLAN FOR SUBQUERY 1.<br />

| | | |<br />

| | | | |SORT Operator<br />

| | | | | Using Worktable1 for internal storage.<br />

| | | | |<br />

| | | | | |SCAN Operator<br />

| | | | | | FROM TABLE<br />

| | | | | | lineitem<br />

| | | | | | Table Scan.<br />

| | | | | | Forward Scan.<br />

| | |SORT Operator<br />

| | | Using Worktable4 for internal storage.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | customer<br />

| | | | Table Scan.<br />

| | | | Forward Scan.<br />

Note the shift in the location of the subquery as highlighted above.<br />

Query Processing FAQ:<br />

Is there a ‘switch’ or trace flag that re-enables the old 12.5 optimizer<br />

This is a common myth that seems to surface for every new release that was given further credence during<br />

the early beta stages of <strong>ASE</strong> 15, both optimizers were available for testing purposes. As with any GA<br />

release, only a single codeline is released into the product. Certain new features or changes may be able to<br />

86


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

be influenced by a traceflag – however, in many cases doing so impacts queries that were helped by the<br />

change. Consequently using the AQP feature is a more “targeted” approach.<br />

Is there a tool that can automatically find queries that were impacted<br />

Yes. DBExpert <strong>15.0</strong> includes the migration analyzer, which compares the query plans and execution<br />

statistics between queries in 12.5 and <strong>15.0</strong> – identifying which queries where changed and what the impact<br />

was.<br />

Should I still use trace flags 302, 310, etc. to diagnose optimizer issues<br />

No. These trace flags have been replaced by showplan options – which in addition to not requiring sa_role<br />

– the new options provide much clearer output as well as the ability to control the amout of detailed output.<br />

In a future release, trace flags 302, 310, etc. will be fully deprecated an unoperable.<br />

Why doesn’t ‘set statistics io on’ work in parallel mode<br />

In early <strong>ASE</strong> <strong>15.0</strong> releases, this was not supported, however, support was added in <strong>ASE</strong> <strong>15.0</strong> ESD #2<br />

(released in May 2006). However, a more accurate picture can be achieved by using ‘set statistics plancost<br />

on’ which will show the parallel access nodes in the lava operator tree.<br />

Query result not sorted when doing group by queries.<br />

In pre <strong>15.0</strong>, the grouping algorithm creates a work table with clustered index. Grouping works based on<br />

insertion into that work table. In <strong>15.0</strong>, the grouping algorithm has been changed to use a hash based<br />

strategy. This does not generate a sorted result set. In order to generate a sorted result set, queries will<br />

have to be changed to include an order by clause. Note that this change is in line with ANSI SQL<br />

standards.<br />

I have heard that joins between columns of different compatible data types have been solved in<br />

<strong>15.0</strong>. Do we still need trace flag 291 in <strong>15.0</strong><br />

No. You must not use trace flag 291 or else, you could get wrong answers. <strong>ASE</strong> <strong>15.0</strong> has improved<br />

algorithms to take care of joins between compatible but different data types. There is no problem using<br />

indices even if the SARGs are of different datatypes.<br />

My customer uses "set tablecount". Is that still supported in <strong>ASE</strong> <strong>15.0</strong> <br />

No. The option "set tablecount" is obsolete in <strong>15.0</strong>. This is because we can use a more<br />

sophisticated cost based pruning and optimization timeout mechanism.<br />

I see merge join being chosen despite "enable sort-merge join and JTC" turned off.<br />

In general, "enable sort-merge join and JTC" is not supported in <strong>15.0</strong>. If you do want to turn "sort-merge"<br />

join off, you would have to do it at a session level using the command "set merge_join 0" or use an<br />

optimization goal that disables merge join like "allrows_oltp". However, the reason that <strong>ASE</strong> is picking a<br />

merge join (when it appears it shouldn’t) over a nested-loop join is likely due to not being able to use an<br />

index due to poor or missing statistics, or not all the necessary columns are covered. Prior to simply<br />

attempting to disable the merge join, you may want to look at the query diagnostics to determine the best<br />

course of action. Arbitrarily disabling a merge join may detrimentally affect queries that can benefit from<br />

it, and resolving the underlying cause may achieve much better performance than strictly disabling merge<br />

joins.<br />

87


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

When I add an orderby clause to my query, it runs orders of magnitude slower.<br />

This is possible if you have DSYNC turned on for tempdb. In one instance, we measured a time difference<br />

of 20 minutes to 7 hours, when DSYNC option was turned OFF versus it being turned ON. In general, you<br />

should try to turn DSYNC off. Here is sample set of commands that will help you to have DYNC turned off<br />

for your database. You need to adapt that you to your environment. In the first case, we show how the<br />

default tempdb is created on new devices with the DSYNC turned off.<br />

USE master<br />

Go<br />

DISK INIT name = 'tempdbdev01', physname = '/tempdb_data' , size = '4G', dsync = 'false'<br />

Go<br />

DISK INIT name = 'tempdblogdev01', physname = '/tempdb_log', size = '4G', dsync = 'false'<br />

Go<br />

ALTER DATAB<strong>ASE</strong> tempdb ON tempdbdev01 = '4G' LOG ON tempdblogdev01 = '4G'<br />

Go<br />

USE tempdb<br />

Go<br />

EXEC sp_dropsegment 'logsegment', 'tempdb', 'master'<br />

go<br />

EXEC sp_dropsegment 'system', 'tempdb', 'master'<br />

go<br />

EXEC sp_dropsegment 'default', 'tempdb', 'master'<br />

go<br />

In case, you already have devices established for tempdb, you would merely have to turn the DYSNC<br />

property off. You will have to reboot <strong>ASE</strong>.<br />

EXEC sp_deviceattr 'tempdbdev01', 'dsync', 'false'<br />

Go<br />

EXEC sp_deviceattr 'tempdblogdev01', 'dsync', 'false'<br />

go<br />

I think we have discovered a bug in the optimizer, what information should we collect<br />

Query processing problems can be of several types. In this case, either you are seeing a stack trace in the<br />

errorlog (possibly coupled with a client disconnect) or degraded performance (either unknown cause, or via<br />

query forcing a correct plan is obtainable). It is imperative that you isolate the problem query. Once you<br />

have done that, you may want to get the following output (remember, <strong>Sybase</strong> has non-disclosure<br />

agreements with all customers, which may alleviate concerns about providing business sensitive data):<br />

• Preferably, <strong>Sybase</strong> would like to have a full database dump. However, if not available, the<br />

full schema of the tables involved, stored procedure source code, and bcp extraction of the<br />

data is the next best option. While many some issues may be resolvable without this<br />

information, such fixes are not always guaranteed as they are based on "guesses" of what is<br />

happening. By having a copy of the data, the exact data cardinality, data volumes, data<br />

selectivity that go into the optimizer costing algorithms is available – not only for problem<br />

detection – but also for testing the resolution ensuring that when a fix is provided, you can be<br />

certain that it does fix the problem.<br />

• Get the output of ddlgen. If you can't provide the data, then the full schema for all tables,<br />

indices, triggers and procedures involved will be needed.<br />

• Get the output of optdiag including simulate mode if used. If the data can not be provided,<br />

optimizer engineering will need to have some notion of the data volumes, cardinalities and<br />

selectivity involved. If you were able to influence the optimizer using simulated statistics,<br />

send those as well.<br />

• Force the good plan and collect the following information on it as well as the bad plan<br />

o<br />

o<br />

o<br />

set statistics plancost on<br />

set statistics time on<br />

set option show long (this output could be huge. You do not need the output of trace<br />

flags 302/310 anymore, however, you many need trace 3604 enabled)<br />

88


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

o<br />

set showplan on (with traceflag 526 turned on)<br />

We are getting the wrong answers from <strong>ASE</strong> to certain queries<br />

There are two situations with this – with or without parallelism involved. Without parallelism, ideally<br />

<strong>Sybase</strong> engineering would likely to be able to determine the problem using your data – so once again,<br />

whatever you can provide in terms of database dumps or bcp'd extracts of the data is most helpful.<br />

However, in addition, collect the output after enabling the following options and running the query:<br />

• set option show_code_gen on<br />

• dbcc traceon(201)<br />

If the problem only occurs in parallel queries vs. when run serial fashion, in addition to the ones listed<br />

above, add the following:<br />

• set option show long<br />

• set option show_parallel long<br />

If proper partition elimination is not happening, add the following option to those above.<br />

• set option show_elimination long<br />

89


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Storage & Disk IO Changes<br />

One of the other major changes in <strong>ASE</strong> <strong>15.0</strong> was in the area of storage and disk i/o. The two main features<br />

were the Very Large Storage Support (VLSS) and DIRECTIO implementations. As a result of these<br />

changes, there will likely be impacts on maintenance applications and tools - especially DBA scripts that<br />

monitor space usage. These impacts will be discussed later, while this section focuses on the higher-level<br />

changes brought on by these two changes.<br />

Very Large Storage Support<br />

In pre-<strong>15.0</strong> releases of Adaptive Server, a virtual page was described internally as a 32-bit integer: the first<br />

byte holds the device number (the vdevno) and the succeeding three bytes describe the page offset within<br />

the device in units of 2K bytes (the virtual page number). This is illustrated below:<br />

Figure 6 - <strong>ASE</strong> logical page & device bitmask prior to <strong>ASE</strong> <strong>15.0</strong><br />

This architecture limited the number of devices to 256 and the size of each device to 32 gigabytes, which<br />

makes a maximum storage limit of 8 terabytes in the entire server. While the upper bound of the storage<br />

size typically was not an issue, several customers were hitting issues with the limit on the number of<br />

devices - largely because the device size restriction forced more devices to be created than would have been<br />

necessary had <strong>ASE</strong> supported larger device sizes.<br />

<strong>ASE</strong> <strong>15.0</strong> separated the device number and logical page id’s into two separate fields allowing the server to<br />

address 2 billion devices containing 2 billion logical pages. This now means that <strong>ASE</strong> supports devices up<br />

to 4TB in size and a single database could theoretically have 2 billion 16K pages for a total of 32TB. The<br />

server has a maximum of 1 Exabyte based on a theoretical limit of 32,676 databases of 32TB each.<br />

However on a more practical limit, the question that may be asked is how many devices should a database<br />

have at a minimum. This question often arises when DBA’s are working with storage administrators who<br />

are trying to over-simplify administration by striping all the available devices into one single large device -<br />

an inoptimal configuration. The best advice for <strong>ASE</strong> and the number of devices that need to be created<br />

should be based on the following criteria:<br />

• Identitify all the tables you expect to have heavy IO on (likely about 10-20 per production<br />

system). These tables should be created on separate devices that map to separate LUNs along<br />

with separate devices/LUNs for the indexes on these tables. This does not mean one device<br />

per index, but likely one device for all the indices for one table at a minimum.<br />

• Other tables can be spread across those devices or other devices as desired. However, the<br />

transaction log for each database should have a separate device as it will likely be the heaviest<br />

hit device within OLTP databases.<br />

• As far as system databases, each tempdb should have 2 or more devices. Other devices as<br />

necessary for master, sybsystemprocs, etc.<br />

The rationale behind these suggestions are due to several reasons. First, there is a single pending IO queue<br />

for each device and device semaphore within <strong>ASE</strong>. If all the tables are on a single device, the writes will<br />

be queued in order, possibly delaying unrelated transactions. If the IO is delayed, it needs to grab the<br />

device semaphore, which could lead to higher internal contention within <strong>ASE</strong> than necessary.<br />

91


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

The second reason for this suggestion is that by separating databases on different devices/LUNs when<br />

multiple databases are within the same server, DBA’s can work with storage administrators to exploit SAN<br />

utilities to quickly create copies of a database for development testing or for HA implementations using<br />

quiesce database or mount/unmount database commands. Since the device copying/relocation is done at<br />

the physical drive level, having multiple databases spanning the same devices could result in wasted disk<br />

space when the device is copied to the development or DR site.<br />

DIRECTIO Support & FileSystem Devices<br />

A commonly asked question is whether <strong>Sybase</strong> recommends file systems or raw partition devices. The<br />

answer is that it depends on a number of factors such as OS file system implementation as well as the<br />

application IO profile, available OS resources such as memory & cpu and OS tuning for file system cache<br />

limits and swapping preferences. Generally, in the past, <strong>Sybase</strong> has stated that file system devices behave<br />

well for read operations – particularly large reads where the file system read-ahead can outpace even <strong>ASE</strong><br />

asynchronous prefetch capabilities, whereas raw partitions did better for write activity – especially in high<br />

concurrency environments.<br />

In <strong>ASE</strong> 12.0, <strong>Sybase</strong> introduced the device 'dsync' attribute, which implemented DSYNC I/O for file<br />

system devices. A common misconception was that this bypassed the file system buffer to ensure<br />

recoverability. In actuality, it still used the filesystem buffer, but forced a flush after each file system write.<br />

This double buffering in both <strong>ASE</strong> and the file system cache plus the flush request caused slower response<br />

times for writes to file system devices than raw partitions. In <strong>ASE</strong> <strong>15.0</strong>, <strong>Sybase</strong> has added DIRECTIO<br />

support via the 'directio' device attribute to overcome this problem. Internal tests have shown a substantial<br />

performance improvement in write activity with devices using DIRECTIO vs. devices using DSYNC.<br />

Currently, as of <strong>ASE</strong> <strong>15.0</strong> GA release, DIRECTIO is only supported on the following platforms.<br />

• Sun Solaris<br />

• IBM AIX<br />

• MicroSoft Windows<br />

Other operating systems may be added in later releases or updates to <strong>ASE</strong> <strong>15.0</strong>. There are several very<br />

important considerations about DIRECTIO that should be considered:<br />

• On operating systems that support it, you may have to tune the OS kernel, mount the file<br />

system with special options (i.e. the forcedirectio mount option in Solaris) as well as make<br />

sure the OS patch levels are sufficient for high volume of DIRECTIO activity.<br />

• DIRECTIO and DSYNC are mutually exclusive. If using DSYNC currently and you wish to<br />

use DIRECTIO, you will first have to disable the 'dsync' attribute. Note that changing a<br />

device attribute requires rebooting the <strong>ASE</strong>.<br />

• Make sure that the memory requirements for filesystem caching as well as CPU requirements<br />

for changes in OS processing of IO requests does not detract from <strong>ASE</strong> resources. In fact,<br />

when using file system devices, it is likely that you will need to leave at least 1 or 2 cpu’s<br />

available to the OS to process the write requests (reverting back to the old N-1 advice for the<br />

maximum number of engines).<br />

It is extremely important that before switching a device from using raw, DSYNC or DIRECTIO, that you<br />

test your application at scaling the user load 25%, 50%, 75% and 100%. The reason for testing multiple<br />

scenarios is that you can see if the scaling is linear or degrades as you get closer to 100% load. If<br />

performance starts flattening, then if you need to increase the load due to increased user population – you<br />

may need to add more resources to <strong>ASE</strong>.<br />

92


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Tempdb & FileSystem devices<br />

For sites running <strong>ASE</strong> with tempdb on a filesystem on SMP hardware, DIRECTIO support may benefit<br />

even over having dsync off. Again, please test your application at varying loads with different tempdb<br />

device configurations. It is likely that dsync off will have an advantage on smaller SMP systems with<br />

lower user concurrency in tempdb or with fewer/larger temp tables, while directio enable may have an<br />

advantage on larger SMP systems or on systems with high concurrency in tempdb with smaller/higher write<br />

activity tempdbs. Note that as of <strong>ASE</strong> 12.5.0.3, you can have multiple tempdbs, consequently, it may be<br />

benefitial to have one tempdb running with file system devices using dsync off for the batch reports at night<br />

whereas daytime OLTP processing tempdb's may use raw partitions or file system devices with<br />

DIRECTIO.<br />

In addition, having dsync on may still dramatically degrade query performance for sorting operations –<br />

such as adding an 'order by' clause to your query. This is possible if you have DSYNC turned on for<br />

tempdb. In one instance, we measured a time difference of 20 minutes to 7 hours, when DSYNC option<br />

was turned OFF versus it being turned ON. In general, you should try to turn DSYNC off or use<br />

DIRECTIO. Here is sample set of commands that will help you to have DYNC turned off for your<br />

database. You need to adapt that you to your environment. In the first case, we show how the default<br />

tempdb is created on new devices with the DSYNC turned off.<br />

USE master<br />

Go<br />

DISK INIT name = 'tempdbdev01', physname = '/tempdb_data' , size = '4G', dsync = 'false'<br />

Go<br />

DISK INIT name = 'tempdblogdev01', physname = '/tempdb_log', size = '4G', dsync = 'false'<br />

Go<br />

ALTER DATAB<strong>ASE</strong> tempdb ON tempdbdev01 = '4G' LOG ON tempdblogdev01 = '4G'<br />

Go<br />

USE tempdb<br />

Go<br />

EXEC sp_dropsegment 'logsegment', 'tempdb', 'master'<br />

go<br />

EXEC sp_dropsegment 'system', 'tempdb', 'master'<br />

go<br />

EXEC sp_dropsegment 'default', 'tempdb', 'master'<br />

go<br />

In case, you already have devices established for tempdb, you would merely have to turn the DYSNC<br />

property off. You will have to reboot <strong>ASE</strong>.<br />

EXEC sp_deviceattr 'tempdbdev01', 'dsync', 'false'<br />

Go<br />

EXEC sp_deviceattr 'tempdblogdev01', 'dsync', 'false'<br />

go<br />

Enabling DIRECTIO can either be enabled in similar fashion using disk init or by using the sp_deviceattr<br />

stored procedure. Any change to the device attribute will require an <strong>ASE</strong> reboot.<br />

Running tempdb on a Unix filesystem device without DIRECTIO may also cause considerable swapping<br />

and other activities due to the UFS cache utilization. Without DIRECTIO, you will use UFS cache no<br />

matter if DSYNC is on or off. Consequently, you should make sure that <strong>ASE</strong> + UFS cache size is less than<br />

85% of physical memory. You also should make sure UFS cache is constrained to the desired size vs. the<br />

default which is often unconstrained. Constraining UFS cache size typically requires using the operating<br />

systems kernel tuning facilities, so you will need to work with your system administrator.<br />

Additionally, if using SAN disk block replication, do NOT under any circumstances include tempdb<br />

devices in the disk replication volume group. Doing so will cause extreme performance degradation to<br />

queries involving sort operations as well as normal temp table write activity. If the storage admins insist on<br />

block replicating the entire SAN, purchase a couple of high speed internal disks and place them directly in<br />

the server cabinet (or in a direct attached storage bay).<br />

93


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Changes to DBA Maintenance Procedures<br />

This sections purpose is to try to highlight changes in <strong>ASE</strong> <strong>15.0</strong> that affect DBA utilities as well as point<br />

out how some of the earlier changes discussed may impact DBA procedures. Perhaps the biggest impact is<br />

due to the VLSS implementation as well as table partitioning and their respective impacts on DBA scripts<br />

for space utilization.<br />

Space Reporting System Functions<br />

Perhaps the most noticeable change is that the system functions that formerly were used to report space<br />

usage have been deprecated and replaced with partition-aware variants. The following table lists these<br />

functions, along with the syntax changes.<br />

<strong>ASE</strong> 12.5 <strong>ASE</strong> <strong>15.0</strong><br />

data_pgs(object_id,{doampg | ioampg})<br />

used_pgs(object_id, doampg, ioampg)<br />

reserved_pgs(object_id,{doampg | ioampg})<br />

rowcnt(sysindexes.doampg)<br />

ptn_data_pgs(object_id, partition_id)<br />

data_pages(dbid, object_id [, indid [, ptn_id]])<br />

used_pages(dbid, object_id [, indid [, ptn_id]])<br />

reserved_pages(dbid, object_id [, indid [, ptn_id]])<br />

row_count(dbid, object_id [, ptn_id])<br />

(data_pages())<br />

The big change that is apparent in the above functions is the replacement of sysindexes.doampg or ioampg<br />

with indid and partition_id. The change that is not as apparent is that whereas before the functions were<br />

used with scans of sysindexes, now they are used with scans of syspartitions – likely joined with<br />

sysindexes. For example:<br />

-- <strong>ASE</strong> 12.5 logic to report the spaced used by nonclustered indices<br />

select name, indid, used_pgs(id, doampg, ioampg)<br />

from sysindexes<br />

where id=object_id(‘authors’)<br />

and indid > 1<br />

In <strong>ASE</strong> <strong>15.0</strong>, this changes to the following:<br />

-- <strong>ASE</strong> <strong>15.0</strong> logic to report the spaced used by nonclustered indices<br />

select i.name, p.indid, used_pages(dbid(), p.id ,p.indid)<br />

from sysindexes I, syspartitions p<br />

where i.id=object_id(‘authors’)<br />

and i.indid > 1<br />

and p.indid > 1<br />

and p.id=i.id<br />

and p.id=object_id(‘authors’)<br />

and p.indid=i.indid<br />

order by indid<br />

Which doesn’t make sense until you realize that with storage now linked to the syspartitions table vs.<br />

sysindexes (and logically this makes sense), if you are trying to gauge space utilization on a partition basis,<br />

it is likely that you would run queries such as:<br />

-- <strong>ASE</strong> <strong>15.0</strong> logic to report the spaced used by nonclustered indices on a partition<br />

basis<br />

select p.name, i.name, p.indid, used_pages(dbid(), p.id, p.indid, p.partitionid)<br />

from sysindexes I, syspartitions p<br />

where i.id=object_id(‘authors’)<br />

and i.indid > 1<br />

and p.indid > 1<br />

and p.id=i.id<br />

and p.id=object_id(‘authors’)<br />

and p.indid=i.indid<br />

order by p.partitionid, p.indid<br />

95


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Note that the deprecated <strong>ASE</strong> 12.x functions still execute, but they always return a value of 0. The reason<br />

is partially because they rely on sysindexes.doampg and sysindexes.ioampg, which are no longer<br />

maintained. While syspartitions seems to have similar structures in the columns datoampage and<br />

indoampage, these values are on a partition-basis – consequently index space usage would have to<br />

aggregated for partitioned tables.<br />

Sysindexes vs. syspartitions & storage<br />

As mentioned earlier in the discussion associated with deprecated and new space reporting functions<br />

(data_pgs() data_pages()), disk space allocations are now associated with syspartitions in <strong>ASE</strong> <strong>15.0</strong> vs.<br />

sysindexes as previously was the case. The following table identifies the <strong>ASE</strong> 12.5 space pointers in<br />

sysindexes and the equivalent locations in <strong>ASE</strong> <strong>15.0</strong> syspartitions.<br />

Space association <strong>ASE</strong> 12.5<br />

sysindexes<br />

<strong>ASE</strong> <strong>15.0</strong><br />

syspartitions<br />

Unique row id + indid id + indid +<br />

partitionid<br />

First page first firstpage<br />

Root page root rootpage<br />

Data OAM page doampg datoampage<br />

Index OAM page ioampg indoampage<br />

Custom DBA scripts or previous dbcc commands that used these locations will need to be changed to<br />

reflect the current implementation. Published dbcc commands have already been modified. If you used<br />

previously undocumented dbcc commands (such as dbcc pglinkage()), these likely will fail as they have<br />

been either deprecated or simply not maintained. Such commands were typically only used to perform<br />

detail problem diagnosis/data recovery steps and should not have been used on a recurring basis. If a data<br />

corruption problem occurs, contact <strong>Sybase</strong> Technical Support for the correct procedures under <strong>ASE</strong> <strong>15.0</strong>.<br />

Continuing to use undocumented dbcc commands from previously releases in <strong>ASE</strong> <strong>15.0</strong> is likely to cause<br />

corruptions simply due to the change in space associated noted here.<br />

VDEVNO column<br />

Another major system change in <strong>15.0</strong> is the lifting of some of the storage limitations. Prior to <strong>ASE</strong> <strong>15.0</strong>,<br />

<strong>ASE</strong> had a 256 device limit and 32GB/device size limit. This was driven by the fact that <strong>ASE</strong>’s storage is<br />

organized by virtual pages using a 4 byte bit mask. The high order byte was used to denote the device id<br />

(or vdevno) and consequently the 8 bits limited <strong>ASE</strong> to 256 devices (8 bits as a signed integer). The<br />

remaining 3 bytes were used to track the actual virtual page numbers – which considering a 2K page (the<br />

storage factor in sysdevices) provides a limit of 32GB per device. Note that theoretically, a 16K server<br />

could have large devices, but due the default of 2K and the implementation of sysdevices, a limit of 32GB<br />

was imposed. This 32-bit bitmask logical page id resembled the following:<br />

Figure 7 - Pre-<strong>ASE</strong> 15.x Logical PageID and Vdevno<br />

As illustrated above, prior to <strong>ASE</strong> <strong>15.0</strong>, the vdevno was the high-order byte of the 32 bit page<br />

implementation. Deriving the vdevno often meant that users where using calculations such as 2 24 to isolate<br />

96


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

the high-order byte. This technique also had an implied limitation of 255 devices. As an additional<br />

problem, DBA's attempting to associate device fragments from master..sysusages with master..sysdevices<br />

had to join using a between clause based on the high and low virtual page numbers.<br />

In <strong>ASE</strong> <strong>15.0</strong>, the virtual page number is now 2 32-bit integers – one for the device number (vdevno) and<br />

one for the page id itself. The vdevno is also present in sysusages and sysdevices. As a result, DBA scripts<br />

that were previously used to calculate space consumption need to be modified. Consider the following<br />

examples:<br />

-- <strong>ASE</strong> 12.5 implementation<br />

select d.name, u.size<br />

from sysuages u, sysdevices d<br />

where u.vstart >= d.low<br />

and u.vstart


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Update Statistics & datachange()<br />

This section highlights some of the differences and questions around update statistics in <strong>ASE</strong> <strong>15.0</strong><br />

Automated Update Statistics<br />

<strong>ASE</strong> <strong>15.0</strong> includes the ability to automate the task of updating statistics only when necessary. This feature<br />

is mentioned in this document because many people are confused and think that this implies that running<br />

update statistics is no longer necessary as they believe that automated update statistics means that the server<br />

automatically tracks statistics with each DML operation. This doesn't happen for a variety of reason - the<br />

main one is that this could slow down OLTP operations. Even if done outside the scope of the transaction,<br />

since multiple users would likely be attempting to modify the statistics for the same exact rows in<br />

systabstats, the contention would result in effectively single threading the system. Another reason is that<br />

incrementally adding rows to a large table might not accurately update the statistics as the newly computed<br />

density might have the same exact value due to precision loss. For example, take a 1,000,000 row table and<br />

add 100,000 rows to it. Let's assume that the table contains a date column and all of the 100,000 rows have<br />

today's date. If we add the rows one at a time, the range cell density wouldn't change as each new row as it<br />

is added only amounts to 1/1,000,000 th of the table – or .000001. This would be especially true if less than<br />

6 digits of precision were used as the 6 th digit would be ever truncated. However, 100,000 rows adds 10%<br />

to the overall table size.<br />

Instead, the functionality is accomplished using the Job Scheduler template included with <strong>ASE</strong> <strong>15.0</strong>. The<br />

Job Scheduler (JS) engine was a new component added in <strong>ASE</strong> 12.5.1 (12.5.2 for Windows) that allows<br />

DBA’s to schedule recurring jobs from a central management server for multiple <strong>ASE</strong>’s in the environment<br />

– including support for different versions. More documentation on setting up the JS engine and the <strong>Sybase</strong><br />

Central interface is found in the <strong>ASE</strong> Job Scheduler’s User Guide. The JS template for update statistics<br />

allows DBA’s to very quickly set thresholds for particular tables so that update statistics can be run only if<br />

necessary during the scheduled time. A screen snapshot of this template is shown below.<br />

Datachange() function<br />

The automated update statistics feature is based on the new system function datachange(). This function, as<br />

documented, returns the percentage of data modified within the table. As a result, existing update statistics<br />

DBA scripts can be modified to take advantage of this by simple logic such as:<br />

select @datachange = datachange(“authors”, null, null)<br />

if @datachange > 50<br />

begin<br />

99


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

end<br />

Several notes:<br />

update statistics authors<br />

• The percentage returned is based on the number of DML operations and the table size. Each<br />

insert or delete counts as 1, while updates count as 2 (as if it were a delete followed by an<br />

insert).<br />

• The percentage returned is based on the number of rows remaining in the table. For example,<br />

deleting 500 rows from a 1,000 row table will result in datachange() returning 100%<br />

• The datachange() function parameters are tablename, partition, and column in that order. This<br />

allows DBA’s to detect just the change in particularly volatile fields and update the index<br />

statistics for just specific indices vs. all – or for a specific partition.<br />

The rationale for reporting a percentage instead of the number of rows is that the number of rows does not<br />

really provide useful information by itself and would only be useful when compared in the context to the<br />

size of the table. If datachange()=5,000, this could be very significant if the table contains 5,100 rows – or<br />

insignificant if it contains 500 million. By using a percentage, it makes it easier to establish relative<br />

thresholds in maintenance scripts, etc.<br />

Update Statistics Frequency and <strong>ASE</strong> 15<br />

Because <strong>ASE</strong> <strong>15.0</strong> now has several different algorithms for sorting, grouping, unions, joining and other<br />

operations, it is more important that <strong>ASE</strong> 15 have more up to date statistics and more column statistics than<br />

previous releases. This does not necessarily infer that <strong>ASE</strong> 15 needs to have update statistics run when 5%<br />

of the table has changed whereas in 12.5 you used to wait for 10%. What is implied is that in many<br />

situations, customers would run update statistics on fairly static data such as data in reporting systems.<br />

While <strong>ASE</strong> 12.5 might not have picked a bad plan simply because with a single algorithm it was tough to<br />

do so, <strong>ASE</strong> 15 – lacking current statistics – may pick one of the new algorithms which may prove to be<br />

disastrously slow as the actual data volume far exceeds the projected volume based on the stale statistics.<br />

Similarly as you saw earlier (in the discussion on showplan options in Diagnosing and Fixing Issues in<br />

<strong>ASE</strong> 15), the row estimates based on the availability of column statistics for all the columns in an index vs.<br />

just the density statistics can have a considerable impact on row estimations - which play a crucial role, not<br />

only in index selectivity, but also in parallel query optimization decisions, join selectivity, etc. Because of<br />

this, when migrating to <strong>ASE</strong> 15, the following advice about statistics is provided:<br />

• Use update index statistics instead of update statistics.<br />

• Use a higher step count and histogram factor to provide more accurate statistics and skew<br />

coverage.<br />

• For really large tables, consider partitioning the tables and running update index statistics on<br />

each partition (see the next section).<br />

• For extremely large tables or tables not being partitioned, consider running update index<br />

statistics with sampling.<br />

• When running update index statistics, consider raising the ‘number of sort buffers’ via<br />

sp_configure to orders of magnitude higher than normal execution. This can be done<br />

dynamically - and while it does require more proc cache (or less parallel executions of update<br />

index statistics) - it can have a profound effect on the time it takes to run update statistics on<br />

really large tables.<br />

• Use datachange() to determine when update statistics is necessary. On volatile indexes,<br />

consider running update index statistics on that index individually or perhaps just on the<br />

volatile column vs. all the indexes in the table. As the need to run update index statistics is<br />

100


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

more realistically arrived at, the tables could be cycled through the different maintenance<br />

periods vs. doing them all.<br />

Update Statistics on Partitions<br />

The ability to run update statistics on a specific partition should by itself reduce the time needed to run<br />

update statistics substantially. Most large tables that take lengthy time running update statistics contain a<br />

lot of historical data. Few, if any, of these historical rows change, yet update statistics must scan them all.<br />

Additionally, when viewed from the entire table’s perspective – even 1 million rows added to a 500 million<br />

row table is only 0.2% - which would suggest that statistics does not need to be updated. However, these<br />

are the likely the rows most often used and their distribution heuristics are not included in the range cell<br />

densities, etc. As a result, query optimization likely suffers.<br />

If you partition the data, particularly on a date range for the scenario above – i.e. month, then older/static<br />

data can be skipped when running update statistics (after the first time). Thereafter, you can use the<br />

datachange() function to check the amount of change within the current partition and run update statistics<br />

as necessary. Note that all partitions have a name – if you did not supply one (i.e. you used the hash<br />

partition short cut syntax), <strong>ASE</strong> provides a default name for you – similar to tempdb tables. Use sp_help to<br />

identifying the system supplied names. An example of datachange() based on a system supplied name and<br />

focusing on a specific column is illustrated below:<br />

1> select datachange("mytable","part_1360004845", "p_partkey")<br />

2> go<br />

---------------------------<br />

100.000000<br />

Obviously, this would be a good candidate for running update statistics on the partition listed above.<br />

update statistics mytable partition part_1360004845 (p_partkey)<br />

go<br />

Note the highlighted syntax change allowing update statistics to focus on a specific partition (and in this<br />

case column).<br />

The reduction in time for update statistics may allow you to create statistics or heuristics on columns that<br />

were avoided previously due to time constraints. For example:<br />

update statistics mytable (col1, col2)<br />

Creates heuristics on {col1} and densities on the combined {col1,col2} pair. If you think having heuristics<br />

on {col2} will help, by updating statistics on a partition basis, you might now have the time to do so.<br />

Consequently, the command may now look like:<br />

update statistics mytable partition part_1360004845 (col1, col2)<br />

go<br />

update statistics mytable partition part_1360004845 (col2)<br />

go<br />

Needs Based Maintenance: datachange() and derived_stats()<br />

Prior to <strong>ASE</strong> <strong>15.0</strong>, often DBA’s did table level maintenance in the blind or in reactionary mode. For<br />

example, they just arbitrarily ran update statistics every weekend – or whenever users started complaining<br />

about query performance. Obviously, datachange() will help reduce time spent updating statistics<br />

needlessly.<br />

However, DBA’s often dropped and recreated indices on some tables on a regular basis as well due to table<br />

fragmentation. In a sense, datachange() is the perfect complement to the derived_stats() function that was<br />

added in <strong>ASE</strong> 12.5.1. The syntax for the derived_stat function has been updated in <strong>ASE</strong> 15 to also work on<br />

partitioned tables as well:<br />

101


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

derived_stat(object_name | object_id,<br />

index_name | index_id,<br />

[partition_name | partition_id,]<br />

“statistic”)<br />

The values for statistic are:<br />

Value<br />

data page cluster ratio or dpcr<br />

index page cluster ratio or ipcr<br />

data row cluster ratio or drcr<br />

large io efficiency or lgio<br />

space utilization or sput<br />

Returns<br />

The data page cluster ratio for the object/index pair<br />

The index page cluster ratio for the object/index pair<br />

The data row cluster ratio for the object/index pair<br />

The large I/O efficiency for the object/index pair<br />

The space utilization for the object/index pair<br />

The statistic values match those returned from optdiag, consequently provide useful information for<br />

determining when a reorg might be needed. For instance, some DBA’s have found that reorg should be<br />

done on an index when the index page cluster ratio (ipcr) changes by 0.1 from a known starting point.<br />

One change in this function from 12.5.1 is the addition of the partition column – which might throw<br />

existing queries off slightly. As a result, the following rules apply:<br />

• If the four arguments are provided, derived_stat uses the third argument as the partition, and<br />

returns derived statistics on the fourth argument.<br />

• If three arguments are provided, derived_stat assumes you did not specifiy a partition, and<br />

returns derived statistic on the third argument.<br />

This technique provides compatibility for existing scripts that used the derived_stat function.<br />

By combining data_change() and derived_stat() and monitoring performance on index selection via<br />

monOpenObjectActivity, DBA’s could develop fairly accurate trigger points at which specific values for<br />

data rows modified or cluster ratio changes as a result of data modifications would trigger the need for<br />

running a reorg on a table or index. This also can be combined with queries using systabstats – such as the<br />

forwrowcnt, delrowcnt, and emptypgcnt – to fine tune which of the reorgs are really required.<br />

Update Statistics FAQ<br />

Why doesn’t the server just automatically update the statistics as a result of my DML statement<br />

– eliminating the need to even run update statistics entirely<br />

There are several reasons, however, the main one is that this could slow down OLTP operations. Even if<br />

done outside the scope of the transaction, since multiple users would likely be attempting to modify the<br />

statistics for the same exact rows in systabstats, the contention would result in effectively single threading<br />

the system.<br />

Why does datachange() report a percentage (%) vs. the number of rows modified<br />

Reporting the number of rows modified provides no context to the size of the table. If datachange()=5,000,<br />

this could be very significant if the table contains 5,100 rows – or insignificant if it contains 500 million.<br />

By using a percentage, it makes it easier to establish relative thresholds in maintenance scripts, etc.<br />

102


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Computed Columns/Function-Based Indices<br />

The following section highlights some of the behavior nuances of using computed columns.<br />

Computed Column Evaluation<br />

The most important rule to consider when understanding the behavior of computed columns is to<br />

understand when they are evaluated – especially if trying to pre-determine the likely output of nondeterministic<br />

columns. The evaluation rules are as follows:<br />

• Non-materialized (virtual) computed columns have their expression evaluated during the<br />

query processing. Consequently, it reflects the state of the current user’s session.<br />

• Materialized (physical) computed columns have their expression evaluated only when a<br />

referenced column is modified.<br />

For example, consider the following table:<br />

create table test_table (<br />

rownum int not null,<br />

status char(1) not null,<br />

-- virtual columns<br />

sel_user as suser_name(),<br />

sel_date as getdate(),<br />

-- materialized columns<br />

cr_user<br />

as suser_name() materialized,<br />

)<br />

cr_date<br />

upd_user<br />

upd_date<br />

as getdate() materialized,<br />

as (case when status is not null<br />

then suser_name() else 'dbo' end)<br />

materialized,<br />

as (case when status is not null<br />

then getdate() else 'jan 1 1970' end)<br />

materialized<br />

This table has 3 pairs of computed columns that are evaluated differently.<br />

sel_user/sel_date – These are virtual columns which have their expression evaluated when someone queries<br />

the table.<br />

cr_user/cr_date – Although these are physical/materialized columns, since they do not reference any other<br />

columns, their expression will only be evaluated when rows are inserted. They will not be affected by<br />

updates.<br />

upd_user/upd_date – These columns reference the status column although the status column does not<br />

determine the value. As a result, these columns will only be changed if the status column is modified –<br />

effectively inserts and updates that set the status column to any value.<br />

As a result, the last two computed column pairs (cr_user/cr_date and upd_user/upd_date) are unaffected by<br />

queries. So although they are based on non-deterministic functions, the values are consistent for all<br />

queries.<br />

Non-Materialized Computed Columns & Invalid Values<br />

As mentioned above, non-materialized computed columns have their expressions evaluated only at query<br />

time – not during DML operations. This can lead to query problems if the formula used to create the<br />

expression is not validated prior to creating the computed column. Consider the following:<br />

create table t (a int, b compute sqrt(a))<br />

go<br />

insert t values (2)<br />

insert t values (-1)<br />

insert t values (3)<br />

go<br />

103


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

select * from t<br />

go<br />

1> select * from t<br />

2> go<br />

a b<br />

----------- --------------------<br />

2 1.414214<br />

Domain error occurred.<br />

The computed column ‘b’ is not evaluated until queried – hence the domain error isn’t noted until the select<br />

statement. This can be especially nefarious if the select statement is embedded within a trigger.<br />

104


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Application and 3 rd Party Tool Compatibility<br />

In addition to the storage changes and procedures for updating statistics, there are a number of other system<br />

changes related to new tables to support encrypted columns, and other features. However, since these are<br />

described in the what’s new, this section instead will focus on temp table changes and third party tool<br />

compatibility.<br />

#Temp Table Changes<br />

There are two main changes that affect temporary tables to be considered. First we will discuss the temp<br />

table naming changes and then take a look at query optimization behaviors:<br />

#Temp Table Naming<br />

Prior to <strong>ASE</strong> <strong>15.0</strong>, the 30 character limit meant that user temporary tables in tempdb were named with 12<br />

distinct characters plus a 17 byte hash separated by an underscore. Names shorter than 12 characters were<br />

padded with underscores to achieve a length of 12 characters. For example:<br />

select *<br />

into #temp_t1<br />

from mytable<br />

where …<br />

Would likely result in a temp table with a name of #temp_t1______0000021008240896.<br />

In <strong>ASE</strong> 15, this padding with underscore is no longer implemented. Additionally, the limitation of 12<br />

distinct characters has been lifted along with the limitation of 30 characters for object names. This can<br />

cause some slight differences in temp table behavior that should not affect applications – other than<br />

applications built in <strong>ASE</strong> 15 may not be backward compatible with 12.5. Consider the following scenarios<br />

in which two temp tables are created in the same session for each scenario:<br />

-- This fails in <strong>ASE</strong> 12.5, but succeeds in <strong>15.0</strong><br />

-- The reason is that in 12.5, the automatic padding with underscore<br />

-- results in tables with the same name.<br />

create table #mytemp (…)<br />

create table #mytemp___ (…)<br />

go<br />

-- This also fails in <strong>ASE</strong> 12.5, but succeeds in <strong>15.0</strong><br />

-- The reason is that in <strong>ASE</strong> 12.5, the names are truncated to 12 characters<br />

create table #t12345678901 (…)<br />

create table #t1234567890123 (…)<br />

-- The following refer to the same table in <strong>ASE</strong> 12.5, but different<br />

-- tables in <strong>15.0</strong> – the reason is identical to the above (name truncation)<br />

select * from #t12345678901<br />

select * from #t1234567890123456<br />

#Temp Table Query Optimization<br />

In <strong>ASE</strong> 12.x, particularly with ‘enable sort-merge join and JTC’ disabled, joins involving temporary tables<br />

- particularly between two or more temporary tables would use the single remaining join strategy of Nested<br />

Loop Join (NLJ). Generally queries such as this are contained within stored procedures - consider the<br />

following two examples (note the single line difference disabling merge join):<br />

create procedure temp_test1<br />

@book_type varchar(30),<br />

@start_date datetime,<br />

@end_date<br />

datetime<br />

as begin<br />

select * into #sales from sales<br />

105


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

select * into #salesdetail from salesdetail<br />

-- List the title, price, qty for business books in 1998<br />

select t.title, t.price, s.stor_id, s.ord_num, sd.qty, total_sale=t.price * sd.qty<br />

from #sales s, #salesdetail sd, titles t<br />

where t.type=@book_type<br />

and s.stor_id=sd.stor_id<br />

and s.ord_num=sd.ord_num<br />

and s.date between @start_date and @end_date<br />

return 0<br />

end<br />

go<br />

create procedure temp_test2<br />

@book_type varchar(30),<br />

@start_date datetime,<br />

@end_date<br />

datetime<br />

as begin<br />

return 0<br />

end<br />

go<br />

select * into #sales from sales<br />

select * into #salesdetail from salesdetail<br />

set merge_join 0<br />

-- List the title, price, qty for business books in 1998<br />

select t.title, t.price, s.stor_id, s.ord_num, sd.qty, total_sale=t.price * sd.qty<br />

from #sales s, #salesdetail sd, titles t<br />

where t.type=@book_type<br />

and s.stor_id=sd.stor_id<br />

and s.ord_num=sd.ord_num<br />

and s.date between @start_date and @end_date<br />

Now then, we are all familiar with the fact that if <strong>ASE</strong> lacks any information on the #temp table, it assumes<br />

it is 10 rows per page and 10 pages total or 100 rows in size. As a result, <strong>ASE</strong> would consider the join<br />

between the two #temp tables as a likely candidate for a merge join as the sort expense is not that high and<br />

certainly cheaper than an n*m table scan. To see how this works, let’s take a look at the showplan and<br />

logical i/o costings for each one. First we will look at the showplan and statistics i/o for the query allowing<br />

merge joins (some of the output deleted for clarity):<br />

1> exec temp_test1 'business', 'Jan 1 1988', 'Dec 31 1988 11:59pm'<br />

-- some showplan output removed for clarity/space<br />

QUERY PLAN FOR STATEMENT 4 (at line 12).<br />

8 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|RESTRICT Operator<br />

|<br />

| |NESTED LOOP JOIN Operator (Join Type: Inner Join)<br />

| |<br />

| | |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | | Using Worktable3 for internal storage.<br />

| | | Key Count: 2<br />

| | | Key Ordering: ASC ASC<br />

| | |<br />

| | | |SORT Operator<br />

| | | | Using Worktable1 for internal storage.<br />

| | | |<br />

| | | | |SCAN Operator<br />

| | | | | FROM TABLE<br />

| | | | | #sales<br />

| | | | | s<br />

| | | | | Table Scan.<br />

| | | | | Forward Scan.<br />

| | | | | Positioning at start of table.<br />

| | | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | |<br />

106


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| | | |SORT Operator<br />

| | | | Using Worktable2 for internal storage.<br />

| | | |<br />

| | | | |SCAN Operator<br />

| | | | | FROM TABLE<br />

| | | | | #salesdetail<br />

| | | | | sd<br />

| | | | | Table Scan.<br />

| | | | | Forward Scan.<br />

| | | | | Positioning at start of table.<br />

| | | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | | With LRU Buffer Replacement Strategy for data pages.<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | titles<br />

| | | t<br />

| | | Table Scan.<br />

| | | Forward Scan.<br />

| | | Positioning at start of table.<br />

| | | Using I/O Size 4 Kbytes for data pages.<br />

| | | With LRU Buffer Replacement Strategy for data pages.<br />

Total estimated I/O cost for statement 4 (at line 12): 244.<br />

-- actual I/O’s for statement #4 as reported by set statistics io on.<br />

Table: #sales01000310024860367 scan count 1, logical reads: (regular=1 apf=0 total=1), physical reads:<br />

(regular=0 apf=0 total=0), apf IOs used=0<br />

Table: #salesdetail01000310024860367 scan count 1, logical reads: (regular=2 apf=0 total=2), physical<br />

reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: titles scan count 11, logical reads: (regular=11 apf=0 total=11), physical reads: (regular=0<br />

apf=0 total=0), apf IOs used=0<br />

Total actual I/O cost for this command: 28.<br />

Total writes for this command: 0<br />

(44 rows affected)<br />

So as you can see, we scanned each of the two temp tables 1 time to create Worktable1 and Worktable2 and<br />

performed a merge join - and then did a NLJ on the table scan with titles (no index on titles.type) for a total<br />

I/O cost of 28. Let’s compare this to when the merge join is disabled:<br />

1> exec temp_test2 'business', 'Jan 1 1988', 'Dec 31 1988 11:59pm' with recompile<br />

-- some showplan output removed for clarity/space<br />

QUERY PLAN FOR STATEMENT 5 (at line 14).<br />

9 operator(s) under root<br />

The type of query is SELECT.<br />

ROOT:EMIT Operator<br />

|RESTRICT Operator<br />

|<br />

| |SEQUENCER Operator has 2 children.<br />

| |<br />

| | |STORE Operator<br />

| | | Worktable1 created, in allpages locking mode, for REFORMATTING.<br />

| | | Creating clustered index.<br />

| | |<br />

| | | |INSERT Operator<br />

| | | | The update mode is direct.<br />

| | | |<br />

| | | | |SCAN Operator<br />

| | | | | FROM TABLE<br />

| | | | | #salesdetail<br />

| | | | | sd<br />

| | | | | Table Scan.<br />

| | | | | Forward Scan.<br />

| | | | | Positioning at start of table.<br />

| | | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | | |<br />

107


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| | | | TO TABLE<br />

| | | | Worktable1.<br />

| |<br />

| | |N-ARY NESTED LOOP JOIN Operator has 3 children.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | titles<br />

| | | | t<br />

| | | | Table Scan.<br />

| | | | Forward Scan.<br />

| | | | Positioning at start of table.<br />

| | | | Using I/O Size 4 Kbytes for data pages.<br />

| | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | #sales<br />

| | | | s<br />

| | | | Table Scan.<br />

| | | | Forward Scan.<br />

| | | | Positioning at start of table.<br />

| | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | Worktable1.<br />

| | | | Using Clustered Index.<br />

| | | | Forward Scan.<br />

| | | | Positioning by key.<br />

| | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | With LRU Buffer Replacement Strategy for data pages.<br />

Total estimated I/O cost for statement 5 (at line 14): 822.<br />

-- actual I/O’s for statement #5 as reported by set statistics io on.<br />

Table: #salesdetail01000130025150907 scan count 1, logical reads: (regular=2 apf=0 total=2), physical<br />

reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: Worktable1 scan count 12, logical reads: (regular=157 apf=0 total=157), physical reads:<br />

(regular=0 apf=0 total=0), apf IOs used=0<br />

Table: titles scan count 1, logical reads: (regular=1 apf=0 total=1), physical reads: (regular=0 apf=0<br />

total=0), apf IOs used=0<br />

Table: #sales01000130025150907 scan count 4, logical reads: (regular=4 apf=0 total=4), physical reads:<br />

(regular=0 apf=0 total=0), apf IOs used=0<br />

Total actual I/O cost for this command: 328.<br />

Total writes for this command: 9<br />

(44 rows affected)<br />

As you can see, the merge join reduced the I/O cost of this query by 300. If we had enabled set option<br />

show_lio_costing, we would have seen in each case the following output:<br />

Estimating selectivity for table '#sales'<br />

Table scan cost is 100 rows, 10 pages,<br />

The table (Allpages) has 100 rows, 10 pages,<br />

Data Page Cluster Ratio 0.0000000<br />

date >= Jan 1 1988 12:00:00:000AM<br />

date


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

This is fine for small temp tables - the problems develop when the temp tables are much larger. For<br />

instance, we will use the same queries, but this time in the pubstune database that is often used by <strong>Sybase</strong><br />

Education in performance and tuning classes. The difference between the two as far as row counts are:<br />

Table pubs2 pubstune<br />

titles 18 5,018<br />

sales 30 132,357<br />

salesdetail 116 1,350,257<br />

To demonstrate the differences in optimization, we will run 4 tests using 4 stored procedures:<br />

• Default optimization (likely will use merge join)<br />

• Disabled merge join (forcing NLJ)<br />

• Default optimization with update statistics<br />

• Default optimization with create index<br />

In the sample data, the year 2000 recorded ~26,000 sales<br />

<br />

Default Optimization (Merge Join)<br />

For this test, we will use the following stored procedure:<br />

create procedure temp_test_default<br />

@book_type<br />

@begin_date<br />

@end_date<br />

as begin<br />

varchar(30),<br />

datetime,<br />

datetime<br />

return 0<br />

end<br />

go<br />

select * into #sales<br />

from sales<br />

where date between @begin_date and @end_date<br />

select sd.* into #salesdetail<br />

from salesdetail sd, #sales s<br />

where sd.stor_id=s.stor_id<br />

and sd.ord_num=s.ord_num<br />

select @@rowcount<br />

select t.title_id, t.title, s.stor_id, s.ord_num, s.date, sd.qty,<br />

total_sale=(sd.qty*t.price)-(sd.qty*t.price*sd.discount)<br />

into #results<br />

from titles t, #sales s, #salesdetail sd<br />

where s.stor_id=sd.stor_id<br />

and s.ord_num=sd.ord_num<br />

and sd.title_id=t.title_id<br />

select count(*) from #results<br />

drop table #sales<br />

drop table #salesdetail<br />

drop table #results<br />

The script to run the test is:<br />

dbcc traceon(3604)<br />

go<br />

set showplan on<br />

go<br />

set option show_lio_costing on<br />

go<br />

set statistics io on<br />

go<br />

exec temp_test_default 'business', 'Jan 1 2000', 'Dec 31 2000 11:59pm' with recompile<br />

109


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

go<br />

dbcc traceoff(3604)<br />

set showplan off<br />

set option show_lio_costing off<br />

set statistics io off<br />

go<br />

As expected, the estimates for #sales and #salesdetail are the typical 10 rows/page and 10 pages (100 rows)<br />

Estimating selectivity for table '#sales'<br />

Table scan cost is 100 rows, 10 pages,<br />

The table (Allpages) has 100 rows, 10 pages,<br />

Data Page Cluster Ratio 0.0000000<br />

Search argument selectivity is 1.<br />

using table prefetch (size 32K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

Estimating selectivity for table '#salesdetail'<br />

Table scan cost is 100 rows, 10 pages,<br />

The table (Allpages) has 100 rows, 10 pages,<br />

Data Page Cluster Ratio 0.0000000<br />

Search argument selectivity is 1.<br />

using table prefetch (size 32K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

Not too surprising, it picked a merge join as per the showplan below (note the estimated I/Os at the bottom)<br />

QUERY PLAN FOR STATEMENT 5 (at line 19).<br />

9 operator(s) under root<br />

The type of query is INSERT.<br />

ROOT:EMIT Operator<br />

|INSERT Operator<br />

| The update mode is direct.<br />

|<br />

| |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | Using Worktable5 for internal storage.<br />

| | Key Count: 1<br />

| | Key Ordering: ASC<br />

| |<br />

| | |SORT Operator<br />

| | | Using Worktable4 for internal storage.<br />

| | |<br />

| | | |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | | | Using Worktable3 for internal storage.<br />

| | | | Key Count: 2<br />

| | | | Key Ordering: ASC ASC<br />

| | | |<br />

| | | | |SORT Operator<br />

| | | | | Using Worktable1 for internal storage.<br />

| | | | |<br />

| | | | | |SCAN Operator<br />

| | | | | | FROM TABLE<br />

| | | | | | #sales<br />

| | | | | | s<br />

| | | | | | Table Scan.<br />

| | | | | | Forward Scan.<br />

| | | | | | Positioning at start of table.<br />

| | | | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | | |<br />

| | | | |SORT Operator<br />

| | | | | Using Worktable2 for internal storage.<br />

| | | | |<br />

| | | | | |SCAN Operator<br />

| | | | | | FROM TABLE<br />

| | | | | | #salesdetail<br />

| | | | | | sd<br />

| | | | | | Table Scan.<br />

| | | | | | Forward Scan.<br />

| | | | | | Positioning at start of table.<br />

| | | | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | | | With LRU Buffer Replacement Strategy for data pages.<br />

110


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | titles<br />

| | | t<br />

| | | Table Scan.<br />

| | | Forward Scan.<br />

| | | Positioning at start of table.<br />

| | | Using I/O Size 32 Kbytes for data pages.<br />

| | | With LRU Buffer Replacement Strategy for data pages.<br />

|<br />

| TO TABLE<br />

| #results<br />

| Using I/O Size 32 Kbytes for data pages.<br />

Total estimated I/O cost for statement 5 (at line 19): 46.<br />

The actual I/Os processed for the three way join were - the bottom number is the result of how many rows<br />

are in the final result set (which also happens to be the number of rows in #salesdetail):<br />

Table: #sales01000130014531845 scan count 1, logical reads: (regular=222 apf=0 total=222), physical<br />

reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: #salesdetail01000130014531845 scan count 1, logical reads: (regular=2784 apf=0 total=2784),<br />

physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: titles scan count 1, logical reads: (regular=562 apf=0 total=562), physical reads: (regular=0<br />

apf=0 total=0), apf IOs used=0<br />

Table: #results01000130014531845 scan count 0, logical reads: (regular=262042 apf=0 total=262042),<br />

physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Total actual I/O cost for this command: 531220.<br />

Total writes for this command: 0<br />

-----------<br />

261612<br />

Note that most of the 531,220 I/Os are related to the #results table -the others are used when performing the<br />

actual joins & sorting for the work tables<br />

<br />

Merge Join Disabled (NLJ)<br />

Now let’s disable the merge join and forcing an NLJ (by default - we are not using allrows_dss which<br />

would allow a hash join). The script changes slightly to:<br />

dbcc traceon(3604)<br />

go<br />

set showplan on<br />

go<br />

-- set option show on<br />

-- go<br />

set option show_lio_costing on<br />

go<br />

set statistics io on<br />

go<br />

set merge_join 0<br />

go<br />

exec temp_test_default 'business', 'Jan 1 2000', 'Dec 31 2000 11:59pm' with recompile<br />

go<br />

dbcc traceoff(3604)<br />

set showplan off<br />

set option show_lio_costing off<br />

set statistics io off<br />

go<br />

Once again, the estimates for #sales and #salesdetail are the same.<br />

Estimating selectivity for table '#sales'<br />

Table scan cost is 100 rows, 10 pages,<br />

The table (Allpages) has 100 rows, 10 pages,<br />

Data Page Cluster Ratio 0.0000000<br />

Search argument selectivity is 1.<br />

using table prefetch (size 32K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

111


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Estimating selectivity for table '#salesdetail'<br />

Table scan cost is 100 rows, 10 pages,<br />

The table (Allpages) has 100 rows, 10 pages,<br />

Data Page Cluster Ratio 0.0000000<br />

Search argument selectivity is 1.<br />

using table prefetch (size 32K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

The showplan predictably picks up the N-ary NLJ - after first reformatting #sales as seen below (but note<br />

the difference in logical I/Os at the bottom):<br />

QUERY PLAN FOR STATEMENT 5 (at line 19).<br />

9 operator(s) under root<br />

The type of query is INSERT.<br />

ROOT:EMIT Operator<br />

|SEQUENCER Operator has 2 children.<br />

|<br />

| |STORE Operator<br />

| | Worktable1 created, in allpages locking mode, for REFORMATTING.<br />

| | Creating clustered index.<br />

| |<br />

| | |INSERT Operator<br />

| | | The update mode is direct.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | #sales<br />

| | | | s<br />

| | | | Table Scan.<br />

| | | | Forward Scan.<br />

| | | | Positioning at start of table.<br />

| | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | |<br />

| | | TO TABLE<br />

| | | Worktable1.<br />

|<br />

| |INSERT Operator<br />

| | The update mode is direct.<br />

| |<br />

| | |N-ARY NESTED LOOP JOIN Operator has 3 children.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | #salesdetail<br />

| | | | sd<br />

| | | | Table Scan.<br />

| | | | Forward Scan.<br />

| | | | Positioning at start of table.<br />

| | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | titles<br />

| | | | t<br />

| | | | Using Clustered Index.<br />

| | | | Index : titleidind<br />

| | | | Forward Scan.<br />

| | | | Positioning by key.<br />

| | | | Keys are:<br />

| | | | title_id ASC<br />

| | | | Using I/O Size 4 Kbytes for data pages.<br />

| | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | |<br />

| | | |SCAN Operator<br />

| | | | FROM TABLE<br />

| | | | Worktable1.<br />

| | | | Using Clustered Index.<br />

| | | | Forward Scan.<br />

| | | | Positioning by key.<br />

| | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | With LRU Buffer Replacement Strategy for data pages.<br />

| |<br />

112


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| | TO TABLE<br />

| | #results<br />

| | Using I/O Size 32 Kbytes for data pages.<br />

Total estimated I/O cost for statement 5 (at line 19): 4186.<br />

However, notice the difference in I/O’s!!<br />

Table: #sales01000010014826768 scan count 1, logical reads: (regular=222 apf=0 total=222), physical<br />

reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: Worktable1 scan count 261612, logical reads: (regular=815090 apf=0 total=815090), physical<br />

reads: (regular=8 apf=1 total=9), apf IOs used=1<br />

Table: #salesdetail01000010014826768 scan count 1, logical reads: (regular=2784 apf=0 total=2784),<br />

physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: titles scan count 261612, logical reads: (regular=784836 apf=0 total=784836), physical reads:<br />

(regular=0 apf=0 total=0), apf IOs used=0<br />

Table: #results01000010014826768 scan count 0, logical reads: (regular=262042 apf=0 total=262042),<br />

physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Total actual I/O cost for this command: 3730173.<br />

Total writes for this command: 266<br />

-----------<br />

261612<br />

The problem is the optimizer picked #salesdetail as the outer table, then reformatted #sales and then finally<br />

joined with titles. Since there were 261,262 rows in #salesdetail, this cause the I/O count to jump from<br />

500K to 3.7M - a 700% increase.<br />

This suggests that even for larger #temp table joins (#sales had 26,000 rows, #salesdetail had 260,000<br />

rows), a merge join in <strong>ASE</strong> <strong>15.0</strong> may still be significantly faster than <strong>ASE</strong> 12.5.<br />

<br />

Default Optimization with Update Statistics<br />

Normally we associate update statistics with indexes - but what if the table doesn’t have any indices - does<br />

it still help the optimizer To test this, we will alter the earlier proc as follows:<br />

create procedure temp_test_stats<br />

@book_type<br />

@begin_date<br />

@end_date<br />

as begin<br />

varchar(30),<br />

datetime,<br />

datetime<br />

select * into #sales<br />

from sales<br />

where date between @begin_date and @end_date<br />

update statistics #sales<br />

select sd.* into #salesdetail<br />

from salesdetail sd, #sales s<br />

where sd.stor_id=s.stor_id<br />

and sd.ord_num=s.ord_num<br />

update statistics #salesdetail<br />

-- these two statements are fakes - just to force the update stats to take<br />

-- effect since it appears not to work immediately for the following query<br />

select rowcnt=@@rowcount<br />

into #rowcnt<br />

drop table #rowcnt<br />

select t.title_id, t.title, s.stor_id, s.ord_num, s.date, sd.qty,<br />

total_sale=(sd.qty*t.price)-(sd.qty*t.price*sd.discount)<br />

into #results<br />

from titles t, #sales s, #salesdetail sd<br />

where s.stor_id=sd.stor_id<br />

and s.ord_num=sd.ord_num<br />

and sd.title_id=t.title_id<br />

select count(*) from #results<br />

drop table #sales<br />

drop table #salesdetail<br />

drop table #results<br />

113


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

return 0<br />

end<br />

go<br />

Now if we run it with the default optimization for <strong>ASE</strong> <strong>15.0</strong> (merge join enabled - same as earlier), let’s<br />

first take a look at the index selectivity outputs for the #sales and #salesdetail tables.<br />

Estimating selectivity for table '#sales'<br />

Table scan cost is 100 rows, 10 pages,<br />

The table (Allpages) has 100 rows, 10 pages,<br />

Data Page Cluster Ratio 0.0000000<br />

ord_num = ord_num<br />

Estimated selectivity for ord_num,<br />

selectivity = 0.1,<br />

stor_id = stor_id<br />

Estimated selectivity for stor_id,<br />

selectivity = 0.1,<br />

Search argument selectivity is 0.01.<br />

using table prefetch (size 32K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

Estimating selectivity for table '#salesdetail'<br />

Table scan cost is 100 rows, 10 pages,<br />

The table (Allpages) has 100 rows, 10 pages,<br />

Data Page Cluster Ratio 0.0000000<br />

title_id = title_id<br />

Estimated selectivity for title_id,<br />

selectivity = 0.0001992826,<br />

Search argument selectivity is 0.0001992826.<br />

using table prefetch (size 32K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

So running update statistics by itself to try to tell the optimizer the number of pages/rows doesn’t work. Or<br />

so we think - but then we notice the optimizer re-optimizes based on re-resolving the query due to the<br />

update statistics and we see the following near the bottom:<br />

Estimating selectivity for table '#sales'<br />

Table scan cost is 26364 rows, 222 pages,<br />

The table (Allpages) has 26364 rows, 222 pages,<br />

Data Page Cluster Ratio 1.0000000<br />

ord_num = ord_num<br />

Estimated selectivity for ord_num,<br />

selectivity = 0.1,<br />

stor_id = stor_id<br />

Estimated selectivity for stor_id,<br />

selectivity = 0.1,<br />

Search argument selectivity is 0.01.<br />

using table prefetch (size 32K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

Estimating selectivity for table '#salesdetail'<br />

Table scan cost is 261612 rows, 2784 pages,<br />

The table (Allpages) has 261612 rows, 2784 pages,<br />

Data Page Cluster Ratio 0.9995895<br />

ord_num = ord_num<br />

Estimated selectivity for ord_num,<br />

selectivity = 0.1,<br />

stor_id = stor_id<br />

Estimated selectivity for stor_id,<br />

selectivity = 0.1,<br />

Search argument selectivity is 0.01.<br />

using table prefetch (size 32K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

So, it does pick up the statistics for cost optimization, but the estimated I/Os in the showplan does not<br />

change as evident below:<br />

QUERY PLAN FOR STATEMENT 6 (at line 21).<br />

9 operator(s) under root<br />

The type of query is INSERT.<br />

114


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

ROOT:EMIT Operator<br />

|INSERT Operator<br />

| The update mode is direct.<br />

|<br />

| |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | Using Worktable5 for internal storage.<br />

| | Key Count: 1<br />

| | Key Ordering: ASC<br />

| |<br />

| | |SORT Operator<br />

| | | Using Worktable4 for internal storage.<br />

| | |<br />

| | | |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | | | Using Worktable3 for internal storage.<br />

| | | | Key Count: 2<br />

| | | | Key Ordering: ASC ASC<br />

| | | |<br />

| | | | |SORT Operator<br />

| | | | | Using Worktable1 for internal storage.<br />

| | | | |<br />

| | | | | |SCAN Operator<br />

| | | | | | FROM TABLE<br />

| | | | | | #sales<br />

| | | | | | s<br />

| | | | | | Table Scan.<br />

| | | | | | Forward Scan.<br />

| | | | | | Positioning at start of table.<br />

| | | | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | | |<br />

| | | | |SORT Operator<br />

| | | | | Using Worktable2 for internal storage.<br />

| | | | |<br />

| | | | | |SCAN Operator<br />

| | | | | | FROM TABLE<br />

| | | | | | #salesdetail<br />

| | | | | | sd<br />

| | | | | | Table Scan.<br />

| | | | | | Forward Scan.<br />

| | | | | | Positioning at start of table.<br />

| | | | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | | | With LRU Buffer Replacement Strategy for data pages.<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | titles<br />

| | | t<br />

| | | Table Scan.<br />

| | | Forward Scan.<br />

| | | Positioning at start of table.<br />

| | | Using I/O Size 32 Kbytes for data pages.<br />

| | | With LRU Buffer Replacement Strategy for data pages.<br />

|<br />

| TO TABLE<br />

| #results<br />

| Using I/O Size 32 Kbytes for data pages.<br />

Total estimated I/O cost for statement 6 (at line 21): 46.<br />

Which is identical to the default merge join processing from earlier - including the estimated I/Os. The<br />

actual I/O’s for the main query<br />

Table: #salesdetail01000260015312707 scan count 1, logical reads: (regular=2784 apf=0 total=2784),<br />

physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: titles scan count 1, logical reads: (regular=562 apf=0 total=562), physical reads: (regular=0<br />

apf=0 total=0), apf IOs used=0<br />

Table: #sales01000260015312707 scan count 1, logical reads: (regular=222 apf=0 total=222), physical<br />

reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: #results01000260015312707 scan count 0, logical reads: (regular=262046 apf=0 total=262046),<br />

physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Total actual I/O cost for this command: 531228.<br />

Total writes for this command: 0<br />

-----------<br />

261612<br />

Not too surprising, the I/Os didn’t change - without any indices, we still have to access the tables the same.<br />

115


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

<br />

Default Optimization with Create Index<br />

Now, let’s test the always controversial topic of whether or not <strong>ASE</strong> uses indexes created on #temp tables<br />

or not. To test this, we will alter the earlier proc as follows:<br />

create procedure temp_test_idx<br />

@book_type<br />

@begin_date<br />

@end_date<br />

as begin<br />

varchar(30),<br />

datetime,<br />

datetime<br />

return 0<br />

end<br />

go<br />

select * into #sales<br />

from sales<br />

where date between @begin_date and @end_date<br />

create unique index sales_idx on #sales (stor_id, ord_num)<br />

with statistics using 1000 values<br />

select sd.* into #salesdetail<br />

from salesdetail sd, #sales s<br />

where sd.stor_id=s.stor_id<br />

and sd.ord_num=s.ord_num<br />

create unique index salesdetail_idx on #salesdetail (stor_id, ord_num, title_id)<br />

with statistics using 1000 values<br />

create index salesdetailtitle_idx on #salesdetail (title_id)<br />

with statistics using 1000 values<br />

select t.title_id, t.title, s.stor_id, s.ord_num, s.date, sd.qty,<br />

total_sale=(sd.qty*t.price)-(sd.qty*t.price*sd.discount)<br />

into #results<br />

from titles t, #sales s, #salesdetail sd<br />

where s.stor_id=sd.stor_id<br />

and s.ord_num=sd.ord_num<br />

and sd.title_id=t.title_id<br />

select count(*) from #results<br />

drop table #sales<br />

drop table #salesdetail<br />

drop table #results<br />

Now if we run it with the default optimization for <strong>ASE</strong> <strong>15.0</strong> (merge join enabled), let’s first take a look at<br />

the index selectivity outputs for the #sales and #salesdetail tables.<br />

Estimating selectivity for table '#sales'<br />

Table scan cost is 26364 rows, 222 pages,<br />

…<br />

(some output removed)<br />

…<br />

Estimating selectivity of index 'sales_idx', indid 2<br />

ord_num = ord_num<br />

stor_id = stor_id<br />

Estimated selectivity for stor_id,<br />

selectivity = 0.001972387,<br />

Estimated selectivity for ord_num,<br />

selectivity = 0.1,<br />

scan selectivity 3.793051e-005, filter selectivity 3.793051e-005<br />

restricted selectivity 1<br />

unique index with all keys, one row scans<br />

…<br />

(some output removed)<br />

…<br />

Estimating selectivity of index 'salesdetail_idx', indid 2<br />

ord_num = ord_num<br />

stor_id = stor_id<br />

Estimated selectivity for stor_id,<br />

selectivity = 0.001972387,<br />

Estimated selectivity for ord_num,<br />

selectivity = 0.1,<br />

scan selectivity 4.946968e-005, filter selectivity 4.946968e-005<br />

restricted selectivity 1<br />

12.94186 rows, 1 pages<br />

Data Row Cluster Ratio 0.9574466<br />

Index Page Cluster Ratio 0.9985192<br />

116


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Data Page Cluster Ratio 0.901487<br />

using no index prefetch (size 4K I/O)<br />

in index cache 'default data cache' (cacheid 0) with LRU replacement<br />

using no table prefetch (size 4K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

Data Page LIO for 'salesdetail_idx' on table '#salesdetail' = 1.55072<br />

Estimating selectivity of index 'salesdetailtitle_idx', indid 3<br />

scan selectivity 1, filter selectivity 1<br />

261612 rows, 1156 pages<br />

Data Row Cluster Ratio 0.1226413<br />

Index Page Cluster Ratio 0.9990109<br />

Data Page Cluster Ratio 0.009184345<br />

using index prefetch (size 32K I/O)<br />

in index cache 'default data cache' (cacheid 0) with LRU replacement<br />

using table prefetch (size 32K I/O)<br />

in data cache 'default data cache' (cacheid 0) with LRU replacement<br />

Data Page LIO for 'salesdetailtitle_idx' on table '#salesdetail' = 229869<br />

FINAL PLAN ( total cost = 27070.91)<br />

Path: 338222.5<br />

Work: 441325.4<br />

Est: 779547.9<br />

So the optimizer does at least consider the indexes in the optimization. The showplan now is<br />

QUERY PLAN FOR STATEMENT 7 (at line 33).<br />

7 operator(s) under root<br />

The type of query is INSERT.<br />

ROOT:EMIT Operator<br />

|INSERT Operator<br />

| The update mode is direct.<br />

|<br />

| |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | Using Worktable3 for internal storage.<br />

| | Key Count: 1<br />

| | Key Ordering: ASC<br />

| |<br />

| | |SORT Operator<br />

| | | Using Worktable2 for internal storage.<br />

| | |<br />

| | | |MERGE JOIN Operator (Join Type: Inner Join)<br />

| | | | Using Worktable1 for internal storage.<br />

| | | | Key Count: 2<br />

| | | | Key Ordering: ASC ASC<br />

| | | |<br />

| | | | |SCAN Operator<br />

| | | | | FROM TABLE<br />

| | | | | #salesdetail<br />

| | | | | sd<br />

| | | | | Index : salesdetail_idx<br />

| | | | | Forward Scan.<br />

| | | | | Positioning at index start.<br />

| | | | | Using I/O Size 32 Kbytes for index leaf pages.<br />

| | | | | With LRU Buffer Replacement Strategy for index leaf pages.<br />

| | | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | | With LRU Buffer Replacement Strategy for data pages.<br />

| | | |<br />

| | | | |SCAN Operator<br />

| | | | | FROM TABLE<br />

| | | | | #sales<br />

| | | | | s<br />

| | | | | Index : sales_idx<br />

| | | | | Forward Scan.<br />

| | | | | Positioning at index start.<br />

| | | | | Using I/O Size 32 Kbytes for index leaf pages.<br />

| | | | | With LRU Buffer Replacement Strategy for index leaf pages.<br />

| | | | | Using I/O Size 32 Kbytes for data pages.<br />

| | | | | With LRU Buffer Replacement Strategy for data pages.<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | titles<br />

117


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| | | t<br />

| | | Table Scan.<br />

| | | Forward Scan.<br />

| | | Positioning at start of table.<br />

| | | Using I/O Size 32 Kbytes for data pages.<br />

| | | With LRU Buffer Replacement Strategy for data pages.<br />

|<br />

| TO TABLE<br />

| #results<br />

| Using I/O Size 32 Kbytes for data pages.<br />

Total estimated I/O cost for statement 7 (at line 33): 27070.<br />

So it still is going to use a merge join, but now it estimates the I/O’s at 27,000 (vs. 46). The actual I/O’s<br />

are still approximately the same (~500K) as illustrated below:<br />

Table: #salesdetail01000280015570890 scan count 1, logical reads: (regular=16114 apf=0 total=16114),<br />

physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: #sales01000280015570890 scan count 1, logical reads: (regular=411 apf=0 total=411), physical<br />

reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: titles scan count 1, logical reads: (regular=562 apf=0 total=562), physical reads: (regular=8<br />

apf=11 total=19), apf IOs used=11<br />

Table: #results01000280015570890 scan count 0, logical reads: (regular=262052 apf=0 total=262052),<br />

physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Total actual I/O cost for this command: 558753.<br />

Total writes for this command: 0<br />

-----------<br />

261612<br />

Let’s see what happens when we disable the merge-join and force the NLJ - which NLJ’s typically will do<br />

index traversals. The show plan becomes:<br />

QUERY PLAN FOR STATEMENT 7 (at line 33).<br />

5 operator(s) under root<br />

The type of query is INSERT.<br />

ROOT:EMIT Operator<br />

|INSERT Operator<br />

| The update mode is direct.<br />

|<br />

| |N-ARY NESTED LOOP JOIN Operator has 3 children.<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | #sales<br />

| | | s<br />

| | | Table Scan.<br />

| | | Forward Scan.<br />

| | | Positioning at start of table.<br />

| | | Using I/O Size 32 Kbytes for data pages.<br />

| | | With LRU Buffer Replacement Strategy for data pages.<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | #salesdetail<br />

| | | sd<br />

| | | Index : salesdetail_idx<br />

| | | Forward Scan.<br />

| | | Positioning by key.<br />

| | | Keys are:<br />

| | | stor_id ASC<br />

| | | ord_num ASC<br />

| | | Using I/O Size 4 Kbytes for index leaf pages.<br />

| | | With LRU Buffer Replacement Strategy for index leaf pages.<br />

| | | Using I/O Size 4 Kbytes for data pages.<br />

| | | With LRU Buffer Replacement Strategy for data pages.<br />

| |<br />

| | |SCAN Operator<br />

| | | FROM TABLE<br />

| | | titles<br />

| | | t<br />

| | | Using Clustered Index.<br />

| | | Index : titleidind<br />

| | | Forward Scan.<br />

118


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

| | | Positioning by key.<br />

| | | Keys are:<br />

| | | title_id ASC<br />

| | | Using I/O Size 4 Kbytes for data pages.<br />

| | | With LRU Buffer Replacement Strategy for data pages.<br />

|<br />

| TO TABLE<br />

| #results<br />

| Using I/O Size 32 Kbytes for data pages.<br />

Total estimated I/O cost for statement 7 (at line 33): 1406537.<br />

Since we have to process every row in the outer table (#sales) the table scan is not surprising. However, we<br />

do note that #salesdetail does indeed pick up the index for the join!!! Note, however, the estimated I/O cost<br />

of 1.4 million, with an actual I/O cost of:<br />

Table: #sales01000290015809592 scan count 1, logical reads: (regular=222 apf=0 total=222), physical<br />

reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: #salesdetail01000290015809592 scan count 26364, logical reads: (regular=120303 apf=0<br />

total=120303), physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Table: titles scan count 261612, logical reads: (regular=784836 apf=0 total=784836), physical reads:<br />

(regular=0 apf=0 total=0), apf IOs used=0<br />

Table: #results01000290015809592 scan count 0, logical reads: (regular=262052 apf=0 total=262052),<br />

physical reads: (regular=0 apf=0 total=0), apf IOs used=0<br />

Total actual I/O cost for this command: 2334826.<br />

Total writes for this command: 0<br />

-----------<br />

261612<br />

Interesting to compare this to the 3.7 million I/Os for the NLJ without the indexes - obviously, the use of an<br />

index does indeed reduce the I/O cost for NLJ by nearly 40% - howver, it is still 4 times as many I/O’s as<br />

the merge join. The obvious question that might be asked is - so, the I/O’s are cheaper - what about<br />

execution times Running the same tests, with diagnostics off, but set statistics time on reveals the<br />

following results (times are for the main query only - not the entire proc execution):<br />

Method Cpu time (ms) Elapsed time (ms) Diff %*<br />

Default proc (merge-join) 21,900 21,873 (ref)<br />

Default proc w/ NLJ 35,500 35,643 162%<br />

Proc w/ upd stats SMJ 19,900 20,000 (ref)<br />

Proc w/ upd stats NLJ 32,200 33,906 162%<br />

Proc w/ index SMJ 16,300 16,953 (ref)<br />

Proc w/ index NLJ 23,300 23,250 143%<br />

* The difference is measured compared to the SMJ run in the same proc<br />

Still the merge join (SMJ) has better overall execution times. The last run is a bit misleading - while the<br />

query execution times appear to be quite a bit better with the indexes vs. those without, the create index<br />

times were about 10 seconds - resulting in the same overall time. The reason the query time is likely faster<br />

is simply due to the fact that the create index statements forced the tables to be cached for the subsequent<br />

query (whereas update stats with no indexes does not have the same effect).<br />

From all of this, we can conclude that most applications that have joins with more than one #temp table<br />

involved, should see some improvement from the enablement of merge joins in <strong>ASE</strong> <strong>15.0</strong>. While there<br />

may be exceptions, those can be dealt with on an individual basis allowing the others to benefit from the<br />

improved execution times.<br />

3 rd Party Tool Compatibility<br />

Third party applications or custom developed applications that access the database for normal DML and<br />

query operations should largely remain unaffected by an upgrade to <strong>ASE</strong> <strong>15.0</strong>. However, third party tools<br />

or custom applications that perform DDL or interrogate the system catalogs may encounter problems due to<br />

119


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

the system table changes, the deprecated functions, etc. For example, previous releases of Embarcadero's<br />

DBArtisan had a widely liked feature in which not only did the object browser display the table names, but<br />

it also listed the number of rows in the table. This row count was derived by using the rowcnt() function,<br />

which has now been deprecated. As a result, users of DBArtisan may be surprised when running older<br />

versions of the tool to see that their <strong>ASE</strong> <strong>15.0</strong> server is "missing all the data" as DBArtisan will report 0<br />

rows for every table. Remember, in deprecating the functions, <strong>Sybase</strong> left the function intact, but also had<br />

it return a constant value of 0 – along with a warning that the function was deprecated.<br />

Users of third party system administration/DBA utilities should contact the vendors to see when they will<br />

be releasing an <strong>ASE</strong> <strong>15.0</strong> compatible version.<br />

Application/3 rd Party Tool Compatibility FAQ<br />

What changes were made to the system tables in <strong>ASE</strong> <strong>15.0</strong><br />

Quite a few. Most notably, the system catalogs in <strong>ASE</strong> <strong>15.0</strong> were updated to reflect the changes to support<br />

semantic partitions as well as column encryption. As a result, new system tables were added, many were<br />

extended to include partition information and new objects were added to the object classes. A full<br />

description of the changes is contained in the <strong>ASE</strong> <strong>15.0</strong> What's New Guide.<br />

120


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

dbISQL & <strong>Sybase</strong> Central<br />

In <strong>ASE</strong> <strong>15.0</strong>, a new client utility was introduced for <strong>ASE</strong> customers – dbIsql. The reason is that dbIsql<br />

provides a common client utility for all the <strong>Sybase</strong> servers (previously it was used for ASA and ASIQ) and<br />

can connect to OpenServer-based applications such as <strong>Sybase</strong> Replication Server. Additionally, as a java<br />

based client, it can run on Linux and other Unix platforms and is a more mature product than the jisql<br />

utility previously shipped with <strong>ASE</strong>. Given the growth of Linux desktops in corporate environments, it<br />

was important to provide a GUI client that ran on multiple platforms.<br />

<strong>Sybase</strong> Central 4.3 ships with <strong>ASE</strong> <strong>15.0</strong>. However, this is not the same build as the <strong>Sybase</strong> Central 4.3 that<br />

may already be installed on your PC from other installations. This current build supports plug-ins for<br />

SySAM, Unified Agent Framework (UAF) and includes new functionality not available in previous<br />

releases (starting, stopping, pinging remote servers, remote errorlog viewing, automatic server detection,<br />

server groups, etc.). It also has a new location – specifically in %SYB<strong>ASE</strong>%\Shared\<strong>Sybase</strong> Central 4.3<br />

for Windows users vs. the former location directly under %SYB<strong>ASE</strong>%. This can lead to a number of<br />

problems if the previous versions are left intact as program launch icons, path settings and CLASSPATH<br />

may point to the previous location. The best advice is to rename the old installation directory and then after<br />

a period of time (when you are sure all the other product plug-ins are compatible with the newer release)<br />

completely delete it. Product plug-ins are individually available from www.sybase.com.<br />

dbISQL & <strong>Sybase</strong> Central FAQ<br />

What happened to SQL Advantage Does it still work<br />

SQLAdvantage is no longer being maintained by <strong>Sybase</strong>. While it still can connect to <strong>ASE</strong> <strong>15.0</strong>, some of<br />

the newer API features in OCS <strong>15.0</strong> are not available to it, consequently it may be limited in functionality.<br />

If you have problems with SQLAdvantage running on a machine you recently installed the <strong>ASE</strong> <strong>15.0</strong> PC<br />

Client software, the likely cause is that it is finding OCS <strong>15.0</strong> in the path ahead of OCS 12.5, and with the<br />

renaming of some of the dll libraries, it fails. The work-around is to create a shell script that has all of the<br />

OCS 12.5 library directories in the PATH and none of the OCS <strong>15.0</strong> libraries and launch SQL Advantage<br />

from the shell script.<br />

I can't get dbIsql to launch – it always fails with class not found or other errors.<br />

Both the GA and ESD 1 batch scripts contained an error in the call to dbIsql's class file. In some cases the<br />

"-Dpath" option was garbled, although the %path% for the option to set was present. To fix, navigate to<br />

%SYB<strong>ASE</strong>%\dbisql\bin, edit the dbisql.bat file and make sure the execution line reads as follows (note the<br />

line breaks below are due to document formatting. Pay particular attention to the dashes ("-") at the ends of<br />

some lines which are the switch character when copy/pasting this into your script):<br />

REM optionally trim the path to only what is necessary<br />

set PATH="c:\sybase\<strong>ASE</strong>-15_0\bin;c:\sybase\OCS-15_0\bin;.;"<br />

REM this should all be on one line – line breaks are due to document formatting<br />

"%JRE_DIR%\bin\java" -Disql.helpFolder="%DBISQL_DIR%\help" -<br />

Dsybase.jsyblib.dll.location="%SYBROOT%\Shared\win32\\" -<br />

Djava.security.policy="%DBISQL_DIR%\lib\java.policy" -Dpath="%path%" -classpath<br />

"%DBISQL_DIR%\lib;%isql_jar%;%jlogon_jar%;%jodbc_jar%;%xml4j_jar%;%jconn_jar%;%dsparser_jar%;%helpmana<br />

ger_jar%;%jcomponents_jar%;%jh_jar%;%jsyblib_jar%;%planviewer_jar%;%sceditor_jar%;%uafclient_jar%;%jin<br />

icore_jar%;%jiniext_jar%;%jmxremote_jar%;%jmxri_jar%;%commonslogging_jar%;%log4j_jar%"<br />

sybase.isql.ISQLLoader -ase %*<br />

Optionally, you can remove the argument entirely as it is not necessary.<br />

Another common problem is that customers on Windows machines running <strong>Sybase</strong> ASA tools (from either<br />

<strong>Sybase</strong> ASA or <strong>Sybase</strong> IQ), the ASA installer installs its version of dbisql as a “startup” process for quicker<br />

launches. You will need to open the task manager to kill the process and then remove it from the startup<br />

lists before attempting to run the <strong>ASE</strong> version of dbISQL (aka Interactive SQL)<br />

121


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

<strong>Sybase</strong> Central frequently crashes or won't start up<br />

Make sure that you are using the current build of <strong>Sybase</strong> Central (4.3.0.2427 as of <strong>ASE</strong> <strong>15.0</strong> ESD2). The<br />

most likely cause is that you are launching an old version of <strong>Sybase</strong> Central (i.e. 4.1) or and older build of<br />

4.3 that is not compatible with the <strong>ASE</strong> <strong>15.0</strong> plug-in. Make sure you are launching the version out of<br />

%SYB<strong>ASE</strong>%\Shared\<strong>Sybase</strong> Central 4.3 (to make sure of this, open a DOS window, navigate to the<br />

directory and execute scjview.bat.<br />

Additionally some issues have been noted when using <strong>Sybase</strong> Central with the UAF plug-in – which uses<br />

java security policies – and VPN software that does TCP redirection, such as VSClient from InfoExpress.<br />

If this is the problem, exit the VPN client application entirely and try <strong>Sybase</strong> Central again.<br />

In any case, all <strong>Sybase</strong> Central crashes create a stack trace in a file named scj-errors.txt (if more than one,<br />

the files will be number sequentially after the first crash – i.e. scj-errors-2.txt) located in the <strong>Sybase</strong> Central<br />

home directory. If reporting problems to <strong>Sybase</strong> Technical Support, include this file as it identifies all the<br />

plug-in versions as well as all the jar versions used by <strong>Sybase</strong> Central and the plug-ins.<br />

122


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

Appendix A - Common Performance Troubleshooting Tips<br />

1. Delete the statistics and run update index statistics on the table. If necessary, gradually<br />

increase the step count and/or decrease the histogram tuning factor from the default of 20 to a<br />

number that keeps the total number of steps


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

125


<strong>Planning</strong> <strong>Your</strong> <strong>ASE</strong> 15 <strong>Migration</strong>- v1.0<br />

<strong>Sybase</strong> Incorporated<br />

Worldwide Headquarters<br />

One <strong>Sybase</strong> Drive<br />

Dublin, CA 94568, USA<br />

Tel: 1-800-8-<strong>Sybase</strong>, Inc.<br />

www.sybase.com<br />

Copyright © 2000 <strong>Sybase</strong>, Inc. All rights reserved. Unpublished rights reserved under U.S. copyright laws. <strong>Sybase</strong> and the <strong>Sybase</strong> logo are<br />

trademarks of <strong>Sybase</strong>, Inc. All other trademarks are property of their respective owners. ® indicates registration in the United States.<br />

Specifications are subject to change without notice. Printed in the U.S.A.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!