Convert Your CLOBs to VARCHAR2 in Oracle 12c

Understanding Oracle 12c’s Extended VARCHAR2, NVARCHAR2 & RAW Data Types

One easily overlooked new feature of 12c is an increase in the size of the VARCHAR2, NVARCHAR2, and RAW data types. These data types previously allowed up to 4K of data to be stored in them, but can now accommodate 32K (8 times more).

VARCHAR2 vs. NVARCHAR2
You wouldn’t be alone if you were wondering; what exactly is the difference between the two VARCHAR2 data types?

Comparing VARCHAR2 and NVARCHAR2 Post-Upgrade

The VARCHAR2 data type was introduced around 1989 in Oracle 6, and can store character-based data in the database character set (NLS_CHARACTERSET).

The NVARCHAR2 data type allows us to use Unicode for some columns, regardless of the database character set. The NVARCHAR2 is a Unicode-only data type and typically contains data in the NLS_NCHAR_CHARACTERSET format.

One reason you may want to use NVARCHAR2 is that your DB uses a non-Unicode character set, and you still want to be able to store Unicode data for some columns without changing the primary character set. Another reason may be that you want to use two Unicode character sets (AL32UTF8 for data that comes mostly from western Europe, and AL16UTF16 for data that comes mostly from Asia for example) with each one being stored in the most efficient manner in columns of the appropriate data type.

Significance of 32K
Yeah – so what? 32,767 bytes (yawn). SQL Server can store 2GB in VARCHAR (Max)!

Ignoring the fact that many historical events were accomplished with computers having a total of 32K of memory or less, what does that number mean to us today in 2016?

According to analysis performed on numerous Internet sites, the average word length in English text is about five characters. Although many words are longer, shorter words are used much more frequently. Allowing an additional 1.5 characters for the space and punctuation between words, means that in 32K (with an 8-byte character set), we can store about 5,000 words of written English text. At 130 words spoken per minute, that is approximately a 38-minute speech.

In addition to natural languages, many computer messaging formats are text-based, and 32K represents an 8x increase compared to what we could store in previous versions of Oracle VARCHAR2 columns. That means that many computer applications can now be simplified, and potentially benefit from dramatically improved performance by storing character data in-line as opposed to in Large Objects as previously required for anything larger than 4,000 bytes.

Why Move from CLOBs to VARCHAR2: Performance & Simplicity Benefits

Prior to 12c, any data over 4K had to be stored in Large Objects (LOBs) with the character variant being a Character Large Object (CLOB). These are inherently less efficient than native data types, which are stored in-line with the data row. LOBs have special storage requirements, and may even reside outside of normal table spaces. Working with LOBs is also … clunky.

In addition to the performance benefits and simplification of your application code, there are more built-in functions, as well as operators, for working with VARCHAR2 than with CLOB data. So, if your existing CLOB data values are less than 32K, it probably makes sense to consider a data conversion as part of your Oracle 12c upgrade. This will be especially true for CLOB data that is inserted, updated, or manipulated frequently.

Enabling Extended VARCHAR2 Support in Oracle 12c

Configuration
In order to enable the increased size limit in 12c, we have to set the initialization parameter MAX_STRING_SIZE = “EXTENDED” (as opposed to “NORMAL”).

Once that change is made, you will be able to create columns and declare PL/SQL data types of VARCHAR2(32767).

Unfortunately, to upgrade an existing database is not quite as simple as it sounds. The change can only be made when the database is in upgrade mode, and a support script must be run to upgrade the data dictionary in conjunction with the parameter change. Be warned, there is no way back once you’ve made the change!

Traditional DB Example:

SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP UPGRADE
SQL> ALTER SYSTEM SET max_string_size=extended;
SQL> @?/rdbms/admin/utl32k
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP

 

Pluggable DB Example:

SQL> ALTER SESSION SET CONTAINER = {Normal DB};
SQL> ALTER PLUGGABLE DATABASE CLOSE IMMEDIATE;
SQL> ALTER PLUGGABLE DATABASE OPEN UPGRADE;
SQL> ALTER SYSTEM SET max_string_size=extended;
SQL> @?/rdbms/admin/utl32k
SQL> ALTER PLUGGABLE DATABASE CLOSE;
SQL> ALTER PLUGGABLE DATABASE OPEN;

 

Converting Existing CLOB Columns to VARCHAR2

All right, so I’m convinced. How painful is this going to be?

First, you’ll want to figure out if you have any CLOB values over 32K by running a simple query:

SQL> select count(1)
from {your_table}
where dbms_lob.getlength( {CLOB_Column} ) > 32767

 

If you get back a count of ”0”, then you’re in good shape. Otherwise, you’ll have to figure out how best to continue. Perhaps you can transform or truncate the longer ones, or maybe you have to scrap the idea altogether.

Assuming you continue (and you’ve upgraded your database as described above), then it’s a matter of the following steps:

Adding & Populating New VARCHAR2 Columns

SQL> alter table {your table} add {VARCHAR2_Column} varchar2(32767);
Table altered.


SQL> update {your table} set {VARCHAR2_Column} = {CLOB_Column};
17000 rows updated.


SQL> commit;
Commit complete.


SQL> alter table {your table} drop column {CLOB_Column};
Table altered.


-- Optionally rename the new column

SQL> alter table {your table}
rename column {VARCHAR2_Column} to {CLOB_Column};

Table altered.

Handling Oversized CLOB Data with Truncation

To truncate text in the CLOB to the maximum length of VARCHAR2, change the update query to:

SQL> update {your table}
set {VARCHAR2_Column} = dbms_lob.substr( {CLOB_Column},
amount=>32767, offset=>1);

 

Additional Considerations & Best Practices

  1. Once you upgrade a database, there’s no way back other than to restore from backup.
  2. If you want to index one of your new larger VARCHAR2 columns, you may run into restrictions on the maximum size of an indexed column, which is about 75% of the index tablespace block size. In such cases, you will have to move or create the index in a tablespace with a larger block size.
  3. You may need to review any application code that works with LOB columns
    1. Insert and updates may need to ensure values are not larger than 32K, but otherwise remain the same.
    2. Although many functions are agnostic, or overloaded to work with either CLOB or VARCHAR2, some code may need to be changed if using LOB-specific functionality.
  4. You should always perform a full regression test of any applications when making structural changes to your database.

Conclusion: Key Advantages of Converting CLOBs to VARCHAR2

In summary, the benefits of converting CLOB columns sized 32K or less to VARCHAR2, and the benefits include:

  1. Improved performance for DML involving character objects
  2. Ability to index the column values
  3. Simplification of application code
  4. Reduced reliance on PL/SQL procedures for manipulation of column values
  5. Increased functionality for VARCHAR2 vs. CLOB

Ready to boost your Oracle 12c performance?

Our House of Brick database experts will guide you through enabling extended VARCHAR2 support, converting CLOBs seamlessly, and tuning your environment for faster, more reliable operations.

Talk to an Expert

Frequently Asked Questions

What is the MAX_STRING_SIZE parameter, and how does setting it to EXTENDED impact VARCHAR2 columns?

When MAX_STRING_SIZE is set to EXTENDED, Oracle 12c allows VARCHAR2, NVARCHAR2, and RAW columns to grow from 4 KB to 32 KB. You enable it by putting the database in upgrade mode, altering the parameter, and running the utl32k script.

What is the difference between VARCHAR2 and NVARCHAR2 after enabling EXTENDED mode?

VARCHAR2 uses the database character set (NLS_CHARACTERSET), while NVARCHAR2 stores Unicode data in the NLS_NCHAR_CHARACTERSET. Both types support up to 32 KB in EXTENDED mode.

How do I convert existing CLOB columns to VARCHAR2 once extended support is enabled?

After enabling EXTENDED mode, add a new VARCHAR2(32767) column, update it with the CLOB values, commit, drop the old CLOB column, and optionally rename the new column to match the original name.

 

What if my CLOB data exceeds 32 KB during conversion?

Use the DBMS_LOB.SUBSTR function in your update statement to truncate values to 32,767 bytes, or review and transform longer data before performing the conversion.

 

Are there any prerequisites or considerations before upgrading to support extended VARCHAR2 columns?

You cannot revert after making this change—ensure you have a full backup, consider index length limits, verify application code compatibility, and perform comprehensive regression testing.

Related Posts