Oracle to Postgres – Part 1
Andy Kerber (@dbakerber), Senior Consultant
Recently Oracle has dramatically increased the price of its Oracle Standard Edition database. Because of this price increase, many observers in the database world feel that some companies may be migrating from Oracle to other database versions such as SQL Server, NoSQL or PostgreSQL.
In this article, we will discuss the process of copying an Oracle database from Oracle to PostgreSQL. The tools we will be using for this are Oracle XE, PostgreSQL, SymmetricDS (replication software), and ora2pg. PostgreSQL, SymmetricDS, and ora2pg are all open source software.
PostgreSQL is the destination database software, SymmetricDS is the replication software, and ora2pg is a tool for converting the Oracle table definitions to PostgreSQL, and also for creating the initial load if required.
In this article, we use the latest version of PostgreSQL and Oracle XE. For the purposes of this demonstration, I downloaded the Swingbench software from dominicgiles.com, created a 3.2G OE schema in the Oracle XE database, and initiated the replication from Oracle to PostgreSQL.
Because this post is designed for the Oracle DBA, I am going to omit the instructions for the Oracle installation, but I will give instructions for installing PostgreSQL.
The first step was to install the OE schema for Swingbench in the Oracle XE database. This is a fairly straightforward process, and is well documented on dominicgiles.com. Java is required for Swingbench, and I recommend that you install the latest version.
After installing Swingbench and Java, run the oewizard (oewizard.bat on windows) to create the order entry (OE) schema in your Oracle database. For the purposes of this demo, choose the smallest database size, 3.2G. Here is a link to instructions for installation of Swingbench: http://houseofbrick.com/swingbench-installation-and-setup/.
Next, install the PostgreSQL software and create your initial database. Because of the prevalence of UNIX derivatives for Oracle and PostgreSQL software, this article assumes that the Oracle and PostgreSQL database are using Linux.
PostgreSQL can be download from here: http://www.enterprisedb.com/downloads/postgres-postgresql-downloads. Note that the proprietary version, Postgres Plus Advanced server, is also available from this site.
The download will be a run file, which should be executed by root. This will be a GUI installation program. Normally, installation is done with the default options. Installation in /opt, owner postgres, port 5432.
After the program completes, go ahead and run the stack builder program, and choose any additional packages. Anything labelled EDB or EnterpriseDB will be proprietary packages. pgAgent and pgBouncer are not proprietary packages.
PgAgent is useful as a job scheduler, and pgBouncer is used for connection pooling. Additional free tools available are pgAdmin and SymmetricDS (for replication) or Slony (also for replication). PsqlODBC id needs to be installed, and if you plan to connect to an Oracle database you probably need to install the Oracle client as well.
After initial installation of PostgreSQL, there is a substantial amount of tuning that must be done, similar to Oracle. This includes tuning the vacuum process (space recovery) WAL (Write Ahead Logs, similar to Oracle redo logs) sizing, checkpoint tuning, etc. I will leave configuring these items as content for future blogs.
For the purposes of this blog, we are going to let our replication software handle the initial load. The initial load can also be done manually.
Next, in PostgreSQL, create the user that will own the SOE schema, and create the database that will contain it. The command to create a user in PostgreSQL (in psql) is: ‘create user soe password ‘soe’;. After creating the user, the command to create the database is (in psql) ‘create database SOE with owner soe’;. Note that in the screen print below, the command to create the soe user grants superuser to soe. This is for ease of use, it is not necessary each time.
At this point, the PostgreSQL database and user are created, the schema and data are created within the Oracle database, and we are ready to begin the replication process.
In order to access the PostgreSQL database, the pg_hba.conf file must be configured to allow access. Assuming a default installation, the pg_hba.conf file will be located in /var/lib/pgsql/9.4/data. Also Assuming that the database created is named soe, and your local network is 192.168.1.x, these lines should be added to the pg_hba.conf file to allow local and remote access:
local all soe md5 host all soe 192.168.1.0/24 password
The next step is to copy the schema from Oracle to PostgreSQL. It works best to pre-create the tables, as SymmetricDS may not convert the data types properly if it is allowed to create the tables. The program ora2pg, downloaded here: https://sourceforge.net/projects/ora2pg/, can be used to generate the DDL for the tables in PostgreSQL. For the purposes of this blog, we are not going to try and convert the PL/SQL triggers, functions, and packages.
Using ORA2PG to extract DDL
Ora2pg uses ora2pg.conf file for its configuration arguments. One of the lines has the data type conversions, and the interval data type was left off this list. So, uncomment this line and add this to the front of the data line: INTERVAL:interval, so the line should read like this:
DATA_TYPE INTERVAL:interval, DATE:timestamp, LONG:text, LONG RAW:bytea, CLOB:text, NCLOB:text, BLOB:bytea, BFILE:bytea, RAW:bytea, ROWID:oid,FLOAT:double precision, DEC:decimal,DECIMAL:decimal, DOUBLE PRECISION:double precision, INT:integer, INTEGER:integer, REAL:real, SMALLINT:smallint,BINARY_FLOAT:double precision, BINARY_DOUBLE:double precision, TIMESTAMP:timestamp, XMLTYPE:xml, BINARY_INTEGER:integer, PLS_INTEGER:integer, TIMESTAMP WITH TIME ZONE:timestamp with time zone, TIMESTAMP WITH LOCAL TIME ZONE:timestamp with time zone
If you do not make the change above, interval data types will not be created properly.
The following lines also need to be modified per your requirements (our example is extracting only DDL):
# Set the Oracle home directory ORACLE_HOME /u01/app/oracle/product/11.2.0/xe # Set Oracle database connection (datasource, user, password) ORACLE_DSN dbi:Oracle:host=pgoraclone.localdomain;sid=xe ORACLE_USER system ORACLE_PWD password # Oracle schema/owner to use SCHEMA SOE # Type of export. Values can be the following keyword: # Note: data is only exported if INSERT or COPY command is used, #copy is the fastest method normally. # TABLE Export tables, constraints, indexes, ... # PACKAGE Export packages # INSERT Export data from table as INSERT statement # COPY Export data from table as COPY statement # VIEW Export views # GRANT Export grants # SEQUENCE Export sequences # TRIGGER Export triggers # FUNCTION Export functions # PROCEDURE Export procedures # TABLESPACE Export tablespace (PostgreSQL >= 8 only) # TYPE Export user defined Oracle types # PARTITION Export range or list partition (PostgreSQL >= v8.4) # FDW Export table as foreign data wrapper tables # MVIEW Export materialized view as snapshot refresh view # QUERY Convert Oracle SQL queries from a file. # KETTLE Generate XML ktr template files to be used by Kettle. TYPE TABLE,VIEW,GRANT,SEQUENCE,MVIEW # By default all output is dump to STDOUT if not send directly to postgresql # database (see above). Give a filename to save export to it. If you want # a Gzip'd compressed file just add the extension .gz to the filename (you # need perl module Compress::Zlib from CPAN). Add extension .bz2 to use Bzip2 # compression. OUTPUT output.sql
After configuring the ora2pg.conf file, set your environment for the Oracle database you will be accessing by using oraenv. This will set up the appropriate libraries. If you see an error similar to this:
Can't load '/usr/local/lib64/perl5/auto/DBD/Oracle/Oracle.so' for module DBD::Oracle: libclntsh.so.11.2:
There is a problem with your Oracle path, and it can probably be fixed by running oraenv or setting LD_LIBRARY_PATH appropriately.
So to run ora2pg, after setting up the ora2pg.conf command, just enter ora2pg. It will generate a file called output.sql containing the commands to create the schema in PostgreSQL.
Next, copy the output.sql file to your PostgreSQL database server, and run it, using this command:
psql soe soe –f output.sql
where the first soe is the database name, the second soe is the username. It should then prompt you for the password for the soe user. The output should look similar to this:
[postgres@postgres3 data]$ psql soe soe -f output.sql Password for user soe: SET CREATE TABLE ALTER TABLE CREATE TABLE ALTER TABLE CREATE INDEX CREATE INDEX CREATE TABLE ALTER TABLE CREATE INDEX CREATE TABLE ALTER TABLE CREATE INDEX CREATE TABLE ALTER TABLE CREATE TABLE CREATE TABLE ALTER TABLE CREATE TABLE ALTER TABLE CREATE TABLE CREATE TABLE ALTER TABLE CREATE TABLE ALTER TABLE CREATE TABLE ALTER TABLE CREATE TABLE ALTER TABLE CREATE TABLE ALTER TABLE CREATE TABLE ALTER TABLE . . . ALTER TABLE
So, at this point the tables in the destination are created and we are ready to begin the replication.
Using ORA2PG for initial load
The method described later in this blog for using SymmetricDS for the initial load is slow. It appears to do a commit after each insert, which takes a substantial amount of time.
Instead, you can use ora2pg for the initial load. If you choose to do so, after creating the tables, modify the ‘TYPE’ line in the ora2pg.conf file for either ‘COPY’ or ‘INSERT’ (copy is recommended), copy the subsequent output file to the destination database server, and run the output file exactly as you did to create the schema. This will load the data very quickly.
There are other tools for copying data from Oracle to PostgreSQL, but we are not discussing them here.
In part one, I discussed how Oracle’s recent price increase for its Standard Edition database may drive some users to migrate from Oracle to other database versions such as PostgreSQL. I also described the process of copying an Oracle database from Oracle to PostgreSQL and the tools like Oracle XE, PostgreSQL, SymmetricDS, and ora2pg used to facilitate the migration. In part two of the blog, I will focus specifically on one way replication configuration using SymmetricDS configuration.