Netezza (currently branded as IBM PureData for Analytics. It was the core of an earlier product stack called IBM DB2 Analytics Accelerator) created a category breaker data warehouse appliance in 2003, innovating with capabilities such as scanning, appliance DW, zone maps, and a leader in minimal DBA SQL data warehouse databases. IBM acquired Netezza in 2010, and the product never integrated well into the IBM stack and the product is currently end of life.
EDW.CLOUD Strategy Recommendation
Exit. Get off of Netezza as soon as feasible – there is no sensible forward path here. See the migration strategy section on advice on planning migration and help deciding on lift-and-shift vs. reimplementation. Lift and shift is likely to cost less than support at this point.
Migration Support Vendors
AWS has schema/data migration tooling, see https://aws.amazon.com/blogs/database/introducing-data-extractors-in-aws-schema-conversion-tool-version-1-0-602/
IBM (and many vendors) offers services to migrate/reimplement in various forms of DB2: https://www.ibm.com/support/knowledgecenter/en/SS6NHC/com.ibm.swg.im.dashdb
.apdv.porting.doc/doc/compat_process.html. IBM does not seem to be providing tooling to assist in migration and seems willing to lose a substantial portion of Netezza warehouses to DBMS competitors.
Impetus has a product to migrate Netezza work to Hadoop, see https://www.impetus.com/data-warehouse-modernization/resources/solution-brief/automated-netezza-workload-migration
The usual suspect ETL vendors provide tooling to migrate data (generally not much different from using the NZUNLOAD command), Many data modeling tools support schema conversion (be careful about PK definitions since Netezza does not enforce them.) These are important options for reimplementation in the cloud, but note our advice on implementation strategy in reimplementing data warehouses in the cloud
EDW.CLOUD, the product, will automate lift-and-shift to the cloud and will emulate Netezza commands, allowing 1-2 week migration. See The EDW.CLOUD Netezza migration product
Advice About Netezza Migration to the Cloud
Netezza stored procedures and user defined functions are based on a modified version of Postgres 7.1 plpgl. Understanding how to manage these can be the bulk of a migration, especially if they are used for ETL/ELT data loading. Netezza SQL has idiosyncrasies, such as Boolean values of ‘t’ and ‘f’.
Our general experience is that migrating from Netezza to a traditional row store (Oracle, DB2, SQL Server) can have performance issues, due to the high scan I/O demands and the lack of zone maps. Expect a long tuning cycle for partitioning, indexing, etc. Column stores generally work out better.
Netezza migrations we have seen to Hadoop including SQL on Hadoop and to Spark SQL seem poorly received. This seems to relate to the conflict between of the Netezza “just do it” model and the tune/tweak cycles expected in the Hadoop world and the business vs. tech priorities of the user base of the product. Consider the cloud native columnar databases first for data that is frequently queried – Snowflake, Redshift (for low concurrency use cases), BigQuery (if the ecosystem is acceptable). In most of these cases, there is no compelling cost reason to tier data with archival data in Hadoop or an object store.