The Database Knowledgebase on the Web

KNOWLEDGEBASE:

SQL

Oracle

MySQL

Postgres 

General topics 

Glossary 

Database Wisdom: Reference



Oracle Streams, High Speed Replication and Data Sharing


Oracle Streams, High Speed Replication and Data Sharing

I originally wrote this review in Jan 2006. At the time I had done very little production Streams work. I have since done quite a bit of work in the arena and my opinion of this book changed because of that. At the time, it seemed like it was loaded with in-depth information. After using Streams, I realized how much was missing.

If you have never used Streams at all, this book is a decent, easy to read overview. You would get just as much out of reading the Oracle streams documentation but this is a bit easier to follow.

Also, since the book was written in 2005, it is now out of date. If you can pick this book up at a used book store or on a site like buy.com for less than $10, it's probably worth adding to your library. Otherwise, I wouldn't bother.

Title: Oracle Streams, High Speed Replication and Data Sharing
Author: Madhu Tumma
Publisher: Rampant Press
Publish Date: Feb 2005
ISBN: 0-9745993-5-2
Price: US $16.95
Pages: 289

This book covers all aspects of Oracle Streams including how to configure, monitor and use it.

The book includes a preface, 10 chapters, references and an index.

The chapters are:

* Preface -- Data & What is Streams Replication?

* Chapter 1: What is Streams? -- Introduction to Replication

* Chapter 2: Streams Components and Processes -- The Architecture of Streams

* Chapter 3: Streams Replication -- The OUT When, What & How

* Chapter 4: Capture and Propagate Configuration -- Database Nitty Gritty

* Chapter 5: Apply Process Configuration -- The IN When,What and How

* Chapter 6: Apply Handlers -- Code Time

* Chapter 7: Monitoring and Troubleshooting Streams -- SQL, Views and Errors

* Chapter 8: Down Streams Capture -- Remote Source Replication

* Chapter 9: Streams and Real Application Clusters - Streams & RAC Overview

* Chapter 10: Streams for Heterogeneous Replication -- Oracle and Non-oracle Data Sharing

Preface

5 pages

I wouldn't normally include the preface in a review. In most cases it's just a description of the book and the way it's laid out with the occasional discussion of philosophy by the author. In Oracle Streams, Madhu Tumma opens with a really decent definition of streams and how it's different from data guard and RAC.

Chapter 1: What is Streams?

23 pages

Chapter 1 is the introduction chapter. The author covers data sharing and synchronization concepts and how streams fits into those concepts. He covers why data sharing is needed and how data sharing is impacted by very large databases (VLDB).

The need for data transformation is discussed briefly and an example scenario is presented.

He also discusses just what data replication is and why it's needed, including: to support global operations, site autonomy, enhanced performance, and data availability and protection (failover).

This chapters explains the difference between synchronous and asynchronous replication. This section also described two-phase-commit (2PC), issues with 2PC and how streams is a simpler approach.

The next section in this chapter explains what Oracle Streams, including a brief intro to the streams architecture, and where to use streams. This is a really good discussion that gets a bit more into the differences between Streams, Data Guard and RAC. The author points out that while Data Guard, RAC ad Streams are different, Streams has incorporated some of the strengths of RAC and Data Guard. Streams also allows PL/SQL user exits (Apply Handlers) which is not available in Data Guard (and doesn't really make sense for RAC).

The author explains that Streams technology is used in Message Queuing (via AQ), Event Messaging and Notification, Oracle Replication, and Data Warehouse loading (via Change Data Capture).

Chapter 1 also provides a history of Streams evolution, Streams 10g new features and a little bit more information on Streams and AQ.

The chapter ends with a discussion of two other replication products: GoldenGate Data Synchronization Platform by GoldenGate Software and Shareplex Data Replication by Quest Software. Both of these technologies gets about a page of detail.

By itself, this would be a great overview for anyone in your organization interested in Streams, data sharing or replication. If Rampant press provided this chapter for free as a PDF, I bet they would sell many more copies of this book and others in the series.

Chapter 2: Streams Components and Processes

25 pages

Chapter 2 is an introduction to the architecture of Oracle Streams. The first part covers the Producer/Consumer model. That is, there is a producer database providing data and one or more consumer databases consuming that data. The Producer/Consumer model was introduced into Oracle with AQ.

The author provides good detail on the flow of data in Streams and the Streams Clients. These clients are the entities that capture data, move data around, and store/manipulate the data.

My favorite part of this chapter is the discussion of queues. Queues are a key component of Streams. This chapter explains HOW those queues are used. It also covers secure queues, the typed queue and the AnyData queue. User applications would use transactional queues.

This section also defines enqueuing and dequeuing and how and when enqueues and dequeues are called.

The capture process is covered in some detail including almost a full page about buffered queues and how they help performance. Logical Change Records from the ReDo logs is discussed.

The author also explains how LogMiner is is used in the capture process and what the differences are between Hot Mining and Cold Mining.

Since Streams uses the redo logs for capture, additional information called "supplemental logging" is required. This chapter provides very detailed information about this additional logging as well as configuration of this logging.

The latter half of this chapter is an introduction to propagation, propagation rules and the apply process. The main features of the apply process are discussed as the four custom apply handlers: DML Handler, DDL Handler, Message Handler and Pre-Commit Handler.

The chapter ends with an overview of the Rules Engine and Rule Based Transformations.

Chapter 3: Streams Replication

22 pages

This chapter covers the specifics of data replication, rather than Streams in general.

The author explains what database replication is and how DDL and DML differ. Streams can handle both kinds of replication. This chapter explains how background processes in the source database capture DML and DDL from the redo log. These changes are propagated to, and applied in, destination databases.

A fairly detailed explanation of "Downstream Capture" is provided. Downstream capture is the process of copying redo logs to a non-critical database so that the capture process will not impact performance. The author provides an excellent explanation of the requirements and configuration of downstream capture.

The author includes those types of DML that are replicated: Insert, Update, Delete, Merge and Updates of LOBs and covers those types of DDL activity that are not replicated. He makes an important note here:

A Capture process can capture DDL statements, but not the result of DDL statements, unless the DDL statement is a CREATE TABLE AS SELECT statement.

He goes on to use ANALYZE as an example. The analyze itself can be captured but the statistics generated would not be.

He also makes the point that by using nologging (for SQL) and unrecoverable (for SQL*Loader), the capture process will not see those changes. DBAs use these keywords to improve performance.

Additional configuration of supplemental logging and object instantiation are covered.

Streams has a feature called tags. The author explains what tags are and how they can help identify the session running.

Chapter 3 ends with a discussion of multi-way replication and conflict resolution. He details the four types of conflict: Update, Uniqueness, Delete and Foreign Key. The author notes that each of these conflicts are automatically handled by placing the errors in an error queue unless a custom error handler has been written. He also makes the point that good design can alleviate some conflict issues and he points out the pre-built conflict handlers.

Chapter 4: Capture and Propagate Configuration

64 pages

This is the chapter for DBAs. This chapter goes into great detail about the configuration of both the Capture and Propagate processes. This chapter alone makes the book worthwhile.

The chapter starts with capture environment setup: init.ora, streams pool, LogMiner, streams administrator, supplemental logging, database links and setting up the queues for staging and propagation. I won't go into detail on each of these except to say that the author does a very good job covering the important details.

The beginning of this chapter also includes the steps needed to be taken for capture configuration, the capture architecture and capture rules. This section is rich with code (SQL and PL/SQL) and actual examples.

The author also covers adding new object to an existing configuration, something that I know from personal experience can be a pain. He provides a step by step guide.

The final of this chapter is the dedicated to the propagation process, including enabling, disabling, dropping and altering propagation.

Chapter 5: Apply Process Configuration

33 pages

The apply process is the Streams automated way of getting data INTO the destination database. This is an optional process (you can write your own dequeue process if you want) and is implemented as a background process.

The author provides a nice flow diagram of the apply process. Apply rules are covered.

The apply process components are covered: Reader Server, Coordinator Process, and the Apply Servers.

The bulk of this chapter covers creating and managing an apply process and like the previous chapter is code intensive (SQL and PL/SQL). The Streams API is described, via syntax examples, in detail.

Chapter 6: Apply Handlers

22 pages

This chapter covers the four custom Apply Handlers: DDL Handler, DML Handler, Error Handler and Pre-Commit Handler. Each handler gets it's own section and includes descriptive text as well as sample code.

Chapter 7: Monitoring and Troubleshooting Streams

52 pages

This chapter is another very important one. It primarily covers the errors that be raised by Streams and the views that can be used to monitor and troubleshoot. In addition to describing the views available and the data contained in them, the author also provides scripts to extract useful information. Some Capture examples are: Event Enqueuing Latency, Rule Evaluations, Elapsed Times, etc.

The views described and SQL provided cover Capture, Apply and Propagation.

On the troubleshooting side, the author covers errors that can occur and possible solutions.

Chapter 8: Down Streams Capture

12 pages

As described above, downstream capture is the process of copying redo logs to a non-critical database to lower the impact of Capture to a database. This chapter covers that option in detail. The author discusses pros and cons of doing downstream capture and provides a step by step guide to configuration.

Chapter 9: Streams and Real Application Clusters

9 pages

This chapter does not go into great detail but does provide a good overview of topics of importance in a RAC configuration. The author also provides code examples for this scenario.

Chapter 10: Streams for Heterogeneous Replication

10 pages

This chapter provides an overview of using streams with a non-oracle destination using an Oracle Transparent Gateway. Configuration is covered and code is provided but Heterogeneous processing really requires a book to itself. If this topic is important to you, this chapter will provide a basic overview. Sybase is the example used.

Conclusion

I have spoken about this topic in the past, Advanced Queues and Streams: A Definition in Plain English and Oracle Advanced Replication: A Definition in Plain English. The main difference between those two blog entries and this book are the level of detail. I spent about 5 or 6 pages on these topics and this book has almost 300 pages. That is the main differentiator between this book and others covering Oracle Streams.

After using Streams in real world applications, I realized this book was less than useful once you have gotten past the basics. If you are new to streams, or if you are a developer writing distributed or messaging applications, this book would be a good place to start.

This book is worth picking up at USD16.95 but try to find it used for less. The one thing I can think of that would make this book even more valuable would be the addition of a sample application from beginning to end. The code examples that are provided are good, but a complete application would be better.


Contact: Lewis Cunningham
lewisc@databasewisdom.com

About us

Contact us

Support us

Search Database Wisdom