Oracle Streams, High Speed Replication and Data Sharing
I originally wrote this review in Jan 2006. At the time I had done very little production Streams work. I have since done quite a bit of
work in the arena and my opinion of this book changed because of that. At the time, it seemed like it was loaded with in-depth
information. After using Streams, I realized how much was missing.
If you have never used Streams at all, this book is a decent, easy to read overview. You would get just as much out of reading the
Oracle streams documentation but this is a bit easier to follow.
Also, since the book was written in 2005, it is now out of date. If you can pick this book up at a used book store or on a site like
buy.com for less than $10, it's probably worth adding to your library. Otherwise, I wouldn't bother.
Title: Oracle Streams, High Speed Replication and Data Sharing
Author: Madhu Tumma
Publisher: Rampant Press
Publish Date: Feb 2005
ISBN: 0-9745993-5-2
Price: US $16.95
Pages: 289
This book covers all aspects of Oracle Streams including how to configure, monitor and use it.
The book includes a preface, 10 chapters, references and an index.
The chapters are:
* Preface -- Data & What is Streams Replication?
* Chapter 1: What is Streams? -- Introduction to Replication
* Chapter 2: Streams Components and Processes -- The Architecture of Streams
* Chapter 3: Streams Replication -- The OUT When, What & How
* Chapter 4: Capture and Propagate Configuration -- Database Nitty Gritty
* Chapter 5: Apply Process Configuration -- The IN When,What and How
* Chapter 6: Apply Handlers -- Code Time
* Chapter 7: Monitoring and Troubleshooting Streams -- SQL, Views and Errors
* Chapter 8: Down Streams Capture -- Remote Source Replication
* Chapter 9: Streams and Real Application Clusters - Streams & RAC Overview
* Chapter 10: Streams for Heterogeneous Replication -- Oracle and Non-oracle Data Sharing
Preface
5 pages
I wouldn't normally include the preface in a review. In most cases it's just a description of the book and the way it's laid
out with the occasional discussion of philosophy by the author. In Oracle Streams, Madhu Tumma opens with a really decent definition of
streams and how it's different from data guard and RAC.
Chapter 1: What is Streams?
23 pages
Chapter 1 is the introduction chapter. The author covers data sharing and synchronization concepts and how streams fits into those
concepts. He covers why data sharing is needed and how data sharing is impacted by very large databases (VLDB).
The need for data transformation is discussed briefly and an example scenario is presented.
He also discusses just what data replication is and why it's needed, including: to support global operations, site autonomy,
enhanced performance, and data availability and protection (failover).
This chapters explains the difference between synchronous and asynchronous replication. This section also described two-phase-commit
(2PC), issues with 2PC and how streams is a simpler approach.
The next section in this chapter explains what Oracle Streams, including a brief intro to the streams architecture, and where to use
streams. This is a really good discussion that gets a bit more into the differences between Streams, Data Guard and RAC. The author
points out that while Data Guard, RAC ad Streams are different, Streams has incorporated some of the strengths of RAC and Data Guard.
Streams also allows PL/SQL user exits (Apply Handlers) which is not available in Data Guard (and doesn't really make sense for RAC).
The author explains that Streams technology is used in Message Queuing (via AQ), Event Messaging and Notification, Oracle Replication,
and Data Warehouse loading (via Change Data Capture).
Chapter 1 also provides a history of Streams evolution, Streams 10g new features and a little bit more information on Streams and AQ.
The chapter ends with a discussion of two other replication products: GoldenGate Data Synchronization Platform by GoldenGate Software
and Shareplex Data Replication by Quest Software. Both of these technologies gets about a page of detail.
By itself, this would be a great overview for anyone in your organization interested in Streams, data sharing or replication. If Rampant
press provided this chapter for free as a PDF, I bet they would sell many more copies of this book and others in the series.
Chapter 2: Streams Components and Processes
25 pages
Chapter 2 is an introduction to the architecture of Oracle Streams. The first part covers the Producer/Consumer model. That is, there is
a producer database providing data and one or more consumer databases consuming that data. The Producer/Consumer model was introduced
into Oracle with AQ.
The author provides good detail on the flow of data in Streams and the Streams Clients. These clients are the entities that capture
data, move data around, and store/manipulate the data.
My favorite part of this chapter is the discussion of queues. Queues are a key component of Streams. This chapter explains HOW those
queues are used. It also covers secure queues, the typed queue and the AnyData queue. User applications would use transactional queues.
This section also defines enqueuing and dequeuing and how and when enqueues and dequeues are called.
The capture process is covered in some detail including almost a full page about buffered queues and how they help performance. Logical
Change Records from the ReDo logs is discussed.
The author also explains how LogMiner is is used in the capture process and what the differences are between Hot Mining and Cold Mining.
Since Streams uses the redo logs for capture, additional information called "supplemental logging" is required. This chapter
provides very detailed information about this additional logging as well as configuration of this logging.
The latter half of this chapter is an introduction to propagation, propagation rules and the apply process. The main features of the
apply process are discussed as the four custom apply handlers: DML Handler, DDL Handler, Message Handler and Pre-Commit Handler.
The chapter ends with an overview of the Rules Engine and Rule Based Transformations.
Chapter 3: Streams Replication
22 pages
This chapter covers the specifics of data replication, rather than Streams in general.
The author explains what database replication is and how DDL and DML differ. Streams can handle both kinds of replication. This chapter
explains how background processes in the source database capture DML and DDL from the redo log. These changes are propagated to, and
applied in, destination databases.
A fairly detailed explanation of "Downstream Capture" is provided. Downstream capture is the process of copying redo logs to a
non-critical database so that the capture process will not impact performance. The author provides an excellent explanation of the
requirements and configuration of downstream capture.
The author includes those types of DML that are replicated: Insert, Update, Delete, Merge and Updates of LOBs and covers those types of
DDL activity that are not replicated. He makes an important note here:
A Capture process can capture DDL statements, but not the result of DDL statements, unless the DDL statement is a CREATE TABLE AS SELECT
statement.
He goes on to use ANALYZE as an example. The analyze itself can be captured but the statistics generated would not be.
He also makes the point that by using nologging (for SQL) and unrecoverable (for SQL*Loader), the capture process will not see those
changes. DBAs use these keywords to improve performance.
Additional configuration of supplemental logging and object instantiation are covered.
Streams has a feature called tags. The author explains what tags are and how they can help identify the session running.
Chapter 3 ends with a discussion of multi-way replication and conflict resolution. He details the four types of conflict: Update,
Uniqueness, Delete and Foreign Key. The author notes that each of these conflicts are automatically handled by placing the errors in an
error queue unless a custom error handler has been written. He also makes the point that good design can alleviate some conflict issues
and he points out the pre-built conflict handlers.
Chapter 4: Capture and Propagate Configuration
64 pages
This is the chapter for DBAs. This chapter goes into great detail about the configuration of both the Capture and Propagate processes.
This chapter alone makes the book worthwhile.
The chapter starts with capture environment setup: init.ora, streams pool, LogMiner, streams administrator, supplemental logging,
database links and setting up the queues for staging and propagation. I won't go into detail on each of these except to say that the
author does a very good job covering the important details.
The beginning of this chapter also includes the steps needed to be taken for capture configuration, the capture architecture and capture
rules. This section is rich with code (SQL and PL/SQL) and actual examples.
The author also covers adding new object to an existing configuration, something that I know from personal experience can be a pain. He
provides a step by step guide.
The final of this chapter is the dedicated to the propagation process, including enabling, disabling, dropping and altering propagation.
Chapter 5: Apply Process Configuration
33 pages
The apply process is the Streams automated way of getting data INTO the destination database. This is an optional process (you can write
your own dequeue process if you want) and is implemented as a background process.
The author provides a nice flow diagram of the apply process. Apply rules are covered.
The apply process components are covered: Reader Server, Coordinator Process, and the Apply Servers.
The bulk of this chapter covers creating and managing an apply process and like the previous chapter is code intensive (SQL and PL/SQL).
The Streams API is described, via syntax examples, in detail.
Chapter 6: Apply Handlers
22 pages
This chapter covers the four custom Apply Handlers: DDL Handler, DML Handler, Error Handler and Pre-Commit Handler. Each handler gets
it's own section and includes descriptive text as well as sample code.
Chapter 7: Monitoring and Troubleshooting Streams
52 pages
This chapter is another very important one. It primarily covers the errors that be raised by Streams and the views that can be used to
monitor and troubleshoot. In addition to describing the views available and the data contained in them, the author also provides scripts
to extract useful information. Some Capture examples are: Event Enqueuing Latency, Rule Evaluations, Elapsed Times, etc.
The views described and SQL provided cover Capture, Apply and Propagation.
On the troubleshooting side, the author covers errors that can occur and possible solutions.
Chapter 8: Down Streams Capture
12 pages
As described above, downstream capture is the process of copying redo logs to a non-critical database to lower the impact of Capture to
a database. This chapter covers that option in detail. The author discusses pros and cons of doing downstream capture and provides a
step by step guide to configuration.
Chapter 9: Streams and Real Application Clusters
9 pages
This chapter does not go into great detail but does provide a good overview of topics of importance in a RAC configuration. The author
also provides code examples for this scenario.
Chapter 10: Streams for Heterogeneous Replication
10 pages
This chapter provides an overview of using streams with a non-oracle destination using an Oracle Transparent Gateway. Configuration is
covered and code is provided but Heterogeneous processing really requires a book to itself. If this topic is important to you, this
chapter will provide a basic overview. Sybase is the example used.
Conclusion
I have spoken about this topic in the past, Advanced Queues and Streams: A Definition in Plain English and Oracle Advanced Replication:
A Definition in Plain English. The main difference between those two blog entries and this book are the level of detail. I spent about 5
or 6 pages on these topics and this book has almost 300 pages. That is the main differentiator between this book and others covering
Oracle Streams.
After using Streams in real world applications, I realized this book was less than useful once you have gotten past the basics. If you
are new to streams, or if you are a developer writing distributed or messaging applications, this book would be a good place to start.
This book is worth picking up at USD16.95 but try to find it used for less. The one thing I can think of that would make this book even
more valuable would be the addition of a sample application from beginning to end. The code examples that are provided are good, but a
complete application would be better.