Tuesday, April 21, 2020

Oracle Big Data SQL aka Cloud SQL

The post is intended to outline the Oracle Big Data SQL offering, more of an introduction to what it is and where can we find further information.

Introduction and terminology


Oracle Big Data SQL leverages Oracle cell smart-scan technology to fetch a large amount of data defined as an Oracle external table using Oracle Big Data SQL drivers (oracle_hdfs, oracle_hive, and for Big Data SQL (aka. BDSQL) 4.0, oracle_bigdata)

An Oracle query that involves such an external table triggers a describe call from an Oracle database foreground process to access metadata (from hadoop Hive Metastore / namenode) which includes granules (i.e., data blocks ) and location information. The granules are then batched and assigned to eligible multiple Parallel Queries Slaves, PQs (or a foreground process if a serialized query) which interface with Big Data SQL (or BDSQL) cells running in data nodes. There can be hundreds or thousands of granules involved in a single query. A query involving granules in all data nodes will be able to leverage all Big Data SQL cells running in data nodes, thus scale (i.e., parallelize) the query the best, and get the most benefits. After the data is fetched to the cell, the data can be filtered before sending back to the database, thus shortening the overall query elapsed time significantly.

Retries/Fallback


There are three replicas of each granule by default, each in different datanode. Each granule has three datanodes assigned, thus there will be three attempts to fetch each granule. If all of the three attempts (aka. retries) fail, the granule will have to fetched directly from the database (aka fallback mode), in which case, data will not be able to be filtered (impacting performance).

A small number of 'retries count' might not represent a problem. In any event, if the number keeps increasing, it indicates either the Big Data SQL cell memory is not large enough to handle the workload, or there are code issues / configuration issues.

Quarantines


With any internal error (i.e., ORA-600) occurs on any offload server thread, a quarantine is created in the Big Data SQL cell. The quarantine has the SQLID stored along with other attributes. Once a quarantine is created in a Big Data SQL cell, the same SQL (with the same SQLID) will not be allowed to be processed in that cell. The basic idea is that the same SQL will very likely cause the same internal error. Once there are three quarantines (within a certain time range), the Big Data SQL cell will not accept any further SQL from the database.

Each quarantine will cause an incident trace created in:/opt/oracle/bd_cell/log/diag/bdsql/cell/SYS*/incident/

Below is the working Architecture of Big Data SQL , which I will try to explain in later post.



Documentation 


          User Guide 

          Installation Guide



No comments:

Post a Comment