To create a partitioned external table, the database must be at 12.2.0 compatibility level or higher.

分区的外部表和存储在数据库中的分区表有点像,但不同的是外部表分区可以存储在文件系统上,如    Apache Hive storage   ,或   Hadoop Distributed File System (HDFS)   。   所有外部表上的限制在分区的外部表上都有。   分区表上的所有限制,在分区的外部表上也都有。   Oracle   数据库不能保证分区的外部文件包含满足分区定义的数据。



For large amounts of data, partitioning for external tables provides fast query performance and enhanced data maintenance.

  • About Partitioning External Tables  
    Partitioning data in external tables is similar to partitioning tables stored in the database, but there are some differences. The files for the partitioned external table can be stored on a file system, in Apache Hive storage, or in a Hadoop Distributed File System (HDFS).

  • Restrictions for Partitioned External Tables  
    Some restrictions apply to partitioned external tables.

  • Creating a Partitioned External Table  
    You create a non-composite partitioned external table by issuing a   CREATE TABLE   statement with the   ORGANIZATION EXTERNAL   clause and the   PARTITION BY   clause. To create a composite partitioned external table, the   SUBPARTITION BY   clause must also be included.

  • Altering a Partitioned External Table  
    You can use the   ALTER TABLE   statement to modify table-level external parameters, but not the partition-level and subpartition-level parameters, of a partitioned external table.

20.14.6.1   About Partitioning External Tables

Partitioning data in external tables is similar to partitioning tables stored in the database, but there are some differences. The files for the partitioned external table can be stored on a file system, in Apache Hive storage, or in a Hadoop Distributed File System (HDFS).

Before attempting to partition external tables, you should understand the concepts related to partitioning in Oracle Database VLDB and Partitioning Guide.

The main reason to partition external tables is to take advantage of the same performance improvements provided by partitioning tables stored in the database. Specifically, partition pruning and partition-wise joins can improve query performance. Partition pruning means that queries can focus on a subset of the data in an external table instead of all of the data because the query can apply to only one partition. Partition-wise joins can be applied when two tables are being joined and both tables are partitioned on the join key, or when a reference partitioned table is joined with its parent table. Partition-wise joins break a large join into smaller joins that occur between each of the partitions, completing the overall join in less time.

Most of the partitioning strategies that are supported for tables in the database are supported for external tables. External tables can be partitioned by range or list, and composite partitioning is supported. However, hash partitioning is not supported for external tables.

For a partitioned table that is stored in the database, storage for each partition is specified with a tablespace. For a partitioned external table, storage for each partition is specified by indicating the directory and files for each partition.

Clauses for Creating Partitioned External Tables

The clauses for creating a non-partitioned external table are the following:

  • TYPE   - Specifies the access driver for the type of external table (  ORACLE_LOADER  ,   ORACLE_DATAPUMP  ,   ORACLE_HIVE  , and   ORACLE_HDFS  ).

  • DEFAULT DIRECTORY   - Specifies with a directory object the default directory to use for all input and output files that do not explicitly name a directory object. 

  • ACCESS PARAMETERS   - Describe the external data source.

  • LOCATION   - Specifies the files for the external table.

  • REJECT LIMIT   - Specifies the number of errors that can occur during a query of the external data.

When you create a partitioned external table, you must include a   PARTITION   clause that defines each partition. The following table describes the clauses allowed at each level during external table creation.

Table 20-6 External Table Clauses and Partitioning

ClauseTable LevelPartition LevelSubpartition Level

TYPE

Allowed

Not Allowed

Not Allowed

DEFAULT DIRECTORY

Allowed

Allowed

Allowed

ACCESS PARAMETERS

Allowed

Not Allowed

Not Allowed

LOCATION

Not allowed

Allowed

Allowed

REJECT LIMIT

Allowed

Not allowed

Not allowed

For a non-composite partitioned table, files for a partition must be specified in the   LOCATION   clause for the partition. For a composite partitioned table, files for a subpartition must be specified in the   LOCATION   clause for the subpartition. When a partition has subpartitions, the   LOCATION   clause can be specified for subpartitions but not for the partition. If the   LOCATION   clause is omitted for a partition or subpartition, then an empty partition or subpartition is created.

In the   LOCATION   clause, the files are named in the form    directory:file   , and one clause can specify multiple files. The    directory    portion is optional. The following rules apply for the directory used by a partition or subpartition:

  • When a directory is specified in the   LOCATION   clause for a partition or subpartition, then it applies to that location only.

  • In the   LOCATION   clause for a specific partition, for each file that does not have a directory specification, use the directory specified in the   DEFAULT DIRECTORY   clause for the partition or table level, in order.

    For example, when the   ORGANIZATION EXTERNAL   clause of a   CREATE TABLE   statement includes a   DEFAULT DIRECTORY   clause, and a   PARTITION   clause in the statement does not specify a directory for a file in its   LOCATION   clause, the file uses the directory specified in the   DEFAULT DIRECTORY   clause for the table.

  • In the   LOCATION   clause for a specific subpartition, for each file that does not have a directory specification, use the directory specified in the   DEFAULT DIRECTORY   clause for the subpartition, partition, or table level, in order.

    For example, when a   PARTITION   clause includes a   DEFAULT DIRECTORY   clause, and a   SUBPARITION   clause in the partition does not specify a directory for a file in its   LOCATION   clause, the file uses the directory specified in the   DEFAULT DIRECTORY   clause for the partition.

  • The default directory for a partition or subpartition cannot be specified in a   LOCATION   clause. It can only be specified in a   DEFAULT DIRECTORY   clause.

See Also:

Example 20-23   illustrates the directory rules

Using the ORACLE_HIVE Access Driver

Apache Hive has its own partitioning. To create partitioned external tables, use the   CREATE_EXTDDL_FOR_HIVE   procedure in the   DBMS_HADOOP   package. This procedure generates data definition language (DDL) statements that you can use to create a partitioned external table that corresponds with the partitioning in the Apache Hive storage.

The   DBMS_HADOOP   package also includes the   SYNC_PARTITIONS_FOR_HIVE   procedure. This procedure automatically synchronizes the partitioning of the partitioned external table in the Apache Hive storage with the partitioning metadata of the same table stored in the Oracle Database.

Related Topics

  • Altering External Tables

  • Oracle Database Utilities

  • Oracle Database PL/SQL Packages and Types Reference

20.14.6.2   Restrictions for Partitioned External Tables

Some restrictions apply to partitioned external tables.

The following are restrictions for partitioned external tables:

  • All restrictions that apply to non-partitioned external tables also apply to partitioned external tables.

  • Partitioning restrictions that apply to tables stored in the database also apply to partitioned external tables, such as the maximum number of partitions.

  • Oracle Database cannot guarantee that the external files for partitions contain data that satisfies partitioning definitions.

  • Only the   DEFAULT DIRECTORY   and   LOCATION   clauses can be specified in a   PARTITION   or   SUBPARTITION   clause.

  • When altering a partitioned external table with the   ALTER TABLE   statement, the following clauses are not supported:   MODIFY PARTITION  ,   EXCHANGE PARTITION  ,   MOVE PARTITION  ,   MERGE PARTITIONS  ,   SPLIT PARTITION  ,   COALESCE PARTITION  , and   TRUNCATE PARTITION  .

  • Reference partitioning, automatic list partitioning, and interval partitioning are not supported.

  • Subpartition templates are not supported.

  • The   ORACLE_DATAPUMP   access driver cannot populate external files for partitions using a   CREATE TABLE AS SELECT  statement.

  • Incremental statistics are not gathered for partitioned external tables.

  • In addition to restrictions on partitioning methods that can be used for the other drivers, range and composite partitioning are not supported for the   ORACLE_HIVE   access driver.

  • A   SELECT   statement with the   EXTERNAL MODIFY   clause cannot override partition-level or subpartition-level clauses. Only external clauses supported at the table level can be overridden with the   EXTERNAL MODIFY   clause. Because the   LOCATION  clause is not allowed at the table level for a partitioned external table, it cannot be overridden with the   EXTERNAL MODIFY  clause.

See Also:

  • "  About External Tables  "

  • Oracle Database SQL Language Reference    provides details of the syntax of the   CREATE TABLE   statement for creating external tables and specifies restrictions on the use of clauses

20.14.6.3   Creating a Partitioned External Table

You create a non-composite partitioned external table by issuing a   CREATE TABLE   statement with the   ORGANIZATION EXTERNAL   clause and the   PARTITION BY   clause. To create a composite partitioned external table, the   SUBPARTITION BY  clause must also be included.

The   PARTITION BY   clause and the   SUBPARTITION BY   clause specify the locations of the external files for each partition and subpartition.

To create a partitioned external table, the database must be at 12.2.0 compatibility level or higher.

  1. Connect to the database as a user with the privileges required to create the external table.

    See Oracle Database SQL Language Reference for information about the required privileges.

  2. Issue a   CREATE TABLE   statement with the   ORGANIZATION EXTERNAL   clause and the   PARTITION BY   clause. For a composite partitioned table, include the   SUBPARTITION BY   clause also.

Example 20-22 Creating a Partitioned External Table with Access Parameters Common to All Partitions

This example creates an external table named   orders_external_range   that is partitioned by the date data in the   order_date   column. The   ACCESS PARAMETERS   clause is specified at the table level for the   ORACLE_LOADER   access driver. The   data_dir1   directory object is the default directory object used for the partitions   month1  ,   month2  , and   month3  . The   pmax  partition specifies the   data_dir2   directory object in the   DEFAULT DIRECTORY   clause, so the   data_dir2   directory object is used for the   pmax   partition.

-- Set up directories and grant access to oe
CREATE OR REPLACE DIRECTORY data_dir1
    AS '/flatfiles/data1'; 
CREATE OR REPLACE DIRECTORY data_dir2 
    AS '/flatfiles/data2'; 
CREATE OR REPLACE DIRECTORY bad_dir 
    AS '/flatfiles/bad'; 
CREATE OR REPLACE DIRECTORY log_dir 
    AS '/flatfiles/log'; 
GRANT READ ON DIRECTORY data_dir1 TO oe; 
GRANT READ ON DIRECTORY data_dir2 TO oe; 
GRANT WRITE ON DIRECTORY bad_dir TO oe; 
GRANT WRITE ON DIRECTORY log_dir TO oe; 
-- oe connects. Provide the user password (oe) when prompted.
CONNECT oe
-- create the partitioned external table
CREATE TABLE orders_external_range(
    order_id           NUMBER(12),
    order_date         DATE NOT NULL,
    customer_id        NUMBER(6) NOT NULL,
    order_status       NUMBER(2),
    order_total        NUMBER(8,2),
    sales_rep_id       NUMBER(6))
ORGANIZATION EXTERNAL(
   TYPE ORACLE_LOADER
   DEFAULT DIRECTORY data_dir1
   ACCESS PARAMETERS(
      RECORDS DELIMITED BY NEWLINE
      BADFILE bad_dir: 'sh%a_%p.bad'
      LOGFILE log_dir: 'sh%a_%p.log'
      FIELDS TERMINATED BY '|'
      MISSING FIELD VALUES ARE NULL))
PARALLEL
REJECT LIMIT UNLIMITED
PARTITION BY RANGE (order_date)
  (PARTITION month1 VALUES LESS THAN (TO_DATE('31-12-2014', 'DD-MM-YYYY'))
        LOCATION ('sales_1.csv'),
    PARTITION month2 VALUES LESS THAN (TO_DATE('31-01-2015', 'DD-MM-YYYY'))
        LOCATION ('sales_2.csv'),
    PARTITION month3 VALUES LESS THAN (TO_DATE('28-02-2015', 'DD-MM-YYYY'))
        LOCATION ('sales_3.csv'),
    PARTITION pmax VALUES LESS THAN (MAXVALUE)
        DEFAULT DIRECTORY data_dir2 LOCATION('sales_4.csv'));

 

In the previous example, the default directory   data_dir2   is specified for the   pmax   partition. You can also specify the directory for a specific location in this partition in the   LOCATION   clause in the following way:

PARTITION pmax VALUES LESS THAN (MAXVALUE)
   LOCATION ('data_dir2:sales_4.csv')

 

Note that, in this case, the directory   data_dir2   is specified for the location   sales_4.csv  , but the   data_dir2   directory is not the default directory for the partition. Therefore, the default directory for the   pmax   partition is the same as the default directory for the table, which is   data_dir1  .

Example 20-23 Creating a Composite List-Range Partitioned External Table

This example creates an external table named   accounts   that is partitioned by the data in the   region   column. This partition is subpartitioned using range on the data in the   balance   column. The   ACCESS PARAMETERS   clause is specified at the table level for the   ORACLE_LOADER   access driver. A   LOCATION   clause is specified for each subpartition.

There is a table-level   DEFAULT DIRECTORY   clause set to the   data_dir1   directory object, and this directory object is used for all of the subpartitions, except for the following:

  • There is a partition-level   DEFAULT DIRECTORY   clause set to the   data_dir2   directory object for partition   p_southcentral  . In that partition, the following subpartitions use this default directory:   p_sc_low  ,   p_sc_high  , and   p_sc_extraordinary  .

  • In partition   p_southcentral  , the subpartition   p_sc_average   has a subpartition-level   DEFAULT DIRECTORY   clause set to the   data_dir3   directory object, and this subpartition uses the   data_dir3   directory object.

  • As previously stated, the default directory for the   p_sc_high   subpartition is   data_dir2  . The   p_sc_high   subpartition does not have a   DEFAULT DIRECTORY   clause, and the default directory   data_dir2   is inherited from the   DEFAULT DIRECTORY   specified in the   PARTITION BY   clause for the partition   p_southcentral  . The files in the   p_sc_high  subpartition use the following directories:

    • The   psch1.csv   file uses   data_dir2  , the default directory for the subpartition.

    • The   psch2.csv   file uses the   data_dir4   directory because the   data_dir4   directory is specified for that location.

-- Set up the directories and grant access to oe
CREATE OR REPLACE DIRECTORY data_dir1
    AS '/stage/data1_dir';
CREATE OR REPLACE DIRECTORY data_dir2
    AS '/stage/data2_dir';
CREATE OR REPLACE DIRECTORY data_dir3
    AS '/stage/data3_dir';
CREATE OR REPLACE DIRECTORY data_dir4
    AS '/stage/data4_dir';
CREATE OR REPLACE DIRECTORY bad_dir 
    AS '/stage/bad_dir'; 
CREATE OR REPLACE DIRECTORY log_dir 
    AS '/stage/log_dir';
GRANT READ ON DIRECTORY data_dir1 TO oe;
GRANT READ ON DIRECTORY data_dir2 TO oe;
GRANT READ ON DIRECTORY data_dir3 TO oe;
GRANT READ ON DIRECTORY data_dir4 TO oe;
GRANT WRITE ON DIRECTORY bad_dir TO oe; 
GRANT WRITE ON DIRECTORY log_dir TO oe; 
-- oe connects. Provide the user password (oe) when prompted.
CONNECT oe
-- create the partitioned external table
CREATE TABLE accounts
( id             NUMBER,
  account_number NUMBER,
  customer_id    NUMBER,
  balance        NUMBER,
  branch_id      NUMBER,
  region         VARCHAR(2),
  status         VARCHAR2(1)
)
ORGANIZATION EXTERNAL(
   TYPE ORACLE_LOADER
   DEFAULT DIRECTORY data_dir1
   ACCESS PARAMETERS(
      RECORDS DELIMITED BY NEWLINE
      BADFILE bad_dir: 'sh%a_%p.bad'
      LOGFILE log_dir: 'sh%a_%p.log'
      FIELDS TERMINATED BY '|'
      MISSING FIELD VALUES ARE NULL))
PARALLEL
REJECT LIMIT UNLIMITED
PARTITION BY LIST (region)
SUBPARTITION BY RANGE (balance)
( PARTITION p_northwest VALUES ('OR', 'WA') 
  ( SUBPARTITION p_nw_low VALUES LESS THAN (1000) LOCATION ('pnwl.csv'),
    SUBPARTITION p_nw_average VALUES LESS THAN (10000) LOCATION ('pnwa.csv'),
    SUBPARTITION p_nw_high VALUES LESS THAN (100000) LOCATION ('pnwh.csv'),
    SUBPARTITION p_nw_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('pnwe.csv')
  ),
  PARTITION p_southwest VALUES ('AZ', 'UT', 'NM')
  ( SUBPARTITION p_sw_low VALUES LESS THAN (1000) LOCATION ('pswl.csv'),
    SUBPARTITION p_sw_average VALUES LESS THAN (10000) LOCATION ('pswa.csv'),
    SUBPARTITION p_sw_high VALUES LESS THAN (100000) LOCATION ('pswh.csv'),
    SUBPARTITION p_sw_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('pswe.csv')
  ),
  PARTITION p_northeast VALUES ('NY', 'VM', 'NJ')
  ( SUBPARTITION p_ne_low VALUES LESS THAN (1000) LOCATION ('pnel.csv'),
    SUBPARTITION p_ne_average VALUES LESS THAN (10000) LOCATION ('pnea.csv'),
    SUBPARTITION p_ne_high VALUES LESS THAN (100000) LOCATION ('pneh.csv'),
    SUBPARTITION p_ne_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('pnee.csv')
  ),
  PARTITION p_southeast VALUES ('FL', 'GA')
  ( SUBPARTITION p_se_low VALUES LESS THAN (1000) LOCATION ('psel.csv'),
    SUBPARTITION p_se_average VALUES LESS THAN (10000) LOCATION ('psea.csv'),
    SUBPARTITION p_se_high VALUES LESS THAN (100000) LOCATION ('pseh.csv'),
    SUBPARTITION p_se_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('psee.csv')
  ),
  PARTITION p_northcentral VALUES ('SD', 'WI')
  ( SUBPARTITION p_nc_low VALUES LESS THAN (1000) LOCATION ('pncl.csv'),
    SUBPARTITION p_nc_average VALUES LESS THAN (10000) LOCATION ('pnca.csv'),
    SUBPARTITION p_nc_high VALUES LESS THAN (100000) LOCATION ('pnch.csv'),
    SUBPARTITION p_nc_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('pnce.csv')
  ),
  PARTITION p_southcentral VALUES ('OK', 'TX') DEFAULT DIRECTORY data_dir2
  ( SUBPARTITION p_sc_low VALUES LESS THAN (1000) LOCATION ('pscl.csv'),
    SUBPARTITION p_sc_average VALUES LESS THAN (10000) 
       DEFAULT DIRECTORY data_dir3 LOCATION ('psca.csv'),
    SUBPARTITION p_sc_high VALUES LESS THAN (100000) 
       LOCATION ('psch1.csv','data_dir4:psch2.csv'),
    SUBPARTITION p_sc_extraordinary VALUES LESS THAN (MAXVALUE) 
       LOCATION ('psce.csv')
  )
);

 

See Also:

Oracle Database VLDB and Partitioning Guide

20.14.6.4   Altering a Partitioned External Table

You can use the   ALTER TABLE   statement to modify table-level external parameters, but not the partition-level and subpartition-level parameters, of a partitioned external table.

The locations of external files are specified in the   PARTITION BY   and   SUBPARTITION BY   clauses. External files for a partition are specified in the partition’s   PARTITION BY   clause. External files for a subpartition are specified in the subpartition's   SUBPARTITION BY   clause.

The only exception is that the   LOCATION   clause cannot be specified at the table level during the creation of a partitioned external table. Therefore, the   LOCATION   clause cannot be added at the table level in an   ALTER TABLE   statement that modifies a partitioned external table.

At the partition level, only   ADD  ,   DROP  , and   RENAME   operations are supported. An   ALTER TABLE   statement cannot modify the attributes of existing partitions and subpartitions. However, you can include the   DEFAULT DIRECTORY   and   LOCATION   clauses in a   PARTITION   clause or   SUBPARTITION   clause when you add a new partition or subpartition.

  1. Connect to the database as a user with the privileges required to alter the external table.

  2. Issue an   ALTER TABLE   statement.

Example 20-24 Renaming a Partition of a Partitioned External Table

This example renames a partition of the partitioned external table named   orders_external_range  .

ALTER TABLE orders_external_range RENAME PARTITION pmax TO other_months;