To create a partitioned external table, the database must be at 12.2.0 compatibility level or higher.
分区的外部表和存储在数据库中的分区表有点像,但不同的是外部表分区可以存储在文件系统上,如 Apache Hive storage ,或 Hadoop Distributed File System (HDFS) 。 所有外部表上的限制在分区的外部表上都有。 分区表上的所有限制,在分区的外部表上也都有。 Oracle 数据库不能保证分区的外部文件包含满足分区定义的数据。
For large amounts of data, partitioning for external tables provides fast query performance and enhanced data maintenance.
About Partitioning External Tables
Partitioning data in external tables is similar to partitioning tables stored in the database, but there are some differences. The files for the partitioned external table can be stored on a file system, in Apache Hive storage, or in a Hadoop Distributed File System (HDFS).Restrictions for Partitioned External Tables
Some restrictions apply to partitioned external tables.Creating a Partitioned External Table
You create a non-composite partitioned external table by issuing aCREATE TABLE
statement with theORGANIZATION EXTERNAL
clause and thePARTITION BY
clause. To create a composite partitioned external table, theSUBPARTITION BY
clause must also be included.Altering a Partitioned External Table
You can use theALTER TABLE
statement to modify table-level external parameters, but not the partition-level and subpartition-level parameters, of a partitioned external table.
20.14.6.1 About Partitioning External Tables
Partitioning data in external tables is similar to partitioning tables stored in the database, but there are some differences. The files for the partitioned external table can be stored on a file system, in Apache Hive storage, or in a Hadoop Distributed File System (HDFS).
Before attempting to partition external tables, you should understand the concepts related to partitioning in Oracle Database VLDB and Partitioning Guide .
The main reason to partition external tables is to take advantage of the same performance improvements provided by partitioning tables stored in the database. Specifically, partition pruning and partition-wise joins can improve query performance. Partition pruning means that queries can focus on a subset of the data in an external table instead of all of the data because the query can apply to only one partition. Partition-wise joins can be applied when two tables are being joined and both tables are partitioned on the join key, or when a reference partitioned table is joined with its parent table. Partition-wise joins break a large join into smaller joins that occur between each of the partitions, completing the overall join in less time.
Most of the partitioning strategies that are supported for tables in the database are supported for external tables. External tables can be partitioned by range or list, and composite partitioning is supported. However, hash partitioning is not supported for external tables.
For a partitioned table that is stored in the database, storage for each partition is specified with a tablespace. For a partitioned external table, storage for each partition is specified by indicating the directory and files for each partition.
Clauses for Creating Partitioned External Tables
The clauses for creating a non-partitioned external table are the following:
TYPE
- Specifies the access driver for the type of external table (ORACLE_LOADER
,ORACLE_DATAPUMP
,ORACLE_HIVE
, andORACLE_HDFS
).DEFAULT DIRECTORY
- Specifies with a directory object the default directory to use for all input and output files that do not explicitly name a directory object.ACCESS PARAMETERS
- Describe the external data source.LOCATION
- Specifies the files for the external table.REJECT LIMIT
- Specifies the number of errors that can occur during a query of the external data.
When you create a partitioned external table, you must include a PARTITION
clause that defines each partition. The following table describes the clauses allowed at each level during external table creation.
Table 20-6 External Table Clauses and Partitioning
Clause | Table Level | Partition Level | Subpartition Level |
---|---|---|---|
| Allowed | Not Allowed | Not Allowed |
| Allowed | Allowed | Allowed |
| Allowed | Not Allowed | Not Allowed |
| Not allowed | Allowed | Allowed |
| Allowed | Not allowed | Not allowed |
For a non-composite partitioned table, files for a partition must be specified in the LOCATION
clause for the partition. For a composite partitioned table, files for a subpartition must be specified in the LOCATION
clause for the subpartition. When a partition has subpartitions, the LOCATION
clause can be specified for subpartitions but not for the partition. If the LOCATION
clause is omitted for a partition or subpartition, then an empty partition or subpartition is created.
In the LOCATION
clause, the files are named in the form directory:file
, and one clause can specify multiple files. The directory
portion is optional. The following rules apply for the directory used by a partition or subpartition:
When a directory is specified in the
LOCATION
clause for a partition or subpartition, then it applies to that location only.In the
LOCATION
clause for a specific partition, for each file that does not have a directory specification, use the directory specified in theDEFAULT DIRECTORY
clause for the partition or table level, in order.For example, when the
ORGANIZATION EXTERNAL
clause of aCREATE TABLE
statement includes aDEFAULT DIRECTORY
clause, and aPARTITION
clause in the statement does not specify a directory for a file in itsLOCATION
clause, the file uses the directory specified in theDEFAULT DIRECTORY
clause for the table.In the
LOCATION
clause for a specific subpartition, for each file that does not have a directory specification, use the directory specified in theDEFAULT DIRECTORY
clause for the subpartition, partition, or table level, in order.For example, when a
PARTITION
clause includes aDEFAULT DIRECTORY
clause, and aSUBPARITION
clause in the partition does not specify a directory for a file in itsLOCATION
clause, the file uses the directory specified in theDEFAULT DIRECTORY
clause for the partition.The default directory for a partition or subpartition cannot be specified in a
LOCATION
clause. It can only be specified in aDEFAULT DIRECTORY
clause.
See Also:
Example 20-23 illustrates the directory rules
Using the ORACLE_HIVE Access Driver
Apache Hive has its own partitioning. To create partitioned external tables, use the CREATE_EXTDDL_FOR_HIVE
procedure in the DBMS_HADOOP
package. This procedure generates data definition language (DDL) statements that you can use to create a partitioned external table that corresponds with the partitioning in the Apache Hive storage.
The DBMS_HADOOP
package also includes the SYNC_PARTITIONS_FOR_HIVE
procedure. This procedure automatically synchronizes the partitioning of the partitioned external table in the Apache Hive storage with the partitioning metadata of the same table stored in the Oracle Database.
Related Topics
20.14.6.2 Restrictions for Partitioned External Tables
Some restrictions apply to partitioned external tables.
The following are restrictions for partitioned external tables:
All restrictions that apply to non-partitioned external tables also apply to partitioned external tables.
Partitioning restrictions that apply to tables stored in the database also apply to partitioned external tables, such as the maximum number of partitions.
Oracle Database cannot guarantee that the external files for partitions contain data that satisfies partitioning definitions.
Only the
DEFAULT DIRECTORY
andLOCATION
clauses can be specified in aPARTITION
orSUBPARTITION
clause.When altering a partitioned external table with the
ALTER TABLE
statement, the following clauses are not supported:MODIFY PARTITION
,EXCHANGE PARTITION
,MOVE PARTITION
,MERGE PARTITIONS
,SPLIT PARTITION
,COALESCE PARTITION
, andTRUNCATE PARTITION
.Reference partitioning, automatic list partitioning, and interval partitioning are not supported.
Subpartition templates are not supported.
The
ORACLE_DATAPUMP
access driver cannot populate external files for partitions using aCREATE TABLE AS SELECT
statement.Incremental statistics are not gathered for partitioned external tables.
In addition to restrictions on partitioning methods that can be used for the other drivers, range and composite partitioning are not supported for the
ORACLE_HIVE
access driver.A
SELECT
statement with theEXTERNAL MODIFY
clause cannot override partition-level or subpartition-level clauses. Only external clauses supported at the table level can be overridden with theEXTERNAL MODIFY
clause. Because theLOCATION
clause is not allowed at the table level for a partitioned external table, it cannot be overridden with theEXTERNAL MODIFY
clause.
See Also:
Oracle Database SQL Language Reference provides details of the syntax of the
CREATE TABLE
statement for creating external tables and specifies restrictions on the use of clauses
20.14.6.3 Creating a Partitioned External Table
You create a non-composite partitioned external table by issuing a CREATE TABLE
statement with the ORGANIZATION EXTERNAL
clause and the PARTITION BY
clause. To create a composite partitioned external table, the SUBPARTITION BY
clause must also be included.
The PARTITION BY
clause and the SUBPARTITION BY
clause specify the locations of the external files for each partition and subpartition.
To create a partitioned external table, the database must be at 12.2.0 compatibility level or higher.
Connect to the database as a user with the privileges required to create the external table.
See Oracle Database SQL Language Reference for information about the required privileges.
Issue a
CREATE TABLE
statement with theORGANIZATION EXTERNAL
clause and thePARTITION BY
clause. For a composite partitioned table, include theSUBPARTITION BY
clause also.
Example 20-22 Creating a Partitioned External Table with Access Parameters Common to All Partitions
This example creates an external table named orders_external_range
that is partitioned by the date data in the order_date
column. The ACCESS PARAMETERS
clause is specified at the table level for the ORACLE_LOADER
access driver. The data_dir1
directory object is the default directory object used for the partitions month1
, month2
, and month3
. The pmax
partition specifies the data_dir2
directory object in the DEFAULT DIRECTORY
clause, so the data_dir2
directory object is used for the pmax
partition.
-- Set up directories and grant access to oe CREATE OR REPLACE DIRECTORY data_dir1 AS '/flatfiles/data1'; CREATE OR REPLACE DIRECTORY data_dir2 AS '/flatfiles/data2'; CREATE OR REPLACE DIRECTORY bad_dir AS '/flatfiles/bad'; CREATE OR REPLACE DIRECTORY log_dir AS '/flatfiles/log'; GRANT READ ON DIRECTORY data_dir1 TO oe; GRANT READ ON DIRECTORY data_dir2 TO oe; GRANT WRITE ON DIRECTORY bad_dir TO oe; GRANT WRITE ON DIRECTORY log_dir TO oe; -- oe connects. Provide the user password (oe) when prompted. CONNECT oe -- create the partitioned external table CREATE TABLE orders_external_range( order_id NUMBER(12), order_date DATE NOT NULL, customer_id NUMBER(6) NOT NULL, order_status NUMBER(2), order_total NUMBER(8,2), sales_rep_id NUMBER(6)) ORGANIZATION EXTERNAL( TYPE ORACLE_LOADER DEFAULT DIRECTORY data_dir1 ACCESS PARAMETERS( RECORDS DELIMITED BY NEWLINE BADFILE bad_dir: 'sh%a_%p.bad' LOGFILE log_dir: 'sh%a_%p.log' FIELDS TERMINATED BY '|' MISSING FIELD VALUES ARE NULL)) PARALLEL REJECT LIMIT UNLIMITED PARTITION BY RANGE (order_date) (PARTITION month1 VALUES LESS THAN (TO_DATE('31-12-2014', 'DD-MM-YYYY')) LOCATION ('sales_1.csv'), PARTITION month2 VALUES LESS THAN (TO_DATE('31-01-2015', 'DD-MM-YYYY')) LOCATION ('sales_2.csv'), PARTITION month3 VALUES LESS THAN (TO_DATE('28-02-2015', 'DD-MM-YYYY')) LOCATION ('sales_3.csv'), PARTITION pmax VALUES LESS THAN (MAXVALUE) DEFAULT DIRECTORY data_dir2 LOCATION('sales_4.csv'));
In the previous example, the default directory data_dir2
is specified for the pmax
partition. You can also specify the directory for a specific location in this partition in the LOCATION
clause in the following way:
PARTITION pmax VALUES LESS THAN (MAXVALUE) LOCATION ('data_dir2:sales_4.csv')
Note that, in this case, the directory data_dir2
is specified for the location sales_4.csv
, but the data_dir2
directory is not the default directory for the partition. Therefore, the default directory for the pmax
partition is the same as the default directory for the table, which is data_dir1
.
Example 20-23 Creating a Composite List-Range Partitioned External Table
This example creates an external table named accounts
that is partitioned by the data in the region
column. This partition is subpartitioned using range on the data in the balance
column. The ACCESS PARAMETERS
clause is specified at the table level for the ORACLE_LOADER
access driver. A LOCATION
clause is specified for each subpartition.
There is a table-level DEFAULT DIRECTORY
clause set to the data_dir1
directory object, and this directory object is used for all of the subpartitions, except for the following:
There is a partition-level
DEFAULT DIRECTORY
clause set to thedata_dir2
directory object for partitionp_southcentral
. In that partition, the following subpartitions use this default directory:p_sc_low
,p_sc_high
, andp_sc_extraordinary
.In partition
p_southcentral
, the subpartitionp_sc_average
has a subpartition-levelDEFAULT DIRECTORY
clause set to thedata_dir3
directory object, and this subpartition uses thedata_dir3
directory object.As previously stated, the default directory for the
p_sc_high
subpartition isdata_dir2
. Thep_sc_high
subpartition does not have aDEFAULT DIRECTORY
clause, and the default directorydata_dir2
is inherited from theDEFAULT DIRECTORY
specified in thePARTITION BY
clause for the partitionp_southcentral
. The files in thep_sc_high
subpartition use the following directories:The
psch1.csv
file usesdata_dir2
, the default directory for the subpartition.The
psch2.csv
file uses thedata_dir4
directory because thedata_dir4
directory is specified for that location.
-- Set up the directories and grant access to oe CREATE OR REPLACE DIRECTORY data_dir1 AS '/stage/data1_dir'; CREATE OR REPLACE DIRECTORY data_dir2 AS '/stage/data2_dir'; CREATE OR REPLACE DIRECTORY data_dir3 AS '/stage/data3_dir'; CREATE OR REPLACE DIRECTORY data_dir4 AS '/stage/data4_dir'; CREATE OR REPLACE DIRECTORY bad_dir AS '/stage/bad_dir'; CREATE OR REPLACE DIRECTORY log_dir AS '/stage/log_dir'; GRANT READ ON DIRECTORY data_dir1 TO oe; GRANT READ ON DIRECTORY data_dir2 TO oe; GRANT READ ON DIRECTORY data_dir3 TO oe; GRANT READ ON DIRECTORY data_dir4 TO oe; GRANT WRITE ON DIRECTORY bad_dir TO oe; GRANT WRITE ON DIRECTORY log_dir TO oe; -- oe connects. Provide the user password (oe) when prompted. CONNECT oe -- create the partitioned external table CREATE TABLE accounts ( id NUMBER, account_number NUMBER, customer_id NUMBER, balance NUMBER, branch_id NUMBER, region VARCHAR(2), status VARCHAR2(1) ) ORGANIZATION EXTERNAL( TYPE ORACLE_LOADER DEFAULT DIRECTORY data_dir1 ACCESS PARAMETERS( RECORDS DELIMITED BY NEWLINE BADFILE bad_dir: 'sh%a_%p.bad' LOGFILE log_dir: 'sh%a_%p.log' FIELDS TERMINATED BY '|' MISSING FIELD VALUES ARE NULL)) PARALLEL REJECT LIMIT UNLIMITED PARTITION BY LIST (region) SUBPARTITION BY RANGE (balance) ( PARTITION p_northwest VALUES ('OR', 'WA') ( SUBPARTITION p_nw_low VALUES LESS THAN (1000) LOCATION ('pnwl.csv'), SUBPARTITION p_nw_average VALUES LESS THAN (10000) LOCATION ('pnwa.csv'), SUBPARTITION p_nw_high VALUES LESS THAN (100000) LOCATION ('pnwh.csv'), SUBPARTITION p_nw_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('pnwe.csv') ), PARTITION p_southwest VALUES ('AZ', 'UT', 'NM') ( SUBPARTITION p_sw_low VALUES LESS THAN (1000) LOCATION ('pswl.csv'), SUBPARTITION p_sw_average VALUES LESS THAN (10000) LOCATION ('pswa.csv'), SUBPARTITION p_sw_high VALUES LESS THAN (100000) LOCATION ('pswh.csv'), SUBPARTITION p_sw_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('pswe.csv') ), PARTITION p_northeast VALUES ('NY', 'VM', 'NJ') ( SUBPARTITION p_ne_low VALUES LESS THAN (1000) LOCATION ('pnel.csv'), SUBPARTITION p_ne_average VALUES LESS THAN (10000) LOCATION ('pnea.csv'), SUBPARTITION p_ne_high VALUES LESS THAN (100000) LOCATION ('pneh.csv'), SUBPARTITION p_ne_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('pnee.csv') ), PARTITION p_southeast VALUES ('FL', 'GA') ( SUBPARTITION p_se_low VALUES LESS THAN (1000) LOCATION ('psel.csv'), SUBPARTITION p_se_average VALUES LESS THAN (10000) LOCATION ('psea.csv'), SUBPARTITION p_se_high VALUES LESS THAN (100000) LOCATION ('pseh.csv'), SUBPARTITION p_se_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('psee.csv') ), PARTITION p_northcentral VALUES ('SD', 'WI') ( SUBPARTITION p_nc_low VALUES LESS THAN (1000) LOCATION ('pncl.csv'), SUBPARTITION p_nc_average VALUES LESS THAN (10000) LOCATION ('pnca.csv'), SUBPARTITION p_nc_high VALUES LESS THAN (100000) LOCATION ('pnch.csv'), SUBPARTITION p_nc_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('pnce.csv') ), PARTITION p_southcentral VALUES ('OK', 'TX') DEFAULT DIRECTORY data_dir2 ( SUBPARTITION p_sc_low VALUES LESS THAN (1000) LOCATION ('pscl.csv'), SUBPARTITION p_sc_average VALUES LESS THAN (10000) DEFAULT DIRECTORY data_dir3 LOCATION ('psca.csv'), SUBPARTITION p_sc_high VALUES LESS THAN (100000) LOCATION ('psch1.csv','data_dir4:psch2.csv'), SUBPARTITION p_sc_extraordinary VALUES LESS THAN (MAXVALUE) LOCATION ('psce.csv') ) );
See Also:
Oracle Database VLDB and Partitioning Guide
20.14.6.4 Altering a Partitioned External Table
You can use the ALTER TABLE
statement to modify table-level external parameters, but not the partition-level and subpartition-level parameters, of a partitioned external table.
The locations of external files are specified in the PARTITION BY
and SUBPARTITION BY
clauses. External files for a partition are specified in the partition’s PARTITION BY
clause. External files for a subpartition are specified in the subpartition's SUBPARTITION BY
clause.
The only exception is that the LOCATION
clause cannot be specified at the table level during the creation of a partitioned external table. Therefore, the LOCATION
clause cannot be added at the table level in an ALTER TABLE
statement that modifies a partitioned external table.
At the partition level, only ADD
, DROP
, and RENAME
operations are supported. An ALTER TABLE
statement cannot modify the attributes of existing partitions and subpartitions. However, you can include the DEFAULT DIRECTORY
and LOCATION
clauses in a PARTITION
clause or SUBPARTITION
clause when you add a new partition or subpartition.
Connect to the database as a user with the privileges required to alter the external table.
Issue an
ALTER TABLE
statement.
Example 20-24 Renaming a Partition of a Partitioned External Table
This example renames a partition of the partitioned external table named orders_external_range
.
ALTER TABLE orders_external_range RENAME PARTITION pmax TO other_months;