Hive CREATE AS
Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, querying, and analysis. One of the powerful features of Hive is the ability to create tables using the CREATE AS
statement. This allows users to create a new table based on the results of a query.
Syntax
The syntax for the CREATE AS
statement is as follows:
CREATE TABLE new_table_name AS
SELECT column1, column2, ...
FROM existing_table_name
WHERE condition;
new_table_name
is the name of the new table that will be created.column1, column2, ...
are the columns that will be selected from the existing table. If you want to include all columns, you can use*
.existing_table_name
is the name of the existing table from which the data will be selected.condition
is an optional clause that allows you to filter the data based on a specific condition.
Example
Let's say we have a table called employees
with the following schema:
id | name | age | salary |
---|---|---|---|
1 | John | 30 | 50000 |
2 | Alice | 25 | 40000 |
3 | Bob | 35 | 60000 |
Now, we want to create a new table called top_earning_employees
to store the employees with a salary greater than 45000. We can use the CREATE AS
statement to achieve this:
CREATE TABLE top_earning_employees AS
SELECT id, name, age, salary
FROM employees
WHERE salary > 45000;
The above query will create a new table top_earning_employees
with the following schema:
id | name | age | salary |
---|---|---|---|
1 | John | 30 | 50000 |
3 | Bob | 35 | 60000 |
The new table will only contain the rows where the salary is greater than 45000.
Benefits of Using CREATE AS
The CREATE AS
statement in Hive offers several benefits:
-
Efficiency: By creating a new table based on the results of a query, you can avoid the need to write complex and repetitive code to perform the same calculations or transformations multiple times.
-
Data Manipulation: The
CREATE AS
statement allows you to manipulate and transform the data during the creation of the new table. You can use functions, aggregations, and filters to extract only the necessary data. -
Data Security: Creating a new table with a subset of data from an existing table can help to ensure data security. You can restrict access to the new table, providing only the necessary information to the users who need it.
Conclusion
In this article, we have learned about the CREATE AS
statement in Hive, which allows users to create a new table based on the results of a query. We have seen the syntax of the statement and how to use it with an example. The CREATE AS
statement in Hive provides efficiency, data manipulation, and data security benefits, making it a powerful tool for data analysis and management.
erDiagram
employees ||--|| top_earning_employees : CREATE AS
References:
- Hive Language Manual - [Data Definition Language](