Hive Regular Expression Cheat Sheet

broken image


  1. Perl Regular Expression Cheat Sheet
  2. Hive Regular Expression Cheat Sheet Pdf
  3. Powershell Regular Expression Cheat Sheet

The present file, called PScheatsheet.html, contains several terms in bold font, surrounded. The following PS pipeline extracts (almost) all bold text strings from it that occur first on a line. Apache Hive is a tool where the data is stored for analysis and querying. This cheat sheet guides you through the basic concepts and commands required to start with it. You can also download the printable PDF of this Apache Hive cheat sheet. The Hadoop Hive regular expression functions identify precise patterns of characters in the given string and are useful for extracting string from the data and validation of the existing data, for example, validate date, range checks, checks for characters, and extract specific characters from the data. For example, Hive table column value may contain a string that has embedded percentage (%) sign, in that case escape character functionality will allow you to ignore those during string matching. Below example statements show usage: Apache Hive RLIKE statement. You can match the pattern using regular expression with help of Hive RLIKE statement. Regular Expressions Cheat Sheet by DaveChild - Cheatography.com Created Date: 4237Z.

Skip to end of metadataGo to start of metadata

GROUP BY; SORT/ORDER/CLUSTER/DISTRIBUTE BY; JOIN (Hive Joins, Join Optimization, Outer Join Behavior); UNION; TABLESAMPLE; Subqueries; Virtual Columns; Operators and UDFs; LATERAL VIEW; Windowing, OVER, and Analytics; Common Table Expressions

Select Syntax

  • A SELECT statement can be part of a union query or a subquery of another query.
  • table_reference indicates the input to the query. It can be a regular table, a view, a join construct or a subquery.
  • Table names and column names are case insensitive.
    • In Hive 0.12 and earlier, only alphanumeric and underscore characters are allowed in table and column names.
    • In Hive 0.13 and later, column names can contain any Unicode character (see HIVE-6013). Any column name that is specified within backticks (`) is treated literally. Within a backtick string, use double backticks (``) to represent a backtick character.
    • To revert to pre-0.13.0 behavior and restrict column names to alphanumeric and underscore characters, set the configuration property hive.support.quoted.identifiers to none. In this configuration, backticked names are interpreted as regular expressions. For details, see Supporting Quoted Identifiers in Column Names (attached to HIVE-6013). Also see REGEX Column Specification below.
  • Simple query. For example, the following query retrieves all columns and all rows from table t1.

    Note

    As of Hive 0.13.0, FROM is optional (for example, SELECT 1+1).

  • To get the current database (as of Hive 0.13.0), use the current_database() function:

  • To specify a database, either qualify the table names with database names ('db_name.table_name' starting in Hive 0.7) or issue the USE statement before the query statement (starting in Hive 0.6).

    Visit some area yard sales to get a feel for local pricing on different types of things. If most sellers price hardbound books at 50 cents to $1, yours won't sell at $3 each. If name-brand blue jeans typically sell for $2 in your neighborhood, marking yours at 25 cents per. Garage sale stuff. Find great deals in your area from garage sale and yard sales near you featured on Facebook Marketplace. Apr 13, 2021 - Garage Sale Pricing Yard Sale Pricing How to price garage sale items Garage sale prices garage sale pricing guide 2019 garage sale pricing baby items garage sale pricing stickers garage sale price list garage sale price list pdf garage sale pricing rule of thumb yard sale pricing 2019. See more ideas about garage sale pricing, garage sale pricing guide, yard.

    'db_name.table_name' allows a query to access tables in different databases.

    USE sets the database for all subsequent HiveQL statements. Reissue it with the keyword 'default' to reset to the default database.

WHERE Clause

The WHERE condition is a boolean expression. For example, the following query returns only those sales records which have an amount greater than 10 from the US region. Hive supports a number of operators and UDFs in the WHERE clause:

As of Hive 0.13 some types of subqueries are supported in the WHERE clause.

ALL and DISTINCT Clauses

The ALL and DISTINCT options specify whether duplicate rows should be returned. If none of these options are given, the default is ALL (all matching rows are returned). DISTINCT specifies removal of duplicate rows from the result set. Note, Hive supports SELECT DISTINCT * starting in release 1.1.0 (HIVE-9194).

ALL and DISTINCT can also be used in a UNION clause – see Union Syntax for more information.

Partition Based Queries

In general, a SELECT query scans the entire table (other than for sampling). If a table created using the PARTITIONED BY clause, a query can do partition pruning and scan only a fraction of the table relevant to the partitions specified by the query. Hive currently does partition pruning if the partition predicates are specified in the WHERE clause or the ON clause in a JOIN. For example, if table page_views is partitioned on column date, the following query retrieves rows for just days between 2008-03-01 and 2008-03-31.

If a table page_views is joined with another table dim_users, you can specify a range of partitions in the ON clause as follows:

  • See also Partition Filter Syntax.
  • See also Group By.
  • See also Sort By / Cluster By / Distribute By / Order By.

HAVING Clause

Hive added support for the HAVING clause in version 0.7.0. In older versions of Hive it is possible to achieve the same effect by using a subquery, e.g:

can also be expressed as

LIMIT Clause

The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement.

Hive Regular Expression Cheat Sheet

LIMIT takes one or two numeric arguments, which must both be non-negative integer constants.

Perl Regular Expression Cheat Sheet

The first argument specifies the offset of the first row to return (as of Hive 2.0.0) and the second specifies the maximum number of rows to return.

When a single argument is given, it stands for the maximum number of rows and the offset defaults to 0.

The following query returns 5 arbitrary customers


The following query returns the first 5 customers to be created

The following query returns the 3rd to the 7th customers to be created


REGEX Column Specification

Hive regular expression cheat sheet download

LIMIT takes one or two numeric arguments, which must both be non-negative integer constants.

Perl Regular Expression Cheat Sheet

The first argument specifies the offset of the first row to return (as of Hive 2.0.0) and the second specifies the maximum number of rows to return.

When a single argument is given, it stands for the maximum number of rows and the offset defaults to 0.

The following query returns 5 arbitrary customers


The following query returns the first 5 customers to be created

The following query returns the 3rd to the 7th customers to be created


REGEX Column Specification

A SELECT statement can take regex-based column specification in Hive releases prior to 0.13.0, or in 0.13.0 and later releases if the configuration property hive.support.quoted.identifiers is set to none.

  • We use Java regex syntax. Try http://www.fileformat.info/tool/regex.htm for testing purposes.
  • The following query selects all columns except ds and hr.

More Select Syntax

See the following documents for additional syntax and features of SELECT statements:

  • JOIN

There are four types of operators in Hive:

  • Relational Operators
  • Arithmetic Operators
  • Logical Operators
  • Complex Operators

Relational Operators

These operators are used to compare two operands. The following table describes the relational operators available in Hive:

OperatorOperandDescription
A = Ball primitive typesTRUE if expression A is equivalent to expression B otherwise FALSE.
A != Ball primitive typesTRUE if expression A is not equivalent to expression B otherwise FALSE.
A < Ball primitive typesTRUE if expression A is less than expression B otherwise FALSE.
A <= Ball primitive typesTRUE if expression A is less than or equal to expression B otherwise FALSE.
A > Ball primitive typesTRUE if expression A is greater than expression B otherwise FALSE.
A >= Ball primitive typesTRUE if expression A is greater than or equal to expression B otherwise FALSE.
A IS NULLall typesTRUE if expression A evaluates to NULL otherwise FALSE.
A IS NOT NULLall typesFALSE if expression A evaluates to NULL otherwise TRUE.
A LIKE BStringsTRUE if string pattern A matches to B otherwise FALSE.
A RLIKE BStringsNULL if A or B is NULL, TRUE if any substring of A matches the Java regular expression B , otherwise FALSE.
A REGEXP BStringsSame as RLIKE.

Example

Let us assume the employee table is composed of fields named Id, Name, Salary, Designation, and Dept as shown below. Generate a query to retrieve the employee details whose Id is 1205.

The following query is executed to retrieve the employee details using the above table:

On successful execution of query, you get to see the following response:

The following query is executed to retrieve the employee details whose salary is more than or equal to Rs 40000.

On successful execution of query, you get to see the following response:

Arithmetic Operators

These operators support various common arithmetic operations on the operands. All of them return number types. The following table describes the arithmetic operators available in Hive:

OperatorsOperandDescription
A + Ball number typesGives the result of adding A and B.
A – Ball number typesGives the result of subtracting B from A.
A * Ball number typesGives the result of multiplying A and B.
A / Ball number typesGives the result of dividing B from A.
A % Ball number typesGives the reminder resulting from dividing A by B.
A & Ball number typesGives the result of bitwise AND of A and B.
A | Ball number typesGives the result of bitwise OR of A and B.
A ^ Ball number typesGives the result of bitwise XOR of A and B.
~Aall number typesGives the result of bitwise NOT of A.

Example

The following query adds two numbers, 20 and 30.

On successful execution of the query, you get to see the following response:

Logical Operators

The operators are logical expressions. All of them return either TRUE or FALSE.

OperatorsOperandsDescription
A AND BbooleanTRUE if both A and B are TRUE, otherwise FALSE.
A && BbooleanSame as A AND B.
A OR BbooleanTRUE if either A or B or both are TRUE, otherwise FALSE.
A || BbooleanSame as A OR B.
NOT AbooleanTRUE if A is FALSE, otherwise FALSE.
!AbooleanSame as NOT A.

Example

The following query is used to retrieve employee details whose Department is TP and Salary is more than Rs 40000.

On successful execution of the query, you get to see the following response:

Complex Operators

Hive Regular Expression Cheat Sheet Pdf

These operators provide an expression to access the elements of Complex Types.

Powershell Regular Expression Cheat Sheet

OperatorOperandDescription
A[n]A is an Array and n is an intIt returns the nth element in the array A. The first element has index 0.
M[key]M is a Map and key has type KIt returns the value corresponding to the key in the map.
S.xS is a structIt returns the x field of S.




broken image