diff --git a/build.txt b/build.txt index ca3304dc..e60fe846 100644 --- a/build.txt +++ b/build.txt @@ -62,6 +62,5 @@ Findbugs report is generated in /target/site ## Generate Apache Web Site =========================== - $ mvn -pl phoenix-core site -Ddependency.locations.enabled=false - -Note: site is generated in phoenix-core/target/site +checkout https://svn.apache.org/repos/asf/incubator/phoenix + $ build.sh diff --git a/phoenix-core/src/site/bin/merge.jar b/phoenix-core/src/site/bin/merge.jar deleted file mode 100644 index 2b329d69..00000000 Binary files a/phoenix-core/src/site/bin/merge.jar and /dev/null differ diff --git a/phoenix-core/src/site/bin/merge.sh b/phoenix-core/src/site/bin/merge.sh deleted file mode 100755 index 5ac0641a..00000000 --- a/phoenix-core/src/site/bin/merge.sh +++ /dev/null @@ -1,10 +0,0 @@ -current_dir=$(cd $(dirname $0);pwd) -cd $current_dir -SITE_TARGET="../../../target/site" -java -jar merge.jar ../language_reference_source/index.html $SITE_TARGET/language/index.html -java -jar merge.jar ../language_reference_source/functions.html $SITE_TARGET/language/functions.html -java -jar merge.jar ../language_reference_source/datatypes.html $SITE_TARGET/language/datatypes.html -cd $SITE_TARGET - -grep -rl class=\"nav-collapse\" . | xargs sed -i 's/class=\"nav-collapse\"/class=\"nav-collapse collapse\"/g';grep -rl class=\"active\" . | xargs sed -i 's/class=\"active\"/class=\"divider\"/g' -grep -rl "dropdown active" . | xargs sed -i 's/dropdown active/dropdown/g' diff --git a/phoenix-core/src/site/language_reference_source/datatypes.html b/phoenix-core/src/site/language_reference_source/datatypes.html deleted file mode 100644 index 9efca102..00000000 --- a/phoenix-core/src/site/language_reference_source/datatypes.html +++ /dev/null @@ -1,493 +0,0 @@ - - - -Data Types - - - - - - -

Index

- - - - - - -
- - INTEGER Type
- - UNSIGNED_INT Type
- - BIGINT Type
- - UNSIGNED_LONG Type
- - TINYINT Type
- - UNSIGNED_TINYINT Type
- - SMALLINT Type
-
- - UNSIGNED_SMALLINT Type
- - FLOAT Type
- - UNSIGNED_FLOAT Type
- - DOUBLE Type
- - UNSIGNED_DOUBLE Type
- - DECIMAL Type
- - BOOLEAN Type
-
- - TIME Type
- - DATE Type
- - TIMESTAMP Type
- - VARCHAR Type
- - CHAR Type
- - BINARY Type
- - VARBINARY Type
-
- - - -

INTEGER Type

- -
-INTEGER
-
-
-INTEGER -
- - -

Possible values: -2147483648 to 2147483647.

Mapped to java.lang.Integer. The binary representation is a 4 byte integer with the sign bit flipped (so that negative values sorts before positive values).

-

Example:

-

INTEGER

- -

UNSIGNED_INT Type

- -
-UNSIGNED_INT
-
-
-UNSIGNED_INT -
- - -

Possible values: 0 to 2147483647. Mapped to java.lang.Integer. The binary representation is a 4 byte integer, matching the HBase Bytes.toBytes(int) method. The purpose of this type is to map to existing HBase data that was serialized using this HBase utility method. If that is not the case, use the regular signed type instead.

-

Example:

-

UNSIGNED_INT

- -

BIGINT Type

- -
-BIGINT
-
-
-BIGINT -
- - -

Possible values: -9223372036854775807 to 9223372036854775807. Mapped to java.lang.Long. The binary representation is an 8 byte long with the sign bit flipped (so that negative values sorts before positive values).

-

Example:

-

BIGINT

- -

UNSIGNED_LONG Type

- -
-UNSIGNED_LONG
-
-
-UNSIGNED_LONG -
- - -

Possible values: 0 to 9223372036854775807. Mapped to java.lang.Long. The binary representation is an 8 byte integer, matching the HBase Bytes.toBytes(long) method. The purpose of this type is to map to existing HBase data that was serialized using this HBase utility method. If that is not the case, use the regular signed type instead.

-

Example:

-

UNSIGNED_LONG

- -

TINYINT Type

- -
-TINYINT
-
-
-TINYINT -
- - -

Possible values: -128 to 127. Mapped to java.lang.Byte. The binary representation is a single byte, with the sign bit flipped (so that negative values sorts before positive values).

-

Example:

-

TINYINT

- -

UNSIGNED_TINYINT Type

- -
-UNSIGNED_TINYINT
-
-
-UNSIGNED_TINYINT -
- - -

Possible values: 0 to 127. Mapped to java.lang.Byte. The binary representation is a single byte, matching the HBase Bytes.toBytes(byte) method. The purpose of this type is to map to existing HBase data that was serialized using this HBase utility method. If that is not the case, use the regular signed type instead.

-

Example:

-

UNSIGNED_TINYINT

- -

SMALLINT Type

- -
-SMALLINT
-
-
-SMALLINT -
- - -

Possible values: -32768 to 32767. Mapped to java.lang.Short. The binary representation is a 2 byte short with the sign bit flipped (so that negative values sort before positive values).

-

Example:

-

SMALLINT

- -

UNSIGNED_SMALLINT Type

- -
-UNSIGNED_SMALLINT
-
-
-UNSIGNED_SMALLINT -
- - -

Possible values: 0 to 32767. Mapped to java.lang.Short. The binary representation is an 2 byte integer, matching the HBase Bytes.toBytes(short) method. The purpose of this type is to map to existing HBase data that was serialized using this HBase utility method. If that is not the case, use the regular signed type instead.

-

Example:

-

UNSIGNED_SMALLINT

- -

FLOAT Type

- -
-FLOAT
-
-
-FLOAT -
- - -

Possible values: -3.402823466 E + 38 to 3.402823466 E + 38. Mapped to java.lang.Float. The binary representation is an 4 byte float with the sign bit flipped (so that negative values sort before positive values).

-

Example:

-

FLOAT

- -

UNSIGNED_FLOAT Type

- -
-UNSIGNED_FLOAT
-
-
-UNSIGNED_FLOAT -
- - -

Possible values: 0 to 3.402823466 E + 38. Mapped to java.lang.Float. The binary representation is an 4 byte float matching the HBase Bytes.toBytes(float) method. The purpose of this type is to map to existing HBase data that was serialized using this HBase utility method. If that is not the case, use the regular signed type instead.

-

Example:

-

UNSIGNED_FLOAT

- -

DOUBLE Type

- -
-DOUBLE
-
-
-DOUBLE -
- - -

Possible values: -1.7976931348623158 E + 308 to 1.7976931348623158 E + 308. Mapped to java.lang.Double. The binary representation is an 8 byte double with the sign bit flipped (so that negative values sort before positive value).

-

Example:

-

DOUBLE

- -

UNSIGNED_DOUBLE Type

- -
-UNSIGNED_DOUBLE
-
-
-UNSIGNED_DOUBLE -
- - -

Possible values: 0 to  1.7976931348623158 E + 308. Mapped to java.lang.Double. The binary representation is an 8 byte double matching the HBase Bytes.toBytes(double) method. The purpose of this type is to map to existing HBase data that was serialized using this HBase utility method. If that is not the case, use the regular signed type instead.

-

Example:

-

UNSIGNED_DOUBLE

- -

DECIMAL Type

- -
-DECIMAL
-
-
-DECIMAL -
- - -

Data type with fixed precision and scale. The maximum precision is 18 digits. Mapped to java.math.BigDecimal. The binary representation is binary comparable, variable length format. When used in a row key, it is terminated with a null byte unless it is the last column.

-

Example:

-

DECIMAL

- -

BOOLEAN Type

- -
-BOOLEAN
-
-
-BOOLEAN -
- - -

Possible values: TRUE and FALSE.

Mapped to java.lang.Boolean. The binary representation is a single byte with 0 for false and 1 for true

-

Example:

-

BOOLEAN

- -

TIME Type

- -
-TIME
-
-
-TIME -
- - -

The time data type. The format is yyyy-MM-dd hh:mm:ss, with both the date and time parts maintained. Mapped to java.sql.Time. The binary representation is an 8 byte long (the number of milliseconds from the epoch).

-

Example:

-

TIME

- -

DATE Type

- -
-DATE
-
-
-DATE -
- - -

The date data type. The format is yyyy-MM-dd hh:mm:ss, with both the date and time parts maintained to a millisecond accuracy. Mapped to java.sql.Date. The binary representation is an 8 byte long (the number of milliseconds from the epoch).

-

Example:

-

DATE

- -

TIMESTAMP Type

- -
-TIMESTAMP
-
-
-TIMESTAMP -
- - -

The timestamp data type. The format is yyyy-MM-dd hh:mm:ss[.nnnnnnnnn]. Mapped to java.sql.Timestamp with an internal representation of the number of nanos from the epoch. The binary representation is 12 bytes: an 8 byte long for the epoch time plus a 4 byte integer for the nanos.

-

Example:

-

TIMESTAMP

- -

VARCHAR Type

- -
-VARCHAR  [ ( precisionInt ) ]
-
-
-
VARCHAR
 
( precisionInt )
-
- - -

A variable length String with an optional max byte length. The binary representation is UTF8 matching the HBase Bytes.toBytes(String) method. When used in a row key, it is terminated with a null byte unless it is the last column.

Mapped to java.lang.String.

-

Example:

-

VARCHAR
VARCHAR(255)

- -

CHAR Type

- -
-CHAR ( precisionInt )
-
-
-
CHAR ( precisionInt )
-
- - -

A fixed length String with single-byte characters. The binary representation is UTF8 matching the HBase Bytes.toBytes(String) method.

Mapped to java.lang.String.

-

Example:

-

CHAR(10)

- -

BINARY Type

- -
-BINARY ( precisionInt )
-
-
-
BINARY ( precisionInt )
-
- - -

Raw fixed length byte array.

Mapped to byte[].

-

Example:

-

BINARY

- -

VARBINARY Type

- -
-VARBINARY
-
-
-VARBINARY -
- - -

Raw variable length byte array.

Mapped to byte[].

-

Example:

-

VARBINARY

- - - - - diff --git a/phoenix-core/src/site/language_reference_source/functions.html b/phoenix-core/src/site/language_reference_source/functions.html deleted file mode 100644 index 53056290..00000000 --- a/phoenix-core/src/site/language_reference_source/functions.html +++ /dev/null @@ -1,740 +0,0 @@ - - - -Functions - - - - - - -

Aggregate Functions

- - - - - - -
- - AVG
- - COUNT
- - MAX
- - MIN
-
- - SUM
- - PERCENTILE_CONT
- - PERCENTILE_DISC
- - PERCENT_RANK
-
- - STDDEV_POP
- - STDDEV_SAMP
-
- - -

String Functions

- - - - - - -
- - SUBSTR
- - TRIM
- - LTRIM
- - RTRIM
-
- - LENGTH
- - REGEXP_SUBSTR
- - REGEXP_REPLACE
- - UPPER
-
- - LOWER
- - REVERSE
- - TO_CHAR
-
- - -

Time and Date Functions

- - - - - - -
- - ROUND
- - TRUNCATE
-
- - TO_DATE
- - CURRENT_DATE
-
- - CURRENT_TIME
-
- - -

Other Functions

- - - - - - -
- - MD5
- - INVERT
-
- - TO_NUMBER
- - COALESCE
-
-
- - - - -

AVG

- -
-AVG ( { numericTerm } )
-
-
-
AVG ( numericTerm )
-
- - -

The average (mean) value. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.

-

Example:

-

AVG(X)

- -

COUNT

- -
-COUNT( [ DISTINCT ] { * | { term } } )
-
-
-
COUNT (
 
DISTINCT
*
term
)
-
- - -

The count of all row, or of the non-null values. This method returns a long. When DISTINCT is used, it counts only distinct values. If no rows are selected, the result is 0. Aggregates are only allowed in select statements.

-

Example:

-

COUNT(*)

- -

MAX

- -
-MAX(term)
-
-
-
MAX ( term )
-
- - -

The highest value. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.

-

Example:

-

MAX(NAME)

- -

MIN

- -
-MIN(term)
-
-
-
MIN ( term )
-
- - -

The lowest value. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.

-

Example:

-

MIN(NAME)

- -

SUM

- -
-SUM( { numericTerm } )
-
-
-
SUM ( numericTerm )
-
- - -

The sum of all values. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.

-

Example:

-

SUM(X)

- -

PERCENTILE_CONT

- -
-PERCENTILE_CONT( { numeric } ) WITHIN GROUP (ORDER BY { numericTerm } { ASC | DESC } )
-
-
-
PERCENTILE_CONT ( numeric ) WITHIN GROUP ( ORDER BY numericTerm
ASC
DESC
)
-
- - -

The nth percentile of values in the column. The percentile value can be between 0 and 1 inclusive. Aggregates are only allowed in select statements. The returned value is of decimal data type.

-

Example:

-

PERCENTILE_CONT( 0.9 ) WITHIN GROUP (ORDER BY X ASC)

- -

PERCENTILE_DISC

- -
-PERCENTILE_DIST( { numeric } ) WITHIN GROUP (ORDER BY { numericTerm } { ASC | DESC } )
-
-
-
PERCENTILE_DIST ( numeric ) WITHIN GROUP ( ORDER BY numericTerm
ASC
DESC
)
-
- - -

PERCENTILE_DISC is an inverse distribution function that assumes a discrete distribution model. It takes a percentile value and a sort specification and returns an element from the set. Nulls are ignored in the calculation.

-

Example:

-

PERCENTILE_DISC( 0.9 ) WITHIN GROUP (ORDER BY X DESC)

- -

PERCENT_RANK

- -
-PERCENT_RANK( { numeric } ) WITHIN GROUP (ORDER BY { numericTerm } { ASC | DESC } )
-
-
-
PERCENT_RANK ( numeric ) WITHIN GROUP ( ORDER BY numericTerm
ASC
DESC
)
-
- - -

The percentile rank for a hypothetical value, if inserted into the column. Aggregates are only allowed in select statements. The returned value is of decimal data type.

-

Example:

-

PERCENT_RANK( 100 ) WITHIN GROUP (ORDER BY X ASC)

- -

STDDEV_POP

- -
-STDDEV_POP( { numericTerm } )
-
-
-
STDDEV_POP ( numericTerm )
-
- - -

The population standard deviation of all values. Aggregates are only allowed in select statements. The returned value is of decimal data type.

-

Example:

-

STDDEV_POP( X )

- -

STDDEV_SAMP

- -
-STDDEV_SAMP( { numericTerm } )
-
-
-
STDDEV_SAMP ( numericTerm )
-
- - -

The sample standard deviation of all values. Aggregates are only allowed in select statements. The returned value is of decimal data type.

-

Example:

-

STDDEV_SAMP( X )

- -

MD5

- -
-MD5( term )
-
-
-
MD5 ( term )
-
- - -

Computes the MD5 hash of the argument, returning the result as a BINARY(16).

-

Example:

-

MD5(my_column)

- -

INVERT

- -
-INVERT( term )
-
-
-
INVERT ( term )
-
- - -

Inverts the bits of the argument. The return type will be the same as the argument.

-

Example:

-

INVERT(my_column)

- -

TO_NUMBER

- -
-TO_NUMBER( stringTerm | timeStampTerm [, formatString] )
-
-
-
TO_NUMBER ( stringTerm
timeStampTerm
 
, formatString
)
-
- - -

Formats a string or date/time/timeStamp as a number, optionally accepting a format string. For details on the format, see java.text.DecimalFormat. For date, time, and timeStamp terms, the result is the time in milliseconds since the epoch. This method returns a decimal number.

-

Example:

-

TO_NUMBER('$123.33', '\u00A4###.##')

- -

COALESCE

- -
-COALESCE( firstTerm, secondTerm )
-
-
-
COALESCE ( firstTerm , secondTerm )
-
- - -

Returns the value of the first argument if not null and the second argument otherwise. Useful to guarantee that a column in an UPSERT SELECT command will evaluate to a non null value.

-

Example:

-

COALESCE(last_update_date, CURRENT_DATE())

- -

SUBSTR

- -
-SUBSTR( stringTerm, startInt [, lengthInt ] )
-
-
-
SUBSTR ( stringTerm , startInt
 
, lengthInt
)
-
- - -

Returns a substring of a string starting at the one-based position. If zero is used, the position is zero-based. If the start index is negative, then the start index is relative to the end of the string. The length is optional and if not supplied, the rest of the string will be returned.

-

Example:

-

SUBSTR('[Hello]', 2, 5)
SUBSTR('Hello World', -5)

- -

TRIM

- -
-TRIM( stringTerm )
-
-
-
TRIM ( stringTerm )
-
- - -

Removes leading and trailing spaces from the input string.

-

Example:

-

TRIM('  Hello  ')

- -

LTRIM

- -
-LTRIM( stringTerm )
-
-
-
LTRIM ( stringTerm )
-
- - -

Removes leading spaces from the input string.

-

Example:

-

LTRIM('  Hello')

- -

RTRIM

- -
-RTRIM( stringTerm )
-
-
-
RTRIM ( stringTerm )
-
- - -

Removes trailing spaces from the input string.

-

Example:

-

RTRIM('Hello   ')

- -

LENGTH

- -
-LENGTH( stringTerm )
-
-
-
LENGTH ( stringTerm )
-
- - -

Returns the length of the string in characters.

-

Example:

-

LENGTH('Hello')

- -

REGEXP_SUBSTR

- -
-REGEXP_SUBSTR( stringTerm, patternString [, startInt ] )
-
-
-
REGEXP_SUBSTR ( stringTerm , patternString
 
, startInt
)
-
- - -

Returns a substring of a string by applying a regular expression start from the offset of a one-based position. Just like with SUBSTR, if the start index is negative, then it is relative to the end of the string. If not specified, the start index defaults to 1.

-

Example:

-

REGEXP_SUBSTR('na1-appsrv35-sj35', '[^-]+') evaluates to 'na1'

- -

REGEXP_REPLACE

- -
-REGEXP_REPLACE( stringTerm, patternString [, replacementString ] )
-
-
-
REGEXP_REPLACE ( stringTerm , patternString
 
, replacementString
)
-
- - -

Returns a string by applying a regular expression and replacing the matches with the replacement string. If the replacement string is not specified, it defaults to an empty string.

-

Example:

-

REGEXP_REPLACE('abc123ABC', '[0-9]+', '#') evaluates to 'abc#ABC'

- -

UPPER

- -
-UPPER( stringTerm )
-
-
-
UPPER ( stringTerm )
-
- - -

Returns upper case string of the string argument.

-

Example:

-

UPPER('Hello')

- -

LOWER

- -
-LOWER( stringTerm )
-
-
-
LOWER ( stringTerm )
-
- - -

Returns lower case string of the string argument.

-

Example:

-

LOWER('HELLO')

- -

REVERSE

- -
-REVERSE( stringTerm )
-
-
-
REVERSE ( stringTerm )
-
- - -

Returns reversed string of the string argument.

-

Example:

-

REVERSE('Hello')

- -

TO_CHAR

- -
-TO_CHAR( timestampTerm | numberTerm [, formatString] )
-
-
-
TO_CHAR ( timestampTerm
numberTerm
 
, formatString
)
-
- - -

Formats a date, time, timestamp, or number as a string. The default date format is yyyy-MM-dd HH:mm:ss and the default number format is #,##0.###. For details, see java.text.SimpleDateFormat for date/time values and java.text.DecimalFormat for numbers. This method returns a string.

-

Example:

-

TO_CHAR(myDate, '2001-02-03 04:05:06')
TO_CHAR(myDecimal, '#,##0.###')

- -

ROUND

- -
-ROUND(timestampTerm, {'DAY' | 'HOUR' | 'MINUTE' | 'SECOND' | 'MILLISECOND'} [, multiplierNumber])
-
-
-
ROUND ( timestampTerm ,
' DAY '
' HOUR '
' MINUTE '
' SECOND '
' MILLISECOND '
 
, multiplierNumber
)
-
- - -

Rounds the timestamp to the nearest time unit specified. The multiplier is used to round to a multiple of a time unit (i.e. 10 minute) and defaults to 1 if not specified. This method returns a date.

-

Example:

-

ROUND(date, 'MINUTE', 30)
ROUND(time, 'HOUR')

- -

TRUNCATE

- -
-TRUNC(timestampTerm, {'DAY' | 'HOUR' | 'MINUTE' | 'SECOND' | 'MILLISECOND'} [, multiplierInt])
-
-
-
TRUNC ( timestampTerm ,
' DAY '
' HOUR '
' MINUTE '
' SECOND '
' MILLISECOND '
 
, multiplierInt
)
-
- - -

Truncates the timestamp to the next time unit closer to 0. The multiplier is used to truncate to a multiple of a time unit (i.e. 10 minute) and defaults to 1 if not specified. This method returns a date.

-

Example:

-

TRUNCATE(timestamp, 'SECOND', 30)
TRUNCATE(date, 'DAY', 7)

- -

TO_DATE

- -
-TO_DATE( stringTerm [, formatString] )
-
-
-
TO_DATE ( stringTerm
 
, formatString
)
-
- - -

Parses a string and returns a date. The most important format characters are: y year, M month, d day, H hour, m minute, s second. The default format string is yyyy-MM-dd HH:mm:ss. For details of the format, see java.text.SimpleDateFormat.

-

Example:

-

TO_DATE('Sat, 3 Feb 2001 03:05:06 GMT', 'EEE, d MMM yyyy HH:mm:ss z')

- -

CURRENT_DATE

- -
-CURRENT_DATE()
-
-
-
CURRENT_DATE ( )
-
- - -

Returns the current server-side date, bound at the start of the execution of a query based on the current time on the region server owning the metadata of the table being queried.

-

Example:

-

CURRENT_DATE()

- -

CURRENT_TIME

- -
-CURRENT_TIME()
-
-
-
CURRENT_TIME ( )
-
- - -

Same as CURRENT_DATE(), except returns a value of type TIME. In either case, the underlying representation is the epoch time as a long value.

-

Example:

-

CURRENT_TIME()

- - - - - diff --git a/phoenix-core/src/site/language_reference_source/index.html b/phoenix-core/src/site/language_reference_source/index.html deleted file mode 100644 index 04bcec67..00000000 --- a/phoenix-core/src/site/language_reference_source/index.html +++ /dev/null @@ -1,947 +0,0 @@ - - - - -SQL Grammar - - - - - - -

Commands

- - - - - - -
- - SELECT
- - UPSERT VALUES
- - UPSERT SELECT
- - DELETE
-
- - CREATE
- - DROP
- - ALTER TABLE
- - CREATE INDEX
-
- - DROP INDEX
- - ALTER INDEX
- - EXPLAIN
-
- - -

Other Grammar

- - - - - - -
- - Constraint
- - Options
- - Hint
- - Column Def
- - Table Ref
- - Column Ref
- - Select Expression
- - Split Point
- - Table Expression
- - Order
- - Expression
- - And Condition
-
- - Condition
- - Compare
- - Operand
- - Summand
- - Factor
- - Term
- - Row Value Constructor
- - Bind Parameter
- - Value
- - Case
- - Case When
- - Name
-
- - Quoted Name
- - Alias
- - Null
- - Data Type
- - String
- - Boolean
- - Numeric
- - Int
- - Long
- - Decimal
- - Number
- - Comments
-
- - - -

SELECT

- -
-SELECT [/*+ hint */] [DISTINCT | ALL] selectExpression [,...]
-FROM tableExpression [( columnDef [,...] )] [ WHERE expression ]
-[ GROUP BY expression [,...] ] [ HAVING expression ]
-[ ORDER BY order [,...] ] [ LIMIT {bindParameter | number} ]
-
-
-
SELECT
 
/ * + hint * /
 
DISTINCT
ALL
selectExpression
 
, ...

FROM tableExpression
 
( columnDef
 
, ...
)
 
WHERE expression

 
GROUP BY expression
 
, ...
 
HAVING expression

 
ORDER BY order
 
, ...
 
LIMIT
bindParameter
number
-
- - -

Selects data from a table. DISTINCT filters out duplicate results while ALL, the default, includes all results. FROM identifies the table being queried (single table only currently - no joins or derived tables yet). Dynamic columns not declared at create time may be defined in parenthesis after the table name and then used in the query. GROUP BY groups the the result by the given expression(s). HAVING filter rows after grouping. ORDER BY sorts the result by the given column(s) or expression(s) and is only allowed for aggregate queries or queries with a LIMIT clause. LIMIT limits the number of rows returned by the query with no limit applied if specified as null or less than zero. The LIMIT clause is executed after the ORDER BY clause to support TopN type queries. An optional hint overrides the default query plan.

-

Example:

-

-SELECT * FROM TEST;
SELECT a.* FROM TEST;
SELECT DISTINCT NAME FROM TEST;
SELECT ID, COUNT(1) FROM TEST GROUP BY ID;
SELECT NAME, SUM(VAL) FROM TEST GROUP BY NAME HAVING COUNT(1) > 2;
SELECT 'ID' COL, MAX(ID) AS MAX FROM TEST;
SELECT * FROM TEST LIMIT 1000;

- -

UPSERT VALUES

- -
-UPSERT INTO tableName [( { columnRef | columnDef } [,...] )] VALUES ( constantTerm [,...] )
-
-
-
UPSERT INTO tableName
 
(
columnRef
columnDef
 
, ...
)
VALUES ( constantTerm
 
, ...
)
-
- - -

Inserts if not present and updates otherwise the value in the table. The list of columns is optional and if not present, the values will map to the column in the order they are declared in the schema. The values must evaluate to constants.

-

Example:

-

-UPSERT INTO TEST VALUES('foo','bar',3);
UPSERT INTO TEST(NAME,ID) VALUES('foo',123);

- -

UPSERT SELECT

- -
-UPSERT [/*+ hint */] INTO tableName [( { columnRef | columnDef } [,...] )] select
-
-
-
UPSERT
 
/ * + hint * /
INTO tableName
 
(
columnRef
columnDef
 
, ...
)
select
-
- - -

Inserts if not present and updates otherwise rows in the table based on the results of running another query. The values are set based on their matching position between the source and target tables. The list of columns is optional and if not present will map to the column in the order they are declared in the schema. If auto commit is on, and both a) the target table matches the source table, and b) the select performs no aggregation, then the population of the target table will be done completely on the server-side (with constraint violations logged, but otherwise ignored). Otherwise, data is buffered on the client and, if auto commit is on, committed in row batches as specified by the UpsertBatchSize connection property (or the phoenix.mutate.upsertBatchSize HBase config property which defaults to 10000 rows)

-

Example:

-

-UPSERT INTO test.targetTable(col1, col2) SELECT col3, col4 FROM test.sourceTable WHERE col5 < 100
UPSERT INTO foo SELECT * FROM bar;

- -

DELETE

- -
-DELETE [/*+ hint */] FROM tableName [ WHERE expression ]
-[ ORDER BY order [,...] ] [ LIMIT {bindParameter | number} ]
-
-
-
DELETE
 
/ * + hint * /
FROM tableName
 
WHERE expression

 
ORDER BY order
 
, ...
 
LIMIT
bindParameter
number
-
- - -

Deletes the rows selected by the where clause. If auto commit is on, the deletion is performed completely server-side.

-

Example:

-

-DELETE FROM TEST;
DELETE FROM TEST WHERE ID=123;
DELETE FROM TEST WHERE NAME LIKE 'foo%';

- -

CREATE

- -
-CREATE { TABLE | VIEW } [IF NOT EXISTS] tableRef
-( columnDef [,...] [constraint] )
-[tableOptions] [ SPLIT ON ( splitPoint [,...] ) ]
-
-
-
CREATE
TABLE
VIEW
 
IF NOT EXISTS
tableRef

( columnDef
 
, ...
 
constraint
)

 
tableOptions
 
SPLIT ON ( splitPoint
 
, ...
)
-
- - -

Creates a new table or view. For the creation of a table, the HBase table and any column families referenced are created if they don't already exist (using uppercase names unless they are double quoted in which case they are case sensitive). Column families outside of the ones listed are not affected. At create time, an empty key value is added to the first column family of any existing rows. Upserts will also add this empty key value. This is done to improve query performance by having a key value column we can guarantee always being there (minimizing the amount of data that must be projected). Alternately, if a view is created, the HBase table and column families must already exist. No empty key value is added to existing rows and no data mutations are allowed - the view is read-only. Query performance for a view will not be as good as performance for a table. For a table only, HBase table and column configuration options may be passed through as key/value pairs to setup the HBase table as needed.

-

Example:

-

-CREATE TABLE my_schema.my_table ( id BIGINT not null primary key, date DATE not null)
CREATE TABLE my_table ( id INTEGER not null primary key desc, date DATE not null,
    m.db_utilization DECIMAL, i.db_utilization)
    m.DATA_BLOCK_ENCODING='DIFF'
CREATE TABLE stats.prod_metrics ( host char(50) not null, created_date date not null,
    txn_count bigint CONSTRAINT pk PRIMARY KEY (host, created_date) )
CREATE TABLE IF NOT EXISTS my_table ( id char(10) not null primary key, value integer)
    DATA_BLOCK_ENCODING='NONE',VERSIONS=?,MAX_FILESIZE=2000000 split on (?, ?, ?)

- -

DROP

- -
-DROP {TABLE | VIEW} [IF EXISTS] tableRef
-
-
-
DROP
TABLE
VIEW
 
IF EXISTS
tableRef
-
- - -

Drops a table or view. When dropping a table, the data in the table is deleted. For a view, on the other hand, the data is not affected. Note that the schema is versioned, such that snapshot queries connecting at an earlier time stamp may still query against the dropped table, as the HBase table itself is not deleted.

-

Example:

-

-DROP TABLE my_schema.my_table
DROP VIEW my_view

- -

ALTER TABLE

- -
-ALTER TABLE tableRef { { ADD [IF NOT EXISTS] columnDef [options] } | { DROP COLUMN [IF EXISTS] columnRef } | { SET options } }
-
-
-
ALTER TABLE tableRef
ADD
 
IF NOT EXISTS
columnDef
 
options
DROP COLUMN
 
IF EXISTS
columnRef
SET options
-
- - -

Alters an existing table by adding or removing a column or updating table options. When a column is dropped from a table, the data in that column is deleted as well. PK columns may not be dropped, and only nullable PK columns may be added. For a view, the data is not affected when a column is dropped. Note that creating or dropping columns only affects subsequent queries and data modifications. Snapshot queries that are connected at an earlier timestamp will still use the prior schema that was in place when the data was written.

-

Example:

-

-ALTER TABLE my_schema.my_table ADD d.dept_id char(10) VERSIONS=10
ALTER TABLE my_table ADD dept_name char(50)
ALTER TABLE my_table ADD parent_id char(15) null primary key
ALTER TABLE my_table DROP COLUMN d.dept_id
ALTER TABLE my_table DROP COLUMN dept_name
ALTER TABLE my_table DROP COLUMN parent_id
ALTER TABLE my_table SET IMMUTABLE_ROWS=true

- -

CREATE INDEX

- -
-CREATE INDEX [IF NOT EXISTS] indexName
-ON tableRef ( columnRef [ASC | DESC] [,...] )
-[ INCLUDE ( columnRef [,...] ) ]
-[indexOptions] [ SPLIT ON ( splitPoint [,...] ) ]
-
-
-
CREATE INDEX
 
IF NOT EXISTS
indexName

ON tableRef ( columnRef
 
ASC
DESC
 
, ...
)

 
INCLUDE ( columnRef
 
, ...
)

 
indexOptions
 
SPLIT ON ( splitPoint
 
, ...
)
-
- - -

Creates a new secondary index on a table or view. The index will be automatically kept in sync with the table as the data changes. At query time, the optimizer will use the index if it contains all columns referenced in the query and produces the most efficient execution plan. If a table has rows that are write-once and append-only, then the table may set the IMMUTABLE_ROWS property to true (either up-front in the CREATE TABLE statement or afterwards in an ALTER TABLE statement). This reduces the overhead at write time to maintain the index. Otherwise, if this property is not set on the table, then incremental index maintenance will be performed on the server side when the data changes.

-

Example:

-

-CREATE INDEX my_idx ON sales.opportunity(last_updated_date DESC)
CREATE INDEX my_idx ON log.event(created_date DESC) INCLUDE (name, payload) SALT_BUCKETS=10
CREATE INDEX IF NOT EXISTS my_comp_idx ON server_metrics ( gc_time DESC, created_date DESC )
    DATA_BLOCK_ENCODING='NONE',VERSIONS=?,MAX_FILESIZE=2000000 split on (?, ?, ?)

- -

DROP INDEX

- -
-DROP INDEX [IF EXISTS] indexName ON tableRef
-
-
-
DROP INDEX
 
IF EXISTS
indexName ON tableRef
-
- - -

Drops an index from a table. When dropping an index, the data in the index is deleted. Note that since metadata is versioned, snapshot queries connecting at an earlier time stamp may still use the index, as the HBase table backing the index is not deleted.

-

Example:

-

-DROP INDEX my_idx ON sales.opportunity
DROP INDEX IF EXISTS my_idx ON server_metrics

- -

ALTER INDEX

- -
-ALTER INDEX [IF EXISTS] indexName ON tableRef { DISABLE | REBUILD | UNUSABLE | USABLE }
-
-
-
ALTER INDEX
 
IF EXISTS
indexName ON tableRef
DISABLE
REBUILD
UNUSABLE
USABLE
-
- - -

Alters the state of an existing index.  DISABLE will cause the no further index maintenance to be performed on the index and it will no longer be considered for use in queries. REBUILD will completely rebuild the index and upon completion will enable the index to be used in queries again. UNUSABLE will cause the index to no longer be considered for use in queries, however index maintenance will continue to be performed. USABLE will cause the index to again be considered for use in queries. Note that a disabled index must be rebuild and cannot be set as USABLE.

-

Example:

-

-ALTER INDEX my_idx ON sales.opportunity DISABLE
ALTER INDEX IF EXISTS my_idx ON server_metrics REBUILD

- -

EXPLAIN

- -
-EXPLAIN {select|upsertSelect|delete}
-
-
-
EXPLAIN
select
upsertSelect
delete
-
- - -

Computes the logical steps necessary to execute the given command. Each step is represented as a string in a single column result set row.

-

Example:

-

-EXPLAIN SELECT NAME, COUNT(*) FROM TEST GROUP BY NAME HAVING COUNT(*) > 2;
EXPLAIN SELECT entity_id FROM CORE.CUSTOM_ENTITY_DATA WHERE organization_id='00D300000000XHP' AND SUBSTR(entity_id,1,3) = '002' AND created_date < CURRENT_DATE()-1;

- - -

Constraint

- -
CONSTRAINT constraintName PRIMARY KEY ( columnName
 
ASC
DESC
 
, ...
)
- - -

Defines a multi-part primary key constraint. Each column may be declared to be sorted in ascending or descending ordering. The default is ascending.

-

Example:

-

CONSTRAINT my_pk PRIMARY KEY (host,created_date)
CONSTRAINT my_pk PRIMARY KEY (host ASC,created_date DESC)

- -

Options

- -
 
familyName .
name =
value
bindParameter
 
, ...
- - -

Sets an option on an HBase table or column by modifying the respective HBase metadata. The option applies to the named family or if omitted to all families if the name references an HColumnDescriptor property. Otherwise, the option applies to the HTableDescriptor.

One built-in option is SALT_BUCKETS. This option causes an extra byte to be transparently prepended to every row key to ensure an even distribution of write load across all your region servers. This is useful when your row key is always monotonically increasing causing hot spotting on a single region server. The byte is determined by hashing the row key and modding it with the SALT_BUCKETS value. The value may be from 1 to 256. If not split points are defined for the table, it will automatically be pre-split at each possible salt bucket value. For an excellent write-up of this technique, see http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/

Another built-in options is IMMUTABLE_ROWS. Only tables with immutable rows are allowed to have indexes. Immutable rows are expected to be inserted once in their entirety and then never updated. This limitation will be removed once incremental index maintenance has been implemented. The current implementation inserts the index rows when the data row is inserted.

-

Example:

-

IMMUTABLE_ROWS=true
SALT_BUCKETS=10
DATA_BLOCK_ENCODING='NONE',a.VERSIONS=10
MAX_FILESIZE=2000000000,MEMSTORE_FLUSHSIZE=80000000

- -

Hint

- -
name
 
, ...
- - -

Advanced features that overrides default query processing behavior. The supported hints include 1) SKIP_SCAN to force a skip scan to be performed on the query when it otherwise would not be. This option may improve performance if a query does not include the leading primary key column, but does include other, very selective primary key columns. 2) RANGE_SCAN to force a range scan to be performed on the query. This option may improve performance if a query filters on a range for non selective leading primary key column along with other primary key columns 3) NO_INTRA_REGION_PARALLELIZATION to prevent the spawning of multiple threads to process data within a single region. This option is useful when the overall data set being queries is known to be small. 4) NO_INDEX to force the data table to be used for a query, and 5) INDEX(<table_name> <index_name>...) to suggest which index to use for a given query. Double quotes may be used to surround a table_name and/or index_name that is case sensitive.

-

Example:

-

/*+ SKIP_SCAN */
/*+ RANGE_SCAN */
/*+ NO_INTRA_REGION_PARALLELIZATION */
/*+ NO_INDEX */
/*+ INDEX(employee emp_name_idx emp_start_date_idx) */

- -

Column Def

- -
columnRef dataType
 
 
NOT
NULL
 
PRIMARY KEY
 
ASC
DESC
- - -

Define a new primary key column. The column name is case insensitive by default and case sensitive if double quoted. The sort order of a primary key may be ascending (ASC) or descending. The default is ascending.

-

Example:

-

id char(15) not null primary key
key integer null
m.response_time bigint

- -

Table Ref

- -
 
schemaName .
tableName
- - -

References a table with an optional schema name qualifier

-

Example:

-

Sales.Contact
HR.Employee
Department

- -

Column Ref

- -
 
familyName .
columnName
- - -

References a column with an optional family name qualifier

-

Example:

-

e.salary
dept_name

- -

Select Expression

- -
*
( familyName . * )
term
 
 
AS
columnAlias
- - -

An expression in a SELECT statement. All columns in a table may be selected using *, and all columns in a column family may be selected using <familyName>.*.

-

Example:

-

*
cf.*
ID AS VALUE
VALUE + 1 VALUE_PLUS_ONE

- -

Split Point

- -
value
bindParameter
- - -

Defines a split point for a table. Use a bind parameter with preparedStatement.setBinary(int,byte[]) to supply arbitrary bytes.

-

Example:

-

'A'

- -

Table Expression

- -
 
schemaName .
tableName
 
 
AS
tableAlias
- - -

A reference to a table. Joins and sub queries are not currently supported.

-

Example:

-

PRODUCT_METRICS AS PM

- -

Order

- -
expression
 
ASC
DESC
 
NULLS
FIRST
LAST
- - -

Sorts the result by an expression.

-

Example:

-

NAME DESC NULLS LAST

- -

Expression

- -
andCondition
 
OR andCondition
 
...
- - -

Value or condition.

-

Example:

-

ID=1 OR NAME='Hi'

- -

And Condition

- -
condition
 
AND condition
 
...
- - -

Value or condition.

-

Example:

-

ID=1 AND NAME='Hi'

- -

Condition

- -
operand
 
compare operand
 
NOT
IN ( constantOperand
 
, ...
)
 
NOT
LIKE operand
 
NOT
BETWEEN operand AND operand
IS
 
NOT
NULL
NOT expression
( expression )
- - -

Boolean value or condition. When comparing with LIKE, the wildcards characters are _ (any one character) and % (any characters). To search for the characters % and _, the characters need to be escaped. The escape character is \ (backslash). Patterns that end with an escape character are invalid and the expression returns NULL. BETWEEN does an inclusive comparison for both operands.

-

Example:

-

NAME LIKE 'Jo%'

- -

Compare

- -
< >
< =
> =
=
<
>
! =
- - -

Comparison operator. The operator != is the same as <>.

-

Example:

-

<>

- -

Operand

- -
summand
 
|| summand
 
...
- - -

A string concatenation.

-

Example:

-

'foo'|| s

- -

Summand

- -
factor
 
+
-
factor
 
...
- - -

An addition or subtraction of numeric or date type values

-

Example:

-

a + b
a - b

- -

Factor

- -
term
 
*
/
term
 
...
- - -

A multiplication or division.

-

Example:

-

c * d
e / 5

- -

Term

- -
value
bindParameter
Function
case
caseWhen
( operand )
 
tableAlias .
columnRef
rowValueConstructor
- - -

A value.

-

Example:

-

'Hello'

- -

Row Value Constructor

- -
( term , term
 
...
)
- - -

A row value constructor is a list of other terms which are treated together as a kind of composite structure. They may be compared to each other or to other other terms. The main use case is 1) to enable efficiently stepping through a set of rows in support of query-more type functionality, or 2) to allow IN clause to perform point gets on composite row keys.

-

Example:

-

(col1, col2, 5)

- -

Bind Parameter

- -
?
: number
- - -

A parameters can be indexed, for example :1 meaning the first parameter.

-

Example:

-

:1
?

- -

Value

- -
string
numeric
boolean
null
- - -

A literal value of any data type, or null.

-

Example:

-

10

- -

Case

- -
CASE term WHEN expression THEN term
 
...

 
ELSE expression
END
- - -

Returns the first expression where the value is equal to the test expression. If no else part is specified, return NULL.

-

Example:

-

CASE CNT WHEN 0 THEN 'No' WHEN 1 THEN 'One' ELSE 'Some' END

- -

Case When

- -
CASE WHEN expression THEN term
 
...

 
ELSE term
END
- - -

Returns the first expression where the condition is true. If no else part is specified, return NULL.

-

Example:

-

CASE WHEN CNT<10 THEN 'Low' ELSE 'High' END

- -

Name

- -
A-Z | _
 
A-Z | _
0-9
 
...
quotedName
- - -

Unquoted names are not case sensitive. There is no maximum name length.

-

Example:

-

my_column

- -

Quoted Name

- -
" anything "
- - -

Quoted names are case sensitive, and can contain spaces. There is no maximum name length. Two double quotes can be used to create a single double quote inside an identifier.

-

Example:

-

"first-name"

- -

Alias

- -name - - -

An alias is a name that is only valid in the context of the statement.

-

Example:

-

A

- -

Null

- -NULL - - -

NULL is a value without data type and means 'unknown value'.

-

Example:

-

NULL

- -

Data Type

- -
charType
varcharType
decimalType
tinyintType
smallintType
integerType
bigintType
floatType
doubleType
timestampType
dateType
timeType
unsignedTinyintType
unsignedSmallintType
unsignedIntType
unsignedLongType
unsignedFloatType
unsignedDoubleType
binaryType
varbinaryType
- - -

A type name.

-

Example:

-

CHAR(15)
VARCHAR
VARCHAR(1000)
INTEGER
BINARY(200)

- -

String

- -
' anything '
- - -

A string starts and ends with a single quote. Two single quotes can be used to create a single quote inside a string.

-

Example:

-

'John''s car'

- -

Boolean

- -
TRUE
FALSE
- - -

A boolean value.

-

Example:

-

TRUE

- -

Numeric

- -
int
long
decimal
- - -

The data type of a numeric value is always the lowest possible for the given value. If the number contains a dot this is decimal; otherwise it is int, long, or decimal (depending on the value).

-

Example:

-

SELECT -10.05
SELECT 5
SELECT 12345678912345

- -

Int

- -
 
-
number
- - -

The maximum integer number is 2147483647, the minimum is -2147483648.

-

Example:

-

10

- -

Long

- -
 
-
number
- - -

Long numbers are between -9223372036854775808 and 9223372036854775807.

-

Example:

-

100000

- -

Decimal

- -
 
-
number
 
. number
- - -

A decimal number with fixed precision and scale. Internally, java.lang.BigDecimal is used.

-

Example:

-

SELECT -10.5

- -

Number

- -
0-9
 
...
- - -

The maximum length of the number depends on the data type used.

-

Example:

-

100

- -

Comments

- -
- - anything
/ / anything
/ * anything * /
- - -

Comments can be used anywhere in a command and are ignored by the database. Line comments end with a newline. Block comments cannot be nested, but can be multiple lines long.

-

Example:

-

// This is a comment

- - - - diff --git a/phoenix-core/src/site/markdown/Phoenix-in-15-minutes-or-less.md b/phoenix-core/src/site/markdown/Phoenix-in-15-minutes-or-less.md deleted file mode 100644 index bc01b185..00000000 --- a/phoenix-core/src/site/markdown/Phoenix-in-15-minutes-or-less.md +++ /dev/null @@ -1,80 +0,0 @@ -# Phoenix in 15 minutes or less - -*What is this new [Phoenix](index.html) thing I've been hearing about?*
-Phoenix is an open source SQL skin for HBase. You use the standard JDBC APIs instead of the regular HBase client APIs to create tables, insert data, and query your HBase data. - -*Doesn't putting an extra layer between my application and HBase just slow things down?*
-Actually, no. Phoenix achieves as good or likely better [performance](performance.html) than if you hand-coded it yourself (not to mention with a heck of a lot less code) by: -* compiling your SQL queries to native HBase scans -* determining the optimal start and stop for your scan key -* orchestrating the parallel execution of your scans -* bringing the computation to the data by - * pushing the predicates in your where clause to a server-side filter - * executing aggregate queries through server-side hooks (called co-processors) - -In addition to these items, we've got some interesting enhancements in the works to further optimize performance: -* secondary indexes to improve performance for queries on non row key columns -* stats gathering to improve parallelization and guide choices between optimizations -* skip scan filter to optimize IN, LIKE, and OR queries -* optional salting of row keys to evenly distribute write load - -*Ok, so it's fast. But why SQL? It's so 1970s*
-Well, that's kind of the point: give folks something with which they're already familiar. What better way to spur the adoption of HBase? On top of that, using JDBC and SQL: -* Reduces the amount of code users need to write -* Allows for performance optimizations transparent to the user -* Opens the door for leveraging and integrating lots of existing tooling - -*But how can SQL support my favorite HBase technique of x,y,z*
-Didn't make it to the last HBase Meetup did you? SQL is just a way of expressing *what you want to get* not *how you want to get it*. Check out my [presentation](http://files.meetup.com/1350427/IntelPhoenixHBaseMeetup.ppt) for various existing and to-be-done Phoenix features to support your favorite HBase trick. Have ideas of your own? We'd love to hear about them: file an [issue](issues.html) for us and/or join our [mailing list](mailing_list.html). - -*Blah, blah, blah - I just want to get started!*
-Ok, great! Just follow our [install instructions](download.html#Installation): -* [download](download.html) and expand our installation tar -* copy the phoenix jar into the HBase lib directory of every region server -* restart the region servers -* add the phoenix client jar to the classpath of your HBase client -* download and [setup SQuirrel](download.html#SQL-Client) as your SQL client so you can issue adhoc SQL against your HBase cluster - -*I don't want to download and setup anything else!*
-Ok, fair enough - you can create your own SQL scripts and execute them using our command line tool instead. Let's walk through an example now. In the bin directory of your install location: -* Create us_population.sql file -
CREATE TABLE IF NOT EXISTS us_population (
-      state CHAR(2) NOT NULL,
-      city VARCHAR NOT NULL,
-      population BIGINT
-      CONSTRAINT my_pk PRIMARY KEY (state, city));
-* Create us_population.csv file -
NY,New York,8143197
-CA,Los Angeles,3844829
-IL,Chicago,2842518
-TX,Houston,2016582
-PA,Philadelphia,1463281
-AZ,Phoenix,1461575
-TX,San Antonio,1256509
-CA,San Diego,1255540
-TX,Dallas,1213825
-CA,San Jose,912332
-
-* Create us_population_queries.sql file -
SELECT state as "State",count(city) as "City Count",sum(population) as "Population Sum"
-FROM us_population
-GROUP BY state
-ORDER BY sum(population) DESC;
-
-* Execute the following command from a command terminal -
./psql.sh <your_zookeeper_quorum> us_population.sql us_population.csv us_population_queries.sql
-
- -Congratulations! You've just created your first Phoenix table, inserted data into it, and executed an aggregate query with just a few lines of code in 15 minutes or less! - -*Big deal - 10 rows! What else you got?*
-Ok, ok - tough crowd. Check out our bin/performance.sh script to create as many rows as you want, for any schema you come up with, and run timed queries against it. - -*Why is it called Phoenix anyway? Did some other project crash and burn and this is the next generation?*
-I'm sorry, but we're out of time and space, so we'll have to answer that next time! - -Thanks for your time,
-James Taylor
-http://phoenix-hbase.blogspot.com/ -
-@JamesPlusPlus
diff --git a/phoenix-core/src/site/markdown/building.md b/phoenix-core/src/site/markdown/building.md deleted file mode 100644 index 1b6a3f46..00000000 --- a/phoenix-core/src/site/markdown/building.md +++ /dev/null @@ -1,25 +0,0 @@ -# Building Phoenix Project - -Phoenix is a fully mavenized project. That means you can build simply by doing: - -``` -$ mvn package -``` - -builds, test and package Phoenix and put the resulting jars (phoenix-[version].jar and phoenix-[version]-client.jar) in the generated phoenix-core/target/ and phoenix-assembly/target/ directories respectively. - -To build, but skip running the tests, you can do: - -``` - $ mvn package -DskipTests -``` - -To only build the generated parser (i.e. PhoenixSQLLexer and PhoenixSQLParser), you can do: - -``` - $ mvn install -DskipTests - $ mvn process-sources -``` - -To build an Eclipse project, install the m2e plugin and do an File->Import...->Import Existing Maven Projects selecting the root directory of Phoenix. - diff --git a/phoenix-core/src/site/markdown/download.md b/phoenix-core/src/site/markdown/download.md deleted file mode 100644 index 147bc789..00000000 --- a/phoenix-core/src/site/markdown/download.md +++ /dev/null @@ -1,84 +0,0 @@ -## Available Phoenix Downloads - -### Download link will be available soon. - -
- -### Installation ### -To install a pre-built phoenix, use these directions: - -* Download and expand the latest phoenix-[version]-install.tar -* Add the phoenix-[version].jar to the classpath of every HBase region server. An easy way to do this is to copy it into the HBase lib directory. -* Restart all region servers. -* Add the phoenix-[version]-client.jar to the classpath of any Phoenix client. - -### Getting Started ### -Wanted to get started quickly? Take a look at our [FAQs](faq.html) and take our quick start guide [here](Phoenix-in-15-minutes-or-less.html). - -

Command Line

- -A terminal interface to execute SQL from the command line is now bundled with Phoenix. To start it, execute the following from the bin directory: - - $ sqlline.sh localhost - -To execute SQL scripts from the command line, you can include a SQL file argument like this: - - $ sqlline.sh localhost ../examples/stock_symbol.sql - -![sqlline](images/sqlline.png) - -For more information, see the [manual](http://www.hydromatic.net/sqlline/manual.html). - -
Loading Data
- -In addition, you can use the bin/psql.sh to load CSV data or execute SQL scripts. For example: - - $ psql.sh localhost ../examples/web_stat.sql ../examples/web_stat.csv ../examples/web_stat_queries.sql - -Other alternatives include: -* Using our [map-reduce based CSV loader](mr_dataload.html) for bigger data sets -* [Mapping an existing HBase table to a Phoenix table](index.html#Mapping-to-an-Existing-HBase-Table) and using the [UPSERT SELECT](language/index.html#upsert_select) command to populate a new table. -* Populating the table through our [UPSERT VALUES](language/index.html#upsert_values) command. - -

SQL Client

- -If you'd rather use a client GUI to interact with Phoenix, download and install [SQuirrel](http://squirrel-sql.sourceforge.net/). Since Phoenix is a JDBC driver, integration with tools such as this are seamless. Here are the setup steps necessary: - -1. Remove prior phoenix-[version]-client.jar from the lib directory of SQuirrel -2. Copy the phoenix-[version]-client.jar into the lib directory of SQuirrel (Note that on a Mac, this is the *internal* lib directory). -3. Start SQuirrel and add new driver to SQuirrel (Drivers -> New Driver) -4. In Add Driver dialog box, set Name to Phoenix -5. Press List Drivers button and org.apache.phoenix.jdbc.PhoenixDriver should be automatically populated in the Class Name textbox. Press OK to close this dialog. -6. Switch to Alias tab and create the new Alias (Aliases -> New Aliases) -7. In the dialog box, Name: _any name_, Driver: Phoenix, User Name: _anything_, Password: _anything_ -8. Construct URL as follows: jdbc:phoenix: _zookeeper quorum server_. For example, to connect to a local HBase use: jdbc:phoenix:localhost -9. Press Test (which should succeed if everything is setup correctly) and press OK to close. -10. Now double click on your newly created Phoenix alias and click Connect. Now you are ready to run SQL queries against Phoenix. - -Through SQuirrel, you can issue SQL statements in the SQL tab (create tables, insert data, run queries), and inspect table metadata in the Object tab (i.e. list tables, their columns, primary keys, and types). - -![squirrel](images/squirrel.png) - -### Samples ### -The best place to see samples are in our unit tests under src/test/java. The ones in the endToEnd package are tests demonstrating how to use all aspects of the Phoenix JDBC driver. We also have some examples in the examples directory. - -### Phoenix Client - Server Compatibility - -Major and minor version should match between client and server (patch version can mismatch). Following is the list of compatible client and server version(s). It is recommended that same client and server version are used. - -Phoenix Client Version | Compatible Server Versions ------------------------|--- -1.0.0 | 1.0.0 -1.1.0 | 1.1.0 -1.2.0 | 1.2.0, 1.2.1 -1.2.1 | 1.2.0, 1.2.1 -2.0.0 | 2.0.0, 2.0.1, 2.0.2 -2.0.1 | 2.0.0, 2.0.1, 2.0.2 -2.0.2 | 2.0.0, 2.0.1, 2.0.2 -2.1.0 | 2.1.0, 2.1.1, 2.1.2 -2.1.1 | 2.1.0, 2.1.1, 2.1.2 -2.1.2 | 2.1.0, 2.1.1, 2.1.2 -2.2.0 | 2.2.0, 2.2.1 -2.2.1 | 2.2.0, 2.2.1 - -[![githalytics.com alpha](https://cruel-carlota.pagodabox.com/33878dc7c0522eed32d2d54db9c59f78 "githalytics.com")](http://githalytics.com/forcedotcom/phoenix.git) diff --git a/phoenix-core/src/site/markdown/dynamic_columns.md b/phoenix-core/src/site/markdown/dynamic_columns.md deleted file mode 100644 index 0c7d9ce2..00000000 --- a/phoenix-core/src/site/markdown/dynamic_columns.md +++ /dev/null @@ -1,17 +0,0 @@ -# Dynamic Columns - -Sometimes defining a static schema up front is not feasible. Instead, a subset of columns may be specified at table [create](language/index.html#create) time while the rest would be specified at [query](language/index.html#select) time. As of Phoenix 1.2, specifying columns dynamically is now supported by allowing column definitions to included in parenthesis after the table in the FROM clause on a SELECT statement. Although this is not standard SQL, it is useful to surface this type of functionality to leverage the late binding ability of HBase. - -For example: - - SELECT eventTime, lastGCTime, usedMemory, maxMemory - FROM EventLog(lastGCTime TIME, usedMemory BIGINT, maxMemory BIGINT) - WHERE eventType = 'OOM' AND lastGCTime < eventTime - 1 - -Where you may have defined only a subset of your event columns at create time, since each event type may have different properties: - - CREATE TABLE EventLog ( - eventId BIGINT NOT NULL, - eventTime TIME NOT NULL, - eventType CHAR(3) NOT NULL - CONSTRAINT pk PRIMARY KEY (eventId, eventTime)) diff --git a/phoenix-core/src/site/markdown/faq.md b/phoenix-core/src/site/markdown/faq.md deleted file mode 100644 index cbcfc0a1..00000000 --- a/phoenix-core/src/site/markdown/faq.md +++ /dev/null @@ -1,279 +0,0 @@ -# F.A.Q. - -* [I want to get started. Is there a Phoenix Hello World?](#I_want_to_get_started_Is_there_a_Phoenix_Hello_World) -* [Is there a way to bulk load in Phoenix?](#Is_there_a_way_to_bulk_load_in_Phoenix) -* [How do I create a VIEW in Phoenix? What's the difference between a VIEW and a TABLE?](#How_I_create_Views_in_Phoenix_Whatnulls_the_difference_between_ViewsTables) -* [Are there any tips for optimizing Phoenix?](#Are_there_any_tips_for_optimizing_Phoenix) -* [How do I create Secondary Index on a table?](#How_do_I_create_Secondary_Index_on_a_table) -* [Why isn't my secondary index being used?](#Why_isnnullt_my_secondary_index_being_used) -* [How fast is Phoenix? Why is it so fast?](#How_fast_is_Phoenix_Why_is_it_so_fast) -* [How do I connect to secure HBase cluster?](#How_do_I_connect_to_secure_HBase_cluster) -* [How do I connect with HBase running on Hadoop-2?](#How_do_I_connect_with_HBase_running_on_Hadoop-2) -* [Can phoenix work on tables with arbitrary timestamp as flexible as HBase API?](#Can_phoenix_work_on_tables_with_arbitrary_timestamp_as_flexible_as_HBase_API) -* [Why isn't my query doing a RANGE SCAN?](#Why_isnnullt_my_query_doing_a_RANGE_SCAN) - - -### I want to get started. Is there a Phoenix _Hello World_? - -*Pre-requisite:* Download latest Phoenix from [here](download.html) -and copy phoenix-*.jar to HBase lib folder and restart HBase. - -**1. Using console** - -1. Start Sqlline: `$ sqlline.sh [zookeeper]` -2. Execute the following statements when Sqlline connects: - -``` -create table test (mykey integer not null primary key, mycolumn varchar); -upsert into test values (1,'Hello'); -upsert into test values (2,'World!'); -select * from test; -``` - -3. You should get the following output - -``` -+-------+------------+ -| MYKEY | MYCOLUMN | -+-------+------------+ -| 1 | Hello | -| 2 | World! | -+-------+------------+ -``` - - -**2. Using java** - -Create test.java file with the following content: - -``` -import java.sql.Connection; -import java.sql.DriverManager; -import java.sql.ResultSet; -import java.sql.SQLException; -import java.sql.PreparedStatement; -import java.sql.Statement; - -public class test { - - public static void main(String[] args) throws SQLException { - Statement stmt = null; - ResultSet rset = null; - - Connection con = DriverManager.getConnection("jdbc:phoenix:[zookeeper]"); - stmt = con.createStatement(); - - stmt.executeUpdate("create table test (mykey integer not null primary key, mycolumn varchar)"); - stmt.executeUpdate("upsert into test values (1,'Hello')"); - stmt.executeUpdate("upsert into test values (2,'World!')"); - con.commit(); - - PreparedStatement statement = con.prepareStatement("select * from test"); - rset = statement.executeQuery(); - while (rset.next()) { - System.out.println(rset.getString("mycolumn")); - } - statement.close(); - con.close(); - } -} -``` -Compile and execute on command line - -`$ javac test.java` - -`$ java -cp "../phoenix-[version]-client.jar:." test` - - -You should get the following output - -`Hello` -`World!` - - - -### Is there a way to bulk load in Phoenix? - -**Map Reduce** - -See the example [here](mr_dataload.html) Credit: Arun Singh - -**CSV** - -CSV data can be bulk loaded with built in utility named psql. Typical upsert rates are 20K - 50K rows per second (depends on how wide are the rows). - -Usage example: -Create table using psql -`$ psql.sh [zookeeper] ../examples/web_stat.sql` - -Upsert CSV bulk data -`$ psql.sh [zookeeper] ../examples/web_stat.csv` - - - -### How I create Views in Phoenix? What's the difference between Views/Tables? - -You can create both a Phoenix table or view through the CREATE TABLE/CREATE VIEW DDL statement on a pre-existing HBase table. In both cases, we'll leave the HBase metadata as-is, except for with a TABLE we turn KEEP_DELETED_CELLS on. For CREATE TABLE, we'll create any metadata (table, column families) that doesn't already exist. We'll also add an empty key value for each row so that queries behave as expected (without requiring all columns to be projected during scans). - -The other caveat is that the way the bytes were serialized must match the way the bytes are serialized by Phoenix. For VARCHAR,CHAR, and UNSIGNED_* types, we use the HBase Bytes methods. The CHAR type expects only single-byte characters and the UNSIGNED types expect values greater than or equal to zero. - -Our composite row keys are formed by simply concatenating the values together, with a zero byte character used as a separator after a variable length type. - -If you create an HBase table like this: - -`create 't1', {NAME => 'f1', VERSIONS => 5}` - -then you have an HBase table with a name of 't1' and a column family with a name of 'f1'. Remember, in HBase, you don't model the possible KeyValues or the structure of the row key. This is the information you specify in Phoenix above and beyond the table and column family. - -So in Phoenix, you'd create a view like this: - -`CREATE VIEW "t1" ( pk VARCHAR PRIMARY KEY, "f1".val VARCHAR )` - -The "pk" column declares that your row key is a VARCHAR (i.e. a string) while the "f1".val column declares that your HBase table will contain KeyValues with a column family and column qualifier of "f1":VAL and that their value will be a VARCHAR. - -Note that you don't need the double quotes if you create your HBase table with all caps names (since this is how Phoenix normalizes strings, by upper casing them). For example, with: - -`create 'T1', {NAME => 'F1', VERSIONS => 5}` - -you could create this Phoenix view: - -`CREATE VIEW t1 ( pk VARCHAR PRIMARY KEY, f1.val VARCHAR )` - -Or if you're creating new HBase tables, just let Phoenix do everything for you like this (No need to use the HBase shell at all.): - -`CREATE TABLE t1 ( pk VARCHAR PRIMARY KEY, val VARCHAR )` - - - -### Are there any tips for optimizing Phoenix? - -* Use **Salting** to increase read/write performance -Salting can significantly increase read/write performance by pre-splitting the data into multiple regions. Although Salting will yield better performance in most scenarios. - -Example: - -` CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) SALT_BUCKETS=16` - -Note: Ideally for a 16 region server cluster with quad-core CPUs, choose salt buckets between 32-64 for optimal performance. - -* **Per-split** table -Salting does automatic table splitting but in case you want to exactly control where table split occurs with out adding extra byte or change row key order then you can pre-split a table. - -Example: - -` CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) SPLIT ON ('CS','EU','NA')` - -* Use **multiple column families** - -Column family contains related data in separate files. If you query use selected columns then it make sense to group those columns together in a column family to improve read performance. - -Example: - -Following create table DDL will create two column familes A and B. - -` CREATE TABLE TEST (MYKEY VARCHAR NOT NULL PRIMARY KEY, A.COL1 VARCHAR, A.COL2 VARCHAR, B.COL3 VARCHAR)` - -* Use **compression** -On disk compression improves performance on large tables - -Example: - -` CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) COMPRESSION='GZ'` - -* Create **indexes** -See [faq.html#/How_do_I_create_Secondary_Index_on_a_table](faq.html#/How_do_I_create_Secondary_Index_on_a_table) - -* **Optimize cluster** parameters -See http://hbase.apache.org/book/performance.html - -* **Optimize Phoenix** parameters -See [tuning.html](tuning.html) - - - -### How do I create Secondary Index on a table? - -Starting with Phoenix version 2.1, Phoenix supports index over mutable and immutable data. Note that Phoenix 2.0.x only supports Index over immutable data. Index write performance index with immutable table is slightly faster than mutable table however data in immutable table cannot be updated. - -Example - -* Create table - -Immutable table: `create table test (mykey varchar primary key, col1 varchar, col2 varchar) IMMUTABLE_ROWS=true;` - -Mutable table: `create table test (mykey varchar primary key, col1 varchar, col2 varchar);` - -* Creating index on col2 - -`create index idx on test (col2)` - -* Creating index on col1 and a covered index on col2 - -`create index idx on test (col1) include (col2)` - -Upsert rows in this test table and Phoenix query optimizer will choose correct index to use. You can see in [explain plan](language/index.html#explain) if Phoenix is using the index table. You can also give a [hint](language/index.html#hint) in Phoenix query to use a specific index. - - - -### Why isn't my secondary index being used? - -The secondary index won't be used unless all columns used in the query are in it ( as indexed or covered columns). All columns making up the primary key of the data table will automatically be included in the index. - -Example: DDL `create table usertable (id varchar primary key, firstname varchar, lastname varchar); create index idx_name on usertable (firstname);` - -Query: DDL `select id, firstname, lastname from usertable where firstname = 'foo';` - -Index would not be used in this case as lastname is not part of indexed or covered column. This can be verified by looking at the explain plan. To fix this create index that has either lastname part of index or covered column. Example: `create idx_name on usertable (firstname) include (lastname);` - - -### How fast is Phoenix? Why is it so fast? - -Phoenix is fast. Full table scan of 100M rows usually completes in 20 seconds (narrow table on a medium sized cluster). This time come down to few milliseconds if query contains filter on key columns. For filters on non-key columns or non-leading key columns, you can add index on these columns which leads to performance equivalent to filtering on key column by making copy of table with indexed column(s) part of key. - -Why is Phoenix fast even when doing full scan: - -1. Phoenix chunks up your query using the region boundaries and runs them in parallel on the client using a configurable number of threads -2. The aggregation will be done in a coprocessor on the server-side, collapsing the amount of data that gets returned back to the client rather than returning it all. - - - -### How do I connect to secure HBase cluster? -Check out excellent post by Anil Gupta -http://bigdatanoob.blogspot.com/2013/09/connect-phoenix-to-secure-hbase-cluster.html - - - -### How do I connect with HBase running on Hadoop-2? -Hadoop-2 profile exists in Phoenix pom.xml. - - -### Can phoenix work on tables with arbitrary timestamp as flexible as HBase API? -By default, Phoenix let's HBase manage the timestamps and just shows you the latest values for everything. However, Phoenix also allows arbitrary timestamps to be supplied by the user. To do that you'd specify a "CurrentSCN" (or PhoenixRuntime.CURRENT_SCN_ATTRIB if you want to use our constant) at connection time, like this: - - Properties props = new Properties(); - props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts)); - Connection conn = DriverManager.connect(myUrl, props); - - conn.createStatement().execute("UPSERT INTO myTable VALUES ('a')"); - conn.commit(); -The above is equivalent to doing this with the HBase API: - - myTable.put(Bytes.toBytes('a'),ts); -By specifying a CurrentSCN, you're telling Phoenix that you want everything for that connection to be done at that timestamp. Note that this applies to queries done on the connection as well - for example, a query over myTable above would not see the data it just upserted, since it only sees data that was created before its CurrentSCN property. This provides a way of doing snapshot, flashback, or point-in-time queries. - -Keep in mind that creating a new connection is *not* an expensive operation. The same underlying HConnection is used for all connections to the same cluster, so it's more or less like instantiating a few objects. - - -### Why isn't my query doing a RANGE SCAN? - -`DDL: CREATE TABLE TEST (pk1 char(1) not null, pk2 char(1) not null, pk3 char(1) not null, non-pk varchar CONSTRAINT PK PRIMARY KEY(pk1, pk2, pk3));` - -RANGE SCAN means that only a subset of the rows in your table will be scanned over. This occurs if you use one or more leading columns from your primary key constraint. Query that is not filtering on leading PK columns ex. `select * from test where pk2='x' and pk3='y';` will result in full scan whereas the following query will result in range scan `select * from test where pk1='x' and pk2='y';`. Note that you can add a secondary index on your "pk2" and "pk3" columns and that would cause a range scan to be done for the first query (over the index table). - -DEGENERATE SCAN means that a query can't possibly return any rows. If we can determine that at compile time, then we don't bother to even run the scan. - -FULL SCAN means that all rows of the table will be scanned over (potentially with a filter applied if you have a WHERE clause) - -SKIP SCAN means that either a subset or all rows in your table will be scanned over, however it will skip large groups of rows depending on the conditions in your filter. See this blog for more detail. We don't do a SKIP SCAN if you have no filter on the leading primary key columns, but you can force a SKIP SCAN by using the /*+ SKIP_SCAN */ hint. Under some conditions, namely when the cardinality of your leading primary key columns is low, it will be more efficient than a FULL SCAN. - - diff --git a/phoenix-core/src/site/markdown/flume.md b/phoenix-core/src/site/markdown/flume.md deleted file mode 100644 index 6cc9251a..00000000 --- a/phoenix-core/src/site/markdown/flume.md +++ /dev/null @@ -1,42 +0,0 @@ -# Apache Flume Plugin - -The plugin enables us to reliably and efficiently stream large amounts of data/logs onto HBase using the Phoenix API. The necessary configuration of the custom Phoenix sink and the Event Serializer has to be configured in the Flume configuration file for the Agent. Currently, the only supported Event serializer is a RegexEventSerializer which primarily breaks the Flume Event body based on the regex specified in the configuration file. - -#### Prerequisites: - -* Phoenix v 3.0.0 SNAPSHOT + -* Flume 1.4.0 + - -#### Installation & Setup: - -1. Download and build Phoenix v 0.3.0 SNAPSHOT -2. Follow the instructions as specified [here](building.html) to build the project as the Flume plugin is still under beta -3. Create a directory plugins.d within $FLUME_HOME directory. Within that, create a sub-directories phoenix-sink/lib -4. Copy the generated phoenix-3.0.0-SNAPSHOT-client.jar onto $FLUME_HOME/plugins.d/phoenix-sink/lib - -#### Configuration: - -Property Name |Default| Description ---------------------------|-------|--- -type | |org.apache.phoenix.flume.sink.PhoenixSink -batchSize |100 |Default number of events per transaction -zookeeperQuorum | |Zookeeper quorum of the HBase cluster -table | |The name of the table in HBase to write to. -ddl | |The CREATE TABLE query for the HBase table where the events will be upserted to. If specified, the query will be executed. Recommended to include the IF NOT EXISTS clause in the ddl. -serializer |regex |Event serializers for processing the Flume Event . Currently , only regex is supported. -serializer.regex |(.*) |The regular expression for parsing the event. -serializer.columns | |The columns that will be extracted from the Flume event for inserting into HBase. -serializer.headers | |Headers of the Flume Events that go as part of the UPSERT query. The data type for these columns are VARCHAR by default. -serializer.rowkeyType | |A custom row key generator . Can be one of timestamp,date,uuid,random and nanotimestamp. This should be configured in cases where we need a custom row key value to be auto generated and set for the primary key column. - - -For an example configuration for ingesting Apache access logs onto Phoenix, see [this](https://github.com/forcedotcom/phoenix/blob/master/src/main/config/apache-access-logs.properties) property file. Here we are using UUID as a row key generator for the primary key. - -#### Starting the agent: - $ bin/flume-ng agent -f conf/flume-conf.properties -c ./conf -n agent - -#### Monitoring: - For monitoring the agent and the sink process , enable JMX via flume-env.sh($FLUME_HOME/conf/flume-env.sh) script. Ensure you have the following line uncommented. - - JAVA_OPTS="-Xms1g -Xmx1g -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=3141 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false" - diff --git a/phoenix-core/src/site/markdown/index.md b/phoenix-core/src/site/markdown/index.md deleted file mode 100644 index 8b9f0b04..00000000 --- a/phoenix-core/src/site/markdown/index.md +++ /dev/null @@ -1,69 +0,0 @@ -# Overview - -Apache Phoenix is a SQL skin over HBase delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. The table metadata is stored in an HBase table and versioned, such that snapshot queries over prior versions will automatically use the correct schema. Direct use of the HBase API, along with coprocessors and custom filters, results in [performance](performance.html) on the order of milliseconds for small queries, or seconds for tens of millions of rows. - -## Mission -Become the standard means of accessing HBase data through a well-defined, industry standard API. - -## Quick Start -Tired of reading already and just want to get started? Take a look at our [FAQs](faq.html), listen to the Apache Phoenix talks from [Hadoop Summit 2013](http://www.youtube.com/watch?v=YHsHdQ08trg) and [HBaseConn 2013](http://www.cloudera.com/content/cloudera/en/resources/library/hbasecon/hbasecon-2013--how-and-why-phoenix-puts-the-sql-back-into-nosql-video.html), and jump over to our quick start guide [here](Phoenix-in-15-minutes-or-less.html). - -##SQL Support## -To see what's supported, go to our [language reference](language/index.html). It includes all typical SQL query statement clauses, including `SELECT`, `FROM`, `WHERE`, `GROUP BY`, `HAVING`, `ORDER BY`, etc. It also supports a full set of DML commands as well as table creation and versioned incremental alterations through our DDL commands. We try to follow the SQL standards wherever possible. - -Use JDBC to get a connection to an HBase cluster like this: - - Connection conn = DriverManager.getConnection("jdbc:phoenix:server1,server2:3333"); -where the connection string is composed of: -jdbc:phoenix [ :<zookeeper quorum> [ :<port number> ] [ :<root node> ] ] - -For any omitted part, the relevant property value, hbase.zookeeper.quorum, hbase.zookeeper.property.clientPort, and zookeeper.znode.parent will be used from hbase-site.xml configuration file. - -Here's a list of what is currently **not** supported: - -* **Full Transaction Support**. Although we allow client-side batching and rollback as described [here](#transactions), we do not provide transaction semantics above and beyond what HBase gives you out-of-the-box. -* **Derived tables**. Nested queries are coming soon. -* **Relational operators**. Union, Intersect, Minus. -* **Miscellaneous built-in functions**. These are easy to add - read this [blog](http://phoenix-hbase.blogspot.com/2013/04/how-to-add-your-own-built-in-function.html) for step by step instructions. - -##Schema## - -Apache Phoenix supports table creation and versioned incremental alterations through DDL commands. The table metadata is stored in an HBase table. - -A Phoenix table is created through the [CREATE TABLE](language/index.html#create) DDL command and can either be: - -1. **built from scratch**, in which case the HBase table and column families will be created automatically. -2. **mapped to an existing HBase table**, by creating either a read-write TABLE or a read-only VIEW, with the caveat that the binary representation of the row key and key values must match that of the Phoenix data types (see [Data Types reference](datatypes.html) for the detail on the binary representation). - * For a read-write TABLE, column families will be created automatically if they don't already exist. An empty key value will be added to the first column family of each existing row to minimize the size of the projection for queries. - * For a read-only VIEW, all column families must already exist. The only change made to the HBase table will be the addition of the Phoenix coprocessors used for query processing. The primary use case for a VIEW is to transfer existing data into a Phoenix table, since data modification are not allowed on a VIEW and query performance will likely be less than as with a TABLE. - -All schema is versioned, and prior versions are stored forever. Thus, snapshot queries over older data will pick up and use the correct schema for each row. - -####Salting -A table could also be declared as salted to prevent HBase region hot spotting. You just need to declare how many salt buckets your table has, and Phoenix will transparently manage the salting for you. You'll find more detail on this feature [here](salted.html), along with a nice comparison on write throughput between salted and unsalted tables [here](performance.htm#salting). - -####Schema at Read-time -Another schema-related feature allows columns to be defined dynamically at query time. This is useful in situations where you don't know in advance all of the columns at create time. You'll find more details on this feature [here](dynamic_columns.html). - -####Mapping to an Existing HBase Table -Apache Phoenix supports mapping to an existing HBase table through the [CREATE TABLE](language/index.html#create) and [CREATE VIEW](language/index.html#create) DDL statements. In both cases, the HBase metadata is left as-is, except for with CREATE TABLE the [KEEP_DELETED_CELLS](http://hbase.apache.org/book/cf.keep.deleted.html) option is enabled to allow for flashback queries to work correctly. For CREATE TABLE, any HBase metadata (table, column families) that doesn't already exist will be created. Note that the table and column family names are case sensitive, with Phoenix upper-casing all names. To make a name case sensitive in the DDL statement, surround it with double quotes as shown below: -
CREATE VIEW "MyTable" ("a".ID VARCHAR PRIMARY KEY)
- -For CREATE TABLE, an empty key value will also be added for each row so that queries behave as expected (without requiring all columns to be projected during scans). For CREATE VIEW, this will not be done, nor will any HBase metadata be created. Instead the existing HBase metadata must match the metadata specified in the DDL statement or a ERROR 505 (42000): Table is read only will be thrown. - -The other caveat is that the way the bytes were serialized in HBase must match the way the bytes are expected to be serialized by Phoenix. For VARCHAR,CHAR, and UNSIGNED_* types, Phoenix uses the HBase Bytes utility methods to perform serialization. The CHAR type expects only single-byte characters and the UNSIGNED types expect values greater than or equal to zero. - -Our composite row keys are formed by simply concatenating the values together, with a zero byte character used as a separator after a variable length type. For more information on our type system, see the [Data Type](datatypes.html). - -##Transactions## -The DML commands of Apache Phoenix, [UPSERT VALUES](language/index.html#upsert_values), [UPSERT SELECT](language/index.html#upsert_select) and [DELETE](language/index.html#delete), batch pending changes to HBase tables on the client side. The changes are sent to the server when the transaction is committed and discarded when the transaction is rolled back. The only transaction isolation level we support is TRANSACTION_READ_COMMITTED. This includes not being able to see your own uncommitted data as well. Phoenix does not providing any additional transactional semantics beyond what HBase supports when a batch of mutations is submitted to the server. If auto commit is turned on for a connection, then Phoenix will, whenever possible, execute the entire DML command through a coprocessor on the server-side, so performance will improve. - -Most commonly, an application will let HBase manage timestamps. However, under some circumstances, an application needs to control the timestamps itself. In this case, a long-valued "CurrentSCN" property may be specified at connection time to control timestamps for any DDL, DML, or query. This capability may be used to run snapshot queries against prior row values, since Phoenix uses the value of this connection property as the max timestamp of scans. - -## Metadata ## -The catalog of tables, their columns, primary keys, and types may be retrieved via the java.sql metadata interfaces: `DatabaseMetaData`, `ParameterMetaData`, and `ResultSetMetaData`. For retrieving schemas, tables, and columns through the DatabaseMetaData interface, the schema pattern, table pattern, and column pattern are specified as in a LIKE expression (i.e. % and _ are wildcards escaped through the \ character). The table catalog argument to the metadata APIs deviates from a more standard relational database model, and instead is used to specify a column family name (in particular to see all columns in a given column family). - -
-## Disclaimer ## -Apache Phoenix is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the [Apache Incubator PMC](http://incubator.apache.org/). Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. -

diff --git a/phoenix-core/src/site/markdown/issues.md b/phoenix-core/src/site/markdown/issues.md deleted file mode 100644 index 64ea3ca4..00000000 --- a/phoenix-core/src/site/markdown/issues.md +++ /dev/null @@ -1,9 +0,0 @@ -# Issue Tracking - -This project uses JIRA issue tracking and project management application. Issues, bugs, and feature requests should be submitted to the following: - -
- -https://issues.apache.org/jira/browse/PHOENIX - -
diff --git a/phoenix-core/src/site/markdown/mailing_list.md b/phoenix-core/src/site/markdown/mailing_list.md deleted file mode 100644 index fbe9e43c..00000000 --- a/phoenix-core/src/site/markdown/mailing_list.md +++ /dev/null @@ -1,14 +0,0 @@ -# Mailing Lists - -These are the mailing lists that have been established for this project. For each list, there is a subscribe, unsubscribe and post link. - -
- -Name| Subscribe| Unsubscribe| Post ---------------------------|----|----|---- -User List | [Subscribe](mailto:user-subscribe@phoenix.incubator.apache.org) | [Unsubscribe](mailto:user-unsubscribe@phoenix.incubator.apache.org) | [Post](mailto:user@phoenix.incubator.apache.org) -Developer List | [Subscribe](mailto:dev-subscribe@phoenix.incubator.apache.org) | [Unsubscribe](mailto:dev-unsubscribe@phoenix.incubator.apache.org) | [Post](mailto:dev@phoenix.incubator.apache.org) -Private List | [Subscribe](mailto:private-subscribe@phoenix.incubator.apache.org) | [Unsubscribe](mailto:private-unsubscribe@phoenix.incubator.apache.org) | [Post](mailto:private@phoenix.incubator.apache.org) -Commits List | [Subscribe](mailto:commits-subscribe@phoenix.incubator.apache.org) | [Unsubscribe](mailto:commits-unsubscribe@phoenix.incubator.apache.org) | [Post](mailto:commits@phoenix.incubator.apache.org) - -
diff --git a/phoenix-core/src/site/markdown/mr_dataload.md b/phoenix-core/src/site/markdown/mr_dataload.md deleted file mode 100644 index b0053ac1..00000000 --- a/phoenix-core/src/site/markdown/mr_dataload.md +++ /dev/null @@ -1,63 +0,0 @@ -# Bulk CSV Data Load using Map-Reduce - -Phoenix v 2.1 provides support for loading CSV data into a new/existing Phoenix table using Hadoop Map-Reduce. This provides a means of bulk loading CSV data in parallel through map-reduce, yielding better performance in comparison with the existing [psql csv loader](download.html#Loading-Data). - -####Sample input CSV data: - -``` -12345, John, Doe -67890, Mary, Poppins -``` - -####Compatible Phoenix schema to hold the above CSV data: - - CREATE TABLE ns.example ( - my_pk bigint not null, - m.first_name varchar(50), - m.last_name varchar(50) - CONSTRAINT pk PRIMARY KEY (my_pk)) - - - - - - -
Row Key
Column Family (m)
my_pk BIGINTfirst_name VARCHAR(50)last_name VARCHAR(50)
12345JohnDoe
67890MaryPoppins
- - -####How to run? - -1- Please make sure that Hadoop cluster is working correctly and you are able to run any job like [this](http://wiki.apache.org/hadoop/WordCount). - -2- Copy latest phoenix-[version].jar to hadoop/lib folder on each node or add it to Hadoop classpath. - -3- Run the bulk loader job using the script /bin/csv-bulk-loader.sh as below: - -``` -./csv-bulk-loader.sh