Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-19667][SQL]create table with hiveenabled in default database use warehouse path instead of the location of default database #17001

Closed
wants to merge 31 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
aebdfc6
[SPARK-19667][SQL]create table with hiveenabled in default database u…
windpiger Feb 20, 2017
825c0ad
rename a conf name
windpiger Feb 20, 2017
a2c9168
fix test faile
windpiger Feb 21, 2017
bacd528
process default database location when create/get database from metas…
windpiger Feb 22, 2017
3f6e061
remove an redundant line
windpiger Feb 22, 2017
96dcc7d
fix empty string location of database
windpiger Feb 22, 2017
f329387
modify the test case
windpiger Feb 22, 2017
83dba73
Merge branch 'master' into defaultDBPathInHive
windpiger Feb 22, 2017
58a0020
fix test failed
windpiger Feb 22, 2017
1dce2d7
add log to find out why jenkins failed
windpiger Feb 22, 2017
12f81d3
add scalastyle:off for println
windpiger Feb 22, 2017
56e83d5
fix test faile
windpiger Feb 22, 2017
901bb1c
make warehouse path qualified for default database
windpiger Feb 23, 2017
99d9746
remove a string s
windpiger Feb 23, 2017
db555e3
modify a comment
windpiger Feb 23, 2017
d327994
fix test failed
windpiger Feb 23, 2017
73c8802
move to sessioncatalog
windpiger Feb 23, 2017
747b31a
remove import
windpiger Feb 23, 2017
8f8063f
remove an import
windpiger Feb 23, 2017
4dc11c1
modify some codestyle and some comment
windpiger Feb 24, 2017
9c0773b
Merge branch 'defaultDBPathInHive' of github.com:windpiger/spark into…
windpiger Feb 24, 2017
80b8133
mv defaultdb path logic to ExternalCatalog
windpiger Feb 27, 2017
41ea115
modify a comment
windpiger Feb 27, 2017
13245e4
modify a comment
windpiger Feb 27, 2017
096ae63
add final def
windpiger Mar 1, 2017
badd61b
modify some code
windpiger Mar 2, 2017
35d2b59
add lazy flag
windpiger Mar 2, 2017
e3a467e
modify test case
windpiger Mar 3, 2017
ae9938a
modify test case
windpiger Mar 3, 2017
7739ccd
mv getdatabase
windpiger Mar 3, 2017
f93f5d3
merge with master
windpiger Mar 8, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -672,6 +672,14 @@ object SQLConf {
.stringConf
.createWithDefault(TimeZone.getDefault().getID())

// for test
val TEST_HIVE_CREATETABLE_DEFAULTDB_USEWAREHOUSE_PATH =
buildConf("spark.hive.test.createTable.defaultDB.location.useWarehousePath")
.doc("Enables test case to use warehouse path instead of db location when " +
"create table in default database.")
.booleanConf
.createWithDefault(false)

object Deprecated {
val MAPRED_REDUCE_TASKS = "mapred.reduce.tasks"
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,10 @@ private[sql] class SharedState(val sparkContext: SparkContext) extends Logging {

// Create the default database if it doesn't exist.
{
// default database set to empty string,
// when reload from metastore using warehouse path to replace it
val defaultDbDefinition = CatalogDatabase(
SessionCatalog.DEFAULT_DATABASE, "default database", warehousePath, Map())
SessionCatalog.DEFAULT_DATABASE, "default database", "", Map())
// Initialize default database if it doesn't exist
if (!externalCatalog.databaseExists(SessionCatalog.DEFAULT_DATABASE)) {
// There may be another Spark application creating default database at the same time, here we
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ import org.apache.spark.sql.catalyst.parser.{CatalystSqlParser, ParseException}
import org.apache.spark.sql.execution.QueryExecutionException
import org.apache.spark.sql.execution.command.DDLUtils
import org.apache.spark.sql.hive.client.HiveClientImpl._
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.internal.StaticSQLConf._
import org.apache.spark.sql.types._
import org.apache.spark.util.{CircularBuffer, Utils}

Expand Down Expand Up @@ -311,11 +313,12 @@ private[hive] class HiveClientImpl(
override def createDatabase(
database: CatalogDatabase,
ignoreIfExists: Boolean): Unit = withHiveState {
// default database's location always use the warehouse path, here set to emtpy string
client.createDatabase(
new HiveDatabase(
database.name,
database.description,
database.locationUri,
if (database.name == SessionCatalog.DEFAULT_DATABASE) "" else database.locationUri,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is empty, metastore will set it for us, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, actually it will throw an exception, my local default has created, so it does not hit the exception, I will just replace the default database location when reload from metastore, drop the logic when create database set location to empty string.

Option(database.properties).map(_.asJava).orNull),
ignoreIfExists)
}
Expand All @@ -339,10 +342,17 @@ private[hive] class HiveClientImpl(

override def getDatabase(dbName: String): CatalogDatabase = withHiveState {
Option(client.getDatabase(dbName)).map { d =>
// default database's location always use the warehouse path
// TEST_HIVE_CREATETABLE_DEFAULTDB_USEWAREHOUSE_PATH is a flag fro test
val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE
|| sparkConf.get(SQLConf.TEST_HIVE_CREATETABLE_DEFAULTDB_USEWAREHOUSE_PATH)) {
sparkConf.get(WAREHOUSE_PATH)
} else d.getLocationUri

CatalogDatabase(
name = d.getName,
description = d.getDescription,
locationUri = d.getLocationUri,
locationUri = dbLocation,
properties = Option(d.getParameters).map(_.asScala.toMap).orNull)
}.getOrElse(throw new NoSuchDatabaseException(dbName))
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,8 @@ class ShowCreateTableSuite extends QueryTest with SQLTestUtils with TestHiveSing
)

table.copy(
storage = table.storage.copy(
locationUri = table.storage.locationUri.map(_.stripPrefix("file:"))),
createTime = 0L,
lastAccessTime = 0L,
properties = table.properties.filterKeys(!nondeterministicProps.contains(_))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1587,4 +1587,51 @@ class HiveDDLSuite
}
}
}

test("create table with default database use warehouse path instead of database location") {
withTable("t") {
// default database use warehouse path as its location
withTempDir { dir =>
spark.sparkContext.conf
.set(SQLConf.TEST_HIVE_CREATETABLE_DEFAULTDB_USEWAREHOUSE_PATH.key, "true")

val sparkWarehousePath = spark.sharedState.warehousePath.stripSuffix("/")
spark.sql(s"CREATE DATABASE default_test LOCATION '$dir'" )
val db = spark.sessionState.catalog.getDatabaseMetadata("default_test")
assert(db.locationUri.stripSuffix("/") == sparkWarehousePath)
spark.sql("USE default_test")

spark.sql("CREATE TABLE t(a string)")
val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
assert(table.location.stripSuffix("/").stripPrefix("file:") ==
new File(sparkWarehousePath, "t").getAbsolutePath.stripSuffix("/"))

// clear
spark.sparkContext.conf
.remove(SQLConf.TEST_HIVE_CREATETABLE_DEFAULTDB_USEWAREHOUSE_PATH.key)

spark.sql("DROP TABLE t")
spark.sql("DROP DATABASE default_test")
spark.sql("USE DEFAULT")
}

// not default database use its's location from the create command
withTempDir { dir =>
val dirPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
spark.sql(s"CREATE DATABASE test_not_default LOCATION '$dir'" )
val db = spark.sessionState.catalog.getDatabaseMetadata("test_not_default")
assert(db.locationUri.stripSuffix("/") == dirPath)
spark.sql("USE test_not_default")

spark.sql("CREATE TABLE t(a string)")
val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
assert(table.location.stripSuffix("/") == s"$dirPath/t" )

// clear
spark.sql("DROP TABLE t")
spark.sql("DROP DATABASE test_not_default")
spark.sql("USE DEFAULT")
}
}
}
}