Skip to content

Commit

Permalink
add config.
Browse files Browse the repository at this point in the history
  • Loading branch information
flaming-archer committed Apr 12, 2024
1 parent 0dde4c2 commit 070d5b2
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 54 deletions.
98 changes: 44 additions & 54 deletions HowToKerberize.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,75 +24,65 @@ In addition, because Kerberos authentication requires a delegation-token to prox
* Zookeeper to store delegation-token (Recommended)

### Configuration

Waggle Dance does not read Hadoop's `core-site.xml` so a general property providing Kerberos auth should be added to
the Hive configuration file `hive-site.xml`:
Waggle Dance `waggle-dance-server.yml` example:

```
<property>
<name>hadoop.security.authentication</name>
<value>KERBEROS</value>
</property>
port: 9083
verbose: true
#database-resolution: MANUAL
database-resolution: PREFIXED
yaml-storage:
overwrite-config-on-shutdown: false
logging:
config: file:/path/to/log4j2.xml
configuration-properties:
hadoop.security.authentication: KERBEROS
hive.metastore.sasl.enabled: true
hive.metastore.kerberos.principal: hive/[email protected]
hive.metastore.kerberos.keytab.file: /path/to/hive.keytab
hive.cluster.delegation.token.store.class: org.apache.hadoop.hive.thrift.ZooKeeperTokenStore
hive.cluster.delegation.token.store.zookeeper.connectString: zz1:2181,zz2:2181,zz3:2181
hive.cluster.delegation.token.store.zookeeper.znode: /hive/cluster/wd_delegation
hive.server2.authentication: KERBEROS
hive.server2.authentication.kerberos.principal: hive/[email protected]
hive.server2.authentication.kerberos.keytab: /path/to/hive.keytab
hive.server2.authentication.client.kerberos.principal: hive/[email protected]
hadoop.kerberos.keytab.login.autorenewal.enabled : true
hadoop.proxyuser.hive.users: '*'
hadoop.proxyuser.hive.hosts: '*'
```


Waggle Dance also needs a keytab file to communicate with the Metastore so the following properties should be present:
Waggle Dance `waggle-dance-federation.yml` example:
```
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/hive.keytab</value>
</property>
primary-meta-store:
database-prefix: ''
name: local
remote-meta-store-uris: thrift://ms1:9083
access-control-type: READ_AND_WRITE_AND_CREATE
impersonation-enabled: true
federated-meta-stores:
- remote-meta-store-uris: thrift://ms2:9083
database-prefix: dw_
name: remote
impersonation-enabled: true
access-control-type: READ_AND_WRITE_ON_DATABASE_WHITELIST
writable-database-white-list:
- .*
```

In addition, all metastores need to use the Zookeeper shared token:
In start shell , add jvm properties maybe useful.
```
<property>
<name>hive.cluster.delegation.token.store.class</name>
<value>org.apache.hadoop.hive.thrift.ZooKeeperTokenStore</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.connectString</name>
<value>zk1:2181,zk2:2181,zk3:2181</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.znode</name>
<value>/hive/token</value>
</property>
-Djavax.security.auth.useSubjectCredsOnly=false
```

If you are intending to use a Beeline client, the following properties may be valuable:
Connect to Waggle Dance via beeline, change ` hive.metastore.uris` in Hive configuration file `hive-site.xml`:
```
<property>
<name>hive.server2.transport.mode</name>
<value>http</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/hive.keytab</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<name>hive.metastore.uris</name>
<value>thrift://wd:9083</value>
</property>
```


### Running

Waggle Dance should be started by a privileged user with a fresh keytab.
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ The table below describes all the available configuration values for Waggle Danc
| `primary-meta-store.name` | Yes | Database name that uniquely identifies this metastore. Used internally. Cannot be empty. |
| `primary-meta-store.database-prefix` | No | Prefix used to access the primary metastore and differentiate databases in it from databases in another metastore. The default prefix (i.e. if this value isn't explicitly set) is empty string.|
| `primary-meta-store.access-control-type` | No | Sets how the client access controls should be handled. Default is `READ_ONLY` Other options `READ_AND_WRITE_AND_CREATE`, `READ_AND_WRITE_ON_DATABASE_WHITELIST` and `READ_AND_WRITE_AND_CREATE_ON_DATABASE_WHITELIST` see Access Control section below. |
| `primary-meta-store.impersonation-enabled` | No | Enable metastore end-user impersonation.|
| `primary-meta-store.writable-database-white-list` | No | White-list of databases used to verify write access used in conjunction with `primary-meta-store.access-control-type`. The list of databases should be listed without any `primary-meta-store.database-prefix`. This property supports both full database names and (case-insensitive) [Java RegEx patterns](https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html).|
| `primary-meta-store.metastore-tunnel` | No | See metastore tunnel configuration values below. |
| `primary-meta-store.latency` | No | Indicates the acceptable slowness of the metastore in **milliseconds** for increasing the default connection timeout. Default latency is `0` and should be changed if the metastore is particularly slow. If you get an error saying that results were omitted because the metastore was slow, consider changing the latency to a higher number.|
Expand All @@ -168,6 +169,7 @@ The table below describes all the available configuration values for Waggle Danc
| `federated-meta-stores` | No | Possible empty list of read only federated metastores. |
| `federated-meta-stores[n].remote-meta-store-uris` | Yes | Thrift URIs of the federated read-only metastore. |
| `federated-meta-stores[n].name` | Yes | Name that uniquely identifies this metastore. Used internally. Cannot be empty. |
| `federated-meta-stores[n].impersonation-enabled` | No | Enable metastore end-user impersonation.|
| `federated-meta-stores[n].database-prefix` | No | Prefix used to access this particular metastore and differentiate databases in it from databases in another metastore. Typically used if databases have the same name across metastores but federated access to them is still needed. The default prefix (i.e. if this value isn't explicitly set) is {federated-meta-stores[n].name} lowercased and postfixed with an underscore. For example if the metastore name was configured as "waggle" and no database prefix was provided but `PREFIXED` database resolution was used then the value of `database-prefix` would be "waggle_". |
| `federated-meta-stores[n].metastore-tunnel` | No | See metastore tunnel configuration values below. |
| `federated-meta-stores[n].latency` | No | Indicates the acceptable slowness of the metastore in **milliseconds** for increasing the default connection timeout. Default latency is `0` and should be changed if the metastore is particularly slow. If you get an error saying that results were omitted because the metastore was slow, consider changing the latency to a higher number.|
Expand Down
Binary file modified kerberos-process.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 070d5b2

Please sign in to comment.