This repository was archived by the owner on Jan 9, 2020. It is now read-only.
forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 118
Implements a server for refreshing HDFS tokens, a part of secure HDFS support. #453
Open
kimoonkim
wants to merge
43
commits into
apache-spark-on-k8s:branch-2.2-kubernetes
Choose a base branch
from
kimoonkim:token-renew-server
base: branch-2.2-kubernetes
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 27 commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
7abb5ee
Add skeleton
kimoonkim 88c0c03
Renew part
kimoonkim ca0b583
Compile succeeds
kimoonkim 025e2ba
Login to kerberos
kimoonkim 0a7a15d
Clean up constants
kimoonkim b3534f1
Refresh server works
kimoonkim cbe2777
Deployment config file
kimoonkim 4f36793
Fix Dockerfile to match names
kimoonkim 874b8e9
Add as independent project with own pom.xml
kimoonkim 5dc49ca
Add working Dockerfile and deployment yaml file
kimoonkim 388063a
Fix a bug by including hadoop conf dir in the classpath
kimoonkim 1426523
Add token-refresh-server as extra build-only module
kimoonkim 50c3a66
Use akka scheduler for renew tasks
kimoonkim c2ccaa9
Relogin to Kerberos periodically
kimoonkim a2aec2b
Renew at 90% mark of deadline
kimoonkim ec70b47
Get renew time from data item key
kimoonkim 0b049fd
Fix compile error
kimoonkim d42c568
Obtain new tokens
kimoonkim 5d96879
Fix bugs
kimoonkim c0e28d4
Write back tokens to K8s secret
kimoonkim 57c847e
Handle recently added secrets
kimoonkim ce1bb7f
Use k8s client editable to update secret data
kimoonkim 5162339
Add a comment
kimoonkim 196cd8a
Keep only secret metadata in memory
kimoonkim 56ef8e6
Fix a regex match bug
kimoonkim 93b2acf
Tested
kimoonkim ba2e79a
Updated parent version
kimoonkim 1d74579
Address review comments
kimoonkim 95e68d3
Add TODO for token status rest endpoint
kimoonkim a006233
Address review comments
kimoonkim 9dc8345
Address review comments
kimoonkim f4d5ee9
Support configuration
kimoonkim 1462d2c
Fix a typo
kimoonkim 193d0f9
Add some unit tests
kimoonkim eb10c4a
Add more tests
kimoonkim 078ac2a
Clean up
kimoonkim aeb269a
Add more tests
kimoonkim 87dedbc
Minor clean-up
kimoonkim 4d1cb74
Add unit tests for renew tasks
kimoonkim 8997f04
Verify test results more
kimoonkim 2ed55af
Rename the new profile to kubernetes-hdfs-extra
kimoonkim 01be03e
Fix style issues
kimoonkim 0baaf0b
Fix Hadoop 2.7 dependency issue
kimoonkim File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
61 changes: 61 additions & 0 deletions
61
resource-managers/kubernetes/token-refresh-server/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
--- | ||
layout: global | ||
title: Hadoop Token Refresh Server on Kubernetes | ||
--- | ||
|
||
Spark on Kubernetes may use Kerberized Hadoop data sources such as secure HDFS or Kafka. If the job | ||
runs for days or weeks, someone should extend the lifetime of Hadoop delegation tokens, which | ||
expire every 24 hours. The Hadoop Token Refresh Server is a Kubernetes microservice that renews | ||
token lifetime and puts the replacement tokens in place. | ||
|
||
# Building the Refresh Server | ||
|
||
To build the refresh server jar, simply run Maven. For example: | ||
|
||
mvn clean package | ||
|
||
The target directory will have a tarball that includes the project jar file as well as | ||
3rd party dependency jars. The tarball name would end with `-assembly.tar.gz`. For example: | ||
|
||
target/token-refresh-server-kubernetes_2.11-2.2.0-k8s-0.3.0-SNAPSHOT-assembly.tar.gz | ||
|
||
# Running the Refresh Server | ||
|
||
To run the server, follow the steps below. | ||
|
||
1. Build and push the docker image: | ||
|
||
docker build -t hadoop-token-refresh-server:latest \ | ||
-f src/main/docker/Dockerfile . | ||
docker tag hadoop-token-refresh-server:latest <YOUR-REPO>:<YOUR-TAG> | ||
docker push <YOUR-REPO>:<YOUR-TAG> | ||
|
||
2. Create a k8s `configmap` containing Hadoop config files. This should enable Kerberos and secure Hadoop. | ||
It should also include the Hadoop servers that would issue delegation tokens such as the HDFS namenode | ||
address: | ||
|
||
kubectl create configmap hadoop-token-refresh-server-hadoop-config \ | ||
--from-file=/usr/local/hadoop/conf/core-site.xml | ||
|
||
3. Create another k8s `configmap` containing Kerberos config files. This should include | ||
the kerberos server address and the correct realm name for Kerberos principals: | ||
|
||
kubectl create configmap hadoop-token-refresh-server-kerberos-config \ | ||
--from-file=/etc/krb5.conf | ||
|
||
4. Create a k8s `secret` containing the Kerberos keytab file. The keytab file should include | ||
the password for the system user Kerberos principal that the refresh server is using to | ||
extend Hadoop delegation tokens. | ||
|
||
kubectl create secret generic hadoop-token-refresh-server-kerberos-keytab \ | ||
--from-file /mnt/secrets/krb5.keytab | ||
|
||
5. Create a k8s `service account` and `clusterrolebinding` that the service pod will use. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note as optional |
||
The service account should have `edit` capability for job `secret`s that contains | ||
the Hadoop delegation tokens. | ||
|
||
6. Finally, edit the config file for k8s `deployment` and launch the service pod | ||
using the deployment. The config file should include the right docker image tag | ||
and the correct k8s `service account` name. | ||
|
||
kubectl create -f src/main/conf/kubernetes-hadoop-token-refresh-server.yaml |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
~ Licensed to the Apache Software Foundation (ASF) under one or more | ||
~ contributor license agreements. See the NOTICE file distributed with | ||
~ this work for additional information regarding copyright ownership. | ||
~ The ASF licenses this file to You under the Apache License, Version 2.0 | ||
~ (the "License"); you may not use this file except in compliance with | ||
~ the License. You may obtain a copy of the License at | ||
~ | ||
~ http://www.apache.org/licenses/LICENSE-2.0 | ||
~ | ||
~ Unless required by applicable law or agreed to in writing, software | ||
~ distributed under the License is distributed on an "AS IS" BASIS, | ||
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
~ See the License for the specific language governing permissions and | ||
~ limitations under the License. | ||
--> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
<parent> | ||
<groupId>org.apache.spark</groupId> | ||
<artifactId>spark-parent_2.11</artifactId> | ||
<version>2.2.0-k8s-0.5.0-SNAPSHOT</version> | ||
<relativePath>../../../pom.xml</relativePath> | ||
</parent> | ||
|
||
<artifactId>token-refresh-server-kubernetes_2.11</artifactId> | ||
<packaging>jar</packaging> | ||
<name>Hadoop Token Refresh Server on Kubernetes</name> | ||
<properties> | ||
<akka.actor.version>2.5.4</akka.actor.version> | ||
<commons-logging.version>1.2</commons-logging.version> | ||
<kubernetes.client.version>2.2.13</kubernetes.client.version> | ||
</properties> | ||
<dependencies> | ||
<dependency> | ||
<groupId>com.typesafe.akka</groupId> | ||
<artifactId>akka-actor_${scala.binary.version}</artifactId> | ||
<version>${akka.actor.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>io.fabric8</groupId> | ||
<artifactId>kubernetes-client</artifactId> | ||
<version>${kubernetes.client.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>log4j</groupId> | ||
<artifactId>log4j</artifactId> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.apache.hadoop</groupId> | ||
<artifactId>hadoop-client</artifactId> | ||
</dependency> | ||
<dependency> | ||
<groupId>commons-logging</groupId> | ||
<artifactId>commons-logging</artifactId> | ||
<version>${commons-logging.version}</version> | ||
</dependency> | ||
</dependencies> | ||
<build> | ||
<plugins> | ||
<plugin> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-assembly-plugin</artifactId> | ||
<configuration> | ||
<descriptors> | ||
<descriptor>src/main/assembly/assembly.xml</descriptor> | ||
</descriptors> | ||
</configuration> | ||
<executions> | ||
<execution> | ||
<id>make-assembly</id> | ||
<phase>package</phase> | ||
<goals> | ||
<goal>single</goal> | ||
</goals> | ||
</execution> | ||
</executions> | ||
</plugin> | ||
</plugins> | ||
</build> | ||
</project> |
33 changes: 33 additions & 0 deletions
33
resource-managers/kubernetes/token-refresh-server/src/main/assembly/assembly.xml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
<!-- | ||
~ Licensed to the Apache Software Foundation (ASF) under one or more | ||
~ contributor license agreements. See the NOTICE file distributed with | ||
~ this work for additional information regarding copyright ownership. | ||
~ The ASF licenses this file to You under the Apache License, Version 2.0 | ||
~ (the "License"); you may not use this file except in compliance with | ||
~ the License. You may obtain a copy of the License at | ||
~ | ||
~ http://www.apache.org/licenses/LICENSE-2.0 | ||
~ | ||
~ Unless required by applicable law or agreed to in writing, software | ||
~ distributed under the License is distributed on an "AS IS" BASIS, | ||
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
~ See the License for the specific language governing permissions and | ||
~ limitations under the License. | ||
--> | ||
<assembly> | ||
<id>assembly</id> | ||
<formats> | ||
<format>tar.gz</format> | ||
</formats> | ||
<includeBaseDirectory>false</includeBaseDirectory> | ||
<dependencySets> | ||
<dependencySet> | ||
<unpack>false</unpack> | ||
<scope>compile</scope> | ||
</dependencySet> | ||
<dependencySet> | ||
<unpack>false</unpack> | ||
<scope>provided</scope> | ||
</dependencySet> | ||
</dependencySets> | ||
</assembly> |
66 changes: 66 additions & 0 deletions
66
...kubernetes/token-refresh-server/src/main/conf/kubernetes-hadoop-token-refresh-server.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
--- | ||
apiVersion: extensions/v1beta1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use |
||
kind: Deployment | ||
metadata: | ||
name: hadoop-token-refresh-server | ||
spec: | ||
replicas: 1 | ||
template: | ||
metadata: | ||
labels: | ||
hadoop-token-refresh-server-instance: default | ||
spec: | ||
serviceAccountName: YOUR-SERVICE-ACCOUNT | ||
volumes: | ||
- name: kerberos-config | ||
configMap: | ||
name: hadoop-token-refresh-server-kerberos-config | ||
- name: hadoop-config | ||
configMap: | ||
name: hadoop-token-refresh-server-hadoop-config | ||
- name: kerberos-keytab | ||
secret: | ||
secretName: hadoop-token-refresh-server-kerberos-keytab | ||
containers: | ||
- name: hadoop-token-refresh-server | ||
image: YOUR-REPO:YOUR-TAG | ||
env: | ||
- name: HADOOP_CONF_DIR | ||
value: /etc/hadoop/conf | ||
- name: TOKEN_REFRESH_SERVER_ARGS | ||
value: --verbose | ||
resources: | ||
requests: | ||
cpu: 100m | ||
memory: 512Mi | ||
limits: | ||
cpu: 100m | ||
memory: 512Mi | ||
volumeMounts: | ||
- name: kerberos-config | ||
mountPath: '/etc/krb5.conf' | ||
subPath: krb5.conf | ||
readOnly: true | ||
- name: hadoop-config | ||
mountPath: '/etc/hadoop/conf' | ||
readOnly: true | ||
- name: kerberos-keytab | ||
mountPath: '/mnt/secrets/krb5.keytab' | ||
subPath: krb5.keytab | ||
readOnly: true |
43 changes: 43 additions & 0 deletions
43
resource-managers/kubernetes/token-refresh-server/src/main/docker/Dockerfile
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
FROM openjdk:8-alpine | ||
|
||
|
||
RUN apk upgrade --no-cache && \ | ||
apk add --no-cache bash tini && \ | ||
rm /bin/sh && \ | ||
ln -sv /bin/bash /bin/sh && \ | ||
chgrp root /etc/passwd && \ | ||
chmod ug+rw /etc/passwd && \ | ||
mkdir -p /opt/token-refresh-server && \ | ||
mkdir -p /opt/token-refresh-server/jars && \ | ||
mkdir -p /opt/token-refresh-server/work-dir | ||
|
||
ADD target/token-refresh-server-kubernetes_2.11-2.2.0-k8s-0.3.0-SNAPSHOT-assembly.tar.gz \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. deprecated version, update to newest k8s assembly jar |
||
/opt/token-refresh-server/jars | ||
WORKDIR /opt/token-refresh-server/work-dir | ||
|
||
# The docker build command should be invoked from the top level directory of | ||
# the token-refresh-server project. E.g.: | ||
# docker build -t hadoop-token-refresh-server:latest \ | ||
# -f src/main/docker/Dockerfile . | ||
|
||
CMD /sbin/tini -s -- /usr/bin/java \ | ||
-cp $HADOOP_CONF_DIR:'/opt/token-refresh-server/jars/*' \ | ||
org.apache.spark.security.kubernetes.TokenRefreshServer \ | ||
$TOKEN_REFRESH_SERVER_ARGS |
19 changes: 19 additions & 0 deletions
19
...es/token-refresh-server/src/main/scala/org/apache/spark/security/kubernetes/Logging.scala
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
package org.apache.spark.security.kubernetes | ||
|
||
import org.apache.log4j.{LogManager, Logger, Priority} | ||
|
||
trait Logging { | ||
|
||
private var log : Logger = LogManager.getLogger(this.getClass) | ||
|
||
protected def logDebug(msg: => String) = if (log.isDebugEnabled) log.debug(msg) | ||
|
||
protected def logInfo(msg: => String) = if (log.isInfoEnabled) log.info(msg) | ||
|
||
protected def logWarning(msg: => String) = if (log.isEnabledFor(Priority.WARN)) log.warn(msg) | ||
|
||
protected def logWarning(msg: => String, throwable: Throwable) = | ||
if (log.isEnabledFor(Priority.WARN)) log.warn(msg, throwable) | ||
|
||
protected def logError(msg: => String) = if (log.isEnabledFor(Priority.ERROR)) log.error(msg) | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like Kerborized HDFS
.