Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect service name reference in generated ingress for Spark History Server #45

Closed
tianliang0038 opened this issue May 7, 2024 · 1 comment · Fixed by #77
Closed

Comments

@tianliang0038
Copy link
Contributor

tianliang0038 commented May 7, 2024

PysparkPipeline.json

Context:
In our Spark infrastructure chart, which powers the Spark History Server, we create both an Ingress and a Service. The default configuration shows that the Ingress is set to target a Service called "spark-history." However, in the service.yaml template, the reference is hard-coded to "spark-infrastructure," which is incorrect for the Spark History Server.

If we test if ingress works by calling curl command curl spark-history.p.uip.sh, an error would pop up
Could not resolve host: spark-history.p.uip.sh
when we check ingress by running kubectl describe ingress spark-infrastructure
we'll get spark-history:18080 (<error: endpoints "spark-history" not found>) error.

DOD

  1. Reproduce the error:
  • Set up a pipeline containing a Spark application
  • Run tilt up and wait till spark-infrastructure successfully runs
  • Execute the command kubectl describe ingress spark-infrastructure
  1. Rectify the problem by modifying the ingress.hosts.host.paths.backend.service.name from "spark-history" to "spark-infrastructure".

Test

  1. Pull the latest baseline project
  2. Run
    mvn clean install -f foundation/foundation-mda
  3. Build a Pyspark pipeline
mvn archetype:generate -B -DarchetypeGroupId=com.boozallen.aissemble \
                          -DarchetypeArtifactId=foundation-archetype \
                          -DarchetypeVersion=1.7.0-SNAPSHOT \
                          -DartifactId=test-ingress-svc\
                          -DgroupId=org.test \
                          -DprojectName='Test Ingress Service Functionning' \
                          -DprojectGitUrl=test.org/test-ingress-svc\
&& cd test-ingress-svc
  1. Add the attached PysparkPipeline.json to -pipeline-models/src/main/resources/pipelines
  2. Run the following:
    mvn clean install
  3. Follow the manual action and then run till there's no more manual action updates popping up
    mvn clean generate-sources
  4. Remove deployment.ingress.metadata.annotations in values.yaml. Add deployment.ingress.metadata.name with value ingress. Change value of deployment.ingress.hosts.host to spark-history.rancher.localhost
    The deployment.ingress section should look like
ingress:
    enabled: true
    metadata: 
      name: ingress
    hosts:
      - host: spark-history.rancher.localhost
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: spark-infrastructure
                port:
                  number: 18080
  1. Run Tilt up and wait till spark-infrastructure pod successfully spins

  2. Enter http://spark-history.rancher.localhost in the browser. (try chrome first, if not Firefox or safari)

  3. Verify that it's error free (not 404)

tianliang0038 added a commit that referenced this issue May 15, 2024
…structure

#45 Incorrect service name reference in generated ingress for Spark H…
@carter-cundiff
Copy link
Contributor

Testing passed:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants