Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebrand to com.nvidia instead of ai.rapids #188

Merged
merged 4 commits into from
Jun 17, 2020
Merged

rebrand to com.nvidia instead of ai.rapids #188

merged 4 commits into from
Jun 17, 2020

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Jun 16, 2020

I think I have everything moved to com.nvidia.spark.rapids, but I want to check on a few things, and would love feedback on what people think.

  1. Do we want the plugin to be called com.nvidia.spark.rapids.SQLPlugin should we try to shorten the name of it? If so to what?
  2. What about com.nvidia.spark.rapids.ExclusiveModeGpuDiscoveryPlugin?
  3. I renamed the TPCH/Mortgage/TPCXbb code to com.nvidia.sparkexamples. is that okay? Do you have any alternative suggestion?
  4. What about shuffle? It is all under com.nvidia.spark.rapids.shuffle Should I make a change now?
  5. Should I remove the old ai.rapids.spark.Plugin now com.nvidia.spark.rapids.Plugin ?

@revans2
Copy link
Collaborator Author

revans2 commented Jun 16, 2020

build

@revans2 revans2 self-assigned this Jun 16, 2020
@revans2 revans2 added build Related to CI / CD or cleanly building documentation Improvements or additions to documentation help wanted Extra attention is needed labels Jun 16, 2020
@razajafri
Copy link
Collaborator

How about com.nvidia.spark.SQLPlugin. To me having nvidia covers the rapids part.

@revans2
Copy link
Collaborator Author

revans2 commented Jun 16, 2020

How about com.nvidia.spark.SQLPlugin. To me having nvidia covers the rapids part.

I don't know nvidia and rapids are not one in the same. That said I don't think nvidia will be putting out a separate SQL plugin that is not based on rapids so it should work.

@abellina
Copy link
Collaborator

Should this be com.nvidia.rapids.spark instead of com.nvidia.spark.rapids?

@abellina
Copy link
Collaborator

abellina commented Jun 16, 2020

What about shuffle? It is all under com.nvidia.spark.rapids.shuffle Should I make a change now?

So the plugin the user refers to is currently under ai.rapids.spark, so I think it becomes com.nvidia.spark.rapids.RapidsShuffleManager. That some of the implementation is under a shuffle sub-package, seems fine to me. I could be overridden.

Should I remove the old ai.rapids.spark.Plugin now com.nvidia.spark.rapids.Plugin ?

+1 This would remove the @deprecated annotation.

@andygrove
Copy link
Contributor

Here are my opinions.

  1. I think this is fine. SQLPlugin is intuitive for me at least since it relates to Spark SQL.
  2. Is it possible that we might support discovery by other means than exclusive mode in the future? Maybe this could be shortened to GpuDiscoveryPlugin leaving it less explicit?
  3. Fine with me.
  4. These are the only classes under a subpackage so it would be more uniform to move them into the same package as the other classes. I don't think it would be a big deal to leave them though.
  5. Yes. Given that this is the first public release, I think this would be reasonable.

@jlowe
Copy link
Member

jlowe commented Jun 16, 2020

Do we want the plugin to be called com.nvidia.spark.rapids.SQLPlugin should we try to shorten the name of it? If so to what?

I'm OK with either the com.nvidia.spark.rapids.SQLPlugin or com.nvidia.spark.SQLPlugin name. I have a slight preference for the former if all the other classes are under com.nvidia.spark.rapids for consistency, but I can see the desire to abbreviate for the end-user experience.

What about com.nvidia.spark.rapids.ExclusiveModeGpuDiscoveryPlugin?

IMO this should be in the same package as the SQL plugin for consistency. Again not a strong preference either way. I could see the argument that this discovery plugin isn't really specific to RAPIDS.

I renamed the TPCH/Mortgage/TPCXbb code to com.nvidia.sparkexamples. is that okay? Do you have any alternative suggestion?

Personally I've not been a fan of sparkexamples, it feels like it should be com.nvidia.spark.rapids.examples. If these are examples designed for the RAPIDS Accelerator plugin which has a package prefix of com.nvidia.spark.rapids then an examples package under there makes sense to me rather than something not even under the com.nvidia.spark umbrella. Again not a showstopper for me either way, just a slight preference. Curious to hear the reasoning behind the existing ai.rapids.sparkexamples scheme, as it seems arbitrarily different to me.

What about shuffle? It is all under com.nvidia.spark.rapids.shuffle Should I make a change now?

IMO this is a shuffle specific to the RAPIDS Accelerator plugin, so it should be under com.nvidia.spark.rapids somewhere. If we push the SQLPlugin up one level for user abbreviation then I could see the main shuffle plugin class being directly under com.nvidia.spark for similar reasons, but I would expect the rest of the code to be under com.nvidia.spark.rapids since it's specific to that project.

Should I remove the old ai.rapids.spark.Plugin now com.nvidia.spark.rapids.Plugin ?

Yes, if we're making no attempts at backwards compatibility with ai.rapids.spark.SQLPlugin then it doesn't make sense to keep the legacy Plugin class around.

@revans2
Copy link
Collaborator Author

revans2 commented Jun 16, 2020

Should this be com.nvidia.rapids.spark instead of com.nvidia.spark.rapids?

I prefer spark first, because of shuffle. I can see us having a shuffle at some point that is not tied to rapids, but is to spark.

@revans2
Copy link
Collaborator Author

revans2 commented Jun 16, 2020

ai.rapids.sparkexamples came from the XGBoost/Rapids repo for the mortgage query and the others followed because hey they are similar. I am fine with changing them. They have no dependency on the plugins so keeping them in a separate package without rapids in it kind of indicated that, at least to me, but like you said there are here for the use of the plugin so...

@revans2
Copy link
Collaborator Author

revans2 commented Jun 16, 2020

One more question. We still have org.apache.spark.sql.rapids to be able to work around package private APIs. Do we want to change this?

@jlowe
Copy link
Member

jlowe commented Jun 16, 2020

They have no dependency on the plugins so keeping them in a separate package without rapids in it kind of indicated that

Yeah com.nvidia.spark.examples works too if that makes more sense. Since we're dropping the ai.rapids prefix, I'd prefer to not follow the sparkexamples precedent since we're making up a new package here anyway.

We still have org.apache.spark.sql.rapids to be able to work around package private APIs. Do we want to change this?

That's a great question. I personally think we can leave that as-is, although we could get extreme and call it org.apache.spark.sql.com.nvidia.spark.rapids if we really want to carve out a specific place there. ;-)

@revans2
Copy link
Collaborator Author

revans2 commented Jun 17, 2020

build

@revans2
Copy link
Collaborator Author

revans2 commented Jun 17, 2020

Thanks for all of the feedback. I did a second round of changes to the names. I moved com.nvidia.sparkexamples to com.nvidia.spark.rapids.tests. They were not really examples of how to use the plugin. They were tests so I thought that was more appropriate.

I also opted to make the names of the plugins as small as possible com.nvidia.SQLPlugin and com.nvidia.RapidsShuffleManager. I thought about renaming it to just com.nvidia.ShuffleManager but I thought it was better to leave it this way.

Please take another look. I know that there are some issues with code coverage that I need to look into, but it is probably just a few settings in a script or a pom.xml that I need to look into.

@revans2
Copy link
Collaborator Author

revans2 commented Jun 17, 2020

build

@kuhushukla
Copy link
Collaborator

lgtm. com.nvidia.spark.SQLPlugin and com.nvidia.spark.RapidsShuffleManager change looks good too.

@abellina
Copy link
Collaborator

abellina commented Jun 17, 2020

I see references to com.nvidia.spark.SQLPlugin, that need to be updated according to:

I also opted to make the names of the plugins as small as possible com.nvidia.SQLPlugin and com.nvidia.RapidsShuffleManager. I thought about renaming it to just com.nvidia.ShuffleManager but I thought it was better to leave it this way.

And also the shuffle manager: com.nvidia.spark.RapidsShuffleManager. Maybe I am missing something.

@revans2
Copy link
Collaborator Author

revans2 commented Jun 17, 2020

I see references to com.nvidia.spark.SQLPlugin that need to be updated according to:

I also opted to make the names of the plugins as small as possible com.nvidia.SQLPlugin and com.nvidia.RapidsShuffleManager. I thought about renaming it to just com.nvidia.ShuffleManager but I thought it was better to leave it this way.

Sorry I mistyped them. should be com.nvidia.spark.SQLPlugin and com.nvidia.spark.RapidsShuffleManager

@revans2
Copy link
Collaborator Author

revans2 commented Jun 17, 2020

build

@revans2 revans2 merged commit 765344e into NVIDIA:branch-0.1 Jun 17, 2020
@revans2 revans2 deleted the package-name-change branch June 17, 2020 14:45
@jlowe jlowe added this to the Jun 8 - Jun 19 milestone Jun 17, 2020
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
Signed-off-by: spark-rapids automation <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Related to CI / CD or cleanly building documentation Improvements or additions to documentation help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants