Skip to content

[SPARK-51318][TESTS] Remove jar files from Apache Spark repository and disable affected tests #50231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

vrozov
Copy link
Member

@vrozov vrozov commented Mar 10, 2025

What changes were proposed in this pull request?

remove jar files from the Apache Spark repo and disable affected tests.

Why are the changes needed?

Apache source releases must not contained jar files. The issue is discussed on https://lists.apache.org/thread/0ro5yn6lbbpmvmqp2px3s2pf7cwljlc4 e-mail thread.

Does this PR introduce any user-facing change?

Yes, there is a minor change in the Spark Connect example, otherwise PR only affects tests

How was this patch tested?

Patch disables tests affected by the removal of jars

Was this patch authored or co-authored using generative AI tooling?

No

@HyukjinKwon
Copy link
Member

tests files are not in the releasee. Why do we need to remove them?

@vrozov
Copy link
Member Author

vrozov commented Mar 10, 2025

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Mar 11, 2025

Do you mind explicitly explaining the background, the discussion links, etc in PR description?

@@ -243,7 +243,8 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu
}
}

test("add and list jar files") {
ignore("SPARK-51318: Remove `jar` files from Apache Spark repository and disable affected " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create the jar during test runtime instead of simply disabling it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agreement in the e-mail thread on [email protected] I referred to is to disable affected tests. As discussed in the same e-mail thread, there are several ways to provide jars at runtime and building them is one of the option. It is outside of the scope for this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we reach the agreement there, and we did not also go for a vote. I disagree with disabling the tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was voted on the e-mail thread I posted and there were no -1 votes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you please mind sharing the link of vote? I cannot see in the dev mailing list in Spark.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refer to the e-mail thread I already posted here where removal of jars proposal by @dongjoon-hyun was discussed and votes casted. Voting on the actual code changes is done on the PR and follows Apache voting rules

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really sorry but do you mind sharing the link for the vote instead of discussion thread?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no any other thread except the one I mentioned. I don't see why you expect a separate voting thread. In many cases votes are casted on the discussion thread. For example you casted vote on the discussion thread.

@@ -101,7 +102,7 @@ class StubClassLoaderSuite extends SparkFunSuite {

// Creating a new class loader will unpack the udf correctly.
val cl2 = new ChildFirstURLClassLoader(
Array(udfNoAJar),
Array(new File(udfNoAJar).toURI.toURL),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove unrelated changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no unrelated changes. This change is necessary otherwise test suite fails to initialize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind explaining why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jar is removed, so the entire suite fails to initialize (see change on line 29).

@vrozov
Copy link
Member Author

vrozov commented Mar 11, 2025

Do you mind explicitly explaining the background, the discussion links, etc in PR description?

I can provide link to the discussion. Replicating everything that were discussed on the e-mail thread IMO is an overkill.

@HyukjinKwon
Copy link
Member

Please put the link there, and fill the PR description with summary as guided in the template - that's the purpose of having a PR description. Otherwise people have to read the whole related threads and codes.

@vrozov
Copy link
Member Author

vrozov commented Mar 11, 2025

Please put the link there, and fill the PR description with summary as guided in the template

Please see:

Why are the changes needed?
Apache source releases must not contained jar files.

@HyukjinKwon
Copy link
Member

-1 if we disable the tests. We are introducing a set of technical debt to remove the other.

@vrozov
Copy link
Member Author

vrozov commented Mar 11, 2025

@HyukjinKwon removing jars from Apache source release is a must otherwise ASF may pull out release. By voting -1 on the PR I assume that you will provide another PR shortly with removed jars and tests being enabled. cc: @dongjoon-hyun

@HyukjinKwon
Copy link
Member

No, I agree with removing jars but disagree with removing tests.

I assume that you will provide another PR shortly with removed jars and tests being enabled

No, I casted my veto with a technical justification. It does not mean that I will start working on it.

@vrozov
Copy link
Member Author

vrozov commented Mar 11, 2025

No, I agree with removing jars but disagree with removing tests.

I assume that you will provide another PR shortly with removed jars and tests being enabled

No, I casted my veto with a technical justification. It does not mean that I will start working on it.

If you agree that jars must be removed and they needs to be preferably removed before the next release goes out, somebody needs to provide code changes. As you are blocking this change even though others voted for it, the assumption is that you will provide a better solution as keeping jars is not actually even a technical debt, it is ASF legal issue.

Note that the PR does not remove tests from the source code, it keeps the code, so test can be enabled back whenever there is a solution that avoids having jars in the source code.

@HyukjinKwon
Copy link
Member

Right, I will revoke my veto, and defer to other committers as discussed in #50378

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants