To ensure smooth operation, it is critical to use consistent versions of the required tools. Mismatched versions may lead to compatibility issues.
- SDKMAN is recommended for managing Scala and Spark environments.
- Java Version: 17.0
- Scala Version: 2.13.8
- Spark Version: 3.5.3 (built with Scala 2.13)
- SBT Version: 1.11.1
If you are using SDKMAN, you can quickly install Java, Scala, and SBT as follows:
sdk install java 17.0.14.crac-zulu
sdk install scala 2.13.8
sdk install sbt 1.11.1
The Spark version provided by SDKMAN only supports Scala 2.12. Therefore, you need to manually install the Spark version compatible with Scala 2.13. You can download it from the following link: Spark Download.
To simplify the spark-submit
process, create a wrapper script named spark-submit-wrapper.sh
. Replace the SPARK_HOME
path with the actual installation directory on your machine.
#!/bin/bash
export SPARK_HOME=/xxx/spark-3.5.3-bin-hadoop3-scala2.13
if [ ! -d "$SPARK_HOME" ]; then
echo "Error: SPARK_HOME directory does not exist: $SPARK_HOME"
exit 1
fi
SPARK_SUBMIT="$SPARK_HOME/bin/spark-submit"
if [ ! -f "$SPARK_SUBMIT" ]; then
echo "Error: spark-submit not found at: $SPARK_SUBMIT"
exit 1
fi
exec "$SPARK_SUBMIT" "$@"
After saving the script, grant it execution permissions and set up an alias for convenience. Replace the script path with the actual location on your machine.
chmod +x /xxx/spark-submit-wrapper.sh
alias spark-submit-wrapper="/xxx/spark-submit-wrapper.sh"
You can use the pre-built Milvus Spark Connector package directly, or compile it yourself by following the instructions in the next section.
Note: The official release package is currently used primarily for testing the release process. Active development and updates are concentrated in the SNAPSHOT versions.
- Official Release: Available at Maven Repository
- Latest SNAPSHOT: Version 0.1.2-SNAPSHOT
To use the SNAPSHOT version, you need to add the snapshot repository to your build configuration:
For SBT (build.sbt):
ThisBuild / resolvers += "Sonatype Snapshots" at "https://central.sonatype.com/repository/maven-snapshots/"
Use the following SBT commands to compile, package, and publish the connector locally:
sbt clean compile package publishLocal
- clean: Clears previous build artifacts.
- compile: Compiles the source code.
- package: Packages the compiled code into a JAR file.
- publishLocal: Publishes the package to the local repository (primarily for use in connector examples).
To create a fat JAR containing all dependencies, run:
sbt assembly
Example Project Repository: Milvus Spark Connector Example
To compile and package the example project, use the following SBT command:
sbt clean compile package
To execute the test demo, specify the paths to the JAR files generated in the previous steps. Replace /xxx/
with the actual paths on your machine.
Note: If you prefer to use online dependencies instead of building locally, you can download the pre-built assembly JAR from the GitHub Releases page.
spark-submit-wrapper --jars /xxx/spark-connector-assembly-0.1.0-SNAPSHOT.jar --class "example.FloatInsertDemo" /xxx/milvus-spark-connector-example_2.13-0.1.0-SNAPSHOT.jar
This command runs the FloatInsertDemo
class, which demonstrates how to insert data into Milvus using the Spark connector. Ensure that the paths to the JAR files are correct before running the command.