-
Notifications
You must be signed in to change notification settings - Fork 61
DOCSP-48557 Update Spark streaming write configuration to include all batch options #261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
DOCSP-48557 Update Spark streaming write configuration to include all batch options #261
Conversation
✅ Deploy Preview for docs-spark-connector ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
🔄 Deploy Preview for docs-spark-connector processing
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great job on this! added some comments related to wording as well as a question about a default setting in one of the parameters.
w Option </reference/write-concern/#w-option>` in the {+mdb-server+} | ||
manual. | ||
| | ||
| **Default:** ``1`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: How did you get that this default is 1? the server manual states only that "If the write concern is missing the w field, MongoDB sets the w option to the default write concern," and then later on in the table it says "{ w: "majority" } is the default write concern for most MongoDB deployments". The implicit default write concern for most mongo deployments seems to be majority: https://www.mongodb.com/docs/manual/reference/write-concern/#std-label-wc-default-behavior
This question applies to the other writeConcern.w option as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just moved what was on the batch page to the stream page, so I didn't write any of the info. I assumed most of it was correct but I'll double check everything now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that it can be 1 or majority, so I'm going to include both, with majority taking precedence as it is most cases like you said
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a bit odd if this option has two potential defaults, so it would be good to double check with the technical reviewer in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaults to Acknowledged which is the default set by the server.
@@ -56,13 +56,126 @@ You can configure the following properties when writing data to MongoDB in strea | |||
interface. | |||
| | |||
| **Default:** ``com.mongodb.spark.sql.connector.connection.DefaultMongoClientFactory`` | |||
|
|||
* - ``convertJson`` | |||
- | Specifies whether the connector parses the string and converts extended JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S: "whether" seems a bit imprecise, as there are more than two options. Perhaps change wording to "specifies if and how", or something similar. applies to all instances of convertJson
@@ -56,13 +56,126 @@ You can configure the following properties when writing data to MongoDB in strea | |||
interface. | |||
| | |||
| **Default:** ``com.mongodb.spark.sql.connector.connection.DefaultMongoClientFactory`` | |||
|
|||
* - ``convertJson`` | |||
- | Specifies whether the connector parses the string and converts extended JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- | Specifies whether the connector parses the string and converts extended JSON | |
- | Specifies whether the connector parses string values and converts extended JSON |
"the string" is a bit unclear to me as i'm unsure what the specific string we're referring to is. from my understanding this option applies to the whole value of the write operation so "string values" or "strings" would be more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applies to all
| **Default:** ``false`` | ||
|
||
* - ``idFieldList`` | ||
- | Field or list of fields by which to split the collection data. To |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S: to keep parallelism, "Specifies a field or list of fields.."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applies to all
| **Default:** ``1`` | ||
| For more information on ``j`` values, see :manual:`WriteConcern j | ||
Option </reference/write-concern/#j-option>` in the {+mdb-server+} | ||
manual. | ||
|
||
* - ``writeConcern.wTimeoutMS`` | ||
- | Specifies ``wTimeoutMS``, a write-concern option to return an error | ||
when a write operation exceeds the number of milliseconds. If you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when a write operation exceeds the number of milliseconds. If you | |
when a write operation exceeds the specified number of milliseconds. If you |
applies to all
manual. | ||
| | ||
| **Default:** ``1`` | ||
|
||
* - ``writeConcern.journal`` | ||
- | Specifies ``j``, a write-concern option to enable request for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- | Specifies ``j``, a write-concern option to enable request for | |
- | Specifies ``j``, a write-concern option requesting acknowledgment from MongoDB that the data has been written to the on-disk journal... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applies to all
| **Default:** ``true`` | ||
|
||
* - ``upsertDocument`` | ||
- | When ``true``, replace and update operations will insert the data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- | When ``true``, replace and update operations will insert the data | |
- | When ``true``, replace and update operations insert the data |
applies to all
| | ||
| For more information on ``wTimeoutMS`` values, see | ||
:manual:`WriteConcern wtimeout </reference/write-concern/#wtimeout>` in | ||
the {+mdb-server+} manual. | ||
|
||
* - ``checkpointLocation`` | ||
- | The absolute file path of the directory to which the connector writes checkpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- | The absolute file path of the directory to which the connector writes checkpoint | |
- | The absolute file path of the directory where the connector writes checkpoint |
@@ -139,33 +139,35 @@ You can configure the following properties when writing data to MongoDB in batch | |||
| | |||
| **Default:** ``true`` | |||
|
|||
* - ``writeConcern.w`` | |||
- | Specifies ``w``, a write-concern option to request acknowledgment that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- | Specifies ``w``, a write-concern option to request acknowledgment that | |
- | Specifies ``w``, a write-concern option requesting acknowledgment that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with a note to ask tech reviewer for confirmation on something!
Pull Request Info
PR Reviewing Guidelines
JIRA - https://jira.mongodb.org/browse/DOCSP-48557
Staging Links
Self-Review Checklist