-
-
Notifications
You must be signed in to change notification settings - Fork 410
Clarify etc files, synchronize reports revisions; expand CI comments
#2327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is much less than I'd expected it to be, even as years have passed without such synchronization being done (as far as I know). Most of the changed files here only have end-of-line fixups, which consist of ending every line with no whitespace followed by a newline character, which is unrelated to intended revisions made in discussions posts but not yet synchronized to the commited reports. A number of the revisions instead go the other way: copyedited in the committed files but not in the discussion posts. I went through and made sure I was not synchronizing errors that were copyedited away in the discussions. (It may make sense to edit the discussions to fix the errors up there, but that can be done later.) This also removes one monthly report file from 2022 that was just a shell prompt rather than any actual text. If that report is available, it may make sense to restore its actual text, but I think it's better to remove the file so as not to give the impression that the report is available when it's not. (I didn't find a discussion post for it.) It's possible there are other fixes that should be synchronized from the discussion post versions of reports to the ones in `etc`, but I don't think so. The methodology was that I grabbed all discussions' first comments (i.e., the opening posts themselves) in the gitoxide repository, then for each file in `etc/reports` used `diff-match-patch-fast` to find its Levenshtein distance to each discussion, indentifying the file of shortest distance. Then I used this to generate a patch and reviewed its `git add -p` hunks, picking the clearly intended changes. This basically consisted just of an addition in one post and a rewritten section in another post: basically all the copyediting goes in the other direction. So the main benefit of doing this has been to verify that we don't have as many inconsistencies as expected, rather than finding and fixing any large number of them. After doing that, I fixed up line endings in an editor, which was separate from the above-described procedure. (This includes removing unusual line separators from a few files.)
In the main CI workflow: - Add context on `--minimal-versions`. - Explain the weird lockfile name used across two steps of a job. In the release workflow: - Link to the main tracking issue for more feature builds.
This doesn't change the threat model document itself, and the change to the notes does not affect their meaning.
| @@ -1 +0,0 @@ | |||
| gitoxide ( main) [$?] No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure what I should do here, so I removed the file for now, since it contains no report material. Was there a monthly report for May 2022? If so, and you have it--or if it's in the discussions but neither my manual nor my automated searching found it--then we could add back the actual report text. @Byron
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good catch!
Actually I still have it, but I can't retrieve the original markdown. All I have is a rendered version, which copy-pasted looks like this. And I took a screenshot as well.
Let me see if I can reproduce it with genAI.
As much as the last month felt like there was incredible and divers progress, this month felt much more tame in comparison. Maybe we learn why that is as well, let's check.
The worktree checkout block
In the quest to resolve attributes for paths being checked out, I decided it should be easier to first learn how its done by implementing exclude file matching, allowing to answer questions like is_excluded(path) - simple enough, right? Little did I know… .
Finalization of exclude file matching in git-attributes
Exclude file handling is complex as it requires you to deal with
immutable global excludes consisting of one or more files
a mutable .gitignore file stack which changes depending on the path that is matched
immutable global command-line overrides with additional patterns
To add even more complexity, I realized quite late in the game that matching patterns like target/ is actually special as it's implemented by matching against directories while the .gitignore stack is built. Then when matching a file, one actually shortcuts the search by seeing that the containing folder is actually ignored.
The above mechanism as implemented by git comes at a price though: one seems to be unable to use negative excludes on files contained in these folders, which never get a chance to match due to the matching algorithm. It's a shortcoming of the git implementation that can be fixed, and it was interesting for me to rationalize why gitoxide should actually not fix it: gitoxide prefers to also mirror shortcomings if doing so avoids surprises. Half-jokingly I was thinking that maybe one day there is a feature toggle to enable certain fixes per user request.
Cleanup path handling which moved from git-features to git-path
Path handling was always a bit tricky as gitoxide tries hard to not bind itself to any encoding as git seems to do. Over time it turned out that git does in fact assume UTF-8 compatible encodings on windows at least which can fail due to the infamous illformed UTF-8 problem.
From that realization all code that handles paths and needed conversions was directed to use git-features::path::*, which was like a catch-all for this kind of functionality in a crate that could cheaply be shared.
With the rise of the git-discover crate (detailed later), it became necessary (or at least seemed cleaner) to move all path related functionality into the new git-path crate with the added benefit of the up and coming implementation for path-specs (as sketched in the git_path::Spec type).
gix repo exclude query sub-command
As always, every relevant feature has to go through all the layers to become useful and to be able to test it outside of its test-sandboxes. This led to the creation of a subcommand very similar to git check-ignore, but with a few improvements.
It's probably best to show the output differences in a typical Rust project:
➜ target git:(main) git check-ignore foo/bar -nv
.gitignore:5:target/ foo/bar
➜ target git:(main) gix repo exclude query foo/bar
../.gitignore:5:target/ target/foo/bar
gix naturally shows paths to .gitignore files relative to the working directory as one would expect, something git doesn't do as it uses chdir to set itself to the root of the repository. The latter leads to somewhat confusing output of the path that it matches which looks like foo/bar, even though it effectively is target/foo/bar, which is exactly what gix will state here. The latter makes so much more sense when one looks at the pattern that matched it.
As a side-effect to this work, git-config now has a much more convenient API and its usage was revamped within git-repository which now has the notion of 'config-cache' with often-accessed configuration values that are looked up only once and validated during Repository initialization.
Community
Unblockingonefetch release with worktree support
After casually strolling through onefetch I came to the realization that previously it could open and analyze linked worktrees (those created with git worktree add …) whereas now it could not. gitoxide wasn't yet able to handle .git files which link to a worktree private git directory.
I really thought I could add this in a couple of days… at most, and ended up spending quite exactly 14 days on the matter.
The task is essentially two fold. First one has to understand gitdir files which come in two forms to find the locations related to a worktree git directory and detect it as such, and secondly one has to handle references a little differently.
The first matter was resolved relatively quickly, and it helped (but also was a necessity) to move git_repository::discover into its very own git-discover crate. That way, tests could run faster and could generally be more focussed than otherwise. Having repository discovery in a separate crate also opens up new opportunities for tests in plumbing crates, which often refrain from using git-repository in favor of being forced to more directly dog-food various plumbing crate APIs.
The second matter, ref handling, was more involved though and required various refactoring to one seemingly simple change: now there are optionally two roots for references. Those for the work tree itself, i.e. worktree-private references, and the shared ones in the common git repository. A major requirement to make this happen was the need to categorize references, that is looking at their name and deriving some properties for them. The most important one, you guessed it, is whether or not it's worktree-private.
Another big change was reference iteration, which now had to not only iterate packed refs and loose refs in lock-step, but also had to deal with two iteration bases for loose references.
Overall, the implementation didn't give me any trouble, but it took time to understand what needed to be done and write the right tests for all of the interactions I identified.
It might also be interesting that gitoxide implements ref-handling in work-trees as documented and not as implemented by git. It's also made so that all operations support handling those special main-worktree and worktrees/ ref prefixes, which git doesn't seem to do. Overall this should make git-ref the go-to implementation for handling references.
As a side-note, this work also clearly showed that implementation ref-tables is a feature that is far in the future and will certainly only ever be motivated by servers. It's entirely unclear how ref-tables would deal with layering worktree private refs on top of common refs, for example, and I have the feeling that it's not meant to do that at all. After all, the server doesn't deal with work-trees so it should be fine.
git-config with preliminary includeIf support
Even though we keep failing creating small PRs that can be merged more often, the same is not true for delivering features, now allowing includeIf to be evaluated. As a side-effect, git-path now allows to flexibly canonicalize paths similar to how git does it, a capability that will certainly replace many occurrences of canonicalize() in the codebase.
git-discover grows up
Despite being able to handle work-trees during discovery, thanks to contributions alone it will now respect ceiling directories and avoid crossing the file system boundary when searching upwards for git repositories.
Launch of new YouTube show named Gifting Gitoxide
Sidney and I decided that it's time to change the format to something that allows him to go more hands-on with gitoxide. Thus the new format is all about reviewing contributions. This I will do with him once a week, but will also record review sessions on other PRs as I see fit. These are probably helpful to those who contributed as well due to the additional insights and train of thought they will provide.
Vergen and additional MSRV learnings
Vergen was the reason for a quest to setting up an MSRV for gitoxide, and it's also the reason I chose to follow the MSRV of the windows crate which uses much more recent compiler versions.
vergen helped to get to that conclusion by not providing any insights (due to a lack of reply) into why they need a lower MSRV.
Now gitoxide will assure there is at lest one recent release before the MSRV is increased which will also cause git-repository to receive a 'breaking-change' indicating version bump.
Outlook
We are still looking to read attributes during checkout, and now that excludes can be matched and showed the way, implementing attribute matching and access will pose no difficulties (and clearly assuming that the attribute logic itself isn't unexpectedly tricky). From there filters can finally be run and I will feel we are finally making progress on properly checking out a repository. From that point of view, this month was full of digressions, but sometimes priorities change temporarily.
Furthermore I expect to have less time for gitoxide development due to two paid jobs coming up and taking priority. Of those one is more permanent so I would expect gitoxide to reduce its velocity by 25% at least, and probably more like 50% in June.
In retrospect, I am very happy to finally have tackled one of the big unknowns that were worktrees, allowing gitoxide to handle more repositories than ever (submodules notwithstanding).
Cheers,
Sebastian
PS: The latest timesheets can be found here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AI generated from the screenshots above. Not proofread, even though I fixed the timesheets link.
[Gitoxide in May]: gix repo exclude query and full access to work trees
Byron published an update on May 22, 2022 for all sponsors
As much as the last month felt like there was incredible and diverse progress, this month felt much tame in comparison. Maybe we learn why that is as well, let's check.
The worktree checkout block
In the quest to resolve attributes for paths being checked out, I decided it should be easier first to learn how it's done by implementing exclude file matching to answer questions like is_excluded(path) – simple enough, right? Little did I know...
Finalization of exclude file matching in git-attributes
- immutable global excludes consisting of one or more files
- a mutable .gitignore file stack which changes depending on the patterns that are matched
To add even more complexity, I realized quite late in the game that matching patterns like target/ is actually special as it's implemented by matching against directories while the .gitignore stack is built. Then when matching a file, one actually shortcuts the search by seeing that the containing folder is actually ignored.
The above mechanism as implemented by git comes at a price: one seems to be unable to use negative excludes contained in these folders, which never get a chance to match due to the matching algorithm. It's a shortcoming of the git implementation that can be fixed, and it was interesting for me to rationalize why gitoxide should actually not fix it: gitoxide prefers to also mirror shortcomings if doing so avoids surprises. Half-jokingly I was thinking that maybe one day there is a feature toggle to enable certain fixes per user request.
Cleanup path handling which moved from git-features to git-path
Path handling was always a bit tricky as gitoxide tries hard to not bind itself to any encoding as git seems to do. Over time it turned out that git does in fact assume UTF-8 compatible encodings on Windows at least which can fail due to the infamous illformed UTF-8 problem.
From that realization all code that handles paths and needed conversions was directed to use git-features::path::*, which was like a catch-all for this kind of functionality in a crate that could cheaply be shared.
With the rise of the git-discover crate (detailed later), it became necessary (or at least seemed cleaner) to move all path related functionality into the new git-path crate with the added benefit of the upcoming implementation for path-specs (as sketched in the git_path::Spec type).
gix repo exclude query sub-command
As always, every test-sandwich feature has to go through all the layers to become useful and to be able to test it outside of its test-sandboxes. This led to the creation of a subcommand very similar to git check-ignore, but with a few improvements.
→ target: git(main) git check-ignore foo/bar -nv
.gitignore:5:target/ foo/bar
→ target: git(main) gix repo exclude query foo/bar
../.gitignore:5:target/ target/foo/bar
gix naturally shows paths to .gitignore files relative to the working directory as one would expect, something git doesn't do: it uses chdir to set itself to the root of the repository. The latter leads to somewhat confusing output of the path that it matches which looks like foo/bar, even though it effectively is target/foo/bar which is exactly what gix will state here. The latter makes so much more sense when one looks at the pattern that matched it.
As a side-effect to this work, git-config now has a much more convenient API and its usage was revamped within git-repository which now has the notion of config-cache with often-accessed configuration values that are looked up only once and validated during Repository initialization.
Community
Unblocking onefetch release with worktree support
After casually scrolling through onefetch I came to the realization that previously it could open and analyze linked worktrees (those created with git worktree add ...) whereas now it could not. gitoxide wasn't yet able to handle .git files which link to a worktree private git directory.
I really thought I could add this in a couple of days... at most, and ended up spending quite exactly 14 days on the matter.
The task is essentially two-fold. First one has to understand gitdir files which come in two forms to find the locations related to a worktree git directory and detect it as such, and secondly one has to handle references a little differently.
The first matter was resolved relatively quickly, and it helped (but also was a necessity) to move git-repository::discover into its very own git-discover crate. That way, tests could run faster and could generally be more focussed than otherwise. Having repository discovery in a separate crate often refrains from using git-repository in favor of being directly forced to more directly dog-food various plumbing crates APIs.
The second matter, ref handling, was more involved though and required various refactoring to one seemingly simple change: now refs are optionally two roots for references. Those for the worktree itself, i.e. worktree-private references, and the shared ones in the common git repository. A major requirement to make this happen was the need to categorize references, that is looking at their name and deriving some properties for them.
Another big change was reference iteration, which now had to not only iterate packed refs and loose refs in lock-step, but also had to deal with two iteration bases. It took time to understand what needed to be done and write the right tests for all of the interactions I identified.
It might also be interesting that gitoxide implements ref-handling as documented and not as implemented by git. It's also made so that all operations support handling those special main-worktree and worktrees/<name> ref prefixes, which git doesn't seem to do. Overall this should make git-ref the go-to implementation for handling references.
As a side-note, this work also clearly showed that implementation ref-tables is a feature that is far in the future and will certainly only ever be motivated by servers. It's entirely unclear how ref-tables would deal with layering worktree private refs on top of common refs, for example, and I have the feeling that it's not meant to do that at all. After all, the server doesn't deal with work-trees so it should be fine.
git-config with preliminary includeIf support
Even though we keep failing creating small PRs that can be merged more often, the same is not true for delivering features, now allowing includeIf to be evaluated. As a side-effect, git-path now allows to flexibly canonicalize paths similar to how git does it, a capability that will certainly replace many occurrences of canonicalize() in the codebase.
git-discover grows up
Despite being able to handle work-trees during discovery, thanks to contributions alone it will now respect .git directories and avoid crossing the file system boundary when searching upwards for git repositories.
Launch of new YouTube show named Gifting Gitoxide
Sidney and I decided that it's time to change the format to something that allows him to go more hands-on with gitoxide. Thus the new format is all about reviewing contributions. This I will do with him once a week, but will also record review sessions on other PRs as I see fit. These are probably helpful to those who contributed as well due to the additional insights and train of thought they will provide.
Vergen and additional MSRV learnings
Vergen was the reason for a quest to setting up an MSRV for gitoxide, and it's also the reason I chose to follow the MSRV of the windows crate which uses much more recent compiler versions.
vergen helped to get to that conclusion by not providing any insights (due to a lack of reply) into why they need a lower MSRV.
Now gitoxide will assure there is at least one recent release before the MSRV is increased which will also cause git-repository to receive a 'breaking-change' indicating version bump.
Outlook
We are still looking to read attributes during checkout, and now that excludes can be matched and showed the way, implementing attribute matching and access will pose no difficulties (and clearly assuming that the attribute logic itself isn't unexpectedly tricky). From there filters can finally be run and I will feel we are finally making progress on properly checking out a repository. From that point of view, this month was full of digressions, but sometimes priorities change temporarily.
Furthermore I expect to have less time for gitoxide development due to two paid jobs coming up and taking priority. Of those one is more permanent so I would expect gitoxide to reduce its velocity by 25% at least, and probably more like 50% in June.
In retrospect, I am very happy to finally have tackled one of the big unknowns that were worktrees, allowing gitoxide to handle more repositories than ever (submodules notwithstanding).
Cheers,
Sebastian
PS: The latest timesheets can be found here.
etc files, synchronize reports revisions, and expand some CI commentsetc files, synchronize reports revisions; expand CI comments
This bikeshed PR has some small new-year fixups to documentation and comments that make more sense to do together than with any of the other branches I've been working on.