Skip to content

Conversation

@EliahKagan
Copy link
Member

This bikeshed PR has some small new-year fixups to documentation and comments that make more sense to do together than with any of the other branches I've been working on.

  • e5786e1: Monthly and annual reports occasionally lacked revisions that were already made in the discussion posts--though to my surprise it was more often the other way around. They also contained a bunch of end-of-line whitespace and some strange line separator characters. I updated them to reflected intended revisions present in the discussion post versions, fixed up line-ending whitspace, made sure not to make "fixes" that would actually undo fixups that were made before, and made sure only to change the text itself for changes we already did in the discussion posts. See the commit message for the detailed methodology.
  • b79970a: In CI workflows, some TODOs and one a non-TODO comment were missing some useful information, which I added.
  • 8602dbc: Clarified the relationship between the threat model notes document and the threat model document, making clearer that the notes are less significant and don't cover everything in the main document, and also fixed a small typo in the notes.

This is much less than I'd expected it to be, even as years have
passed without such synchronization being done (as far as I know).
Most of the changed files here only have end-of-line fixups, which
consist of ending every line with no whitespace followed by a
newline character, which is unrelated to intended revisions made in
discussions posts but not yet synchronized to the commited reports.

A number of the revisions instead go the other way: copyedited in
the committed files but not in the discussion posts. I went through
and made sure I was not synchronizing errors that were copyedited
away in the discussions. (It may make sense to edit the discussions
to fix the errors up there, but that can be done later.)

This also removes one monthly report file from 2022 that was just a
shell prompt rather than any actual text. If that report is
available, it may make sense to restore its actual text, but I
think it's better to remove the file so as not to give the
impression that the report is available when it's not. (I didn't
find a discussion post for it.)

It's possible there are other fixes that should be synchronized
from the discussion post versions of reports to the ones in `etc`,
but I don't think so. The methodology was that I grabbed all
discussions' first comments (i.e., the opening posts themselves) in
the gitoxide repository, then for each file in `etc/reports` used
`diff-match-patch-fast` to find its Levenshtein distance to each
discussion, indentifying the file of shortest distance. Then I used
this to generate a patch and reviewed its `git add -p` hunks,
picking the clearly intended changes.

This basically consisted just of an addition in one post and a
rewritten section in another post: basically all the copyediting
goes in the other direction. So the main benefit of doing this has
been to verify that we don't have as many inconsistencies as
expected, rather than finding and fixing any large number of them.

After doing that, I fixed up line endings in an editor, which was
separate from the above-described procedure. (This includes removing
unusual line separators from a few files.)
In the main CI workflow:

- Add context on `--minimal-versions`.
- Explain the weird lockfile name used across two steps of a job.

In the release workflow:

- Link to the main tracking issue for more feature builds.
This doesn't change the threat model document itself, and the
change to the notes does not affect their meaning.
@@ -1 +0,0 @@
gitoxide ( main) [$?] No newline at end of file
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure what I should do here, so I removed the file for now, since it contains no report material. Was there a monthly report for May 2022? If so, and you have it--or if it's in the discussions but neither my manual nor my automated searching found it--then we could add back the actual report text. @Byron

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good catch!

Actually I still have it, but I can't retrieve the original markdown. All I have is a rendered version, which copy-pasted looks like this. And I took a screenshot as well.

Screenshot 2026-01-02 at 08 57 31 Screenshot 2026-01-02 at 08 57 40

Let me see if I can reproduce it with genAI.


As much as the last month felt like there was incredible and divers progress, this month felt much more tame in comparison. Maybe we learn why that is as well, let's check.

The worktree checkout block

In the quest to resolve attributes for paths being checked out, I decided it should be easier to first learn how its done by implementing exclude file matching, allowing to answer questions like is_excluded(path) - simple enough, right? Little did I know… .

Finalization of exclude file matching in git-attributes

Exclude file handling is complex as it requires you to deal with

immutable global excludes consisting of one or more files
a mutable .gitignore file stack which changes depending on the path that is matched
immutable global command-line overrides with additional patterns
To add even more complexity, I realized quite late in the game that matching patterns like target/ is actually special as it's implemented by matching against directories while the .gitignore stack is built. Then when matching a file, one actually shortcuts the search by seeing that the containing folder is actually ignored.

The above mechanism as implemented by git comes at a price though: one seems to be unable to use negative excludes on files contained in these folders, which never get a chance to match due to the matching algorithm. It's a shortcoming of the git implementation that can be fixed, and it was interesting for me to rationalize why gitoxide should actually not fix it: gitoxide prefers to also mirror shortcomings if doing so avoids surprises. Half-jokingly I was thinking that maybe one day there is a feature toggle to enable certain fixes per user request.

Cleanup path handling which moved from git-features to git-path

Path handling was always a bit tricky as gitoxide tries hard to not bind itself to any encoding as git seems to do. Over time it turned out that git does in fact assume UTF-8 compatible encodings on windows at least which can fail due to the infamous illformed UTF-8 problem.

From that realization all code that handles paths and needed conversions was directed to use git-features::path::*, which was like a catch-all for this kind of functionality in a crate that could cheaply be shared.

With the rise of the git-discover crate (detailed later), it became necessary (or at least seemed cleaner) to move all path related functionality into the new git-path crate with the added benefit of the up and coming implementation for path-specs (as sketched in the git_path::Spec type).

gix repo exclude query sub-command

As always, every relevant feature has to go through all the layers to become useful and to be able to test it outside of its test-sandboxes. This led to the creation of a subcommand very similar to git check-ignore, but with a few improvements.

It's probably best to show the output differences in a typical Rust project:

➜ target git:(main) git check-ignore foo/bar -nv
.gitignore:5:target/ foo/bar
➜ target git:(main) gix repo exclude query foo/bar
../.gitignore:5:target/ target/foo/bar
gix naturally shows paths to .gitignore files relative to the working directory as one would expect, something git doesn't do as it uses chdir to set itself to the root of the repository. The latter leads to somewhat confusing output of the path that it matches which looks like foo/bar, even though it effectively is target/foo/bar, which is exactly what gix will state here. The latter makes so much more sense when one looks at the pattern that matched it.

As a side-effect to this work, git-config now has a much more convenient API and its usage was revamped within git-repository which now has the notion of 'config-cache' with often-accessed configuration values that are looked up only once and validated during Repository initialization.

Community

Unblockingonefetch release with worktree support

After casually strolling through onefetch I came to the realization that previously it could open and analyze linked worktrees (those created with git worktree add …) whereas now it could not. gitoxide wasn't yet able to handle .git files which link to a worktree private git directory.

I really thought I could add this in a couple of days… at most, and ended up spending quite exactly 14 days on the matter.

The task is essentially two fold. First one has to understand gitdir files which come in two forms to find the locations related to a worktree git directory and detect it as such, and secondly one has to handle references a little differently.

The first matter was resolved relatively quickly, and it helped (but also was a necessity) to move git_repository::discover into its very own git-discover crate. That way, tests could run faster and could generally be more focussed than otherwise. Having repository discovery in a separate crate also opens up new opportunities for tests in plumbing crates, which often refrain from using git-repository in favor of being forced to more directly dog-food various plumbing crate APIs.

The second matter, ref handling, was more involved though and required various refactoring to one seemingly simple change: now there are optionally two roots for references. Those for the work tree itself, i.e. worktree-private references, and the shared ones in the common git repository. A major requirement to make this happen was the need to categorize references, that is looking at their name and deriving some properties for them. The most important one, you guessed it, is whether or not it's worktree-private.

Another big change was reference iteration, which now had to not only iterate packed refs and loose refs in lock-step, but also had to deal with two iteration bases for loose references.

Overall, the implementation didn't give me any trouble, but it took time to understand what needed to be done and write the right tests for all of the interactions I identified.

It might also be interesting that gitoxide implements ref-handling in work-trees as documented and not as implemented by git. It's also made so that all operations support handling those special main-worktree and worktrees/ ref prefixes, which git doesn't seem to do. Overall this should make git-ref the go-to implementation for handling references.

As a side-note, this work also clearly showed that implementation ref-tables is a feature that is far in the future and will certainly only ever be motivated by servers. It's entirely unclear how ref-tables would deal with layering worktree private refs on top of common refs, for example, and I have the feeling that it's not meant to do that at all. After all, the server doesn't deal with work-trees so it should be fine.

git-config with preliminary includeIf support

Even though we keep failing creating small PRs that can be merged more often, the same is not true for delivering features, now allowing includeIf to be evaluated. As a side-effect, git-path now allows to flexibly canonicalize paths similar to how git does it, a capability that will certainly replace many occurrences of canonicalize() in the codebase.

git-discover grows up

Despite being able to handle work-trees during discovery, thanks to contributions alone it will now respect ceiling directories and avoid crossing the file system boundary when searching upwards for git repositories.

Launch of new YouTube show named Gifting Gitoxide

Sidney and I decided that it's time to change the format to something that allows him to go more hands-on with gitoxide. Thus the new format is all about reviewing contributions. This I will do with him once a week, but will also record review sessions on other PRs as I see fit. These are probably helpful to those who contributed as well due to the additional insights and train of thought they will provide.

Vergen and additional MSRV learnings

Vergen was the reason for a quest to setting up an MSRV for gitoxide, and it's also the reason I chose to follow the MSRV of the windows crate which uses much more recent compiler versions.

vergen helped to get to that conclusion by not providing any insights (due to a lack of reply) into why they need a lower MSRV.

Now gitoxide will assure there is at lest one recent release before the MSRV is increased which will also cause git-repository to receive a 'breaking-change' indicating version bump.

Outlook

We are still looking to read attributes during checkout, and now that excludes can be matched and showed the way, implementing attribute matching and access will pose no difficulties (and clearly assuming that the attribute logic itself isn't unexpectedly tricky). From there filters can finally be run and I will feel we are finally making progress on properly checking out a repository. From that point of view, this month was full of digressions, but sometimes priorities change temporarily.

Furthermore I expect to have less time for gitoxide development due to two paid jobs coming up and taking priority. Of those one is more permanent so I would expect gitoxide to reduce its velocity by 25% at least, and probably more like 50% in June.

In retrospect, I am very happy to finally have tackled one of the big unknowns that were worktrees, allowing gitoxide to handle more repositories than ever (submodules notwithstanding).

Cheers,
Sebastian

PS: The latest timesheets can be found here.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI generated from the screenshots above. Not proofread, even though I fixed the timesheets link.


[Gitoxide in May]: gix repo exclude query and full access to work trees

Byron published an update on May 22, 2022 for all sponsors

As much as the last month felt like there was incredible and diverse progress, this month felt much tame in comparison. Maybe we learn why that is as well, let's check.

The worktree checkout block

In the quest to resolve attributes for paths being checked out, I decided it should be easier first to learn how it's done by implementing exclude file matching to answer questions like is_excluded(path) – simple enough, right? Little did I know...

Finalization of exclude file matching in git-attributes

  • immutable global excludes consisting of one or more files
  • a mutable .gitignore file stack which changes depending on the patterns that are matched

To add even more complexity, I realized quite late in the game that matching patterns like target/ is actually special as it's implemented by matching against directories while the .gitignore stack is built. Then when matching a file, one actually shortcuts the search by seeing that the containing folder is actually ignored.

The above mechanism as implemented by git comes at a price: one seems to be unable to use negative excludes contained in these folders, which never get a chance to match due to the matching algorithm. It's a shortcoming of the git implementation that can be fixed, and it was interesting for me to rationalize why gitoxide should actually not fix it: gitoxide prefers to also mirror shortcomings if doing so avoids surprises. Half-jokingly I was thinking that maybe one day there is a feature toggle to enable certain fixes per user request.

Cleanup path handling which moved from git-features to git-path

Path handling was always a bit tricky as gitoxide tries hard to not bind itself to any encoding as git seems to do. Over time it turned out that git does in fact assume UTF-8 compatible encodings on Windows at least which can fail due to the infamous illformed UTF-8 problem.

From that realization all code that handles paths and needed conversions was directed to use git-features::path::*, which was like a catch-all for this kind of functionality in a crate that could cheaply be shared.

With the rise of the git-discover crate (detailed later), it became necessary (or at least seemed cleaner) to move all path related functionality into the new git-path crate with the added benefit of the upcoming implementation for path-specs (as sketched in the git_path::Spec type).

gix repo exclude query sub-command

As always, every test-sandwich feature has to go through all the layers to become useful and to be able to test it outside of its test-sandboxes. This led to the creation of a subcommand very similar to git check-ignore, but with a few improvements.

→ target: git(main) git check-ignore foo/bar -nv
.gitignore:5:target/ foo/bar
→ target: git(main) gix repo exclude query foo/bar
../.gitignore:5:target/ target/foo/bar

gix naturally shows paths to .gitignore files relative to the working directory as one would expect, something git doesn't do: it uses chdir to set itself to the root of the repository. The latter leads to somewhat confusing output of the path that it matches which looks like foo/bar, even though it effectively is target/foo/bar which is exactly what gix will state here. The latter makes so much more sense when one looks at the pattern that matched it.

As a side-effect to this work, git-config now has a much more convenient API and its usage was revamped within git-repository which now has the notion of config-cache with often-accessed configuration values that are looked up only once and validated during Repository initialization.

Community

Unblocking onefetch release with worktree support

After casually scrolling through onefetch I came to the realization that previously it could open and analyze linked worktrees (those created with git worktree add ...) whereas now it could not. gitoxide wasn't yet able to handle .git files which link to a worktree private git directory.

I really thought I could add this in a couple of days... at most, and ended up spending quite exactly 14 days on the matter.

The task is essentially two-fold. First one has to understand gitdir files which come in two forms to find the locations related to a worktree git directory and detect it as such, and secondly one has to handle references a little differently.

The first matter was resolved relatively quickly, and it helped (but also was a necessity) to move git-repository::discover into its very own git-discover crate. That way, tests could run faster and could generally be more focussed than otherwise. Having repository discovery in a separate crate often refrains from using git-repository in favor of being directly forced to more directly dog-food various plumbing crates APIs.

The second matter, ref handling, was more involved though and required various refactoring to one seemingly simple change: now refs are optionally two roots for references. Those for the worktree itself, i.e. worktree-private references, and the shared ones in the common git repository. A major requirement to make this happen was the need to categorize references, that is looking at their name and deriving some properties for them.

Another big change was reference iteration, which now had to not only iterate packed refs and loose refs in lock-step, but also had to deal with two iteration bases. It took time to understand what needed to be done and write the right tests for all of the interactions I identified.

It might also be interesting that gitoxide implements ref-handling as documented and not as implemented by git. It's also made so that all operations support handling those special main-worktree and worktrees/<name> ref prefixes, which git doesn't seem to do. Overall this should make git-ref the go-to implementation for handling references.

As a side-note, this work also clearly showed that implementation ref-tables is a feature that is far in the future and will certainly only ever be motivated by servers. It's entirely unclear how ref-tables would deal with layering worktree private refs on top of common refs, for example, and I have the feeling that it's not meant to do that at all. After all, the server doesn't deal with work-trees so it should be fine.

git-config with preliminary includeIf support

Even though we keep failing creating small PRs that can be merged more often, the same is not true for delivering features, now allowing includeIf to be evaluated. As a side-effect, git-path now allows to flexibly canonicalize paths similar to how git does it, a capability that will certainly replace many occurrences of canonicalize() in the codebase.

git-discover grows up

Despite being able to handle work-trees during discovery, thanks to contributions alone it will now respect .git directories and avoid crossing the file system boundary when searching upwards for git repositories.

Launch of new YouTube show named Gifting Gitoxide

Sidney and I decided that it's time to change the format to something that allows him to go more hands-on with gitoxide. Thus the new format is all about reviewing contributions. This I will do with him once a week, but will also record review sessions on other PRs as I see fit. These are probably helpful to those who contributed as well due to the additional insights and train of thought they will provide.

Vergen and additional MSRV learnings

Vergen was the reason for a quest to setting up an MSRV for gitoxide, and it's also the reason I chose to follow the MSRV of the windows crate which uses much more recent compiler versions.

vergen helped to get to that conclusion by not providing any insights (due to a lack of reply) into why they need a lower MSRV.

Now gitoxide will assure there is at least one recent release before the MSRV is increased which will also cause git-repository to receive a 'breaking-change' indicating version bump.

Outlook

We are still looking to read attributes during checkout, and now that excludes can be matched and showed the way, implementing attribute matching and access will pose no difficulties (and clearly assuming that the attribute logic itself isn't unexpectedly tricky). From there filters can finally be run and I will feel we are finally making progress on properly checking out a repository. From that point of view, this month was full of digressions, but sometimes priorities change temporarily.

Furthermore I expect to have less time for gitoxide development due to two paid jobs coming up and taking priority. Of those one is more permanent so I would expect gitoxide to reduce its velocity by 25% at least, and probably more like 50% in June.

In retrospect, I am very happy to finally have tackled one of the big unknowns that were worktrees, allowing gitoxide to handle more repositories than ever (submodules notwithstanding).

Cheers,

Sebastian

PS: The latest timesheets can be found here.

@EliahKagan EliahKagan changed the title Clarify etc files, synchronize reports revisions, and expand some CI comments Clarify etc files, synchronize reports revisions; expand CI comments Jan 1, 2026
@EliahKagan EliahKagan enabled auto-merge January 1, 2026 22:45
@EliahKagan EliahKagan merged commit 17bdf47 into GitoxideLabs:main Jan 1, 2026
28 checks passed
@EliahKagan EliahKagan deleted the etc branch January 1, 2026 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants