0

I'm doing something with git-filter-repo that is like the convert-svnexternals script here:

https://github.com/newren/git-filter-repo/blob/main/contrib/filter-repo-demos/convert-svnexternals

In that script, the Git hash for the submodule is associated like this:

  dirname = parsed_config[section]['path'].encode()

  # Add gitlink to tree
  commit.file_changes.append(fr.FileChange(b'M', dirname, git_hash, b'160000'))

However in that script the 'path' section and the name of the Gitmodule section both happen to be the same value. For my script, because the developers using SVN used spaces in the path names (which I wasn't sure if are valid in Git submodule names) and because there will be multiple submodules referencing the same target (so I prepend a svn-externals- or extracted-folder- to say why the submodule exists), no longer is it the case that the key (the name) of the Git submodule is the same as the path. For example:

[submodule "svn-extern-a_path_with_spaces/SomeLib"]
    path = "a path with spaces/SomeLib"
    url = ssh://[email protected]:7999/FOO/somelib.git

# Same url, different key. They were doing a lot gratuitous
# use of svn:externals in the SVN repository I'm converting,
# and some of them were self-links. So the prior one was a link
# from there to here, within the same repo, that I had to also 
# extract out into its own repository to make it a submodule.
[submodule "extracted-folder-AnotherPath/SomeLib"]
    path = "AnotherPath/SomeLib"
    url = ssh://[email protected]:7999/FOO/somelib.git

Hence my question, should the line with commit.file_changes.append associate the hash with the name of the submodule or with the path?

Or, is it likely to cause more problems having the submodule name different than the path than what might be caused by a submodule name with spaces in it?

1 Answer 1

1

The name of the [submodule] section is irrelevant, and only exists to distinguish multiple [submodule] config sections, as the configuration format makes that necessary (does not allow "arrays of sections" at the section level, i.e. there is no equivalent to TOML's "[[submodule]]" syntax).

The actual path of the gitlink is stored in path as section names may have limitations (e.g. allowed characters) that tree paths do not have, so the path is necessarily stored as an option value.


Changes to submodules do not live in any kind of separate namespace – submodule links exist within the same 'tree' as files and directories, and therefore are associated with a path. That path may correspond to a .gitmodules entry with a matching path = parameter (though isn't guaranteed to).

The only difference that submodule links have from file (blob) or subdirectory (tree) entries is that the referenced hash is external.

The hash references a 'commit' object found within the submodule's Git repository. Within the parent repository, you would need to find a .gitmodules section with the matching path, and cloning the respective URL will hopefully result in a repository that contains the commit (though isn't guaranteed to).

So the easy way to answer this question is to find a repository where the .gitmodules configuration has different names, and to look at the resulting tree objects (git ls-tree, or such) to see if it's the section name or the 'path' parameter that most accurately reflects the real path.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.