| Age | Commit message (Collapse) | Author | Files | Lines |
|
* maint-2.22: (43 commits)
Git 2.22.2
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
...
|
|
* maint-2.21: (42 commits)
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
...
|
|
* maint-2.20: (36 commits)
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
...
|
|
* maint-2.19: (34 commits)
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
...
|
|
* maint-2.18: (33 commits)
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
...
|
|
* maint-2.17: (32 commits)
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
...
|
|
* maint-2.16: (31 commits)
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
...
|
|
* maint-2.15: (29 commits)
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
...
|
|
* maint-2.14: (28 commits)
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
test-path-utils: offer to run a protectNTFS/protectHFS benchmark
...
|
|
As with export-marks in the previous commit, import-marks can access the
filesystem. This is significantly less dangerous than export-marks
because it only involves reading from arbitrary paths, rather than
writing them. However, it could still be surprising and have security
implications (e.g., exfiltrating data from a service that accepts
fast-import streams).
Let's lump it (and its "if-exists" counterpart) in with export-marks,
and enable the in-stream version only if --allow-unsafe-features is set.
Signed-off-by: Jeff King <peff@peff.net>
|
|
The fast-import stream command "feature export-marks=<path>" lets the
stream write marks to an arbitrary path. This may be surprising if you
are running fast-import against an untrusted input (which otherwise
cannot do anything except update Git objects and refs).
Let's disallow the use of this feature by default, and provide a
command-line option to re-enable it (you can always just use the
command-line --export-marks as well, but the in-stream version provides
an easy way for exporters to control the process).
This is a backwards-incompatible change, since the default is flipping
to the new, safer behavior. However, since the main users of the
in-stream versions would be import/export-based remote helpers, and
since we trust remote helpers already (which are already running
arbitrary code), we'll pass the new option by default when reading a
remote helper's stream. This should minimize the impact.
Note that the implementation isn't totally simple, as we have to work
around the fact that fast-import doesn't parse its command-line options
until after it has read any "feature" lines from the stream. This is how
it lets command-line options override in-stream. But in our case, it's
important to parse the new --allow-unsafe-features first.
There are three options for resolving this:
1. Do a separate "early" pass over the options. This is easy for us to
do because there are no command-line options that allow the
"unstuck" form (so there's no chance of us mistaking an argument
for an option), though it does introduce a risk of incorrect
parsing later (e.g,. if we convert to parse-options).
2. Move the option parsing phase back to the start of the program, but
teach the stream-reading code never to override an existing value.
This is tricky, because stream "feature" lines override each other
(meaning we'd have to start tracking the source for every option).
3. Accept that we might parse a "feature export-marks" line that is
forbidden, as long we don't _act_ on it until after we've parsed
the command line options.
This would, in fact, work with the current code, but only because
the previous patch fixed the export-marks parser to avoid touching
the filesystem.
So while it works, it does carry risk of somebody getting it wrong
in the future in a rather subtle and unsafe way.
I've gone with option (1) here as simple, safe, and unlikely to cause
regressions.
This fixes CVE-2019-1348.
Signed-off-by: Jeff King <peff@peff.net>
|
|
When we parse the --export-marks option, we don't immediately open the
file, but we do create any leading directories. This can be especially
confusing when a command-line option overrides an in-stream one, in
which case we'd create the leading directory for the in-stream file,
even though we never actually write the file.
Let's instead create the directories just before opening the file, which
means we'll create only useful directories. Note that this could change
the handling of relative paths if we chdir() in between, but we don't
actually do so; the only permanent chdir is from setup_git_directory()
which runs before either code path (potentially we should take the
pre-setup dir into account to avoid surprising the user, but that's an
orthogonal change).
The test just adapts the existing "override" test to use paths with
leading directories. This checks both that the correct directory is
created (which worked before but was not tested), and that the
overridden one is not (our new fix here).
While we're here, let's also check the error result of
safe_create_leading_directories(). We'd presumably notice any failure
immediately after when we try to open the file itself, but we can give a
more specific error message in this case.
Signed-off-by: Jeff King <peff@peff.net>
|
|
When asked to import marks from "subdir/file.marks", we create the
leading directory "subdir" if it doesn't exist. This makes no sense for
importing marks, where we only ever open the path for reading.
Most of the time this would be a noop, since if the marks file exists,
then the leading directories exist, too. But if it doesn't (e.g.,
because --import-marks-if-exists was used), then we'd create the useless
directory.
This dates back to 580d5f83e7 (fast-import: always create marks_file
directories, 2010-03-29). Even then it was useless, so it seems to have
been added in error alongside the --export-marks case (which _is_
helpful).
Signed-off-by: Jeff King <peff@peff.net>
|
|
We parse options like "--max-pack-size=" using skip_prefix(), which
makes sense to get at the bytes after the "=". However, we also parse
"--quiet" and "--stats" with skip_prefix(), which allows things like
"--quiet-nonsense" to behave like "--quiet".
This was a mistaken conversion in 0f6927c229 (fast-import: put option
parsing code in separate functions, 2009-12-04). Let's tighten this to
an exact match, which was the original intent.
Signed-off-by: Jeff King <peff@peff.net>
|
|
The tree-walk API learned to pass an in-core repository
instance throughout more codepaths.
* nd/tree-walk-with-repo:
t7814: do not generate same commits in different repos
Use the right 'struct repository' instead of the_repository
match-trees.c: remove the_repo from shift_tree*()
tree-walk.c: remove the_repo from get_tree_entry_follow_symlinks()
tree-walk.c: remove the_repo from get_tree_entry()
tree-walk.c: remove the_repo from fill_tree_descriptor()
sha1-file.c: remove the_repo from read_object_with_reference()
|
|
Code clean-up.
* rs/copy-array:
use COPY_ARRAY for copying arrays
coccinelle: use COPY_ARRAY for copying arrays
|
|
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert calls of memcpy(3) to use COPY_ARRAY, which shortens and
simplifies the code a bit.
Patch generated by Coccinelle and contrib/coccinelle/array.cocci.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since git supports commit messages with an encoding other than UTF-8,
allow fast-import to import such commits. This may be useful for folks
who do not want to reencode commit messages from an external system, and
may also be useful to achieve reversible history rewrites (e.g. sha1sum
<-> sha256sum transitions or subtree work) with git repositories that
have used specialized encodings in their commit history.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Conversion from unsigned char[20] to struct object_id continues.
* bc/hash-transition-16: (35 commits)
gitweb: make hash size independent
Git.pm: make hash size independent
read-cache: read data in a hash-independent way
dir: make untracked cache extension hash size independent
builtin/difftool: use parse_oid_hex
refspec: make hash size independent
archive: convert struct archiver_args to object_id
builtin/get-tar-commit-id: make hash size independent
get-tar-commit-id: parse comment record
hash: add a function to lookup hash algorithm by length
remote-curl: make hash size independent
http: replace sha1_to_hex
http: compute hash of downloaded objects using the_hash_algo
http: replace hard-coded constant with the_hash_algo
http-walker: replace sha1_to_hex
http-push: remove remaining uses of sha1_to_hex
http-backend: allow 64-character hex names
http-push: convert to use the_hash_algo
builtin/pull: make hash-size independent
builtin/am: make hash size independent
...
|
|
When get-mark was introduced in commit 28c7b1f7b7b7 ("fast-import: add a
get-mark command", 2015-07-01), it followed the precedent of the
cat-blob command to be allowed on any line other than in the middle of a
data directive; see commit 777f80d7429b ("fast-import: Allow cat-blob
requests at arbitrary points in stream", 2010-11-28). It was useful to
allow cat-blob directives in the middle of a commit to get more data
that would be used in writing the current commit object. get-mark is
not similarly useful since fast-import can already use either object id
or mark. Further, trying to allow this command anywhere caused parsing
bugs. Fix the parsing problems by only allowing get-mark commands to
appear when other commands have completed.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In commit 777f80d7429b ("fast-import: Allow cat-blob requests at
arbitrary points in stream", 2010-11-28), fast-import started allowing
cat-blob commands to appear on the start of any line except in the
middle of a "data" command. It could be in the middle of various
directives that were part of a tag command, or in the middle of
checkpoints or progresses (each of which allow an optional second empty
newline), or even immediately after the mark command of a blob before
the data directive appeared (raising the question of what if it used the
mark for the blob that just barely appeared in the stream that we do not
yet have the data for). None of these locations make any sense as
places to put cat-blob requests.
The purpose of this change as stated in that commit message was to
[save] frontends from having to loop over everything they want to
commit in the next commit and cat-ing the necessary objects in
advance.
However, that can be achieved by simply allowing cat-blob requests to
appear whenever a filemodify directive is allowed. Further, it avoids
setting a bad precedent for other commands to follow (e.g. get-mark); a
precedent which caused parsing problems in corner cases.
Technically, inline filemodify directives add a slight wrinkle in that
frontends might want to have cat-blob directives appear after the start
of the filemodify and before the data directive contained within it. I
think it would have been better to disallow such a case (it would be
trivial to use cat-blob before the filemodify instead), but since there
is evidence this was used, for backwards compatibility let's support
that case too.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This is not a very important change, and one that I expect to have no
performance impact whatsoever, but reading the code bothered me. The
parsing of command types in cmd_main() mostly runs in order of most
common to least common commands; sure, it's hard to say for sure what
the most common are without some type of study, but it seems fairly
clear to mark the original four ("blob", "commit", "tag", "reset") as
the most prominent. Indeed, the parsing for most other commands were
added to later in the list. However, when "ls" was added, it was stuck
near the top of the list, with no rationale for that particular
location. Move it down to later to appease my Tourette's-like internal
twitching that its former location was causing.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace the uses of sha1_to_hex in this function with hash_to_hex to
allow the use of SHA-256 as well. Rename a variable since it is no
longer limited to SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace several uses of GIT_SHA1_HEXSZ and 40-based constants with
references to the_hash_algo. Update the note handling code here to
compute path sizes based on GIT_MAX_RAWSZ as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This member is used to represent the pack checksum of the pack in
question. Expand this member to be GIT_MAX_RAWSZ bytes in length so it
works with longer hashes and rename it to be "hash" instead of "sha1".
This transformation was made with a change to the definition and the
following semantic patch:
@@
struct packed_git *E1;
@@
- E1->sha1
+ E1->hash
@@
struct packed_git E1;
@@
- E1.sha1
+ E1.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Small fixes and features for fast-export and fast-import, mostly on
the fast-export side.
* en/fast-export-import:
fast-export: add a --show-original-ids option to show original names
fast-import: remove unmaintained duplicate documentation
fast-export: add --reference-excluded-parents option
fast-export: ensure we export requested refs
fast-export: when using paths, avoid corrupt stream with non-existent mark
fast-export: move commit rewriting logic into a function for reuse
fast-export: avoid dying when filtering by paths and old tags exist
fast-export: use value from correct enum
git-fast-export.txt: clarify misleading documentation about rev-list args
git-fast-import.txt: fix documentation for --quiet option
fast-export: convert sha1 to oid
|
|
Knowing the original names (hashes) of commits can sometimes enable
post-filtering that would otherwise be difficult or impossible. In
particular, the desire to rewrite commit messages which refer to other
prior commits (on top of whatever other filtering is being done) is
very difficult without knowing the original names of each commit.
In addition, knowing the original names (hashes) of blobs can allow
filtering by blob-id without requiring re-hashing the content of the
blob, and is thus useful as a small optimization.
Once we add original ids for both commits and blobs, we may as well
add them for tags too for completeness. Perhaps someone will have a
use for them.
This commit teaches a new --show-original-ids option to fast-export
which will make it add a 'original-oid <hash>' line to blob, commits,
and tags. It also teaches fast-import to parse (and ignore) such
lines.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
fast-import.c has started with a comment for nine and a half years
re-directing the reader to Documentation/git-fast-import.txt for
maintained documentation. Instead of leaving the unmaintained
documentation in place, just excise it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When printing variables which contain a size, today "unsigned long"
is used at many places.
In order to be able to change the type from "unsigned long" into size_t
some day in the future, we need to have a way to print 64 bit variables
on a system that has "unsigned long" defined to be 32 bit, like Win64.
Upcast all those variables into uintmax_t before they are printed.
This is to prepare for a bigger change, when "unsigned long"
will be converted into size_t for variables which may be > 4Gib.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
spatch transformation to replace boolean uses of !hashcmp() to
newly introduced oideq() is added, and applied, to regain
performance lost due to support of multiple hash algorithms.
* jk/cocci:
show_dirstat: simplify same-content check
read-cache: use oideq() in ce_compare functions
convert hashmap comparison functions to oideq()
convert "hashcmp() != 0" to "!hasheq()"
convert "oidcmp() != 0" to "!oideq()"
convert "hashcmp() == 0" to hasheq()
convert "oidcmp() == 0" to oideq()
introduce hasheq() and oideq()
coccinelle: use <...> for function exclusion
|
|
The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.
* ds/reachable:
commit-reach: correct accidental #include of C file
commit-reach: use can_all_from_reach
commit-reach: make can_all_from_reach... linear
commit-reach: replace ref_newer logic
test-reach: test commit_contains
test-reach: test can_all_from_reach_with_flags
test-reach: test reduce_heads
test-reach: test get_merge_bases_many
test-reach: test is_descendant_of
test-reach: test in_merge_bases
test-reach: create new test tool for ref_newer
commit-reach: move can_all_from_reach_with_flags
upload-pack: generalize commit date cutoff
upload-pack: refactor ok_to_give_up()
upload-pack: make reachable() more generic
commit-reach: move commit_contains from ref-filter
commit-reach: move ref_newer from remote.c
commit.h: remove method declarations
commit-reach: move walk methods from commit.c
|
|
This is the flip side of the previous two patches: checking
for a non-zero oidcmp() can be more strictly expressed as
inequality. Like those patches, we write "!= 0" in the
coccinelle transformation, which covers by isomorphism the
more common:
if (oidcmp(E1, E2))
As with the previous two patches, this patch can be achieved
almost entirely by running "make coccicheck"; the only
differences are manual line-wrap fixes to match the original
code.
There is one thing to note for anybody replicating this,
though: coccinelle 1.0.4 seems to miss the case in
builtin/tag.c, even though it's basically the same as all
the others. Running with 1.0.7 does catch this, so
presumably it's just a coccinelle bug that was fixed in the
interim.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Using the more restrictive oideq() should, in the long run,
give the compiler more opportunities to optimize these
callsites. For now, this conversion should be a complete
noop with respect to the generated code.
The result is also perhaps a little more readable, as it
avoids the "zero is equal" idiom. Since it's so prevalent in
C, I think seasoned programmers tend not to even notice it
anymore, but it can sometimes make for awkward double
negations (e.g., we can drop a few !!oidcmp() instances
here).
This patch was generated almost entirely by the included
coccinelle patch. This mechanical conversion should be
completely safe, because we check explicitly for cases where
oidcmp() is compared to 0, which is what oideq() is doing
under the hood. Note that we don't have to catch "!oidcmp()"
separately; coccinelle's standard isomorphisms make sure the
two are treated equivalently.
I say "almost" because I did hand-edit the coccinelle output
to fix up a few style violations (it mostly keeps the
original formatting, but sometimes unwraps long lines).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are many places in the codebase that want to iterate over
all packfiles known to Git. The purposes are wide-ranging, and
those that can take advantage of the multi-pack-index already
do. So, use get_all_packs() instead of get_packed_git() to be
sure we are iterating over all packfiles.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
lookup_commit_reference() and friends have been updated to find
in-core object for a specific in-core repository instance.
* sb/object-store-lookup: (32 commits)
commit.c: allow lookup_commit_reference to handle arbitrary repositories
commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories
tag.c: allow deref_tag to handle arbitrary repositories
object.c: allow parse_object to handle arbitrary repositories
object.c: allow parse_object_buffer to handle arbitrary repositories
commit.c: allow get_cached_commit_buffer to handle arbitrary repositories
commit.c: allow set_commit_buffer to handle arbitrary repositories
commit.c: migrate the commit buffer to the parsed object store
commit-slabs: remove realloc counter outside of slab struct
commit.c: allow parse_commit_buffer to handle arbitrary repositories
tag: allow parse_tag_buffer to handle arbitrary repositories
tag: allow lookup_tag to handle arbitrary repositories
commit: allow lookup_commit to handle arbitrary repositories
tree: allow lookup_tree to handle arbitrary repositories
blob: allow lookup_blob to handle arbitrary repositories
object: allow lookup_object to handle arbitrary repositories
object: allow object_as_type to handle arbitrary repositories
tag: add repository argument to deref_tag
tag: add repository argument to parse_tag_buffer
tag: add repository argument to lookup_tag
...
|
|
These methods are now declared in commit-reach.h. Remove them from
commit.h and add new include statements in all files that require these
declarations.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We know diff_delta() returns NULL, saying "no good delta exists for
it", when fed an empty data. Check the length of the data in the
caller to avoid such a call.
This incidentally reduces the number of attempted deltification we
see in the final statistics.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a repository argument to allow callers of
lookup_commit_reference_gently to be more specific about which
repository to handle. This is a small mechanical change; it doesn't
change the implementation to handle repositories other than
the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up to adjust to a more recent lockfile API convention that
allows lockfile instances kept on the stack.
* ma/lockfile-cleanup:
lock_file: move static locks into functions
lock_file: make function-local locks non-static
refs.c: do not die if locking fails in `delete_pseudoref()`
refs.c: do not die if locking fails in `write_pseudoref()`
t/helper/test-write-cache: clean up lock-handling
|
|
The codepath around object-info API has been taught to take the
repository object (which in turn tells the API which object store
the objects are to be located).
* sb/oid-object-info:
cache.h: allow oid_object_info to handle arbitrary repositories
packfile: add repository argument to cache_or_unpack_entry
packfile: add repository argument to unpack_entry
packfile: add repository argument to read_object
packfile: add repository argument to packed_object_info
packfile: add repository argument to packed_to_object_type
packfile: add repository argument to retry_bad_packed_offset
cache.h: add repository argument to oid_object_info
cache.h: add repository argument to oid_object_info_extended
|
|
Placing `struct lock_file`s on the stack used to be a bad idea, because
the temp- and lockfile-machinery would keep a pointer into the struct.
But after 076aa2cbd (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we can safely have lockfiles on the stack. (This applies
even if a user returns early, leaving a locked lock behind.)
These `struct lock_file`s are local to their respective functions and we
can drop their staticness.
For good measure, I have inspected these sites and come to believe that
they always release the lock, with the possible exception of bailing out
using `die()` or `exit()` or by returning from a `cmd_foo()`.
As pointed out by Jeff King, it would be bad if someone held on to a
`struct lock_file *` for some reason. After some grepping, I agree with
his findings: no-one appears to be doing that.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Precompute and store information necessary for ancestry traversal
in a separate file to optimize graph walking.
* ds/commit-graph:
commit-graph: implement "--append" option
commit-graph: build graph from starting commits
commit-graph: read only from specific pack-indexes
commit: integrate commit graph with commit parsing
commit-graph: close under reachability
commit-graph: add core.commitGraph setting
commit-graph: implement git commit-graph read
commit-graph: implement git-commit-graph write
commit-graph: implement write_commit_graph()
commit-graph: create git-commit-graph builtin
graph: add commit graph design document
commit-graph: add format document
csum-file: refactor finalize_hashfile() method
csum-file: rename hashclose() to finalize_hashfile()
|
|
Add a repository argument to allow the callers of unpack_entry
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a repository argument to allow the callers of oid_object_info
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
An reusable "memory pool" implementation has been extracted from
fast-import.c, which in turn has become the first user of the
mem-pool API.
* jm/mem-pool:
mem-pool: move reusable parts of memory pool into its own file
fast-import: introduce mem_pool type
fast-import: rename mem_pool type to mp_block
|
|
This moves the reusable parts of the memory pool logic used by
fast-import.c into its own file for use by other components.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Introduce the mem_pool type which encapsulates all the information necessary to
manage a pool of memory. This change moves the existing variables in
fast-import used to support the global memory pool to use this structure. It
also renames variables that are no longer used by memory pools to reflect their
more scoped usage.
These changes allow for the multiple instances of a memory pool to
exist and be reused outside of fast-import. In a future commit the
mem_pool type will be moved to its own file.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This is part of a patch series to extract the memory pool logic in
fast-import into a more generalized version. The existing mem_pool type
maps more closely to a "block of memory" (mp_block) in the more
generalized memory pool. This commit renames the mem_pool to mp_block to
reduce churn in future patches.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Refactoring of the internal global data structure continues.
* sb/packfiles-in-repository:
packfile: keep prepare_packed_git() private
packfile: allow find_pack_entry to handle arbitrary repositories
packfile: add repository argument to find_pack_entry
packfile: allow reprepare_packed_git to handle arbitrary repositories
packfile: allow prepare_packed_git to handle arbitrary repositories
packfile: allow prepare_packed_git_one to handle arbitrary repositories
packfile: add repository argument to reprepare_packed_git
packfile: add repository argument to prepare_packed_git
packfile: add repository argument to prepare_packed_git_one
packfile: allow install_packed_git to handle arbitrary repositories
packfile: allow rearrange_packed_git to handle arbitrary repositories
packfile: allow prepare_packed_git_mru to handle arbitrary repositories
|
|
Refactoring the internal global data structure to make it possible
to open multiple repositories, work with and then close them.
Rerolled by Duy on top of a separate preliminary clean-up topic.
The resulting structure of the topics looked very sensible.
* sb/object-store: (27 commits)
sha1_file: allow sha1_loose_object_info to handle arbitrary repositories
sha1_file: allow map_sha1_file to handle arbitrary repositories
sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
sha1_file: allow open_sha1_file to handle arbitrary repositories
sha1_file: allow stat_sha1_file to handle arbitrary repositories
sha1_file: allow sha1_file_name to handle arbitrary repositories
sha1_file: add repository argument to sha1_loose_object_info
sha1_file: add repository argument to map_sha1_file
sha1_file: add repository argument to map_sha1_file_1
sha1_file: add repository argument to open_sha1_file
sha1_file: add repository argument to stat_sha1_file
sha1_file: add repository argument to sha1_file_name
sha1_file: allow prepare_alt_odb to handle arbitrary repositories
sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
sha1_file: add repository argument to prepare_alt_odb
sha1_file: add repository argument to link_alt_odb_entries
sha1_file: add repository argument to read_info_alternates
sha1_file: add repository argument to link_alt_odb_entry
sha1_file: add raw_object_store argument to alt_odb_usable
pack: move approximate object count to object store
...
|
|
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (36 commits)
convert: convert to struct object_id
sha1_file: introduce a constant for max header length
Convert lookup_replace_object to struct object_id
sha1_file: convert read_sha1_file to struct object_id
sha1_file: convert read_object_with_reference to object_id
tree-walk: convert tree entry functions to object_id
streaming: convert istream internals to struct object_id
tree-walk: convert get_tree_entry_follow_symlinks internals to object_id
builtin/notes: convert static functions to object_id
builtin/fmt-merge-msg: convert remaining code to object_id
sha1_file: convert sha1_object_info* to object_id
Convert remaining callers of sha1_object_info_extended to object_id
packfile: convert unpack_entry to struct object_id
sha1_file: convert retry_bad_packed_offset to struct object_id
sha1_file: convert assert_sha1_type to object_id
builtin/mktree: convert to struct object_id
streaming: convert open_istream to use struct object_id
sha1_file: convert check_sha1_signature to struct object_id
sha1_file: convert read_loose_object to use struct object_id
builtin/index-pack: convert struct ref_delta_entry to object_id
...
|
|
The hashclose() method behaves very differently depending on the flags
parameter. In particular, the file descriptor is not always closed.
Perform a simple rename of "hashclose()" to "finalize_hashfile()" in
preparation for functional changes.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The reason callers have to call this is to make sure either packed_git
or packed_git_mru pointers are initialized since we don't do that by
default. Sometimes it's hard to see this connection between where the
function is called and where packed_git pointer is used (sometimes in
separate functions).
Keep this dependency internal because now all access to packed_git and
packed_git_mru must go through get_xxx() wrappers.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a repository argument to allow prepare_packed_git callers to be
more specific about which repository to handle. See commit "sha1_file:
add repository argument to link_alt_odb_entry" for an explanation of
the #define trick.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This conversion was done without the #define trick used in the earlier
series refactoring to have better repository access, because this function
is easy to review, as it only has one caller and all lines but the first
two are converted.
We must not convert 'pack_open_fds' to be a repository specific variable,
as it is used to monitor resource usage of the machine that Git executes
on.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a process with multiple repositories open, packfile accessors
should be associated to a single repository and not shared globally.
Move packed_git and packed_git_mru into the_repository and adjust
callers to reflect this.
[nd: while at there, wrap access to these two fields in get_packed_git()
and get_packed_git_mru(). This allows us to lazily initialize these
fields without caller doing that explicitly]
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Compilation fix.
* rj/warning-uninitialized-fix:
read-cache: fix an -Wmaybe-uninitialized warning
-Wuninitialized: remove some 'init-self' workarounds
|
|
The 'self-initialised' variables construct (ie <type> var = var;) has
been used to silence gcc '-W[maybe-]uninitialized' warnings. This has,
unfortunately, caused MSVC to issue 'uninitialized variable' warnings.
Also, using clang static analysis causes complaints about an 'Assigned
value is garbage or undefined'.
There are six such constructs in the current codebase. Only one of the
six causes gcc to issue a '-Wmaybe-uninitialized' warning (which will
be addressed elsewhere). The remaining five 'init-self' gcc workarounds
are noted below, along with the commit which introduced them:
1. builtin/rev-list.c: 'reaches' and 'all', see commit 457f08a030
("git-rev-list: add --bisect-vars option.", 2007-03-21).
2. merge-recursive.c:2064 'mrtree', see commit f120ae2a8e ("merge-
recursive.c: mrtree in merge() is not used before set", 2007-10-29).
3. fast-import.c:3023 'oe', see commit 85c62395b1 ("fast-import: let
importers retrieve blobs", 2010-11-28).
4. fast-import.c:3006 'oe', see commit 28c7b1f7b7 ("fast-import: add a
get-mark command", 2015-07-01).
Remove the 'self-initialised' variable constructs noted above.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file. Do the same for read_sha1_file_extended.
Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id. Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert read_object_with_reference to take pointers to struct object_id.
Update the internals of the function accordingly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert sha1_object_info and sha1_object_info_extended to take pointers
to struct object_id and rename them to use "oid" instead of "sha1" in
their names. Update the declaration and definition and apply the
following semantic patch, plus the standard object_id transforms:
@@
expression E1, E2;
@@
- sha1_object_info(E1.hash, E2)
+ oid_object_info(&E1, E2)
@@
expression E1, E2;
@@
- sha1_object_info(E1->hash, E2)
+ oid_object_info(E1, E2)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1.hash, E2, E3)
+ oid_object_info_extended(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1->hash, E2, E3)
+ oid_object_info_extended(E1, E2, E3)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Avoid using identifiers that clash with C++ keywords. Even though
it is not a goal to compile Git with C++ compilers, changes like
this help use of code analysis tools that targets C++ on our
codebase.
* bw/c-plus-plus: (37 commits)
replace: rename 'new' variables
trailer: rename 'template' variables
tempfile: rename 'template' variables
wrapper: rename 'template' variables
environment: rename 'namespace' variables
diff: rename 'template' variables
environment: rename 'template' variables
init-db: rename 'template' variables
unpack-trees: rename 'new' variables
trailer: rename 'new' variables
submodule: rename 'new' variables
split-index: rename 'new' variables
remote: rename 'new' variables
ref-filter: rename 'new' variables
read-cache: rename 'new' variables
line-log: rename 'new' variables
imap-send: rename 'new' variables
http: rename 'new' variables
entry: rename 'new' variables
diffcore-delta: rename 'new' variables
...
|
|
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Rename struct sha1file to struct hashfile, along with all of its related
functions.
The transformation in this commit was made by global search-and-replace.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Switch various uses of explicit calls to SHA-1 to use the_hash_algo.
Convert various uses of 20 and the GIT_SHA1 constants as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.
* jk/write-in-full-fix:
read_pack_header: handle signed/unsigned comparison in read result
config: flip return value of store_write_*()
notes-merge: use ssize_t for write_in_full() return value
pkt-line: check write_in_full() errors against "< 0"
convert less-trivial versions of "write_in_full() != len"
avoid "write_in_full(fd, buf, len) != len" pattern
get-tar-commit-id: check write_in_full() return against 0
config: avoid "write_in_full(fd, buf, len) < len" pattern
|
|
All but two of the call sites already have parameters using the hash
parameter of struct object_id, so convert them to take a pointer to the
struct directly. Also convert refs_read_refs_full, the underlying
implementation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Update the ref transaction code to use struct object_id. Remove one
NULL pointer check which was previously inserted around a dereference;
since we now pass a pointer to struct object_id directly through, the
code we're calling handles this for us.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The checkpoint command "git fast-import" did not flush updates to
refs and marks unless at least one object was created since the
last checkpoint, which has been corrected, as these things can
happen without any new object getting created.
* er/fast-import-dump-refs-on-checkpoint:
fast-import: checkpoint: dump branches/tags/marks even if object_count==0
|
|
The checkpoint command cycles packfiles if object_count != 0, a sensible
test or there would be no pack files to write. Since 820b931012, the
command also dumps branches, tags and marks, but still conditionally.
However, it is possible for a command stream to modify refs or create
marks without creating any new objects.
For example, reset a branch (and keep fast-import running):
$ git fast-import
reset refs/heads/master
from refs/heads/master^
checkpoint
but refs/heads/master remains unchanged.
Other example: a commit command that re-creates an object that already
exists in the object database.
The man page also states that checkpoint "updates the refs" and that
"placing a progress command immediately after a checkpoint will inform
the reader when the checkpoint has been completed and it can safely
access the refs that fast-import updated". This wasn't always true
without this patch.
This fix unconditionally calls dump_{branches,tags,marks}() for all
checkpoint commands. dump_branches() and dump_tags() are cheap to call
in the case of a no-op.
Add tests to t9300 that observe the (non-packfiles) effects of
checkpoint.
Signed-off-by: Eric Rannaud <e@nanocritical.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.
* jk/write-in-full-fix:
read_pack_header: handle signed/unsigned comparison in read result
config: flip return value of store_write_*()
notes-merge: use ssize_t for write_in_full() return value
pkt-line: check write_in_full() errors against "< 0"
convert less-trivial versions of "write_in_full() != len"
avoid "write_in_full(fd, buf, len) != len" pattern
get-tar-commit-id: check write_in_full() return against 0
config: avoid "write_in_full(fd, buf, len) < len" pattern
|
|
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).
So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:
1. It can do a funny signed/unsigned comparison. If your
"len" is signed (e.g., a size_t) then the compiler will
promote the "-1" to its unsigned variant.
This works out for "!= len" (unless you really were
trying to write the maximum size_t bytes), but is a
bug if you check "< len" (an example of which was fixed
recently in config.c).
We should avoid promoting the mental model that you
need to check the length at all, so that new sites are
not tempted to copy us.
2. Checking for a negative value is shorter to type,
especially when the length is an expression.
3. Linus says so. In d34cf19b89 (Clean up write_in_full()
users, 2007-01-11), right after the write_in_full()
semantics were changed, he wrote:
I really wish every "write_in_full()" user would just
check against "<0" now, but this fixes the nasty and
stupid ones.
Appeals to authority aside, this makes it clear that
writing it this way does not have an intentional
benefit. It's a historical curiosity that we never
bothered to clean up (and which was undoubtedly
cargo-culted into new sites).
So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).
[1] A careful reader may notice there is one way that
write_in_full() can return a different value. If we ask
write() to write N bytes and get a return value that is
_larger_ than N, we could return a larger total. But
besides the fact that this would imply a totally broken
version of write(), it would already invoke undefined
behavior. Our internal remaining counter is an unsigned
size_t, which means that subtracting too many byte will
wrap it around to a very large number. So we'll instantly
begin reading off the end of the buffer, trying to write
gigabytes (or petabytes) of data.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Currently, sha1_file.c and cache.h contain many functions, both related
to and unrelated to packfiles. This makes both files very large and
causes an unclear separation of concerns.
Create a new file, packfile.c, to hold all packfile-related functions
currently in sha1_file.c. It has a corresponding header packfile.h.
In this commit, the pack name-related functions are moved. Subsequent
commits will move the other functions.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.
* ab/free-and-null:
*.[ch] refactoring: make use of the FREE_AND_NULL() macro
coccinelle: make use of the "expression" FREE_AND_NULL() rule
coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
coccinelle: make use of the "type" FREE_AND_NULL() rule
coccinelle: add a rule to make "type" code use FREE_AND_NULL()
git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
|
|
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
|
|
"fast-import" uses a default pack chain depth that is consistent
with other parts of the system.
* mh/fast-import-raise-default-depth:
fast-import: increase the default pack depth to 50
|
|
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We often try to open a file for reading whose existence is
optional, and silently ignore errors from open/fopen; report such
errors if they are not due to missing files.
* nd/fopen-errors:
mingw_fopen: report ENOENT for invalid file names
mingw: verify that paths are not mistaken for remote nicknames
log: fix memory leak in open_next_file()
rerere.c: move error_errno() closer to the source system call
print errno when reporting a system call error
wrapper.c: make warn_on_inaccessible() static
wrapper.c: add and use fopen_or_warn()
wrapper.c: add and use warn_on_fopen_errors()
config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too
config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD
clone: use xfopen() instead of fopen()
use xfopen() in more places
git_fopen: fix a sparse 'not declared' warning
|
|
In 618e613a70, 10 years ago, the default for pack depth used for
git-pack-objects and git-repack was changed from 10 to 50, while
leaving fast-import's default to 10.
There doesn't seem to be a reason besides oversight for the change not
having happened in fast-import as well.
Interestingly, fast-import uses pack.depth when it's set, and the
git-config manual says the default for pack.depth is 50. While the
git-fast-import manual does say the default depth is 10, the
inconsistency is also confusing.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
xfopen()
- provides error details
- explains error on reading, or writing, or whatever operation
- has l10n support
- prints file name in the error
Some of these are missing in the places that are replaced with xfopen(),
which is a clear win. In some other places, it's just less code (not as
clearly a win as the previous case but still is).
The only slight regresssion is in remote-testsvn, where we don't report
the file class (marks files) in the error messages anymore. But since
this is a _test_ svn remote transport, I'm not too concerned.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert struct pack_idx_entry to use struct object_id by changing the
definition and applying the following semantic patch, plus the standard
object_id transforms:
@@
struct pack_idx_entry E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct pack_idx_entry *E1;
@@
- E1->sha1
+ E1->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert lookup_commit, lookup_commit_or_die,
lookup_commit_reference, and lookup_commit_reference_gently to take
struct object_id arguments.
Introduce a temporary in parse_object buffer in order to convert this
function. This is required since in order to convert parse_object and
parse_object_buffer, lookup_commit_reference_gently and
lookup_commit_or_die would need to be converted. Not introducing a
temporary would therefore require that lookup_commit_or_die take a
struct object_id *, but lookup_commit would take unsigned char *,
leaving a confusing and hard-to-use interface.
parse_object_buffer will lose this temporary in a later patch.
This commit was created with manual changes to commit.c, commit.h, and
object.c, plus the following semantic patch:
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1.hash, E2)
+ lookup_commit_reference_gently(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1->hash, E2)
+ lookup_commit_reference_gently(E1, E2)
@@
expression E1;
@@
- lookup_commit_reference(E1.hash)
+ lookup_commit_reference(&E1)
@@
expression E1;
@@
- lookup_commit_reference(E1->hash)
+ lookup_commit_reference(E1)
@@
expression E1;
@@
- lookup_commit(E1.hash)
+ lookup_commit(&E1)
@@
expression E1;
@@
- lookup_commit(E1->hash)
+ lookup_commit(E1)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1.hash, E2)
+ lookup_commit_or_die(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1->hash, E2)
+ lookup_commit_or_die(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert the remaining parts of fast-import.c to use struct object_id.
Convert several instances of get_sha1_hex to parse_oid_hex to avoid
needing to specify constants. Convert other hardcoded values to named
constants. Finally, use the is_empty_tree_oid function instead of a
direct comparison against a fixed string.
Note that the odd computation with GIT_MAX_HEXSZ is due to the insertion
of a slash between every two hex digits in the path, plus one for the
terminating NUL.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert struct tree_entry_ms, struct branch, struct tag, and struct
hash_list to use struct object_id by changing the definition and
applying the following semantic patch, plus the standard object_id
transforms:
@@
struct tree_entry_ms E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct tree_entry_ms *E1;
@@
- E1->sha1
+ E1->oid.hash
@@
struct branch E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct branch *E1;
@@
- E1->sha1
+ E1->oid.hash
@@
struct tag E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct tag *E1;
@@
- E1->sha1
+ E1->oid.hash
@@
struct hash_list E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct hash_list *E1;
@@
- E1->sha1
+ E1->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While handy, "git_path()" is a dangerous function to use as a
callsite that uses it safely one day can be broken by changes
to other code that calls it. Reduction of its use continues.
* jk/war-on-git-path:
am: drop "dir" parameter from am_state_init
replace strbuf_addstr(git_path()) with git_path_buf()
replace xstrdup(git_path(...)) with git_pathdup(...)
use git_path_* helper functions
branch: add edit_description() helper
bisect: add git_path_bisect_terms helper
|
|
It's more efficient to use git_pathdup(), as it skips an
extra copy of the path. And by removing some calls to
git_path(), it makes it easier to audit for dangerous uses.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The odb_mkstemp() function expects the caller to provide a
fixed buffer to write the resulting tempfile name into. But
it creates the template using snprintf without checking the
return value. This means we could silently truncate the
filename.
In practice, it's unlikely that the truncation would end in
the template-pattern that mkstemp needs to open the file. So
we'd probably end up failing either way, unless the path was
specially crafted.
The simplest fix would be to notice the truncation and die.
However, we can observe that most callers immediately
xstrdup() the result anyway. So instead, let's switch to
using a strbuf, which is easier for them (and isn't a big
deal for the other 2 callers, who can just strbuf_release
when they're done with it).
Note that many of the callers used static buffers, but this
was purely to avoid putting a large buffer on the stack. We
never passed the static buffers out of the function, so
there's no complicated memory handling we need to change.
Signed-off-by: Jeff King <peff@peff.net>
|
|
Code clean-up.
* jk/fast-import-cleanup:
pack.h: define largest possible encoded object size
encode_in_pack_object_header: respect output buffer length
fast-import: use xsnprintf for formatting headers
fast-import: use xsnprintf for writing sha1s
|
|
The encode_in_pack_object_header() writes a variable-length
header to an output buffer, but it doesn't actually know
long the buffer is. At first glance, this looks like it
might be possible to overflow.
In practice, this is probably impossible. The smallest
buffer we use is 10 bytes, which would hold the header for
an object up to 2^67 bytes. Obviously we're not likely to
see such an object, but we might worry that an object could
lie about its size (causing us to overflow before we realize
it does not actually have that many bytes). But the argument
is passed as a uintmax_t. Even on systems that have __int128
available, uintmax_t is typically restricted to 64-bit by
the ABI.
So it's unlikely that a system exists where this could be
exploited. Still, it's easy enough to use a normal out/len
pair and make sure we don't write too far. That protects the
hypothetical 128-bit system, makes it harder for callers to
accidentally specify a too-small buffer, and makes the
resulting code easier to audit.
Note that the one caller in fast-import tried to catch such
a case, but did so _after_ the call (at which point we'd
have already overflowed!). This check can now go away.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The stream_blob() function checks the return value of
snprintf and dies. This is more simply done with
xsnprintf (and matches the similar call in store_object).
The message the user would get is less specific, but since
the point is that this _shouldn't_ ever happen, that's OK.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we have to write a sha1 with a newline, we do so by
copying both into a single buffer, so that we can issue a
single write() call.
We use snprintf but don't bother to check the output, since
we know it will fit. However, we should use xsnprintf() in
such a case so that we're notified if our assumption turns
out to be wrong (and to make it easier to audit for
unchecked snprintf calls).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up.
* jk/pack-name-cleanups:
index-pack: make pointer-alias fallbacks safer
replace snprintf with odb_pack_name()
odb_pack_keep(): stop generating keepfile name
sha1_file.c: make pack-name helper globally accessible
move odb_* declarations out of git-compat-util.h
|
|
In several places we write the name of the pack filename
into a fixed-size buffer using snprintf(), but do not check
the return value. As a result, a very long object directory
could cause us to quietly truncate the pack filename
(potentially leading to a corrupted repository, as a newly
written packfile could be missing its .pack extension).
We can use odb_pack_name() to do this with a strbuf (and
shorten the code, as well).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The odb_pack_keep() function generates the name of a .keep
file and opens it. This has two problems:
1. It requires a fixed-size buffer to create the filename
and doesn't notice when the result is truncated.
2. Of the two callers, one sometimes wants to open a
filename it already has, which makes things awkward (it
has to do so manually, and skips the leading-directory
creation).
Instead, let's have odb_pack_keep() just open the file.
Generating the name isn't hard, and a future patch will
switch callers over to odb_pack_name() anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When the current branch is renamed with 'git branch -m/-M' or deleted
with 'git update-ref -m<msg> -d', the event is recorded in HEAD's log
with an empty message. In preparation for adding a more meaningful
message to HEAD's log in these cases, update delete_ref() to take a
message argument and pass it along to ref_transaction_delete().
Modify all callers to pass NULL for the new message argument; no
change in behavior is intended.
Note that this is relevant for HEAD's log but not for the deleted
ref's log, which is currently deleted along with the ref. Even if it
were not, an entry for the deletion wouldn't be present in the deleted
ref's log. files_transaction_commit() writes to the log if
REF_NEEDS_COMMIT or REF_LOG_ONLY are set, but lock_ref_for_update()
doesn't set REF_NEEDS_COMMIT for the deleted ref because REF_DELETING
is set. In contrast, the update for HEAD has REF_LOG_ONLY set by
split_head_update(), resulting in the deletion being logged.
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git fast-import" sometimes mishandled while rebalancing notes
tree, which has been fixed.
* mh/fast-import-notes-fix-new:
fast-import: properly fanout notes when tree is imported
|
|
In typical uses of fast-import, trees are inherited from a parent
commit. In that case, the tree_entry for the branch looks like:
.versions[1].sha1 = $some_sha1
.tree = <tree structure loaded from $some_sha1>
However, when trees are imported, rather than inherited, that is not the
case. One can import a tree with a filemodify command, replacing the
root tree object.
e.g.
"M 040000 $some_sha1 \n"
In this case, the tree_entry for the branch looks like:
.versions[1].sha1 = $some_sha1
.tree = NULL
When adding new notes with the notemodify command, do_change_note_fanout
is called to get a notes count, and to do so, it loops over the
tree_entry->tree, but doesn't do anything when the tree is NULL.
In the latter case above, it means do_change_note_fanout thinks the tree
contains no notes, and new notes are added with no fanout.
Interestingly, do_change_note_fanout does check whether subdirectories
have a NULL .tree, in which case it uses load_tree(). Which means the
right behaviour happens when using the filemodify command to import
subdirectories.
This change makes do_change_note_fanount call load_tree() whenever the
tree_entry it is given has no tree loaded, making all cases handled
equally.
Signed-off-by: Mike Hommey <mh@glandium.org>
Reviewed-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are three codepaths that use a variable whose name is
pack_compression_level to affect how objects and deltas sent to a
packfile is compressed. Unlike zlib_compression_level that controls
the loose object compression, however, this variable was static to
each of these codepaths. Two of them read the pack.compression
configuration variable, using core.compression as the default, and
one of them also allowed overriding it from the command line.
The other codepath in bulk-checkin did not pay any attention to the
configuration.
Unify the configuration parsing to git_default_config(), where we
implement the parsing of core.loosecompression and core.compression
and make the former override the latter, by moving code to parse
pack.compression and also allow core.compression to give default to
this variable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT. The resulting code is
shorter and supports empty arrays with NULL pointers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are certain house-keeping tasks that need to be performed at
the very beginning of any Git program, and programs that are not
built-in commands had to do them exactly the same way as "git"
potty does. It was easy to make mistakes in one-off standalone
programs (like test helpers). A common "main()" function that
calls cmd_main() of individual program has been introduced to
make it harder to make mistakes.
* jk/common-main:
mingw: declare main()'s argv as const
common-main: call git_setup_gettext()
common-main: call restore_sigpipe_to_default()
common-main: call sanitize_stdfds()
common-main: call git_extract_argv0_path()
add an extra level of indirection to main()
|
|
* jk/common-main-2.8:
mingw: declare main()'s argv as const
common-main: call git_setup_gettext()
common-main: call restore_sigpipe_to_default()
common-main: call sanitize_stdfds()
common-main: call git_extract_argv0_path()
add an extra level of indirection to main()
|
|
This should be part of every program, as otherwise users do
not get translated error messages. However, some external
commands forgot to do so (e.g., git-credential-store). This
fixes them, and eliminates the repeated code in programs
that did remember to use it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Every program which links against libgit.a must call this
function, or risk hitting an assert() in system_path() that
checks whether we have configured argv0_path (though only
when RUNTIME_PREFIX is defined, so essentially only on
Windows).
Looking at the diff, you can see that putting it into the
common main() saves us having to do it individually in each
of the external commands. But what you can't see are the
cases where we _should_ have been doing so, but weren't
(e.g., git-credential-store, and all of the t/helper test
programs).
This has been an accident-waiting-to-happen for a long time,
but wasn't triggered until recently because it involves one
of those programs actually calling system_path(). That
happened with git-credential-store in v2.8.0 with ae5f677
(lazily load core.sharedrepository, 2016-03-11). The
program:
- takes a lock file, which...
- opens a tempfile, which...
- calls adjust_shared_perm to fix permissions, which...
- lazy-loads the config (as of ae5f677), which...
- calls system_path() to find the location of
/etc/gitconfig
On systems with RUNTIME_PREFIX, this means credential-store
reliably hits that assert() and cannot be used.
We never noticed in the test suite, because we set
GIT_CONFIG_NOSYSTEM there, which skips the system_path()
lookup entirely. But if we were to tweak git_config() to
find /etc/gitconfig even when we aren't going to open it,
then the test suite shows multiple failures (for
credential-store, and for some other test helpers). I didn't
include that tweak here because it's way too specific to
this particular call to be worth carrying around what is
essentially dead code.
The implementation is fairly straightforward, with one
exception: there is exactly one caller (git.c) that actually
cares about the result of the function, and not the
side-effect of setting up argv0_path. We can accommodate
that by simply replacing the value of argv[0] in the array
we hand down to cmd_main().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are certain startup tasks that we expect every git
process to do. In some cases this is just to improve the
quality of the program (e.g., setting up gettext()). In
others it is a requirement for using certain functions in
libgit.a (e.g., system_path() expects that you have called
git_extract_argv0_path()).
Most commands are builtins and are covered by the git.c
version of main(). However, there are still a few external
commands that use their own main(). Each of these has to
remember to include the correct startup sequence, and we are
not always consistent.
Rather than just fix the inconsistencies, let's make this
harder to get wrong by providing a common main() that can
run this standard startup.
We basically have two options to do this:
- the compat/mingw.h file already does something like this by
adding a #define that replaces the definition of main with a
wrapper that calls mingw_startup().
The upside is that the code in each program doesn't need
to be changed at all; it's rewritten on the fly by the
preprocessor.
The downside is that it may make debugging of the startup
sequence a bit more confusing, as the preprocessor is
quietly inserting new code.
- the builtin functions are all of the form cmd_foo(),
and git.c's main() calls them.
This is much more explicit, which may make things more
obvious to somebody reading the code. It's also more
flexible (because of course we have to figure out _which_
cmd_foo() to call).
The downside is that each of the builtins must define
cmd_foo(), instead of just main().
This patch chooses the latter option, preferring the more
explicit approach, even though it is more invasive. We
introduce a new file common-main.c, with the "real" main. It
expects to call cmd_main() from whatever other objects it is
linked against.
We link common-main.o against anything that links against
libgit.a, since we know that such programs will need to do
this setup. Note that common-main.o can't actually go inside
libgit.a, as the linker would not pick up its main()
function automatically (it has no callers).
The rest of the patch is just adjusting all of the various
external programs (mostly in t/helper) to use cmd_main().
I've provided a global declaration for cmd_main(), which
means that all of the programs also need to match its
signature. In particular, many functions need to switch to
"const char **" instead of "char **" for argv. This effect
ripples out to a few other variables and functions, as well.
This makes the patch even more invasive, but the end result
is much better. We should be treating argv strings as const
anyway, and now all programs conform to the same signature
(which also matches the way builtins are defined).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git fast-import" learned the same performance trick to avoid
creating too small a packfile as "git fetch" and "git push" have,
using *.unpackLimit configuration.
* ew/fast-import-unpack-limit:
fast-import: invalidate pack_id references after loosening
fast-import: implement unpack limit
|
|
"git fast-import --export-marks" would overwrite the existing marks
file even when it makes a dump from its custom die routine.
Prevent it from doing so when we have an import-marks file but
haven't finished reading it.
* fc/fast-import-broken-marks-file:
fast-import: do not truncate exported marks file
|
|
When loosening a pack, the current pack_id gets reused when
checkpointing and the import does not terminate. This causes
problems after checkpointing as the object table, branch, and
tag lists still contains pre-checkpoint references to the
recycled pack_id.
Merely clearing the object_table as suggested by Jeff King in
http://mid.gmane.org/20160517121330.GA7346@sigill.intra.peff.net
is insufficient as the marks set still contains references
to object entries.
Wrong pack_id references branch and tags lists do not cause
errors, but can lead to misleading crash reports and core dumps,
so they are also invalidated.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The experimental "multiple worktree" feature gains more safety to
forbid operations on a branch that is checked out or being actively
worked on elsewhere, by noticing that e.g. it is being rebased.
* nd/worktree-various-heads:
branch: do not rename a branch under bisect or rebase
worktree.c: check whether branch is bisected in another worktree
wt-status.c: split bisect detection out of wt_status_get_state()
worktree.c: check whether branch is rebased in another worktree
worktree.c: avoid referencing to worktrees[i] multiple times
wt-status.c: make wt_status_check_rebase() work on any worktree
wt-status.c: split rebase detection out of wt_status_get_state()
path.c: refactor and add worktree_git_path()
worktree.c: mark current worktree
worktree.c: make find_shared_symref() return struct worktree *
worktree.c: store "id" instead of "git_dir"
path.c: add git_common_path() and strbuf_git_common_path()
dir.c: rename str(n)cmp_icase to fspath(n)cmp
|
|
Certain lines of the marks file might be corrupted (or the objects
missing due to a garbage collection), but that's no reason to truncate
the file and essentially destroy the rest of it.
Ideally missing objects should not cause a crash, we could just skip
them, but that's another patch.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With many incremental imports, small packs become highly
inefficient due to the need to readdir scan and load many
indices to locate even a single object. Frequent repacking and
consolidation may be prohibitively expensive in terms of disk
I/O, especially in large repositories where the initial packs
were aggressively optimized and marked with .keep files.
In those cases, users may be better served with loose objects
and relying on "git gc --auto".
This changes the default behavior of fast-import for small
imports found in test cases, so adjustments to t9300 were
necessary.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
These functions compare two paths that are taken from file system.
Depending on the running file system, paths may need to be compared
case-sensitively or not, and maybe even something else in future. The
current names do not convey that well.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Update various codepaths to avoid manually-counted malloc().
* jk/tighten-alloc: (22 commits)
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
tree-diff: catch integer overflow in combine_diff_path allocation
...
|
|
This function allocate a packed_git flex-array, and adds a
mysterious 2 bytes to the length of the pack_name field. One
is for the trailing NUL, but the other has no purpose. This
is probably cargo-culted from add_packed_git, which gets the
".idx" path and needed to allocate enough space to hold the
matching ".pack" (though since 48bcc1c, we calculate the
size there differently).
This site, however, is using the raw path of a tempfile, and
does not need the extra byte. We can just replace the
allocation with FLEX_ALLOC_STR, which handles the allocation
and the NUL for us.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:
1. It automatically checks the array-size multiplication
for overflow.
2. It always uses sizeof(*array) for the element-size,
so that it can never go out of sync with the declared
type of the array.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago. No useful caller that uses other value has emerged.
By now, it is clear that the interface is overly broad without a
good reason. Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.
This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them. The changes contained in this patch are:
* introduction of these two functions in strbuf.[ch]
* mechanical conversion of all callers to strbuf_getline() with
either '\n' or '\0' as the third parameter to instead call the
respective thin wrapper.
After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller. An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Cosmetic improvement to lock-file error messages.
* sg/lock-file-commit-error:
Make error message after failing commit_lock_file() less confusing
|
|
The error message after a failing commit_lock_file() call sometimes
looks like this, causing confusion:
$ git remote add remote git@server.com/repo.git
error: could not commit config file .git/config
# Huh?!
# I didn't want to commit anything, especially not my config file!
While in the narrow context of the lockfile module using the verb
'commit' in the error message makes perfect sense, in the broader
context of git the word 'commit' already has a very specific meaning,
hence the confusion.
Reword these error messages to say "could not write" instead of "could
not commit".
While at it, include strerror in the error messages after writing the
config file or the credential store fails to provide some information
about the cause of the failure, and update the style of the error
message after writing the reflog fails to match surrounding error
messages (i.e. no '' around the pathname and no () around the error
description).
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Jeff King <peff@peff.net>
|
|
Many allocations that is manually counted (correctly) that are
followed by strcpy/sprintf have been replaced with a less error
prone constructs such as xstrfmt.
Macintosh-specific breakage was noticed and corrected in this
reroll.
* jk/war-on-sprintf: (70 commits)
name-rev: use strip_suffix to avoid magic numbers
use strbuf_complete to conditionally append slash
fsck: use for_each_loose_file_in_objdir
Makefile: drop D_INO_IN_DIRENT build knob
fsck: drop inode-sorting code
convert strncpy to memcpy
notes: document length of fanout path with a constant
color: add color_set helper for copying raw colors
prefer memcpy to strcpy
help: clean up kfmclient munging
receive-pack: simplify keep_arg computation
avoid sprintf and strcpy with flex arrays
use alloc_ref rather than hand-allocating "struct ref"
color: add overflow checks for parsing colors
drop strcpy in favor of raw sha1_to_hex
use sha1_to_hex_r() instead of strcpy
daemon: use cld->env_array when re-spawning
stat_tracking_info: convert to argv_array
http-push: use an argv_array for setup_revisions
fetch-pack: use argv_array for index-pack / unpack-objects
...
|
|
"git log --date=local" used to only show the normal (default)
format in the local timezone. The command learned to take 'local'
as an instruction to use the local timezone with other formats,
e.g. "git show --date=rfc-local".
* jk/date-local:
t6300: add tests for "-local" date formats
t6300: make UTC and local dates different
date: make "local" orthogonal to date format
date: check for "local" before anything else
t6300: add test for "raw" date format
t6300: introduce test_date() helper
fast-import: switch crash-report date to iso8601
Documentation/rev-list: don't list date formats
Documentation/git-for-each-ref: don't list date formats
Documentation/config: don't list date formats
Documentation/blame-options: don't list date formats
|
|
strncpy is known to be a confusing function because of its
termination semantics. These calls are all correct, but it
takes some examination to see why. In particular, every one
of them expects to copy up to the length limit, and then
makes some arrangement for terminating the result.
We can just use memcpy, along with noting explicitly how the
result is terminated (if it is not already obvious). That
should make it more clear to a reader that we are doing the
right thing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we already know the length of a string (e.g., because
we just malloc'd to fit it), it's nicer to use memcpy than
strcpy, as it makes it more obvious that we are not going to
overflow the buffer (because the size we pass matches the
size in the allocation).
This also eliminates calls to strcpy, which make auditing
the code base harder.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we are allocating a struct with a FLEX_ARRAY member, we
generally compute the size of the array and then sprintf or
strcpy into it. Normally we could improve a dynamic allocation
like this by using xstrfmt, but it doesn't work here; we
have to account for the size of the rest of the struct.
But we can improve things a bit by storing the length that
we use for the allocation, and then feeding it to xsnprintf
or memcpy, which makes it more obvious that we are not
writing more than the allocated number of bytes.
It would be nice if we had some kind of helper for
allocating generic flex arrays, but it doesn't work that
well:
- the call signature is a little bit unwieldy:
d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...);
You need offsetof here instead of just writing to the
end of the base size, because we don't know how the
struct is packed (partially this is because FLEX_ARRAY
might not be zero, though we can account for that; but
the size of the struct may actually be rounded up for
alignment, and we can't know that).
- some sites do clever things, like over-allocating because
they know they will write larger things into the buffer
later (e.g., struct packed_git here).
So we're better off to just write out each allocation (or
add type-specific helpers, though many of these are one-off
allocations anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We generally use 32-byte buffers to format git's "type size"
header fields. These should not generally overflow unless
you can produce some truly gigantic objects (and our types
come from our internal array of constant strings). But it is
a good idea to use xsnprintf to make sure this is the case.
Note that we slightly modify the interface to
write_sha1_file_prepare, which nows uses "hdrlen" as an "in"
parameter as well as an "out" (on the way in it stores the
allocated size of the header, and on the way out it returns
the ultimate size of the header).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When fast-import emits a crash report, it does so in the
user's local timezone. But because we omit the timezone
completely for DATE_LOCAL, a reader of the report does not
immediately know which time zone was used. Let's switch this
to ISO8601 instead, which includes the time zone.
This does mean we will show the time in UTC, but that's not
a big deal. A crash report like this will either be looked
at immediately (in which case nobody even looks at the
timestamp), or it will be passed along to a developer to
debug, in which case the original timezone is less likely to
be of interest.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
git_path() and mkpath() are handy helper functions but it is easy
to misuse, as the callers need to be careful to keep the number of
active results below 4. Their uses have been reduced.
* jk/git-path:
memoize common git-path "constant" files
get_repo_path: refactor path-allocation
find_hook: keep our own static buffer
refs.c: remove_empty_directories can take a strbuf
refs.c: avoid git_path assignment in lock_ref_sha1_basic
refs.c: avoid repeated git_path calls in rename_tmp_log
refs.c: simplify strbufs in reflog setup and writing
path.c: drop git_path_submodule
refs.c: remove extra git_path calls from read_loose_refs
remote.c: drop extraneous local variable from migrate_file
prefer mkpathdup to mkpath in assignments
prefer git_pathdup to git_path in some possibly-dangerous cases
add_to_alternates_file: don't add duplicate entries
t5700: modernize style
cache.h: complete set of git_path_submodule helpers
cache.h: clarify documentation for git_path, et al
|
|
Long overdue micro clean-up.
* jc/finalize-temp-file:
sha1_file.c: rename move_temp_to_file() to finalize_object_file()
|
|
Because git_path uses a static buffer that is shared with
calls to git_path, mkpath, etc, it can be dangerous to
assign the result to a variable or pass it to a non-trivial
function. The value may change unexpectedly due to other
calls.
None of the cases changed here has a known bug, but they're
worth converting away from git_path because:
1. It's easy to use git_pathdup in these cases.
2. They use constructs (like assignment) that make it
hard to tell whether they're safe or not.
The extra malloc overhead should be trivial, as an
allocation should be an order of magnitude cheaper than a
system call (which we are clearly about to make, since we
are constructing a filename). The real cost is that we must
remember to free the result.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since 5a688fe4 ("core.sharedrepository = 0mode" should set, not
loosen, 2009-03-25), we kept reminding ourselves:
NEEDSWORK: this should be renamed to finalize_temp_file() as
"moving" is only a part of what it does, when no patch between
master to pu changes the call sites of this function.
without doing anything about it. Let's do so.
The purpose of this function was not to move but to finalize. The
detail of the primarily implementation of finalizing was to link the
temporary file to its final name and then to unlink, which wasn't
even "moving". The alternative implementation did "move" by calling
rename(2), which is a fun tangent.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Teach "git log" and friends a new "--date=format:..." option to
format timestamps using system's strftime(3).
* jk/date-mode-format:
strbuf: make strbuf_addftime more robust
introduce "format" date-mode
convert "enum date_mode" into a struct
show-branch: use DATE_RELATIVE instead of magic number
|
|
Often a fast-import stream builds a new commit on top of the
previous commit it built, and it often unconditionally emits a
"from" command to specify the first parent, which can be omitted in
such a case. This caused fast-import to forget the tree of the
previous commit and then re-read it from scratch, which was
inefficient. Optimize for this common case.
* mh/fast-import-optimize-current-from:
fast-import: do less work when given "from" matches current branch head
|
|
"git fast-import" learned to respond to the get-mark command via
its cat-blob-fd interface.
* mh/fast-import-get-mark:
fast-import: add a get-mark command
|
|
When building a fast-import stream, it's easy to forget the fact
that for non-merge commits happening on top of the current branch
head, there is no need for a "from" command. That is corroborated by
the fact that at least git-p4, hg-fast-export and felipec's
git-remote-hg all unconditionally use a "from" command.
Unfortunately, giving a "from" command always resets the branch
tree, forcing it to be re-read, and in many cases, the pack is also
closed and reopened through gfi_unpack_entry. Both are unnecessary
overhead, and the latter is particularly slow at least on OSX.
Avoid resetting the tree when it's unmodified, and avoid calling
gfi_unpack_entry when the given mark points to the same commit as
the current branch head.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It is sometimes useful for importers to be able to read the SHA-1
corresponding to a mark that they have created via fast-import. For
example, they might want to embed the SHA-1 into the commit message of
a later commit. Or it might be useful for internal bookkeeping uses,
or for logging.
Add a "get-mark" command to "git fast-import" that allows the importer
to ask for the value of a mark that has been created earlier.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.
Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor. However, the tricky case is where we use the
enum labels as constants, like:
show_date(t, tz, DATE_NORMAL);
Ideally we could say:
show_date(t, tz, &{ DATE_NORMAL });
but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:
1. Manually add a "struct date_mode d = { DATE_NORMAL }"
definition to each caller, and pass "&d". This makes
the callers uglier, because they sometimes do not even
have their own scope (e.g., they are inside a switch
statement).
2. Provide a pre-made global "date_normal" struct that can
be passed by address. We'd also need "date_rfc2822",
"date_iso8601", and so forth. But at least the ugliness
is defined in one place.
3. Provide a wrapper that generates the correct struct on
the fly. The big downside is that we end up pointing to
a single global, which makes our wrapper non-reentrant.
But show_date is already not reentrant, so it does not
matter.
This patch implements 3, along with a minor macro to keep
the size of the callers sane.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If we are deleting the reference, then we don't need to read the
reference's old value. It doesn't provide any race safety, because the
value read just before the delete is no "better" than the value that
would be read under lock during the delete. And even if the reference
previously didn't exist, we can call delete_ref() on it if we don't
provide an old_sha1 value.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A replacement for contrib/workdir/git-new-workdir that does not
rely on symbolic links and make sharing of objects and refs safer
by making the borrowee and borrowers aware of each other.
* nd/multiple-work-trees: (41 commits)
prune --worktrees: fix expire vs worktree existence condition
t1501: fix test with split index
t2026: fix broken &&-chain
t2026 needs procondition SANITY
git-checkout.txt: a note about multiple checkout support for submodules
checkout: add --ignore-other-wortrees
checkout: pass whole struct to parse_branchname_arg instead of individual flags
git-common-dir: make "modules/" per-working-directory directory
checkout: do not fail if target is an empty directory
t2025: add a test to make sure grafts is working from a linked checkout
checkout: don't require a work tree when checking out into a new one
git_path(): keep "info/sparse-checkout" per work-tree
count-objects: report unused files in $GIT_DIR/worktrees/...
gc: support prune --worktrees
gc: factor out gc.pruneexpire parsing code
gc: style change -- no SP before closing parenthesis
checkout: clean up half-prepared directories in --to mode
checkout: reject if the branch is already checked out elsewhere
prune: strategies for linked checkouts
checkout: support checking out into a new working directory
...
|
|
Code simplification.
* rs/deflate-init-cleanup:
zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw}
|
|
Clear the git_zstream variable at the start of git_deflate_init() etc.
so that callers don't have to do that.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Simplify the ref transaction API around how "the ref should be
pointing at this object" is specified.
* mh/refs-have-new:
refs.h: remove duplication in function docstrings
update_ref(): improve documentation
ref_transaction_verify(): new function to check a reference's value
ref_transaction_delete(): check that old_sha1 is not null_sha1
ref_transaction_create(): check that new_sha1 is valid
commit: avoid race when creating orphan commits
commit: add tests of commit races
ref_transaction_delete(): remove "have_old" parameter
ref_transaction_update(): remove "have_old" parameter
struct ref_update: move "have_old" into "flags"
refs.c: change some "flags" to "unsigned int"
refs: remove the gap in the REF_* constant values
refs: move REF_DELETING to refs.c
|
|
"git fast-import" used to crash when it could not close and
conclude the resulting packfile cleanly.
* jk/fast-import-die-nicely-fix:
fast-import: avoid running end_packfile recursively
|
|
Instead, verify the reference's old value if and only if old_sha1 is
non-NULL.
ref_transaction_delete() will get the same treatment in a moment.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When an import has finished, we run end_packfile() to
finalize the data and move the packfile into place. If this
process fails, we call die() and end up in our die_nicely()
handler. Which unfortunately includes running end_packfile
to save any progress we made. We enter the function again,
and start operating on the pack_data struct while it is in
an inconsistent state, leading to a segfault.
One way to trigger this is to simply start two identical
fast-imports at the same time. They will both create the
same packfiles, which will then try to create identically
named ".keep" files. One will win the race, and the other
will die(), and end up with the segfault.
Since 3c078b9, we already reset the pack_data pointer to
NULL at the end of end_packfile. That covers the case of us
calling die() right after end_packfile, before we have
reinitialized the pack_data pointer. This new problem is
quite similar, except that we are worried about calling
die() _during_ end_packfile, not right after. Ideally we
would simply set pack_data to NULL as soon as we enter the
function, and operate on a copy of the pointer.
Unfortunately, it is not so easy. pack_data is a global, and
end_packfile calls into other functions which operate on the
global directly. We would have to teach each of these to
take an argument, and there is no guarantee that we would
catch all of the spots.
Instead, we can simply use a static flag to avoid
recursively entering the function. This is a little less
elegant, but it's short and fool-proof.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This allows git_path() to redirect info/fast-import to another place
if needed
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Before the previous commit, get_pathname returns an array of PATH_MAX
length. Even if git_path() and similar functions does not use the
whole array, git_path() caller can, in theory.
After the commit, get_pathname() may return a buffer that has just
enough room for the returned string and git_path() caller should never
write beyond that.
Make git_path(), mkpath() and git_path_submodule() return a const
buffer to make sure callers do not write in it at all.
This could have been part of the previous commit, but the "const"
conversion is too much distraction from the core changes in path.c.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change the ref transaction API so that we pass the reflog message to the
create/delete/update functions instead of to ref_transaction_commit.
This allows different reflog messages for each ref update in a multi-ref
transaction.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Move the interface declaration for the functions in lockfile.c from
cache.h to a new file, lockfile.h. Add #includes where necessary (and
remove some redundant includes of cache.h by files that already
include builtin.h).
Move the documentation of the lock_file state diagram from lockfile.c
to the new header file.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When commit_lock_file() fails, it now always calls
rollback_lock_file() internally, so there is no need to call that
function here.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
An attempt to remove the entire tree in the "git fast-import" input
stream caused it to misbehave.
* mb/fast-import-delete-root:
fast-import: fix segfault in store_tree()
t9300: test filedelete command
|
|
Code cleanup.
* rs/realloc-array:
use REALLOC_ARRAY for changing the allocation size of arrays
add macro REALLOC_ARRAY
|
|
* jk/fast-import-fixes:
fast-import: fix buffer overflow in dump_tags
fast-import: clean up pack_data pointer in end_packfile
|
|
An attempt to remove the entire tree in the "git fast-import" input
stream caused it to misbehave.
* mb/fast-import-delete-root:
fast-import: fix segfault in store_tree()
t9300: test filedelete command
|
|
Code clean-up.
* jk/commit-author-parsing:
determine_author_info(): copy getenv output
determine_author_info(): reuse parsing functions
date: use strbufs in date-formatting functions
record_author_date(): use find_commit_header()
record_author_date(): fix memory leak on malformed commit
commit: provide a function to find a header in a buffer
|
|
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With sufficiently long refnames, fast-import could have overflown
an on-stack buffer.
* jk/fast-import-fixes:
fast-import: fix buffer overflow in dump_tags
fast-import: clean up pack_data pointer in end_packfile
|
|
The second batch of the transactional ref update series.
* rs/ref-transaction-1: (22 commits)
update-ref --stdin: pass transaction around explicitly
update-ref --stdin: narrow scope of err strbuf
refs.c: make delete_ref use a transaction
refs.c: make prune_ref use a transaction to delete the ref
refs.c: remove lock_ref_sha1
refs.c: remove the update_ref_write function
refs.c: remove the update_ref_lock function
refs.c: make lock_ref_sha1 static
walker.c: use ref transaction for ref updates
fast-import.c: use a ref transaction when dumping tags
receive-pack.c: use a reference transaction for updating the refs
refs.c: change update_ref to use a transaction
branch.c: use ref transaction for all ref updates
fast-import.c: change update_branch to use ref transactions
sequencer.c: use ref transactions for all ref updates
commit.c: use ref transactions for updates
replace.c: use the ref transaction functions for updates
tag.c: use ref transactions when doing updates
refs.c: add transaction.status and track OPEN/CLOSED
refs.c: make ref_transaction_begin take an err argument
...
|
|
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change update_branch() to use ref transactions for updates.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Branch tree is NULLified by filedelete command if we are trying
to delete root tree. Add sanity check and use load_tree() in that case.
Signed-off-by: Maxim Bublis <satori@yandex-team.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Many of the date functions write into fixed-size buffers.
This is a minor pain, as we have to take special
precautions, and frequently end up copying the result into a
strbuf or heap-allocated buffer anyway (for which we
sometimes use strcpy!).
Let's instead teach parse_date, datestamp, etc to write to a
strbuf. The obvious downside is that we might need to
perform a heap allocation where we otherwise would not need
to. However, it turns out that the only two new allocations
required are:
1. In test-date.c, where we don't care about efficiency.
2. In determine_author_info, which is not performance
critical (and where the use of a strbuf will help later
refactoring).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When creating a new annotated tag, we sprintf the refname
into a static-sized buffer. If we have an absurdly long
tagname, like:
git init repo &&
cd repo &&
git commit --allow-empty -m foo &&
git tag -m message mytag &&
git fast-export mytag |
perl -lpe '/^tag/ and s/mytag/"a" x 8192/e' |
git fast-import <input
we'll overflow the buffer. We can fix it by using a strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We have a global pointer pack_data pointing to the current
pack we have open. Inside end_packfile we have two new
pointers, old_p and new_p. The latter points to pack_data,
and the former points to the new "installed" version of the
packfile we get when we hand the file off to the regular
sha1_file machinery. When then free old_p.
Presumably the extra old_p pointer was there so that we
could overwrite pack_data with new_p and still free old_p,
but we don't do that. We just leave pack_data pointing to
bogus memory, and don't overwrite it until we call
start_packfile again (if ever).
This can cause problems for our die routine, which calls
end_packfile to clean things up. If we die at the wrong
moment, we can end up looking at invalid memory in
pack_data left after the last end_packfile().
Instead, let's make sure we set pack_data to NULL after we
free it, and make calling endfile() again with a NULL
pack_data a noop (there is nothing to end).
We can further make things less confusing by dropping old_p
entirely, and moving new_p closer to its point of use.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use `git_config_get_*()` family instead of `git_config()` to take
advantage of the config-set API which provides a cleaner control flow.
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we see a file change in a commit, we expect one of:
1. A mark.
2. An "inline" keyword.
3. An object sha1.
The handling of spaces is inconsistent between the three
options. Option 1 calls a sub-function which checks for the
space, but doesn't parse past it. Option 2 parses the space,
then deliberately avoids moving the pointer past it. Option
3 detects the space locally but doesn't move past it.
This is confusing, because it looks like option 1 forgets to
check for the space (it's just buried). And option 2 checks
for "inline ", but only moves strlen("inline") characters
forward, which looks like a bug but isn't.
We can make this more clear by just having each branch move
past the space as it is checked (and we can replace the
doubled use of "inline" with a call to skip_prefix).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fast-import does a lot of parsing of commands and
dispatching to sub-functions. For example, given "option
foo", we might recognize "option " using starts_with, and
then hand it off to parse_option() to do the rest.
However, we do not let parse_option know that we have parsed
the first part already. It gets the full buffer, and has to
skip past the uninteresting bits. Some functions simply add
a magic constant:
char *option = command_buf.buf + 7;
Others use strlen:
char *option = command_buf.buf + strlen("option ");
And others use strchr:
char *option = strchr(command_buf.buf, ' ') + 1;
All of these are brittle and easy to get wrong (especially
given that the starts_with call and the code that assumes
the presence of the prefix are far apart). Instead, we can
use skip_prefix, and just pass each handler a pointer to its
arguments.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fast-import shares code between its command-line parser and
the "option" command. To do so, it strips the "--" from any
command-line options and passes them to the option parser.
However, it does not confirm that the option even begins
with "--" before blindly passing "arg + 2".
It does confirm that the option starts with "-", so the only
affected case was:
git fast-import -
which would read uninitialized memory after the argument. We
can fix it by using skip_prefix and checking the result. As
a bonus, this gets rid of some magic numbers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Avoid scanning strings twice, once with strchr() and then with
strlen(), by using strchrnul().
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rohit Mani <rohit.mani@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Give "update-refs" a "--stdin" option to read multiple update
requests and perform them in an all-or-none fashion.
* bk/refs-multi-update:
update-ref: add test cases covering --stdin signature
update-ref: support multiple simultaneous updates
refs: add update_refs for multiple simultaneous updates
refs: add function to repack without multiple refs
refs: factor delete_ref loose ref step into a helper
refs: factor update_ref steps into helpers
refs: report ref type from lock_any_ref_for_update
reset: rename update_refs to reset_refs
|
|
We liberally use "committish" and "commit-ish" (and "treeish" and
"tree-ish"); as these are non-words, let's unify these terms to
their dashed form. More importantly, clarify the documentation on
object peeling using these terms.
* rh/ishes-doc:
glossary: fix and clarify the definition of 'ref'
revisions.txt: fix and clarify <rev>^{<type>}
glossary: more precise definition of tree-ish (a.k.a. treeish)
use 'commit-ish' instead of 'committish'
use 'tree-ish' instead of 'treeish'
glossary: define commit-ish (a.k.a. committish)
glossary: mention 'treeish' as an alternative to 'tree-ish'
|
|
Replace 'committish' in documentation and comments with 'commit-ish'
to match gitglossary(7) and to be consistent with 'tree-ish'.
The only remaining instances of 'committish' are:
* variable, function, and macro names
* "(also committish)" in the definition of commit-ish in
gitglossary[7]
Signed-off-by: Richard Hansen <rhansen@bbn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace 'treeish' in documentation and comments with 'tree-ish' to
match gitglossary(7).
The only remaining instances of 'treeish' are:
* variable, function, and macro names
* "(also treeish)" in the definition of tree-ish in gitglossary(7)
Signed-off-by: Richard Hansen <rhansen@bbn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* jk/fast-import-empty-ls:
fast-import: allow moving the root tree
fast-import: allow ls or filecopy of the root tree
fast-import: set valid mode on root tree in "ls"
t9300: document fast-import empty path issues
|
|
Expose lock_ref_sha1_basic's type_p argument to callers of
lock_any_ref_for_update. Update all call sites to ignore it by passing
NULL for now.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Because fast-import.c::tree_content_remove does not check for the empty
path, it is not possible to move the root tree to a subdirectory.
Instead the error "Path not in branch" is produced (note the double
space where the empty path has been inserted).
Fix this by explicitly checking for the empty path and handling it.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit 178e1de (fast-import: don't allow 'ls' of path with empty
components, 2012-03-09) restricted paths which:
. contain an empty directory component (e.g. foo//bar is invalid),
. end with a directory separator (e.g. foo/ is invalid),
. start with a directory separator (e.g. /foo is invalid).
However, the implementation also caught the empty path, which should
represent the root tree. Relax this restriction so that the empty path
is explicitly allowed and refers to the root tree.
Reported-by: Dave Abrahams <dave@boostpro.com>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This prevents a failure later when we lift the restriction on ls with
the empty path.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Optimization for fast-export by avoiding unnecessarily resolving
arbitrary object name and parsing object when only presence and
type information is necessary, etc.
* fc/fast-export-persistent-marks:
fast-{import,export}: use get_sha1_hex() to read from marks file
fast-export: don't parse commits while reading marks file
fast-export: do not parse non-commit objects while reading marks file
|
|
It's wrong to call get_sha1() if they should be SHA-1s, plus
inefficient.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Sparse issues 68 errors (two errors for each main() function) such
as the following:
SP git.c
git.c:510:5: error: too many arguments for function mingw_main
git.c:510:5: error: symbol 'mingw_main' redeclared with different type \
(originally declared at git.c:510) - different argument counts
The errors are caused by the 'main' macro used by the MinGW build
to provide a replacement main() function. The original main function
is effectively renamed to 'mingw_main' and is called from the new
main function. The replacement main is used to execute certain actions
common to all git programs on MinGW (e.g. ensure the standard I/O
streams are in binary mode).
In order to suppress the errors, we change the macro to include the
parameters in the declaration of the mingw_main function.
Unfortunately, this change provokes both sparse and gcc to complain
about 9 calls to mingw_main(), such as the following:
CC git.o
git.c: In function 'main':
git.c:510: warning: passing argument 2 of 'mingw_main' from \
incompatible pointer type
git.c:510: note: expected 'const char **' but argument is of \
type 'char **'
In order to suppress these warnings, since both of the main
functions need to be declared with the same prototype, we
change the declaration of the 9 main functions, thus:
int main(int argc, char **argv)
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit cbfd5e1c ("drop some obsolete "x = x" compiler warning hacks",
21-03-2013) removed a gcc hack that suppressed an "might be used
uninitialized" warning issued by older versions of gcc.
However, commit 3aa99df8 ('fast-import: clarify "inline" logic in
file_change_m', 21-03-2013) addresses an (almost) identical issue
(with very similar code), but includes additional code in it's
resolution. The solution used by this commit, unlike that used by
commit cbfd5e1c, also suppresses the -Wuninitialized warning on
older versions of gcc.
In order to suppress the warning (against the 'oe' symbol) in the
note_change_n() function, we adopt the same solution used by commit
3aa99df8.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we read a fast-import line like:
M 100644 :1 foo.c
we point the local object_entry variable "oe" to the object
named by the mark ":1". When the input uses the "inline"
construct, however, we do not have such an object_entry.
The current code is careful not to access "oe" in the inline
case, but we can make the assumption even more obvious (and
catch violations of it) by setting oe to NULL and adding a
comment. As a bonus, this also squelches an over-zealous gcc
-Wuninitialized warning, which means we can drop the "oe =
oe" initialization hack.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In cases where the setting and access of a variable are
protected by the same conditional flag, older versions of
gcc would generate a "might be used unitialized" warning. We
silence the warning by initializing the variable to itself,
a hack that gcc recognizes.
Modern versions of gcc are smart enough to get this right,
going back to at least version 4.3.5. gcc 4.1 does get it
wrong in both cases, but is sufficiently old that we
probably don't need to care about it anymore.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This is shorter, idiomatic, and it means the compiler does
not get confused about whether our "e" pointer is valid,
letting us drop the "e = e" hack.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Optimise the "merge-base" computation a bit, and also update its
users that do not need the full merge-base information to call a
cheaper subset.
* jc/merge-bases:
reduce_heads(): reimplement on top of remove_redundant()
merge-base: "--is-ancestor A B"
get_merge_bases_many(): walk from many tips in parallel
in_merge_bases(): use paint_down_to_common()
merge_bases_many(): split out the logic to paint history
in_merge_bases(): omit unnecessary redundant common ancestor reduction
http-push: use in_merge_bases() for fast-forward check
receive-pack: use in_merge_bases() for fast-forward check
in_merge_bases(): support only one "other" commit
|
|
In early days of its life, I planned to make it possible to compute
"is a commit contained in all of these other commits?" with this
function, but it turned out that no caller needed it.
Just make it take two commit objects and add a comment to say what
these two functions do.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The syntax for the use of mark references in fast-import
demands either a SP (space) or LF (end-of-line) after
a mark reference. Fast-import does not complain when garbage
appears after a mark reference in some cases.
Factor out parsing of mark references and complain if
errant characters are found. Also be a little more careful
when parsing "inline" and SHA1s, complaining if extra
characters appear or if the form of the dataref is unrecognized.
Buggy input can cause fast-import to produce the wrong output,
silently, without error. This makes it difficult to track
down buggy generators of fast-import streams. An example is
seen in the last line of this commit command:
commit refs/heads/S2
committer Name <name@example.com> 1112912893 -0400
data <<COMMIT
commit message
COMMIT
from :1M 100644 :103 hello.c
It is missing a newline and should be:
[...]
from :1
M 100644 :103 hello.c
What fast-import does is to produce a commit with the same
contents for hello.c as in refs/heads/S2^. What the buggy
program was expecting was the contents of blob :103. While
the resulting commit graph looked correct, the contents in
some commits were wrong.
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* jn/maint-fast-import-empty-ls:
fast-import: don't allow 'ls' of path with empty components
fast-import: leakfix for 'ls' of dirty trees
|
|
As the fast-import manual explains:
The value of <path> must be in canonical form. That is it must
not:
. contain an empty directory component (e.g. foo//bar is invalid),
. end with a directory separator (e.g. foo/ is invalid),
. start with a directory separator (e.g. /foo is invalid),
Unfortunately the "ls" command accepts these invalid syntaxes and
responds by declaring that the indicated path is missing. This is too
subtle and causes importers to silently misbehave; better to error out
so the operator knows what's happening.
The C, R, and M commands already error out for such paths.
Reported-by: Andrew Sayers <andrew-git@pileofstuff.org>
Analysis-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
When the chosen directory has changed since it was last written to
pack, "tree_content_get" makes a deep copy of its content to scribble
on while computing the tree name, which we forgot to free.
This leak has been present since the 'ls' command was introduced in
v1.7.5-rc0~3^2~33 (fast-import: add 'ls' command, 2010-12-02).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
When running t9300, valgrind (correctly) complains about an
uninitialized value in write_crash_report:
==2971== Use of uninitialised value of size 8
==2971== at 0x4164F4: sha1_to_hex (hex.c:70)
==2971== by 0x4073E4: die_nicely (fast-import.c:468)
==2971== by 0x43284C: die (usage.c:86)
==2971== by 0x40420D: main (fast-import.c:2731)
==2971== Uninitialised value was created by a heap allocation
==2971== at 0x4C29B3D: malloc (vg_replace_malloc.c:263)
==2971== by 0x433645: xmalloc (wrapper.c:35)
==2971== by 0x405DF5: pool_alloc (fast-import.c:619)
==2971== by 0x407755: pool_calloc.constprop.14 (fast-import.c:634)
==2971== by 0x403F33: main (fast-import.c:3324)
Fix this by zeroing all of the 'struct tag'. We would only need to
zero out the 'sha1' field, but this way seems more future-proof.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On Solaris the system headers define the "tmpfile" name, which'll
cause Git compiled with Sun Studio 12 Update 1 to whine about us
redefining the name:
"pack-write.c", line 76: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
"sha1_file.c", line 2455: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
"fast-import.c", line 858: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
"builtin/index-pack.c", line 175: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
Just renaming the "tmpfile" variable to "tmp_file" in the relevant
places is the easiest way to fix this.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* ab/enable-i18n:
i18n: add infrastructure for translating Git with gettext
Conflicts:
Makefile
|