| Age | Commit message (Collapse) | Author | Files | Lines |
|
fc47391e24 (drop vcs-svn experiment, 2020-08-13) removed most vcs-svn
files. Drop the remaining header files as well, as they are no longer
used.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The code in vcs-svn was started in 2010 as an attempt to build a
remote-helper for interacting with svn repositories (as opposed to
git-svn). However, we never got as far as shipping a mature remote
helper, and the last substantive commit was e99d012a6bc in 2012.
We do have a git-remote-testsvn, and it is even installed as part of
"make install". But given the name, it seems unlikely to be used by
anybody (you'd have to explicitly "git clone testsvn::$url", and there
have been zero mentions of that on the mailing list since 2013, and even
that includes the phrase "you might need to hack a bit to get it working
properly"[1]).
We also ship contrib/svn-fe, which builds on the vcs-svn work. However,
it does not seem to build out of the box for me, as the link step misses
some required libraries for using libgit.a. Curiously, the original
build breakage bisects for me to eff80a9fd9 (Allow custom "comment
char", 2013-01-16), which seems unrelated. There was an attempt to fix
it in da011cb0e7 (contrib/svn-fe: fix Makefile, 2014-08-28), but on my
system that only switches the error message.
So it seems like the result is not really usable by anybody in practice.
It would be wonderful if somebody wanted to pick up the topic again, and
potentially it's worth carrying around for that reason. But the flip
side is that people doing tree-wide operations have to deal with this
code. And you can see the list with (replace "HEAD" with this commit as
appropriate):
{
echo "--"
git diff-tree --diff-filter=D -r --name-only HEAD^ HEAD
} |
git log --no-merges --oneline e99d012a6bc.. --stdin
which shows 58 times somebody had to deal with the code, generally due
to a compile or test failure, or a tree-wide style fix or API change.
Let's drop it and let anybody who wants to pick it up do so by
resurrecting it from the git history.
As a bonus, this also reduces the size of a stripped installation of Git
from 21MB to 19MB.
[1] https://lore.kernel.org/git/CALkWK0mPHzKfzFKKpZkfAus3YVC9NFYDbFnt+5JQYVKipk3bQQ@mail.gmail.com/
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In previous patches, extern was mechanically removed from function
declarations without care to formatting, causing parameter lists to be
misaligned. Manually format changed sections such that the parameter
lists should be realigned.
Viewing this patch with 'git diff -w' should produce no output.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There has been a push to remove extern from function declarations.
Remove some instances of "extern" for function declarations which are
caught by Coccinelle. Note that Coccinelle has some difficulty with
processing functions with `__attribute__` or varargs so some `extern`
declarations are left behind to be dealt with in a future patch.
This was the Coccinelle patch used:
@@
type T;
identifier f;
@@
- extern
T f(...);
and it was run with:
$ git ls-files \*.{c,h} |
grep -v ^compat/ |
xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place
Files under `compat/` are intentionally excluded as some are directly
copied from external sources and we should avoid churning them as much
as possible.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
These were not caught by the previous commit, as they did not match the
regular expression.
While at it, remove the localization from one instance: we never want
BUG() messages to be translated, as they target Git developers, not the
end user (hence it would be quite unhelpful to not only burden the
translators, but then even end up with a bug report in a language that
no core Git contributor understands).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up.
* jn/vcs-svn-cleanup:
vcs-svn: move remaining repo_tree functions to fast_export.h
vcs-svn: remove repo_delete wrapper function
vcs-svn: remove custom mode constants
vcs-svn: remove more unused prototypes and declarations
|
|
Code clean-up.
* bc/vcs-svn-cleanup:
vcs-svn: rename repo functions to "svn_repo"
vcs-svn: remove unused prototypes
|
|
These used to be for manipulating the in-memory repo_tree structure,
but nowadays they are convenience wrappers to handle a few git-vs-svn
mismatches:
1. Git does not track empty directories but Subversion does. When
looking up a path in git that Subversion thinks exists and finding
nothing, we can safely assume that the path represents a
directory. This is needed when a later Subversion revision
modifies that directory.
2. Subversion allows deleting a file by copying. In Git fast-import
we have to handle that more explicitly as a deletion.
These are details of the tool's interaction with git fast-import.
Move them to fast_export.c, where other such details are handled.
This way the function names do not start with a repo_ prefix that
would clash with the repository object introduced in
v2.14.0-rc0~38^2~16 (repository: introduce the repository object,
2017-06-22) or an svn_ prefix that would clash with libsvn (in case
someone wants to link this code with libsvn some day).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since v1.7.10-rc0~118^2~4^2~4^2~3 (vcs-svn: pass paths through to
fast-import, 2010-12-13) this is an alias for fast_export_delete.
Remove the unnecessary layer of indirection.
No functional change intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the rest of Git, these modes are spelled as S_IFDIR,
S_IFREG | 0644, S_IFREG | 0755, and S_IFLNK. Use the same constants
in svn-fe for simplicity and consistency.
No functional change intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
I forgot to remove these in v1.7.10-rc0~118^2~4^2~5^2~4 (vcs-svn:
eliminate repo_tree structure, 2010-12-10).
This finishes what was started in commit 36f63b50 (vcs-svn: remove
unused prototypes, 2017-08-21).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There were several functions in the Subversion code that started with
"repo_". This namespace is also used by the Git struct repository code.
Rename these functions to start with "svn_repo" to avoid any future
conflicts.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The Subversion code had prototypes for several functions which were not
ever defined or used. These functions all had names starting with
"repo_", some of which conflict with those in repository.h. To avoid
the conflict, remove those unused prototypes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.
There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
away from the ill-specified `unsigned long` data type.
So let's introduce the pseudo format "PRItime" (currently simply being
defined to "lu") to make it easier to change the data type used for
timestamps.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code cleanup.
* mr/vcs-svn-printf-ulong:
vcs-svn/fast_export: fix timestamp fmt specifiers
|
|
Code cleanup.
* mr/vcs-svn-printf-ulong:
vcs-svn/fast_export: fix timestamp fmt specifiers
|
|
Two instances of %ld being used for unsigned longs
Signed-off-by: Mike Ralphson <mike.ralphson@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
prefixcmp() and suffixcmp() share the common "cmp" suffix that
typically are used to name functions that can be used for ordering,
but they can't, because they are not antisymmetric:
prefixcmp("foo", "foobar") < 0
prefixcmp("foobar", "foo") == 0
We in fact do not use these functions for ordering. Replace them
with functions that just check for equality.
Add starts_with() and end_with() that will be used to replace
prefixcmp() and suffixcmp(), respectively, as the first step. These
are named after corresponding functions/methods in programming
languages, like Java, Python and Ruby.
In vcs-svn/fast_export.c, there was already an ends_with() function
that did the same thing. Let's use the new one instead while at it.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Search for a note attached to the ref to update and read it's
'Revision-number:'-line. Start import from the next svn revision.
If there is no next revision in the svn repo, svnrdump terminates with
a message on stderr an non-zero return value. This looks a little
weird, but there is no other way to know whether there is a new
revision in the svn repo.
On the start of an incremental import, the parent of the first commit
in the fast-import stream is set to the branch name to update. All
following commits specify their parent by a mark number. Previous mark
files are currently not reused.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To provide metadata from svn dumps for further processing, e.g.
branch detection, attach a note to each imported commit that stores
additional information. The notes are currently hard-coded in
refs/notes/svn/revs. Currently the following lines from the svn dump
are directly accumulated in the note. This can be refined as needed.
- "Revision-number"
- "Node-path"
- "Node-kind"
- "Node-action"
- "Node-copyfrom-path"
- "Node-copyfrom-rev"
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
fast_export lacked a method to writes notes to fast-import stream.
Add two new functions fast_export_note which is similar to
fast_export_modify. And also add fast_export_buf_to_data to be able to
write inline blobs that don't come from a line_buffer or from delta
application.
To be used like this:
fast_export_begin_commit("refs/notes/somenotes", ...)
fast_export_note("refs/heads/master", "inline")
fast_export_buf_to_data(&data)
or maybe
fast_export_note("refs/heads/master", sha1)
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The reference to update by the fast-import stream is hard-coded. When
fetching from a remote the remote-helper shall update refs in a
private namespace, i.e. a private subdir of refs/. This namespace is
defined by the 'refspec' capability, that the remote-helper advertises
as a reply to the 'capabilities' command.
Extend svndump and fast-export to allow passing the target ref.
Update svn-fe to be compatible.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The existing function only allows reading from a filename or from
stdin. Allow passing of a FD and an additional FD for the back report
pipe. This allows us to retrieve the name of the pipe in the caller.
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
vcs-svn updates to clean-up compilation, lift 32-bit limitations, etc.
* jn/vcs-svn:
vcs-svn: allow 64-bit Prop-Content-Length
vcs-svn: suppress a signed/unsigned comparison warning
vcs-svn: suppress a signed/unsigned comparison warning
vcs-svn: suppress signed/unsigned comparison warnings
vcs-svn: use strstr instead of memmem
vcs-svn: use constcmp instead of prefixcmp
vcs-svn: simplify cleanup in apply_one_window
vcs-svn: avoid self-assignment in dummy initialization of pre_off
vcs-svn: drop no-op reset methods
vcs-svn: suppress -Wtype-limits warning
vcs-svn: allow import of > 4GiB files
vcs-svn: rename check_overflow and its arguments for clarity
|
|
Currently the vcs-svn/ library only pays attention to the presence of
the Prop-Content-Length field and doesn't care about its value, but
some day we might care about the value. Parse it as an off_t instead
of arbitrarily limiting to 32 bits for intuitiveness.
So now you can import from a dump with more than 2 GiB of properties
for a node. In practice that isn't likely to happen often, and this
is mostly meant as a cleanup.
Based-on-patch-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
All callers pass a nonnegative delta_len, so the code is already safe.
Add an assertion to ensure that remains so and add a cast to keep
clang and gcc -Wsign-compare from worrying.
Reported-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
The preceding code checks that view->max_off is nonnegative and
(off + width) fits in an off_t, so this code is already safe.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
These are already safe because both sides of the comparison are
nonnegative.
This would normally not be important because Git is not -Wsign-compare
clean anyway, but we like to keep the vcs-svn/ lib to a higher
standard for convenience using it in other projects.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
memmem is a GNU extension.
Avoiding it makes the code clearer and makes it easier for projects
that don't share git's compat/ code, such as the standalone
svn-dump-fast-export project, to reuse the vcs-svn/ library.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Since the length of t is already known, we can simplify a little by
using memcmp() instead of strncmp() to carry out a prefix comparison.
All nearby code already does this.
Noticed in the standalone svn-dump-fast-export project which has not
needed to implement prefixcmp() yet.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Currently the cleanup code looks like this:
free resources
return 0;
error_out:
free resources
return -1;
Avoid duplicating the "free resources" part by keeping the return
value in a variable and sharing code between the success and
exceptional case:
ret = 0;
out:
free resources
return ret;
Noticed in the svn-dump-fast-export project, where using the error()
macro in void context produces a warning.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Without this change, clang complains:
vcs-svn/svndiff.c:298:3: warning: Assigned value is garbage or undefined
off_t pre_off = pre_off; /* stupid GCC... */
^ ~~~~~~~
This code uses an old and common idiom for suppressing an
"uninitialized variable" warning, and clang is wrong to warn about it.
The idiom tells the compiler to leave the variable uninitialized,
which saves a few bytes of code size, and, more importantly, allows
valgrind to check at runtime that the variable is properly initialized
by the time it is used.
But MSVC and clang do not know that idiom, so let's avoid it in
vcs-svn/ code.
Initialize pre_off to -1, a recognizably meaningless value, to allow
future code changes that cause pre_off to be used before it is
initialized to be caught early.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Since v1.7.5~42^2~6 (vcs-svn: remove buffer_read_string)
buffer_reset() does nothing thus fast_export_reset() also.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
The error handling routines add a newline. Remove
the duplicate ones in error messages.
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On 32-bit architectures with 64-bit file offsets, gcc 4.3 and earlier
produce the following warning:
CC vcs-svn/sliding_window.o
vcs-svn/sliding_window.c: In function `check_overflow':
vcs-svn/sliding_window.c:36: warning: comparison is always false \
due to limited range of data type
The warning appears even when gcc is run without any warning flags
(this is gcc bug 12963). In later versions the same warning can be
reproduced with -Wtype-limits, which is implied by -Wextra.
On 64-bit architectures it really is possible for a size_t not to be
representable as an off_t so the check this is warning about is not
actually redundant. But even false positives are distracting. Avoid
the warning by making the "len" argument to check_overflow a
uintmax_t; no functional change intended.
Reported-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There is no reason in principle that an svn-format dump would not be
able to represent a file whose length does not fit in a 32-bit
integer. Use off_t consistently to represent file lengths (in place
of using uint32_t in some contexts) so we can handle that.
Most svn-fe code is already ready to do that without this patch and
passes values of type off_t around. The type mismatch from stragglers
was noticed with gcc -Wtype-limits.
While at it, tighten the parsing of the Text-content-length field to
make sure it is a number and does not overflow, and tighten other
overflow checks as that value is passed around and manipulated.
Inspired-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code using the argument names a and b just doesn't look right (not
sure why!). Use more explicit names "offset" and "len" to make their
type and meaning clearer.
Also rename check_overflow() to check_offset_overflow() to clarify
that we are making sure that "len" bytes beyond "offset" still fits
the type to represent an offset.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On 32-bit architectures with 64-bit file offsets, gcc 4.3 and earlier
produce the following warning:
CC vcs-svn/sliding_window.o
vcs-svn/sliding_window.c: In function `check_overflow':
vcs-svn/sliding_window.c:36: warning: comparison is always false \
due to limited range of data type
The warning appears even when gcc is run without any warning flags
(PR12963). In later versions it can be reproduced with -Wtype-limits,
which is implied by -Wextra.
On 64-bit architectures it really is possible for a size_t not to be
representable as an off_t so the check being warned about is not
actually redundant. But even false positives are distracting. Avoid
the warning by making the "len" argument to check_overflow a
uintmax_t; no functional change intended.
Reported-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
There is no reason in principle that an svn-format dump would not
be able to represent a file whose length does not fit in a 32-bit
integer. Use off_t consistently (instead of uint32_t) to represent
file lengths so we can handle that.
Most of our code is already ready to do that without this patch and
already passes values of type off_t around. The type mismatch due to
stragglers was noticed with gcc -Wtype-limits.
Inspired-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
The canonical interpretation of a range a,b is as an interval [a,b),
not [a,a+b), so this function taking argument names a and b feels
unnatural. Use more explicit names "offset" and "len" to make the
arguments' type and function clearer.
While at it, rename the function to convey that we are making sure
the sum of this offset and length do not overflow an off_t, not a
size_t.
[jn: split out from a patch from Ramsay Jones, then improved with
advice from Thomas Rast, Dmitry Ivankov, and David Barr]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Improved-by: Dmitry Ivankov <divanorama@gmail.com>
|
|
Curiously, pre_len given to read_length() does not trigger the same warning
even though the code structure is the same. Most likely this is because
read_offset() is used only once and inlining it will make gcc realize that
it has a chance to do more flow analysis. Alas, the analysis is flawed, so
it does not help X-<.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This simplifies svn-fe a great deal and fulfills a longstanding wish:
support for dumps with deltas in them, and incremental imports.
The cost is that commandline usage of the svn-fe tool becomes a little
more complicated since it no longer keeps state itself but instead reads
blobs back from fast-import in order to copy them between revisions and
apply deltas to them.
Also removes a couple of custom data structures and replaces them with
strbufs like other parts of Git.
* 'svn-fe' of git://repo.or.cz/git/jrn: (32 commits)
vcs-svn: reset first_commit_done in fast_export_init
vcs-svn: do not initialize report_buffer twice
vcs-svn: avoid hangs from corrupt deltas
vcs-svn: guard against overflow when computing preimage length
vcs-svn: cap number of bytes read from sliding view
test-svn-fe: split off "test-svn-fe -d" into a separate function
vcs-svn: implement text-delta handling
vcs-svn: let deltas use data from preimage
vcs-svn: let deltas use data from postimage
vcs-svn: verify that deltas consume all inline data
vcs-svn: implement copyfrom_data delta instruction
vcs-svn: read instructions from deltas
vcs-svn: read inline data from deltas
vcs-svn: read the preimage when applying deltas
vcs-svn: parse svndiff0 window header
vcs-svn: skeleton of an svn delta parser
vcs-svn: make buffer_read_binary API more convenient
vcs-svn: learn to maintain a sliding view of a file
Makefile: list one vcs-svn/xdiff object or header per line
vcs-svn: avoid using ls command twice
...
Conflicts:
Makefile
contrib/svn-fe/svn-fe.txt
|
|
Change direct and indirect assignments of the bitwise negation of 0 to
uint32_t variables to have a "U" suffix. I.e. ~0U instead of ~0. This
eliminates warnings under Sun Studio 12 Update 1:
"vcs-svn/string_pool.c", line 11: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND)
"vcs-svn/string_pool.c", line 81: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND)
"vcs-svn/repo_tree.c", line 112: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND)
"vcs-svn/repo_tree.c", line 112: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND)
"test-treap.c", line 34: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND)
The semantics are still the same as demonstrated by this program:
$ cat test.c && make test && ./test
#include <stdio.h>
#include <stdint.h>
int main(void)
{
uint32_t foo = ~0;
uint32_t bar = ~0U;
printf("foo = <%u> bar = <%u>\n", foo, bar);
return 0;
}
cc test.c -o test
"test.c", line 5: warning: initializer will be sign-extended: -1
foo = <4294967295> bar = <4294967295>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
first_commit_done has zero as a default value, but it
is not reset back to zero in fast_export_init.
Reset it back to zero so that each export will have
proper initial state.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
When importing from a dump with deltas, first fast_export_init calls
buffer_fdinit, and then init_report_buffer calls fdopen once again
when processing the first delta. The second initialization is
redundant and leaks a FILE *.
Remove the redundant on-demand initialization to fix this.
Initializing directly in fast_export_init is simpler and lets the
caller pass an int specifying which fd to use instead of hard-coding
REPORT_FILENO.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
A corrupt Subversion-format delta can request reads past the end of
the preimage. Set sliding_view::max_off so such corruption is caught
when it appears rather than blocking in an impossible-to-fulfill
read() when input is coming from a socket or pipe.
Inspired-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Signed integer overflow produces undefined behavior in C and off_t is
a signed type. For predictable behavior, add some checks to protect
in advance against overflow.
On 32-bit systems ftell as called by buffer_tmpfile_prepare_to_read
is likely to fail with EOVERFLOW when reading the corresponding
postimage, and this patch does not fix that. So it's more of a
futureproofing measure than a complete fix.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
* db/delta-applier:
vcs-svn: cap number of bytes read from sliding view
test-svn-fe: split off "test-svn-fe -d" into a separate function
|
|
Introduce a "max_off" field in struct sliding_view, roughly
representing a maximum number of bytes that can be read from "file".
If it is set to a nonnegative integer, a call to move_window()
attempting to put the right endpoint beyond that offset will return
an error instead.
The idea is to use this when applying Subversion-format deltas to
prevent reads past the end of the preimage (which has known length).
Without such a check, corrupt deltas would cause svn-fe to block
indefinitely when data in the input pipe is exhausted.
Inspired-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Handle input in Subversion's dumpfile format, version 3. This is the
format produced by "svnrdump dump" and "svnadmin dump --deltas", and
the main difference between v3 dumpfiles and the dumpfiles already
handled is that these can include nodes whose properties and text are
expressed relative to some other node.
To handle such nodes, we find which node the text and properties are
based on, handle its property changes, use the cat-blob command to
request the basis blob from the fast-import backend, use the
svndiff0_apply() helper to apply the text delta on the fly, writing
output to a temporary file, and then measure that postimage file's
length and write its content to the fast-import stream.
The temporary postimage file is shared between delta-using nodes to
avoid some file system overhead.
The svn-fe interface needs to be more complicated to accomodate the
backward flow of information from the fast-import backend to svn-fe.
The backflow fd is not needed when parsing streams without deltas,
though, so existing scripts using svn-fe on v2 dumps should
continue to work.
NEEDSWORK: generalize interface so caller sets the backflow fd, close
temporary file before exiting
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
* db/delta-applier:
vcs-svn: let deltas use data from preimage
vcs-svn: let deltas use data from postimage
vcs-svn: verify that deltas consume all inline data
vcs-svn: implement copyfrom_data delta instruction
vcs-svn: read instructions from deltas
vcs-svn: read inline data from deltas
vcs-svn: read the preimage when applying deltas
vcs-svn: parse svndiff0 window header
vcs-svn: skeleton of an svn delta parser
vcs-svn: make buffer_read_binary API more convenient
vcs-svn: learn to maintain a sliding view of a file
Makefile: list one vcs-svn/xdiff object or header per line
Conflicts:
Makefile
vcs-svn/LICENSE
|
|
* db/svn-fe-code-purge:
vcs-svn: drop obj_pool
vcs-svn: drop treap
vcs-svn: drop string_pool
vcs-svn: pass paths through to fast-import
Conflicts:
vcs-svn/fast_export.c
vcs-svn/fast_export.h
vcs-svn/repo_tree.c
vcs-svn/repo_tree.h
vcs-svn/string_pool.c
vcs-svn/svndump.c
vcs-svn/trp.txt
|
|
This teaches svn-fe to incrementally import into an existing
repository (at last!) at the expense of less convenient UI. Think of
it as growing pains. This opens the door to many excellent things,
and it would be a bad idea to discourage people from building on it
for much longer.
* db/vcs-svn-incremental:
vcs-svn: avoid using ls command twice
vcs-svn: use mark from previous import for parent commit
vcs-svn: handle filenames with dq correctly
vcs-svn: quote paths correctly for ls command
vcs-svn: eliminate repo_tree structure
vcs-svn: add a comment before each commit
vcs-svn: save marks for imported commits
vcs-svn: use higher mark numbers for blobs
vcs-svn: set up channel to read fast-import cat-blob response
Conflicts:
t/t9010-svn-fe.sh
vcs-svn/fast_export.c
vcs-svn/fast_export.h
vcs-svn/repo_tree.c
vcs-svn/svndump.c
|
|
* rj/sparse:
sparse: Fix some "symbol not declared" warnings
sparse: Fix errors due to missing target-specific variables
sparse: Fix an "symbol 'merge_file' not decared" warning
sparse: Fix an "symbol 'format_subject' not declared" warning
sparse: Fix some "Using plain integer as NULL pointer" warnings
sparse: Fix an "symbol 'cmd_index_pack' not declared" warning
Makefile: Use cgcc rather than sparse in the check target
|
|
In particular, sparse issues the "symbol 'a_symbol' was not declared.
Should it be static?" warnings for the following symbols:
attr.c:468:12: 'git_etc_gitattributes'
attr.c:476:5: 'git_attr_system'
vcs-svn/svndump.c:282:6: 'svndump_read'
vcs-svn/svndump.c:417:5: 'svndump_init'
vcs-svn/svndump.c:432:6: 'svndump_deinit'
vcs-svn/svndump.c:445:6: 'svndump_reset'
The symbols in attr.c only require file scope, so we add the static
modifier to their declaration.
The symbols in vcs-svn/svndump.c are external symbols, and they
already have extern declarations in the "svndump.h" header file,
so we simply include the header in svndump.c.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
I found that some doubled words had snuck back into projects from which
I'd already removed them, so now there's a "syntax-check" makefile rule in
gnulib to help prevent recurrence.
Running the command below spotted a few in git, too:
git ls-files | xargs perl -0777 -n \
-e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt])\s+\1\b/gims)' \
-e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g;' \
-e 'print "$ARGV:$n:$v\n"}'
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* 'svn-fe' of git://repo.or.cz/git/jrn:
tests: kill backgrounded processes more robustly
vcs-svn: a void function shouldn't try to return something
tests: make sure input to sed is newline terminated
vcs-svn: add missing cast to printf argument
|
|
As v1.7.4-rc0~184 (2010-10-04) and C99 §6.8.6.4.1 remind us, standard
C does not permit returning an expression of type void, even for a
tail call.
Noticed with gcc -pedantic:
vcs-svn/svndump.c: In function 'handle_node':
vcs-svn/svndump.c:213:3: warning: ISO C forbids 'return' with expression,
in function returning void [-pedantic]
[jn: with simplified log message]
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
gcc -m32 correctly warns:
vcs-svn/fast_export.c: In function 'fast_export_commit':
vcs-svn/fast_export.c:54:2: warning: format '%llu' expects
argument of type 'long long unsigned int', but argument 2
has type 'unsigned int' [-Wformat]
Fix it.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The copyfrom_source instruction appends data from the preimage buffer
to the end of output. Its arguments are a length and an offset
relative to the beginning of the source view.
With this change, the delta applier is able to reproduce all 5,636,613
blobs in the early history of the ASF repository. Tested with
mkfifo backflow
svn-fe <svn-asf-public-r0:940166 3<backflow |
git fast-import --cat-blob-fd=3 3>backflow
with svn-asf-public-r0:940166 produced by whatever version of
Subversion the dumps in /dump/ on svn.apache.org use (presumably
1.6.something).
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
|
|
The copyfrom_target instruction copies appends data that is already
present in the current output view to the end of output. (The offset
argument is relative to the beginning of output produced in the
current window.)
The region copied is allowed to run past the end of the existing
output. To support that case, copy one character at a time rather
than calling memcpy or memmove. This allows copyfrom_target to be
used once to repeat a string many times. For example:
COPYFROM_DATA 2
COPYFROM_OUTPUT 10, 0
DATA "ab"
would produce the output "ababababababababababab".
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
|
|
By constraining the format of deltas, we can more easily detect
corruption and other breakage.
Requiring deltas not to provide unconsumed data also opens the
possibility of ignoring the declared amount of novel data and simply
streaming the data as needed to fulfill copyfrom_data requests.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
|
|
The copyfrom_data instruction copies a few bytes verbatim from the
novel text section of a window to the postimage.
[jn: with memory leak fix from David]
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
|
|
Buffer the instruction section upon encountering it for later
interpretation.
An alternative design would involve parsing the instructions
at this point and buffering them in some processed form. Using
the unprocessed form is simpler.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
|
|
Each window of an svndiff0-format delta includes a section for novel
text to be copied to the postimage (in the order it appears in the
window, possibly interspersed with other data).
Slurp in this data when encountering it. It is not actually necessary
to do so --- it would be just as easy to copy from delta to output
as part of interpreting the relevant instructions --- but this way,
the code that interprets svndiff0 instructions can proceed very
quickly because it does not require I/O.
Subversion's svndiff0 parser rejects deltas that do not consume all
the novel text that was provided. Omit that check for now so we can
test the new functionality right away, rather than waiting to learn
instructions that consume data.
Do check for truncated data sections. Subversion's parser rejects
deltas that end in the middle of a declared novel-text section, so it
should be safe for us to reject them, too.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
|
|
The source view offset heading each svndiff0 window represents a
number of bytes past the beginning of the preimage. Together with the
source view length, it dictates to the delta applier what portion of
the preimage instructions will refer to. Read that portion right away
using the sliding window code.
Maybe some day we will use mmap to read data more lazily.
Subversion's implementation tolerates source view offsets pointing
past the end of the preimage file but we do not, for simplicity.
This does not teach the delta applier to read instructions or copy
data from the source view. Deltas that could produce nonempty output
will still be rejected.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
|
|
Each window in a subversion delta (svndiff0-format file) starts with a
window header, consisting of five integers with variable-length
representation:
source view offset
source view length
output length
instructions length
auxiliary data length
Parse it. The result is not usable for deltas with nonempty postimage
yet; in fact, this only adds support for deltas without any
instructions or auxiliary data. This is a good place to stop, though,
since that little support lets us add some simple passing tests
concerning error handling to the test suite.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
A delta in the subversion delta (svndiff0) format consists of the
magic bytes SVN\0 followed by a sequence of windows of a certain well
specified format (starting with five integers).
Add an svndiff0_apply function and test-svn-fe -d commandline tool to
parse such a delta in the special case of not including any windows.
Later patches will add features to turn this into a fully functional
delta applier for svn-fe to use to parse the streams produced by
"svnrdump dump" and "svnadmin dump --deltas".
The content of symlinks starts with the word "link " in Subversion's
worldview, so we need to be able to prepend that text to input for the
sake of delta application. So initialization of the input state of
the delta preimage is left to the calling program, giving callers a
chance to seed the buffer with text of their choice.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
buffer_read_binary is a thin wrapper around fread, but its signature
is wrong:
- fread can fill an arbitrary in-memory buffer. buffer_read_binary
is limited to buffers whose size is representable by a 32-bit
integer.
- The result from fread is the number of bytes actually read.
buffer_read_binary only reports the number of bytes read by
incrementing sb->len by that amount and returns void.
Fix both: let buffer_read_binary accept a size_t instead of uint32_t
for the number of bytes to read and as a convenience return the number
of bytes actually read.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Each section of a Subversion-format delta only requires examining (and
keeping in random-access memory) a small portion of the preimage. At
any moment, this portion starts at a certain file offset and has a
well-defined length, and as the delta is applied, the portion advances
from the beginning to the end of the preimage. Add a move_window
function to keep track of this view into the preimage.
You can use it like this:
buffer_init(f, NULL);
struct sliding_view window = SLIDING_VIEW_INIT(f);
move_window(&window, 3, 7); /* (1) */
move_window(&window, 5, 5); /* (2) */
move_window(&window, 12, 2); /* (3) */
strbuf_release(&window.buf);
buffer_deinit(f);
The data structure is called sliding_view instead of _window to
prevent confusion with svndiff0 Windows.
In this example, (1) reads 10 bytes and discards the first 3;
(2) discards the first 2, which are not needed any more; and (3) skips
2 bytes and reads 2 new bytes to work with.
When move_window returns, the file position indicator is at position
window->off + window->width and the data from positions window->off to
the current file position are stored in window->buf.
This function performs only sequential access from the input file and
never seeks, so it can be safely used on pipes and sockets.
On end-of-file, move_window silently reads less than the caller
requested. On other errors, it prints a message and returns -1.
Helped-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
gcc -m32 correctly warns:
vcs-svn/fast_export.c: In function 'fast_export_commit':
vcs-svn/fast_export.c:54:2: warning: format '%llu' expects
argument of type 'long long unsigned int', but argument 2
has type 'unsigned int' [-Wformat]
Fix it.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
* 'svn-fe' of git://repo.or.cz/git/jrn:
vcs-svn: handle log message with embedded NUL
vcs-svn: avoid unnecessary copying of log message and author
vcs-svn: remove buffer_read_string
vcs-svn: make reading of properties binary-safe
|
|
Currently there are two functions to retrieve the mode and content
at a path:
const char *repo_read_path(const uint32_t *path);
uint32_t repo_read_mode(const uint32_t *path)
Replace them with a single function with two return values. This
means we can use one round-trip to get the same information from
fast-import that previously took two.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Pass the log message by strbuf instead of as a C-style string and use
fwrite instead of printf to write it to fast-import so embedded '\0'
bytes can be preserved.
Currently "git log" doesn't show the embedded NULs but "git cat-file
commit" can.
While at it, stop including system headers from repo_tree.h. git
source files need to include git-compat-util.h (or cache.h or
builtin.h) sooner to ensure the appropriate feature test macros are
defined.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Use strbuf_swap when storing the svn:log and svn:author properties, so
pointers to rather than the contents of buffers get copied. The main
effect should be to make the code a little easier to read.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
All previous users of buffer_read_string have already been converted
to use the more intuitive buffer_read_binary, so remove the old API to
avoid some confusion.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
svn-fe errors out on revision 59151 of the ASF repository:
fatal: invalid dump: unexpected end of file
The proximate cause is a property with an embedded NUL character.
Previously such anomalies were ignored but commit c9d1c8ba
(2010-12-28) introduced a check strlen(val) == len to avoid reading
uninitialized data when a property list ends early and unfortunately
this test does not distinguish between "foo" followed by EOF and the
string "foo\0bar\0baz".
Fix it by using buffer_read_binary to read to a strbuf and checking
the actual length read. Most consumers of properties still use
C-style strings, so in practice an author or log message with embedded
NULs will be truncated, but a least this way svn-fe won't error out
(fixing the regression).
Reported-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
* 'svn-fe' of git://repo.or.cz/git/jrn:
vcs-svn: use strchr to find RFC822 delimiter
vcs-svn: implement perfect hash for top-level keys
vcs-svn: implement perfect hash for node-prop keys
vcs-svn: use strbuf for author, UUID, and URL
vcs-svn: use strbuf for revision log
vcs-svn: improve reporting of input errors
vcs-svn: make buffer_copy_bytes return length read
vcs-svn: make buffer_skip_bytes return length read
vcs-svn: improve support for reading large files
vcs-svn: allow input errors to be detected promptly
vcs-svn: simplify repo_modify_path and repo_copy
vcs-svn: handle_node: use repo_read_path
vcs-svn: introduce repo_read_path to check the content at a path
|
|
* db/length-as-hash:
vcs-svn: use strchr to find RFC822 delimiter
vcs-svn: implement perfect hash for top-level keys
vcs-svn: implement perfect hash for node-prop keys
Conflicts:
vcs-svn/svndump.c
|
|
This reverts commit 4709455db3891f6cad9a96a574296b4926f70cbe (Add
memory pool library, 2010-08-09). svn-fe uses strbufs to avoid memory
allocation overhead nowadays.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
This reverts commit 951f316470acc7c785c460a4e40735b22822349f
(Add treap implementation, 2010-08-09). The string_pool was
trp.h's last user.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
This reverts commit 1d73b52f5ba4184de6acf474f14668001304a10c
(Add string-specific memory pool, 2010-08-09). Now that svn-fe
does not need to maintain a growing collection of strings (paths)
over a long period of time, the string_pool is not needed.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Now that there is no internal representation of the repo, it is not
necessary to tokenise paths. Use strbuf instead and bypass
string_pool.
This means svn-fe can handle arbitrarily long paths (as long as a
strbuf can fit them), with arbitrarily many path components.
While at it, since we now treat paths in their entirety, only quote
when necessary.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
* db/strbufs-for-metadata:
vcs-svn: use strbuf for author, UUID, and URL
vcs-svn: use strbuf for revision log
Conflicts:
vcs-svn/fast_export.c
vcs-svn/fast_export.h
vcs-svn/repo_tree.c
vcs-svn/svndump.c
|
|
* 'db/length-as-hash' (early part):
vcs-svn: implement perfect hash for top-level keys
vcs-svn: implement perfect hash for node-prop keys
vcs-svn: improve reporting of input errors
vcs-svn: make buffer_copy_bytes return length read
vcs-svn: make buffer_skip_bytes return length read
vcs-svn: improve support for reading large files
Conflicts:
vcs-svn/fast_export.c
vcs-svn/svndump.c
|
|
This is a small optimisation (4% reduction in user time) but is the
largest artifact within the parsing portion of svndump.c
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Instead of interning property names and comparing their string_pool
keys, look them up in a table by string length, which should be about
as fast.
Another small step towards removing dependence on string_pool
altogether.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Instead of interning property names and comparing their string_pool
keys, look them up in a table by string length, which should be about
as fast.
This is a small step towards removing dependence on string_pool.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Use strbufs and strings instead of interned strings for values of rev,
dump, and node fields that happen to be strings. After this change,
the only remaining string_pool use is for paths in the repo_tree API
and internals.
Functional change: treat an empty author, UUID, or URL as none at all.
So for example, in repos where the first revision has an empty
svn:author property, the first rev will be treated as by "nobody"
rather than by a person with empty name and email address created by
prepending an @ sign to the repository UUID.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
obj_pool is overkill for this application: all that is needed is a
buffer that can resize from rev to rev to accomodate differently-sized
strings. In the spirit of commit deadcef4 (2010-11-06), use a strbuf
instead.
This is a small step towards removing dependence on obj_pool.h.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Catch input errors and exit early enough to print a reasonable
diagnosis based on errno.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Currently buffer_copy_bytes does not report to its caller whether
it encountered an early end of file.
Add a return value representing the number of bytes read (but not
the number of bytes copied). This way all three unusual conditions
can be distinguished: input error with buffer_ferror, output error
with ferror(outfile), early end of input by checking the return
value.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Currently there is no way to detect when input ended if it ended
early during buffer_skip_bytes. Tell the calling program how many
bytes were actually skipped for easier debugging.
Existing callers will still ignore early EOF.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Move from uint32_t to off_t as the fundamental unit of length used by
the line_buffer library. Performance would get worse if anything but
I think it's worth it for support of deltas that need to skip large
pieces (> 4 GiB).
Exception: buffer_read_string still takes a uint32_t, since it keeps
its result in an in-core obj_pool.
Callers still have to be updated to take advantage of this.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
trp_gen is not a statement or function call, so it should not be
followed with a semicolon. Noticed by gcc -pedantic.
vcs-svn/repo_tree.c:41:81: warning: ISO C does not allow extra ';'
outside of a function [-pedantic]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With this patch, overlapping incremental imports work.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Quote paths passed to fast-import so filenames with double quotes are
not misinterpreted.
One might imagine this could help with filenames with newlines, too,
but svn does not allow those.
Helped-by: David Barr <daivd.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
This bug was found while importing rev 601865 of ASF.
[jn: with test]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Rely on fast-import for information about previous revs.
This requires always setting up backward flow of information, even for
v2 dumps. On the plus side, it simplifies the code by quite a bit and
opens the door to further simplifications.
[db: adjusted to support final version of the cat-blob patch]
[jn: avoiding hard-coding git's name for the empty tree for
portability to other backends]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Current svn-fe produces output like this:
blob
mark :7382321
data 5
hello
blob
mark :7382322
data 5
Hello
commit
mark :3
[...]
M 100644 :7382321 hello.c
M 100644 :7382322 hello2.c
This means svn-fe has to keep track of the paths modified in each
commit and the corresponding marks, instead of dealing with each file
as it arrives in input and then forgetting about it. A better
strategy would be to use inline blobs:
commit
mark :3
[...]
M 100644 inline hello.c
data 5
hello
[...]
As a first step towards that, teach svn-fe to notice when the
collection of blobs for each commit starts and write a comment
("# commit 3.") there.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
This way, a person can use
svnadmin dump $path |
svn-fe |
git fast-import --relative-marks --export-marks=svn-revs
to get a list of what commit corresponds to each svn revision (plus
some irrelevant blob names) in .git/info/fast-import/svn-revs.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Prepare to use mark :5 for the commit corresponding to r5 (and so on).
1 billion seems sufficiently high for blob marks to avoid conflicting
with rev marks, while still leaving room for 3 billion blobs. Such
high mark numbers cause trouble with ancient fast-import versions, but
this topic cannot support git fast-import versions before 1.7.4 (which
introduces the cat-blob command) anyway.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Set up some plumbing: teach the svndump lib to pass a file descriptor
number to the fast_export lib, representing where cat-blob/ls
responses can be read from, and add a get_response_line helper
function to the fast_export lib to read a line from that file.
Unfortunately this means that svn-fe needs file descriptor 3 to be
redirected from somewhere (preferrably the cat-blob stream of a
fast-import backend); otherwise it will fail:
$ svndump <path> | svn-fe
fatal: cannot read from file descriptor 3: Bad file descriptor
For the moment, "svn-fe 3</dev/null" works as a workaround but it
will not work for very long. A fast-import backend that can retrieve
old commits is needed in order to be able to fulfill svn
"Node-copyfrom-rev" requests that refer to revs from a previous run.
[jn: with new change description]
Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
The line_buffer library silently flags input errors until
buffer_deinit time; unfortunately, by that point usually errno is
invalid. Expose the error flag so callers can check for and
report errors early for easy debugging.
some_error_prone_operation(...);
if (buffer_ferror(buf))
return error("input error: %s", strerror(errno));
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Restrict the repo_tree API to functions that are actually needed.
- decouple reading the mode and content of dirents from other
operations.
- remove repo_modify_path. It is only used to read the mode from
dirents.
- remove the ability to use repo_read_mode on a missing path. The
existing code only errors out in that case, anyway.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
svn-fe processes each commit in two stages: first decide on the
correct content for all paths and export the relevant blobs, then
export a commit with the result.
But we can keep less state and simplify svn-fe a great deal by
exporting the commit in one step: use 'inline' blobs for each path and
remember nothing. This way, the repo_tree structure could be
eliminated, and we would get support for incremental imports 'for
free'.
Reorganize handle_node along these lines. This is just a code
cleanup; the changes in repo_tree and handle_revision will come later.
[db: backported to apply without text delta support]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
The repo_tree structure remembers, for each path in each revision, a
mode (regular file, executable, symlink, or directory) and content
(blob mark or directory structure). Maintaining a second copy of all
this information when it's already in the target repository is
wasteful, it does not persist between svn-fe invocations, and most
importantly, there is no convenient way to transfer it from one
machine to another. So it would be nice to get rid of it.
As a first step, let's change the repo_tree API to match fast-import's
read commands more closely. Currently to read the mode for a path,
one uses
repo_modify_path(path, new_mode, new_content);
which changes the mode and content as a side effect. There is no
function to read the content at a path; add one.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
* git://github.com/gitster/git:
vcs-svn: Allow change nodes for root of tree (/)
vcs-svn: Implement Prop-delta handling
vcs-svn: Sharpen parsing of property lines
vcs-svn: Split off function for handling of individual properties
vcs-svn: Make source easier to read on small screens
vcs-svn: More dump format sanity checks
vcs-svn: Reject path nodes without Node-action
vcs-svn: Delay read of per-path properties
vcs-svn: Combine repo_replace and repo_modify functions
vcs-svn: Replace = Delete + Add
vcs-svn: handle_node: Handle deletion case early
vcs-svn: Use mark to indicate nodes with included text
vcs-svn: Unclutter handle_node by introducing have_props var
vcs-svn: Eliminate node_ctx.mark global
vcs-svn: Eliminate node_ctx.srcRev global
vcs-svn: Check for errors from open()
vcs-svn: Allow simple v3 dumps (no deltas yet)
Conflicts:
t/t9010-svn-fe.sh
vcs-svn/svndump.c
|
|
It can sometimes be useful to write information temporarily to file,
to read back later. These functions allow a program to use the
line_buffer facilities when doing so.
It works like this:
1. find a unique filename with buffer_tmpfile_init.
2. rewind with buffer_tmpfile_rewind. This returns a stdio
handle for writing.
3. when finished writing, declare so with
buffer_tmpfile_prepare_to_read. The return value indicates
how many bytes were written.
4. read whatever portion of the file is needed.
5. if finished, remove the temporary file with buffer_deinit.
otherwise, go back to step 2,
The svn support would use this to buffer the postimage from delta
application until the length is known and fast-import can receive
the resulting blob.
Based-on-patch-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Based-on-patch-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
buffer_read_char can be used in place of buffer_read_string(1) to
avoid consuming valuable static buffer space. The delta applier will
use this to read variable-length integers one byte at a time.
Underneath, it is fgetc, wrapped so the line_buffer library can
maintain its role as gatekeeper of input.
Later it might be worth checking if fgetc_unlocked is faster ---
most line_buffer functions are not thread-safe anyway.
Helpd-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
buffer_read_string works well for non line-oriented input except for
one problem: it does not tell the caller how many bytes were actually
written. This means that unless one is very careful about checking
for errors (and eof) the calling program cannot tell the difference
between the string "foo" followed by an early end of file and the
string "foo\0bar\0baz".
So introduce a variant that reports the length, too, a thinner wrapper
around strbuf_fread. Its result is written to a strbuf so the caller
does not need to keep track of the number of bytes read.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Collect the line_buffer state in a newly public line_buffer struct.
Callers can use multiple line_buffers to manage input from multiple
files at a time.
svn-fe's delta applier will use this to stream a delta from svnrdump
and the preimage it applies to from fast-import at the same time.
The tests don't take advantage of the new features, but I think that's
okay. It is easier to find lingering examples of nonreentrant code by
searching for "static" in line_buffer.c.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
Prepare for the line_buffer lib to support input from multiple files,
by collecting global state in a struct that can be easily passed
around.
No API change yet.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
obj_pool is inherently global and does not use the standard growing
factor alloc_nr, which makes it feel out of place in the git codebase.
Plus it is overkill for this application: all that is needed is a
buffer that can grow between requests to accomodate larger strings.
Use a strbuf instead.
As a side effect, this improves the error handling: allocation
failures will result in a clean exit instead of segfaults. It would
be nice to add a test case (using ulimit or failmalloc) but that can
wait for another day.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
The data stored in byte_buffer[] is always either discarded or
written to stdout immediately. No need for it to persist between
function calls.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
In particular, on systems that define uint32_t as an unsigned long,
gcc complains as follows:
CC vcs-svn/svndump.o
vcs-svn/svndump.c: In function `svndump_read':
vcs-svn/svndump.c:215: warning: int format, uint32_t arg (arg 2)
In order to suppress the warning we use the C99 format specifier
macro PRIu32 from <inttypes.h>.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* 'jn/svn-fe' (early part):
vcs-svn: Error out for v3 dumps
Conflicts:
t/t9010-svn-fe.sh
|
|
* jn/maint-svn-fe:
t9010 fails when no svn is available
vcs-svn: fix intermittent repo_tree corruption
treap: make treap_insert return inserted node
t9010 (svn-fe): Eliminate dependency on svn perl bindings
|
|
It is not uncommon for a svn repository to include change records for
properties at the top level of the tracked tree:
Node-path:
Node-kind: dir
Node-action: change
Prop-delta: true
Prop-content-length: 43
Content-length: 43
K 10
svn:ignore
V 11
build-area
PROPS-END
Unfortunately a recent svn-fe change (vcs-svn: More dump format sanity
checks, 2010-11-19) causes such nodes to be rejected with the error
message
fatal: invalid dump: path to be modified is missing
The repo_tree module does not keep a dirent for the root of the tree.
Add a block to the dump parser to take care of this case.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Pointers to directory entries do not remain valid after a call to
dent_insert.
Noticed in the course of importing a small Subversion repository
(~1000 revs); after setting up a dirent for a certain path as a
placeholder, by luck dent_insert would trigger a realloc that
shifted around addresses, resulting in an import with that file
replaced by a directory.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Suppose I try the following:
struct int_node *node = node_pointer(node_alloc(1));
node->n = 5;
treap_insert(&root, node);
printf("%d\n", node->n);
Usually the result will be 5. But since treap_insert draws memory
from the node pool, if the caller is unlucky then (1) the node pool
will be full and (2) realloc will be forced to move the node pool to
find room, so the node address becomes invalid and the result of
dereferencing it is undefined.
So we ought to use offsets in preference to pointers for references
that would remain valid after a call to treap_insert. Tweak the
signature to hint at a certain special case: since the inserted node
can change address (though not offset), as a convenience teach
treap_insert to return its new address.
So the motivational example could be fixed by adding "node =".
struct int_node *node = node_pointer(node_alloc(1));
node->n = 5;
node = treap_insert(&root, node);
printf("%d\n", node->n);
Based on a true story.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The rules for what file is used as delta source for each file are not
documented in dump-load-format.txt. Luckily, the Apache Software
Foundation repository has rich enough examples to figure out most of
the rules:
Node-action: replace implies the empty property set and empty text as
preimage for deltas. Otherwise, if a copyfrom source is given, that
node is the preimage for deltas. Lastly, if none of the above applies
and the node path exists in the current revision, then that version
forms the basis.
[jn: refactored, with tests]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Prepare to add a new type of property line (the 'D' line) to
handle property deltas.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The handle_property function is the part of read_props that would be
interesting for most people: semantics of properties rather than the
algorithm for parsing them.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove some newlines from handle_node() that are not needed for
clarity.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Node-action: change is not appropriate when switching between file and
directory or adding a new file. Current svn-fe silently accepts such
nodes and the resulting tree has missing files in the "changed when
meant to add" case.
Node-action: add requires some content (text or directory); there is
no such thing as an "intent to add" node in svn dumps. Current svn-fe
accepts such contentless adds but produces an invalid fast-import
stream that refers to nonexistent mark :0 in response.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It would be better to flag such errors and let the import proceed
anyway, but for now it is simpler not to worry about recovery
from such weird cases.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The mode for each file in an svn-format dump is kept in the properties
section. The properties section is read as soon as possible to allow
the correct mode to be filled in when registering the file with the
repo_tree lib.
To support nodes with a missing properties section, svn-fe determines
the mode in three stages:
- The kind (directory or file) of the node is read from the dump and
used to make an initial estimate (040000 or 100644).
- Properties are read in and allowed to override this for symlinks
and executables.
- If there is no properties section, the mode from the previous
content of the path is left alone, overriding the above
considerations.
This is a bit of a mess, and worse, it would get even more complicated
once we start to support property deltas. If we could only register
the file with a provisional value for mode and then change it later
when properties say so, the procedure would be much simpler.
... oh, right, we can.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are two functions to change the staged content for a path in the
svn importer's active commit: repo_replace, which changes the text and
returns the mode, and repo_modify, which changes the text and mode and
returns nothing.
Worse, there are more subtle differences:
- A mark of 0 passed to repo_modify means "use the existing content".
repo_replace uses it as mark :0 and produces a corrupt stream.
- When passed a path that is not part of the active commit,
repo_replace returns without doing anything. repo_modify
transparently adds a new directory entry.
Get rid of both and introduce a new function with the best features of
both: repo_modify_path modifies the mode, content, or both for a path,
depending on which arguments are zero. If no such dirent already
exists, it does nothing and reports the error by returning 0.
Otherwise, the return value is the resulting mode.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Simplify by reducing the "Node-action: replace" case to "Node-action:
add". This way, the main part of handle_node() only has to deal with
"add" and "change" nodes.
Functional change: replacing a symlink or executable without setting
properties will reset the mode.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Take care of "Node-action: delete" as soon as possible, so we can stop
worrying about that case in the rest of the function.
Functional change: catch deletion nodes with features that would not
apply to them (text, properties, or origin data) and error out for
those cases.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Allocate a mark if needed as soon as possible so later code can use
"if (mark)" to check if this node has text attached rather than
explicitly checking for Text-content-length.
While at it, reject directory nodes with text attached; the presence
of such a node would indicate a bug in the dump generator or svn-fe's
understanding. In the long term, it would be nice to be able to
continue parsing and save the error for later, but for now it is
simpler to error out right away.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It is possible for a path node in an SVN-format dump file to leave out
the properties section. svn-fe handles this by carrying over the
properties (in particular, file type) from the old version of that
node.
To support this, handle_node tests several times whether a
Prop-content-length field is present. Ancient Subversion actually
leaves out the Prop-content-length field even for nodes with
properties, so that's not quite the right check. Besides, this detail
of mechanism is distracting when the question at hand is instead what
content the new node should have.
So introduce a local have_props variable. The semantics are the same
as before; the adaptations to support ancient streams that leave out
the prop-content-length can wait until someone needs them.
Signed-off-by: Jonathan Nieder <jrnieer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The mark variable is only used in handle_node(). Its life is
very short and simple: first, a new mark number is allocated if
this node has text attached, then that mark is recorded in the
in-core tree being built up, and lastly the mark is communicated
to fast-import in the stream along with the associated text.
A new reader may worry about interaction with other code, especially
since mark is not initialized to zero in handle_node() itself.
Disperse such worries by making it local. No functional change
intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The srcRev variable is only used in handle_node(); its purpose
is to hold the old mode for a path, to only be used if properties
are not being changed. Narrow its scope to make its meaningful
lifetime more obvious.
No functional change intended. Add some tests as a sanity-check
for the simplest case (no renames).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
test-svn-fe segfaults when passed a bogus path. Simplify debugging by
exiting with a meaningful error message instead.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since the dumpfile version 1 days, the Subversion dump format
gained some new fields:
- a unique identifier for the repository (version 2 format)
- whether the text and properties for a node should be
interpreted as deltas
- checksums for a delta's preimage
- SHA-1 sums as alternatives to the existing MD5 checksums for
copy source and the payload (delta).
For now what is relevant to us is the Text-delta and Prop-delta
fields, since not noticing these causes a dump file to be
misinterpreted (see the previous commit).
[jn: with tests]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
By ignoring the Text-Delta and Prop-Delta node fields, current svn-fe
happily mistakes deltas for full text and instead of cleanly erroring
out, it produces a valid but semantically bogus fast-import stream
when fed a dump file in the modern "svnadmin dump --deltas" format.
Dump file parsers are supposed to ignore header fields they don't
understand (to allow for backward-compatible extensions), but they are
also supposed to check the SVN-fs-dump-format-version header to
prevent misinterpretation of non backward-compatible extensions.
Do so.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In particular, on systems that define uint32_t as an unsigned long,
gcc complains as follows:
CC vcs-svn/fast_export.o
vcs-svn/fast_export.c: In function `fast_export_modify':
vcs-svn/fast_export.c:28: warning: unsigned int format, uint32_t arg (arg 2)
vcs-svn/fast_export.c:28: warning: int format, uint32_t arg (arg 3)
vcs-svn/fast_export.c: In function `fast_export_commit':
vcs-svn/fast_export.c:42: warning: int format, uint32_t arg (arg 5)
vcs-svn/fast_export.c:62: warning: int format, uint32_t arg (arg 2)
vcs-svn/fast_export.c: In function `fast_export_blob':
vcs-svn/fast_export.c:72: warning: int format, uint32_t arg (arg 2)
vcs-svn/fast_export.c:72: warning: int format, uint32_t arg (arg 3)
CC vcs-svn/svndump.o
vcs-svn/svndump.c: In function `svndump_read':
vcs-svn/svndump.c:260: warning: int format, uint32_t arg (arg 3)
In order to suppress the warnings we use the C99 format specifier
macros PRIo32 and PRIu32 from <inttypes.h>.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the spirit of v1.6.4-rc0~124 (MinGW: Fix compiler warning in
merge-recursive, 2009-05-23), use a 32-bit integer instead; the
dump file parser does not support any better, anyway.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
dirent is #define’d to mingw_dirent in compat/mingw.h, with the
result that
obj_pool_gen(dirent, struct repo_dirent, 4096)
creates functions with names like mingw_dirent_alloc and
references to dirent_alloc go unresolved. Rename the functions
to dent_* to avoid this problem.
Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Missing spaces in while (0) and trpn_pointer(a, b).
Remove parentheses around return value.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
svndump parses data that is in SVN dumpfile format produced by
`svnadmin dump` with the help of line_buffer and uses repo_tree and
fast_export to emit a git fast-import stream.
Based roughly on com.hydrografix.svndump 0.92 from the SvnToCCase
project at <http://svn2cc.sarovar.org/>, by Stefan Hegny and
others.
[rr: allow input from files other than stdin]
[jn: with test, more error reporting]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
repo_tree maintains the exporter's state and provides a facility to to
call fast_export, which writes objects to stdout suitable for
consumption by fast-import.
The exported functions roughly correspond to Subversion FS operations.
. repo_add, repo_modify, repo_copy, repo_replace, and repo_delete
update the current commit, based roughly on the corresponding
Subversion FS operation.
. repo_commit calls out to fast_export to write the current commit to
the fast-import stream in stdout.
. repo_diff is used by the fast_export module to write the changes
for a commit.
. repo_reset erases the exporter's state, so valgrind can be happy.
[rr: squelched compiler warnings]
[jn: removed support for maintaining state on-disk, though we may
want to add it back later]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This library provides thread-unsafe fgets()- and fread()-like
functions where the caller does not have to supply a buffer. It
maintains a couple of static buffers and provides an API to use
them.
[rr: allow input from files other than stdin]
[jn: with tests, documentation, and error handling improvements]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Intern strings so they can be compared by address and stored without
wasting space.
This library uses the macros in the obj_pool.h and trp.h to create a
memory pool for strings and expose an API for handling them.
[rr: added API docs]
[jn: with some API simplifications, new documentation and tests]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Provide macros to generate a type-specific treap implementation and
various functions to operate on it. It uses obj_pool.h to store memory
nodes in a treap. Previously committed nodes are never removed from
the pool; after any *_commit operation, it is assumed (correctly, in
the case of svn-fast-export) that someone else must care about them.
Treaps provide a memory-efficient binary search tree structure.
Insertion/deletion/search are about as about as fast in the average
case as red-black trees and the chances of worst-case behavior are
vanishingly small, thanks to (pseudo-)randomness. The bad worst-case
behavior is a small price to pay, given that treaps are much simpler
to implement.
>From http://www.canonware.com/download/trp/trp_hash/trp.h
[db: Altered to reference nodes by offset from a common base pointer]
[db: Bob Jenkins' hashing implementation dropped for Knuth's]
[db: Methods unnecessary for search and insert dropped]
[rr: Squelched compiler warnings]
[db: Added support for immutable treap nodes]
[jn: Reintroduced treap_nsearch(); with tests]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a memory pool library implemented using C macros. The
obj_pool_gen() macro creates a type-specific memory pool.
The memory pool library is distinguished from the existing specialized
allocators in alloc.c by using a contiguous block for all allocations.
This means that on one hand, long-lived pointers have to be written as
offsets, since the base address changes as the pool grows, but on the
other hand, the entire pool can be easily written to the file system.
This could allow the memory pool to persist between runs of an
application.
For the svn importer, such a facility is useful because each svn
revision can copy trees and files from any previous revision. The
relevant information for all revisions has to persist somehow to
support incremental runs.
[rr: minor cleanups]
[jn: added tests; removed file system backing for now]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Teach the build system to build a separate library for the
upcoming subversion interop support.
The resulting vcs-svn/lib.a does not contain any code, nor is
it built during a normal build. This is just scaffolding for
later changes.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|