| Age | Commit message (Collapse) | Author | Files | Lines |
|
In the preceding commits we have renamed the structures contained in
"object-store.h" to `struct object_database` and `struct odb_backend`.
As such, the code files "object-store.{c,h}" are confusingly named now.
Rename them to "odb.{c,h}" accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `object_directory` structure is used as an access point for a single
object directory like ".git/objects". While the structure isn't yet
fully self-contained, the intent is for it to eventually contain all
information required to access objects in one specific location.
While the name "object directory" is a good fit for now, this will
change over time as we continue with the agenda to make pluggable object
databases a thing. Eventually, objects may not be accessed via any kind
of directory at all anymore, but they could instead be backed by any
kind of durable storage mechanism. While it seems quite far-fetched for
now, it is thinkable that eventually this might even be some form of a
database, for example.
As such, the current name of this structure will become worse over time
as we evolve into the direction of pluggable ODBs. Immediate next steps
will start to carve out proper self-contained object directories, which
requires us to pass in these object directories as parameters. Based on
our modern naming schema this means that those functions should then be
named after their subsystem, which means that we would start to bake the
current name into the codebase more and more.
Let's preempt this by renaming the structure. There have been a couple
alternatives that were discussed:
- `odb_backend` was discarded because it led to the association that
one object database has a single backend, but the model is that one
alternate has one backend. Furthermore, "backend" is more about the
actual backing implementation and less about the high-level concept.
- `odb_alternate` was discarded because it is a bit of a stretch to
also call the main object directory an "alternate".
Instead, pick `odb_source` as the new name. It makes it sufficiently
clear that there can be multiple sources and does not cause confusion
when mixed with the already-existing "alternate" terminology.
In the future, this change allows us to easily introduce for example a
`odb_files_source` and other format-specific implementations.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `raw_object_store` structure is the central entry point for reading
and writing objects in a repository. The main purpose of this structure
is to manage object directories and provide an interface to access and
write objects in those object directories.
Right now, many of the functions associated with the raw object store
implicitly rely on `the_repository` to get access to its `objects`
pointer, which is the `raw_object_store`. As we want to generally get
rid of using `the_repository` across our codebase we will have to
convert this implicit dependency on this global variable into an
explicit parameter.
This conversion can be done by simply passing in an explicit pointer to
a repository and then using its `->objects` pointer. But there is a
second effort underway, which is to make the object subsystem more
selfcontained so that we can eventually have pluggable object backends.
As such, passing in a repository wouldn't make a ton of sense, and the
goal is to convert the object store interfaces such that we always pass
in a reference to the `raw_object_store` instead.
This will expose the `raw_object_store` type to a lot more callers
though, which surfaces that this type is named somewhat awkwardly. The
"raw_" prefix makes readers wonder whether there is a non-raw variant of
the object store, but there isn't. Furthermore, we nowadays want to name
functions in a way that they can be clearly attributed to a specific
subsystem, but calling them e.g. `raw_object_store_has_object()` is just
too unwieldy, even when dropping the "raw_" prefix.
Instead, rename the structure to `object_database`. This term is already
used a lot throughout our codebase, and it cannot easily be mistaken for
"object directories", either. Furthermore, its acronym ODB is already
well-known and works well as part of a function's name, like for example
`odb_has_object()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Support to create a loose object file with unknown object type has
been dropped.
* jk/no-funny-object-types:
object-file: drop support for writing objects with unknown types
hash-object: handle --literally with OPT_NEGBIT
hash-object: merge HASH_* and INDEX_* flags
hash-object: stop allowing unknown types
t: add lib-loose.sh
t/helper: add zlib test-tool
oid_object_info(): drop type_name strbuf
fsck: stop using object_info->type_name strbuf
oid_object_info_convert(): stop using string for object type
cat-file: use type enum instead of buffer for -t option
object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag
cat-file: make --allow-unknown-type a noop
object-file.h: fix typo in variable declaration
|
|
Code cleanup.
* jk/oidmap-cleanup:
raw_object_store: drop extra pointer to replace_map
oidmap: add size function
oidmap: rename oidmap_free() to oidmap_clear()
|
|
We provide a mechanism for callers to get the object type as a raw
string, rather than an object_type enum. This was in theory useful for
returning types that are not representable in the enum, but we consider
any such type to be an error, and there are no callers that use the
strbuf anymore.
Let's drop support to simplify the code a bit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In oid_object_info_convert(), we convert objects between their sha1 and
sha256 variants. To do this, we naturally need to know the type, which
we get from oid_object_info_extended() using its type_name strbuf
option.
But getting the value as a string (versus an object_type enum) is not
helpful. Since we do not allow unknown types, the regular enum is
sufficient. And the resulting code is a bit simpler, as we no longer
have to manage the extra allocation nor convert the string to an enum
ourselves.
Note that at first glance, it might seem like we should retain the error
check for "type == -1" to catch bogus types found by the underlying
parser. But we don't need it, as an unknown type would have yielded an
error from the call to oid_object_info_extended(), which would already
have caused us to return an error.
In fact, I suspect this was always impossible to trigger. Even when we
were converting the string to a type enum ourselves, an invalid type
would never have escaped oid_object_info_extended(), since we never
passed the (now removed) OBJECT_INFO_ALLOW_UNKNOWN_TYPE option.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We store the replacement data in an oidmap, which is itself a pointer in
the raw_object_store struct. But there's no need for an extra pointer
indirection here. It is always allocated and initialized along with the
containing struct, and we never check it for NULL-ness.
Let's embed the map directly in the struct, which is simpler and avoids
extra pointer chasing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This function does not free the oidmap struct itself; it just drops all
items from the map (using hashmap_clear_() internally). It should be
called oidmap_clear(), per CodingGuidelines.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the preceding commits we have converted all users of
`repo_has_object_file()` and its `_with_flags()` variant to instead use
`has_object()`. Drop these functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As the comment of `repo_has_object_file()` and its `_with_flags()`
variant tells us, these functions are considered to be deprecated in
favor of `has_object()`. There are a couple of slight benefits in favor
of the replacement:
- The new function has a short-and-sweet name.
- More explicit defaults: `has_object()` doesn't fetch missing objects
via promisor remotes, and neither does it reload packfiles if an
object wasn't found by default. This ensures that it becomes
immediately obvious when a simple object existence check may result
in expensive actions.
Most importantly though, it is confusing that we have two sets of
functions that ultimately do the same thing, but with different
defaults.
Start sunsetting `repo_has_object_file()` and its `_with_flags()`
sibling by replacing all callsites with `has_object()`:
- `repo_has_object_file(...)` is equivalent to
`has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., 0)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`
is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`.
The replacements should be functionally equivalent.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're about to fully remove `repo_has_object_file()` in favor of
`has_object()`. The latter function does not yet have a way to fetch
missing objects via a promisor remote though, which means that it cannot
fully replace all usecases of `repo_has_object_file()`.
Introduce a new flag `HAS_OBJECT_FETCH_PROMISOR` that causes the
function to optionally fetch missing objects which are part of a
promisor pack. This flag will be used in the subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function `odb_pack_keep()` creates a file at the passed-in path. If
this fails, then the function re-tries by first creating any potentially
missing leading directories and then trying to create the file once
again. As such, this function doesn't host any kind of logic that is
specific to the object store, but is rather a generic helper function.
Rename the function to `safe_create_file_with_leading_directories()` and
move it into "path.c". While at it, refactor it so that it loses its
dependency on `the_repository`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function `loose_object_path()` is a trivial wrapper around
`odb_loose_path()`, with the only exception that it always uses the
primary object database of the given repository. This doesn't really add
a ton of value though, so let's drop the function and inline it at every
callsite.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "object-store-ll.h" header has been introduced to keep transitive
header dependendcies and compile times at bay. Now that we have created
a new "object-store.c" file though we can easily move the last remaining
additional bit of "object-store.h", the `odb_path_map`, out of the
header.
Do so. As the "object-store.h" header is now equivalent to its low-level
alternative we drop the latter and inline it into the former.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Cached objects are virtual objects that can be set up without writing
anything into the object store directly, which is used by git-blame(1)
to create fake commits for the working tree.
These cached objects are stored in a global variable, which is another
roadblock for libification of the object subsystem. Refactor the code so
that we instead store the array as part of the raw object store.
This refactoring raises the question whether virtual objects should
really be specific to a single repository (or rather a single object
store). Hypothetical usecases might for example span across submodules,
and here it may or may not be the right thing to provide virtual objects
across submodule boundaries.
The only existing usecase is git-blame(1) though, which does not know to
blame across submodule boundaries in the first place. As such, storing
these objects both globally and per-repository would achieve the same
result right now. But arguably, if we learned to blame across submodule
boundaries, we would likely want to create separate fare working tree
commits for each of the submodules so that the user can learn which
worktree a specific uncommitted change belongs to. And even if we would
want to create the same fake commit for each of the submodules we could
do that when storing separate virtual objects per object store.
While this is all rather hypothetical, the takeaway is that handling
virtual objects per-object store gives us more flexibility compared to
storing them globally. In a hypothetical future where we have achieved
full libification one might be able to handle unrelated repositories in
a single process, where the state of one repository should not have an
impact on the state of another repository. As such, storing these cached
objects per object store will enable more usecases and should lead to
less surprising outcomes overall.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Split out functions relating to the object store subsystem from
"object.c". This helps us to separate concerns.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While we have the "object-store.h" header, most of the functionality for
object stores is actually hosted in "object-file.c". This makes it hard
to find relevant functions and causes us to mix up concerns.
Split out functions relating to the object store subsystem into a new
"object-store.c" file.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|