aboutsummaryrefslogtreecommitdiffstats
path: root/reftable/basics.c
AgeCommit message (Collapse)AuthorFilesLines
2025-10-07reftable: check for trailing newline in 'tables.list'Karthik Nayak1-13/+24
In the reftable format, the 'tables.list' file contains a newline separated list of tables. While we parse this file, we do not check or care about the last newline. Tighten the parser in `parse_names()` to return an appropriate error if the last newline is missing. This requires modification to `parse_names()` to now return the error while accepting the output as a third argument. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07reftable: fix formatting of the license headerPatrick Steinhardt1-6/+6
The license headers used across the reftable library doesn't follow our typical coding style for multi-line comments. Fix it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18reftable/basics: provide wrappers for big endian conversionPatrick Steinhardt1-19/+0
We're using a mixture of big endian conversion functions provided by both the reftable library, but also by the Git codebase. Refactor the code so that we exclusively use reftable-provided wrappers in order to untangle us from the Git codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/basics: adjust `hash_size()` to return `uint32_t`Patrick Steinhardt1-1/+1
The `hash_size()` function returns the number of bytes used by the hash function. Weirdly enough though, it returns a signed integer for its size even though the size obviously cannot ever be negative. The only case where it could be negative is if the function returned an error when asked for an unknown hash, but we assert(3p) instead. Adjust the type of `hash_size()` to be `uint32_t` and adapt all places that use signed integers for the hash size to follow suit. This also allows us to get rid of a couple asserts that we had which verified that the size was indeed positive, which further stresses the point that this refactoring makes sense. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/basics: adjust `common_prefix_size()` to return `size_t`Patrick Steinhardt1-5/+3
The `common_prefix_size()` function computes the length of the common prefix between two buffers. As such its return value will always be an unsigned integer, as the length cannot be negative. Regardless of that, the function returns a signed integer, which is nonsensical and causes a couple of -Wsign-compare warnings all over the place. Adjust the function to return a `size_t` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28reftable: handle realloc error in parse_names()René Scharfe1-1/+2
Check the final reallocation for adding the terminating NULL and handle it just like those in the loop. Simply use REFTABLE_ALLOC_GROW instead of keeping the REFTABLE_REALLOC_ARRAY call and adding code to preserve the original pointer value around it. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28reftable: fix allocation count on realloc errorRené Scharfe1-8/+3
When realloc(3) fails, it returns NULL and keeps the original allocation intact. REFTABLE_ALLOC_GROW overwrites both the original pointer and the allocation count variable in that case, simultaneously leaking the original allocation and misrepresenting the number of storable items. parse_names() avoids the leak by keeping the original pointer if reallocation fails, but still increase the allocation count in such a case as if it succeeded. That's OK, because the error handling code just frees everything and doesn't look at names_cap anymore. reftable_buf_add() does the same, but here it is a problem as it leaves the reftable_buf in a broken state, with ->alloc being roughly twice as big as the actually allocated memory, allowing out-of-bounds writes in subsequent calls. Reimplement REFTABLE_ALLOC_GROW to avoid leaks, keep allocation counts in sync and still signal failures to callers while avoiding code duplication in callers. Make it an expression that evaluates to 0 if no reallocation is needed or it succeeded and 1 on failure while keeping the original pointer and allocation counter values. Adjust REFTABLE_ALLOC_GROW_OR_NULL to the new calling convention for REFTABLE_ALLOC_GROW, but keep its support for non-size_t alloc variables for now. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-23Merge branch 'ps/reftable-alloc-failures-zalloc-fix'Junio C Hamano1-0/+7
Recent reftable updates mistook a NULL return from a request for 0-byte allocation as OOM and died unnecessarily, which has been corrected. * ps/reftable-alloc-failures-zalloc-fix: reftable/basics: return NULL on zero-sized allocations reftable/stack: fix zero-sized allocation when there are no readers reftable/merged: fix zero-sized allocation when there are no readers reftable/stack: don't perform auto-compaction with less than two tables
2024-12-22reftable/basics: return NULL on zero-sized allocationsPatrick Steinhardt1-0/+7
In the preceding commits we have fixed a couple of issues when allocating zero-sized objects. These issues were masked by implementation-defined behaviour. Quoting malloc(3p): If size is 0, either: * A null pointer shall be returned and errno may be set to an implementation-defined value, or * A pointer to the allocated space shall be returned. The application shall ensure that the pointer is not used to access an object. So it is perfectly valid that implementations of this function may or may not return a NULL pointer in such a case. Adapt both `reftable_malloc()` and `reftable_realloc()` so that they return NULL pointers on zero-sized allocations. This should remove any implementation-defined behaviour in our allocators and thus allows us to detect such platform-specific issues more easily going forward. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19reftable/system: stop depending on "hash.h"Patrick Steinhardt1-6/+7
We include "hash.h" in "reftable/system.h" such that we can use hash format IDs as well as the raw size of SHA1 and SHA256. As we are in the process of converting the reftable library to become standalone we of course cannot rely on those constants anymore. Introduce a new `enum reftable_hash` to replace internal uses of the hash format IDs and new constants that replace internal uses of the hash size. Adapt the reftable backend to set up the correct hash function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-17reftable: convert from `strbuf` to `reftable_buf`Patrick Steinhardt1-1/+1
Convert the reftable library to use the `reftable_buf` interface instead of the `strbuf` interface. This is mostly a mechanical change via sed(1) with some manual fixes where functions for `strbuf` and `reftable_buf` differ. The converted code does not yet handle allocation failures. This will be handled in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17reftable/basics: provide new `reftable_buf` interfacePatrick Steinhardt1-0/+74
Implement a new `reftable_buf` interface that will replace Git's own `strbuf` interface. This is done due to three reasons: - The `strbuf` interfaces do not handle memory allocation failures and instead causes us to die. This is okay in the context of Git, but is not in the context of the reftable library, which is supposed to be usable by third-party applications. - The `strbuf` interface is quite deeply tied into Git, which makes it hard to use the reftable library as a standalone library. Any dependent would have to carefully extract the relevant parts of it to make things work, which is not all that sensible. - The `strbuf` interface does not use the pluggable allocators that can be set up via `reftable_set_alloc()`. So we have good reasons to use our own type, and the implementation is rather trivial. Implement our own type. Conversion of the reftable library will be handled in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-04reftable/basics: fix segfault when growing `names` array failsPatrick Steinhardt1-2/+4
When growing the `names` array fails we would end up with a `NULL` pointer. This causes two problems: - We would run into a segfault because we try to free names that we have assigned to the array already. - We lose track of the old array and cannot free its contents. Fix this issue by using a temporary variable. Like this we do not clobber the old array that we tried to reallocate, which will remain valid when a call to realloc(3P) fails. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02reftable/basics: ban standard allocator functionsPatrick Steinhardt1-0/+1
The reftable library uses pluggable allocators, which means that we shouldn't ever use the standard allocator functions. But it is an easy mistake to make to accidentally use e.g. free(3P) instead of the reftable-specific `reftable_free()` function, and we do not have any mechanism to detect this misuse right now. Introduce a couple of macros that ban the standard allocators, similar to how we do it in "banned.h". Note that we do not ban the following two classes of functions: - Macros like `FREE_AND_NULL()` or `REALLOC_ARRAY()`. As those expand to code that contains already-banned functions we'd get a compiler error even without banning those macros explicitly. - Git-specific allocators like `xmalloc()` and friends. The primary reason is that there are simply too many of them, so we're rather aiming for best effort here. Furthermore, the eventual goal is to make them unavailable in the reftable library place by not pulling them in via "git-compat-utils.h" anymore. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02reftable/basics: handle allocation failures in `parse_names()`Patrick Steinhardt1-4/+16
Handle allocation failures in `parse_names()` by returning `NULL` in case any allocation fails. While at it, refactor the function to return the array directly instead of assigning it to an out-pointer. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02reftable/basics: handle allocation failures in `reftable_calloc()`Patrick Steinhardt1-3/+10
Handle allocation failures in `reftable_calloc()`. While at it, remove our use of `st_mult()` that would cause us to die on an overflow. From the caller's point of view there is not much of a difference between arguments that are too large to be multiplied and a request that is too big to handle by the allocator: in both cases the allocation cannot be fulfilled. And in neither of these cases do we want the reftable library to die. While we could use `unsigned_mult_overflows()` to handle the overflow gracefully, we instead open-code it to further our goal of converting the reftable codebase to become a standalone library that can be reused by external projects. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02reftable: introduce `reftable_strdup()`Patrick Steinhardt1-0/+10
The reftable library provides the ability to swap out allocators. There is a gap here though, because we continue to use `xstrdup()` even in the case where all the other allocators have been swapped out. Introduce `reftable_strdup()` that uses `reftable_malloc()` to do the allocation. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02reftable/basics: merge "publicbasics" into "basics"Patrick Steinhardt1-0/+55
The split between "basics" and "publicbasics" is somewhat arbitrary and not in line with how we typically structure code in the reftable library. While we do indeed split up headers into a public and internal part, we don't do that for the compilation unit itself. Furthermore, the declarations for "publicbasics.c" are in "reftable-malloc.h", which isn't in line with our naming schema, either. Fix these inconsistencies by: - Merging "publicbasics.c" into "basics.c". - Renaming "reftable-malloc.h" to "reftable-basics.h" as the public header. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07global: improve const correctness when assigning string constantsPatrick Steinhardt1-9/+6
We're about to enable `-Wwrite-strings`, which changes the type of string constants to `const char[]`. Fix various sites where we assign such constants to non-const variables. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03reftable/block: fix error handling when searching restart pointsPatrick Steinhardt1-1/+4
When doing the binary search over restart points in a block we need to decode the record keys. This decoding step can result in an error when the block is corrupted, which we indicate to the caller of the binary search by setting `args.error = 1`. But the only caller that exists mishandles this because it in fact performs the error check before calling `binsearch()`. Fix this bug by checking for errors at the right point in time. Furthermore, refactor `binsearch()` so that it aborts the search in case the callback function returns a negative value so that we don't needlessly continue to search the block. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03reftable/basics: fix return type of `binsearch()` to be `size_t`Patrick Steinhardt1-1/+1
The `binsearch()` function can be used to find the first element for which a callback functions returns a truish value. But while the array size is of type `size_t`, the function in fact returns an `int` that is supposed to index into that array. Fix the function signature to return a `size_t`. This conversion does not change any semantics given that the function would only ever return a value in the range `[0, sz]` anyway. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06reftable/stack: use `size_t` to track stack lengthPatrick Steinhardt1-4/+3
While the stack length is already stored as `size_t`, we frequently use `int`s to refer to those stacks throughout the reftable library. Convert those cases to use `size_t` instead to make things consistent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06reftable: introduce macros to grow arraysPatrick Steinhardt1-6/+2
Throughout the reftable library we have many cases where we need to grow arrays. In order to avoid too many reallocations, we roughly double the capacity of the array on each iteration. The resulting code pattern is duplicated across many sites. We have similar patterns in our main codebase, which is why we have eventually introduced an `ALLOC_GROW()` macro to abstract it away and avoid some code duplication. We cannot easily reuse this macro here though because `ALLOC_GROW()` uses `REALLOC_ARRAY()`, which in turn will call realloc(3P) to grow the array. The reftable code is structured as a library though (even if the boundaries are fuzzy), and one property this brings with it is that it is possible to plug in your own allocators. So instead of using realloc(3P), we need to use `reftable_realloc()` that knows to use the user-provided implementation. So let's introduce two new macros `REFTABLE_REALLOC_ARRAY()` and `REFTABLE_ALLOC_GROW()` that mirror what we do in our main codebase, with two modifications: - They use `reftable_realloc()`, as explained above. - They use a different growth factor of `2 * cap + 1` instead of `(cap + 16) * 3 / 2`. The second change is because we know a bit more about the allocation patterns in the reftable library. In most cases, we end up only having a handful of items in the array and don't end up growing them. The initial capacity that our normal growth factor uses (which is 24) would thus end up over-allocating in a lot of code paths. This effect is measurable: - Before change: HEAP SUMMARY: in use at exit: 671,983 bytes in 152 blocks total heap usage: 3,843,446 allocs, 3,843,294 frees, 223,761,402 bytes allocated - After change with a growth factor of `(2 * alloc + 1)`: HEAP SUMMARY: in use at exit: 671,983 bytes in 152 blocks total heap usage: 3,843,446 allocs, 3,843,294 frees, 223,761,410 bytes allocated - After change with a growth factor of `(alloc + 16)* 2 / 3`: HEAP SUMMARY: in use at exit: 671,983 bytes in 152 blocks total heap usage: 3,833,673 allocs, 3,833,521 frees, 4,728,251,742 bytes allocated While the total heap usage is roughly the same, we do end up allocating significantly more bytes with our usual growth factor (in fact, roughly 21 times as many). Convert the reftable library to use these new macros. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: utility functionsHan-Wen Nienhuys1-0/+128
This commit provides basic utility classes for the reftable library. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>