0

I am very new to Zig (working through Advent Of Code in it), and I am very confused by its handling of strings (or, I should say, []u8s) as function arguments and return types.

TL;DR what is the correct implementation of the following function?

fn doIt(string: []u8) []u8 {
    return "prefix" ++ string;
}

First attempts

I would expect the following test to pass:

fn doIt(string: []u8) []u8 {
    return "prefix" ++ string;
}

const expect = @import("std").testing.expect;

test {
    try expect(std.mem.eql(u8, doIt("foo"), "prefixfoo"));
}

but instead zig test gives:

scratch.zig:27:33: error: expected type '[]u8', found '*const [3:0]u8'
    try expect(std.mem.eql(u8, doIt("foo"), "prefixfoo"));
                                    ^~~~~
scratch.zig:27:33: note: cast discards const qualifier
scratch.zig:20:17: note: parameter type declared here
fn doIt(string: []u8) []u8 {

OK, so the error message seems clear - I need to change the signature of the function to accept a pointer. I don't know why I'm not allowed to pass a ~~string~~ []u8-literal directly, but let's trust the compiler and try it:

// Let's not worry, for the moment, about the fact that we're writing a function which
// can only accept strings of length 3...
fn doIt(string: *const [3:0]u8) []u8 {
    return "prefix" ++ string;
}
...

giving

scratch.zig:23:21: error: expected type '[]u8', found '*const [9:0]u8'
    return "prefix" ++ string;
           ~~~~~~~~~^~~~~~~~~
scratch.zig:23:21: note: cast discards const qualifier
scratch.zig:20:33: note: function return type declared here
fn doIt(string: *const [3:0]u8) []u8 {
                                ^~~~

OK, the addition of two pointers-to-arrays results in a pointer to the result. That makes sense. I didn't want to be dealing with pointers in the first place - but since I was forced into "pointer-land", I can understand that the output of an operation there would also be a pointer. So, presumably, we just use .*, a.k.a pointer dereferencing, to return the actual value (a []u8), then?

fn doIt(string: *const [3:0]u8) []u8 {
    return ("prefix" ++ string).*;
}

giving

scratch.zig:21:32: error: array literal requires address-of operator (&) to coerce to slice type '[]u8'
    return ("prefix" ++ string).*;

...how can the address-of operator coerce a pointer into an object? Isn't that the inverse of what that operator does? But ok, let's try it...

fn doIt(string: *const [3:0]u8) []u8 {
    return &("prefix" ++ string).*;
}
scratch.zig:21:12: error: expected type '[]u8', found '*const [9:0]u8'
    return &("prefix" ++ string).*;
           ^~~~~~~~~~~~~~~~~~~~~~~
scratch.zig:21:12: note: cast discards const qualifier
scratch.zig:20:33: note: function return type declared here
fn doIt(string: *const [3:0]u8) []u8 {

...I give up, I must be misunderstanding something. Can anyone point (a-ha) me in the right direction?

Embrace the pointers

Taking a different tack, if we change the function's return type to be a pointer, there are still problems ahead:

fn doIt(string: *const [3:0]u8) *[]u8 {
    return "prefix" ++ string;
}

const expect = @import("std").testing.expect;

test {
    try expect(std.mem.eql(u8, doIt("foo"), "prefixfoo"));
}
scratch.zig:21:21: error: expected type '*[]u8', found '*const [9:0]u8'
    return "prefix" ++ string;
           ~~~~~~~~~^~~~~~~~~
scratch.zig:21:21: note: cast discards const qualifier
scratch.zig:20:33: note: function return type declared here
fn doIt(string: *const [3:0]u8) *[]u8 {
                                ^~~~~
scratch.zig:27:32: error: expected type '[]const u8', found '*[]u8'
    try expect(std.mem.eql(u8, doIt("foo"), "prefixfoo"));
                               ~~~~^~~~~~~
/Users/scubbo/zig/zig-macos-x86_64-0.14.0-dev.2362+a47aa9dd9/lib/std/mem.zig:658:33: note: parameter type declared here
pub fn eql(comptime T: type, a: []const T, b: []const T) bool {

The compiler suggests that I should make the return type of my function *const [9:0]u8. Which, with a little tweaking...still fails, in an even more surprising way:

fn doIt(string: *const [3:0]u8) *const [9:0]u8 {
    return "prefix" ++ string;
}

const expect = @import("std").testing.expect;

test {
    for (doIt("foo")) |char| {print("{c}", .{char});}
    print("\n", .{});
    for ("prefixfoo") |char| {print("{c}", .{char});}
    print("\n", .{});
    try expect(std.mem.eql(u8, doIt("foo"), "prefixfoo"));
}
pefixfoo
prefixfoo
1/1 scratch.test_0...FAIL (TestUnexpectedResult)
/Users/scubbo/zig/zig-macos-x86_64-0.14.0-dev.2362+a47aa9dd9/lib/std/testing.zig:546:14: 0x10846a78f in expect (test)
    if (!ok) return error.TestUnexpectedResult;
             ^
/Users/scubbo/Code/advent-of-code-2024/scratch.zig:31:5: 0x10846a936 in test_0 (test)
    try expect(std.mem.eql(u8, doIt("foo"), "prefixfoo"));
    ^
0 passed; 0 skipped; 1 failed.
error: the following test command failed with exit code 1:
/Users/scubbo/.cache/zig/o/1bb299b096246ee4dc2c6057c3d21f46/test --seed=0xc38b771a

That is not a typo or copy-paste mistake. The character-by-character printing of the output of return "prefix" ++ string; is pefixfoo. I could maybe understand the final character getting dropped somehow if I'd sized the array wrongly (though, see the next section), or the first character getting dropped for...some reason...but what could make the second character get dropped?

Flexibility of function inputs

And that's leaving aside the fact that a function signature of (string: *const [3:0]u8) *const[9:0]u8 would not, presumably, be able to accept a string of length 4. Hardly a multipurpose function!

References

Some links I have consulted to try to understand:

2
  • 1
    This question is a mess, can you reduce it to what's necessary, the answer you accepted only refers to the very first example, anything after that seems extraneous. Commented Dec 24, 2024 at 23:46
  • 1
    In my experience, understanding the misunderstandings and wrong turns that a querent has made is helpful in resolving the issue. The accepted answer correctly and helpfully answered my TL;DR (which is, in fact, all that's "necessary") - but if it hadn't been so simply, then knowing what I'd tried would have helped them know what areas would be fruitful or what kinds of explanation are relevant. Simply asking "how to concatenate two strings in Zig" wouldn't have given answered any information on what I didn't know. Commented Dec 27, 2024 at 0:50

2 Answers 2

3

The ++ operator only works on arrays with comptime-known sizes. But you clearly want the function to be fully usable at runtime.

This means that you need to be able to answer the question: where does your function get the memory for the new string? Idiomatically, the function would take an allocator, for example:

fn doIt(allocator: std.mem.Allocator, string: []const u8) ![]u8 {
    const prefix = "prefix";
    const new_string = try allocator.alloc(u8, prefix.len + string.len);
    @memcpy(new_string[0..prefix.len], prefix);
    @memcpy(new_string[prefix.len..], string);
    return new_string;
}

And you need to free the new string after you're done using it.

Sign up to request clarification or add additional context in comments.

Comments

1

Building on the answer by sigod, you can use the concat function to reduce code and do efficient memcpy under the hood:

fn doIt(allocator: std.mem.Allocator, string: []const u8) ![]u8 {
    return std.mem.concat(allocator, u8, &[_][]const u8{ "prefix", string });
}

The code has Zig infer the length of the array at compile time using [_]. We take the reference of this because we can't return static memory (we constructed an array, not a slice, so it's not a pointer until we take its reference).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.