Warning: Trying to access array offset on value of type null in /nas/content/live/hadean2022/wp-content/themes/blankslate/functions.php on line 373
Last Time
In our previous post, we explored the problem of building Rust packages in a contained environment and made some first steps toward a solution. In this post we’ll take those ideas a little further and spend some time looking at Cargo’s dependency handling.
Rather than the slightly insipid example we used in the previous post, this time we’re going to struggle with the openssl
crate. Not only is this crate significantly more complicated to allow us to test the robustness of our approach, it also lets us show off one of the primary advantages of integrating Cargo with Nix in the first place, namely that we can now entrust the provisioning of native libraries to Nix and have our Rust programs run on any system without worrying about what native dependencies are or aren’t pre-installed.
Removing Dependencies
If we take our previous attempt from the last post and apply it to the openssl
crate, we’ll immediately run into some trouble:
openssl = mkRustCrate rec {
name = "openssl";
version = "0.10.15";
src = fetchFromCratesIo {
inherit name version;
sha256 = "0fj5g66ibkyb6vfdfjgaypfn45vpj2cdv7d7qpq653sv57glcqri";
};
};
error: the lock file /tmp/nix-build-openssl.drv-0/openssl-0.10.15.crate/Cargo.lock needs to be updated but --frozen was passed to prevent this
To avoid Cargo trying to update its lockfile, we’ll modify its inputs to remove any mention of dependencies.
The first part of that is just to replace Cargo.lock
with a more innocuous version of our own, which is simple enough: let’s create a new file called Cargo.lock
relative to our Nix expression:
[root]
name = "@crateName@"
version = "@crateVersion@"
and then we can just use substituteAll
to substitute in our crate-specific values at a later date.
lockFile = substituteAll {
src = ./Cargo.lock;
crateName = name;
crateVersion = version;
};
The next thing that makes Cargo go and try to fetch dependencies is Cargo.toml
, and removing the dependencies from that is a little more delicate, since we need the rest of the file in order for Cargo to do its job properly. Unfortunately there aren’t many tools for manipulating TOML, but there does exist remarshal
, which can convert TOML to JSON quite happily, and jq
, which can do just about anything you might need to do to JSON. So let’s add a little conversion step to our build.sh
from last time:
cp $lockFile Cargo.lock
$remarshal -if toml -of json -o Cargo.json Cargo.toml
$jq -f $cargoFilter < Cargo.json
| $remarshal -if json -of toml -o Cargo.toml
While we’re here we’ll also extract the value of the crate links
property from the Cargo.toml
, if one exists.
export CARGO_LINKS=$($jq -r .package.links < Cargo.json)
It’s a testament to the power of jq
that our filter is very simple:
def all_dependencies:
.["dependencies", "dev-dependencies", "build-dependencies"];
def optional_dependencies:
[ all_dependencies | select(.) ]
| add // {}
| map_values(select(type == "object" and .optional));
def augment_features:
.features = (.features + optional_dependencies);
def remove_dependencies:
del(all_dependencies)
| .target |= ((.//{}) | map_values(del(all_dependencies)));
def remove_feature_dependencies:
.features |= map_values([]);
augment_features
| remove_feature_dependencies
| remove_dependencies
This filter does only two things: it removes any dependencies from dependencies
or target.*.dependencies
sections, and it takes any optional dependencies and converts them into features. An optional dependency in Cargo is essentially a feature flag for which Cargo promises to provide the appropriate dependency; if we simply remove the offending dependency then Cargo will complain that the package’s caller has specified an invalid feature, and the crate itself will be unable to tell whether it should use that dependency or not. Instead, by replacing it with a no-op feature, we allow the crate to continue to test for the feature. The optional dependency itself, of course, will be provided by us.
Dependencies in Cargo
First, let’s take a break for a quick recap of how Cargo handles dependencies. Note that all details in this section are informative about how Cargo handles dependencies at the time of writing on Linux x86_64, and shouldn’t be taken as any kind of prescriptive specification of what Cargo will always do. For the party line, read the Rust and Cargo books, e.g. the chapter on linkage.
Linkage
Primarily, Cargo library crates are provided as rlib
files. rlib
files are essentially just static libraries, with additional metadata about types, traits, and generic functions — they fulfil the rôle of header plus static library from a C program. Just like static C libraries, it is not conventional to include code from upstream dependencies into an rlib
. The final linkage and symbol resolution happens only when the time comes to link an rlib
into a binary, and until that time we must aggregate a list of all the dependencies we’ve seen so far.
rustc
supports (as far as I can tell) two flags that tell it where to find a dependency in crate form. rustc -L /path/to/library
(or, more specifically, rustc -L crate=/path/to/library
) will cause rustc
to, if it needs a crate named foo-bar
, search for /path/to/library/libfoo_bar.rlib
or /path/to/library/libfoo_bar-*.rlib
. This form is used to provide indirect dependencies. Meanwhile, for direct dependencies of the current compilation (accessed by extern crate foo_bar;
) rustc
supports the rustc --extern foo_bar=/path/to/libfoo_bar.rlib
form, which not only links against /path/to/libfoo_bar.rlib
, but also uses the metadata contained within the rlib
to decide what symbols are made available from the extern crate
.
It’s interesting (and later important) to note that part of the metadata contained in the rlib
is a string that is used to mangle the symbols in that library. When invoked from Cargo, this string is a short hash calculated from the crate metadata. This enables Cargo to link in different versions of crates willy-nilly with no concern for symbol clashes. When compiling directly against a crate, the rlib
metadata is used to reconstruct the specific symbols against which the crate is to be built.
Dependency Types
Cargo supports three classes of dependency, plus target-specific dependencies: a dependency may be a runtime dependency or simply ‘dependency’, a build dependency, or a development dependency. Runtime dependencies are available for use in the program and propagated to any dependants of the crate until a binary crate is reached and the whole dependency tree is linked in. Build dependencies are available only in build scripts and are not available in the source itself or propagated to dependants. Development crates are also not propagated, but are available in the source itself, essentially feature-gated by the test
configuration option. They, as well as the option, are provided when running cargo test
, cargo bench
, or cargo --example
. In this post we will be looking only at dependencies and build dependencies, which is sufficient to build openssl
.
Build Scripts
In addition to receiving special build dependencies, build scripts can output a variety of commands to standard-out that instruct Cargo modify its compilation or linking behaviour somehow. Most of these we can ignore, but a couple need to propagate through the dependency tree. These we will translate into a bash script that can be loaded by our build script in preparation for building. Remember earlier when I said we needed to keep the value of the package.links
key around from Cargo.toml
? This is it — that key is used to determine the name of the environment variable used by Cargo to pass info to dependants, a behaviour we replicate here.
function parse_depinfo {
printf 'NIX_RUST_LINK_FLAGS=%qn' "${NIX_RUST_LINK_FLAGS-}"
cat "$@" | while read line
do
[[ "x$line" =~ xcargo:([^=]+)=(.*) ]] || continue
local key="${BASH_REMATCH[1]}"
local val="${BASH_REMATCH[2]}"
case $key in
rustc-link-lib) ;&
rustc-flags) ;&
rustc-cfg) ;&
rustc-env) ;&
rerun-if-changed) ;&
rerun-if-env-changed) ;&
warning)
;;
rustc-link-search)
printf 'NIX_RUST_LINK_FLAGS+=" "-L%qn' "$val"
;;
*)
printf 'export DEP_%s_%s=%qn'
"$(upper $CARGO_LINKS)"
"$(upper $key)"
"$val"
esac
done
}
Providing Dependencies
Dependency provision is done in three parts. In build.sh
, we go through all the Nix store paths provided in dependencies
and buildDependencies
, which are available to us as environment variables, and collect any Rust libraries we might find into two directories called deps
and build_deps
, respectively, reading their dependency info as we go into RUSTFLAGS
, BUILD_RUSTFLAGS
, depFlags
, and buildDepFlags
:
function add_deps {
local dep_type=$1; shift
local dep_dir=$1; shift
local dep_flags=$1; shift
for dep in ${!dep_type}
do
printf -v$dep_flags "%s --extern %s=%s"
"${!dep_flags}"
"$(crate_name $dep)"
"$dep/lib/lib$(crate_name $dep).rlib"
stat $dep/lib/deps/* &>/dev/null
&& cp -dn $dep/lib/deps/* deps
|| :
for depinfo in $dep/lib/*.depinfo
do
source $depinfo
done
for lib in $dep/lib/*.rlib
do
local dest=$(basename $lib .rlib)-$(crate_hash $dep).rlib
copy_or_link "$lib" "$dep_dir/$dest"
done
done
}
mkdir deps
add_deps dependencies deps depFlags
RUSTFLAGS="-Ldependency=deps $NIX_RUST_LINK_FLAGS"
link_flags="$NIX_RUST_LINK_FLAGS"
NIX_RUST_LINK_FLAGS=
mkdir build_deps
add_deps buildDependencies build_deps buildDepFlags
BUILD_RUSTFLAGS="-Ldependency=build_deps $NIX_RUST_LINK_FLAGS"
NIX_RUST_LINK_FLAGS="$link_flags"
If there were a way to separate the flags passed to rustc
for program vs build script builds, we would be essentially done. However, given that there isn’t and both are invoked from the same cargo build
call giving us no time to change the variables around, our trick is to wrap RUSTC
and RUSTDOC
in a small script. This script does two things. Firstly, it scans the arguments passed in for mentions of the mangling nonce and replaces it with the Nix hash of the derivation — since we have removed all the dependency info from the crate, the Cargo hash is significantly less useful for preventing name clashes than it used to be. Secondly, it checks whether the file being compiled is a build script, and if so adds in the build dependencies.
#!@bash@/bin/bash
source $utils
isBuildScript=
args=("$@")
for i in ${!args[@]}
do
if [ "x${args[$i]::9}" = "xmetadata=" ]
then
args[$i]=metadata=$(crate_hash $out)
elif [ "x${args[$i]}" = "x--crate-name" ]
&& [ "x${args[$i+1]::13}" = "xbuild_script_" ]
then
isBuildScript=1
fi
done
if [ "$isBuildScript" ]
then
depFlags+=" $buildDepFlags $BUILD_RUSTFLAGS"
fi
>&2 echo @cmd@ $depFlags "${args[@]}"
env @cmd@ $depFlags "${args[@]}"
All together, this is sufficient for us to compile the openssl
crate with all its dependencies!
openssl = mkRustCrate rec {
name = "openssl";
version = "0.10.15";
src = fetchFromCratesIo {
inherit name version;
sha256 = "0fj5g66ibkyb6vfdfjgaypfn45vpj2cdv7d7qpq653sv57glcqri";
};
dependencies = [ bitflags cfg-if foreign-types lazy_static libc openssl-sys ];
};
openssl-sys = mkRustCrate rec {
name = "openssl-sys";
version = "0.9.39";
src = fetchFromCratesIo {
inherit name version;
sha256 = "1lraqg3xz4jxrc99na17kn6srfhsgnj1yjk29mgsh803w40s2056";
};
buildInputs = [ pkgs.openssl pkgs.pkgconfig ];
dependencies = [ libc ];
buildDependencies = [ cc pkg-config ];
};
Note particularly that there was no need to specify pkgs.openssl
as a buildInput
of openssl
. Even though it hasn’t been linked into the rlib
for openssl-sys
, the link flags produced as part of openssl-sys
’s dependency info serve to locate the library for the whole dependency subtree. As far as any dependant package is concerned, openssl
is a completely pure Rust package, and since the native dependency is handled by Nix, the user should never have to worry about it.
Get Involved
All the code from this blog post series is available on GitHub, and the project will hopefully continue to grow.
Next time: testing, development dependencies, and auto-generating Nix expressions from crates.io
metadata.