Vetting the cargo

By Jonathan Corbet
June 10, 2022

Modern language environments make it easy to discover and incorporate externally written libraries into a program. These same mechanisms can also make it easy to inadvertently incorporate security vulnerabilities or overtly malicious code, which is rather less gratifying. The stream of resulting vulnerabilities seems like it will never end, and it afflicts relatively safe languages like Rust just as much as any other language. In an effort to avoid the embarrassment that comes with shipping vulnerabilities (or worse) by way of its dependencies, the Mozilla project has come up with a new supply-chain management tool known as "cargo vet".

The problem

The appeal of modern environments is easy enough to understand. A developer working on a function may suddenly discover the need to, say, left-pad a string with blanks. Rather than go though the pain of implementing this challenging functionality, our developer can simply find an appropriate module in the language-specific repository, add it to the project manifest, and use it with no further thought. This allows our developer to take advantage of the work done by others and focus on their core task, which is probably something vital like getting popup windows past ad blockers.

There is an obvious problem with this approach: our developer knows nothing about what is inside this newly added module — or in any of the other modules that this one might quietly pull in as dependencies. This is true when the module is first imported, and becomes even more so as those dependencies are updated, perhaps by somebody other than the original author. It is a recipe for security problems.

The Mozilla project has been trying to increase the safety of the Firefox browser for years in numerous ways; one of those is rewriting much of the browser in the Rust language — which, itself, has its origins at Mozilla. At this point, though, much of the code shipped with Firefox originates outside the project; from the announcement:

Firefox’s Rust integration makes it very easy for our engineers to pull in off-the-shelf code from crates.io rather than writing it from scratch. This is a great thing for productivity, but also increases our attack surface. Our dependency tree has steadily grown to almost four hundred third-party crates, and we have thus far lacked a mechanism to efficiently audit this code and ensure that we do so systematically.

Nearly four-hundred third-party crates does indeed seem like a significant attack surface. A bug in any one of them could lead to the shipping of a vulnerable browser, and the consequence of a crate containing malware could be quite a bit worse. It is, indeed, good that the project is thinking about how to address this threat.

Tracking code audits

There are many ways to improve confidence in the security of a chunk of code. Writing that code in a memory-safe language is one such way; in a Rust program without unsafe blocks, there are whole classes of problems that simply cannot exist. But more than that is required and, in the end, there is no substitute for simply looking at the code and understanding what it does. If a program like Firefox is built only from code that has been diligently audited, the confidence in its security will be higher.

The cargo vet mechanism, built into Rust's Cargo dependency manager and build system, is meant to help with the task. It can't do the tedious and demanding work of actually auditing code, but it can help to keep track of which code has been audited and ensure that unaudited code does not find its way into a production build.

The initial cargo vet setup creates a new directory, called supply-chain, in the source directory; this new directory contains a couple of files called audits.toml and config.toml. The setup also looks at all of the project's dependencies (which are already tracked by cargo in the Cargo.lock file) and marks them all as being unaudited.

A developer can mark a module as being audited (after, presumably, having actually audited it) by adding a block to audits.toml like the following:

    [[audits.left-pad]]
    version = "1.0"
    who = "Alice TheAuditor <NothingGetsPastMe@example.com>"
    criteria = "safe-to-deploy"

This entry says that version 1.0 (and only that version) of the left-pad crate was audited and deemed to be "safe to deploy" in a production build. There are two "audit criteria" defined by cargo vet, the other being "safe-to-run"; others can be added as need be. There are ways of indicating that a range of versions has been audited, or that the patch from one version to the next has been. It is also possible to put in a violation line with a version range; that indicates that those versions have failed the audit and should not be used. Other examples of audits.toml entries can be found on this page.

Once these audits are in place, cargo vet can be run to ensure that all code in the build has been audited. If some dependencies have been updated, the tool will indicate that they require auditing and cause the build to fail. It can also fetch the source for the dependencies in question from crates.io (rather than, say, the project page on a public forge site) to ensure that the code being audited is the same as the code being deployed.

The cargo vet tool, in other words, can help a project keep track of the vetting of its dependencies, and it can help prevent the shipping of unaudited code to users. But it doesn't change the fact that auditing all of that code is a lot of work in the first place. A lot of that work could perhaps be saved, though, if projects could collaborate and share the audits that they have done.

Bringing in the community

One other key objective driving cargo vet is to spread the work of auditing around the community. Since a project's audits.toml file will be a part of its source repository, it will be available to anybody else who can see that repository; that is the whole world for most open-source code. In other words, the results of a project's auditing work will normally be available for the rest of the world to see — and make use of. After all, if one project has audited a dependency and found nothing amiss, and if that project's judgment is to be trusted, then there is little reason for any other project to repeat that work.

To take advantage of another project's auditing work, cargo vet can be told to import its audits.toml file and accept the audit results found therein. Needless to say, a certain degree of trust should exist before delegating one's auditing tasks to others on the Internet. There is currently no mechanism for discovery of available audits, and no way (in cargo vet at least) to verify that the person listed in the audits.toml file actually claims to have done an audit — anybody who can write the file can add any text they want. If the use of this mechanism takes off, though, such features can be added in the future.

The overall goal of this work is to take away excuses for not properly auditing dependencies:

Each new participant automatically contributes its audits back to the commons, making it progressively less work for everyone to secure their dependencies. We’ve learned many times that the best way to move an ecosystem towards more-secure practices is to take something that was hard and make it easy, and that’s what we’re doing here.

The hope is that, as the amount of audited code increases, the use of cargo vet will grow as well. The infrastructure may be a good start but, as the announcement notes, there is a remaining problem that could be hard to overcome: "there is no way to independently verify that an audit was performed faithfully and adequately". Creating a system of sharing audits across the community looks like a difficult task in the absence of some sort of reputation system that lets users decide which audits they should actually trust.

This project is quite new, though, so it is not surprising that some gaps remain. There can be no doubt that cargo vet is trying to address a pressing and urgent problem, so it is good to see this work being done. If this approach pans out, the use of random modules by unknown authors from a central software repository might just become a slightly more rational thing to do.

Vetting the cargo

Posted Jun 10, 2022 15:46 UTC (Fri) by MrWim (subscriber, #47432) [Link] (1 responses)

> Creating a system of sharing audits across the community looks like a difficult task in the absence of some sort of reputation system that lets users decide which audits they should actually trust.

See also cargo-crev: https://github.com/crev-dev/cargo-crev . I notice it's mentioned in cargo-vet's FAQ: https://mozilla.github.io/cargo-vet/design-choice-faq.htm...

Disclaimer: I've not used either tool

Vetting the cargo

Posted Jun 11, 2022 2:07 UTC (Sat) by pabs (subscriber, #43278) [Link]

Hmm, the crev distributed auditing model seems like a more logical way to do things.

Vetting the cargo

Posted Jun 10, 2022 17:33 UTC (Fri) by edeloget (subscriber, #88392) [Link] (13 responses)

This looks suspiciously similar to RFC3514. Do we expect bad actors to be truthful here?

Vetting the cargo

Posted Jun 10, 2022 19:21 UTC (Fri) by josh (subscriber, #17465) [Link] (12 responses)

No, we expect the people submitting vetting audits to be good actors, who are reviewing the work of others, not their own. Audits would be submitted by developers of the project adding a dependency, not by the developers of the dependency; this just records whatever review work took place, by whatever process the project has in place.

Vetting the cargo

Posted Jun 11, 2022 9:27 UTC (Sat) by edeloget (subscriber, #88392) [Link] (11 responses)

> No, we expect the people submitting vetting audits to be good actors, who are reviewing the work of others, not their own. Audits would be submitted by developers of the project adding a dependency, not by the developers of the dependency; this just records whatever review work took place, by whatever process the project has in place.

I understand that, but it's still not a good solution. Let me introduce all-padding, a crate that depends on both right-padding (developped by an associate of mine, yet you don't know that) and left-padding. Both dependencies are innocuous at the time you need them, so I audited them and told the world they are safe to deploy. Since padding function are "difficult" and your time is better spent implementing the REST API of the coffee machine, you add all-padding as a dependency. You audit my code, and it's ok, so all-padding is audited as safe to deploy as well.

A few update cycles down the road, eveybody forget about all-padding - it's a safe library. Good time for me and my associate to maliciously update right-padding. Since the dependency changes quite a bit, I "re-audit" it and mark is as safe to deploy again. And since you've worked with all-padding for ages, you trust me, so you don't reevaluate right-padding by yourself : I just told you it was ok.

I also understand that I'm kind of a pessimist :)

The main problem here is not even that some packages might be malicious. It's the fact that you might have hundred of dependencies in your project. This is not a rust-related problem (nodejs is a worse offender here, but python, php and a lot of other languages are also at risk) but such a large number of dependencies will always mean that you cannot develop reasonably secure programs with them - no amount of public audit will ever help. The only thing that will come from these audits is a very weak proof, something along the line of "I think that maybe there is a chance that this lib is not actively malicious", which is not very interesting in the long term (not to mention I'm pretty sure that most of them will be of the "I need this lib, I'd like my project to be accepted/pass our CI tests, so let's mark all my dependencies as fully audited").

To this main problem, "cargo vet" adds another layer: a false sense of security. You may end up thinking your project is safe, because all the dependencies you added have been "audited" by people that, you may assume, are better than you at finding security issues. Yet security is hard. Even obvious issues are not that easy to spot (think of the goto fail problem in the Apple SSL library: it tooks the whole world two years to find this very obvious issue in a widely used, public SSL library) so having a random individual tell you that the dependencies he uses are ok is kind of... well, it's kind of alarming.

Vetting the cargo

Posted Jun 11, 2022 15:35 UTC (Sat) by josh (subscriber, #17465) [Link] (5 responses)

> I understand that, but it's still not a good solution. Let me introduce all-padding, a crate that depends on both right-padding (developped by an associate of mine, yet you don't know that) and left-padding. Both dependencies are innocuous at the time you need them, so I audited them and told the world they are safe to deploy. Since padding function are "difficult" and your time is better spent implementing the REST API of the coffee machine, you add all-padding as a dependency. You audit my code, and it's ok, so all-padding is audited as safe to deploy as well.

And then cargo-vet tells you that you also need to audit left-padding and right-padding, because it intentionally does not have a transitive trust model.

> And since you've worked with all-padding for ages, you trust me, so you don't reevaluate right-padding by yourself : I just told you it was ok.

And that was a process mistake that no tool can prevent you from making.

While cargo-vet allows you to import audits from other projects, I have the impression that's intended for use cases like "I'm a small developer and don't want to vet all my dependencies, so I'm going to import the audits of a large project that has security reviewers I trust, such as Mozilla". Which will leave you better off than the audits you might otherwise have done yourself. If, instead of delegating your trust to developers of a large well-known project, you delegate your trust to the author of all-padding, you will get a corresponding level of non-diligence.

Vetting the cargo

Posted Jun 11, 2022 16:48 UTC (Sat) by edeloget (subscriber, #88392) [Link] (4 responses)

> And then cargo-vet tells you that you also need to audit left-padding and right-padding, because it intentionally does not have a transitive trust model.

In real work terms, it means that you'll just import the audits done by someone else. This is why the import function exists in the first place. It's not a transitive trust model, it's a delegate trust model. In the end, there is no difference as the author of all-padding will tell you that his project and its dependencies are trusted by project foo, bar and baz and will points you to the audit tile in order to ease the integration of his own module in your project. And most people will do it, because it's easier this way. Or the author will find a way to add himself in a global, trusted repository. Or he will tell you to trust another registry.

Developpers tend to pick easy solutions (otherwise the npm development model would not even exist), and as a module author, giving them clear, easy instructions to follow in order to avoid being turned down by this pesky CI tests is going to be valued.

You might argue that this is dumb and it should not exist, but, hey, large security appliances are often presenting stupid security holes (like this one: https://twitter.com/AnnaViolet20/status/15235646321405091... ; take some time to think about how is it even possible for a global leader in security to spawn a REST API which runs as root and does not even check for any credential apart from the presence of a specific header which is widely documented. The answer is along the line of "well... Oh! look at this wonderful sky!").

> And that was a process mistake that no tool can prevent you from making.

I would argue that it's definitevely worse, as I now have a tool that tells me it's definitely ok to do that.

> While cargo-vet allows you to import audits from other projects, I have the impression that's intended for use cases like "I'm a small developer and don't want to vet all my dependencies, so I'm going to import the audits of a large project that has security reviewers I trust, such as Mozilla". Which will leave you better off than the audits you might otherwise have done yourself.

I would not assume that this feature will only be used by small developpers. Large developpers shops are also likely to use it in order to make sure they hit their delivery time - because now that the tool exist, support for the tool will be added in CI systems, and suddenly, as a developper, you'll have to accept audits results.

> If, instead of delegating your trust to developers of a large well-known project, you delegate your trust to the author of all-padding, you will get a corresponding level of non-diligence.

Well, this is one of the reason for the development of cargo vet: you cannot assume the users (even large companies) will audit the code because this is not practical. So the level of non-diligence is already here. I would even say that refusing the audit would lead you directly to square zero.

I would also add that there is no actual reason to not trust the author of all-padding. On what ground should I do that? (and I'm OK that in real life, I will have a harder time to trust him than to trust Mozilla, so that's more of a devil's attorney question).

Don't misunderstand my position: I understand that having a tool to help code audit is a good thing. But given the choosen development model (as the developpers of cargo-vet calls it: "low-friction reuse of third-party components") this is just bandage on a wooden leg. The problem is that the development model in itself is dangerous. This has been proven times and times again. While it might reduce risks (and I'm not even sure of that) no amount of tooling might change that.

Vetting the cargo

Posted Jun 11, 2022 19:32 UTC (Sat) by josh (subscriber, #17465) [Link] (1 responses)

> In the end, there is no difference as the author of all-padding will tell you that his project and its dependencies are trusted by project foo, bar and baz and will points you to the audit tile in order to ease the integration of his own module in your project.

And that audit will cover only the versions that were audited, so if you trust that organization to do good audits, you'll be safer than you otherwise would be. If you make the mistake of trusting that because that organization has audited an *old* version the *new* version must be safe, that's your mistake, not that organization's mistake.

This is an improvement over the current state, where it's much easier to do malicious things and not get caught.

> I would argue that it's definitevely worse, as I now have a tool that tells me it's definitely ok to do that.

No, you have a tool that's making it possible to be selective about who you trust and tracking exactly what versions you trust, as opposed to implicitly trusting everyone. If you then choose to trust the wrong people, a tool can't help you with that.

> I would also add that there is no actual reason to not trust the author of all-padding. On what ground should I do that?

There's no reason *to* trust the author of all-padding to vet other software. You are repeatedly introducing the assumption that because a previous version appears safe the developer of that version should be trusted to audit other things, and then saying that if you do that, someone who earns trust with their previous code can break that trust. cargo-vet is designed to *not* make that assumption.

Potentially good criteria for delegating your audits to someone else would include "they're a part of the same company as you and this is their job", or "they're part of a trusted industry organization". Bad criteria for delegating your audits include "they wrote an apparently decent library once".

> But given the choosen development model (as the developpers of cargo-vet calls it: "low-friction reuse of third-party components") this is just bandage on a wooden leg. The problem is that the development model in itself is dangerous.

There are other solutions in the works as well, such as WebAssembly components that allow using a dependency without giving full trust to that dependency.

But in the meantime, this is an improvement over the status quo. And using third-party dependencies rather than reimplementing them is a net improvement.

Vetting the cargo

Posted Jun 12, 2022 3:01 UTC (Sun) by ms-tg (subscriber, #89231) [Link]

Thank you for your patience in expressing the reasoning here.

Vetting the cargo

Posted Jun 12, 2022 9:50 UTC (Sun) by ilammy (subscriber, #145312) [Link]

> I would also add that there is no actual reason to not trust the author of all-padding. On what ground should I do that?

This is the crux of the issue, I believe.

There is no reason to distrust a library when its maintainers are well-behaved: don’t introduce vulnerabilities themselves, don’t pull in libraries with known vulnerabilities, and depend on the libraries you trust -- note: “you”, *not* other people.

But that’s the thing: trust is good until it’s not, then it gets lost. You want to be notified of something worthy of your distrust early. When you see a CVE reported for some library, you immediately distrust that particular version. If that was a malicious change, you might want to distrust whoever did that change and whoever let it get in.

Now, how do you get CVEs reported in the first place? That’s where audits come into the picture. Someone supposedly reviews the code then either vouches that it’s good or reports these CVEs. You want to have all code from your dependencies reviews by someone -- either yourself, or someone you trust.

Tools like this formalize this “thousand eyeballs” expectations that come with depending on libraries. When you pull in dependency, you trust that someone has looked at the code and did not see any obvious malicious parts in it. Without these tools your trust in that is placed in absence of reported CVEs -- which is a bad metric: that’s akin to saying that bugs don’t exist until there is a ticket. With tools like this you’d know exactly who vetted which code and when, then you are free to establish whatever policy you want on top of that.

Naturally, the risk of getting malicious code in your build tree is still there. But there is only one way to *remove* it: never depend on any code that you did not write yourself (and even then, you can introduce vulnerabilities). All other ways are merely reducing the risk to an acceptable level.

Vetting the cargo

Posted Jun 20, 2022 7:12 UTC (Mon) by marcH (subscriber, #57642) [Link]

> Don't misunderstand my position: I understand that having a tool to help code audit is a good thing. But given the choosen development model (as the developpers of cargo-vet calls it: "low-friction reuse of third-party components") this is just bandage on a wooden leg.

In which other development model(s) do you understand that having a tool to help code audit is a good thing?

How incompatible with these other development models is cargo-vet?

Vetting the cargo

Posted Jun 13, 2022 8:45 UTC (Mon) by taladar (subscriber, #68407) [Link] (4 responses)

> but such a large number of dependencies will always mean that you cannot develop reasonably secure programs with them - no amount of public audit will ever help.

As if a small number of extremely large dependencies (think Qt or Boost) would be better here. If anything the large dependencies make it much easier to overlook a suspicious change.

Vetting the cargo

Posted Jun 13, 2022 11:51 UTC (Mon) by hkario (subscriber, #94864) [Link] (3 responses)

large dependencies also usually have large number of developers, and those developers are concentrated on that one project, causing much more eyes looking at the code every day

Vetting the cargo

Posted Jun 13, 2022 12:33 UTC (Mon) by farnz (subscriber, #17727) [Link]

Only if most developers range over the entire project; in my employer's codebase, there is a foundational library that is part of a big dependency where I know for a fact that there is only one person in the entire company who looks at the code regularly, and gets their code reviewed by a selection of people who trust them to get it right. If they were malicious, they have a very good chance of slipping bad code in without being caught despite code review.

At least with small dependencies, the effort to rewrite if upstream is malicious is low.

Vetting the cargo

Posted Jun 13, 2022 13:09 UTC (Mon) by colejohnson66 (subscriber, #134046) [Link] (1 responses)

The "many eyes" theory is complete bogus. OpenSSL had Heartbleed in it for literal years and no one noticed.

Vetting the cargo

Posted Jun 13, 2022 13:22 UTC (Mon) by Wol (subscriber, #4433) [Link]

"Many eyes makes bugs shallow".

Which is true. It's extremely easy to look straight through something you're not expecting to see, many eyes on their own are useless. Note also the "makes bugs shallow". It doesn't say it makes them easy to notice. It *does* mean that once you know there is a bug, it's not going to be able to hide for long.

Which has been borne out many times. What is the expected life-time of a bug once it's been spotted? Hours? Certainly not much more.

(But then, of course, in addition to the time at the start where no-one knew there was a bug, you also have the long tail where the bug has been fixed, but the fix has not been deployed.)

Cheers,
Wol

Vetting the cargo

Posted Jun 10, 2022 17:58 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (46 responses)

> A developer working on a function may suddenly discover the need to, say, left-pad a string with blanks. Rather than go though the pain of implementing this challenging functionality,

The world is getting more and more concerning: "developer" ... "left-padding" ... "challenging functionality".

Hmmm. I mean, it's significantly below the level of the questions that would be asked to hire a developer. If a developer finds left-padding a "challenging functionality", he's not a developer, he's a "google-search frenetic user" or "coding monkey" or "stackoverflow surfer" or whatever, but not a developer. Seriously. The time it would take to implement safely and suitably is probably lower in any language than the time it takes to read the downloaded lib's doc and make sure it's used correctly.

Hint: snprintf(buf, sizeof buf, "%*s", width, string) will do the job; a one-line for() loop following a call to strlen() would do as well, it's just a matter of taste. It even works in shell, and bash has special constructs that make it easy without printf if needed.

Vetting the cargo

Posted Jun 10, 2022 18:30 UTC (Fri) by HenrikH (subscriber, #31152) [Link]

Sorry if I'm missing some otherwise obvious sarcasm here but that text of the article is just a tongue in cheek reference to this real world thing that happened: https://lwn.net/Articles/681410/

Vetting the cargo

Posted Jun 10, 2022 18:30 UTC (Fri) by gspr (guest, #91542) [Link]

I belive you are encountering some sarcasm :-)

Vetting the cargo

Posted Jun 10, 2022 20:11 UTC (Fri) by tialaramex (subscriber, #21167) [Link] (4 responses)

Ignoring the rest of the context (which like, it was linked in the article man, if we're doing grumpy old man let's start with JWZ's "In this house we read links")

snprintf(buf, sizeof buf, "%*s", width, string) doesn't "do the job" on its own, we need considerable scaffolding to get to a place where this is the missing ingredient.

The most cheerful option is that we know width is some particular value, and we can construct buf[] as an array nearby of suitable size and then we can just assume this always works.

But if we don't know width, or we're unable to construct buf nearby, perhaps because it's a buffer somebody else owns, we're in a world of pain.

sizeof buf doesn't actually magically tell us how big our buffer is when it's not some nearby array, because this is C (or C++) and so we don't deserve nice things. Instead buf here is a raw pointer, (if we used an array it decays to a pointer for this purpose) and sizeof buf tells us only how big the variable named buf is, which if buf is a pointer will be either 4 or 8 on modern hardware.

When this fails, snprintf can give us a negative answer (unlikely depending on what buf is exactly) or a positive answer. But if it gives us a positive answer we're not done yet. If the positive answer was sizeof buf or greater, it means snprintf calculated all those bytes but the buffer wasn't big enough to actually write them, try again.

So you can end up writing error handling and retry loops here, as well as needing significantly more logic in place of sizeof. Almost makes you wish somebody else did this correctly already so you could just re-use their work, doesn't it?

Vetting the cargo

Posted Jun 13, 2022 12:42 UTC (Mon) by int19h (guest, #159020) [Link] (3 responses)

Lest the spirit of joke-ruining pedantry is extinguished, I have to add that in C++, you'd use std::format() for this, and not worry about allocating buffers etc.

Pedantry

Posted Jun 14, 2022 1:28 UTC (Tue) by tialaramex (subscriber, #21167) [Link] (2 responses)

std::format() was added by C++ 20, so, if you're using C++ 98, C++ 03, C++ 11, C++ 14 or C++ 17 this isn't available.

Also, while many modern languages will just cut to the chase, UTF-8 or GTFO, in C++ 20 strings are just containers for arbitrary sequences of values in some implementation defined encoding, so the standard can't say exactly how wide anything is, and although real world implementations probably do what you expect you're assured of nothing for this "width" specification, that's entirely implementation defined. Good luck.

Pedantry

Posted Jun 14, 2022 2:44 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

> Also, while many modern languages will just cut to the chase, UTF-8 or GTFO, in C++ 20 strings are just containers for arbitrary sequences of values

I remember that there was a bug filed against the C++ spec because strings were not working when parametrized by doubles due to alignment issues on 32-bit CPUs.

Pedantry

Posted Jun 14, 2022 17:12 UTC (Tue) by madscientist (subscriber, #16861) [Link]

std::format is in C++20 but folks using older C++ compilers can use fmtlib: https://github.com/fmtlib/fmt

Vetting the cargo

Posted Jun 11, 2022 9:21 UTC (Sat) by dottedmag (subscriber, #18590) [Link] (1 responses)

As others pointed, it was a joke.

However, there is another layer to this joke: how wide is UTF-8 "\x4f\xcc\x8b\xcc\a4"?

Vetting the cargo

Posted Jun 11, 2022 19:22 UTC (Sat) by samlh (subscriber, #56788) [Link]

To make things even more fun, the answer for some characters depends on which terminal you're using.

When I realized I needed to solve the problem recently I pulled in a library for grapheme clustering and extracted the data table I needed out of the source code of the terminal I cared about and called it a day.

Vetting the cargo

Posted Jun 12, 2022 15:17 UTC (Sun) by khim (subscriber, #9252) [Link] (31 responses)

> The world is getting more and more concerning: "developer" ... "left-padding" ... "challenging functionality".

If something like strlen could be challenging functionality (and it can, just look on version from glibc), then why couldn't leftpad be problematic?

This being said I sincerely hope Rust doesn't have anything similar to leftpad crate: while I can agree that leftpad may be nontrivial enough to live in some public crate I sincerely hope it would be in a crate with it's friends rightpad, centerpad and other such functions.

Complexity has to live somewhere: if, instead of having one, single, monster libc library you reduce all your crates to single-function ones then you get thousands and tens of thousands of crates as your dependencies which now makes management of crates it's own problem.

Vetting the cargo

Posted Jun 12, 2022 15:52 UTC (Sun) by mpr22 (subscriber, #60784) [Link] (3 responses)

That architecture-specific assembly language implementation which is aggressively optimized for performance on large strings rather than readability or ease of implementation was, I am sure, quite challenging to implement, but this falls into the realm of "accidental challenge arising from the chosen method of implementation" rather than "essential challenge arising from the specification".

A naïve compliant implementation of strlen() in native C, on the other hand, is in the realms of "too trivial to copyright":

size_t strlen(const char *s) {
size_t len;
for (len = 0u; s[len]; ++len) {
/* do nothing */
}
return len;
}

Vetting the cargo

Posted Jun 12, 2022 16:41 UTC (Sun) by adobriyan (subscriber, #30858) [Link] (2 responses)

Funnily enough gcc 12 will optimise this exact code back to strlen():

f:
cmp BYTE PTR [rdi], 0
je .L3
sub rsp, 8
add rdi, 1
call strlen
add rsp, 8
add rax, 1
ret
.L3:
xor eax, eax
ret

Vetting the cargo

Posted Jun 12, 2022 20:54 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

Idiom recognition, there's a lot of this in modern compilers, but it isn't actually new, it was just less necessary in early C compilers than it had been in languages like APL.

Turns out we should let the machine do the boring tedious work, who knew?

Vetting the cargo

Posted Jun 13, 2022 14:28 UTC (Mon) by mathstuf (subscriber, #69389) [Link]

> Funnily enough gcc 12 will optimise this exact code back to strlen():

FWIW, this can only be done because it assumes `s[len]` means `s != NULL` since such things would be UB (and can therefore be assumed to not happen). Otherwise, `strlen` adds the "must not be `NULL`" precondition and is not eligible for such optimization.

Vetting the cargo

Posted Jun 12, 2022 16:32 UTC (Sun) by fratti (guest, #105722) [Link] (26 responses)

> This being said I sincerely hope Rust doesn't have anything similar to leftpad crate: while I can agree that leftpad may be nontrivial enough to live in some public crate I sincerely hope it would be in a crate with it's friends rightpad, centerpad and other such functions.

Haha, yeah, about that...

I seem to recall someone having made a "leftpad index" for Rust crates, which counted number of lines in relation to how often the crate is downloaded. There are a few very leftpad-y crates. In general, Rust unfortunately follows the npm microdependencies approach, which does not only have security implications (trusting 30 random personal GitHub accounts to not get compromised/be malicious for what could be one regular-sized dependency) but also means nothing ever has a consistent API or release schedule.

An example for a classic leftpad-ish crate that should really be part of the stdlib or a generic terminal library but isn't is "atty", which appears to literally just be isatty() called from Rust, with some Windows and webassembly compatibility code. Its implementation, including unit tests and comments, is 210 lines. However, naturally, it has a sponsor button, 17 releases, and lives in someone's personal GitHub repository. 570 other crates depend on it.

Stuff like this not just being part of the standard library is what has stopped me from using Rust so far.

Vetting the cargo

Posted Jun 13, 2022 1:45 UTC (Mon) by khim (subscriber, #9252) [Link] (25 responses)

> An example for a classic leftpad-ish crate that should really be part of the stdlib or a generic terminal library but isn't is "atty", which appears to literally just be isatty() called from Rust, with some Windows and webassembly compatibility code.

Why do you think it should be part of stdlib or a generic terminal library? This crate (with crazy hacks like attempt to see if filename includes words “msys” and “pty”) is definitely not what I would want in the “standard library” yet it's perfectly justified if the only thing you want to do is to show color output on the console.

It's actually not that hard to support ANSI colors on most popular OSes (you just need to call SetConsoleMode on Windows and ignore the result: msys/cygwin would handle these in their own way and Windows console would work), but actual detection of whether you are dealing with TTY or not is complicated and unreliable.

Yes, our world is crazy. Deal with it. If crate is 210 lines long but have history of 17 versions and few dozens of commits I would assume it's complex enough to live in it's own crate.

> Stuff like this not just being part of the standard library is what has stopped me from using Rust so far.

Feel free to pretend that Linux is the only OS that matters. Certain languages (Google's Go, cough) do that and are popular enough, I guess. They tend to be confined to certain niches (command-line only utilities, no GUI, etc), don't support more exotic and/or rare platforms, but as long as you only need to work with these they are more than adequate.

For the rest of us, who have to deal with more complicated parts of the world Rust is better choice. Witch crazy “atty” crates and cargo vetting.

Vetting the cargo

Posted Jun 13, 2022 8:26 UTC (Mon) by farnz (subscriber, #17727) [Link] (21 responses)

The issue with Rust right now is that there's no easy entry route to finding the set of crates that make up an expanded "standard" library; community knowledge says that (for example) serde is the right choice for general serialization and deserialization, and clap is the right choice for CLI argument parsing, but even lib.rs doesn't make it easy to solve the problem.

What's needed is someone putting a lot of work in to construct a list of things that would be in a "batteries included" standard library, such that there's no choices left for you, but linking to something like lib.rs when there's no "obvious" choice.

Vetting the cargo

Posted Jun 13, 2022 9:10 UTC (Mon) by taladar (subscriber, #68407) [Link] (20 responses)

What is so difficult about searching for "serialization" or "argument parser" on lib.rs and picking the one with literally orders of magnitude more downloads than any other?

Just look at e.g. Python and see where this "put everything in the standard library" approach leads. You have the very same problem, only now you have half a dozen "solutions" to any given problem in the standard library but the community wisdom says not to use any of them.

Vetting the cargo

Posted Jun 13, 2022 9:48 UTC (Mon) by roc (subscriber, #30627) [Link] (4 responses)

I agree that Rust's approach is better than a "batteries-included standard library", but I also agree with farnz that it would be really useful to have a curated list of "preferred crate for task X".

Vetting the cargo

Posted Jun 13, 2022 11:44 UTC (Mon) by excors (subscriber, #95769) [Link] (3 responses)

There is a curated list of crates at https://github.com/rust-unofficial/awesome-rust which I've found a few useful pointers from, though I'm not sure how good it is in general. (It doesn't even list serde explicitly, just some serde-related crates. It does list clap, but alongside another CLI library that only has 50K downloads in 2 years and was added to the list by its author and probably doesn't deserve to be there.)

I guess you can get better recommendations by asking on various forums or chat rooms, but it would be nice if that community knowledge was distilled into a more easily accessible location.

Vetting the cargo

Posted Jun 14, 2022 22:18 UTC (Tue) by Wol (subscriber, #4433) [Link] (2 responses)

It would be nice if the language writers had a "library of batteries" that is regularly maintained.

So you would have a "recommended battery for strings", "recommended battery for arbitrary precision maths", etc etc. And batteries could be maintained, deprecated, whatever. Especially as Rust has this version stuff, you could easily continue using old batteries, but - importantly - you would know they were old batteries and not actively maintained.

And if you want to deprecate and get rid of an old battery, provided the functionality has been updated, you could create your new battery and update the old battery to be - as far as possible - just a shim for the new one.

Cheers,
Wol

Vetting the cargo

Posted Jun 15, 2022 8:49 UTC (Wed) by khim (subscriber, #9252) [Link] (1 responses)

> And if you want to deprecate and get rid of an old battery, provided the functionality has been updated, you could create your new battery and update the old battery to be - as far as possible - just a shim for the new one.

That's, actually, where you idea falls apart.

Maintaining compatibility is hard.

The deprecation of old crates and replacement with a new ones usually happen when the desired functionality is impossible to add in the backward-compatible way. Otherwise old crates are just extended.

Which more-or-less automatically makes these shims impractical: they would either be impossible or just simple larger and more error-prone than the whole old library.

That's why Linux kernel have “no stable internal API” policy and why Rust keeps it's standard library as small as possible.

As for “list of currently recommended crates”… this idea is discussed so often, that yes, I hope eventually something like that would be made. Maybe with the use of hyped cargo vet.

Vetting the cargo

Posted Jun 23, 2022 9:33 UTC (Thu) by Vipketsh (guest, #134480) [Link]

> Maintaining compatibility is hard.

Don't disagree here.

> The deprecation of old crates and replacement with a new ones usually happen when the desired functionality is impossible to add in the backward-compatible way.

In my experience statements to this effect are, for the most part, little more than mumbo-jumbo that people pushing for breakage use. The reality, again in my experience, is that backwards incompatible changes occur when some developer or group thereof decides that some new way of doing things looks more "pretty" than the current way so they make the change and then substantiate it with the above. I don't rule out that the average Rust developer is honest and hard working such that whenever faced with an issue they do their utmost to attempt to not break backward compatibility and my experience does not hold, but that is counter to everything I've seen thus I'm staying sceptical.

I would also like to point out the measure for "backwards-compatibility" has to be something like:
1, The overwhelming majority of your users don't notice
2, Those who do notice, notice small things and very rarely

Otherwise we end up with the absolutist argument of "every bug may be depended upon by someone, thus *any* change (no matter how straight-forward) may break something thus we wouldn't be able to change anything thus there is no backwards compatibility thus we can change everything willy-nilly".

Vetting the cargo

Posted Jun 13, 2022 12:26 UTC (Mon) by farnz (subscriber, #17727) [Link] (14 responses)

Because a new user has no way to distinguish "the state of the art is clap, and it should be your default assumption" from "everyone uses clap for legacy reasons, but you should be moving on to new-thing". For example, there was a time period in 2020 during which failure would have been the answer you'd find by your heuristic, but everyone was recommending that you move away from failure (no longer maintained) to anyhow and thiserror. Going back earlier on the graph, in 2018, you'd probably have picked up error-chain, even though it had been abandoned in favour of failure; now, there are special cases in which miette or eyre is what you want, and a human needs to tell you to default to anyhow and thiserror even though miette and eyre exist and are growing.

Hence the need for something with human curation - you want something that says "for error handling, your go-to should be thiserror in a library or anyhow in an application", but also says "for an async web server, there are multiple good choices - use the Are we web yet? Web Frameworks page to make your decision". This gets you the benefits of a big standard library when there's a good default choice to make (like anyhow or thiserror), allows the ecosystem to evolve and (importantly!) guides people to move on when something new replaces the old "good default choice".

The trouble with a standard library is that it's ossified - you can't change it easily. The benefit is that it's a human-curated set of "good default choices" for ways to do things. Keeping the standard library small is absolutely the right decision for Rust with its backwards compatibility goals, but means that you need to look elsewhere for the human-curated set of "good default choices" - and it needs the human curation to avoid the problem of getting stuck on a bad default that a standard library has.

Vetting the cargo

Posted Jun 13, 2022 13:27 UTC (Mon) by Wol (subscriber, #4433) [Link] (3 responses)

And your example of "eire", "failure", "error-chain" etc points out a failure even of a curated list - HOW MANY PEOPLE PUT A DATE ON THEIR RECOMMENDATIONS.

I don't claim to be a paragon of virtue, but whenever I update the raid wiki I try to remember to put a date against what I'm doing - not in the logs but slap-bang in the middle of the article or whatever, so anybody reading it is given a big clue as to whether it's up-to-date or further research is warranted.

Cheers,
Wol

Vetting the cargo

Posted Jun 17, 2022 12:20 UTC (Fri) by nix (subscriber, #2304) [Link] (2 responses)

> I don't claim to be a paragon of virtue, but whenever I update the raid wiki I try to remember to put a date against what I'm doing - not in the logs but slap-bang in the middle of the article or whatever, so anybody reading it is given a big clue as to whether it's up-to-date or further research is warranted.

And this has downsides too: if it's not metadata, eventually the article turns into a morass of dates and modified-by-this-person notes, because nobody making a change later knows whether their edits have obsoleted *everything* some other date note related to, so they tend to stay around... (I've seen this in its extreme form in 30-year-old codebases that were religiously maintained like this. It's *awful*. /* DCH 1990-04-14 */ /* LMH 1991-02-22 */ *everywhere*, many lines having multiple notes and hardly any having none. This is what a version control system is for, guys.)

Vetting the cargo

Posted Jun 18, 2022 18:55 UTC (Sat) by Wol (subscriber, #4433) [Link]

I get that.

But when updating the wiki, I generally completely revamped each article (and move the out-of-date version to an archive page).

It takes discipline, but I try and treat it like a "revised and updated edition" - leave the old version for people running out-of-date LTS systems, and the new up-to-date version for those on the bleeding edge :-)

I've dealt with a few too many horror stories of people finding out-of-date pages and almost trashing their systems - fortunately in my experience people tend to ask for help before doing something stupid, but not always ...

Cheers,
Wol

Vetting the cargo

Posted Jun 27, 2022 17:48 UTC (Mon) by sammythesnake (guest, #17693) [Link]

I remember seeing an extension (for Firefox?) that would shade the background of the text on Wikipedia to indicate how recently a particular bit had been updated.

I thimk the authors were assuming new edits might be wiki vandalism that hasn't been spotted yet, but with known authors, you could invert the logic.

Perhaps a more sophisticated iteration could use colours that somehow represent both age and the authors' demonstrated reliability (based on how many edits they've made / over how long a period / what proportion of them get reverted etc.)

A naïve idea might have the strength of the shading represent age and the hue represent some trust score (amber for newbie, green for OG editor, red for somebody with questionable performance, shades in between accordingly) Obviously, these actual colours would be awful for somebody red/green colour blind, but somebody better informed on such things could suggest a kinder palette:-P

Vetting the cargo

Posted Jun 23, 2022 9:42 UTC (Thu) by Vipketsh (guest, #134480) [Link] (9 responses)

> The trouble with a standard library is that it's ossified - you can't change it easily. The benefit is that it's a human-curated set of "good default choices" for ways to do things. Keeping the standard library small is absolutely the right decision for Rust with its backwards compatibility goals, but means that you need to look elsewhere for the human-curated set of "good default choices" - and it needs the human curation to avoid the problem of getting stuck on a bad default that a standard library has.

Way to go passing the buck. Nothing magical happens when the adjective "standard" is attached to "library": if you have to depend on a library which then goes away or changes in an incompatible way you are the same screwed irrespective of whether it was "standard" or not.

Translated the Rust tactic seems to be*:
1, Divide your users amongst different libraries (implementations)
2, All those libraries will inevitably be replaced or change in an incompatible way (by design is the implication I'm getting from the parent comment)
3, Since it is not "Rust" that is changing Rust bears no responsibility (thus Rust always has good backwards compatibility)
4, It is actually the user's fault for "choosing the wrong library" (though no information is given on what should have been used)
5, Since the users are divided there are never enough voices for the problem to be heard

The end result is that everyone ends up just as screwed as if all this functionality was in a (changing) standard library.

*: I doubt that it is intentional by the Rust team, but that doesn't change the end result the way I see it.

Vetting the cargo

Posted Jun 23, 2022 10:57 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (1 responses)

The lines up pretty well with Linux's "stable userspace ABI" and the mess of the userland ABI story above the absolute basics (glibc, X11, and a handful of others).

Such reimpls have happened before (cf. libjpeg-turbo). I think the difference is that the ecosystem has a lot more room to move than the guarantees of the stdlib because of the "if it works today, it'll work tomorrow" mindset. And yes, your crate will continue to work because the old crates are still accessible but you might not have updates.

The Rust internals Discourse definitely catches features which lay out semver hazards (or at least I try to point them out) so that library authors are not caught unaware when they make some otherwise innocuous change that their major number needs bumped. However, it isn't really possible to wrap everything in bubblewrap.

Vetting the cargo

Posted Jun 23, 2022 13:14 UTC (Thu) by Wol (subscriber, #4433) [Link]

> The lines up pretty well with Linux's "stable userspace ABI" and the mess of the userland ABI story above the absolute basics (glibc, X11, and a handful of others).

The obvious (and quite likely broken) way for linux to deal with this is to refactor the obsolete ABI into a module. Said module then gets (mostly) abandoned, and dropped when nobody notices.

Of course, the problem with this is when it's core functionality that can't (practicably) be modularised.

Cheers,
Wol

Vetting the cargo

Posted Jun 23, 2022 13:27 UTC (Thu) by farnz (subscriber, #17727) [Link] (6 responses)

Your conclusion doesn't follow from your argument.

The standard library cannot change in a backwards-incompatible fashion; this is simply not permitted under any circumstances. You can add it it, and you can fix bugs, but you can't break existing users. Further, as you update your language environment, you will get newer implementations of the standard library; one of the defining special cases about the standard library is that the compiler is allowed to assume that the standard library is present (which is why Rust currently has three standard libraries - core, which is things the compiler cannot function without, alloc, which is all of core, plus the things that the compiler needs when the heap is available, and std, which is the full-fat standard library when running in userspace on a normal OS).

This means we cannot remove anything from the standard library, ever. No matter how big a mistake it is, it has to remain present in the standard library forever, acting as a foot-gun for new users (see gets in C, for example), because removing it will break backwards compatibility. And because Rust documents all of its standard library, that means that it's visible to new users whether we want them to use it or not.

It's worth noting, too, that just because it's in an external crate doesn't mean it stops working; you can still use error-chain or failure, and the backwards compatibility commitment of the compiler and standard library means that they will still work into the future. They're just not being improved over time, and the ecosystem as a whole is no longer using them in new projects.

Vetting the cargo

Posted Jun 23, 2022 14:16 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Note that Rust does have one handle to pull back an API: unsoundness. If an API cannot be used soundly, it is eligible for deprecation and eventual removal (this may involve an edition change) because an API just cannot be used properly. For example `std::mem::uninitialized()` is off limits because `let b: bool = std::mem::uninitialized();` is instant UB because a `bool` may only have a binary value of 1 or 0 (that is, it may appear to be neither `true` nor `false` with some other bit pattern). Now there is the `std::mem::MaybeUninit` type to handle the use case.

Vetting the cargo

Posted Jun 23, 2022 15:40 UTC (Thu) by Wol (subscriber, #4433) [Link] (4 responses)

> This means we cannot remove anything from the standard library, ever. No matter how big a mistake it is, it has to remain present in the standard library forever, acting as a foot-gun for new users (see gets in C, for example)

Then create a fourth standard library - TheFjords. All your footguns go in that library, and the linker complains loudly when it has to pull it in.

Users won't notice (unless they look), but developers will see it, wonder what the hell is going on (especially if cluesers complain it's there), and will hopefully eliminate it from their code.

So we have an effective method for removing dead/dangerous calls from actively maintained code. If the code isn't maintained then there's not a lot anyone can do, and that's true of all ecosystems ...

Cheers,
Wol

Vetting the cargo

Posted Jun 23, 2022 18:13 UTC (Thu) by farnz (subscriber, #17727) [Link] (3 responses)

The library name is part of the import path used in code. If something moves from (say) core to alloc, code that is written for just core (and does not have alloc or std) is broken and stops compiling. Because Rust does not want to cause working code to break just because you update to a new compiler, code can only move one way - from std to alloc to core - and even then, only because std re-exports everything from alloc and core in the places it "ought" to be in if there was only std, and alloc re-exports everything from core in the places it "ought" to be in if there was only core.

So a "TheFjords" standard library isn't helpful - it just means that it's an obvious fix for things that have been broken by the change, but there's not supposed to be breakage to begin with. Putting things in external crates that get deprecated is helpful - nothing breaks if you still use error-chain, but you're making more work for yourself than if you use thiserror and anyhow, and you can do that refactor at a time to suit you - rather than being forced to do it because you're trying to compile for a new architecture and need a new compiler that supports the new architecture.

Vetting the cargo

Posted Jun 23, 2022 23:18 UTC (Thu) by Wol (subscriber, #4433) [Link] (2 responses)

But that's a "break on build", no? So it'll break FOR A DEVELOPER?

What we don't want is for a system upgrade to break a working program. But if a system upgrade means you can no longer re-build a program without being forced to add TheFjords to the build process, then sorry. That's exactly what I want to achieve!

It's a (reasonably) simple fix, but it places developers on notice that there are problems. So my takeaway from what you say is that "a working binary will continue working, but if it has obsolete components it will no longer build". So if it's unmaintained the user isn't affected, and if is maintained the maintainer has to fix it ...

Cheers,
Wol

Vetting the cargo

Posted Jun 24, 2022 0:47 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

> So it'll break FOR A DEVELOPER?

No, it will break for any person building the program.

You don't need to be a Rust developer to build a Rust program.

Vetting the cargo

Posted Jun 24, 2022 10:04 UTC (Fri) by farnz (subscriber, #17727) [Link]

I prefer the experience offered by RustSec - if you don't care about it, don't run the tool. If you do care, run the tool (cargo audit), and things like this deprecation warning will appear and tell you that you've got a risky dependency and why. If the crate comes back into maintenance, then the advisory gets updated to not trigger on new versions; similarly if recommended replacements get added, it'll be updated to tell you about them (when I first looked at RUSTSEC-2020-0036, it did not mention fehler, which has been added to the list).

This is something that's easier to do with external crates than with the standard library, because all the tooling already exists to do it, and you just don't run it if you don't care about it - a user who just wants to build the version of a program they had 5 years ago to re-generate some results can do that - while a developer who cares about security can run the tools and find out if they need to do anything (and if so, what).

Vetting the cargo

Posted Jun 13, 2022 19:38 UTC (Mon) by fratti (guest, #105722) [Link] (2 responses)

Okay, so you don't want isatty() in a standard library because its current non-standard-library implementation is bad, but you happily will rely on it for everything that parses command line arguments even on platforms where POSIX isatty() would be enough because clap is the recommended crate for this and pulls it in. Gotcha.

Vetting the cargo

Posted Jun 14, 2022 14:53 UTC (Tue) by khim (subscriber, #9252) [Link]

Yes. If I use library for the firmware development then I don't want to ever deal with then fact that isatty is not implementable there in principle, but then I wouldn't need command-line parsing or clap there either.

And if I develop “normal” app and only care about “usual suspects” (GNU/Linux, macOS, Windows) then I would pull clap and that mess with “msys” and “pty”.

May not be an ideal solution, but from practical POV it's the best I can expect.

Maybe, just maybe, having that mess as part of clap would be better… since I couldn't imagine an app which needs to know if it needs to enable colors or not and yet doesn't parse the command line… but probably not by much.

Vetting the cargo

Posted Jun 14, 2022 16:18 UTC (Tue) by excors (subscriber, #95769) [Link]

atty is an optional dependency of clap, so it's only pulled in if you enable the "color" feature (which is enabled by default, but if you care about trusting dependencies then I think you should always be disabling default features and only choosing the ones you really need).

With default-features=false, clap has 7 transitive dependencies (and one of those is owned by the clap project), which doesn't sound like a crazy amount to be vetted. The default features add another 4 transitive dependencies. But if you turn on all the stable optional features, it goes up to about 28, which does sound more like a crazy amount.

Vetting the cargo

Posted Jun 13, 2022 9:09 UTC (Mon) by larkey (guest, #104463) [Link] (2 responses)

The linked reference aside...:


$ cat test.c
#include <stdio.h>

int main(void)
{
	printf("<%*s>\n", 3, "");
	printf("<%*s>\n", 3, "̀a");
	printf("<%*s>\n", 3, "á");
	printf("<%*s>\n", 3, "a");
}
$ make test
cc -o test test.c
$ ./test
<   >
<̀a>
< á>
<  a>

Uh, oh, damn. Seems like your code doesn't work. Too bad. So much bad behavior in just *one* line of code! You really should question whether *you* should call yourself developer, to use your own words! Seriously: *Please* be a bit more humble and less snarky before questioning other people's creds. Thanks.

Vetting the cargo

Posted Jun 13, 2022 11:59 UTC (Mon) by ballombe (subscriber, #9523) [Link] (1 responses)

You are making the mistake to assume that leftpad works better.
What you are trying to do is not even well defined.

Vetting the cargo

Posted Jun 13, 2022 12:59 UTC (Mon) by larkey (guest, #104463) [Link]

No, I did not assume it, in fact I assumed the opposite and checked this assumption beforehand. They're not better.

However, just as the original comment disregarded how well or badly npm-leftpad works, I chose to do so as well.

Which I demonstrated by plugging in example strings into the code given. So if there is something not-well-defined, then it's the behavior of the original comments code in the quite common case of feeding something outside of the C locale.

Regardless, although I'm not an expert in this area, I'd say even this is well-defined albeit not what's wanted. The result depends on the compilation environment, that is, how the Compiler translates the source from the respective character set into byte arrays. From there on, everything works "as intended", the first string is 3 bytes "wide, the second 2, the last just 1. Hence a padding of 0, 1 and 2 bytes conversely.

I'm obviously aware of how this is not how things work, I'm merely demonstrating how lacking the above code is and that there is a merit in using a library for this task, and that this doesn't discredit the engineer. Don't use npm-leftpad though.

Vetting the cargo

Posted Jun 14, 2022 0:41 UTC (Tue) by Matt_G (subscriber, #112824) [Link] (1 responses)

One thing that tends to happen is that someone other than the original developer will decide to "refactor" the code... 'Hey this dev has written some code to left pad, this other dev has written some slightly different code to do the same thing, everybody is writing their own version of this simple function. Wouldn't it be a smart idea to standardize and only use one version of the function? Hey why don't we use a library...'

Personally I'm of the opinion that "trivial" code like this that is commonly implemented should belong in a languages standard library. But that has it's own tradeoffs and problems. For instance what counts as trivial code? The Getpass() example from C is a good example it has been a while since I looked at it. In the old Unix textbook I have from the 90's this is the recommended way to read in passwords from the command line in a "secure" way. At some point someone determined that it might have security problems and was potentially not threadsafe. So it was depreciated from the POSIX standard. The last time I looked into it there was some message that said people could trivially write their own version if they needed the functionality. How trivial is writing something that requires a reasonably deep understanding of things like terminal attributes? Obviously POSIX has a different definition of trivial then I do...

Vetting the cargo

Posted Jun 14, 2022 13:39 UTC (Tue) by rgmoore (✭ supporter ✭, #75) [Link]

At some point someone determined that it might have security problems and was potentially not threadsafe. So it was depreciated from the POSIX standard. The last time I looked into it there was some message that said people could trivially write their own version if they needed the functionality. How trivial is writing something that requires a reasonably deep understanding of things like terminal attributes?

Yeah, this seems exactly backward to me. If a reasonably common function has potential security and thread safety problems, that's a sign that it probably should be handled by a library everyone has access to. If the standard implementation has problems, it's extremely likely all those hand-rolled solutions that replace it will have the same kinds of problems, but with many fewer people around to notice and fix them. Things that are common but fiddly are exactly what you're supposed to put into libraries, so it gets written carefully exactly once.

Vetting the cargo

Posted Jun 16, 2022 13:39 UTC (Thu) by jcpunk (subscriber, #95796) [Link] (2 responses)

I find it a bit distressing that no cryptography is used to sign the audits.toml file.

GPG has a serious usability problem, but that doesn't mean an unsigned audit is fine.

Pushing it further, if we want to protect the supply chain, shouldn't the audit also record some sort of sha###sum of the crate to avoid having that file poisoned by a bad guy with filesystem access and republished under the same name?

If I can't prove "Alice TheAuditor" actually audited exactly this specific item, I'm not sure what this gains...

Vetting the cargo

Posted Jun 16, 2022 15:38 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (1 responses)

For anything with this kind of auditing, I suspect `Cargo.lock` is versioned alongside it. This records the name, version, and hash of its `.crate` used.

Vetting the cargo

Posted Jun 16, 2022 16:17 UTC (Thu) by excors (subscriber, #95769) [Link]

As far as I can see, you're not expected to publish Cargo.lock alongside your audits. Importing is done by just giving the URL of audits.toml and there's no general way to find a Cargo.lock from that.

From the discussion in https://github.com/mozilla/cargo-vet/issues/79 , it sounds like they considered storing the crate's checksum alongside the version in the audit file, and then decided not to because it didn't seem to protect against any plausible attacks. Instead you have to trust that crates.io has not been compromised and is not serving different code for the same crate version to different users, or that if it was then someone would very quickly notice checksum failures with all the other cargo commands that use Cargo.lock.