|
|
Subscribe / Log in / New account

CVEs/LOC not a great metric

CVEs/LOC not a great metric

Posted Dec 3, 2024 0:24 UTC (Tue) by khim (subscriber, #9252)
In reply to: CVEs/LOC not a great metric by Cyberax
Parent article: NonStop discussion around adding Rust to Git

> It's actually easy to explain: the array reserves space exponentially on appends.

And how could I predict the effect of that behavior? When would my “outer” variable be corrupted and when would it survive intact?

> There is nothing surprising here.

Seriously? This variable is maybe mutable or maybe not mutable and maybe would return modified result or maybe not return it… now that's called “nothing surprising here”?

> You can't use any of this behavior to violate memory safety in Go.

And that's precisely what I meant when I mentioned that xkcd 1200 strip.

Can such “easy to explain” behavior lead to corruption of user data? Yes, sure. Lots. Can it be used to bypass security checks? Of course. Lots. Can it be used to cause “memory unsafety”? No, no, that's sacred, of course not!

What good does “memory safety” buys me if my security locks are broken, passwords and databases are stolen and data is replaced with some garbage? What's the point of it?

> FWIW, Java actually does guarantee it.

Yeah, C# and Java are probably the most sane of popular “memory safe” languages. They even try to actually provide safety in a sense of getting the code to do what is intended to do, honestly! Not much and they mostly steal features from Rust, Haskell or ML and adapt them in half-backed form… but they at least try to care about safety is a sense of getting the code to do what is intended to do.

Go, JavaScript, PHP, Python, Ruby? Nope, or at least they are not really serious about it. They try to add some optional, opt-in features in recent versions… but security doesn't work if it's opt-in. It only works if it's opt-out. Only in that case you can actually guarantee something.

That's the biggest issues with TypeScript: it contains decent safety guards… but they all can be easily broken because of compatibility features – that have to be there or else it wouldn't be useful in the world where majority of web developers use JavaScript and not TypeScript.

P.S. Nonetheless push to adopt memory safe languages is a good first step. It doesn't buys us too much and a lot of people who are using horribly unsafe (yet genuinely “memory safe” languages) would be feeling smug for no good reason… but that's definitely only first step. Heck, think about it: Microsoft BASIC from almost half-century ago with no subroitines, only global variables and only two letters for globals… technically it's “memory safe” if you don't use PEEK/POKE. Would you want to rewrite something like Linux kernel in it? Would it become more robust after such rewrite? If no – then why no? It's “memory safe” language, it should be better than C! But at least this tells us that “you are holding it wrong” is not the sole reason for fact that our programs are, essentially, a Swiss cheese form the security POV. And castigating developers for the mistakes they sometimes do is not a constructive way to achieve safety and security. Perhaps 10 or 20 years down the road people would warm up to the idea that not all “memory safe” languages are born equal… but first we have to step up from C/C++… like we stepped up from assembler 30 (or 40?) years ago.


to post comments

CVEs/LOC not a great metric

Posted Dec 3, 2024 1:50 UTC (Tue) by mirabilos (subscriber, #84359) [Link]

Great points! Especially thanks for the example in issue9… that would totally trip me up if I had to peek at some code written in it without having learnt the language inside out (which I haven’t).

I also considered BASIC and scripting languages, funnily enough.

CVEs/LOC not a great metric

Posted Dec 3, 2024 20:35 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (10 responses)

> And how could I predict the effect of that behavior? When would my “outer” variable be corrupted and when would it survive intact?

How do you remember not to use index "8" if you delete 10 elements from an array with the size of 11?

It's literally a logical bug. Don't depend on the underlying storage length, unless you know what you're doing.

And no, it's not a memory safety issue. At worst, you'll get a bounds check panic.

CVEs/LOC not a great metric

Posted Dec 3, 2024 21:08 UTC (Tue) by khim (subscriber, #9252) [Link] (8 responses)

> It's literally a logical bug. Don't depend on the underlying storage length, unless you know what you're doing.

Isn't that the exact same mantra that C and, later, C++ programmers repeated for last half century: don't write programs with bugs – and everything would be all right?

Note that bug is easy to spot here only because I have explicitly called function appendAndChange. If it would have been called in a fashion that doesn't attract attention to the fact that it may change content of the slice that is passed into it – then it could be used for a long time before someone would notice that it corrupts some data it shouldn't be corrupting.

Because it usually doesn't change array into it, if it's created in static fashion, and even simple and naïve test wouldn't reveal any issues.

You have to prepare your slice in a way that is both unnatural for the testing environment and common for the actual code that works with dynamic arrays.

Perfect combo to lead to long hours of debugging if you don't know precisely undocumented rules about how Go slices work. Almost as dangerous as Python's desire for “simplicity” and arrays used as default arguments (although there the behavior is documented, thus a tiny bit less dangerous).

> At worst, you'll get a bounds check panic.

Nope. That's the best outcome. And very much not guaranteed outcome. Normal, typical, most common outcome would be silent data corruption somewhere.

Because, again, it's only called appendAndChange here to make it easier to understand the issue.

In real-world code that would be some function that is not supposed to change it's argument and wouldn't change it in tests… but would change it in production code deployed in some far away place.

Yes, that's not a memory safety issue, it's just almost indistinguishable from the “memory unsafety”.

And wouldn't exist in any “complicated” language like C# or Java (with String and StringBuilder), Rust or even C++!

Yet “modern” Go language in it's quest “for the simplicty” does the exact same mistake that C did half-century ago.

But C have an excuse: it was designed in on computer with 16KiB of RAM, it couldn't add many complicated niceties… what's the Go excuse?

CVEs/LOC not a great metric

Posted Dec 4, 2024 0:33 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (7 responses)

> Isn't that the exact same mantra that C and, later, C++ programmers repeated for last half century: don't write programs with bugs – and everything would be all right?

Pretty much. Go's behavior is not something unusual, you're (possibly) mutating an object, and then depending on the side effects of this mutation. It's literally from the "don't do this" category.

Go only guarantees that your program won't cause memory safety issues.

CVEs/LOC not a great metric

Posted Dec 4, 2024 1:18 UTC (Wed) by khim (subscriber, #9252) [Link]

> Go only guarantees that your program won't cause memory safety issues.

And then, only if there are no race conditions. Which is precisely my point: the goal of Go was never

> It's literally from the "don't do this" category.

Sure, but how would I even know if I'm supposed to do that or not?

Note, that:

  1. Go doesn't offer any way to pass a read-only slice around (like not just C++, but even plain C could do)
  2. Go doesn't offer separate types for the “view slice” and “owned array” (C also conflates them, but C++ handles them separately)
  3. “Ideomatic” Go conflates many other different things (e.g. set is supposed to be handled via hashmap) for the “simplicity”
> Go's behavior is not something unusual

Maybe, but Go's creator's behavior is unusual: it's as if they deliberately combined all the worst sides of programming language design from the last century and created something that could fight with PHP and JavaScript for the title of “the most dangerous popular memory-safe language”.

Essentially the only thing that they added to not be able to take the prize in that contest is static typing (and then pretty weak one). This, indeed, pushed them somewhat from the winning position… but other than that… all the tried and failed ideas are picked up and implemented.

If that were the language like BASIC, only designed to support newbies who couldn't grasp all the concepts that “serious languages” are using – that would have been justified… or if that was supposed to be only used for a small scripts, like Python… maybe.

But Go is not positioned like this! It pretends to be usable for large-scale projects!

And there are even some people who are using it like that…

I guess it's justified by that “strange phenomenon” that even their creators noticed: Although we expected C++ programmers to see Go as an alternative, instead most Go programmers come from languages like Python and Ruby. Very few come from C++. Also note how it took us over a year to figure out arrays and slices – which means that abomination that we are discussing here is not some sort of omission, but something that their creators are proud of! That's just simply… unbelievable.

But still… now we are conflating all languages except for C/C++ in one huge lump of “memory safe” languages – and that's simply wrong.

Not all “memory safe” languages are equally safe… but given the fact that around 70% of bugs in C/C++ programs are memory safety bugs… we should consider themselves lucky if people would switch from C/C++ to Go, JavaScript and Python… although I hope languages like Ada and Rust would good some love, too.

CVEs/LOC not a great metric

Posted Dec 4, 2024 13:07 UTC (Wed) by paulj (subscriber, #341) [Link] (5 responses)

This particular "side effect" looks very confusing. Not a Go expert, but this kind of side-effect looks like a huge land-mine, that could be easy to trigger unwittingly.

CVEs/LOC not a great metric

Posted Dec 4, 2024 20:01 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

In the actual Go code this type of code is _very_ unusual. The typical pattern is `arr = append(arr, items...)`, so it's clear that you're mutating the array.

CVEs/LOC not a great metric

Posted Dec 4, 2024 20:32 UTC (Wed) by khim (subscriber, #9252) [Link] (3 responses)

And does it help? If I replace newNubers with numbers problem remains, of course.

The issue in that code is not that append is used incorrectly, but in the fact that append-using function is used incorrectly.

Which is precisely and exactly the same class of bugs that make C/C++ programs fail, most of the time!

You may as well say that malloc should be paired with free and anyone who doesn't do that is “holding it wrong”.

Whether it's a hashmap that someone incorrectly modifies, or changing a struct which is not supposed to be changed… it's the exact same pattern: something is changed when it shouldn't be changed, or, alternatively, something is not changed when it should be changed.

The latter case couldn't be handled by the language efficiently (that's firmly #1 in farnz classification), but the former could be handled, in many cases. And most serious languages before Go (like C/C++, Java, C#, etc) had a way to handled that to some degree (approaches are different, const in C doesn't work like final in Java and C# in/ref readonly have different scope from both of them), but only “improved C” language developed in XXI century decided that “you are holding it wrong” is the best answer to these concerns.

Sure, the languages that Go actually replaces (JavaScript, Python or Ruby) had no such mechanisms either… which partially explains why C++ developers don't flock to Go and most developers switch to Go from dynamically-typed languages. But even they are slowly growing them (JavaScript have const these days)!

But still it's kind of funny that Go developers were telling to themselves that they are doing a better alternative for C++, not better alternative for Python…

CVEs/LOC not a great metric

Posted Dec 4, 2024 21:18 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> And does it help? If I replace newNubers with numbers problem remains, of course.

Yes, it helps. Because it's clear that you're mutating a passed-by-value object.

CVEs/LOC not a great metric

Posted Dec 5, 2024 11:58 UTC (Thu) by khim (subscriber, #9252) [Link]

I guess it helps a tiny bit (it's similar to the level of protection JavaScript const or Java's final provides), but if you are to use it like that, 99% of time, then why in 15 years of development nothing that doesn't force me to type the same thing twice was provided?

Note: I'm not saying that Go is worse than C (although it's, most likely, worse than C++), I'm just saying that for majority of the “memory safe” languages said “memory safety” doesn't make programs safer or more robust to a degree that people expect.

They add memory safety not to make them safer or more reliable, but, in 90% of cases, to make them “simpler”, or “more flexible”, etc.

And if you eliminate 70% of bugs but make remaining 30% of bugs 2-3 times easier to make… the net result is still positive, but it's not as large as many expect.

CVEs/LOC not a great metric

Posted Dec 5, 2024 0:07 UTC (Thu) by intgr (subscriber, #39733) [Link]

> JavaScript have const these days

TBF JavaScript's const is almost useless. It only prevents you from re-assigning the variable in the scope.

It does nothing to prevent mutation or side-effects: when the *value* referenced by variables is inadvertently mutated.

You can have a const Array variable and still mutate it many times in a function, pass it to another function that mutates the array etc.

Classes of logic bug

Posted Dec 4, 2024 11:37 UTC (Wed) by farnz (subscriber, #17727) [Link]

Part of the fun here is that there's (at least) two classes of logic bug in play here:

  1. Failure to correctly transpose constraints from the problem domain into the code. For example, if a route planner has you arrive at the airport after the gates for your flight have closed, but before the flight's scheduled departure.
  2. Not respecting rules of the language that aren't compiler-enforced. For example, iterator invalidation when you delete items.

It's unreasonable to expect a language to prevent the first sort of logic bug completely (although it can give you the tools to write libraries that prevent them). But it's not unreasonable to expect that languages aim to reduce the second category of logic bugs to as close to zero as possible, and to at least require you to do something a bit strange in order to run into them (e.g. unusual syntax, or special marker keywords).


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds