CVEs/LOC not a great metric

Posted Nov 29, 2024 12:47 UTC (Fri) by khim (subscriber, #9252)
In reply to: CVEs/LOC not a great metric by gmatht
Parent article: NonStop discussion around adding Rust to Git

> Java got lots of CVEs because people kept finding ways malicious code could break the memory safety model, whereas writing `free(x);free(x)` in C got nary a peep.

Note that I haven't told anything about Java. Java tried (and utterly failed) to do something else: use in-language restrictions as part of security boundaries.

This just doesn't work and, e.g., Android (which takes security seriously) doesn't even try to use Java for that.

But all that excitement about “memory safe languages” lumps together languages like Python and Ruby, on one hand, and Ada and Rust, on the other hand. Just take a loon on the list in the government report: Python®, Java®, C#, Go, Delphi/Object Pascal, Swift®, Ruby™, Rust®, and Ada. Thanks god noone told them that the most popular language 1980th is also “memory-safe” (if you forget about PEEK/POKE, but then most other “memory safe” languages also have such “escape hatches”)!

> Safe languages tend to also have less severe CVEs

Have you forgot about xkcd 1200? In what way Log4Shell is “less severe” than buffer overflow in kernel? Sure, they don't give you the ability to install unawthorized driver, but only give you a way to stole passwords and your money…

> Also I presume Rust is more concise than C

Rust is almost as strict about safety and correctness as Ada. C# also comes close. Haskell is probably even safer than Rust. And, lo and behold: all these four languages have affine types today! Rust got them first, then C# added simplified (and restricted) version, then Ada picked them… Haskell even have linear types!

But because the push, today, is not to move to “safe” langauges, but to “memory safe” languages… people move to Python, Ruby and PHP (also, technically, a memory safe language) – and if you look on CVE numbers for these… there that phenomenon of thinking the language safe, people make more of the other kinds of mistakes is very well observed.

> This is a little bit meaningless without knowing their definition of Concurrency and Vulnerability, but since these are they types of things Rust tries to prevent in safe code, it is a hint that Rust isn't encouraging bugs elsewhere.

Most CVEs in Rust ecosystem are for violation of the soundness pledge. IOW: they are similar to Java CVEs and cover cases where incorrect use of library code may allow one to break through unsafe boundary and make unsafe code misbehave.

Only Java ever tried to treat these as a security vulnerability – and only for Sun (later Oracle) produced code for Java Runtime (and privileged Java classes). Most Rust CVE bugs wouldn't even be considered bugs in most other languages (including Java, outside of runtime).

Rust is extremely serious about safety… which is, ironically, how it lost the tracing GC. Because having tracing GC in the language makes it easier to achieve “memory safety”, but harder to deal with other kinds of safety!

And there are research for how dependent typing can make things even safer in the future, in some kind of Rust successor… but these are at the experiments stage, only Wuffs is used in production, and it's not a general-purpose language.

P.S. And I absolutely agree that going to “memory safe” languages would be good first step. If someone would have tried to make all these JavaScript and Python coders to do a 100 feet high jump and switch to Haskell… there would have been riots and no one would have done anything. Making sure C and C++ wouldn't be used, though… that people can buy.

CVEs/LOC not a great metric

Posted Dec 2, 2024 19:43 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (33 responses)

Not too far beyond "memory safety" one gets into things like "formalized logic" and such. We (as humans) typically don't spec software out enough to "prove" them correct in any "logically safe" sense. There's also the question of what "safe" means for the software on a missile or in some weapon system. It's certainly not safe for anyone within the collateral damage radius of the weapon (even assuming the intended target "deserves" it in some way…which is certainly debatable most of the time). Getting memory safety, to me, is mostly about getting the code to do what is intended while also guarding against unintended behaviors due to assumptions made within the computing environment at use (i.e., the abstract machine the language uses to represent its semantics that are then translated into actual hardware instructions in some equivalent way). Basically, it's about the code doing what it says on the tin rather than having to also guard against own-goals like "oh, your stack got smashed because someone wrote outside of an array and now any localized reasoning you had about the code is useless".

CVEs/LOC not a great metric

Posted Dec 2, 2024 20:37 UTC (Mon) by khim (subscriber, #9252) [Link] (32 responses)

> Not too far beyond "memory safety" one gets into things like "formalized logic" and such.

Yes, and no. It's true that originally, the first “memory safe” language, LISP was created to reason about things like "formalized logic" and such.

But the vast majority of popular “memory safe” languages today were born to go in the precisely opposite direction! They are not using garbage collector as part of the “formalized logic”, but for the exact opposite reason: to enable “flexibility” and “velocity”. To make sure one may write code without writing formal spec and thinking about what could would be doing too much.

IOW: “memory safety” is not used in these languages as a step on the path to the correct program, nope. It's the opposite: “memory safety”, achieved via tracing GC, is used as “padded walls” for the crazy programs!

Hey, we made it impossible to touch arbitrary memory and that means your program couldn't touch our runtime and that means it couldn't touch unallocated memory and now it no longer matters what crazy stuff you are doing on top of that – runtime system would keep you safe!

That's the idea behind the majority of “memory safe” languages in today's world. Not to make software more robust, but to make wildly unsafe code somewhat more predictable and easier to debug.

> Getting memory safety, to me, is mostly about getting the code to do what is intended

IOW: for you memory safety is one step on the path toward the program safety and correctness… and that's what some other language do with it, too: Haskell, Ada, Rust…

But in the majority of “memory safe” languages that exist these days that's not the primary (and often not even secondary) goal. “Flexibility” and “velocity” matters, not safety.

It's just so happened that to achieve “flexibility” and “velocity” (where developers combine random pieces of code without thinking and then test the result till they would be satisfied) “memory safety” is useful tool.

It's even true in Java, to a great extend: the majority of these “industrial” programmers simply couldn't write anything in a language with UB, where tests may pass then your “well-tested” code may blow up in your face without any good reason.

And you can only achieve so much “safety” in a language when the user of the language is not concerned about safety at all, but is just conducting genetic experiments and looks for the specimen that would look kinda-sorta-working.

> Basically, it's about the code doing what it says on the tin rather than having to also guard against own-goals like "oh, your stack got smashed because someone wrote outside of an array and now any localized reasoning you had about the code is useless".

Yeah, but “memory safety” is not guarding your program in most memory safe languages, today. It guards solely and exclusively runtime, it doesn't even try to protect your program.

Consider the latest and greatest one, Go:

func appendAndChange(numbers []int) {
	newNumbers := append(numbers, 42)
	newNumbers[0] = 666
	fmt.Println("inside", newNumbers)
}

func main() {
	slice := []int{1, 2, 3}
	fmt.Println("before", slice)
	appendAndChange(slice)
	fmt.Println("after ", slice)

	fmt.Println("original slice is intact")
	fmt.Println("------")

	slice = append(slice, 5)
	fmt.Println("before", slice)
	appendAndChange(slice)
	fmt.Println("after ", slice)
	fmt.Println("original slice is modified")
}

The output of the program above would be:

before [1 2 3]
inside [666 2 3 42]
after  [1 2 3]
original slice is intact
------
before [1 2 3 5]
inside [666 2 3 5 42]
after  [666 2 3 5]
original slice is modified

Can you explain why first slice wasn't modified, but second slice was? How the whole thing works? Is it even guaranteed to work?

The answer is “no”, and “you are using it wrong”. And that's “memory safe” language! The latest one!

The only difference between C/C++ and these “memory safe” languages is in the fact that error in your code may generate “nasal daemons” in C/C++ but in “memory safe” languages behavior is restricted, predictable and thus debuggable.

If you care about safety and correctness then sooner or later your language would get affine types, maybe linear types, and, at some point, dependent typing is not out of the question. You may even drop tracing GC at some point, because affine and linear types work without it!

Like Rust did.

But if you are using “memory safety” as a gateway to “flexibility” and “velocity” then sure, you can protect yourself against some common C/C++ bugs, but that doesn't mean that your could wouldn't include other, even easier to exploit, bugs!

After all, if you investigate stories about how people break into various web servers… 90% of time (if not 99% of time) they don't bother to use buffer overflow in these huge and complicated parts (like Linux kernel, PostreSQL or NGINX), but simply find exploitable vulnerability in relatively simple and small scripts that people are creating in these “memory safe” languages like PHP or JavaScript!

CVEs/LOC not a great metric

Posted Dec 2, 2024 23:53 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (26 responses)

> The answer is “no”, and “you are using it wrong”. And that's “memory safe” language! The latest one!

It's actually easy to explain: the array reserves space exponentially on appends. There is nothing surprising here. You can't use any of this behavior to violate memory safety in Go.

Although you can do that if you have multiple threads, Go doesn't guarantee the memory safety in the presence of multithreading. FWIW, Java actually does guarantee it.

CVEs/LOC not a great metric

Posted Dec 3, 2024 0:24 UTC (Tue) by khim (subscriber, #9252) [Link] (12 responses)

> It's actually easy to explain: the array reserves space exponentially on appends.

And how could I predict the effect of that behavior? When would my “outer” variable be corrupted and when would it survive intact?

> There is nothing surprising here.

Seriously? This variable is maybe mutable or maybe not mutable and maybe would return modified result or maybe not return it… now that's called “nothing surprising here”?

> You can't use any of this behavior to violate memory safety in Go.

And that's precisely what I meant when I mentioned that xkcd 1200 strip.

Can such “easy to explain” behavior lead to corruption of user data? Yes, sure. Lots. Can it be used to bypass security checks? Of course. Lots. Can it be used to cause “memory unsafety”? No, no, that's sacred, of course not!

What good does “memory safety” buys me if my security locks are broken, passwords and databases are stolen and data is replaced with some garbage? What's the point of it?

> FWIW, Java actually does guarantee it.

Yeah, C# and Java are probably the most sane of popular “memory safe” languages. They even try to actually provide safety in a sense of getting the code to do what is intended to do, honestly! Not much and they mostly steal features from Rust, Haskell or ML and adapt them in half-backed form… but they at least try to care about safety is a sense of getting the code to do what is intended to do.

Go, JavaScript, PHP, Python, Ruby? Nope, or at least they are not really serious about it. They try to add some optional, opt-in features in recent versions… but security doesn't work if it's opt-in. It only works if it's opt-out. Only in that case you can actually guarantee something.

That's the biggest issues with TypeScript: it contains decent safety guards… but they all can be easily broken because of compatibility features – that have to be there or else it wouldn't be useful in the world where majority of web developers use JavaScript and not TypeScript.

P.S. Nonetheless push to adopt memory safe languages is a good first step. It doesn't buys us too much and a lot of people who are using horribly unsafe (yet genuinely “memory safe” languages) would be feeling smug for no good reason… but that's definitely only first step. Heck, think about it: Microsoft BASIC from almost half-century ago with no subroitines, only global variables and only two letters for globals… technically it's “memory safe” if you don't use PEEK/POKE. Would you want to rewrite something like Linux kernel in it? Would it become more robust after such rewrite? If no – then why no? It's “memory safe” language, it should be better than C! But at least this tells us that “you are holding it wrong” is not the sole reason for fact that our programs are, essentially, a Swiss cheese form the security POV. And castigating developers for the mistakes they sometimes do is not a constructive way to achieve safety and security. Perhaps 10 or 20 years down the road people would warm up to the idea that not all “memory safe” languages are born equal… but first we have to step up from C/C++… like we stepped up from assembler 30 (or 40?) years ago.

CVEs/LOC not a great metric

Posted Dec 3, 2024 1:50 UTC (Tue) by mirabilos (subscriber, #84359) [Link]

Great points! Especially thanks for the example in issue9… that would totally trip me up if I had to peek at some code written in it without having learnt the language inside out (which I haven’t).

I also considered BASIC and scripting languages, funnily enough.

CVEs/LOC not a great metric

Posted Dec 3, 2024 20:35 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (10 responses)

> And how could I predict the effect of that behavior? When would my “outer” variable be corrupted and when would it survive intact?

How do you remember not to use index "8" if you delete 10 elements from an array with the size of 11?

It's literally a logical bug. Don't depend on the underlying storage length, unless you know what you're doing.

And no, it's not a memory safety issue. At worst, you'll get a bounds check panic.

CVEs/LOC not a great metric

Posted Dec 3, 2024 21:08 UTC (Tue) by khim (subscriber, #9252) [Link] (8 responses)

> It's literally a logical bug. Don't depend on the underlying storage length, unless you know what you're doing.

Isn't that the exact same mantra that C and, later, C++ programmers repeated for last half century: don't write programs with bugs – and everything would be all right?

Note that bug is easy to spot here only because I have explicitly called function appendAndChange. If it would have been called in a fashion that doesn't attract attention to the fact that it may change content of the slice that is passed into it – then it could be used for a long time before someone would notice that it corrupts some data it shouldn't be corrupting.

Because it usually doesn't change array into it, if it's created in static fashion, and even simple and naïve test wouldn't reveal any issues.

You have to prepare your slice in a way that is both unnatural for the testing environment and common for the actual code that works with dynamic arrays.

Perfect combo to lead to long hours of debugging if you don't know precisely undocumented rules about how Go slices work. Almost as dangerous as Python's desire for “simplicity” and arrays used as default arguments (although there the behavior is documented, thus a tiny bit less dangerous).

> At worst, you'll get a bounds check panic.

Nope. That's the best outcome. And very much not guaranteed outcome. Normal, typical, most common outcome would be silent data corruption somewhere.

Because, again, it's only called appendAndChange here to make it easier to understand the issue.

In real-world code that would be some function that is not supposed to change it's argument and wouldn't change it in tests… but would change it in production code deployed in some far away place.

Yes, that's not a memory safety issue, it's just almost indistinguishable from the “memory unsafety”.

And wouldn't exist in any “complicated” language like C# or Java (with String and StringBuilder), Rust or even C++!

Yet “modern” Go language in it's quest “for the simplicty” does the exact same mistake that C did half-century ago.

But C have an excuse: it was designed in on computer with 16KiB of RAM, it couldn't add many complicated niceties… what's the Go excuse?

CVEs/LOC not a great metric

Posted Dec 4, 2024 0:33 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (7 responses)

> Isn't that the exact same mantra that C and, later, C++ programmers repeated for last half century: don't write programs with bugs – and everything would be all right?

Pretty much. Go's behavior is not something unusual, you're (possibly) mutating an object, and then depending on the side effects of this mutation. It's literally from the "don't do this" category.

Go only guarantees that your program won't cause memory safety issues.

CVEs/LOC not a great metric

Posted Dec 4, 2024 1:18 UTC (Wed) by khim (subscriber, #9252) [Link]

> Go only guarantees that your program won't cause memory safety issues.

And then, only if there are no race conditions. Which is precisely my point: the goal of Go was never

> It's literally from the "don't do this" category.

Sure, but how would I even know if I'm supposed to do that or not?

Note, that:

Go doesn't offer any way to pass a read-only slice around (like not just C++, but even plain C could do)
Go doesn't offer separate types for the “view slice” and “owned array” (C also conflates them, but C++ handles them separately)
“Ideomatic” Go conflates many other different things (e.g. set is supposed to be handled via hashmap) for the “simplicity”

> Go's behavior is not something unusual

Maybe, but Go's creator's behavior is unusual: it's as if they deliberately combined all the worst sides of programming language design from the last century and created something that could fight with PHP and JavaScript for the title of “the most dangerous popular memory-safe language”.

Essentially the only thing that they added to not be able to take the prize in that contest is static typing (and then pretty weak one). This, indeed, pushed them somewhat from the winning position… but other than that… all the tried and failed ideas are picked up and implemented.

If that were the language like BASIC, only designed to support newbies who couldn't grasp all the concepts that “serious languages” are using – that would have been justified… or if that was supposed to be only used for a small scripts, like Python… maybe.

But Go is not positioned like this! It pretends to be usable for large-scale projects!

And there are even some people who are using it like that…

I guess it's justified by that “strange phenomenon” that even their creators noticed: Although we expected C++ programmers to see Go as an alternative, instead most Go programmers come from languages like Python and Ruby. Very few come from C++. Also note how it took us over a year to figure out arrays and slices – which means that abomination that we are discussing here is not some sort of omission, but something that their creators are proud of! That's just simply… unbelievable.

But still… now we are conflating all languages except for C/C++ in one huge lump of “memory safe” languages – and that's simply wrong.

Not all “memory safe” languages are equally safe… but given the fact that around 70% of bugs in C/C++ programs are memory safety bugs… we should consider themselves lucky if people would switch from C/C++ to Go, JavaScript and Python… although I hope languages like Ada and Rust would good some love, too.

CVEs/LOC not a great metric

Posted Dec 4, 2024 13:07 UTC (Wed) by paulj (subscriber, #341) [Link] (5 responses)

This particular "side effect" looks very confusing. Not a Go expert, but this kind of side-effect looks like a huge land-mine, that could be easy to trigger unwittingly.

CVEs/LOC not a great metric

Posted Dec 4, 2024 20:01 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

In the actual Go code this type of code is _very_ unusual. The typical pattern is `arr = append(arr, items...)`, so it's clear that you're mutating the array.

CVEs/LOC not a great metric

Posted Dec 4, 2024 20:32 UTC (Wed) by khim (subscriber, #9252) [Link] (3 responses)

And does it help? If I replace newNubers with numbers problem remains, of course.

The issue in that code is not that append is used incorrectly, but in the fact that append-using function is used incorrectly.

Which is precisely and exactly the same class of bugs that make C/C++ programs fail, most of the time!

You may as well say that malloc should be paired with free and anyone who doesn't do that is “holding it wrong”.

Whether it's a hashmap that someone incorrectly modifies, or changing a struct which is not supposed to be changed… it's the exact same pattern: something is changed when it shouldn't be changed, or, alternatively, something is not changed when it should be changed.

The latter case couldn't be handled by the language efficiently (that's firmly #1 in farnz classification), but the former could be handled, in many cases. And most serious languages before Go (like C/C++, Java, C#, etc) had a way to handled that to some degree (approaches are different, const in C doesn't work like final in Java and C# in/ref readonly have different scope from both of them), but only “improved C” language developed in XXI century decided that “you are holding it wrong” is the best answer to these concerns.

Sure, the languages that Go actually replaces (JavaScript, Python or Ruby) had no such mechanisms either… which partially explains why C++ developers don't flock to Go and most developers switch to Go from dynamically-typed languages. But even they are slowly growing them (JavaScript have const these days)!

But still it's kind of funny that Go developers were telling to themselves that they are doing a better alternative for C++, not better alternative for Python…

CVEs/LOC not a great metric

Posted Dec 4, 2024 21:18 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> And does it help? If I replace newNubers with numbers problem remains, of course.

Yes, it helps. Because it's clear that you're mutating a passed-by-value object.

CVEs/LOC not a great metric

Posted Dec 5, 2024 11:58 UTC (Thu) by khim (subscriber, #9252) [Link]

I guess it helps a tiny bit (it's similar to the level of protection JavaScript const or Java's final provides), but if you are to use it like that, 99% of time, then why in 15 years of development nothing that doesn't force me to type the same thing twice was provided?

Note: I'm not saying that Go is worse than C (although it's, most likely, worse than C++), I'm just saying that for majority of the “memory safe” languages said “memory safety” doesn't make programs safer or more robust to a degree that people expect.

They add memory safety not to make them safer or more reliable, but, in 90% of cases, to make them “simpler”, or “more flexible”, etc.

And if you eliminate 70% of bugs but make remaining 30% of bugs 2-3 times easier to make… the net result is still positive, but it's not as large as many expect.

CVEs/LOC not a great metric

Posted Dec 5, 2024 0:07 UTC (Thu) by intgr (subscriber, #39733) [Link]

> JavaScript have const these days

TBF JavaScript's const is almost useless. It only prevents you from re-assigning the variable in the scope.

It does nothing to prevent mutation or side-effects: when the *value* referenced by variables is inadvertently mutated.

You can have a const Array variable and still mutate it many times in a function, pass it to another function that mutates the array etc.

Classes of logic bug

Posted Dec 4, 2024 11:37 UTC (Wed) by farnz (subscriber, #17727) [Link]

Part of the fun here is that there's (at least) two classes of logic bug in play here:

Failure to correctly transpose constraints from the problem domain into the code. For example, if a route planner has you arrive at the airport after the gates for your flight have closed, but before the flight's scheduled departure.
Not respecting rules of the language that aren't compiler-enforced. For example, iterator invalidation when you delete items.

It's unreasonable to expect a language to prevent the first sort of logic bug completely (although it can give you the tools to write libraries that prevent them). But it's not unreasonable to expect that languages aim to reduce the second category of logic bugs to as close to zero as possible, and to at least require you to do something a bit strange in order to run into them (e.g. unusual syntax, or special marker keywords).

CVEs/LOC not a great metric

Posted Dec 3, 2024 10:53 UTC (Tue) by paulj (subscriber, #341) [Link] (12 responses)

> It's actually easy to explain: the array reserves space exponentially on appends.

Can you expand on that for non-Go people? The slice that append returns is a different type of slice to the original []int slice?

CVEs/LOC not a great metric

Posted Dec 3, 2024 11:05 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (11 responses)

It looks like bad copy-on-write semantics to me. The first call's `append` has to allocate new memory, so `newNumbers[0]` is completely separate from the input array's backing memory. The second call uses the existing allocation (the extra space the first call to `append` reserved) and writes "see through" to the passed array and modifications affect it. Feels kind of like the Python behavior where `def foo(arr=[]): arr.append(1)` ends up modifying the default argument's backing store if `arr` is not passed.

CVEs/LOC not a great metric

Posted Dec 3, 2024 12:40 UTC (Tue) by khim (subscriber, #9252) [Link] (8 responses)

Correct. And the most amusing issue there is the reaction to that problem from language developers: just don't do that.

Come on, guys! If just don't do that (and write your code without bugs, dummy) would have been a good, actionable, answer then that push for the “memory safe” languages wouldn't have happened in the first place!

If you write code without mistakes then you can write correct, working, program in C, C++ or even assembler: D. J. Bernstein did that (or almost did that depending on you are asking), this approach works!

But it doesn't scale – and that raises the question: why should we accept such answer in “memory safe” languages… we have just abandoned “memory unsafe” precisely because that's bad answer!

P.S. This doesn't mean that Ada or Rust never say just don't do that. Sometimes it's just simply unavoidable. E.g. if you hash functions doesn't always return the same value for the same object (which may happen if you use address of some objects in hash and then object is moved) then Rust's HashSet is not guaranteed to find element in that set. Kind of unfortunate, but hard to offer better alternative. But to design language in XXI century and conflate dynamic array and slice “for simplicity”? Really? Sure, you “simplified” the language… by shoving all that complexity into my head… how is that a good trade-off? I like “simple” languages as much as the next guy – till that “simplicity” is not bought by shoving the intrinsic complexity into my head! And most “memory safe” languages that are popular today are like that: they are “simple” in a sense that their reference guide is short – but then all that complexity have to live in the head of the language user… because where else could it go?

CVEs/LOC not a great metric

Posted Dec 3, 2024 13:05 UTC (Tue) by Wol (subscriber, #4433) [Link] (7 responses)

> Really? Sure, you “simplified” the language… by shoving all that complexity into my head… how is that a good trade-off?

Einstein: "Make things as simple as possible - But No Simpler!"

If you make *PART* of the problem space simpler, by pushing the complexity elsewhere (where it doesn't belong), you make the overall solution more complex.

(Don't get me started on Relational and SQL :-)

Cheers,
Wol

CVEs/LOC not a great metric

Posted Dec 3, 2024 13:20 UTC (Tue) by khim (subscriber, #9252) [Link] (6 responses)

> (Don't get me started on Relational and SQL :-)

Lol. That's actually surprisingly relevant (if sad) example. Sure, SQL is horrible paradigm, but would anything else have a chance in a world where so many people know SQL and so few know anything else?

That push for safer, more robust programs (and push for “memory safe” langauges) have started not because what Rust offers is better than what C++ offers, no.

It have only started when losses started being measured in trillions.

Does the [ab]use of SQL have a chance to lead to losses of that magnitude? Does it cause anything but 10x (or maybe 100x?) excess resource consumption in some rare situations?

I don't know, in all the applications where I use SQL 100x speedup of database access would affect the work of the whole system only very marginally.

CVEs/LOC not a great metric

Posted Dec 3, 2024 14:14 UTC (Tue) by Wol (subscriber, #4433) [Link] (5 responses)

> I don't know, in all the applications where I use SQL 100x speedup of database access would affect the work of the whole system only very marginally.

And if it sped up DEVELOPMENT time by the same one or two orders of magnitude?

One of my war stories (that I'm not sure if it rings true ... the database side rings true, it's the rest of it that doesn't) the Australian breweries when Australia introduced GST.

Of the six breweries, five ran a relational-based accounts system. One of these realised that by leasing barrels to the pubs rather than selling and buying back, they could save a lot of tax. They updated their systems and caught the other five breweries on the hop.

The other four relational breweries took six months to catch up - and lost a lot of market share. Made worse because the sixth brewery modified their Pick-based system in two months - so for the last four months they had TWO breweries taking market share because they were paying less tax...

(Oh - and where seconds count, like 999-response systems (that's 112 or 911 to non-brits), would a "results almost before I hit "return" " response time make much difference? :-)

Cheers,
Wol

CVEs/LOC not a great metric

Posted Dec 3, 2024 16:02 UTC (Tue) by khim (subscriber, #9252) [Link] (4 responses)

> that I'm not sure if it rings true ... the database side rings true, it's the rest of it that doesn't

But it's the rest of it that matters.

> The other four relational breweries took six months to catch up - and lost a lot of market share. Made worse because the sixth brewery modified their Pick-based system in two months - so for the last four months they had TWO breweries taking market share because they were paying less tax...

And the whole thing just falls to pieces at this point. My sister is an accountant and she faced cases like these few times. Usually it just takes two or three days to cobble together something working using some Excel sheets and maybe VBA scripts. And accounting people just do papers by hand. Then later, when the need is not as pressing one makes it automatic.

But sure that only works for a small companies that are not under extra-strict pressure… but these companies that do have the need to do everything in a “certified” fashion… who the hell allowed them to use Pick in place of SAP?

In my experience if something takes not days, but months or years that's not the result of deficiency of SQL or something like that, but the need to jump through lots of paper barriers – and I fail to see how Pick can improve anything there.

> Oh - and where seconds count, like 999-response systems (that's 112 or 911 to non-brits), would a "results almost before I hit "return" " response time make much difference? :-)

Most likely no. Simply because no one would even allow you to use anything without pile of certification papers… that Pick-based system wouldn't have (and most SQL-based systems wouldn't have either).

In the end some horribly inefficient, but certified solution would be picked, no matter what you do.

CVEs/LOC not a great metric

Posted Dec 3, 2024 17:03 UTC (Tue) by Wol (subscriber, #4433) [Link] (2 responses)

> In my experience if something takes not days, but months or years that's not the result of deficiency of SQL or something like that, but the need to jump through lots of paper barriers – and I fail to see how Pick can improve anything there.

In my experience (and I'm currently living the hell ...) writing one single query in SQL can take days. The same query in Pick can be knocked up, tested, and verified! in five minutes! (The query I'm currently trying to debug, and falling over corner cases, and discovering that I'm querying data that doesn't exist or is stored somewhere else, and and and - has taken me months and I've just had a message saying it's screwed up AGAIN! I think I know why, some attribute I don't give a monkeys about has an unexpected value ... :-(

> In the end some horribly inefficient, but certified solution would be picked, no matter what you do.

True :-(

So basically, you're saying that 90% of the work is not programming, but administration ... but there's a lot of places where that isn't true. I work for an (ex)FTSE100, and I'm trying to introduce Pick to get rid of our Excel database. But yes, I know about paperwork - it's supposedly dead easy to introduce "new" technology, there's just a whole bunch of hoops I don't have a clue how to navigate :-( Once it's in, it won't be a problem any more ...

Cheers,
Wol

CVEs/LOC not a great metric

Posted Dec 3, 2024 17:24 UTC (Tue) by daroc (editor, #160859) [Link] (1 responses)

I think we've wandered pretty far off topic, at this point. Let's leave this discussion here.

CVEs/LOC not a great metric

Posted Dec 4, 2024 13:04 UTC (Wed) by paulj (subscriber, #341) [Link]

Wol was doing so well the last while... He's actually managed to go quite a while recently without bringing up Pick. This is his first relapse of his pickaholicism in a while, TTBOM. Maybe he can last a bit longer again from this point forward. ;)

CVEs/LOC not a great metric

Posted Dec 3, 2024 17:08 UTC (Tue) by Wol (subscriber, #4433) [Link]

> But sure that only works for a small companies that are not under extra-strict pressure… but these companies that do have the need to do everything in a “certified” fashion… who the hell allowed them to use Pick in place of SAP?

The accountants? Because chances are, the accounting system was written by accountants who knew what an accounting system was supposed to do, and not by programmers who filled it with bugs because they didn't have a clue?

(Okay, this IS a seriously double-edged sword!)

Cheers,
Wol

CVEs/LOC not a great metric

Posted Dec 3, 2024 18:09 UTC (Tue) by mirabilos (subscriber, #84359) [Link] (1 responses)

Except that the Python behaviour is consistent (and documented, not sure about the issue9 behaviour wrt. that).

CVEs/LOC not a great metric

Posted Dec 3, 2024 21:16 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Yes, there are definitely lints that warn about such things in Python. My caveman club-level of finesse with Golang is usually just happy to get things to work (though I've not dealt with its slices enough to know how any linters behave with them).

CVEs/LOC not a great metric

Posted Dec 3, 2024 13:06 UTC (Tue) by kleptog (subscriber, #1183) [Link] (4 responses)

> And you can only achieve so much “safety” in a language when the user of the language is not concerned about safety at all, but is just conducting genetic experiments and looks for the specimen that would look kinda-sorta-working.

The results of genetic programming is basically determined by the quality of your fitness function. A strict compiler, test cases and code review are all things that make genetic programming more effective. Only the first of those can't be skipped, you can't skip the compiler.

The test is: how close can your code come to *looking* like it's supposed to work while not actually being correct? Writing code in Rust that simultaneously looks correct, compiles but still does the wrong thing is actually tricky. This I believe is the saving grace of Python: the syntax is simple enough that that it almost always does what it looks like its doing, together with PEP8 and a particular coding culture makes it mostly work. The rampant use of monkey patching in Ruby drove me nuts (not sure if it's still like that) and makes things much less obvious.

So far there's only been one Underhanded Rust contest with only 4 entries. It would be great to see if people can improve on those submissions.

CVEs/LOC not a great metric

Posted Dec 3, 2024 14:05 UTC (Tue) by khim (subscriber, #9252) [Link]

> Writing code in Rust that simultaneously looks correct, compiles but still does the wrong thing is actually tricky.

I also believed that… till I saw what people who simply connect output of ChatGPT with the compiler are producing.

Believe me, if you concentrate on keeping your head entirely clean without any understanding of what you are doing… just feeding compiler advice into ChatGPT often leads you to something totally crazy that still compiles and runs… just doesn't work.

The trick is to not know anything about programming at all and not even thinking about the task that you are trying to solve.

Then you can create really powerful headache for the reviewers.

CVEs/LOC not a great metric

Posted Dec 3, 2024 14:19 UTC (Tue) by Wol (subscriber, #4433) [Link] (1 responses)

> The test is: how close can your code come to *looking* like it's supposed to work while not actually being correct?

This is actually a pretty neat description of the sexual arms race :-)

Male profligacy attracts females. How does a male APPEAR profligate, while not actually expending much in resources? How can a female be sure he is actually being profligate (because he can) and so would make a good partner?

As the above shows, too many compilers are bad sexual partners, because they're faking it ... :-)

Cheers,
Wol

Digression

Posted Dec 3, 2024 14:51 UTC (Tue) by corbet (editor, #1) [Link]

This seems like an entirely unnecessary, and perhaps misogynistic, digression. It would really be nicer if you could resist the temptation to do this sort of thing here. Please?

CVEs/LOC not a great metric

Posted Dec 3, 2024 15:26 UTC (Tue) by farnz (subscriber, #17727) [Link]

The test is a little more complex than that - you need the code to "look" like it's supposed to work, to compile, and to work in the cases the programmer tests by hand. Code that looks like it should work, that compiles, but doesn't work when tested by the programmer is code that gets reworked further until it does work for the programmer's test cases.

No language is completely resistant to this - anyone can make the code more complicated until it basically has N cases for the things the programmer tests by hand, and then a general case that doesn't work - but it does make it more likely that your fitness function is going to determine that this code is "too complicated" and reject it on that basis.

CVEs/LOC not a great metric

Posted Dec 15, 2024 8:00 UTC (Sun) by sammythesnake (guest, #17693) [Link] (2 responses)

> people move to [...] PHP (also, technically, a memory safe language) [...] thinking the language safe [...]

I don't think I've ever heard anyone describe PHP as safe, most people who even have a concept of "safe" wouldn't touch PHP with a 20ft pole!

CVEs/LOC not a great metric

Posted Dec 15, 2024 13:55 UTC (Sun) by pizza (subscriber, #46) [Link] (1 responses)

> I don't think I've ever heard anyone describe PHP as safe, most people who even have a concept of "safe" wouldn't touch PHP with a 20ft pole!

PHP has *always* been "safe".

Its poor reputation is due to numerous lackadaisical practices by folks developing with PHP -- primarily not sanitizing inputs and manually assembling database queries (and shell invocations) with those unsanitized inputs.

...The largest data breaches in history are due do this sort of application-level logic flaw... implemented in "safe" languages.

(Any language capable of concatenating strings together is "vulnerable" to this sort of thing)

CVEs/LOC not a great metric

Posted Dec 15, 2024 14:42 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

Memory safety is different than functional safety. PHP has the former but its APIs were woefully lacking to make the latter easy. Now there are better APIs more readily available, but its history definitely casts a long shadow in many minds.

> ...The largest data breaches in history are due do this sort of application-level logic flaw... implemented in "safe" languages.

No one here (AFAIK) is claiming *functional* safety (life, limb, correctness) here[1]. Proof systems are needed for that and Rust isn't there (nor do I know of anyone claiming such). However, when one has memory safety, the foundations for building up things like "encode requirements into types and let the compiler make sure it is fine" are far easier. Of course, some languages don't have sufficient mechanisms to teach their compiler (or interpreter) about such things, memory safe (Python, PHP, Bash) or not (C).

> numerous lackadaisical practices by folks developing with PHP

The same can be said for just about any software dysfunction. No one is perfect, but the level of vigilance and attention required to prevent such issues differs greatly between languages (and is not monotonic across the "memory safe" spectrum either). I know I'd much rather the team I'm working with have a steadfast gatekeeper for problems in the compiler before CI before review before customers. The quality of each filter is different between languages, teams, and cultures but (IMNSHO), finer filters earlier in the sequence are generally worth quite a lot more.

[1] Philosophical question: what constitutes "functionally safe" for a missile?