In C there's a concept called `FILE' for this. In Unix (C, shell) it's called `pipe'. `Threads' are processes where all `inter-thread' communication must be done through one of those devices. Shared-memory parallel programming with manual synchronization was popularized by Windows and C++ coders. Why? Because `processes' were too heavy weight, too slow, and too cumbersome... kinda like the excuses people use when they don't want to use a scripting language.
C doesn't have a sophisticated typing system, but that has little to do with the issue of not being able to write a "sanitized string" type. Just the idea of sanitized string sounds wrong to me. If it's not an ASCII, NUL-terminated array of characters, then it's not properly called a string in C terminology.
Type abstraction is often the root cause of security bugs. For example, you could treat a password as a sub-type of string. But strings as commonly understood almost universally support the concept of truncation. But if you truncate a password horrible security repercussions result. So why would you want to treat it like a string at all?
Strings also usually support the notion of concatenation. So take HTML. You could keep an HTML document as a string, but HTML has a hierarchical structure, and prepending or appending data to a complete document results in garbage.
Just exclude those methods, one says. Well (1) the fact that you must exclude already exposes you to mishaps and bugs the same as forgetting to bounds check an operation in C, and (2) what are you left with when you exclude all the unnecessary and possibly dangerous operations? Not much.
Buffer overflows in C usually are the result of people trying to treat everything like a string; they want to slurp data into a string or array, and then manipulate it in that form. That's a horrible way to write software, whether in C or any other language. It just so happens that if you do it in C you're susceptible to more attacks than if you do it in, say, Java; but the solution isn't to write the bad code in Java; it's to stop writing that kind of code at all.
Using the best language for the particular job also helps. Writing parsers has always been easier for me to do in C because of pointers, and the ability to write very concise state machines. A parser is really a way to consume a string, character by character, and transform it into some other structure. So when people tell me that handling strings in C is more difficult, I don't know how to respond. It's certainly more difficult to juggle and manipulate strings in C. But if that's how you're processing string input in any language, you're probably doing it wrong. If I'm parsing an e-mail message, I'll construct a tree of objects by consuming a stream of characters. I may store the message, or parts of it, as a character "string", but only as an immutable object that I never need to manipulate; outputting it later, if necessary. So I rarely care about the difficulty of manipulating strings in C, because I rarely need to do that.
I tend to use general purpose scripting languages for things _other_ than string processing, like executing complex rules or transformations of structures of objects. For still other things domain specific language are preferable.
Of course, if all you want to do is hack out a script to process some data (as Perl is popular for), then have at it. But don't fool yourself that your script is any more secure than if written in C. There are probably several times more remote execution bugs in scripting language built applications than C applications, just because of improper use of strings.
Posted Feb 3, 2011 10:00 UTC (Thu) by Thomas (subscriber, #39963)
[Link]
"For example, you could treat a password as a sub-type of string. But strings as commonly understood almost universally support the concept of truncation. But if you truncate a password horrible security repercussions result. So why would you want to treat it like a string at all?"
You got it the wrong way round. Strings [a concatenation of bytes] can support truncation but don't have to.
Cheers,
T.
String manipulation bugs
Posted Feb 6, 2011 19:08 UTC (Sun) by man_ls (subscriber, #15091)
[Link]
(2) what are you left with when you exclude all the unnecessary and possibly dangerous operations? Not much.
Immutable strings. As seen in Java, Python or Lua. Safe, flexible, and only occasionally slow enough to use other options. If you remove the main cause for the most common security bug, and nobody complains, then in my book that is a good decision.
There are probably several times more remote execution bugs in scripting language built applications than C applications, just because of improper use of strings.
String manipulation bugs I can live with. Security holes are unacceptable. A language where every bug must be considered a security bug is too hard for me.
LCA: Lessons from 30 years of Sendmail
Posted Feb 12, 2011 23:07 UTC (Sat) by ofranja (subscriber, #11084)
[Link]
"Type abstraction is often the root cause of security bugs. For example, you could treat a password as a sub-type of string. But strings as commonly understood almost universally support the concept of truncation. But if you truncate a password [...]"
I think you wanted to say LACK of abstraction.
If password is not exactly a string, you should have created a "password" type with proper operations and associated semantics.
Do not ever consider "C" as an example of "complete type system", unless you also consider a Ford T an modern vehicle.