|
|
Log in / Subscribe / Register

Many good points

Many good points

Posted Nov 3, 2007 21:43 UTC (Sat) by pynm0001 (guest, #18379)
In reply to: Many good points by epa
Parent article: Daniel Bernstein: ten years of qmail security

In most modern computer languages, a[55] does have deterministic 
behavior, even when the array has less than 56 (ha!) elements.

However most UNIX code is in C, which does not (and without restricting 
the language, cannot) guarantee deterministic behavior in this case.  C++ 
is the same as C in this regard, if you continue to use C-style arrays 
rather than any of the gazillions of good container libraries (including 
the built-in STL).

Most languages I would imagine have the integer overflow problem.


to post comments

Many good points

Posted Nov 3, 2007 22:59 UTC (Sat) by njs (subscriber, #40338) [Link] (12 responses)

> Most languages I would imagine have te integer overflow problem.

FWIW, Lisp dialects rarely do, and Python is many years into its transition to having
arbitrary-size integers by default (to be finished in Py3k).  There are probably others as
well.

Many good points

Posted Nov 4, 2007 0:51 UTC (Sun) by aquasync (guest, #26654) [Link] (11 responses)

Ruby will automatically transition from ints to BigNums as needed - eg `ruby -e 'p 10 ** 100'`
will just work.

Many good points

Posted Nov 4, 2007 3:24 UTC (Sun) by drag (guest, #31333) [Link] (10 responses)

Well ya. 

But isn't that example of 'dynamicly typed'?
I mean python can do that, no problem and I suppose pretty much all dynamicly typed languages
do that also (ie visual basic and perl)

(but also python is strongly typed.. meaning that you can just use a string as a int and visa
versa (unlike VB, for example0)

$ python 
Python 2.4.4 (#2, Aug 16 2007, 02:03:40) 
[GCC 4.1.3 20070812 (prerelease) (Debian 4.1.2-15)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> type(10)
<type 'int'>
>>> type(10 * 100)
<type 'int'>
>>> type(10 ** 100)
<type 'long'>
>>> type(10.0 ** 100)
<type 'float'>

so on and so forth. 

But if you go...

>>> type(10.0 ** 1000)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: (34, 'Numerical result out of range')

Many good points

Posted Nov 4, 2007 3:30 UTC (Sun) by drag (guest, #31333) [Link]

Errr.

>  meaning that you can just use a string as a int and visa

I ment:

meaning that you _can't_ just use a string as a int and visa versa

Sorry.

Many good points

Posted Nov 4, 2007 6:32 UTC (Sun) by njs (subscriber, #40338) [Link] (8 responses)

> But isn't that example of 'dynamicly typed'?

Not really.  The particular implementation in python and ruby (but not, AFAIK, in py3k)
requires dynamic typing because you have the same operation returning two different types
depending on the values involved.  But one could just as well have a dynamically typed
language whose integer operations overflowed instead of passing to a bignum type, and a
statically typed language whose basic integer type didn't overflow because it actually *was* a
bignum type.

> I suppose pretty much all dynamicly typed languages do that also (ie visual basic and perl)

I know nothing about VB, but perl does something different and weird -- its integers seem to
overflow into floats:

$ python -c 'print (10 ** 100 == (10 ** 100 - 1))'
False
$ perl -e 'if (10 ** 100 == (10 ** 100 - 1)) { print "True\n" } else { print "False\n" }'
True

I guess most code that expects programming integers to act like mathematical integers will be
more surprised by an overflow to -2**31 than by loss of integer precision, but either would
make me nervous.

> (but also python is strongly typed.. meaning that you can't just use a string as a int and
visa versa (unlike VB, for example0)

"Strongly typed" is an annoying and vague term, but AFAICT it is usually used to mean "you
can't poke around at the raw representation of objects in machine memory using only ordinary
variable access operations".

Many good points

Posted Nov 4, 2007 7:28 UTC (Sun) by drag (guest, #31333) [Link] (7 responses)

Hrm. My experience is severely limited:

"Dynamicly typed" always ment to me that the type is determined when the variable is created,
based on various rules. That is variable types do not have to be declared before you can use
them. (although they can be if you prefer it)

The opposite is "staticly typed" were you have to declare variables types before using them.


And "Strongly Typed" has always ment, to me, that once the variable is created then it can't
be changed. That is if you create a variable as a int you can only use it as a int. If you
want to use the int's value as a string you have to create a second string variable. The type
is enforced by the language.

The opposite of that is "weakly typed" were a int can be a string can be a float based on the
context in which it's used. That is if you make a 'int' you can use as a string if you feel
like it. The type is not enforced by the language.


So...

C == staticly, weakly typed
Perl == dynamicly, weakly typed
Java == staticly, strongly typed
Python == dynamicly, strongly typed


For example this is not legal in python:

a = "1" + 1

The "" is your way of declaring that value as a string, otherwise numbers by themselves are
always interpreted as int, hex, float, and other numeric types.

But, of course, none of this is hard and fast. Between numeric operators it'll do type
coercion.

a = 0x45 + 1.0 + 12
(so that is a hex + a float + a int)
And the result would 'a' be a float.

Or this could be a bit of a illusion as all these things support the 'add' function. I don't
know.


Maybe this is because my knowledge of all this stuff is purely from a python perspective and
the language that is used is to try to help programmers understand the differences between
other common languages.


But otherwise thanks for the clarification on the "But isn't that example of 'dynamicly
typed'?" That makes a lot of sense.

Many good points

Posted Nov 4, 2007 9:30 UTC (Sun) by elanthis (guest, #6227) [Link] (2 responses)

To be blunt: nobody really cares what you think those terms mean.  They already have
well-defined meanings.  Look them up.

Many good points

Posted Nov 4, 2007 12:40 UTC (Sun) by drag (guest, #31333) [Link] (1 responses)

I did. That was my understanding. 

Many good points

Posted Nov 5, 2007 0:09 UTC (Mon) by k8to (guest, #15413) [Link]

Dynimcally typed does not have anything to do with "when the variable is created"  it has to
do with the type of the variable being known at runtime, rather than at compile time.


Some corrections

Posted Nov 5, 2007 9:03 UTC (Mon) by flewellyn (subscriber, #5047) [Link] (3 responses)

"Dynamicly typed" always ment to me that the type is determined when the variable is created, based on various rules. That is variable types do not have to be declared before you can use them. (although they can be if you prefer it)

The opposite is "staticly typed" were you have to declare variables types before using them.

No. The "dynamic" verus "static" in the typing terms mean solely this: at what time is the type of this variable known? If it's known at compile-time, the variable is statically typed. If it can't necessarily be known until runtime, that variable is dynamically typed. Whether or not you have to declare types ahead of time is mostly irrelevant. I say "mostly" because some languages that are statically typed have facilities for dynamic typing if you want it, and some dynamically typed languages can do static typing if you ask for it.

A number of languages, like Haskell and Boo, have static typing, but by default use "type inference" to determine the type of a variable. So you can declare (using Boo here):

x = 1

And the variable x is determined to be numeric, and an integer. You can't thereafter assign a string, a float, or a value or object of some other type to x, since it's been determined that the type of x is integer.

You can, in Boo at least, declare a type, in case you need to do something special, so if I had said:

x as float = 1

Then x would be a float, and the 1 would be interpreted as 1.0. Also, Boo has optional "duck typing" that you can use to defer type resolution for a specific variable until runtime. This is a good idea if you are assigning user input to a value, and don't necessarily know what type that input will be. (If you DO know, it's a good idea to declare the type, so that the compiler knows what to do with it.)

On the other hand, some languages that are dynamically typed by default, such as Common Lisp, have optional type declarations; when you declare a variable's type, that variable becomes statically typed, and the compiler is free to leave out the usual type checks, which can improve performance. Some CL implementations will also treat type declarations as assertions if you set the compiler's optimization settings a certain way, so that you can get the benefit of static type checking if you want and need it. (Strictly speaking, a CL implementation is free to ignore type declarations altogether according to the spec, so this behavior is entirely implementation- dependent.)

But the crucial point here is that "static" versus "dynamic" typing has everything to do with WHEN a type is known, and nothing really to do with HOW it's known.

And "Strongly Typed" has always ment, to me, that once the variable is created then it can't be changed. That is if you create a variable as a int you can only use it as a int. If you want to use the int's value as a string you have to create a second string variable. The type is enforced by the language.

The opposite of that is "weakly typed" were a int can be a string can be a float based on the context in which it's used. That is if you make a 'int' you can use as a string if you feel like it. The type is not enforced by the language.

This is closer to correct, but still off. "Strongly typed" means that the VALUE'S type is strongly enforced: you can't add a string to an integer, or an integer to a character, without explicit casts, which may not work in any case (how do you coerce "Ich bin ein Berliner" to a numeric type?). You can have a strongly-typed dynamic language (Common Lisp), or a weakly typed static language (C).

The business of whether or not a variable can be rebound to a different type is a matter of static versus dynamic typing, not strong versus weak type safety. You can, in Common Lisp, bind a variable to a string value, then rebind it to a number, or a structure or class object, for that matter; just don't try to use string functions on the number. THAT'S strongly typed. (On the other hand, C will not let you assign a string value to an integer variable, but you could treat the int as a char.)

Some corrections

Posted Nov 6, 2007 10:09 UTC (Tue) by ekj (guest, #1524) [Link] (2 responses)

"Strongly typed" means that the VALUE'S type is strongly enforced: you can't add a string to an integer, or an integer to a character, without explicit casts, which may not work in any case

Well, that depends, now doesn't it ? If your strongly typed language comes with method overloading there's nothing stopping you from defining several add-functions, like say an "string add(int,string)" method. What, exactly, that'd do would be up to you. In some contexts it could make sense.

In python you can do: mystring = 10 * "-" + "Hello World" + 10 * "-", the very same thing would be perfectly possible in say C++, any language with operator overloading basically, regardless of if the language is statically or dynamically typed.

Some corrections

Posted Nov 7, 2007 0:16 UTC (Wed) by flewellyn (subscriber, #5047) [Link] (1 responses)

That doesn't really change what I said, actually. While in some languages you can use "+" to mean string concatenation as well as addition, if the language is strongly typed, it will choose which operation to do based on the types of the arguments. And you may need to cast things anyway, such as if you want to concatenate a number's string representation with a string. I've had to do such casts in Python.

Some corrections

Posted Nov 8, 2007 9:24 UTC (Thu) by ekj (guest, #1524) [Link]

Sure. You're running completely different functions for int+int and int+string, it just so
happens that the two functions have the same name. They don't need to have anything in common
other than the name.


Many good points

Posted Nov 4, 2007 12:25 UTC (Sun) by epa (subscriber, #39769) [Link] (2 responses)

Hmm, you caught me out with 55 versus 56, but isn't it the case that in C it is legal to point
to one element past the end of an array (as long as you don't try to read or write the value
held there).  So a[55] in an array of 55 elements is defined in so far as you can compare a
pointer to &(a[55]).

Many good points

Posted Nov 4, 2007 18:04 UTC (Sun) by pynm0001 (guest, #18379) [Link] (1 responses)

Well sure, you can construct a pointer to point pretty much anywhere you 
want as long as you don't dereference it (i.e. reading or writing).  
Making the element immediately following the end of the array special 
would mesh well with C++ iterators, where the end element is an iterator 
that cannot be dereferenced, always past the end of the data.

Pointers in C

Posted Nov 4, 2007 19:07 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

You /can/ point anywhere but that isn't defined in the language and so your compiler might not
do what you expected. It so happens that the pointers are typically just hardware memory
addresses (virtual addresses on modern hardware) but they could be anything, and any false
assumptions you make in portable software could be expensive mistakes.

K&R says that pointers are only defined when they point /to/ something like an array element
or a variable. ANSI C improved on this by asserting that there is also a pointer value beyond
the end of an array which is larger than the pointer values for the elements of the array,
this means that...

while (pointer <= last_element)) {
  /* do something */
  pointer++;
}

is well defined in ANSI C and does what you expect whereas it would have been legitimate for a
K&R C compiler to do something most unexpected, like set the pointer variable to zero once you
get beyond the end of the array.

Many good points

Posted Nov 8, 2007 20:32 UTC (Thu) by dvdeug (subscriber, #10998) [Link] (2 responses)

Why would most languages have the integer overflow problem? You can detect an integer overflow
at runtime, and do something intelligent, like throw an exception. Even C as standardized
doesn't let you overflow an integer; it's undefined behavior, but wrap-around semantics
assumed so often that optimizing it breaks many programs.

Many good points

Posted Nov 8, 2007 21:53 UTC (Thu) by pynm0001 (guest, #18379) [Link] (1 responses)

"can detect" is not the same as "will detect".  If the language does not 
throw an exception (or otherwise intelligently handle the problem) for an 
overflow then it has an integer overflow problem.

C is even worse simply because it is undefined.  Undefined behavior is 
not a good thing in a program which is supposed to be secure and bug 
free.  The wrap-around behavior is not retained because of historical 
baggage, it's retained because that is the "optimized" form.  i.e. the 
underlying hardware performs the addition and the result is wrapped 
around without checking beforehand if the answer will fit.

Most processors have an "overflow" flag which can be set but checking 
that after every addition is pretty much not done.

Many good points

Posted Nov 9, 2007 4:14 UTC (Fri) by dvdeug (subscriber, #10998) [Link]

And there's no reason for any language that doesn't play fast and loose close to the bare
metal not to detect it, which is why I questioned your assumption that most languages would
have an integer overflow problem.

No, it's not the optimized form. GCC added optimization that in loops took advantage of the
fact that overflow is undefined and hence not done in legal programs, and got a great deal of
flack for it.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds