I would also point out that there -are- stretchy-buffer mallocs, where if you exceed the allocated space, the malloc library automatically traps, reallocs and completes.
I'm not going to critique the choice of Python. On that score, I will only say that there is a difference between the developers not knowing how to do X, Y or Z in a given environment, and X, Y and Z being genuinely impossible in that environment.
I -am- curious, though, as to why there are so many buffer overflows. Buffer overflows can happen for many reasons, not least trying to put data into the wrong part of an array (miscalculating the offset). If the offset is incorrect, avoiding crashes merely allows you to produce garbled output rather than a crash. There is a whole school of thought that says that errors are better off fatal than non-fatal, as non-fatal bugs are a real pain to track down. Debugging them is a nightmare in any language because the errors will slowly build and the distance between the actual bug and where you detect the bug can be huge.
I am also curious as to why there are ever any core-dumps. C++ has exception handling, signals (such as segmentation fault) can be trapped, you can always isolate errors further by having unrelated code running in isolated containers. If a container crashes, it takes out that container until it gets restarted, but doesn't kill everything. In short, if you don't want uncontrolled termination in a program, you don't need to have uncontrolled termination in that program. It's up to you.
All in all, their choice to use Python is ultimately their choice and I'm sure they know what they are doing. The problem is that they do not seem to know what they are not doing.
Posted Jan 15, 2009 5:50 UTC (Thu) by drag (subscriber, #31333)
[Link]
I suppose they just want to be lazy programmers and enjoy hacking on what they care about... such as game logic and or better enemy AI, rather then worrying about creating in-code sandboxes to setup and then catch exceptions as well as tracking down memory leaks and buffer overflows.
They probably could do everything that you've guys been talking about, or they could just get rid of C++ and use python. Lazy is usually not to shabby approach at getting something done... People only have so much time to devote to a project like this. It's not like they use it at work or are working on something that is highly important.. it's just a video game and is designed to be a happy pastime.
----------------------
Of course re-doing working code for the sake of just using python is probably a waste of time also.
It's all a question of balance, I guess. Many popular programs use python as a secondary programming language.. like Blender or Gimp or whatever. But it's not like they are going to replace existing or performance critical things for the sake of programming it in python any time soon.
Python slithers into Wesnoth
Posted Jan 15, 2009 14:23 UTC (Thu) by NAR (subscriber, #1313)
[Link]
I think avoiding pointers in C++ is really not that big deal. I've worked on a C++ project with about 100k LOC and we had about 5 new statements in the whole code (one of them actually led to memory leak). Of course, it depends on the problem field, maybe in a game it's unavoidable.
On the other hand if the developers are not familiar with e.g. STL, but familiar with Python, by all means they should start using Python more and more.
Python slithers into Wesnoth
Posted Jan 15, 2009 22:02 UTC (Thu) by boudewijn (subscriber, #14185)
[Link]
Is that open source? I would like to study it. In Krita, we use shared
pointers a lot -- but that's also just papering over new statements.
Python slithers into Wesnoth
Posted Jan 15, 2009 22:57 UTC (Thu) by njs (subscriber, #40338)
[Link]
shared_ptr<> is handy -- much nicer than the traditional way of tossing delete statements about and praying -- and sometimes the only option, if you really have objects with arbitrary and unpredictable extent. (Somewhat uncommon but very handy jargon: "extent" = "the chunk of your program's runtime during which the object exists".)
But depending on the program, often most of your variables do not have arbitrary extent at all; the example I'm familiar with is "monotone", which is OSS, and basically doesn't call new or indeed use pointers at all. (This also means that you never segfault -- because segfaults are caused by dereferencing pointers, so if you never do that...) I know of two main tricks. The obvious one is to use containers (from STL or whatever), that manage their own memory. Duh. The more subtle one is about function return values: one of the main places you end up using unnecessary pointers is when a function needs to return a complex object, you don't want to copy it around everywhere, so you make the function heap-allocate it and then return a pointer. Then you end up using pointers for all your local variables, object fields, etc., because they get filled in by calling one of those functions. The better convention is to define such functions to take their return value as a reference argument -- the caller allocates it however it wants (on the stack, nested inside another object, whatever), and then passes it to the callee to be filled in.
There are probably some other ones I'm forgetting too... but anyway, good luck! C++ is *so* much nicer if you don't use pointers.
Python slithers into Wesnoth
Posted Jan 15, 2009 23:42 UTC (Thu) by tialaramex (subscriber, #21167)
[Link]
the caller allocates it however it wants
That's all very well if you don't care about abstractions or compatibility.
The advantage of the pointer is that when you change your rectangles from being two 16-bit co-ordinate pairs to one 32-bit co-ordinate pair and a 16-bit size pair, you know the rest of the code doesn't care about the difference between it was dealing with an abstract pointer, not a transparent structure with temptingly accessible named contents. And you know code compiled against the earlier version of the library won't now crash because the allocations are too small, because you're doing the allocations of the structure whose size changed in your library, which is also the only place accessing the structure contents.
Of course you don't /have/ to solve this with pointers. There's an equally valid tradition elsewhere of always having an array or a list, and returning only an opaque offset into the array or list. The calling code needn't care which convention you use.
"You never segfault" is one of those fun promises which assumes on the one hand, that if you'd done things some other way you'd have made mistakes (a correct program needn't take a segfault, at least not a fatal one) yet if you do things the supposedly superior way you won't make the equivalent mistakes. A fatal error is just as fatal if it's the runtime saying you accessed an uninitialised array member as if it's the OS saying you accessed an unmapped address.
-- And yes, maybe you could trap and ignore the array access error, but then your program will probably do something unpredictable and difficult to debug, just as it would if you signal(SIGSEGV, SIG_IGN);
Python slithers into Wesnoth
Posted Jan 16, 2009 0:02 UTC (Fri) by stijn (subscriber, #570)
[Link]
A fatal error is just as fatal if it's the runtime saying you accessed an uninitialised array
member as if it's the OS saying you accessed an unmapped address.
Yep. In my early days of C I was told: a segfault is what you get if you are lucky. Silent data
corruption is worse.
Python slithers into Wesnoth
Posted Mar 15, 2009 16:14 UTC (Sun) by anomalizer (guest, #53112)
[Link]
Anyone who has done serious C coding will agree with that.
Python slithers into Wesnoth
Posted Jan 16, 2009 4:44 UTC (Fri) by njs (subscriber, #40338)
[Link]
>That's all very well if you don't care about abstractions or compatibility.
I had to stare at this for a while to figure out what you were saying...
I think you mean, "that's all very well if you aren't claiming to return a Foo but in fact returning a FooBar (a subclass of Foo chosen at runtime)"? That's a good point. In practice I find that the only times dynamic dispatch (virtual methods) pull their weight in C++ are when using the "strategy" pattern, and then it's usually the caller passing the subclass instance down and references work fine, but that's for my code in my projects; YMMV. I'm happy to add that to the list of exceptional cases where (smart) pointers are called for.
But in response to the rest of your comment, giving the calling code the ability to allocate the object does not mean that you have violated any abstractions. If my code says "give me a rectangle", and someone changes the size of a rectangle in the header file, then my code will allocate a differently sized object from then on. Binary (as opposed to source) compatibility is a little more work, but it just requires the usual pimpl technique that you usually need anyway if you want binary compatibility in a C++ library.
And of course there are always trade-offs, but that doesn't mean this is a zero-sum game. Pointer arithmetic/memory management is one of the most error-prone aspects of classic C/C++ programming (probably only threads are worse); eliminating it will, all else being equal, reduce your bug rate -- or at least, it worked for me.
It's also true that the other way you can segfault is by walking off the end of an array (well, arguably this is a special case of pointer arithmetic, but whatever). The answer to that is even easier: along with not using pointers, don't use arrays :-). Avoiding pointers in C++ takes some tricks, but avoiding arrays (and unchecked wrappers around them) is trivial and worthwhile. It won't stop you making indexing mistakes, but it can turn unreliable crashes/random memory corruptions into reliable clean exits that print the offending file and line number.
Python slithers into Wesnoth
Posted Jan 18, 2009 12:25 UTC (Sun) by rwmj (guest, #5474)
[Link]
shared_ptr<> is just reference counting, ie. the worst, slowest, most invasive form of garbage
collection. Please don't confuse this with advanced garbage collectors.
Python slithers into Wesnoth
Posted Jan 18, 2009 20:44 UTC (Sun) by zlynx (subscriber, #2285)
[Link]
Reference counting is "worst" or "best" depending on your requirements. The kind of garbage collection all the "modern" JVMs like to do requires about FIVE TIMES as much RAM as reference counting.
Speaking of garbage collection, it is using reference counting itself. The difference is that a collection cycle counts up the references on the spot instead of tracking them continuously.
The Linux kernel uses reference counting quite a lot. It isn't the only thing used in the kernel, of course. RCU is another RAM hungry technique used there. But the counting seems to work rather well in a kernel, much better than garbage collection.
The Boost collection has some other reference counting options that can work better than shared_ptr, like the intrusive pointers which store the reference count inside the object, giving much better cache locality.
Python slithers into Wesnoth
Posted Jan 18, 2009 21:29 UTC (Sun) by rwmj (guest, #5474)
[Link]
I'm assuming by "FIVE TIMES" you're referring to this paper. Unfortunately the paper is flawed
with respect to reference counting, because it doesn't include the reference counts
- it just assumes that the manually managed version "knows" somehow exactly
when to free stuff. With ref counts you bloat every object by 4-8 bytes, and
of course that has an effect on cache and memory, which they don't measure.
You can find a more
detailed response here.
Python slithers into Wesnoth
Posted Jan 19, 2009 3:29 UTC (Mon) by njs (subscriber, #40338)
[Link]
> shared_ptr<> is just reference counting, ie. the worst, slowest, most invasive form of garbage collection. Please don't confuse this with advanced garbage collectors.
Good points -- but I'm not sure who you're arguing with?
(Though of course everything old is new again -- I have heard that some cutting-edge true garbage collectors use reference counting because it allows them to optimize reference graph walking -- there cannot exist an unreachable cycle rooted at any object that has not recently had its reference count decremented to a non-zero value. I just like sharing that because while almost certainly useless to know, it's a *super* neat trick.)
Python slithers into Wesnoth
Posted Jan 16, 2009 10:40 UTC (Fri) by NAR (subscriber, #1313)
[Link]