GNU ed 1.6 released
| From: | Antonio Diaz Diaz <ant_diaz-AT-teleline.es> | |
| To: | info-gnu-AT-gnu.org | |
| Subject: | Version 1.6 of GNU ed released | |
| Date: | Mon, 02 Jan 2012 17:18:39 +0100 | |
| Message-ID: | <4F01D8DF.5040109@teleline.es> | |
| Cc: | bug-directory-AT-fsf.org, bug-ed-AT-gnu.org | |
| Archive‑link: | Article |
I am pleased to announce the release of GNU ed 1.6. GNU ed is an 8-bit clean, more or less POSIX-compliant implementation of the standard Unix line editor. The homepage is at http://www.gnu.org/software/ed/ed.html The sources can be downloaded from http://ftpmirror.gnu.org/ed/ http://download.savannah.gnu.org/releases/ed/ or from your favorite GNU mirror. This version is also available in lzip format. If your distro doesn't yet distribute the lzip program, you can download it from http://www.nongnu.org/lzip/lzip.html The md5sums are: 9a78593decccaa889523aa4bb555ed4b ed-1.6.tar.gz e3d4dcfd260b2ebb2855c86ffca1947f ed-1.6.tar.lz This release is also GPG signed. You can download the signature by appending ".sig" to the URL. Changes in version 1.6: * Displaying of null characters by the "l" command has been fixed. * The condition deciding when to show the message "Newline appended" has been corrected. * The "modified" flag is now set when reading a non-empty file into an empty buffer. * An error that prevented using NUL characters in regular expressions has been fixed. * Ed now signals an error if it can't create a shell process when executing a shell command. * Ed now flushes stdout/stderr before reading a new command. * Man page is now generated with "help2man". All command-line options are now documented in the man page. * The copyright notices of Andrew L. Moore have been restored. It seems Andrew granted some permissions but never assigned copyright to the FSF. Please send bug reports and suggestions to bug-ed@gnu.org If you are packaging ed for a distribution, please, try to use the lzipped source tarball, as this can improve the support for the lzip format in packaging systems. Thanks. Regards, Antonio Diaz, GNU ed maintainer. _______________________________________________ GNU Announcement mailing list <info-gnu@gnu.org> https://lists.gnu.org/mailman/listinfo/info-gnu
Posted Jan 2, 2012 22:27 UTC (Mon)
by johnny (guest, #10110)
[Link] (16 responses)
Posted Jan 2, 2012 22:39 UTC (Mon)
by andrel (guest, #5166)
[Link] (15 responses)
Posted Jan 3, 2012 0:35 UTC (Tue)
by geuder (subscriber, #62854)
[Link] (14 responses)
UTF-8 is not a character set at all, but a variable length encoding of a 16-bit or 32-bit character set.
8 bit cleanliness was a big issue for most Europeans wanting to write anything correctly on a computer in their mother tongue in the early 90s. Most editors could only handle 7 bit character sets without quirks.
8 bit character sets were an intermediate step in the 90s. 8 bits per character are enough for most bigger European languages (But not a single 8 bit character set for all of them)
But 8 bits don't help the Asians. They need 16 bits per character at least, and because different sets where needed in Europe it wasn't ideal even there.
I think the majority of software in use has been basically 16 bit Unicode for may years now. Windows, Symbian, and Java use true 16 bit wide characters, while all Linux distributions I have used use UTF-8 encoding by default. The nice thing with UTF-8 is that you even can't tell the difference to the old ASCII as long as you stick to 7 bit ASCII characters, because their encoding is identical, 8 bits with the most significant bit being 0.
Whether ed supports UTF-8 or not is not said in the announcement. IMHO 8 bit cleanliness defines support of 8 bit character sets, not stripping away or clearing the most significant bit. UTF-8 is more than this, the editor must be able to handle the variable length encoding.
But whether it can or cannot be used for writing texts in European languages I use regularly, I don't see a reason why I personally would do it using ed. As long as all American programmers remember every day that the world is not 7 bit and not even all ASCII characters are reachable on the keyboard without using modifier key I'm happy. The original question shows that there is work to do, so excuse my long comment.
Posted Jan 3, 2012 0:56 UTC (Tue)
by Karellen (subscriber, #67644)
[Link] (2 responses)
Anyway, whatever that wikipedia article means, it's kind of a red herring, as "ed" being 8-bit clean means that it can handle both 8 bit character sets (e.g. ISO-8859-*) and 8-bit character encodings (e.g. UTF-8).
Posted Jan 3, 2012 19:39 UTC (Tue)
by blitzkrieg3 (guest, #57873)
[Link] (1 responses)
Posted Jan 3, 2012 20:01 UTC (Tue)
by khim (subscriber, #9252)
[Link]
UTF-8 is not just run-of-the-mill variable-length encoding. Ken Thompson modified original IBM's proposal to make sure most algorithms which treat strings as sequence of 8-bit characters were still usable with UTF-8. This means that yes, you can easily use UTF-8 with programs like GNU ED or GNU M4 which know absolutely nothing about UTF-8 but correctly support 8bit characters in strings.
Posted Jan 3, 2012 3:28 UTC (Tue)
by wahern (subscriber, #37304)
[Link] (9 responses)
For the time being people have low expectations. But political and technical movements like Simplified Chinese will eventually hit substantial cultural barriers and the push back will require that software handle locales which didn't adapt to western syntax. That will mean following the Unicode rules to a T. To follow the Unicode rules you have to use an API for even simple things like "character" iteration, etc, unless the programming language supports the proper semantic text operations, like Perl6 can over graphemes using it's neat NFG hack. Scripts like Thai have no mandatory punctuation, so again you need to use accessors with a complex built-in rule base to detect, e.g., end-of-sentence. There's no hacking in this kind of support after the fact; it has to be baked into the code.
APIs like ICU are huge, but in many cases can make the code more clear. Unfortunately ICU doesn't get used much because the rule tables are so gargantuan that virtual memory explodes (though most of that is mmap'd straight from disk), and programmers are still beholden to their notion of low-level C-like character strings.
In 10-20 years we are going to see a surge in demand for I18N and L10N programmers to refactor all the crap hacks that came out of the 1990s, heralded by Microsoft's and Sun's half-hearted adoption of UTF-16.
Posted Jan 3, 2012 3:34 UTC (Tue)
by mjg59 (subscriber, #23239)
[Link] (3 responses)
"UTF-32 encoding form: The Unicode encoding form that assigns each Unicode scalar value to a single unsigned 32-bit code unit with the same numeric value as the Unicode scalar value"
So UTF-32 isn't variable length. The sudden rise in the use of emoji and other non-BMP characters means that ignoring the variable length of UTF-16 is already broken in real-world cases in non-CJK markets, too.
Posted Jan 3, 2012 4:02 UTC (Tue)
by wahern (subscriber, #37304)
[Link] (2 responses)
Question: do all combining sequences have precomposed equivalents. I think all the Latin ones do, but what about other scripts?
Posted Jan 3, 2012 4:04 UTC (Tue)
by wahern (subscriber, #37304)
[Link]
Q: Doesn’t it cause a problem to have only UTF-16 string APIs, instead of UTF-32 char APIs?
A: Almost all international functions (upper-, lower-, titlecasing, case folding, drawing, measuring, collation, transliteration, grapheme-, word-, linebreaks, etc.) should take string parameters in the API, not single code-points (UTF-32). Single code-point APIs almost always produce the wrong results except for very simple languages, either because you need more context to get the right answer, or because you need to generate a sequence of characters to return the right answer, or both.
(Source: http://unicode.org/faq/utf_bom.html)
Posted Jan 3, 2012 12:02 UTC (Tue)
by mpr22 (subscriber, #60784)
[Link]
Posted Jan 3, 2012 5:56 UTC (Tue)
by ssmith32 (subscriber, #72404)
[Link] (1 responses)
Yet somehow, this is what always happens :D
Posted Jan 3, 2012 12:55 UTC (Tue)
by sorpigal (guest, #36106)
[Link]
There, fixed it for ya.
Posted Jan 3, 2012 23:52 UTC (Tue)
by cmccabe (guest, #60281)
[Link] (2 responses)
UTF-8 works great for what I need. My only wish is that it had been invented sooner, so that people didn't come up with N+1 different subtly defective, backwards incompatible "wide character" solutions.
If I were performing fancy operations on text, I would probably do it in a higher level language with built-in unicode support. At that point the encoding should be a non-issue (right?) because the high level language abstracts that away.
Posted Jan 4, 2012 2:38 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link] (1 responses)
In practice, I can't think of any languages like that. Many of them are built by people who at best assumed other writing systems are just like Latin except with differently shaped squiggles. They often mandate that "text" means "UTF-16 strings" and then blunder into all sorts of problems with filenames, URLs, streams of bytes some idiot stashed in a "text" field on a database, and other things that definitely aren't UTF-16 strings. There may be built-in assumptions about writing direction, the meaning of "character" (a very, very tricky issue) and so on.
As a rule of thumb if the language claims to be "high level" and yet it has a "character" data type that's distinct from a string, or can be treated meaningfully as an integer, or it has the same data type for binary data and text, then either they're yanking your chain or they had no idea about Unicode. C has the excuse that Unicode literally didn't exist back then. Languages like Python will have to provide their own excuses.
Some more bad signs:
• Mentions of the "length" of a string that don't either include or point at a multi-paragraph discussion of what "length" means in this context.
• Discussion of collation or "sorting" strings that doesn't mention locale.
• A string equality operator or comparison method that doesn't come with a multi-paragraph discussion of Unicode equivalence.
Of course a lot of this stuff can be /fixed/ in theory. But fixes after the fact are often messy. The can involve things like deprecated methods on core objects, parallel APIs replacing every mention of character with "string", or even inventing another type "Unicode string" and then going around replacing all the other APIs in the system with Unicode-friendly ones, leaving maintenance programmers to handle the debris.
Posted Jan 4, 2012 16:33 UTC (Wed)
by cmccabe (guest, #60281)
[Link]
Posted Jan 3, 2012 13:15 UTC (Tue)
by bjartur (guest, #67801)
[Link]
Posted Jan 3, 2012 1:06 UTC (Tue)
by halfline (guest, #31920)
[Link] (6 responses)
Posted Jan 3, 2012 1:48 UTC (Tue)
by nescafe (subscriber, #45063)
[Link] (5 responses)
Posted Jan 3, 2012 3:38 UTC (Tue)
by tnoo (subscriber, #20427)
[Link] (4 responses)
golem$ ed
?
---
Note the consistent user interface and error reportage. Ed is generous enough to flag errors, yet prudent enough not to overwhelm the novice with verbosity.
“Ed is the standard text editor.”
Ed, the greatest WYGIWYG editor of all.
Posted Jan 3, 2012 8:09 UTC (Tue)
by rsidd (subscriber, #2582)
[Link] (1 responses)
(from here)
Posted Jan 4, 2012 11:15 UTC (Wed)
by tnoo (subscriber, #20427)
[Link]
trap "" SIGINT;while :;do read x;echo \?;done
Posted Jan 3, 2012 15:28 UTC (Tue)
by NAR (subscriber, #1313)
[Link] (1 responses)
Posted Jan 6, 2012 1:25 UTC (Fri)
by k8to (guest, #15413)
[Link]
Sometimes I did legitimately fix bugs in source on the server system using ed commands. It was painful.
Posted Jan 3, 2012 19:38 UTC (Tue)
by nicku (subscriber, #777)
[Link] (2 responses)
Posted Jan 3, 2012 21:00 UTC (Tue)
by JoeBuck (subscriber, #2330)
[Link] (1 responses)
Posted Jan 9, 2012 15:43 UTC (Mon)
by ndk (subscriber, #43509)
[Link]
Posted Jan 5, 2012 2:11 UTC (Thu)
by neilbrown (subscriber, #359)
[Link] (1 responses)
If 're' is a 'regular expression', then
Posted Feb 12, 2012 4:54 UTC (Sun)
by dirtyepic (guest, #30178)
[Link]
Posted Jan 5, 2012 17:18 UTC (Thu)
by jhhaller (guest, #56103)
[Link]
GNU ed 1.6 released
It means that this version of ed works with 8-bit character sets.
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
This is not true...
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
Even if you ignore IPA, not all Latin-alphabet combining sequences used in the orthography of natural languages have precomposed code points. For example, as far as I know there is still no precomposed code point for n̈ - and yes, this does have a use other than correctly representing the name of a certain fictional heavy metal band.
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
note the consistent user interface
help
?
?
?
quit
?
exit
?
bye
?
hello?
?
eat flaming death
?
^C
?
^C
?
^D
?
Source code for ed:
note the consistent user interface
while :;do read x;echo \?;done
note the consistent user interface
note the consistent user interface
note the consistent user interface
It's strange to relate how happy I was writing all my (mostly Pascal) computing assignments at UNSW in ed on the locally compiled Unix through 2400 bps green terminals in 1986--1989. About twenty of us simultaneously wrote 6809 assembly language programs on a time-share OS/9 system running on one 68000 CPU in an ed-like editor.
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
GNU ed 1.6 released
/re/
will search for it.
/re/p
will search and then print.
g/re/p
will apply this globally - for every line that matches 're', print the line.
So if you wanted to write a program that just printed the lines that match a regular expression - what do you call it?
GNU ed 1.6 released
My first editor on Unix was ed, vi and emacs hadn't been written yet. em (ed for mortals) was next. But, then moving to emacs instead of vi, I never really learned much of vi other than a, i, d, r, and x; most of my use of vi consists of colon followed by a ed command. Go to the end of the file, GNU ed 1.6 released
:$
make a copy of a line, :.t.
move 3 lines :.,.+2t52 :.,.+2d
(assuming moving lines forward), and substitute apple for banana :g/apple/s//banana/g
