|
|
Subscribe / Log in / New account

Surprisingly relevant?

Surprisingly relevant?

Posted May 20, 2020 13:33 UTC (Wed) by Paf (guest, #91811)
In reply to: Surprisingly relevant? by Wol
Parent article: The state of the AWK

A good chunk of the time this doesn’t matter, since it’s just processing small amounts of data. On occasion when working with large log files, I’ve had occasion to need to figure out efficiencies like this... but I don’t do serious “permanent data pipeline” stuff in awk anyway.

I do think about efficiency - for ad-hoc data processing, I start with “how fast can I do this without compromising the actual performance I need”, then work in from there if something’s slow.


to post comments

Surprisingly relevant?

Posted May 20, 2020 18:49 UTC (Wed) by geert (subscriber, #98403) [Link] (1 responses)

For small amounts of data, the tool usually doesn't matter at all.

A long time ago, a colleague came to me for help doing search and replace in a very large file. His editor of choice was "xedit", and the search and replace operation seemed to hang, or at least took ages. I opened his file in "vi", which performed the same operation in the blink of an eye. Didn't even have to refrain to sed.

Lesson learned: "xedit" was written as a sample program for showing how to use the X11 Athena Widgets, it was never meant to be a production-level editor.

Surprisingly relevant?

Posted May 20, 2020 20:19 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

In this context, we're talking about the fixed costs of setting up and tearing down O(1) extra processes (vs. setting up and tearing down exactly one awk process). A reasonable pipeline will scale to millions of lines of text very easily, because the per-process overhead just isn't that big compared to the actual work being done.

On the other hand, if you're doing a while read; do ...; done style thingy, then yes, it will be awful and slow. But I try to avoid that most of the time.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds