This may come off as a bit abusive and probably is full of fail but what I'd like to see in a log format is null deliminated strings.
And it would look something like this:
log_version\0 machine_ident\0 machine_fqdn\0 timestamp\0 service_ident\0 service_string\0 process_id\0 severity\0 data\0 checksum\0\0\n
something simple like that. The *ident fields are UUID and are completely arbitrary.
The 'machine_ident' would be generated when the syslog-like daemon first starts up like ssh keys are. When the logging daemon connects to a service or starts a new log file it just pukes out a log entry with various useful system identification strings that can be easily picked up by any logging parsing software. Like how browsers do when they connect to a web server. That way it makes it easy to identify the machine by UUID. As long as you can read the first log entry in any file or any time it connects to a network logging daemon then you can figure out what it is pretty easily.
Timestamps are just x.xxxx seconds from unix epoch, GMT. Can have a fine grain of a time stamp as the application warrants and the system can deliver on.
Severity level is similar to how Debian does their apt-pinning. Just a number, like 0-1000. And that number maps to different severity levels:
0-250 - debug
250-500 - info
500-750 - warning
750-1000 - error
That way application developers have a way of saying "well this error is more of a error then that error", which seems important.
The actual data field can be whatever you want. Any data as long as no nulls. Probably more structuring can be layered on later, but this makes it easy to incorporate legacy logging data into this format. Just take the string as delivered by the application/server, stuff the entire thing into <data> and wrap it in those other fields as well as can be done. <data> being JSON would be fine by me and the fact that it's JSON or whatever would be recorded as part of the version string.
I know something like that would make my job a lot easier. :)