LWN.net Logo

The May 21 2.6 "must fix" IRC discussion

The following is a raw transcript from the second discussion of the 2.6.0 "must-fix" list; this one looked (mostly) at new features and wishlist items.

<hanna> Welcome everybody. Thanks for attending again. Same rules as last time.
<hanna> Dont talk unless you have something to add. Keep in depth technical
<hanna> discussions to the level appropriate for keeping this meeting around
<hanna> an hour. Lets get started.
aebr (~aeb@a213-84-53-62.adsl.xs4all.nl) has joined channel #lse
<hanna> akpm, ready to go. thanks for doing this again!
<akpm> np, thanks.
dm (~dm@12.98.126.212) has joined channel #lse
<akpm> latest paperwork is at ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/must-fix/must-fix-4a.txt
orjan (~orjan@c213-89-50-36.cm-upc.chello.se) has joined channel #lse
nevdull (~rick@32.97.110.142) has joined channel #lse
<akpm> it's not substantively different from v4 so I didn't put out a new one formally
<akpm> we need to do "power management" from the bugs section - that was missed last week * zaxl (Auto-Away after 10 mins) [BX-MsgLog On]
msa (~maalanen@instlogin.cs.abo.fi) has joined channel #lse
mingo (~mingo@66.187.233.200) has joined channel #lse
<akpm> So mochel has a bunch of pending work there.
<akpm> o New device power management core code, both for individual devices,
<akpm> and for global state transitions.
<akpm> o A generic user interface for triggering system power state transitions.
<akpm> o Arch-independent code for performing state transitions, that calls
<akpm> platform-specific methods along the way.
<akpm> o A better suspend-to-disk mechanism than swsusp.
<akpm> mochel: what's the status of all that?
<mochel> i'm testing/debugging it right now
<akpm> mochel: ok. Anything to add there?
<mochel> it's not critical, and is fairly self-contained
ChanServ (services@services.oftc.net) has changed mode for #lse to +o riel
<mochel> i.e. it doesn't need to hold up 2.6.0, and could safely be added later.
<Alan> mochel: do we have the ability to get a device suspend call back after we finally turn off IRQ's now ?
<mochel> Alan: yes.
putch (~planchett@81.49.240.199) has joined channel #lse
vana (vana@147.32.240.58) has joined channel #lse
<akpm> Now there's a bunch of things from Alan
<akpm> o PCI locking
<putch> slt tout le monde
<akpm> gregkh is working on that
<akpm> (its a dupe)
dipankar (~dipankar@129.42.208.139) has quit: Remote host closed the connection
<akpm> o Frame buffer restore codepaths (that requires some deep PCI magic)
putch (~planchett@81.49.240.199) has left channel #lse
dipankar (~dipankar@32.97.110.66) has joined channel #lse
<akpm> I don't know what the story is there. Is it critical?
<Alan> akpm: I dont recognize that as one of mine
<Alan> there are pending issues about how we get frame buffer back ok from swsuspend and also a pci locking related X + hotplug one that I think is post 2.6
<akpm> Alan: OK. You did send that actually ;)
<akpm> o XFree86 hooks
<akpm> o AGP restoration
<akpm> o DRI restoration
aeb (~aeb@24.132.5.101) has quit: Ping timeout: 488 seconds
<Alan> akpm: XFree86 hooks I think has to be post 2.6 now
<akpm> they all sound related.
<akpm> Alan: OK, what are the implications of not having them?
<davej> AGP has the hooks in place, just needs someone to do the chipset relevant bits
<Alan> akpm: you suspend the laptop and resume and your first 3d app crashes
<Alan> the box
<Alan> davej: for most of them the resume == init
<davej> *nod*
<mochel> it's gonna be hell and will take time.
<hch> davej: actually the agp hooks are gone now
<davej> I'd rather that was tested and patched instead of just blindly changed though
<mochel> but, they're all feature-related, and not bug-related..
<hch> instead the agp drivers use the pci hooks
<davej> hch: the generic ones yes
<akpm> Alan: that sounds fairly fatal. The workaround is to exit X?
<Alan> akpm: if AGP/DRI is modular otherwise reboot
<Alan> and if your dekstop is 3d based its really messy
<Alan> DRI stuff X folks are working ok
<Alan> s/ok/on
<akpm> Alan: does 2.4 crash in this manner as well?
<Alan> AGP stuff in general means calling the chip init routine again to reload the AGP memory data
<Alan> akpm: vendor 2.4 has patches
<Alan> thats how I know calling the init routine normally does all you need
dmo (~dmo@208.186.192.194) has joined channel #lse
<Alan> Also for APM suspend (as opposed to ACPI) most APM bioses will save/restore them
<akpm> Alan: so what is the conclusion here? Rely on vendor semi-fixes for 2.6?
hbaum_ (~hbaum@32.97.110.142) has joined channel #lse
hbaum_ (~hbaum@32.97.110.142) has quit: Client Quit
<Alan> akpm: if the hooks work its just another "this driver wants a few fixes"
<davej> its important, but not 'lets hold back 2.6.0' important IMO
<akpm> davej: OK. And you have a handle on what's needed?
<Alan> davej: right - 2.6.0 with working hooks doesnt stop 2.6.1 with fixes for AGP resume in drivers
<davej> alan, akpm: sure
lvmguy (~mauelshag@208.210.149.34) has joined channel #lse
<mochel> i can make sure the hooks are there.
<mochel> at least from the driver core..
<akpm> I'm surprised I've never seen a bug report of this. I assume the APM restore is working well?
<Alan> akpm: depends on the laptop vendor
<akpm> hm, OK. next.
<akpm> o IDE suspend/resume without races (Ben is looking at this a little)
oxymoron (~oxymoron@waste.org) has joined channel #lse
dipankar (~dipankar@32.97.110.66) has quit: Remote host closed the connection
<akpm> o How to deal with devices that babble (some stuff we have to global IRQ
<akpm> off to save, and global IRQ on -after- we recover with APM)
<akpm> Alan: which devices do you mean here?
<Alan> akpm: Ben's IDE approach is elegant and seems to solve the problem (he's queuing the suspend/resume on the request queue)
<akpm> that's neat
dardhal (~dardhal@213.0.201.144) has joined channel #lse
<Alan> akpm: devices that babble is resolved- Mochel confirmed there is at least one pm callback point after irqs are disabled * akpm deletes
dipankar (~dipankar@202.88.171.30) has joined channel #lse
<akpm> o Pat's swsusp rework?
<mochel> Alan: and another on resume beore irqs are enabled
<Alan> mochel: perfect
<mochel> akpm: part of same pm work. doing suspend to disk to verify generic model works.
<akpm> mochel: yup. And you're not actually looking at suspend-to-swap as such are you?
<mochel> akpm: no. it's configurable, but currently doing suspend-to-dedicated partition.
<akpm> mochel: ok, that's a dupe anyway
<akpm> last up in PM:
<akpm> o Pat: There are already CPU device structures; MTRRs should be a
<akpm> dynamically registered interface of CPUs, which implies there needs
<akpm> to be some other glue to know that there are MTRRs that need to be
<akpm> saved/restored.
<akpm> is that a "bug"?
<mochel> akpm: yes, but not serious.
<davej> wasn't there already a patch for the MTRR sysfs stuff ?
<mochel> akpm: som work needs to be done in that area, and shouldn't much a problem
<mochel> davej: yes, but they wanted to register mtrrs as device, whih they clearly aren't
ddaM (~dominant@210.186.172.65) has joined channel #lse
<akpm> so what's the conclusion here? Nice to have, nobody's actively working it?
<mochel> akpm: i'll take care of it in the next couple of weeks.
<akpm> mochel: ok, thanks.
cryogen (~cryogen@cryogen.noc.oftc.net) has joined channel #lse
<akpm> Not-ready features and speedups
<akpm> drivers/block/
<akpm> o Framework for selecting IO schedulers
<akpm> this is basically the last missing piece before we can merge the anticipatory scheduler and CFQ
<akpm> Jens will be doing this.
<hch> what's the state of CFQ?
<akpm> o CFQ scheduler. Seems to work but Jens planning significant rework.
<rdunlap> runtime-selectable per device or what?
<akpm> hch: experimental but stable, basically.
<akpm> rdunlap: yes
<akpm> per queue
<wli> I'd like to have runtime-selectable by queue
<akpm> o Anticipatory scheduler. Working OK now, still has problems with seeky
<akpm> OLTP-style loads.
<akpm> AS is still under development a bit. So the selectable queue feature is really the only way in whcih we can acceptably merge it.
<akpm> But I think we need to push ahead with this regardless because it makes such a difference with most things.
<oxymoron> Has there been any thought towards automatically shrinking the 'anticipation window' when it's not working?
<akpm> We'll never be better than deadline for databases though.
<wli> akpm: is the TCQ issue resolved?
<akpm> oxymoron: we do lots and lots of things like that.
<wli> (performance issue)
<akpm> wli: it's getting better, but no, not fully.
<Alan> akpm: the selectable stuff is important also because we have controllers that want to turn off smarts in the OS, or which interact badly because they do their own prediction
<hch> yupp s390 dasd tries to select a noop scheduler
<akpm> Alan: yup. I think the driver makes that decision at present, but being able to set the noop elevator at runtime will be possible
<hch> in 2.4 they still need some patches for that
<hch> the noop elevator is totally broken in 2.5 IIRC
<akpm> hch: hm, what driver uses it?
<hch> akpm: S/390 dasd
<hch> in fact it doesn't use it anymore in 2.5 because of thaty brokenness
<akpm> hch: OK. I'll note that
<davej> does 2.5 s390 even work right now?
<Alan> aacraid will want to use it too from timings I've done with 2.4
<Alan> and I suspect other raid
<hanna> davej, I believe so
<hch> davej: they posted a big patchkit some time ago
<davej> apart from the s390x -> s390 merge, I don't recall seeing much recently
<hch> davej: I think Linus dropped it as usual
<gerrit> davej: s390 is staying up to date, but right, not all patches are in tree.
<akpm> hch: no, they got consolidated
<hch> akpm: that was the patchkit before the ignored one :)
<akpm> hch ;)
<gerrit> akpm: s390 shouldn't hold 2.6.0. They are aware of the closedown but focused on new products with 2.4 at the moment
<akpm> next up: the qlogic disas^Wsaga
tytso (~tytso@dsl092-109-027.nyc2.dsl.speakeasy.net) has joined channel #lse
<akpm> we seem to have about six drivers, none of which work.
<akpm> hch: what's the plan there?
<Jes> akpm, scsi or fc?
<akpm> Jes: both
<hch> akpm: my suggestion would be to merge and fixup the feral driver
<akpm> hch: OK. What's this new driver which jejb is playing with?
<hch> akpm: yes
<Jes> jejb and I are working on qla1280, but jejb is also playing with the feral
<hch> he's also playing with the fc-only qlogic driver, too
<hch> and qla1280 together with yes
<hch> the feral driver has the advantage that it covers all qlogic hbas
pryzbyj (~pryzbyj@24.59.52.42) has joined channel #lse
<hch> scsi/fc and pci/sbus
<akpm> hch: mjacob has plans to do some significant restructuring of the driver
<hch> akpm: yes
<hch> akpm: that's what I meant
<akpm> sounds like stuff is happening anyway
<wli> hch: should we be concerned about the size of the BSD emulation layer?
arnd (arnd@80.138.129.137) has joined channel #lse * hch kicks wli
<wli> I'll take that as a "no".
<hch> wli: read the code
<akpm> next up?
<akpm> o cryptoloop: jmorris: There's no cryptoloop in the 2.4 mainline kernel,
<akpm> but I think every distro ships some version. It would probably be useful
<akpm> to have crypto natively supported in 2.6, with backward compatibility for
<akpm> the majority of 2.4 users.
patman (~patman@bi01p1.co.us.ibm.com) has joined channel #lse * akpm looks for a loop maintainer
<viro> akpm: -ENOENT or axboe
keith (~keith@32.97.110.142) has joined channel #lse
<davej> akpm: ISTR hugh was looking at loop stuff a while back with views to merging some new variant of the loop code
<hch> Adam Richter also has a loop rewrite
<akpm> davej: several people play with loop.
<viro> hch: er
<hch> and one of those cryptoloop guys
<hch> neither of them looks sane
<viro> hch: I'd seen his loop rewrites - quite a few of them
<viro> hch: _not_ a happy sight
<davej> any sane bits in any of the 15 rewrites going on ?
<oxymoron> Might make sense to merge distro version post 2.6.0?
<akpm> davej: they're all too damn big
<davej> ick
<akpm> loop needs a little restructuring to use the crypto api
<akpm> basically move it away from virtual addresses, use page/offset instead.
<hch> s/ to use the crypto api// :)
<oxymoron> What's become of the deadlock issues?
<akpm> oxymoron: there are a few memory-pressure related nasties. That's something I keep meaning to look at. But it'll be another bandaid I expect
<oxymoron> akpm: Mostly fixable with mempool stuff, I presume..
<akpm> If we had someone to work it seriously then forking off a loop-ng for a while may be appropriate
<riel> yes, any crypto memory pressure deadlocks should be fixable with mempool
<riel> memory usage is predictable
<akpm> but until someone signs up to this, it ain't going anywhere.
<riel> point
<oxymoron> I personally think a better way to do crypto here is to be able to shim it in between arbitrary block devices - thinner, less complicated.
obiwan (~hussein@61.6.137.45) has left channel #lse: Client Exiting
<hch> oxymoron: *nod*
<Alan> oxym: probably -> 2.7
<akpm> next up?
<oxymoron> Next..
<hch> mixup of loop on bdev and loop on file is one of the reasons for really messy cruft in loop.c
<akpm> drivers/md/
<akpm> o ioctl interface cleanup patch is ready (redo the structure layouts)
<akpm> o A port of the 2.4 snapshot target is in progress
<viro> hch: there's an elegant way to deal with that, but that's for #kernel
<thornber> patchset will be on it's way for RFC
<akpm> this is all happening - not much to say really. Unless someone has something to add?
<thornber> 1 outstanding problem...
<hch> akpm: dm is doing some really nasty stuff in the ioctl code
<thornber> table reload needs to be split into load/commit
<hch> e.g. tried to give the user a bdev->bd_count value
<thornber> so no allocs occur while the dev is suspended
<viro> hch: seconded
<hch> which is of course completly meaningless once it returned
<hch> and other stuff like that
<thornber> hch: agreed
<hch> I think there needs to be a new ioctl protocol revision to fix that stuff before 2.6.0
<thornber> y, there is one coming
<viro> hch: moreover, they do rather messy lookup by textual representation of major:minor (sic)
<akpm> hch, viro: the md developer are sitting on a ton of patches.
<hch> thornber: what happened to the filesystem interface, btw?
<hch> thornber: iirc that was a must deliver for merging dm
<thornber> I wrote one, noone liked it, gregkh volunteered to do another
<thornber> still watiing ..
<viro> thornber, gregkh: Cc patches to me, I can help with that
<gregkh> thornber: mine is no where completed, I haven't touched it in a long time.
<gregkh> thornber: -ENOTIME
<hch> viro: yupp, that open underlying stuff by dev_t stuff has to go
<thornber> the interface is a seperate module to core dm
<viro> thornber: -> #kernel
<hch> I already pointed that out to the sistina folks when doing open_bdev_excl()
<gregkh> thornber: exactly, I didn't touch the core dm stuff almost at all.
<viro> thornber: and private email
bongani (~bongani@196.30.125.234) has joined channel #lse
andmike (~andmike@32.97.110.142) has joined channel #lse
<akpm> thornber: when do you expect to publish the current patchset?
<thornber> I can push something tomorrow
<akpm> thornber: ok thanks, that should be fun
<akpm> anyone have anything else on devicemapper?
<akpm> fs/
<Alan> akpm: other than being sure we get the block splitting to redo ideraid/dm with it no
<akpm> Alan: yup, we covered that last week. Jens and neilb are doing things, and Arjan seems to be OK with the plans
<akpm> o ext3 lock_kernel() removal: that part works OK and is mergeable. But
<akpm> we'll also need to make lock_journal() a spinlock, and that's deep surgery.
<akpm> this is a sore point
<akpm> One of our main journalling filesystems rather sucks on SMP.
<akpm> Alex Tomas has a humongous patch. I need to review it.
<wli> I've not seen many that do well, largely due to PAE interactions.
<akpm> wli: ?
madd (~dominant@210.186.172.12) has joined channel #lse
<viro> akpm: humongous patch -> return to sender, ask to split, review results of split...
<wli> akpm: heavy usage of vmalloc() OOM's xfs quickly, reiserfs tends to create a lot of bh's for some reason, jfs has an analogous global semaphore.
<hch> wli: where do you see 'heavy use of vmalloc' ?
<wli> hch: er, sorry, vmap()
<hch> wli: again, where do you see heavy use?
<hch> wli: biggest user is log recovery..
<wli> hch: I got OOM's; I assumed they were from that since I didn't see bh's.
W0rf (~worf@193.171.247.56) has joined channel #lse
<wli> hch: log recovery is something else. Wasn't that then.
<gerrit> akpm: you've seen dave hansen's testing results from that patch?
<akpm> gerrit: not that I recall
<gerrit> akpm: he and mingming can do more stress testing as an adjunct to additional patch review
<gerrit> akpm: lots of testing, no failures. I'll have him resent the details
<akpm> gerrit: there are races which the eyeball detects - it needs careful review.
<akpm> next up
<akpm> o ext3 and ext2 block allocators have serious failure modes - interleaved
<akpm> allocations.
<akpm> in ext3 this is related.
<akpm> sct has some plans wrt prealloc for ext3
<akpm> we'll get there if we have time.
<akpm> o Integrate Chris Mason's 2.4 reiserfs ordered data and data journaling
<akpm> patches. They make reiserfs a lot safer.
<oxymoron> Actual failure or pessimal performance?
<akpm> oxymoron: awful layout
<viro> akpm: unrelated to 2.5/2.6 boundary
<viro> akpm: such stuff is fs-local and can go at any point once it gets sufficient review/beating
<akpm> viro: yup, a lot of the wishlist features are like that
<akpm> Chris made noises about getting the ordered-data code going by OLS-time.
<akpm> next, nfs client
<plars> akpm: any plans to separate the ones critical for 2.6 from the ones that aren't?
ddaM (~dominant@210.186.172.65) has quit: Ping timeout: 492 seconds
<akpm> plars: OK, I'll go through and take a shot at prioritising them
<viro> akpm: add sysctl handling cleanup/fixes to the list, while you are at it.
janetinc (~janetinc@bi01p1.co.us.ibm.com) has quit: Remote host closed the connection
<akpm> viro: which ones?
<viro> akpm: the thing's racy, but it's common for 2.2/2.4/2.5
<akpm> viro: you mean the core sysctl code?
<viro> akpm: yes
<akpm> ok, noted
<viro> akpm: and its interaction with procfs
<akpm> I don't see a lot to say about nfs client. It has active maintainership and stuff is happening.
<akpm> Anyone have anything to add?
<hch> cpumask_t
<hch> irq.c consolidation
<akpm> hch: wrt NFS client
<akpm> I'd very much like for something like Peter Braam's 'lookup with
<akpm> intent' or (better yet) for a proper dentry->open() to be integrated with
<akpm> path_walk()/open_namei(). I'm still working on the latter (Peter has
<akpm> already completed the lookup with intent stuff).
<davej> akpm: there was the mmap + truncate problem trond mentioned. still no fix for that afaik
<wli> hch: those should probably go toward the end since I think they aren't on the original agenda
<hch> akpm: sorry, misparsed
<dmc> davej: Paul McKenney is working on a fix.
<akpm> are we done with fs/ ?
<davej> dmc: thats trivial to repeat (at least here), so I'll be happy to test that when it shows up
<dmc> davej: it's also a race for local fs, though apparently less fatal in most cases.
_viro (~al@user-2ivf6b6.dialup.mindspring.com) has joined channel #lse
<akpm> davej: can you refresh our minds on what the problem was?
<davej> akpm: 'fsx dies very quickly'
<oxymoron> Wasn't this the one with the 'morton pages'?
jbrocklin (~joe@dynamic-204-011.natpool.uc.edu) has joined channel #lse
<davej> I suspect its the same reason my kernel compiles over nfs fail
<hch> akpm: please repost the lookup thing, viro is the person who probably cares for it most
<akpm> I'd very much like for something like Peter Braam's 'lookup with
<akpm> intent' or (better yet) for a proper dentry->open() to be integrated with
<akpm> path_walk()/open_namei(). I'm still working on the latter (Peter has
<akpm> already completed the lookup with intent stuff).
Numbex_L (~knoppix@128.195.31.157) has joined channel #lse
<dmc> oxy: I think it's the same, yeah.
<akpm> viro: that's from trond. Are you familiar with the lookup-with-intent stuff? Sound sane?
dmo (~dmo@208.186.192.194) has quit: Quit: Client Exiting
<akpm> hmm
dmo (~dmo@208.186.192.194) has joined channel #lse
<akpm> let's move on
<akpm> kernel/
<akpm> o rusty: Zippel's Reference count simplification. Tricky code, but cuts
<akpm> about 120 lines from module.c. Patch exists, needs stressing.
<akpm> o rusty: /proc/kallsyms. What most people really wanted from /proc/ksyms.
<akpm> Patch exists.
pryzbyj (~pryzbyj@24.59.52.42) has quit: Ping timeout: 488 seconds
<akpm> o rusty: Fix module-failed-init races by starting module "disabled". Patch
<akpm> exists, requires some subsystems (ie. add_partition) to explicitly say
<akpm> "make module live now". Without patch we are no worse off than 2.4 etc.
<akpm> that's all happening
<akpm> o Integrate userspace irq balancing daemon.
viro (~al@165.247.149.112) has quit: Ping timeout: 485 seconds
<wli> Does that actually require integration? I thought the API for userspace to set things was already in place.
<mbligh> Does it need any integreation? it works?
__viro (~al@user-2ivf6os.dialup.mindspring.com) has joined channel #lse
habanero (~habanero@192.35.232.241) has quit: Quit: Client Exiting
<oxymoron> wli: There was some question about racy proc interface..
<davej> there is the ongoing 'rip out inkernel balancer' argument I suppose
<dhansen> some people want to put it in the kernel tree
<akpm> wli: apparently. That seems to be a bit stalled.
<hch> dhansen: sounds stupid
<akpm> dhansen: yes, I'd like to see it in the main tree as a delivery mechanism
<wli> I'm not terribly happy with the organization of the surrounding APIC manipulation code but it's 100% organizational.
<hch> dhansen: it's not really tied more to the kernel than say runon
<wli> AFAIK all the functional issues are there.
<akpm> dhansen: so that people can bang on it, add support for new architectures, so it stays in-sync with the kernel etc
<wli> s/there/taken care of/
<mbligh> The only problem I can see is that there's some extra code size there, the config option can deal with that easily
<oxymoron> akpm: It the who klibc/modtools argument over again..
<hch> akpm: there's lots of userspace for which this agrument makes much more sense..
<akpm> hch: view it as an experiment. I think one of the reasons why people tend to put too much stuff in-kernel is that we have no convenient distribution channel for userspace/
<mingo> akpm: true. Performance-wise both solutions are equivalent. * Alan agrees - solve the real problem
<hch> akpm: that's more an argument for a kernel-tools BK repo on kernel.org
<oxymoron> akpm: If it were something that more than 1% of users used, this might be a good test case..
_viro (~al@user-2ivf6b6.dialup.mindspring.com) has quit: Ping timeout: 492 seconds
<wli> akpm: I would say something very similar wrt. dhcp clients for nfsroot; it's critical to getting the system running (hell, in that case it can't boot at all).
<__viro> speaking of which, what's the situation with klibc merge?
<wli> (dhcp clients meant to be used from initramfs)
<Alan> wli: all the mainstream nfs root stuff uses initrd dhcp clients already - that works out
<akpm> wli: not really. dhcp client isn't the sort of thing which kernel hackers need to bang on regularly. IRQ balancing _is_.
<__viro> gregkh, IIRC, you were the last one to touch it
<oxymoron> viro: Conspicuously absent from must-fix?
<wli> akpm: hmm, I at least have enough diskless machines for it to matter but I'm not everyone I guess.
<gregkh> gregkh: I touched klibc last, yes.
<gregkh> __viro: I touched klibc last, yes.
__viro (~al@user-2ivf6os.dialup.mindspring.com) is now known as viro
<gregkh> __viro: but haven't looked at the merge in a while, sorry, Linus wanted some kernel code ported to userspace to use it before he would take it.
<oxymoron> akpm: But do they need to bang on it and the kernel code simultaneously?
<gregkh> viro: and I didn't have any readily available, so I moved on to other stuff.
<hch> do we need klibc for 2.6?
<viro> gregkh: -> #kernel, then
<viro> hch: IWBN
<akpm> oxymoron: probably
<gregkh> hch: I'd say yes, but no users of it have steped forward yet.
<gregkh> hch: so until then, I can't justify it.
<viro> gregkh: I have a bunch of such things
<gregkh> viro: great, let's talk later.
<akpm> wrt irqbalance, I guess we're waiting for someone who cares to actually do something with it.
<akpm> I don't see this as a showstopper really. People can boot with noirqbalance and download arjan's stuff.
<akpm> next?
<akpm> o kexec. Seems to work, is in -mm
<akpm> this seems fairly intrusive and late. I'm not sure how much pull it has.
<mbligh> is incredibly useful for those of us with 5 minute reboot times
<viro> akpm: IIRC, there were API issues with it. Had that been resolved?
Steph (~sglass@pixpat.austin.ibm.com) has joined channel #lse
Numbex_L (~knoppix@128.195.31.157) is now known as barryn
dardhal (~dardhal@213.0.201.144) has quit: Remote host closed the connection
<akpm> viro: Linus had opinions. I need to ping him, see if he's happy with it as-is.
<akpm> so no, not yet.
<viro> akpm: we have enough sys_too_fscking_ugly() already...
<phillips> akpm, I'll vote for it
<wli> Alan: some issues with boot PROM interactions I brought up elsewhere
<akpm> viro: do you have specific probs w/ kexec?
<akpm> mbligh: does it work on numaq?
<viro> akpm: I'll need to go through my old notes and current patch
<mbligh> akpm, I'll make it work when I get back from vacation
<akpm> next...
<akpm> o rmk: modules / /proc/kcore / vmalloc This needs sorting and testing to
<akpm> ensure that stuff like gdb vmlinux /proc/kcore works as expected. I
<akpm> believe this is the only show stopper preventing any ARM platform being
<akpm> built in Linus' kernel.
<akpm> all rmk's problems are small ones ;)
<akpm> o rmk: lib/inflate.c must not use static variables (causes these to be
<akpm> referenced via GOTOFF relocations in PIC decompressor. We have a PIC
<akpm> decompressor to avoid having to hard code a per platform zImage link
<akpm> address into the makefiles.)
<akpm> mm/
<akpm> o objrmap: concerns over page reclaim performance at high sharing levels,
<akpm> and interoperation with nonlinear mappings is hairy. * akpm hides
<willy> the kcore stuff is a problem for ia64 too ... someone posted a proposal yesterday, iirc
<rdunlap> tony luck
<Alan> willy: has the question of some of the nonlinear strange mappings and non coherent cache stuff been resolved ?
<wli> akpm: I think that one is largely teaching it how to handle Morton pages in one way or another.
<phillips> objrmap should be _EXPERIMENTAL_
<willy> Alan: I don't think anyone's thought about it much. Nobody runs Oracle on PA/Linux.
<dmc> wli: mckenney's fix should solve that one.
<hch> phillips: you don't tell me you want it ifdefed, do you?
<phillips> hch, yup
Alan (~alan@213.105.254.86) has quit: Quit: gotta go
<wli> dmc: what's the current take on remap_file_pages()? I know there are ways to deal with it but am not sure what the preferred method is these days
<willy> thinking about it, there are definitely limitations on how you can map stuff in a coherent way on PA-RISC. The best way is probably just to require that things are only remapped as if we had 8MB pages
<dmc> wli: that one is solved for objrmap. It wasn't all that hairy, really.
<mbligh> akpm, I had some prelimary code to fix the high sharing stuff. I gave it to dmc, it's not finished though
<dmc> mbligh: I'm messing with it now.
<mbligh> cool
<akpm> but we're not at 2.6.10 yet ;)
jfk (~jfk@213.76.228.208) has joined channel #lse
<mbligh> ;-)
<mbligh> it's really pretty small, actually
andmike (~andmike@32.97.110.142) has quit: Quit: Client exiting
<wli> I wasn't under the impression it was known whether range coalescing actually resolved the performance issues.
<mbligh> we could change scanning to work on a per-process RSS in 2.6.10
<oxymoron> Speaking of 2.6.10, have we heard from folks like Google on mm issues lately?
<riel> mbligh: LOL
<akpm> oxymoron: they're on 2.4.18
<riel> mbligh: besides, it doesn't combine with memory zones
<mbligh> riel, but that's a corner-case, right? ;-)
<akpm> o Reintroduce and make /proc/sys/vm/freepages writable again so that boxes can be tuned for heavy interrupt load.
<akpm> that's in progress
<akpm> we done with mm/?
<akpm> net/
<oxymoron> akpm: Got an appropriate place for my lost async write errors?
<akpm> oxymoron: noted
<riel> mbligh: it's a very sharp corner though, you don't want to bump into it
<akpm> also atomic i_size patches
<akpm> in net/ davem is cooking up MPLS support, which apparently IPSEC wants.
<oxymoron> Isn't that patented?
<akpm> o Sometimes we generate IP fragments when it truly isn't necessary.
<akpm> net/ is boring. It just works all the time.
<wli> heh
<akpm> net/*/netfilter/
<akpm> o Lots of misc. cleanups, which are happening slowly.
<mbligh> NAPI still sucks
<dhansen> mbligh: you think that from one test we did a year ago. Not exactly conclusive
<akpm> mbligh: it's mainly for routers I suspect.
<hch> yupp, SGI folks seem to have problems on big boxens with NAPI
alan (~alan@213.105.254.86) has joined channel #lse
<akpm> o davem: Netfilter needs to stop linearizing packets as much as possible.
<mbligh> akpm, would be nice if it didn't slow down machines dramatially that aren't flat out on interfaces
<akpm> rusty is working through this apparently
<hch> unfortunately I couldn't conviencing them of reporting it to the lists instead of cooking up half-backed workarounds
<mbligh> dhansen: I've seem similar stuff from others
<mbligh> hch, what did they do? up the interrupt latency?
<hch> mbligh: back out napi support from tg3.c :)
<akpm> next??
madd (~dominant@210.186.172.12) has quit: Ping timeout: 485 seconds
<akpm> arch/i386/
<akpm> o Also PC9800 merge needs finishing to the point we want for 2.6 (not all)
<wli> Alan: Are the parts we don't want the Kanji input support?
<hch> I haven't heard anything from the pc9800 folks for ages
<alan> akpm: not critical
<hch> wli: their console changes aren't mergeable
<wli> alan: it'd be nice if it built so API changes down there could get propagated
<alan> console is 2.7 stuff
<viro> alan: maybe
<wli> Are those the only bits that have to be left out or is there more that has to get cut?
<hch> wli: no idea
<alan> there are a few possible more merges
<hch> wli: without some feedback from the pc98 folks it's impossible to tell
<wli> Is Osamu Tomita the only contact point?
<davej> their floppy.c clone probably wants some of the recent fixes that went into floppy.c too come to think of it
<hch> wli: I think so
<alan> wli: i'll talk to osamu
<akpm> o ES7000 wants merging (now we are all happy with it). That shouldn't be a
<akpm> big problem.
<mbligh> seems to be in reasonable shape now
<viro> BTW, paride/p{g,t}.c desperately need cleanup similar to bock paride drivers
<viro> I can do that - it won't take much
<wli> I think that's pretty much bouncing it in the direction of the emperor penguin.
<alan> done for -ac e7000 seems fine
<akpm> ok. Anything else for arch/i386/?
<davej> PAT stuff maybe
<akpm> for agp?
<akpm> mtrr exhaustion?
<davej> thats my interest, but other drivers could use it too
<davej> the framebuffer ones spring to mind
<mbligh> akpm, can we merge the early printk stuff?
<wli> freitag's stuff suffices.
<akpm> mbligh: sounds like a good idea. ALl the patches I've seen have been crufty tho
<wli> even less suffices actually.
<wli> Shoving a console registration super-early appears to DTRT.
<wli> (an explicit one that is)
<akpm> wli: we'd need a vga driver for it
<akpm> next..
<wli> akpm: Possible. VGA is of low utility for that though.
<mbligh> wli, for most peope, VGA is what's needed.
<akpm> wli: is it? Diagnosing early lockup on PCs?
<mbligh> for me, I want serial.
<oxymoron> earlyconsole=foo?
<dhansen> serial probably needs early command-line parsing, which is the ugliest part of the early printk stuff
<wli> akpm: hmm, could do something for end-user lockups
<akpm> next is "global" things
<akpm> o 64-bit dev_t
<viro> akpm: BTW, one more item:
<akpm> viro: yup
<viro> akpm: cleaning up options-parsers in filesystems
<viro> akpm: patch exists, needs porting
<mbligh> dhansen: re command line, yeah ... I think just merging VGA would be a good first step.
<viro> akpm: infrastructure in there can help with the aforementioned command-line parsing stuff
<akpm> viro: table-driven?
<viro> akpm: more or less
<akpm> viro: yes, sorely needed
<akpm> 64-bit dev_t we kinda covered last week. It is a showstopper, although not a 2.6.0 showstopper I guess.
<akpm> o We need a kernel side API for reporting error events to userspace (could
<akpm> be async to 2.6 itself)
<akpm> davem proposed that this be based off netlink
<akpm> he had a protpoatch which was simple.
<akpm> I'm not sure what the implications are for drivers etc.
<oxymoron> akpm: lots of printk -> netlink work to make it useful.
<akpm> this appears to be the whole subsystem logging problem.
<viro> akpm: there will be problems with initialization order
<akpm> it seems too late and with too little momentum for anything radical to be happening
<rdunlap> oxymoron: and userspace binary interface to kernel (a la ioctl) -- or am i mistaken?
<davej> akpm: sounds like carrion grade stuff
<rdunlap> and enterprise
<wli> +
<oxymoron> rdunlap: No, I think it would channel through existing netlink if..
<akpm> davej: yup. But it's also an opportunity to clean things up and rationalise cruft, etc. But most of that is a 2.7 janitorial activity
<davej> enterprise logging.. to boldly go...
<davej> akpm: agreed.
<oxymoron> Well past 10/31..
<davej> akpm: though if the infrastructure is trivial and non-intrusive...
<akpm> davej: yes. Someone who cares needs to ping davem and try to move that forward
patman (~patman@bi01p1.co.us.ibm.com) has left channel #lse: Client Exiting
bongani (~bongani@196.30.125.234) has quit: Max SendQ exceeded
<akpm> next, a couple of kbuild things
<akpm> o Kai: Introduce a sane, easy and standard way to build external modules
<akpm> o Kai: Allow separate src/objdir
<akpm> these both sound important.
<akpm> kai does good work. It'll happen.
<davej> 'nice to have', can go in after 2.6 IMO.
<akpm> yup
bongani (~bongani@196.30.125.234) has joined channel #lse
<akpm> o general confusion over firmware policy:
<akpm> o do we mandate that it be uploaded from userspace?
<akpm> o Is binary-blob-in-kernel-image OK?
<akpm> o Each driver (wireless, scsi, etc) seems to do it in a different,
<akpm> private manner.
<davej> I'll be surprised if the firmware stuff happens for 2.6
<gregkh> akpm: there's a patch on lkml that is looking good for this.
<akpm> I don't get 2.6 feelings when I see this discussed
<gregkh> akpm: as a way for drivers to standardise on how to do this.
<akpm> gregkh: ok. How much effort is it to port drivers?
<gregkh> akpm: drivers can be converted/moved to it later in 2.6.
<gregkh> akpm: doesn't look like much at all.
<oxymoron> davej: I don't think we need to fix all the broken instances now, but setting a firm policy for everything future would be nice.
<willy> binary blob in kernelspace is getting drivers taken out by debian, FWIW
<davej> oxymoron: true.
<gregkh> akpm: any new drivers should use it, existing ones can convert later.
<davej> willy: ouch. * willy disapproves, but i don't maintain the kernel package
ueimor (~romieu@213.41.134.224) is now known as ueimor-away
<akpm> gregkh: nice. How does it work btw?
<gregkh> willy: I've disscused this with them in the past. That's there loss.
<gregkh> akpm: uses sysfs and /sbin/hotplug to notify userspace that firmware should be sent to the device.
<oxymoron> gregkh: Their hands are tied..
ndabney (~smurf@208.186.192.194) has quit: Quit: ircII EPIC4-1.1.11 -- Are we there yet?
<gregkh> oxymoron: hey, I've said for 3 years for someone to send me a patch, no one has. That shows how serious they are about this.
<akpm> drivers/acpi/
<gregkh> akpm: anyway, should go into the 2.5 tree soon.
<davej> willy: how about an extra kernel-image in non-free? 8-)
<akpm> Any general comments on the ACPI situation?
<gregkh> akpm: getting better, but still needs work.
<davej> borked on quite a few boxes here
<willy> akpm: As long as Linus is taking patches, seems under control
<hch> akpm: ia64 tree has some changes
<willy> hch: err.. in 2.5?
<hch> I wonder why david didn't feed them to grover yet
<hch> willy: yes
<davej> I have ~3 boxes that have the 'NIC stops getting packets' bug, and another which doesnt even boot with acpi
<hch> willy: not as massive as in 2.4 8)
<davej> theyre in bugzilla, hopefully grover et al will get to the bottom of them eventually
<oxymoron> Not sure if it's supposed to work on my T30..
<davej> oxymoron: it should at least make it bootable.
<willy> hch: well, what's in 2.4 is a different version than marcelo's tree. In 2.4, I'm just replacing the ACPI bits with fresh bits from Andy and everything's working fine
<mochel> it seems the acpi irq routing code could use a serious rewrite
<plars> bug #586 was the ACPI one we were seeing on some 8-ways but Andy fixed it recently
<mochel> it's completely flaky across different systems.
<oxymoron> davej: Boots, ACPI sticks directories in /proc rather than /proc/acpi..
<mochel> but, i don't know how plausible it is for 2.6
<wli> plars: Are the fixes for the SMI stuff and tables overflowing the boot-time virtually remapped area so ACPI can function properly on x440 merged yet?
<davej> its the only thing that bothers me for 2.6-test tbh * akpm has bad premonitions about ACPI
<davej> its the only thing that stops boxes booting I've seen (apart from user error upgrading .configs from 2.4)
<mbligh> wli, ACPI works on 8-way but not 16, IIRC
<akpm> wli: an SMI fix was merged
<wli> SMI is okay
<oxymoron> I've spent several days trying to get APM or ACPI suspend working with X on my new laptop..
<wli> mbligh: don't be coy; the issue is a table overflowing the virtually remapped area, no?
<mochel> oxymoron: acpi suspend will not work
<mbligh> wli, check with jstultz
<akpm> mochel: why not?
<mbligh> not sure he knew, last time I spoke to him.
<mochel> akpm: that's the first thing we talked about :)
<oxymoron> mochel: Oh, the irq chatter?
<wli> okay worse comes to worse I'll debug it myself
<plars> wli: I don't even have a 440 to test one nowadays
<mochel> oxymoron: that, few drivers have suspend/resume methods implemented, the code is fragile, and in many places FITH
jstultz (~moog@pixpat.austin.ibm.com) has joined channel #lse
davej_ (~davej@80.194.74.10) has joined channel #lse
<wli> speaking of the devil
<jstultz> wli: i was summoned?
<wli> jstultz: what's the 16x x440 vs. ACPI issue? I thought it was a table overflowing a virtually mapped area
<akpm> mochel: memory fails me. Is ACPI suspend important, and can it be got going?
bongani (~bongani@196.30.125.234) has quit: Max SendQ exceeded
<jstultz> wli: 2.5 has the patch that fixes 16x x440 w/ ACPI. it was the clear-smi-fix patch
<mochel> akpm: important, but not critical.
<wli> jstultz: so I'm thinking of an earlier issue no longer present. Thanks.
<oxymoron> akpm: Laptops are much less useful without it..
<mochel> akpm: can be got going; working on it now..
<jstultz> wli: i can only hope. let me know if you still see problems.
<akpm> mochel: great, thanks.
bongani (~bongani@196.30.125.234) has joined channel #lse
<akpm> enough acpi?
<akpm> drivers/block/
<akpm> o Floppy is almost unusably buggy still
<barryn> floppy seems fine for me as a user
<rdunlap> me too, but alan referred to some corner case data corruption IIRC
<akpm> ho hum. Can a few more people please test their floppies?
<rdunlap> i will ..more
davej__ (~davej@81.86.107.140) has joined channel #lse
alan (~alan@213.105.254.86) has quit: Quit: EPIC - EOF from stdin
<akpm> drivers/char/
<akpm> o Alan: Multiple serious bugs in the DRI drivers (most now with patches
<akpm> thankfully). "The badness I know about is almost entirely IRQ mishandling.
<akpm> DRI failing to mask PCI irqs on exit paths."
<akpm> o Various suspect things in AGP.
<davej__> I've got a bunch of DRI patches queued up. They'll go this week
<davej__> AGP is 'getting there'
<davej__> backend is pretty much ok, frontend is (in linus' words - shit)
<akpm> ok, "in progress" is good, thanks.
<akpm> drivers/isdn/
<akpm> a bunch of things from kai there.
<akpm> drivers/net/
<hch> davej: backend still needs lots of work, too
<akpm> o davej: Either Wireless network drivers or PCMCIA broke somewhen. A
<akpm> configuration that worked fine under 2.4 doesn't receive any packets. Need
<akpm> to look into this more to make sure I don't have any misconfiguration that
<akpm> just 'happened to work' under 2.4
<davej__> hch: sure, but its lots better than it was
<davej__> hch: nothing worth holding up 2.6 for
<davej__> akpm: going to retry wireless stuff this week some time to be sure I didnt goof that up
<davej__> works fine under 2.4, breaks under 2.5 last I tried
<akpm> davej__: probably the drivers broke.
<davej__> could well be
<akpm> davej__: Jeremy Fitzhardinge here is looking at airo. Not pretty
<davej__> I've been diffing net drivers from 2.4 and auditting the changes, theres nothing amazing so far, but I've not got to wireless/ yet * akpm discovers another arch/i386/ section
<akpm> o 2.5.x won't boot on some 440GX
<akpm> alan: Problem understood now, feasible fix in 2.4/2.4-ac. (440GX has two
<akpm> IRQ routers, we use the $PIR table with the PIIX, but the 440GX doesnt use
<akpm> the PIIX for its IRQ routing). Fall back to BIOS for 440GX works and Intel
<akpm> concurs.
<akpm> o 2.5.x doesn't handle VIA APIC right yet.
<akpm> 1. We must write the PCI_INTERRUPT_LINE
<akpm> 2. We have quirk handlers that seem to trash it.
<davej__> will fall out in the 2.4 fixes I'm accumulating
<davej__> unless _A_ beats me to it
<akpm> o ACPI needs the relax patches merging to work on lots of laptops
<akpm> o ECC driver questions are not yet sorted (DaveJ is working on this)
<davej__> ECC stuff is Dan Hollis
barryn (~knoppix@128.195.31.157) has left channel #lse
<davej__> I looked at it, and ran away
<riel> yeah, scary stuff ;)
<riel> unfortunately
<akpm> davej__: heh. I thought it was purely userspace?
ndabney (~ndabney@208.186.192.194) has joined channel #lse
<davej__> akpm: no, it does various pci jiggery pokery
<davej__> at the least it needs splitting into pieces
<davej__> but there are a lot of other bits in there that need cleaning/fixing too
<akpm> I had it going on my 480nx box, don't recall altering the kernel.
<davej__> post 2.6 driver addition perhaps.
<akpm> 450nx
<akpm> yup
<akpm> arch/x86_64/
<davej__> acpi relax stuff - talk to Andi.
davej_ (~davej@80.194.74.10) has quit: Ping timeout: 485 seconds
<davej__> he did the ones for SuSE, so has a handle on that
<akpm> o time handling is broken. Need to move up 2.4 time.c code
<akpm> that reminds me.
<akpm> the gettimeofday-goes-backwards bug
<riel> is that TSCs getting out of sync ?
<davej__> vojtech solved a bunch of those recently for amd64, might be worth bugging him
<wli> akpm: how many minutes backward do you want it to go? I think I'm going about 3 minutes backward atm.
<davej__> theres still a bunch of places we fuck with the RTC without locking too
<davej__> which could explain it
<akpm> riel: apparently it happens when something blocks interrupts for a lot of ticks.
<akpm> riel: with HZ=1000 it got 10x worse
<davej__> ftape is the only one that springs to mind, but I think there are other drivers too
<riel> that might be a different thing, then
<oxymoron> I think SMM was mentioned as a culprit.
<akpm> davej__: SMM and ACPi and APM will do it I think.
<davej__> k
<akpm> davej__: basically out of our control
<davej__> well, SMM is. but we can trap ACPI/APM at least
<akpm> So david m-t did some work on that and I think people were OK with it.
<jstultz> akpm: are we getting reports outside of laptops seeing gtod going backwards on i386?
<akpm> It needs someone to pick it up and do the ia32 hooks, generally finish it off
<akpm> jstultz: I don't recall any reports at all actually
<jstultz> akpm: I believe the lost ticks code already there will compensate for a number of lost ticks (although once we get past a second or so, the low bits of hte TSC can wrap)
<akpm> jstultz: hm, I thought it was decided that lost-ticks didn't solve the problem.
<jstultz> akpm: well, the lost ticks patch was i386 specific..
<akpm> jstultz: have you been following david m-t's work?
<jstultz> akpm: i've been swamped this weekend. i apologize i've been unable to.
bongani (~bongani@196.30.125.234) has quit: Max SendQ exceeded
davej__ (~davej@81.86.107.140) is now known as davej_
viro (~al@user-2ivf6os.dialup.mindspring.com) has quit: Quit: Leaving
<jstultz> akpm: i'm not opposed to it, but I'm not sure how the i386 implementation would be done.
<akpm> jstultz: ah, OK. Please have a think about it anyway.
<akpm> next..
<akpm> x86-64: o need to coredump 64bit vsyscall code with dwarf2
<akpm> o move 64bit signal trampolines into vsyscall code and add dwarf2 for it.
<akpm> o describe kernel assembly with dwarf2 annotations for kgdb (currently
<akpm> waiting on some binutils changes for this)
<akpm> that's all happening
bongani (~bongani@196.30.125.234) has joined channel #lse
<akpm> arch/alpha/
<akpm> o rth: Ptrace writes are broken. This means we can't (reliably) set
<akpm> breakpoints or modify variables from gdb.
<akpm> arch/arm/
<akpm> o rmk: missing raw keyboard translation tables for all ARM machines.
<akpm> arch/others/
<akpm> o SH/SH-64 need resyncing, as do some other ports. No impact on
<akpm> mainstream platforms hopefully.
<akpm> sounds like all that is under control.
<akpm> that's the end of the list.
<hch> o IA64 needs merging, has impact on core code
<akpm> cpumask_t. Where's it at?
<davej_> what areas?
<hch> davej: all over the place
<davej_> ugh
<wli> akpm: ftp://ftp.kernel.org/pub/linux/kernel/people/wli/cpu/
<akpm> wli: bah. I meant what's its status?
<wli> akpm: been talking to i386 subarch maintainers about the stuff, jejb seems to be the only one currently responding, but updates for everything but pc9800 are all set up.
<akpm> wli: other architectures?
<hch> wli: pc98 doesn't compile in mainline, you don't have to care
<hch> akpm: I've done ppc32
<akpm> wli: what is the impact on small smp?
<wli> akpm: ia64 is being handled 100% by SGI, ppc64 is being handled by antonb
<wli> akpm: zero. It collapses back to identical to the original.
<hch> UP only because SMP is far from compiling
<hch> akpm: none
<akpm> wli: how important is it?
<hch> akpm: the boxens that needs this exist now
<hch> akpm: SGI has this patched into a 2.4 product tree
<wli> akpm: without it we knock 2-3 vendors who have been shipping such boxen for over 5 years out of 2.6 mainline
<akpm> wli: if the patch is applied and there's code which still does it the "old way", will that code fail to compile?
<phillips> <jstultz> akpm: are we getting reports outside of laptops seeing gtod going backwards on i386? <- I hereby submit one (sorry for the lag)
<wli> akpm: no, the arithmetic is allowed to happen on the "narrow" cpumasks as normal.
<akpm> phillips: please send jstultz angry emails ;)
<phillips> roger
<akpm> wli: can it be arranged so that it breaks?
<wli> akpm: yes
<wli> akpm: either (1) by artificially increasing NR_CPUS or (2) by introducing extra wrapping around smaller cpumasks
<akpm> wli: it would be best I think.
<akpm> wli: catching bugs with big NR_CPUS would be acceptable
<wli> akpm: the codegen would change but only slightly
<akpm> wli: so how much remains to be done?
<wli> akpm: I checked and it's slightly different with the structure wrapper around it
<hch> akpm: the other arches need fixing
<wli> akpm: converting the rest of non-i386 after ppc32 and ia64.
<wli> akpm: with a big bag of cross-compilers I could do it myself in < 24 hours.
<wli> akpm: assembling that array of cross-compilers will take longer than that of course
<akpm> wli: minimal approach is just to rudely break things. But sending the maintainers a best-effort uncompiled patch would suit
<wli> akpm: that could be done in < 24 hours also.
<akpm> wli: OK, can you send me a patch when you think the time is right?
<wli> I anticipate a couple cycles of post and resend.
<wli> akpm: will do. diff vs. -mm in a few days?
<akpm> wli: sure. -mm diff is too damn big at present. I'm thinking of dropping keec.
<akpm> kexec
bongani (~bongani@196.30.125.234) has quit: Max SendQ exceeded
zaxl (alex@212.39.68.18) has quit: Quit: .
<akpm> hch: irq.c consolidation. Nice to have, but late.
<wli> akpm: minor nit: SGI wants some switchover to call-by-reference for certain things to automatically happen for large enough stuff. I'll have it out with the rest and it can bounce separately.
<hch> akpm: it's just code reshuffing
<hch> akpm: reshuffling
<hch> akpm: no actual code changes
<hch> akpm: e.g. it would have saved me from doing all the cpumask_t changes for PPC32
<akpm> wli: ouch
htj (~htj@213.237.17.105) has quit: Quit: Client exiting
bongani (~bongani@196.30.125.234) has joined channel #lse
<hch> Andrey Panin has patches
<wli> akpm: smaller than it sounds; I'll let ppl look
<hch> maybe you can ask him to resend against -mm
<wli> hch: IIRC ppc64 needs to do some arch-private stuff
<akpm> hch: I know. I almost sucked them up, but wimped out.
<oxymoron> wli: Several arches do.
<wli> hch: basically it has 2**24 interrupts and needs to be a bit sparse...
<hch> wli: for what?
<hanna> coming up on 2.5 hours...
<hch> wli: consolidated irq.c is a config option
<wli> hch: ah; no problem then.
<hch> wli: if the arch doesn't want it it doesn't have to
<wli> hch: they'll be fine.
<hch> wli: it's just to collect the 7 or 8 dupes
bongani (~bongani@196.30.125.234) has quit: Client Quit
<akpm> OK, I have one more thing here
<akpm> o aio: fs IO isn't async at present. suparna has restart patches, they're
<akpm> in -mm. Need to get Ben to review/comment.
<akpm> late, a bit intrusive, a bit messy, but AIO seems fairly pointless without them.
<akpm> or an equiv
<gerrit> akpm: agreed that we need them. we'll do whatever we need to clean, polish, measure, test...
<wli> is massive testing enough or do we need something deeper?
<gerrit> akpm: still needs a lot of perf work on it...
<akpm> wli: I'd like to see numbers from real workloads.
<oxymoron> There was talk about exploitable aio/O_DIRECT vs truncate race?
<akpm> wli: a: cvs co db2. b: hack
<wli> akpm: okay I know a couple of ppl are on that
<wli> akpm: the equivalent of that is already done IIRC, I've already seen numbers flying around internal to IBM
<akpm> oxymoron: yes, we need to fix that up. adding an AIO/DIO-vs-truncate rwsem will plug it simply enough
<hch> akpm: oracle supports aio
<hch> akpm: maybe get the oracle folks to do something useful and put some benchmarks up?
<akpm> hch: yup, that's happening
<akpm> there's also the aiopoll patch. Ben had concerns wrt its scalability and I have concerns wrt its testability.
<akpm> Does anyone have anything else?
<hch> akpm: I don't think it matters whether aio poll goes into 2.6.0 or 2.6.<n>
<akpm> recursive spinlocks?
<hch> if it goes in at all
<akpm> hch: sure.
<gerrit> recursive spinlock: die die die
<akpm> I think we're done here folks. Thanks again.
<hanna> Looks like we are done! Thanks akpm and everyone else. Looks like we should be focusing on checking things off this list primarily.
rdunlap (~rddunlap@208.186.192.194) has quit: Quit: cooked
<gerrit> akpm: any chance you could post a +items added -items closed to the top of your list?
<akpm> hanna: I'll go through the late-features list and assign priorities to them
<willy> akpm: I sent a couple more to you ..
<hanna> akpm, I dont see a need to do this again do ou?
cliffman (~cliffw@208.186.192.194) has quit: Quit: Client exiting
<akpm> willy: I have those, thanks
<akpm> hanna: no I don't think so

(Log in to post comments)

The May 21 2.6 "must fix" IRC discussion

Posted May 22, 2003 11:51 UTC (Thu) by the_JinX (guest, #3953) [Link]

I realy like those !!

and they realy seem to be gerring things done . . .

thx for the update :D

Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds