|
<hanna> |
Ok everyone lets get started now that akpm is here. First a few rules:
|
|
<hanna> |
1) Only talk if you have something relevent to add.
|
|
phillips (~daniel@213.23.142.240) has joined channel #lse
|
|
aeb (~aeb@213.84.53.62) has joined channel #lse
|
|
<hanna> |
2) keep in depth technical discussions to a minimum so we can try to get through the whole list in an hour
|
|
akr_ (~akr@dis212095024083.vie.telering.at) has joined channel #lse
|
|
<hanna> |
3) We are going to start with the top of the list...
|
|
Almighty (~morten@nextframe.net) has joined channel #lse
|
|
roland (~roland@12.162.17.3) has joined channel #lse
|
|
<hanna> |
akpm, Welcome and thanks for doing this. Would you like to get started?
|
|
fbl (~fbl@4-249.ctame701-1.telepar.net.br) has joined channel #lse
|
|
<akpm> |
hanna: sure.
|
|
<akpm> |
top of list is - TTY locking is broken
|
|
jejb (~jejb@64.109.89.110) has joined channel #lse
|
|
<viro> |
akpm: it's more than just locking
|
|
<akpm> |
viro's working on that one, so I'd expect that to be wrapped up
|
|
Kwuck (~jordi@213-96-38-177.uc.nombres.ttd.es) has quit: Quit: rehash
|
|
<akpm> |
viro: but I thin it's "in hand", yes?
|
|
Nemesis (~ryan@194.153.169.196) has joined channel #lse
|
|
<viro> |
akpm: it promises to be as bad as super.c stuff in 2.4.early
|
|
Kwuck (~jordi@3ffe:b80:3:4481::2) has joined channel #lse
|
|
mnc (~null@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<viro> |
akpm: IOW, a long sequence of patches
|
|
jk-jfk (~jfk@pa208.myslowice.sdi.tpnet.pl) has joined channel #lse
|
|
<akpm> |
viro: do you have a decent set of test cases?
|
|
mrmike (~klaus@p3EE2290E.dip.t-dialin.net) has quit: Remote host closed the connection
|
|
<viro> |
akpm: not right now
|
|
<viro> |
akpm: I have a very incomplete list of provably oopsable races
|
|
<willy> |
could the LTP people help by developing those?
|
|
<ak> |
LSB has some tests, but it's not stress tests
|
|
badspirit (~badspirit@80.49.51.30) has joined channel #lse
|
|
akr (~akr@dis212095022199.vie.telering.at) has quit: Ping timeout: 492 seconds
|
|
<badspirit> |
yo all
|
|
<viro> |
akpm: it's not a matter of closing isolated holes
|
|
<badspirit> |
1 hour left ?
|
|
<viro> |
akpm: entire thing is rotten
|
|
<viro> |
akpm: locking rules there go back to 1.3/2.0
|
|
<viro> |
akpm: and that code hadn't been updated since then
|
|
janetinc (~janetinc@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<akpm> |
viro: yeah. Anyway, I think we can say that it's in progress, and contribution or identification of stress tests is needed?
|
|
<viro> |
akpm: which means that almost everything is based on "we can't block in that piece of code"
|
|
davem (~davem@216.101.162.243) has joined channel #lse
|
|
mdomsch (~mdomsch@24.28.80.108) has joined channel #lse
|
|
rmk (~rmk@3ffe:8260:2002:1:201:2ff:fe14:8fad) has joined channel #lse
|
|
<viro> |
akpm: it is in progress, but it will be long
|
|
mikeg (~mgaughen@65.209.250.242) has quit: Quit: Client Exiting
|
|
dwmw2 (~dwmw2@193.237.130.41) has joined channel #lse
|
|
<viro> |
akpm: keep in mind that almost all holes apply to 2.2 and practically all - to 2.4
|
|
<viro> |
akpm: so we will need backportrs
|
|
voogi (~voogi@213.65.185.159) has quit:
|
|
<akpm> |
viro: aye, that's why everyone else has run away screaming
|
|
kaeptnb (~kaeptnb@lifebookwl.ASK.FH-Furtwangen.DE) has quit: Ping timeout: 492 seconds
|
|
<akpm> |
anyway, moving on..
|
|
mckenney (~mckenney@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<akpm> |
rmk: I think we need a generic RTC driver (which is backed by real RTCs).
|
|
mikeg (~mgaughen@65.209.250.242) has joined channel #lse
|
|
<akpm> |
so this is basically a backend/frontend thing to abstract out the RTC hardware a little better (I think)
|
|
<randy> |
while agreeing with it, why must-fix?
|
|
<hollis> |
akpm: what's wrong with drivers/char/genrtc.c ?
|
|
<rmk> |
Tom Rini pointed out that genrtc.c should be able to satisfy most needs, so it needs investigating.
|
|
kaeptnb (~kaeptnb@141.28.225.122) has joined channel #lse
|
|
<akpm> |
randy: OK, I'll move that to wishlist, and if Russell gets onto it then cool.
|
|
<akpm> |
drivers/block/
|
|
<akpm> |
RAID0 dies on strangely aligned BIOs
|
|
<akpm> |
other RAIDs do too. This is in progress
|
|
<akpm> |
some stuff was merged today
|
|
<arjan> |
this is the 'need bio split' stuff?
|
|
<akpm> |
arjan: yup
|
|
ming (~ming@32.97.110.142) has joined channel #lse
|
|
<oxymoron> |
Are there new exhaustion issues here?
|
|
<akpm> |
arjan: neil and jens are moving ahead on that now
|
|
<akpm> |
oxymoron: ?
|
|
<arjan> |
akpm: if we add that function, we must be sure that it can split on not-a-page boundaries too
|
|
<oxymoron> |
Needing to allocate new bios in IO layer under memory pressure..
|
|
<arjan> |
akpm: otherwise it's useless for a bunch of thngs
|
|
mbligh (~mbligh@216.99.192.236) has joined channel #lse
|
|
<akpm> |
arjan: OK, noted. WHcih brings us to the next point ;)
|
|
<akpm> |
ideraid hasn't been ported to 2.5 at all yet.
|
|
<riel> |
oxymoron: that's ok with limited numbers and mempool, please see mempool.c
|
|
<arjan> |
ideraid should be able to use the DM layer
|
|
<arjan> |
once we can split on not-a-page boundary
|
|
<arjan> |
until we can split, it's stillborn
|
|
tcw (tcw@209.198.128.6) has joined channel #lse
|
|
gregg_it (~greg@adsl-178-71.37-151.net24.it) has joined channel #lse
|
|
<arjan> |
(eg not possible at all in 2.6)
|
|
<akpm> |
arjan: OK, well let's ping Jens on that one. I'll mail him, cc yourself and neilb
|
|
<gregkh> |
arjan: I thought the evms people had a patch for dm to do that.
|
|
<arjan> |
gregkh: if there's a generic split that one should do it imo
|
|
robbiew (~robbiew@pixpat.austin.ibm.com) has left channel #lse
|
|
<akpm> |
arjan: I lied.
|
|
<akpm> |
[PATCH] bio walking code
|
|
<akpm> |
|
|
<akpm> |
Add bio traversal functionality. This is a prereq for doing ide
|
|
<akpm> |
multiwrites safely and sanely.
|
|
ming (~ming@32.97.110.142) has quit: Read error: Connection reset by peer
|
|
<akpm> |
was merged today
|
|
<arjan> |
different thing
|
|
<fryelectronic> |
what about support for software raid based ide cards like promise and highpoint ?
|
|
<arjan> |
fryelectronic: that's ideraid
|
|
<akpm> |
fryelectronic: that's what we are discussing
|
|
Intuxicated (~chatzilla@modemcable029.129-200-24.mtl.mc.videotron.ca) has quit:
|
|
ming (~ming@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<arjan> |
fryelectronic: we're talking about that right now
|
|
ming (~ming@bi01p1.co.us.ibm.com) has quit: Client Quit
|
|
<fryelectronic> |
:)
|
|
<akpm> |
arjan: anyway. Let's coordinate with axboe&neilb on that one.
|
|
<akpm> |
next up:
|
|
<akpm> |
CD burning. There are still a few quirks to solve wrt SG_IO and ide-cd
|
|
<akpm> |
that just looks like bugzilla fodder to me
|
|
<akpm> |
drivers/input/
|
|
<viro> |
akpm: locking
|
|
<akpm> |
- rmk: unconverted keyboard/mouse drivers (there's a deadline of 2.6.0
|
|
<akpm> |
currently on these remaining in my/Linus' tree.)
|
|
<akpm> |
viro: of what?
|
|
<dwmw2> |
akpm: did the vmalloc races ever get fixed? You never did work out why my patch killed your machine, did you?
|
|
<viro> |
akpm: of damn next to everything
|
|
<viro> |
akpm: it's... optimistic code
|
|
<akpm> |
viro: which code?
|
|
<viro> |
akpm: input layer
|
|
<akpm> |
dwmw2: nope. I'll note that
|
|
<rmk> |
akpm: those drivers are an issue for ARM people to sort out - only people with the hardware can test them.
|
|
Intuxicated (~Intuxicat@modemcable029.129-200-24.mtl.mc.videotron.ca) has joined channel #lse
|
|
<akpm> |
viro: vojtech has been pretty quiet lately. Can you prepare a description of the problem, or were you planning on getting in there?
|
|
<wli> |
dwmw2: flag me down about those, I looked into a different set of vmalloc races recently
|
|
jfv (~jfv@32.97.110.142) has joined channel #lse
|
|
<viro> |
akpm: the last time I'd checked almost nothing had been protected - basically, it assumed that nothing overlaps in time
|
|
<viro> |
akpm: it might've become better, but I hadn't seen much activity there
|
|
<rmk> |
akpm: I've had mail from vojtech recently, apparantly been rather busy.
|
|
ming (~ming@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<viro> |
akpm: I'll look into that code again
|
|
<akpm> |
viro: thanks
|
|
<rmk> |
akpm: I prodded him about a bug in atkbd.c, recognising mice connected to KVM switches as a keyboard.
|
|
<ak> |
rmk: KVMs are problematic in general with the new code, mine needs a special option now.
|
|
<dwmw2> |
wli: note there's no locking around the {un,}map_vm_area() calls. You can allocate a VM range after someone else frees it, set up your ptes and then they tear theirs down...
|
|
<rmk> |
ak: this is more a timing issue, and there's a bug in there where a timeout never expires.
|
|
<wli> |
dwmw2: that's what I "fixed", we can doublecheck things later
|
|
<rmk> |
we end up refreshing a timeout each time we go around the loop, so we never timeout.
|
|
Foske (~josh@e162077.upc-e.chello.nl) has joined channel #lse
|
|
<viro> |
akpm: BTW, parport is nearly as bad as that and there the code is more hairy
|
|
badspirit (~badspirit@80.49.51.30) has quit: Quit: Client exiting
|
|
ElectricElf (david@elf.noc.oftc.net) has joined channel #lse
|
|
<ak> |
also the input keyboard stuff still has unusably obscure config options for standard PC hardware
|
|
<akpm> |
viro: parport is hairy, and Tim seems to have vanished. Is he still with RH?
|
|
<oxymoron> |
ak: Agreed.
|
|
<ak> |
if they're not already set you have no chance to get them correct
|
|
<dwmw2> |
akpm: on honeymoon I think :)
|
|
<Nemesis> |
parport has *always* been hairy
|
|
<Foske> |
ak: imho that is well covered by make def, but I must agree it is a hell of a job to config the pc keyboard by hand
|
|
<rmk> |
ak: I think the config stuff is all setup so that the default should allow it to just work.
|
|
<akpm> |
dwmw2: is he still active kernelwise?
|
|
<arjan> |
akpm: he still does parport stuff
|
|
<rmk> |
ak: the problem comes when you take an old config with CONFIG_INPUT=m
|
|
<ak> |
rmk: only from the oldconfig
|
|
<ak> |
rmk: if you do make oldconfig with a 2.4 config all hell breaks lose
|
|
<oxymoron> |
Nemesis: actually the original parport code was one of the simplest drivers in the tree..
|
|
<akpm> |
next?
|
|
<akpm> |
synaptic touchpad support. There's a patch in -mm and someone emailed me today wrt forward-porting a spearate driver, so things are advancing there
|
|
<viro> |
akpm: IMO parport is more of "figure out what API changes are needed for its users, get them done ASAP, then fix generic layer at leisure"
|
|
<arjan> |
akpm: how's that different from the userspace tpconfig?
|
|
<Foske> |
akpm: did I miss ACPI already ?
|
|
<akpm> |
arjan: never heard of it
|
|
<hanna> |
Foske, nope.
|
|
<arjan> |
akpm: check it out some day then ;)
|
|
<akpm> |
arjan: I don't know what a synaptics is ;)
|
|
<Foske> |
hanna: oki...
|
|
<oxymoron> |
akpm: Just about any laptop touchpad..
|
|
<akpm> |
so does tpconfig mean that we don't need any kernel support for them?
|
|
<Foske> |
one of the mayor issues IMHO: If your kernel doesn't work: disable ACPI and try again...
|
|
<davej> |
how did extra driver support become a 'must-fix' item anyway ?
|
|
<ak> |
foske: same as with 2.4. nothing has changed.
|
|
<arjan> |
akpm: I think so; but I guess it's an "Ask Vojtech"
|
|
<ak> |
foske: nothing to be concerned about for the release.
|
|
<dwmw2> |
davej: we've taken to breaking the touchpad and clitmouse setups with the mouse probe
|
|
<oxymoron> |
akpm: No, tpconfig may let you fiddle with the tap=click support and the like..
|
|
jasonmc (~jasonmc@64.231.16.203) has joined channel #lse
|
|
<akpm> |
davej: works in 2.4, lots of people want it I think
|
|
<davej> |
dwmw2: ok
|
|
<davej> |
any feedback on whether the driver in -mm fixes things?
|
|
<oxymoron> |
davej: Not as of -3.
|
|
<akpm> |
davej: not really.
|
|
<Foske> |
later about that...
|
|
<hanna> |
next?
|
|
PuckCh (~Bornet@adsl-62-167-187-45.adslplus.ch) has quit: Ping timeout: 492 seconds
|
|
<akpm> |
"I have created an extra driver for the touchpad (as opposed
|
|
<akpm> |
to modifying psmouse.c). I will work on the driver as soon as I find the
|
|
<akpm> |
time and send you a copy."
|
|
<akpm> |
Jens Taprogge <jens.taprogge@rwth-aachen.de> |
|
|
<akpm> |
we'll see. Yup, next.
|
|
<akpm> |
drivers/misc/
|
|
<davej> |
if the mouse probe buggers things up, surely psmouse needs fixing too ?
|
|
bunk (~bunk@129.187.202.58) has joined channel #lse
|
|
<akpm> |
rmk: UCB1[23]00 drivers, currently sitting in drivers/misc in the ARM
|
|
<akpm> |
tree. (touchscreen, audio, gpio, type device.)
|
|
<oxymoron> |
davej: (yes)
|
|
<rmk> |
drivers/misc has traditionally been kept empty - are people happy for such stuff to end up in there, or does it need to find other places in the kernel tree?
|
|
<willy> |
drivers/char == drivers/misc ...
|
|
<oxymoron> |
rmk: Are you pushing to have ARM working out of the box for 2.6?
|
|
drepper (~chatzilla@cpe-24-221-190-179.ca.sprintbbd.net) has joined channel #lse
|
|
<viro> |
akpm: actually, misc.c has a good chance to die
|
|
<rmk> |
oxymoron: I'm trying to get as much of ARM merged. I'm currently working with a 1.8MB uncompressed patch which ain't good.
|
|
<viro> |
akpm: with cdev-cidr that's trivial
|
|
<viro> |
akpm: /proc/misc support is the only thing that remains after that
|
|
<akpm> |
rmk: so it looks like those drivers need to find appropriate, separate homes in the tree?
|
|
<rmk> |
my fear is that as soon as its seen ok to put stuff in drivers/misc, it'll fill up like mad.
|
|
<ak> |
sounds hardly like an release issue, more something for 2.7
|
|
<willy> |
agreed
|
|
<randy> |
how hard is it to just move them?
|
|
<rmk> |
ak: I'd rather not have to carry them across.
|
|
<rmk> |
basically, I'm getting rather sick of carrying such a huge patch.
|
|
<akpm> |
rmk: what would you prefer to do?
|
|
<rmk> |
been doing it since 2.0 ;(
|
|
<davej> |
rmk: why can't they live in sound/ drivers/input etc.. ?
|
|
shawk (~shawk@80.138.160.250) has quit: Quit: Client Exiting
|
|
<rmk> |
they're fairly closely related to each other, being all on one device.
|
|
<riel> |
how about drivers/arm/ if they're arm-only devices anyway ?
|
|
<akpm> |
rmk: drivers/UCB ?
|
|
<davem> |
riel: nobody will grep in there when changing global APIs
|
|
<davem> |
:-)
|
|
<viro> |
akpm: acronym clash
|
|
<akpm> |
drivers/rmk?
|
|
<randy> |
or arch/arm/drivers ?
|
|
<rmk> |
akpm: heh.
|
|
<rmk> |
randy: been there, people didn't like it.
|
|
<randy> |
ok
|
|
<oxymoron> |
randy: Sets an even worse precedent than drivers/misc
|
|
<arjan> |
akpm: likewise we need a drivers/obsolete
|
|
<willy> |
drivers/arm seems fine, it's what we did for parisc
|
|
<jejb> |
drivers/parisc is what parisc uses
|
|
<randy> |
oxymoron: but we still/already have those.
|
|
<willy> |
s390 ditto (though i hate to use s390 as a good example ;-)
|
|
<davem> |
I'd personally prefer drivers/net/arm drivers/scsi/arm etc.
|
|
<akpm> |
rmk: are they truly arm-only?
|
|
<rmk> |
not really - they could be found elsewhere, but I'm not aware of other uses.
|
|
<arjan> |
if it's not a gazillion of them, why even net/arm and not just net/
|
|
<randy> |
drivers/whatever_ucb_is
|
|
<jejb> |
Lay the tree out per maintainers, so drivers/arm/scsi for ARM only SCSI (only drivers/scsi/arm if they share lots of code)
|
|
<rmk> |
there's a related issue - drivers/acorn/net could become drivers/net/arm
|
|
<oxymoron> |
akpm: There are probably multi-device chipsets that will appear on multiple platforms soon..
|
|
<rmk> |
likewise for drivers/acorn/scsi
|
|
* davem doubts we're making progress on the todo list now
|
|
<akpm> |
this is getting ratholey.
|
|
<randy> |
right
|
|
<davej> |
1hr was optimistic for getting through this lot 8)
|
|
<akpm> |
rmk: your call. But there ain't no point in leaving them in your tree.
|
|
<oxymoron> |
rmk: I think the concensus is anything but misc..
|
|
<akpm> |
drivers/net/irda/
|
|
<rmk> |
ok, I'll try to find a new home for them.
|
|
<viro> |
rmk, akpm: let's skip that - it's isolated and can be discussed later
|
|
<akpm> |
this is a bunch of jt thiings which are under control. And a bug from rmk which I'll forward on.
|
|
<rmk> |
akpm: I think we've sorted that one between myself and Jean.
|
|
<akpm> |
ok
|
|
<akpm> |
drivers/pci/
|
|
zaitcev (~zaitcev@adsl-66-124-38-163.dsl.sktn01.pacbell.net) has joined channel #lse
|
|
<akpm> |
alan: Some cardbus crashes the system
|
|
<arjan> |
akpm: afaik the pci device list has no locking
|
|
<arjan> |
which is the biggest thing
|
|
<akpm> |
alan says: We have multiple drivers walking the pci device lists and also using
|
|
<akpm> |
things like pci_find_device in unsafe ways with no refcounting. I think
|
|
<akpm> |
we have to make pci_find_device etc refcount somewhere and add
|
|
<akpm> |
pci_device_put as was done with networking.
|
|
<akpm> |
that's been there for ever
|
|
<ak> |
akpm: in tree cardbus has been traditionally unusable (in 2.4). 2.6 will just carry that on
|
|
PugMajere (~ryan@68.60.187.197) has joined channel #lse
|
|
<arjan> |
would be nice to get at least the infrastructure to do it right in
|
|
<ak> |
everybody serious uses the external code
|
|
<rmk> |
ak: report known problems in my direction please.
|
|
<arjan> |
drivers can be fixed during 2.6.X itself
|
|
<akpm> |
ak: where does the working code come from?
|
|
<willy> |
i use the internal code
|
|
<ak> |
akpm: dmills
|
|
<akpm> |
dhinds?
|
|
<oxymoron> |
ak: Disagree.
|
|
<ak> |
erm hinds sorry
|
|
<gregkh> |
ak: I'm going to be working on the pci locking stuff.
|
|
<ak> |
all the distributions ship external
|
|
<gerrit_> |
seems like things that are broken in 2.4 and still broken in 2.6 should be on a different list from regressions in 2.6 and incomplete features in 2.6.
|
|
<dwmw2> |
pci locking is completely screwed. There's locking of sorts, but it's half-arsed
|
|
<ak> |
and external supports many more chipsets and drivers than internal
|
|
<arjan> |
ak: RH doesn't, Mandrake doesn't
|
|
<rmk> |
weren't we going to deprecate find_device and friends ?
|
|
<akpm> |
gerrit_: well if people aren't seriosuly hurting from them then they're not critical
|
|
<willy> |
gerrit: not entirely -- 2.6 may make these problems more likely to hit, and 2.6 is more capable so it'll be run on more boxes
|
|
<gregkh> |
gerrit_: no, pci locking still really needs to get fixed anyway. it's much easier to do it on 2.5 than 2.4 because of the driver core now.
|
|
<oxymoron> |
arjan: Debian ships both, of course..
|
|
jfv_ (~jfv@bi01p1.nc.us.ibm.com) has joined channel #lse
|
|
zaxl (alex@212.39.68.18) has quit: Read error: Connection reset by peer
|
|
PuckCh (~Bornet@62.167.94.214) has joined channel #lse
|
|
ricklind (~rick@129.33.49.251) has joined channel #lse
|
|
<ak> |
arjan: are you sure? a lot of bridge drivers are missing in internal
|
|
patman_ (~patman@bi01p1.nc.us.ibm.com) has joined channel #lse
|
|
gaughen_ (~gaughen@129.33.49.251) has joined channel #lse
|
|
<arjan> |
ak: for RH I'm sure
|
|
<gregkh> |
rmk: I wanted to, but couldn't find a way to do it, as there are legal users of that api.
|
|
<ak> |
arjan: lots of wireless stuff also only works with external
|
|
<gerrit_> |
if there's a good case to say that 2.6 is different than 2.4, then yeah, higher priority. I'm also thinking of the tty stuff...
|
|
hlinder (~hlinder@129.33.49.251) has joined channel #lse
|
|
<arjan> |
ak: drivers are easily ported and added; different issue
|
|
rick (~rick@32.97.110.142) has joined channel #lse
|
|
mckenney_ (~mckenney@32.97.110.142) has joined channel #lse
|
|
ming (~ming@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
mckenney (~mckenney@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
janetinc (~janetinc@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
mnc (~null@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
badari (~badari@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
hbaum (~hbaum@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
keith (~keith@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
gaughen__ (~gaughen@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
gaughen__ (~gaughen@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
nevdull (~rick@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
colpatch (~mcd@bi01p1.co.us.ibm.com) has quit: autokilled: Session limit exceeded
|
|
mnc (~null@32.97.110.142) has joined channel #lse
|
|
<oxymoron> |
gerrit: Some things like tty were made worse by stuff like preempt even though they were pre-existing.
|
|
<akpm> |
gerrit_: well things like tty are something which we do need to fix. That's what a devel kernel is for, and if it holds up stabilisation of that devel kernel well bad luck.
|
|
rick (~rick@32.97.110.142) is now known as nevdull
|
|
keith (~keith@32.97.110.142) has joined channel #lse
|
|
dkwho (~davidho@199.71.138.178) has joined channel #lse
|
|
<akpm> |
gerrit_: although it may not hold up 2.6.0
|
|
gaughen (~gaughen@32.97.110.142) has quit: Read error: Connection reset by peer
|
|
jfv (~jfv@32.97.110.142) has quit: Write error: connection closed
|
|
jstultz (~jstultz@32.97.110.142) has quit: Write error: connection closed
|
|
patman (~patman@32.97.110.142) has quit: Write error: connection closed
|
|
hanna (~hlinder@32.97.110.142) has quit: Write error: connection closed
|
|
<viro> |
oxymoron: ... not to mention BKL-hunters from certain company...
|
|
<gerrit_> |
akpm: agreed
|
|
<akpm> |
movin' on
|
|
<akpm> |
drivers/pcmcia/
|
|
<akpm> |
alan: Most drivers crash the system on eject randomly with timer bugs. I
|
|
<akpm> |
think after RMK's stuff is in most of the pcmcia/cardbus ones go except the
|
|
<akpm> |
locking disaster.
|
|
hbaum (~hbaum@32.97.110.142) has joined channel #lse
|
|
badari (~badari@32.97.110.142) has joined channel #lse
|
|
<rmk> |
Linus finally took my patch today without complaint. There's still a lot of work to do between Dominik and myself in there though.
|
|
colpatch (~mcd@32.97.110.142) has joined channel #lse
|
|
patman__ (~patman@32.97.110.142) has joined channel #lse
|
|
tytso_ (~tytso@18.187.1.124) has joined channel #lse
|
|
<akpm> |
rmk: but it's happening. Is good.
|
|
<rmk> |
yep.
|
|
<akpm> |
drivers/pld/
|
|
<akpm> |
rmk: EPXA (ARM platform) PLD hotswap drivers (drivers/pld)
|
|
Bacchus (~ralf@80.139.86.183) has joined channel #lse
|
|
<rmk> |
maybe drivers/arm/ stuff again?
|
|
<akpm> |
rmk: ok, you can decide what to do there?
|
|
<rmk> |
ok.
|
|
patman__ (~patman@32.97.110.142) has quit: autokilled: Session limit exceeded
|
|
colpatch (~mcd@32.97.110.142) has quit: autokilled: Session limit exceeded
|
|
badari (~badari@32.97.110.142) has quit: autokilled: Session limit exceeded
|
|
hbaum (~hbaum@32.97.110.142) has quit: autokilled: Session limit exceeded
|
|
keith (~keith@32.97.110.142) has quit: autokilled: Session limit exceeded
|
|
mnc (~null@32.97.110.142) has quit: autokilled: Session limit exceeded
|
|
mckenney_ (~mckenney@32.97.110.142) has quit: autokilled: Session limit exceeded
|
|
nevdull (~rick@32.97.110.142) has quit: autokilled: Session limit exceeded
|
|
<akpm> |
drivers/video/
|
|
<davej> |
fb stuff changes according to position of moon. one day it works, next merge something else seems to break.
|
|
<akpm> |
Lots of drivers don't compile, others do but don't work.
|
|
<davem> |
jsimmons slowly making progress on tis
|
|
<davem> |
I had to teach him how to maintain
|
|
<davem> |
:-)
|
|
<arjan> |
rare driver not compiling is not stopship
|
|
<viro> |
akpm: lots of serial drivers in drivers/char and arch/* do not compile either
|
|
<arjan> |
need a CONFIG_KNOWN_BROKEN
|
|
<cpufreak> |
ok
|
|
<cpufreak> |
that should be fixed
|
|
<zaitcev> |
Wasn't jsimmons unemployed?
|
|
<cpufreak> |
sorry about that
|
|
<ak> |
fb really needs an better in kernel pageattr api
|
|
<gerrit_> |
CONFIG_KNOWN_BROKEN or at least an actively maintained list of working vs. broken
|
|
<ak> |
usually they're unusably slow on non x86 without write combining for the video buffer
|
|
<dwmw2> |
zaitcev: yes. Not sure if that's still the case.
|
|
<akpm> |
davem: slowly. Does James have help? There's a Spanish-sounding guy who is very good, but I think he'd taken a month or two vacation.
|
|
<akpm> |
viro: ok, noted
|
|
<davej> |
there are also early-init problems with some of the framebuffers (like i810fb) too. Initialising fb from drivers/char/mem.c may not be the best thing.
|
|
<gerrit_> |
I keep seeing a lot of "lots of things are broken" but a list of what works AND what is broken would be good
|
|
<wli> |
part of that was my fault with the slabification
|
|
<davem> |
akpm: geert supposedly helps him
|
|
<gregkh> |
gerrit_: the osdl list is a good indication of that.
|
|
<Almighty> |
akpm: Tony Daplas. the i810fb-guy.
|
|
<akpm> |
Almighty: that's the one. He does ice work.
|
|
<akpm> |
nice
|
|
<randy> |
yes
|
|
<Almighty> |
akpm: yeah.
|
|
<davem> |
moving on...
|
|
<akpm> |
not sure what else to say there.
|
|
<akpm> |
drivers/scsi/
|
|
<arjan> |
CONFIG_KNOWN_BROKEN fodder again for not compiling stuff
|
|
<akpm> |
lots of stuff there, and lots of developers and lots of interest in getting it fixed. I don't think we need to say much?
|
|
<davem> |
akpm: a large merge went in from hch recently, maybe it helped significantly here
|
|
<davem> |
right
|
|
<jejb> |
What *needs* to be fixed
|
|
<oxymoron> |
If viro won't mention the locking nightmare, I will..
|
|
<arjan> |
not old isa drivers
|
|
<jejb> |
arjan: so wd33c99
|
|
<willy> |
Please don't pull out sym53c8xx yet
|
|
<rmk> |
I have a pending todo: I need to put the scsi error handling through a workout on my scsi bus from hell to make sure it does the right thing and doesn't get wedged.
|
|
<willy> |
Think _2 still doesn't work on PA-RISC
|
|
<akpm> |
where are we with qlogic?
|
|
<jejb> |
willy: works for me on C360
|
|
ricklind (~rick@129.33.49.251) has quit: Ping timeout: 497 seconds
|
|
<arjan> |
qlogic FC needs quite some work still wrt vendor drivers
|
|
<andmike> |
rmk: Let me know if does not work.
|
|
jfv_ (~jfv@bi01p1.nc.us.ibm.com) has quit: Write error: connection closed
|
|
<jejb> |
akpm: What do people think about the qlogic driver going in?
|
|
patman_ (~patman@bi01p1.nc.us.ibm.com) has quit: Read error: Connection timed out
|
|
gaughen_ (~gaughen@129.33.49.251) has quit: Ping timeout: 485 seconds
|
|
<arjan> |
but they're working on it; not stopship
|
|
<akpm> |
jejb: which one?
|
|
<arjan> |
just driveradd
|
|
<willy> |
jejb: ok. i need to make sure there's nothing left to fix that went into the original and not _2
|
|
<akpm> |
feral?
|
|
<davem> |
And then there's feral.com's qlogic drivers
|
|
janetinc (~janetinc@32.97.110.142) has joined channel #lse
|
|
<jejb> |
akpm: actually both qlogic and feral
|
|
<oxymoron> |
There are race issues with hot adding and removing devices last I checked..
|
|
hlinder (~hlinder@129.33.49.251) has quit: Ping timeout: 495 seconds
|
|
<davem> |
which actually work on SBUS kit
|
|
ming (~ming@32.97.110.142) has joined channel #lse
|
|
<wli> |
jejb: what's your take on qlogicisp.c? marginal case or vaguely important?
|
|
badari (~badari@32.97.110.142) has joined channel #lse
|
|
mnc (~null@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<jejb> |
wli: unfixable without docs
|
|
<davem> |
wli: we can kill that by putting in the feral stuff
|
|
guest_3 (~Bunny@216.236.98.130) has joined channel #lse
|
|
<wli> |
jejb/davem: Sounds like a plan.
|
|
<jejb> |
hotplug is an issue but not a show stopper for SCSI.
|
|
<davem> |
agreed
|
|
mckenney (~mckenney@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<akpm> |
jejb: my perception if that mjacob needs a bit of help. More testers, closer review/feedback from yourself and hch
|
|
<davem> |
akpm: his standards are too high :-)
|
|
<davem> |
akpm: really, his driver is quite stable
|
|
<jejb> |
akpm: no USB storage, but I'll try
|
|
<zwane> |
please do, it's randomly useable (stopped working for me for a release then suddenly worked with not many people able to have a look)
|
|
hanna (~hlinder@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<akpm> |
davem: interesting to hear...
|
|
<mbligh> |
davem / akpm, feral falls over fairly easily for us
|
|
nevdull (~rick@32.97.110.142) has joined channel #lse
|
|
<davem> |
mbligh: and qlogicisp.c does better? :-)
|
|
<mbligh> |
davem, not the point ;-)
|
|
<cpufreak> |
Those of you from IBM - sorry about that, the session limit has now been rasied to try and prevent a reoccurance of that.
|
|
<wli> |
different cards
|
|
jstultz (~jstultz@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<davem> |
I say we put the feral stuff in there so it starts getting tested
|
|
<dhansen> |
mbligh: patman disagrees. He isn't having problems much any more
|
|
colpatch (~mcd@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<mbligh> |
wli, nope, feral will drive ISP
|
|
jbaron (~jbaron@66.187.230.200) has quit: Quit: Client Exiting
|
|
<riel> |
cpufreak: thanks!
|
|
hbaum (~hbaum@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<jejb> |
wli: not entirely. Feral covers all qlogic chips
|
|
* mbligh pokes andmike ... what's up with feral nowadays?
|
|
riel (~riel@riel.netop.oftc.net) has changed mode for #lse to +o hanna
|
|
<jejb> |
feral also does endianness and sbus unlike the other non FC ones
|
|
<wli> |
qlogicisp.c falls over when you breathe on it for isp1020; IIRC other qlogic stuff is used mostly for feral wrt. IBM testing (you don't do perf runs on 32-bit cards on PAE boxen -- it's pointless)
|
|
<andmike> |
jejb: The feral does need some queue work on FC cards
|
|
<mbligh> |
wli, you obviously have dangerously bad breath ;-) Is pretty reliable for me
|
|
<jejb> |
OK, what about putting *both* feral and qlogic in and hoping evolutionary pressure makes them better?
|
|
fryelectronic (~tom@213.119.141.75) has quit: Quit: BitchX: its wax ecstatic
|
|
<mbligh> |
jejb, sounds good to me ;-)
|
|
<arjan> |
either way it's all "just" driver adds
|
|
<davem> |
moving on...
|
|
<jejb> |
and thus dumping qlogicfc and qlogicisp
|
|
<arjan> |
without core effect
|
|
<zaitcev> |
jejb: Tried that before. Look at uhci and usb-uhci. It's not too convinient for distributors. Arjan may chime in, perhaps.
|
|
<akpm> |
qlogic drivers: merge qlogicisp, feral with a view to dropping qlogicfc and qlogicisp
|
|
<andmike> |
jejb: Sounds good to me
|
|
<jejb> |
zaitcev: I'm unwilling to pick winners at this time
|
|
<akpm> |
we done with scsi?
|
|
<davem> |
I think so
|
|
<wli> |
Yes
|
|
<jejb> |
akpm: and merge the qla2xxx too
|
|
<akpm> |
drivers/usb/gadget/
|
|
patman (~patman@32.97.110.142) has joined channel #lse
|
|
<jejb> |
Yes
|
|
zaxl (~alex@212.50.18.217) has joined channel #lse
|
|
akr_ (~akr@dis212095024083.vie.telering.at) has quit: Quit: Client exiting
|
|
<akpm> |
rmk: SA11xx USB client/gadget code (David B has been doing some work on
|
|
<akpm> |
this, and keeps trying to prod me, but unfortunately I haven't had the time
|
|
<akpm> |
to look at his work, sorry David.)
|
|
<willy> |
wasn't that merged yesterday?
|
|
<dwmw2> |
willy: not sa11x0 I think
|
|
<davem> |
ok let david-b and rmk work it out
|
|
<gregkh> |
davem: agreed.
|
|
<jejb> |
There's a dma_api type issue in USB that's gone unaddressed. Probably ignore it for 2.6?
|
|
<randy> |
yses, not must-fix
|
|
<akpm> |
looks like that'll happen in due course
|
|
<davem> |
jejb: in the gadget stuff?
|
|
<jejb> |
davem: yes, I think so. They have a parallel API for doing transports
|
|
<jejb> |
they wanted to hook into the DMA API.
|
|
<davem> |
jejb: I'd need to take a closer look, but it doesn't even build with gcc-3.x currently
|
|
<jejb> |
As long as it all works OK, I think that's 2.7. But does it all work OK
|
|
<jejb> |
?
|
|
<davem> |
jejb: makes "#if define(XXX)" note missing final 'd' in define.
|
|
<gregkh> |
jejb: for the device stuff? I didn't think so, but I'll look into it.
|
|
<akpm> |
davem: I got it all compiling on ppc64, couple of fixlets needed
|
|
<akpm> |
anyway usb gadget won't block 2.6 ;)
|
|
<jejb> |
gregkh: thanks.
|
|
<davem> |
akpm: k
|
|
<akpm> |
fs/
|
|
<davem> |
next
|
|
<akpm> |
ext3 data=journal mode is bust.
|
|
<akpm> |
that's me. Still thinking about it. It'll happen
|
|
<akpm> |
ext3/htree doesn't play right with NFS server. 90% fixed in -mm.
|
|
<viro> |
akpm: fs/char_dev.c needs removal of aeb stuff and merge of cdev-cidr. In progress.
|
|
<gerrit_> |
akpm: if you need help with data=journal, testing, etc. ask ming
|
|
<oxymoron> |
How's htree stability otherwise?
|
|
<tytso_> |
akpm: you forwarded the nfs htree patches to linus, right?
|
|
<ming> |
akpm: I would happy to help on testing
|
|
<akpm> |
There's an outstanding NFS/htree problem to do with converting flat dirs to htree dirs. A fix is known, but not implemented.
|
|
<akpm> |
I'll bug Stephen/tytso. If they can flesh out the fix then chris of bzzz could code it
|
|
<akpm> |
I hope
|
|
<steved> |
its a cookie issue..
|
|
<tytso_> |
akpm: That should have been fixed with the patches you sent to Linus.
|
|
<steved> |
I believe..
|
|
gaughen (~gaughen@32.97.110.142) has joined channel #lse
|
|
gregkh (~gregkh@12.231.249.244) is now known as gregkh_gone
|
|
<akpm> |
tytso_: so there are no outstanding NFS probs?
|
|
<bzzz> |
akpm: any time
|
|
<akpm> |
oxymoron: pretty good I think.
|
|
Set (~set@207.158.181.2) has quit: Read error: Connection reset by peer
|
|
<tytso_> |
tytso: The only problem I know of is that the NFS
patches you sent to Linus fix the flat dir->nfs problems, but
|
|
<tytso_> |
return '.' and '..
|
|
lochii (lochii@singularity.convergence.cx) has joined channel #lse
|
|
<akpm> |
ah, right.
|
|
<tytso_> |
out of order. Nothing seems to break, and I haven't had time to work up the patch to fix that, if it needs fixing.
|
|
<tytso_> |
Once we get the OLS paper submitted, I'll get to work on that patch....
|
|
* corbet thinks "finish OLS paper" belongs on the must-fix list
|
|
<akpm> |
viro: fs/char_dev.c stuff noted
|
|
<akpm> |
- AIO/direct-IO writes can race with truncate and wreck filesystems.
|
|
Steph (~sglass@192.35.232.241) has quit: Read error: Connection reset by peer
|
|
<akpm> |
that's possibly unfixable in the 2.6 context
|
|
<viro> |
akpm: yes, they can.
|
|
<wli> |
akpm: I just got sprayed with a list of like 10 aio bugs from internal stuff
|
|
<viro> |
akpm: it's a Don't Do It, Then issue
|
|
<arjan> |
akpm: sct fixed that for 2.4
|
|
<akpm> |
wimpy option is to disable AIO/DIO for S_ISREG unless special mount option is given
|
|
<oxymoron> |
viro: Seems like it.
|
|
<ak> |
viro: it's not limited to root
|
|
<akpm> |
arjan: no, this one is much more blatant than the 2.4 issues
|
|
<gerrit_> |
janetinc: are you aware of the aio/dio/s_ISREG issues? are you planning to work them at all perhaps with suparna?
|
|
<viro> |
ak: removal of O_DIRECT handling solves it neatly for me
|
|
tcw (tcw@209.198.128.6) has left channel #lse
|
|
<akpm> |
we drop i_sem while IO is in progress. It's a bit of a joke.
|
|
<ak> |
viro: NAO
|
|
ndabney (~smurf@65.172.181.6) has joined channel #lse
|
|
<arjan> |
ak: why not
|
|
<quintela> |
NAO?
|
|
Steph (~sglass@pixpat.austin.ibm.com) has joined channel #lse
|
|
<willy> |
Not An Option
|
|
<janetinc> |
suparna and i can surely have a look at the aio/dio/s_ISREG issues
|
|
sct (~sct@80-195-6-107.cable.ubr02.ed.blueyonder.co.uk) has joined channel #lse
|
|
<bzzz> |
akpm: I believe there is 'use-after-free' bug in current jbd
|
|
Kwuck (~jordi@3ffe:b80:3:4481::2) has quit: Quit: *ping*
|
|
<willy> |
(presumably it's not an option because people have already published TPC numbers ;-)
|
|
<ak> |
arjan: because O_DIRECT is useful for a lot of things. you could make it root only, but that would be rather inconvenient
|
|
<gerrit_> |
akpm: lets avoid the wimpy option if janet can do something here.
|
|
<akpm> |
janetinc: I've discussed with suparna. It's ugly.
|
|
<oxymoron> |
AIO should be fixable, O_DIRECT seems more troublesome.
|
|
<akpm> |
we also need to forward-port sct's fixes for regular O_DIRECT
|
|
<gerrit_> |
akpm: is their a pointer to sct's patches somewhere? on your list perhaps?
|
|
<sct> |
akpm: I think that should be fairly straightforward, I've been doing a bit more testing and they look OK.
|
|
<wli> |
out of curiosity, is there any intention to merge the rest of aio? AIUI there's some large fraction of the actual async semantics that never hit the tree.
|
|
<sct> |
gerrit_: Not yet. There are some security implications.
|
|
<jejb> |
Doesn't full AIO depend on error stacks which we don't have?
|
|
<wli> |
(it sounds doubtful given the current state of affairs)
|
|
<akpm> |
wli: I need to review suparna's stuff
|
|
<oxymoron> |
wli: Linus was talking about a fairly major AIO-centric revamp when this last came up..
|
|
<akpm> |
wli: and run it by bcrl
|
|
<wli> |
jejb: worktodos and/or equivalents? yes
|
|
<willy> |
akpm: suparna has major doubts about the last set of patches she sent
|
|
<gerrit_> |
btw, anyone who has test cases that break aio/dio/etc. should submit them to LTP
|
|
<akpm> |
willy: of what nature?
|
|
<jejb> |
wli: ability to direct the driver error handlers is what I mean (i.e. don't retry 15 times etc.)
|
|
<willy> |
akpm: mostly "better ways to do stuff", I think.
|
|
<akpm> |
willy: ah. I probably scared her ;)
|
|
<sct> |
I just got back from a debate this evening and am off to bed shortly --- arjan said there was O_DIRECT stuff going on.
|
|
<wli> |
jejb: AFAIK we have 0 of that and the 2.4.x stuff didn't either
|
|
<akpm> |
willy: her architecture leverages existing error propagation. It's workable.
|
|
<sct> |
Anything I need to mention before I disappear?
|
|
<jejb> |
wli: that's why I said *full*
|
|
<jejb> |
It also applies to layerd drivers (like md)
|
|
<gerrit_> |
sct: if you need help testing with that stuff, let janetmor@us.ibm.com or suparna know - we'll help
|
|
<wli> |
sct: basically aio/dio writes race w/truncate and someone said you had s fix somewhere
|
|
<akpm> |
sct: lock_journal() stinks ;)
|
|
<sct> |
akpm: We know that. :)
|
|
<sct> |
wli: aio isn't the problem, it's buffered/direct races which hurt.
|
|
<akpm> |
bzzz's stuff is in my intray. big.
|
|
<wli> |
sct: okay so you know what's going on then?
|
|
<akpm> |
wli: the stuff sct is working on is subtle. AIO/DIO is a great big clanger.
|
|
<willy> |
would lock_journal() suck less if we had spin-then-sleep semaphores?
|
|
<akpm> |
willy: it might, yes. Could be fun to play with
|
|
<riel> |
would it be possible to simply have truncate() fail while any process has the file open with O_DIRECT ?
|
|
<wli> |
This vaguely sounds like the stuff has owners & forward progress.
|
|
<akpm> |
wli: yup
|
|
<akpm> |
next....
|
|
<akpm> |
devfs!
|
|
<viro> |
riel: that's too ugly
|
|
<tytso_> |
move to terminate...
|
|
<viro> |
akpm: er. above comment applies
|
|
<Almighty> |
hehe
|
|
<viro> |
tytso: let hch do it
|
|
<arjan> |
isn't it nearly gone alreayd?
|
|
<viro> |
tytso: I'll help
|
|
mrah (~Hicham@80.11.176.174) has joined channel #lse
|
|
<akpm> |
the plan there (I hope) is to rip the core and put in smalldevfs
|
|
<akpm> |
so people are working on that...
|
|
<riel> |
I'm sure devfs will turn out nicely with hch and viro looking after it
|
|
<akpm> |
anything else under fs/?
|
|
<green> |
reiserfs? ;)
|
|
<wli> |
akpm: hugetlbfs sucking on non-x86
|
|
<wli> |
(moi)
|
|
Foske (~josh@e162077.upc-e.chello.nl) has quit: Quit: using sirc version 2.211+KSIRC/1.2.4
|
|
<viro> |
green: tail-related stuff
|
|
<green> |
what's up with that reiserfs_read_pages patch that I believe sits in -mm?
|
|
<akpm> |
wli: bugzilla stuff I think.
|
|
<green> |
viro: Hm, what's wrong with tail related stuff?
|
|
<viro> |
green: IIRC, there was some mess going on
|
|
<viro> |
green: I'll need to look through archives - might've been fixed
|
|
<green> |
viro: any info/testcases? this is first time I hear about it
|
|
<wli> |
akpm: sparc64 is gross negligence + zero hardware =(
|
|
<davem> |
wli: Be nice if there was a testsuite somewhere, then I might work on it
|
|
<davem> |
wli: otherwise I'm -ENORACLE
|
|
<ak> |
davem: it's easy to test.
|
|
<davem> |
ak: I don't want to have to code up test programs
|
|
<green> |
also what's general opinion on st.blk_sizr increase, I wonder? (besides some broken applications breaking)?
|
|
<davej> |
wli: btw, shouldn't hugetlbfs fail to mount if the cpu doesn't support large pages?
|
|
<wli> |
davem: I'll bounce some stuff your way, then.
|
|
<davem> |
wli: cool
|
|
<ak> |
davem: just hack up the ltp shm testcases a bit
|
|
<davem> |
davej: sparc64 supprots it, but I haven't maintained the code at all
|
|
<wli> |
davej: oh boy, sounds like a bogon
|
|
<davej> |
davem: yah, different issue..
|
|
<akpm> |
green: that patch has been in -mm for ages, but I'm wobbly about the st_blocksize thing.
|
|
<davej> |
wli: not sure if that got fixed, but it was the case a while back
|
|
<davem> |
anything else?
|
|
<ak> |
green: "some" is good. everybody using old bsd db breaks.
|
|
<wli> |
I'm done with fs/
|
|
<riel> |
how is ACL/EA stuff looking in 2.5 ?
|
|
<green> |
akpm: Hans seems to strongly want it. I apoke with Sleepycat and they said that they limit their IO window to 16k anyway. kmail issue is fixed, what's left?
|
|
<bzzz> |
akpm: probably, fast EA for ext3?
|
|
<akpm> |
green: but you guys own it. If you want to push ahead, and we have a mount option to turn it off, then let's do it.
|
|
gregg_it (~greg@adsl-178-71.37-151.net24.it) has quit: Remote host closed the connection
|
|
<green> |
akpm: ok, let's hope Linus will accept it then, I'll try to push it again tomorrow
|
|
<akpm> |
bzzz: I'm an EA ignoramus. I don't know if anyone is using it much, but the code has had decent testing in 2.4 I believe
|
|
<akpm> |
green: is -mm uptodate?
|
|
<oxymoron> |
I don't think EA/ACL qualifies as must-fix at this point.
|
|
<viro> |
akpm: there is some generic stuff for namei/namespace/super, but that's a slow-merge and can go in 2.6 just fine
|
|
<davej> |
green: is Hans still hoping for inclusion of reiser4 for 2.6 ?
|
|
<sct> |
akpm: It has, there's a fair following of AG's patches on 2.4
|
|
<green> |
akpm: Should be, I made exactly zero changes on top of what you have in -mm
|
|
pzb (~pzb@peabody.ximian.com) has joined channel #lse
|
|
<viro> |
davej: *what*?
|
|
<green> |
davej: Yes, I think he does
|
|
<davej> |
that seems to have gone quiet over the last months
|
|
<oxymoron> |
2.6.1, perhaps..
|
|
<akpm> |
green: OK, please check it over, let me know and I'll send it on
|
|
<green> |
davej: He thinks he have some secret agreement with Linus so that even if reiser4 is late for 2.6.0, it may get in later as it happened with reiserfs
|
|
<green> |
akpm: Ok
|
|
<akpm> |
kernel/
|
|
<akpm> |
O(1) scheduler starvation, poor behaviour seems unresolved.
|
|
<oxymoron> |
green: Hopefully Linus will hand over 2.6 at .0 this time around..
|
|
<davej> |
green: difference being lots of people were using old reiser before the merge already due to distros merging it
|
|
<nevdull> |
akpm: is Jens on?
|
|
<arjan> |
akpm: is that sched_yield() stuff?
|
|
<willy> |
openmp seems to be a good trigger
|
|
<akpm> |
nevdull: nope
|
|
<arjan> |
willy: openmp is sched_yield() crap
|
|
<arjan> |
willy: intel bogon code
|
|
<mbligh> |
akpm: wasn't that just Linus' patch?
|
|
<arjan> |
willy: they ought to use cpu binding and suddenly all is well
|
|
<nevdull> |
if this is the sched_yield stuff again, it shouldn't hold up. But I think we need to know how it "still doesn't feel as good".
|
|
<akpm> |
what's openmp?
|
|
<green> |
davej: well, I don't know what to say on this then ;)
|
|
<oxymoron> |
mbligh: Linus' patch was addressing a problem that's still there without the patch..
|
|
<willy> |
akpm: like PVM, aiui
|
|
<arjan> |
akpm: intel fortran lib
|
|
<akpm> |
it has a scheduler problem?
|
|
<drepper> |
akpm: there are at least two big problems with the interaction between futex and O(1). Ingo has already patches. But we need much more testing on big boxes. Only 4p+ machines have problems
|
|
<arjan> |
akpm: they try to parallel compute
|
|
<arjan> |
akpm: it calls sched_yield() in a tight loop in a spinlock like thing
|
|
<wli> |
drepper: give me something to test
|
|
<akpm> |
david m-t has said he's seen significant problems with compute loads too. He'll be preparing a report next week
|
|
<arjan> |
akpm: O(1) scheduler refuses to move 1 of them to the second cpu if you have 2 threads in contention
|
|
<arjan> |
akpm: and then Intel gets upset about that
|
|
<mbligh> |
drepper: IBM has plenty of big boxes we can help you with ;-)
|
|
<gerrit_> |
akpm: I have a lot of data that suggests that HT scheduling stinks - e.g. loss of performance with HT on...
|
|
<arjan> |
akpm: davidmt bug is the same openmp issue
|
|
<gerrit_> |
akpm: but I don't think it is stuff to hold the freeze. not a functional issue
|
|
<willy> |
gerrit: well that could be a hardware problem too
|
|
<davej> |
gerrit_: horses for courses... some loads its a win, some a lose.
|
|
Cyb0org (~Cyb.org@217.96.197.18) has joined channel #lse
|
|
jlas9 (~chatzilla@217-125-37-60.uc.nombres.ttd.es) has joined channel #lse
|
|
<willy> |
arjan: no, he's got two issues
|
|
<nevdull> |
find_busiest_queue has rounding errors due to the integer arithmetic. There are discontinuities in its decisions about where to balance.
|
|
<wli> |
major performance target
|
|
<willy> |
one's openmp, the other's something else
|
|
<gerrit_> |
willy: HT is a hardware problem.
|
|
<randy> |
gerrit_: are those just degenerate cases (known) or not?
|
|
<arjan> |
willy: ok
|
|
<drepper> |
wli: mbligh: just get rhl9 and compile glibc with nptl. there are several test programs aavailable. I'd suggest joining the nptl mailing list
|
|
<mbligh> |
nevdull: I had a fix for that somewhere
|
|
<gerrit_> |
davej: we have MS benchmarks and linux benchmarks on identical hardware. MS shows a 20-30% gain, linux shows a degredation.
|
|
<oxymoron> |
akpm: Do we still have the 'wiggling a window makes X go non-interactive' bug?
|
|
<nevdull> |
mbligh: it's not hard to fix. But it's still in the standard code.
|
|
<davej> |
gerrit_: which benchmark ?
|
|
<akpm> |
gerrit_: I have the impression that you have a lot of half-timers on the scheduler. Is there a particular owner there who can help to push things along?
|
|
<gerrit_> |
davej: tpc-h
|
|
<gerrit_> |
davej: and I think specjbb (java)
|
|
<dhansen> |
davej: Specjbb too
|
|
<akpm> |
oxymoron: it's much improved
|
|
<arjan> |
gerrit_: oracle is a cache hog; mssequelsewer isn't
|
|
<mbligh> |
nevdull, just shift the divisors onto the other side of the eqn IIRC
|
|
<arjan> |
gerrit_: HT sucks for cache hogs, you trash the other cpu's cache too
|
|
<gerrit_> |
akpm: nevdull is our focal point, andrew theurer hacks a bit as well
|
|
<nevdull> |
akpm: I've heard one significant thing so far: Ingo has patches. Seems like we should try those before spending a lot of cycles on this.
|
|
mnc (~null@bi01p1.co.us.ibm.com) has quit: Remote host closed the connection
|
|
<gerrit_> |
arjan: yep - well aware, and DB2 was on MS and Linux...
|
|
<wli> |
drepper: we've got dedicated scheduler-interested ppl (e.g. nevdull, habanero, hubertus) we're probably better sharing our machines with to get you runtime, I'll see what I can do to get them plugged in
|
|
<gerrit_> |
same app, radically different performance
|
|
<arjan> |
gerrit_: same threading model ?
|
|
<gerrit_> |
arjan: hard to know - in theory yes, but in practice, theres a lot of arch specific code in the database
|
|
<gerrit_> |
arjan: hard to weed out the app differences - which is part of why it is taking so long.
|
|
<gerrit_> |
arjan: ditto on java related benchmarks.
|
|
<ak> |
the NUMA scheduling will also need a lot of work, but that's a different issue (and no showstopper for release)
|
|
<nevdull> |
ak: agreed. NUMA needs work but need not hold up 2.6.
|
|
<wli> |
nevdull: I'm concerned there are issues beyond load balancing going on. The queueing metrics I've collected appear to show some large discrepancies in scheduling latencies and waiting time on UP
|
|
<gerrit_> |
and again, I don't think the HT stuff holds 2.6 - but it does need some better understanding and we are looking for that.
|
|
<nevdull> |
Ok, those would be useful to go over.
|
|
<akpm> |
hanna: admin: we're 33% of the way through. How about we go through to the end of the "bugs" list, do features next time?
|
|
riel (~riel@riel.netop.oftc.net) is now known as unriel
|
|
<gerrit_> |
and NUMA is likely to be a distro value add item across the board...
|
|
<hanna> |
akpm sounds good. next week same time and day?
|
|
<gerrit_> |
if it makes 2.6.XX, it will have to be mostly arch independent.
|
|
<akpm> |
hanna: we cn wor that out later. There's still 30-60 minutes worth today.
|
|
<akpm> |
next?
|
|
<ak> |
gerrit: one issue is that there is quite different NUMA. k8 style NUMA vs 2/4 cpus per node big iron numa
|
|
<ak> |
gerrit: they need different strategies
|
|
<mbligh> |
nevdull / ak: as long as we can turn it off (it's now a separate config option), I don't see it's a regression from 2.4 ;-)
|
|
<davem> |
akpm: I also have to leave soon, need to hit net soonish :)
|
|
<mbligh> |
ak, you mean page striping?
|
|
<akpm> |
net/
|
|
<akpm> |
davem: that seems to be under control?
|
|
<davem> |
ALexey is on top of the first two entries.
|
|
davej_ (~davej@81-86-107-140.dsl.pipex.com) has joined channel #lse
|
|
<davem> |
The TCP hang issue is troublesome
|
|
<davem> |
Need more traces
|
|
<akpm> |
davem: yeah, sorry. I need to pull finger out.
|
|
Steph (~sglass@pixpat.austin.ibm.com) has quit:
|
|
<akpm> |
I'll do that asap
|
|
<davem> |
Thanks
|
|
<davem> |
Also, related to modules/
|
|
<davem> |
Rusty is working with us on two-stage unload
|
|
<davem> |
You see that URL you have there?
|
|
<davem> |
Ask shemminger for his mirror on osdl.org
|
|
<akpm> |
yup
|
|
<davem> |
Don't want my workstation getting spammed
|
|
<randy> |
davem which is the TCP hang issue?
|
|
<davem> |
Last one in net/
|
|
<ak> |
davem: but it won't need driver changes I hope?
|
|
<davem> |
ak: nope
|
|
Arador (diego@213.99.229.71) has quit: Quit: Arador
|
|
<ak> |
davem: tcp hang - i thought it was TCP_CORK only?
|
|
<davem> |
ak: only if you specify module_shutdown()
|
|
<ak> |
TCP_CORK is not a showstopper ;)
|
|
<davem> |
ak: This is not known yet
|
|
<davem> |
ak: Everyone uses it, yes it it
|
|
mrah (~Hicham@80.11.176.174) has quit: Quit: Client Exiting
|
|
sct (~sct@80-195-6-107.cable.ubr02.ed.blueyonder.co.uk) has quit: Quit: ZZZzzz
|
|
<willy> |
oh -- not net/ as such, davej was reporting problems where a client can overstress a server and Bad Things happen -- I don't see that on the list
|
|
<akpm> |
willy: nfs?
|
|
<davej_> |
thats under nfs
|
|
<davem> |
willy: That's pretty content free description
|
|
<willy> |
yes
|
|
<davej_> |
trond is on that
|
|
<davem> |
sunrpc needs to be fixed module wise
|
|
<davem> |
and de-lock_kernel()'d
|
|
<randy> |
davem, this one: http://www.osdl.org/archive/shemminger/modules.html
|
|
<davem> |
randy: yep
|
|
<akpm> |
I suspect my MAP_SHARED test will nuke nfs again. Need to retest. I pretty much know what to do there. That's just a bug tho
|
|
<ak> |
also soft needs to be fixed - there are quite a lot of uninterruptible waits in sunrpc/nfs
|
|
<viro> |
davem: per-interface sysctls are FUBAR, but that's not new
|
|
<ak> |
(not a showstopper of course)
|
|
<davem> |
viro: which ones?
|
|
<viro> |
davem: entire bunch - if they disappear on ifdown, you have trouble
|
|
<davem> |
viro: we only unregister them on unregister_netdev()
|
|
ricklind (~rick@129.33.49.251) has joined channel #lse
|
|
<davem> |
viro: or are you talking about ipv4 ones?
|
|
<viro> |
davem: yes
|
|
dwmw2 (~dwmw2@193.237.130.41) is now known as gone_dwmw2
|
|
rick (~rick@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
jstultz_ (~jstultz@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
<davem> |
viro: that explains it, I'll look into it
|
|
hbaum_ (~hbaum@bi01p1.co.us.ibm.com) has joined channel #lse
|
|
bunk (~bunk@129.187.202.58) has quit: Quit: good night
|
|
<davem> |
akpm: My only panic is on the TCP hang thing
|
|
<viro> |
davem: OK, later - I'll go through that code again and send you details
|
|
<davem> |
akpm: also what about those unmapping SLAB hacks?
|
|
<akpm> |
davem: hmm, I haven't played with that for a while
|
|
<davem> |
akpm: there was no resolution
|
|
<davem> |
akpm: was it determined to be buggy
|
|
<davem> |
akpm: re: vmalloc races?
|
|
<akpm> |
davem: there's uncertainty over the tlb invalidate.
|
|
moose (~cherry@65.172.181.6) has quit: Quit: Client Exiting
|
|
nevdull (~rick@32.97.110.142) has quit: Read error: Connection reset by peer
|
|
<davem> |
akpm: ok, because that thing could help find real bugs quite well
|
|
<akpm> |
davem: I'll ping manfred.
|
|
<green> |
akpm: but is there a patch one can look at?
|
|
rick (~rick@bi01p1.co.us.ibm.com) is now known as nevdull
|
|
hbaum (~hbaum@bi01p1.co.us.ibm.com) has quit: Read error: Connection reset by peer
|
|
jstultz (~jstultz@bi01p1.co.us.ibm.com) has quit: Read error: Connection reset by peer
|
|
<davem> |
akpm: also his magazine hacks, needed for routing perf
|
|
<akpm> |
green: it's in -mm
|
|
<ak> |
akpm: it kills e100 with apply_alternatives
|
|
<akpm> |
davem: that's 80% complete. Is stable, speeds things up, needs some autotuning or a tuning API
|
|
<davem> |
akpm: great
|
|
<ak> |
and I didn't find a bug in my code
|
|
<wli> |
magazining is a good general optimization
|
|
Burner23 (~burner@port-212-202-203-32.reverse.qdsl-home.de) has quit: Quit: leaving
|
|
<akpm> |
davem: has been in -mm for a couple of weeks
|
|
<davem> |
akpm: other than tweaks and bits, that all wrt. networking
|
|
<akpm> |
davem: ok, thanks.
|
|
<davem> |
np, later guys
|
|
<akpm> |
back to kenrel/
|
|
JK (~jan@vexed2.alioth.net) has quit: Remote host closed the connection
|
|
davem (~davem@216.101.162.243) has quit: Quit: Client Exiting
|
|
<akpm> |
Alan: 32bit uid support is *still* broken for process accounting.
|
|
<akpm> |
Create a 32bit uid, turn accounting on. Shock horror it doesn't work
|
|
<akpm> |
because the field is 16bit. We need an acct structure flag day for 2.6
|
|
<akpm> |
IMHO
|
|
<ak> |
akpm: does anybody still use process accounting ?
|
|
<akpm> |
anyone here understand this?
|
|
<viro> |
akpm: sure
|
|
<wli> |
yes
|
|
<gerrit_> |
sure, but who really uses process accounting?
|
|
<ak> |
the people who do supercomputer accounting have more complex patches anyways
|
|
<viro> |
akpm: the record is fixed-format.
|
|
<viro> |
akpm: binary
|
|
<oxymoron> |
Paranoid folks too.
|
|
<viro> |
akpm: with narrow field for UID
|
|
<tytso_> |
Well, if no one uses it, then a flag day switch of the structure before 2.6.0 ships should be easy.
|
|
<viro> |
akpm: fix: change the format
|
|
<arjan> |
akpm: yes
|
|
<arjan> |
akpm: afaik there's 16 bits of spare change in the records
|
|
<viro> |
akpm: will require change of sa(8)
|
|
<arjan> |
akpm: so not unfixable
|
|
<arjan> |
viro: are you sure? I thought there was some spare space in that struct
|
|
<viro> |
arjan: let me check
|
|
alb (~Libertad@200.43.231.239) has joined channel #lse
|
|
vizard (~jeff@24.43.126.4) has joined channel #lse
|
|
<wli> |
akpm: random goop: the task refcounting bug is taking forever to track down, may or may not be worth promoting
|
|
<viro> |
arjan: umm... there's some padding in the end, but...
|
|
<oxymoron> |
akpm: this accounting stuff is pretty narrow-interest, you can probably move on..
|
|
<arjan> |
viro: ok so fixable, not pretty but fixable
|
|
<akpm> |
oxymoron: well if we need to change API we'd better do it soon and loudly.
|
|
<viro> |
OK, next?
|
|
<arjan> |
akpm: padding ->no break. next. ;)
|
|
rico (peak@217.81.133.15) has left channel #lse
|
|
<akpm> |
ok, who's going to do it?
|
|
<arjan> |
akpm: Alan has a patch ready
|
|
<oxymoron> |
Let's nominate Alan..
|
|
<akpm> |
ah, OK.
|
|
ricklind (~rick@129.33.49.251) has quit: Ping timeout: 485 seconds
|
|
<akpm> |
mm/
|
|
<akpm> |
Overcommit accounting gets wrong answers
|
|
<akpm> |
this is fairly nasty. I'll tkae another look at that.
|
|
<akpm> |
- Proper user level no overcommit also requires a root margin adding
|
|
hollis (~hollis@192.35.232.241) has quit: Quit: night
|
|
<akpm> |
alan said this was surprisingly simple.
|
|
<akpm> |
anything else in mm?
|
|
<wli> |
akpm: slab parts aren't _that_ bad; morton pages are another matter entirely
|
|
<jejb> |
GFP_DMA32?
|
|
<wli> |
jejb: love to have it, wishlist?
|
|
acme (~acme@2-211.ctame701-2.telepar.net.br) has joined channel #lse
|
|
Cyb0org (~Cyb.org@217.96.197.18) has quit: Quit: .
|
|
<hanna> |
closing in on 2 hours...
|
|
* willy moves it be called GFP_4G
|
|
<jejb> |
wli: can be done with minor header tweaking. Needed for IA64 noiommu
|
|
<akpm> |
jejb: that's a separate zone. If we really want it, then yeah, who has patch?
|
|
<zaitcev> |
s390x needs 2GB limits
|
|
<jejb> |
akpm: can be implemented in current zone
|
|
<arjan> |
jejb: only if you don't merge the real ia64 noiommu fix
|
|
<zaitcev> |
Not 4, mind
|
|
<arjan> |
well
|
|
<arjan> |
we need an allocator that takes a bitmask
|
|
<arjan> |
also for 31 bit PCI devices
|
|
<arjan> |
as low level primitive
|
|
<wli> |
okay then let's take this one to another forum?
|
|
<zaitcev> |
Currently they just defined GFP_DMA to be 31 bit long
|
|
<jejb> |
arjan: so kmalloc_mask() instead. That's invasive
|
|
<zaitcev> |
Sorry
|
|
arjan (~arjan@node-d-1ea6.a2000.nl) has quit: Quit: sleeptime for real
|
|
<willy> |
zaitcev: right, and ia64 defines GFP_DMA to be anything under 4GB
|
|
<wli> |
well we don't even agree on strategy here so let's make up our minds some other time
|
|
<akpm> |
I'll poke the stakeholders, see if we can work out what we want to do
|
|
<akpm> |
modules
|
|
<akpm> |
everything I have here seems to be under control.
|
|
<jejb> |
OK. GFP_DMA32 is the quick fix. kmalloc_mask() is better but more difficult
|
|
<willy> |
without rusty here, no point in discussing modules
|
|
<acme> |
akpm: have you seen DaveM's take on modules for complex subsystems like IPv6 and IPv4?
|
|
davidvh (~David@dhcp14096.mccollum.fsu.edu) has joined channel #lse
|
|
<viro> |
willy: yup
|
|
<hanna> |
maybe next call will be earlier so europeans/aussies can make it..
|
|
<akpm> |
but there are issues wrt alleged overall loss of functionality cmpared to 2.4..
|
|
<acme> |
IPv4 is not an issue right now, its not modular anyway
|
|
<akpm> |
acme: yes, dave mentioned two-stage unload
|
|
<akpm> |
next?
|
|
<acme> |
but IPv6 has just too many dinamicly allocated structs to do tons of __module_get/put
|
|
<acme> |
I'll get his unpublished doc on modules and post here
|
|
<ak> |
not unloading ipv6 would also be no regression - it is not unloadable in 2.4
|
|
<akpm> |
waht ak said.
|
|
<acme> |
ak: agreed
|
|
<akpm> |
net/*/netfilter/
|
|
<ak> |
just bridge etc.but these can be made ununloadable too
|
|
<willy> |
akpm: also rusty
|
|
<akpm> |
that's rusty stuff, under control.
|
|
<akpm> |
sound/
|
|
<phillips> |
the original time is ok for europe, except 9 to 5ers
|
|
<akpm> |
rmk: several OSS drivers for SA11xx-based hardware in need of
|
|
<akpm> |
ALSA-ification and L3 bus support code for these.
|
|
<phillips> |
aussies can make it if they get up early
|
|
<davej_> |
lots of OSS drivers still lacking fixes from 2.4
|
|
<akpm> |
davej_: does that have an owner?
|
|
<willy> |
one plan i heard was to kill OSS for 2.6
|
|
<acme> |
I've heard that as well
|
|
<davej_> |
Adam Belay was supposed to be scooping them up, but I've not seen anything from him
|
|
<rmk> |
more drivers stuff from me... the oss drivers can't go in as is because it seems that the OSS bits seem to use virt_to_bus
|
|
<wli> |
willy: is there functionality loss implied?
|
|
<davej_> |
willy: last I heard was 'some of it.. maybe'
|
|
<acme> |
not sure if alsa replaces everything in OSS tho
|
|
Astro (~astro@2001:618:400:1cec::1) has quit: Quit: The Well Of Wishes awaits in Crypt Of Decay...
|
|
<oxymoron> |
There's some OSS->Alsa breakage, yes.
|
|
<davej_> |
killing it off completley is probably 2.7 material
|
|
<acme> |
davej_: agreed
|
|
<rmk> |
so basically for arm everything has to be alsa.
|
|
<wli> |
kill duplicates?
|
|
mdomsch (~mdomsch@24.28.80.108) has quit: Quit: [BX] The Invisible Man uses BitchX, you just can't see it!
|
|
<oxymoron> |
wli: Mark deprecated, move on?
|
|
<akpm> |
yup
|
|
<akpm> |
global stuff.
|
|
<akpm> |
Lots of 2.4 fixes including some security are not in 2.5
|
|
<akpm> |
main problem: how do we find them again?
|
|
<davej_> |
I've got a bunch of those pending cleaning up/rediffing
|
|
<davej_> |
Alan also has a bunch
|
|
<ak> |
akpm: HZ=1000 :- it made the timer handler bugs with skipping jiffies worsep
|
|
<akpm> |
ok
|
|
<badari> |
akpm: I asked Dipankar to follow up
|
|
<davej_> |
going through the old 2.4 commit logs could be worthwhile for someone with too much time on their hands
|
|
<akpm> |
ak: what's the fix for that
|
|
<akpm> |
?
|
|
randy (~rddunlap@65.172.181.6) is now known as randy_gone
|
|
<ak> |
akpm: implement a PLL for the timer interrupt (arjan was talking about that)
|
|
<akpm> |
ak: does the lost_tick stuff help?
|
|
<wli> |
PLL?
|
|
<willy> |
Phase Locked Loop
|
|
<oxymoron> |
wli: phase locked loop
|
|
<ak> |
or lower HZ again
|
|
<ak> |
akpm: only for a single lost tick
|
|
<oxymoron> |
ak: Quick summary of problem again?
|
|
<akpm> |
ak: what is the cause?
|
|
<ak> |
akpm: ACPI or SMM code
|
|
<ak> |
they sometimes add quite long delays with interrupts off
|
|
<oxymoron> |
ak: Missing expiry of timers?
|
|
<willy> |
ACPI is fixable ... SMM less so
|
|
<ak> |
oxymoron: no gettimeofday becomes non monotonous, which breaks applications
|
|
<oxymoron> |
ak: Ahh..
|
|
<akpm> |
hmm, I don't see this being fixed.
|
|
<willy> |
ia64 has a mechanism for ensuring that gettimeofday is monotonous
|
|
<ak> |
akpm: or add hack: single lock for gettimeofday that stops it going backwards
|
|
Lucifair (~luism@212.202.185.200) has left channel #lse: Client exiting
|
|
<ak> |
akpm: but it will cost you because gettimeofday is time critical
|
|
<willy> |
i'm not sure people would like it, it involves a cmpxchg
|
|
<wli> |
stanford checks look like they can be easily catalogued
|
|
Intuxicated (~Intuxicat@modemcable029.129-200-24.mtl.mc.videotron.ca) has quit: Read error: Connection reset by peer
|
|
<ak> |
akpm: e.g. it's called for each network packet, so that would eed a fastpath (network stack can deal with it going backwards)
|
|
<davej_> |
wli: that stuff is logged on their website (and bugzilla)
|
|
<akpm> |
wli: badari was looking into the Stanford bugs
|
|
rz (roman@scrub.xs4all.nl) has quit: Quit: Client exiting
|
|
<ak> |
but x86-64 has it as vsyscall which cannot do locks
|
|
<badari> |
akpm: hollisb is working on these
|
|
<oxymoron> |
ak: Can we reduce the externally visible time resolution?
|
|
<phillips> |
I'd like to see 2.6.0 guarantee monotonic gettimeofday
|
|
<willy> |
ak: but you can loop till successful like ia64 does
|
|
<ak> |
willy: no, vsyscalls cannot update global state
|
|
<akpm> |
ak: do you think you can come up with a plan for tfixing his? John Stultz may be able to help out.
|
|
* jstultz_ wakes up
|
|
<ak> |
akpm: i don't have bandwidth to work on it, sorry.
|
|
<ak> |
just wanted to mention the problem
|
|
<akpm> |
ak: as in: suggest a fix, that's all.
|
|
<gerrit_> |
hbaum, jstultz: does jstultz have bandwidth to dedicate to this?
|
|
<ak> |
akpm: talk to arjan
|
|
<ak> |
he had some ideas
|
|
* akpm nominates jstultz_ ;)
|
|
<acme> |
ok, found: netdev TODO list made by DaveM and with help from a bunch of people
|
|
<akpm> |
movin on, last item..
|
|
<gerrit_> |
akpm: jstultz it is - we'll have to work out details
|
|
<jstultz_> |
noted.
|
|
<akpm> |
64-bit dev_t
|
|
<hbaum_> |
gerrit_: yep jstultz can find the time
|
|
<viro> |
akpm: we need to sort the mess in cdev out before that
|
|
<viro> |
akpm: cdev-cidr will do that
|
|
<aeb> |
Things are in good shape, I think
|
|
<viro> |
aeb: like hell they are
|
|
<aeb> |
Independent of what happens in cdev
|
|
<viro> |
akpm: we _really_ need to get sane API for device number allocation and get it right
|
|
<viro> |
akpm: and that's more serious than cdev
|
|
<viro> |
akpm: what we have now is a joke
|
|
<aeb> |
We have nothing at all
|
|
<akpm> |
viro: can that work be decoupled from simply making dev_t wider to userspace?
|
|
<akpm> |
viro: because it'll take time to weed out remaining broken userspace apps
|
|
<akpm> |
nice to do that in parallel with cdev rework
|
|
<viro> |
akpm: cdev is not a problem
|
|
<gerrit_> |
viro: getting work aka badari's for 5000 devices on IA32 is a major thing for IBM across the board and also for oracle.
|
|
<viro> |
akpm: however, /proc/devices changes will be needed
|
|
<viro> |
akpm: and we'd better get that stuff right.
|
|
<gerrit_> |
viro: we are highly motivated to "do the right thing" for 2.6
|
|
<wli> |
gerrit: IMHO the fix for that is probably not 2.6 mergeable (if mergeable at all)
|
|
<viro> |
wli: ?
|
|
<gerrit_> |
viro: if there are things we need to do to get it right, give us pointers
|
|
<wli> |
gerrit: I've outlined what it is, though.
|
|
<aeb> |
But tThere is no need to use /proc/devices for anything with major > |
255
|
|
colpatch (~mcd@bi01p1.co.us.ibm.com) has quit: Quit: Client Exiting
|
|
<wli> |
viro: stashing info to re-instantiate what are currently pinned dentries/inodes somewhere besides lowmem
|
|
<oxymoron> |
proc/devices must continue to exist for everything that used it in 2.4.
|
|
<viro> |
oxymoron: I'm not sure
|
|
<wli> |
viro: I'd need a serious sit-down just to work out the fs mechanics to get it plausible, much less working, and it's guaranteed the world will hate it.
|
|
<aeb> |
I gave a list of users of /proc/devices earlier today or yesterday
|
|
tridge (~tridge@fjall.tridgell.net) has joined channel #lse
|
|
<oxymoron> |
aeb: In response to hch? Saw that..
|
|
<viro> |
oxymoron: it's becoming a PITA and it's not about to get prettier
|
|
<viro> |
oxymoron: we can leave it as-is, but that means a massive kludge existing only to keep /proc/devices alive
|
|
<viro> |
oxymoron: really massive kludge
|
|
<akpm> |
I'm confused
|
|
<viro> |
akpm: OK
|
|
<viro> |
akpm: let me put it that way: getting syscall boundary into sane shape wrt 64bit dev_t is easy
|
|
<PugMajere> |
can /proc/devices, or an equivalent file be created with some fancy hooks using /sbin/hotplug and converting /proc/devices into a symlink to somewhere else?
|
|
<wli> |
viro: er, sorry if I wasn't clear on the issue, the dentry cache / inode cache lowmem hit from sysfs et al with insane numbers of devices is crippling to the PAE boxen (space behavior of various things was pretty well analyzed and the 2 or 3 above this were already knocked out of the running). i.e. something nobody (except PAE system vendors) even wants fixed.
|
|
<viro> |
akpm: the stuff that bites is
|
|
<viro> |
a) filesystems that keep dev_t of external log - with not enough width
|
|
<oxymoron> |
viro: Do you have a plan re legacy device numbering?
|
|
<viro> |
b) ioctls
|
|
<viro> |
oxymoron: ?
|
|
<akpm> |
wli: we need to revisit that. Linus was totally against dopey icache/dcaceh hacks, and when you do the numbers, the VFS cache mem usage wasn't _taht_ bad. It was mainly pinned requests and stuff.
|
|
<viro> |
oxymoron: leave it as-is, obviously
|
|
<akpm> |
viro: I thought aeb had basically fixed the ioctls?
|
|
<oxymoron> |
viro: I think we only care about /proc/devices for stuff that -had- an 8:8 number.
|
|
<viro> |
akpm: IOW, when foo_bar_we_are_special_ioctl_wank_wank_wank() passes int and casts it to dev_t - we got a problem
|
|
<viro> |
akpm: there are remaining instances
|
|
<wli> |
akpm: If it turns out to be necessary I'll probably be on the hook for it, but it's probably just not for mainline.
|
|
<akpm> |
viro: ext3 external journal we can just break. it's a plaything, not serious feature.
|
|
<tytso_> |
akpm: yes, no one should be using it in production.
|
|
<viro> |
akpm: aeb had fixed some, but it's nowhere near completeness
|
|
<akpm> |
viro: if we merge up the minimal 64-bit patch then those things will be noticed sooner, won't they?
|
|
<viro> |
akpm: not sure
|
|
niemeyer (~niemeyer@200.103.138.210) has joined channel #lse
|
|
<viro> |
akpm: sorry - back in a couple of minutes (RL)
|
|
<aeb> |
"noticed" is not my style
|
|
<akpm> |
viro: aeb was reviewing hundreds of ioctls
|
|
<aeb> |
things must be provably correct
|
|
<tytso_> |
akpm: Especially if we alias the legacy devices at 0xFFFFabcd :-)
|
|
yg_home (~yusufg@cm61-10-95-8.hkcable.com.hk) has joined channel #lse
|
|
<akpm> |
aeb: how far along is that review?
|
|
<oxymoron> |
tytso: Interesting. Breaks everything, but interesting.
|
|
<aeb> |
I think I did two-thirds or so, but was unsure how things
|
|
<aeb> |
would go now that viro is back
|
|
<tytso_> |
oxymoron: might as well find them now
|
|
* viro returns
|
|
ak (nadie@217.82.103.67) has quit: Quit: using sirc version 2.151+ssfe
|
|
<aeb> |
However, I am happy to see that everybody seems to agree what to do at the syscall boundary
|
|
<akpm> |
so suppose we were to jsut merge up the 64-bit patches from -mm. What's bad about that idea?
|
|
<viro> |
akpm: it's a matter of grep, actually - most of uses of dev_t and kdev_t are gone, so it's not too hard to find the rest
|
|
zaitcev (~zaitcev@adsl-66-124-38-163.dsl.sktn01.pacbell.net) has quit: Quit: bye bye
|
|
<viro> |
akpm: mail them to me, OK?
|
|
<akpm> |
viro: sure will.
|
|
<akpm> |
viro: and it sounds like the cdev work, device number allocation, etc is ongoing non-showstopper?
|
|
<viro> |
akpm: cdev work is ongoing and it must be done before 2.6
|
|
<viro> |
akpm: at least on the upper levels
|
|
<akpm> |
viro: what are you up to there?
|
|
<gerrit_> |
viro: is this a multi-month task, though? or is there any chance it'll be done in a month or so?
|
|
<oxymoron> |
gerrit: Seems unlikely with the tty stuff in addition..
|
|
markh (~markh@65.172.181.6) has quit: Quit: Client Exiting
|
|
<gerrit_> |
oxymoron: yeah, I'm adding in my head, hoping the close down is sooner rather than later...
|
|
ajax (~jakorty@208.248.32.211) has joined channel #lse
|
|
<viro> |
akpm: main group is pretty stable and I hope to feed it to Linus RSN
|
|
<viro> |
akpm: that's cdev-cidr and ->i_cdev/->i_cindex stuff
|
|
<akpm> |
ok, thanks.
|
|
<viro> |
akpm: after that we need to fix refcounting for tty_driver (oopsable race, must fix anyway, hopefully about a week until it's merged)
|
|
<akpm> |
anyone else have anything we need to talk about today?
|
|
<viro> |
akpm: then we can do tty/misc/upper levels of sound and hopefully upper level of USB
|
|
<wli> |
viro: is there room for helpers in the equation? You mentioned backports/etc. would be needed?
|
|
ajax (~jakorty@208.248.32.211) has quit: Client Quit
|
|
<viro> |
wli: see above - after the first two parts are in, it branches
|
|
<viro> |
wli: IOW, propagation into subsystems is independent
|
|
<oxymoron> |
wli: Backports for tty fixes, yes.
|
|
tridge (~tridge@fjall.tridgell.net) has quit: Quit: Client exiting
|
|
<viro> |
akpm: USB is a place where we _really_ need to deal with dynamic allocation of device numbers
|
|
<wli> |
viro: That sounds ugly. Is anyone around who can handle working on that level for tty/ldisc/etc. fixes?
|
|
<viro> |
akpm: and that will bite
|
|
<wli> |
(besides you)
|
|
<viro> |
folks, that's #kernel stuff, IMO
|
|
<wli> |
okay
|
|
janetinc (~janetinc@32.97.110.142) has quit: Quit: Client Exiting
|
|
<akpm> |
I think we're done here for today.
|
|
<hanna> |
Great thanks everyone.
|
|
<rmk> |
akpm: power management.
|
|
<viro> |
akpm: ACK
|
|
dmc (~dmc@192.35.232.241) has quit: Quit: using sirc version 2.211+KSIRC/1.2.4
|
|
<akpm> |
thanks folks.
|
|
<hanna> |
Thanks to cryogen and other oftc.net admins!
|
|
<rmk> |
mochel was working on a replacement for the existing stuff with benh.
|
|
jk-jfk (~jfk@pa208.myslowice.sdi.tpnet.pl) has quit: Quit: [BX] Size DOES matter
|
|
jejb (~jejb@64.109.89.110) has quit: Quit: bye
|
|
viro (~al@165.247.161.165) has quit: Quit: Leaving
|
|
<rmk> |
I suspect its too late for it to go into 2.6 tho.
|
|
<hanna> |
guys we can do this again next week or the week after if need be. We should do the features too at some point.
|
|
<hanna> |
Next time Im joining from home.. the network is less flakey there!
|
|
<akpm> |
hanna: next week would suit. No point hanging around
|
|
mckenney (~mckenney@bi01p1.co.us.ibm.com) has left channel #lse: Client Exiting
|
|
aeb (~aeb@213.84.53.62) has quit: Remote host closed the connection
|
|
badari (~badari@32.97.110.142) has left channel #lse: Client Exiting
|
|
<hanna> |
akpm, agreed. What is the earliest you could do it to encourage other people to attend?
|
|
<akpm> |
11AM?
|
|
<oxymoron> |
Is that PDT?
|
|
<akpm> |
yup
|
|
<rmk> |
can we have it specified in UTC please? 8)
|
|
<bzzz> |
yes ;)
|
|
<akpm> |
three hours earlier than this one
|
|
<rmk> |
I think most people know the offset of their local timezone from UTC, but not from PDT or some other zone.
|
|
<oxymoron> |
PDT is -7.
|
|
<hanna> |
rmk. ok sorry. Ill do it then.
|
|
<rmk> |
thanks.
|