|
|
Subscribe / Log in / New account

Optional mandatory locking

Optional mandatory locking

Posted Dec 10, 2015 10:24 UTC (Thu) by cuboci (subscriber, #9641)
Parent article: Optional mandatory locking

As it happens, two days ago I read up on file locking on Linux. And the conclusion I came away with was that there is absolutely no reliable way to make sure you're the only one accessing a file. The use case was something like this:

Someone uploads a file to a server. Upon completion some service does something with that file.

On the server side you have to have a way to know when the transfer is complete. In an ideal world the user would either upload the file with a temporary name and then rename it on the server or upload a second 'flag' file indicating completion. Unsurprisingly, (some) users are dumb and incapable of doing either in practice. So you have to fall back on some other mechanism.

I thought I could use locks that I would only be able to acquire once all other processes finished accessing the file. As it turns out, there's no such mechanism on Linux. Advisory locks are of no use here, mandatory locks are buggy and hard to use anyway. The only way is to wait for a certain amount of time (say, five minutes or so) and check if the file has changed in that time. But that can break down due to bad network connections, too.

So, I'm stuck with an impractical way of doing what I want that is also prone to errors just because Linux lacks proper file locking mechanisms. Sad.


to post comments

Optional mandatory locking

Posted Dec 10, 2015 13:13 UTC (Thu) by philipstorry (subscriber, #45926) [Link] (8 responses)

I'm sure I'll be accused of over-engineering, but...

Use a database?

A database should have all the correct locking you'll need. Granted, you shouldn't put the file being uploaded into the database - but you could use it for the operation status flag.

It's overkill. But after having looked at file locking mechanisms I began to understand why (some) developers use databases so often, and sometimes for what are apparently trivial things.

Optional mandatory locking

Posted Dec 10, 2015 13:27 UTC (Thu) by cuboci (subscriber, #9641) [Link] (7 responses)

This is not about files I generate myself. The files I'm talking about are uploaded by customers. I have no control over that other than to notice a new file is there.

Optional mandatory locking

Posted Dec 10, 2015 15:59 UTC (Thu) by alankila (guest, #47141) [Link] (4 responses)

I'd say that your sftp/whatever server implementation should be able to generate an event when client is finished with uploading a file. This would generally be the case if you used a library that implements the protocol rather than e.g. separate unix process that just dumps stuff to filesystem.

Optional mandatory locking

Posted Dec 10, 2015 19:55 UTC (Thu) by cuboci (subscriber, #9641) [Link] (3 responses)

This is standard OpenSSH SFTP. What event is it able to generate once the upload is complete?

Optional mandatory locking

Posted Dec 10, 2015 20:19 UTC (Thu) by iabervon (subscriber, #722) [Link]

sftp-server logs transactions it performs on behalf of the client. I'm not sure if successful completion is what's at the INFO level or if that would be at a DEBUG level, but this would be a better trigger than any sort of locking, since sftp transfers can fail in the middle, and a locking-based method would either think it was done (and act on partial data) or think it was still going (and wait forever).

Optional mandatory locking

Posted Dec 10, 2015 22:42 UTC (Thu) by rotty (guest, #14630) [Link]

You could also use inotify, for example by incron to generate an event based on a file being open for writing being closed. There might be gotchas, but in principle, it should work (I've used it for auto-converting files uploaded via SMB).

Optional mandatory locking

Posted Dec 12, 2015 11:24 UTC (Sat) by alankila (guest, #47141) [Link]

Probably none, because you are using a system that just dumps stuff to unix filesystem, so you are stuck with something like inotify/dnotify or whatever it is called today. Ideally, you'd assemble your own SFTP daemon out of reusable components, rather than using processes solving parts of the problem and then being stuck trying to discover mechanisms by which they can interoperate.

Optional mandatory locking

Posted Dec 13, 2015 2:55 UTC (Sun) by giraffedata (guest, #1954) [Link] (1 responses)

It's worth noting that even if there were some file locking function that could let you block until the file isn't open for write, relying on that is still a hack in your situation, since there's no reason the program that generates the file, over which you have no control, couldn't open and close the file multiple times in the process.

Optional mandatory locking

Posted Dec 14, 2015 11:55 UTC (Mon) by cuboci (subscriber, #9641) [Link]

I'm aware of that. Problem is, right now I have no choice but to use standard components. Talk about being stuck between a rock and a hard place.

lsof

Posted Dec 10, 2015 20:58 UTC (Thu) by abatters (✭ supporter ✭, #6932) [Link] (2 responses)

Consider scanning /proc/<PID>/fd/* or using lsof to see if the server process still has the file open.

lsof

Posted Dec 15, 2015 16:19 UTC (Tue) by k8to (guest, #15413) [Link]

Or, more simply, if you can trust that the filesystem will be local, make use of mtime. Yeah, it's not perfect but you can wait for the file to be 15 seconds stale and it will work as well as the other hacks.

lsof

Posted Dec 15, 2015 16:53 UTC (Tue) by k8to (guest, #15413) [Link]

Independently, it may be simpler to use fuser for a single file inquiry than lsof. Personally I generally struggle with the lsof flags, but I may be the odd duck.

Optional mandatory locking

Posted Dec 21, 2015 12:02 UTC (Mon) by oldtomas (guest, #72579) [Link] (2 responses)

One of my favourite options for when the transport is SSH is the "command" feature of authorized_keys.

This way you can hook yourself into the action (and even do different processing depending on your customer's credentials).

This is the way gitolite and friends work. For an example on how to do it with rsync, see [1].

Or set up a gitolite, add a few users, go into ~gitolite/.ssh/authorized_keys and follow the breadcrumbs from there.

Missing piece: convince ssh's sftp module to be called from your wrapper script. But I'd expect it to be sufficiently unixy and well-behaved as to just accept some command line parameters and then take the bulk of communication over stdio.

[1] <http://www.sakana.fr/blog/2008/05/07/securing-automated-r...>

Optional mandatory locking

Posted Dec 21, 2015 23:12 UTC (Mon) by nix (subscriber, #2304) [Link] (1 responses)

Your mention of sftp reminds me that insufficient attention is paid to ssh subsystems, probably because they're relatively undocumented. They're not tied into ssh at all -- they're *really* easy to write. A subsystem is just a process whose stdin/stdout/stderr get transparently connected to an SSH stream: all the client end has to do is run ssh -s subsystem_name user@host.

It can be more appealing than authorized_keys commands in some situations (particularly when you want to be able to this for more than one user on the server without frotzing with all their authorized_keys files).

Optional mandatory locking

Posted Dec 22, 2015 12:55 UTC (Tue) by oldtomas (guest, #72579) [Link]

> A subsystem is just a process whose stdin/stdout/stderr get transparently connected to an SSH stream

So my hunch was right, thanks for clarifying that (gotta love the Unix Way :-)

> It can be more appealing than authorized_keys commands in some situations ([...] without frotzing with all their authorized_keys files)

The authorized_keys part serves a different and highly complementary purpose: if you want different clients to do different things depending on their identity (authentification + authorization). The possibility of "hooking in" is just a side-effect.

If you just want to hook in, perhaps substituting the sftp module by an "enhanced" one (which appropriately triggers things on transfer success/failure) would be most adequate, yes.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds