User: Password:
|
|
Subscribe / Log in / New account

UPSERT can be simulated client-side

UPSERT can be simulated client-side

Posted Jun 5, 2014 11:46 UTC (Thu) by andresfreund (subscriber, #69562)
In reply to: UPSERT can be simulated client-side by dskoll
Parent article: PGCon 2014: Clustering and VODKA

Unfortunately you're wrong. That statement isn't safe under concurrency. Consider what happens if:
a) Another transaction inserts a row with a conflicting key inbetween your UPDATE and INSERT
b) If there already is a conflicting row in the table, but you can't see it yet because either the other transaction hasn't committed yet, or you're using REPEATABLE READ and the other transaction has committed after you.


(Log in to post comments)

UPSERT can be simulated client-side

Posted Jun 5, 2014 13:13 UTC (Thu) by dskoll (subscriber, #1630) [Link]

That statement isn't safe under concurrency

Yes, but neither can UPSERT be, can it? An UPSERT will either succeed completely or fail completely. Same with my transaction. The failure modes might be a little different, but you can always contrive situations in which there is a failure.

I guess what I'm asking is: Does the presumably-safer operation of UPSERT under concurrency actually matter in real life? I cannot think of a situation where it does, but maybe I just lack imagination. :)

UPSERT can be simulated client-side

Posted Jun 5, 2014 13:20 UTC (Thu) by andresfreund (subscriber, #69562) [Link]

>> That statement isn't safe under concurrency

> Yes, but neither can UPSERT be, can it? An UPSERT will either succeed completely or fail completely. Same with my transaction. The failure modes might be a little different, but you can always contrive situations in which there is a failure.

No. The upsert we're talking about *does* handle that correctly. We're still arguing about some implementation details but it's definitely possible to do better. The canonical example for that is:
http://www.postgresql.org/docs/current/static/plpgsql-con...

That solution has a relatively high overhead and is not really nice to use, but otherwise it works.

You can do the same from a client connection using savepoints (which internally are used by BEGIN .. EXCEPTION blocks).

> I guess what I'm asking is: Does the presumably-safer operation of UPSERT under concurrency actually matter in real life? I cannot think of a situation where it does, but maybe I just lack imagination. :)

Yes. All the time. If you use something like your code even under moderade concurrency you'll get failures.

UPSERT can be simulated client-side

Posted Jun 6, 2014 18:45 UTC (Fri) by zlynx (subscriber, #2285) [Link]

> Yes. All the time. If you use something like your code even under moderade concurrency you'll get failures.

Is that really a problem? I've written a lot of Perl scripts that use PostgreSQL with very similar transactions, and if it fails I have just done a retry. It has always seemed to work well enough.

Yes, I had to add the retry later after realizing it was a potential problem, but since I check all statements for success and fail or log on errors as all programmers should do, I've never had a problem with thinking it succeeded when it didn't.

UPSERT can be simulated client-side

Posted Jun 7, 2014 9:48 UTC (Sat) by andresfreund (subscriber, #69562) [Link]

> > Yes. All the time. If you use something like your code even under moderade concurrency you'll get failures.

> Is that really a problem? I've written a lot of Perl scripts that use PostgreSQL with very similar transactions, and if it fails I have just done a retry. It has always seemed to work well enough.

There's cases where that's enough. There are others where it's not. Consider the case where the work done in the transaction isn't cheap - retrying the whole thing because of not having an UPSERTish operation is annoying; another case is operations which should have relatively low latency; web frameworks where it's not always easy to retry the whole transaction; ...

You clearly can survive without it, but it does make life easier.

UPSERT can be simulated client-side

Posted Jun 12, 2014 2:41 UTC (Thu) by sbishop (guest, #33061) [Link]

Would you mind helping me understand why dskoll's code wouldn't work? I know why it would be a problem on SQL Server, which is where all my experience is at, but my understanding of PostgreSQL's approach to MVCC tells me that the transaction ought to see a consistent view of the database within one transaction. (In other words, I don't know why the linked-to code loops.)

UPSERT can be simulated client-side

Posted Jun 12, 2014 6:10 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Just imagine what happens if two of these transactions are executing at the same time. You'll end up with two copies of data, since each transaction will see the same view initially.

In practice, you'll likely get a constraint violation error in one of the transactions. That's actually the way UPSERT is usually implemented in Postgres right now.

Ironically, MSSQL does not have this problem because it locks tables, so a second transaction will be waiting until the first transaction is finished.

UPSERT can be simulated client-side

Posted Jun 13, 2014 2:36 UTC (Fri) by sbishop (guest, #33061) [Link]

Thank you, Cyberax. That makes sense.

UPSERT can be simulated client-side

Posted Jun 13, 2014 8:09 UTC (Fri) by intgr (subscriber, #39733) [Link]

> Ironically, MSSQL does not have this problem because it locks tables, so a second transaction will be waiting until the first transaction is finished

Citation needed. This article contradicts your claim: http://www.mssqltips.com/sqlservertip/3074/use-caution-wi...

> MERGE *looks* like it will take care of concurrency for you, because implicitly it seems to a single, atomic statement. However, under the covers, SQL Server really does perform the different operations independently. This means that you could end up with race conditions or primary key conflicts when multiple users are attempting to fire the same MERGE statement concurrently

MSSQL and MVCC

Posted Jun 14, 2014 13:59 UTC (Sat) by oldtomas (guest, #72579) [Link]

According to the wikipedia page on MVCC [1] MS SQL supports MVCC since SQL Server 2005. So they were late at the party (as often) but they seem to be there.

OTOH, given the installations I've seen in the wild (not the real showcases, mind you), most DBAs and application writers seem overly conservative and prefer to not know the feature exists.

UPSERT can be simulated client-side

Posted Jun 5, 2014 13:20 UTC (Thu) by dskoll (subscriber, #1630) [Link]

Yeah, ok, never mind. :) I figured it out.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds