J
J
Jolt2022-03-31 12:59:35
Python
Jolt, 2022-03-31 12:59:35

How to do an upsert in SQLAlchemy?

It seems to be a common task, but I can’t find a normal solution in the English-speaking Internet.
Suppose we have a standard ETL task run through Apache Airflow.
She writes to the "Statistic" table. The table has uniqueness by (user, date), but we do not want data duplication, why do we need this?

The process is simple:
1) create an orm object

Statistic(user='Олег', date='2022-03-31', value='145')

2) you add it to the session through session.add()
3) you do it in session.commit()

Everything is simple, everything even works.
But if suddenly we want to restart the task, you find yourself in a situation where this does not work, because for Oleg the data has already been written to the database. And the code won't do anything at all.

This is usually called Upsert and in postgresql this is done via
insert into...on conflict do update set value = EXTENDED.value

This can be implemented via But wait, how then: 1) write tests with a mock via inmemory sqlite? there these constructions I suppose will not work? 2) then all the code will have to be rewritten from an elegant addition to the session (where all the magic is done under the hood), to constructions with insert, which in the end looks like "it was easier to immediately write on ordinary text queries" 3) disappears immediately plus alchemy in easy migration code to other databases, say mysql (the problem is similar to point 1) In general, please explain what I'm doing wrong? How to add new statistics for a fictitious Oleg to the session so that value is simply updated there with a new one? from sqlalchemy.dialects.postgresql import insert

Answer the question

In order to leave comments, you need to log in

2 answer(s)
D
Dr. Bacon, 2022-03-31
@bacon

disappears immediately plus alchemy in easy code migration to other databases
Well, here's an easy ORM migration from different DBMS, it's kind of a myth. Each base has its own characteristics that help a lot, as a result they become attached to one. And the bases are not changed so often, and rather this is a more global process than just switching. All the same, ORM is more about working with data at the application code level, and not at the lower level, SQL. And the tests are also better on the base used, otherwise you can get different results. Point 2 is hard to say.

Z
Zolg, 2022-03-31
@Zolg

memory sqlite? there these constructions I suppose will not work?
the construct itself is available in SQLite , and sqlalchemy.dialects.sqlite is supported . To what extent its behavior is identical to postgresql, I can’t tell you, but for simple cases, most likely, yes.
3) disappears immediately plus alchemy in easy code migration to other databases, say mysql (the problem is similar to point 1)
In general, please explain what I'm doing wrong?
You want to use both non-standard SQL syntax at the same time and maintain compatibility between different implementations.
Here, either expand the alchemy for your tasks, addressing it for the general case and individual dialects, or think about whether these upserts are really so massive that optimization for ON CONFLICT will really bring a significant gain compared to the banal preliminary SELECT and INSERT choice or UPDATE on the python side.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question