256,338 rows affected.

  • bacon_pdp@lemmy.world
    link
    fedilink
    arrow-up
    0
    arrow-down
    4
    ·
    4 days ago

    Another company that never had a real DBA tell them about _A tables.

    This stuff is literally in the first Database class in any real college.

    This is trivial, before any update or delete you put the main table (let us use table foo with a couple columns (row_id,a, b, create_date,create_user_id, update_date and update_user_id) in this example)

    For vc in (select * from foo where a=3) Loop Insert into foo_A (row_id,a,b, create_date,create_user_id, update_date, update_user_id, audit_date,audit_user_id) values(vc.row_id,vc.a,vc.b, vc.create_date,vc.create_user_id, vc.update_date, vc.update_user_id, ln_sysdate,ln_audit_user_id); Delete from foo where row_id =vc.row_id; End loop

    Now you have a driver that you can examine exactly the records you are going to update, along with ensuring that you will be able to get the old values back, who updated/deleted the values and an audit log for all changes (as you only give accounts insert access to the _A tables and only access to the main tables through stored procedures)

    • whats_a_lemmy@midwest.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      4 days ago

      You can just insert directly to the helper/audit table, and delete using it, no need for the cursor loop. If you need a handle to go through the records one by one, something else has already gone wrong.

      • bacon_pdp@lemmy.world
        link
        fedilink
        arrow-up
        0
        arrow-down
        1
        ·
        4 days ago

        If you need to speed up your deletes, might I suggest not storing data that you don’t need. It is much faster, cheaper and better protects user privacy.

        Modern SQL engines can parallelize the loop and the code is about enabling humans to be able to reason about what exactly is being done and to know that it is being done correctly.

        • whats_a_lemmy@midwest.social
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 days ago

          At least in PG, that is explicitly not the case, unless I’m misunderstanding:

          Similarly, a PL/pgSQL loop of the form FOR x IN query LOOP … END LOOP will never use a parallel plan, because the parallel query system is unable to verify that the code in the loop is safe to execute while parallel query is active.

          https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html

          At any rate, I feel like it’s questionable design to have a user making row-by-row decisions on hard deletes.

          • bacon_pdp@lemmy.world
            link
            fedilink
            arrow-up
            0
            ·
            3 days ago

            The key part is while the query is active.

            Also you are not doing hard deletes on the main table but only on the _A table. As you can always retrieve the main table values from the _A table (which only deletes records based on audit_date when they have aged out) and that is not something that the user or even any of the service accounts will have access to do. (Only a specialized clean up job on a restricted account would have delete permissions on the _A tables and access to nothing else)