Skip to content
Fix Code Error

SQL find sum of entries by date including previous date

June 28, 2021 by Code Error
Posted By: Anonymous

This might not be possible, but here this is the request I’m trying to translate into SQL:
For every date where at least one entry exists in the ACCOUNT_ENTRY table, I want to find the sum of all entries for all accounts (in the ACCOUNT table) for that date. If an account doesn’t have an entry for that particular date, I want to use the next latest entry up to that point.

Note that I’m currently doing this on an H2 DB, but that could change in the future so I’m trying to stay away from vendor specific Stored Procedures.

Example

ACCOUNT table

ID ACCOUNT_LABEL
1 account 123
2 account 456
3 account 789

ACCOUNT_ENTRY table

ID ACCOUNT_ID ENTRY_DATE BOOK_VALUE MARKET_VALUE
7 1 2021-05-31 100 110
5 1 2021-05-28 90 100
6 2 2021-05-28 70 80
4 3 2021-05-28 50 60
3 1 2021-05-27 80 90
2 2 2021-05-27 60 70
1 3 2021-05-27 40 50

Desired result

ENTRY_DATE SUM_OF_BOOK_VAL SUM_OF_MARKET_VAL
2021-05-27 180 210
2021-05-28 210 240
2021-05-31 220 250
  • 2021-05-27 Book value = 80 + 60 + 40 (Row ID 1 + 2 + 3)
  • 2021-05-27 Market value = 90 + 70 + 50 (Row ID 1 + 2 + 3)
  • 2021-05-28 Book value = 90 + 70 + 50 (Row ID 4 + 5 + 6)
  • 2021-05-28 Market value = 100 + 80 + 60 (Row ID 4 + 5 + 6)
  • 2021-05-31 is equal to the results from 2021-05-28 + the new single new entry from 2021-05-31 (Row ID = 7)
    • Book Value = 70 + 50 + 100 = 220
    • Market Value = 80 + 60 + 110 = 250

Result I can get so far

ENTRY_DATE SUM_OF_BOOK_VAL SUM_OF_MARKET_VAL
2021-05-27 180 210
2021-05-28 210 240
2021-05-31 100 110

This would be easy to do in my application code, but I was hoping to leave the compute to the DB and not transport all the data over to the application. I could also roll data forward, but then I could end up with significant chunks of data which aren’t necessary.

Here’s the query I’m using so far, but it doesn’t handle situations where it has to look back at previous entries.

Query

SELECT
    TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd') AS ENTRYDATE,
    SUM(
        CASE
            WHEN A.JOINT_ACCOUNT = 'TRUE' THEN MARKET_VALUE / 2
            ELSE MARKET_VALUE
        END
    ) AS MARKETVALUE,
    SUM(
        CASE
            WHEN A.JOINT_ACCOUNT = 'TRUE' THEN BOOK_VALUE / 2
            ELSE BOOK_VALUE
        END
    ) AS BOOKVALUE
FROM
    ACCOUNT_ENTRY AE
    INNER JOIN ACCOUNT A ON AE.ACCOUNT_ID = A.ID
    INNER JOIN (
        SELECT
            MAX(ID) AS MAX_ID,
            TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd'),
            AE.ACCOUNT_ID AS AID
        FROM
            ACCOUNT_ENTRY AE
        GROUP BY
            AE.ACCOUNT_ID,
            TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd')
    ) MAX_ENTRIES ON AE.ID = MAX_ENTRIES.MAX_ID
GROUP BY
    TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd')
ORDER BY
    TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd') DESC

This might not be possible, but here this is the request I’m trying to translate into SQL:
For every date where at least one entry exists in the ACCOUNT_ENTRY table, I want to find the sum of all entries for all accounts (in the ACCOUNT table) for that date. If an account doesn’t have an entry for that particular date, I want to use the next latest entry up to that point.

Note that I’m currently doing this on an H2 DB, but that could change in the future so I’m trying to stay away from vendor specific Stored Procedures.

Example

ACCOUNT table

ID ACCOUNT_LABEL
1 account 123
2 account 456
3 account 789

ACCOUNT_ENTRY table

ID ACCOUNT_ID ENTRY_DATE BOOK_VALUE MARKET_VALUE
7 1 2021-05-31 100 110
5 1 2021-05-28 90 100
6 2 2021-05-28 70 80
4 3 2021-05-28 50 60
3 1 2021-05-27 80 90
2 2 2021-05-27 60 70
1 3 2021-05-27 40 50

Desired result

ENTRY_DATE SUM_OF_BOOK_VAL SUM_OF_MARKET_VAL
2021-05-27 180 210
2021-05-28 210 240
2021-05-31 220 250
  • 2021-05-27 Book value = 80 + 60 + 40 (Row ID 1 + 2 + 3)
  • 2021-05-27 Market value = 90 + 70 + 50 (Row ID 1 + 2 + 3)
  • 2021-05-28 Book value = 90 + 70 + 50 (Row ID 4 + 5 + 6)
  • 2021-05-28 Market value = 100 + 80 + 60 (Row ID 4 + 5 + 6)
  • 2021-05-31 is equal to the results from 2021-05-28 + the new single new entry from 2021-05-31 (Row ID = 7)
    • Book Value = 70 + 50 + 100 = 220
    • Market Value = 80 + 60 + 110 = 250

Result I can get so far

ENTRY_DATE SUM_OF_BOOK_VAL SUM_OF_MARKET_VAL
2021-05-27 180 210
2021-05-28 210 240
2021-05-31 100 110

This would be easy to do in my application code, but I was hoping to leave the compute to the DB and not transport all the data over to the application. I could also roll data forward, but then I could end up with significant chunks of data which aren’t necessary.

Here’s the query I’m using so far, but it doesn’t handle situations where it has to look back at previous entries.

Query

SELECT
    TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd') AS ENTRYDATE,
    SUM(
        CASE
            WHEN A.JOINT_ACCOUNT = 'TRUE' THEN MARKET_VALUE / 2
            ELSE MARKET_VALUE
        END
    ) AS MARKETVALUE,
    SUM(
        CASE
            WHEN A.JOINT_ACCOUNT = 'TRUE' THEN BOOK_VALUE / 2
            ELSE BOOK_VALUE
        END
    ) AS BOOKVALUE
FROM
    ACCOUNT_ENTRY AE
    INNER JOIN ACCOUNT A ON AE.ACCOUNT_ID = A.ID
    INNER JOIN (
        SELECT
            MAX(ID) AS MAX_ID,
            TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd'),
            AE.ACCOUNT_ID AS AID
        FROM
            ACCOUNT_ENTRY AE
        GROUP BY
            AE.ACCOUNT_ID,
            TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd')
    ) MAX_ENTRIES ON AE.ID = MAX_ENTRIES.MAX_ID
GROUP BY
    TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd')
ORDER BY
    TO_CHAR(ENTRY_DATE, 'yyyy-mm-dd') DESC

SOLUTION Found

Thanks to @Gordon Linoff for sharing the COALESCE and LAG functions.

First, just to simplify my query, I created a view. This isn’t relevant to the challenge I was facing, but important so that I could eliminate multiple entries for the same account and same day

create view LATEST_ACCOUNT_ENTRY as
select
   max(id) as entry_id,
   account_id,
   to_char(entry_date, 'yyyy-mm-dd') as e_date
from
   account_entry
group by
   account_id,
   e_date

Final Query (going to be simplified)

select
   e_date,
   sum(coalesce(BOOK_VALUE, PREVIOUS_BOOK_VALUE)),
   sum(coalesce(MARKET_VALUE, PREVIOUS_MARKET_VALUE))
from
   (
      select
         dates.e_date,
         acc_id,
         entry_id,
         book_value,
         LAG(book_value) OVER (
            partition by dates.acc_id
            order by
               dates.e_date
         ) as previous_book_value,
         market_value,
         LAG(market_value) OVER (
            partition by dates.acc_id
            order by
               dates.e_date
         ) as previous_market_value
      from
         (
            select
               distinct to_char(entry_date, 'yyyy-mm-dd') as e_date,
               accts.id as acc_id
            from
               account_entry
               cross join (
                  select
                     id
                  from
                     account
               ) accts
         ) dates
         left join (
            select
               lae.ACCOUNT_ID,
               lae.E_DATE,
               lae.ENTRY_ID,
               CASE
                        WHEN JOINT_ACCOUNT = 'TRUE' THEN BOOK_VALUE / 2 
                        ELSE BOOK_VALUE
                    END as BOOK_VALUE,
                    CASE
                        WHEN JOINT_ACCOUNT = 'TRUE' THEN MARKET_VALUE / 2 
                        ELSE MARKET_VALUE
                    END as MARKET_VALUE
            from
               LATEST_ACCOUNT_ENTRY lae
               inner join account_entry ae on lae.ENTRY_ID = ae.id
               inner join account acc on ae.account_id = acc.id
         ) entries on dates.e_date = entries.e_date
         and dates.acc_id = entries.account_id
   )
group by
   e_date

Solution

You can do this by generating rows for all dates and accounts and then using lag(ignore nulls) to get the most recent value. So, for the rows:

select a.account_id, d.entry_date, ae.book_val, ae.market_val,
       coalesce(ae.book_val,
                lag(ae.book_val) ignore nulls over (partition by a.account_id order by d.entry_date)
               ) as imputed_book_val,
       coalesce(ae.market_val,
                lag(ae.market_val) ignore nulls over (partition by a.account_id order by d.entry_date)
               ) as imputed_market_val
from (select distinct account_id from account_entry) a cross join
     (select distinct entry_date from account_entry) d left join
     account_entry ae
     on ae.account_id = a.account_id and
        ae.entry_date = d.entry_date;

Then you can aggregate this:

select entry_date, sum(imputed_book_val), sum(imputed_market_val)
from (select a.account_id, d.entry_date, ae.book_val, ae.market_val,
             coalesce(ae.book_val,
                      lag(ae.book_val) ignore nulls over (partition by a.account_id order by d.entry_date)
                     ) as imputed_book_val,
             coalesce(ae.market_val,
                      lag(ae.market_val) ignore nulls over (partition by a.account_id order by d.entry_date)
                     ) as imputed_market_val
      from (select distinct account_id from account_entry) a cross join
           (select distinct entry_date from account_entry) d left join
           account_entry ae
           on ae.account_id = a.account_id and
              ae.entry_date = d.entry_date
     ) ad
group by entry_date;
Answered By: Anonymous

Related Articles

  • How to represent arrays within ember-data models?
  • Integrating CKEditor with Aurelia
  • Attempt to use Aurelia plugin causing 404 error in browser
  • Using aurelia-cli unable to get bootstrap and a couple of…
  • Avoid creating new session on each axios request laravel
  • How not to get a repeated attribute of an object?
  • Failed to authenticate on SMTP server error using gmail
  • Why is Future[Set[Unit]] not accepted as Future[Unit]?
  • How to track untracked content?
  • Ember-Model: Help Debug - TypeError: cannot read property…

Disclaimer: This content is shared under creative common license cc-by-sa 3.0. It is generated from StackExchange Website Network.

Post navigation

Previous Post:

R Shiny Line Plot with Multiple Lines

Next Post:

Is there any function to put some value in a pandas df column based on the presense of word in text of another column?

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Get code errors & solutions at akashmittal.com
© 2022 Fix Code Error