Answer the question
In order to leave comments, you need to log in
How to collect and aggregate data from different time zones?
Good day.
I am engaged in advertising on the Internet, and a few months ago I decided to write an analytics system for my expenses-incomes, everything seemed simple and easy, we parsed data from the advertising network A B and C into the spend table, then collected everything that was earned in various places into the table revenue. They reduced the debit to the credit and voila, profit.
But in reality, everything turned out to be not so easy and obvious. I will try to make a complete statement of the problem, and perhaps one of you can help me, for which I will be eternally grateful :)
Advertising networks in which we pay for placing our ads are located in different countries and consider all statistics in their time zone (TZ) , without the possibility of changing this very time zone. Thus, at the input we have the following parameters:
Network A - tz -04:00 (New York)
Network B - tz +00:00 (Barcelona)
Network C - tz +03:00 (Moscow)
Network D - tz +07:00 (Thailand)
Absolutely identical situation with services in which we receive money as an affiliate. For clarity, let these be Affiliate Programs (PP):
PP1 -02:00
PP2 +00:00
PP3 +03:00
PP4 +07:00
The task is to collect expenses and income statistics every 5 minutes, calculate based on it profit and other various parameters and coefficients (conversion from purchase to sale, conversion of landing pages, coefficients for AB testing, geographic slices, etc.) and display in one dashboard with beautiful graphs and charts
:): Since all networks have statistics in their own time zone, when collecting for "yesterday" we will get numbers that are not really numbers for "yesterday", because we are in the Moscow time zone, and look at everything from it. In Thai grids, "today" will end when it is 20:00 in Moscow, and all subsequent numbers will be written in "tomorrow", but in MSC it is still "yesterday". When a new day comes in MSC, then "tomorrow", which was for Thailand, will become "today" MSC and Tai, and now in "yesterday" record all income and expenses for those in Bars and New York. And so in a circle.
Clarifications : We do not have the ability in the network interface or PP to get income / expense by the hour for each advertising unit.
Earlier, I came up with something like this: since we collect stats for each promo every 5 minutes, then if at 23:55 promo id=12345 has an expense of $10 and an income of $12, we can record this fact at the database level, having for example, 2 columns with dates, one (localDateTime) is DATETIME in the network \ PP zone, the second column is MSK (mskDateTime). Accordingly, when it is 00:05 in Moscow, the promo expense = 12345 will be $11, the income is still $12, since a new day has begun in Moscow time, we will transfer the amount of the amendment $10 (which ended "yesterday Moscow time") each collection , until the day in the target network\PP changes. But in the end, this scheme is not suitable, if suddenly we did not collect data at the time of the transition of the day in different TZ, and a large error is formed. As well as if we now want to collect and retroactively process the existing statistics.
For example: If at 23:55 Moscow time we had an expense of $10, then in the grid (-04:00 New York) - at 06:55 am Moscow time we will have, for example, $30, and at 7 am Moscow time it will already be $0. Selecting data from the table for "today", something like SELECT * FROM spend WHERE `date` BETWEEN '2021-08-15 00:00:00' AND '2021-08-15 23:59:59' we get bullshit. .. :( because before 7 am there will be a high expense that does not correspond to the reality of the current day and small income, and after 7 am, the expense will disappear entirely, and the income will remain from those BCPs that are in other TZ. If the amendment is $ 10 which I spoke about above will be, then $ 30 on which “yesterday” ended minus $ 10 corrections = $ 20 + $ 0 (already today) will turn out just “today” according to Moscow time, that is, today we spent $ 20 according to Moscow time (knowing that we have a discrepancy of 5 minutes between 23:55 and 23:59:59)
Perhaps I formulated the problem terribly chaotically and nothing is clear. If so, please explain how to compose it :)
PS: I'm waiting for your smart comments, thank you for your time.
Answer the question
In order to leave comments, you need to log in
in such cases, the concept of a trading day tied to one time zone is introduced.
Convert all local time to some single standard. UTC is the best. Operate UTC everywhere in queries, only convert back to your local time when displaying.
UTC is easier to use than say MSK, because conversion from local time to UTC is already built into the language.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question