Bitemporal fields

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Bitemporal fields

Chris
When comparing your implementation of bitemporalness, you are using 3 fields : changedAt, recordedAt and erasedAt.  However, if you compare here http://www.cs.arizona.edu/~rts/tdbbook.pdf , you will find four fields (begin and end valid value, and begin end of transaction time).
As a newbie on (bi)temporal databases, I'm getting a strong case of MEGO. In that case, could you answer:
  What are the cost/benefits of having 3 fields vs 4?  
  Does it simplify things at the loss of some capabilities?
  Does it turn out the 4th field is implied by the state of the database?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Bitemporal fields

roenbaeck
Administrator
We are actually moving towards a two timestamp implementation for bitemporal modeling in our new concurrent-temporal technique. One for valid time and one for transaction time, which we refer to as changing time and positing time respectively.

The advantage is that you will never have to execute UPDATE statements against your data when you use single timestamps. Another advantage is that you get a gapless timeline over valid time, which you otherwise would have to ensure using additional logic. Furthermore, you never have to reorganize your data, even if you have late arriving facts. They can just be inserted between any two sequential rows, and the timeline will be rearranged by itself. Finally, it also avoids bad modeling practices that otherwise would be an option, such as null values or having to introduce an until-the-end-of-time concept.

The cost is that you get a slightly more complex query at SELECT time. However, since that logic is taken care of in our automatically generated perspectives, it does not become a use-case problem, but may affect performance.

You do not lose any capabilities, since as you asked, the closing of an interval is implicitly determined by a "later" row for the same identifier. You also save the storage space two additional columns would induce. All in all, single timestamps give less maintenance and less headaches, which is why we have chosen this path, as opposed to how it was done in early research of bitemporality.
Reply | Threaded
Open this post in threaded view
|

Re: Bitemporal fields

roenbaeck
Administrator
An unexpected added benefit of the insert only strategy is when your data is stored on Solid State Disks. As it is right now, an SSD can only handle so many erase cycles before it is worn out. An Anchor Modeled database, using the new dual timestamp approach (concurrent-reliance-temporal modeling), will not produce any erase cycles and therefore be very easy on the somewhat brittle, but very performant, SSD technology.

Another advantage that I failed to mention is of course data integrity. You can switch off UPDATE and DELETE rights on a database level, ensuring that users cannot tamper with the information. It is also possible to keep checksums of tables taken at different times, which you at any later time can recalculate and verify in order to detect any integrity breaches. This is thanks to the fact that data will remain at rest in an Anchor Modeled database. Once a row is inserted it is never altered.