Temporal Dimensional Modeling

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Temporal Dimensional Modeling

roenbaeck
Administrator
In discussions on the concept of immutability in modeling I started to think that the same ideas we have in Anchor Modeling could be applied in dimensional modeling. Here's an excerpt from those discussions:

What I am trying to say is that the introduction of immutable identities may also help other modeling techniques tackle all the challenges in bitemporal modeling.

Take for example dimensional modeling, and I have not tested this, but it should be possible to introduce immutability in it as well. Of course, you would no longer only talk about facts and dimensions, but also immutable identities and perhaps that breaks protocol. Similar to an anchor, there would be a third type of table containing the immutable identities sitting in between fact tables and dimension tables, referenced by both of them. Facts and dimensions can then change bitemporally and independently of each other. Views can be created to travel to a certain point in bitemporal time, ppFact and ppDim, and a join is resolved simply by doing it naturally:

select ... from ppFact(2001, 9999) f
join ImmutableIdentities i on i.ID = f.ID
join ppDim(9999, 9999) d on d.ID = i.ID

Now comes the good part.

Last year I made a tutorial called "The Ghost in the Database" with respect to anchors in Anchor Modeling. They are an integral part of almost all retrieval queries, but they are hardly ever touched during their execution. This is thanks to a feature of modern query optimizers called table or join elimination, depending on the vendor.

In the dimensional example and query above, if ID is the primary key of ImmutableIdentities, and declared as foreign key in the Fact and Dim tables, then joining over that table has no effect on the result. The optimizer will recognize this and remove that table and join from the execution plan, leaving only Fact and Dim, which will be joined on ID directly.

In other words, your dimensional model should behave just as it did before, even though immutability has been introduced. Its introduction is necessary (in case someone is doubting it) because in Dim, ID alone cannot be the primary key. It will only be a part of the primary key together with the bitemporal timelines. If Fact were to reference that composed primary key directly, then you lose the ability to have Fact and Dim change independently over time.
Reply | Threaded
Open this post in threaded view
|

Re: Temporal Dimensional Modeling

roenbaeck
Administrator
I have put together a script that shows the behavior I described above. It can be found here:

http://pastebin.com/3rNc2yvC

Borrowing concepts from Bitemporal Anchor Modeling I have created "Bitemporal Dimensional Modeling". It makes use of the distinction between immutable and mutable information to introduce a dimensional anchor for every dimension. It uses the rewind and point-in-time parametrized views. It used the proper primary keys together with foreign keys to achieve table elimination (immutable data is not touched during execution of a query). It uses constraints to ensure temporal entity integrity.

Note: This does in no way imply that I favor dimensional modeling over anchor modeling :)