Using Anchor Modelling in OLTP environment

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Using Anchor Modelling in OLTP environment

tsaixingwei
Hi,
I'm have been reading about Anchor Modelling and I wonder if it would be advisable to use this modelling technique for designing OLTP databases instead of data warehouses? What would be the pros and cons of doing so?

If AM is suitable for OLTP database design (where it is assumed that records will be inserted, updated and deleted with regularity by the end users), a further question would be how to model a logical 'delete' of an entire record from a 3NF point of view in AM? I'm saying "3NF point of view", because the developers would still be querying the AM designed database using the various views provided by AM that "denormalizes" 6NF to 3NF. But since this is an OLTP database (and not just a data warehouse), there would be a need to map 'delete's and 'updates' from 3NF back into the 6NF as well.

Would the anchor need to have a date column as well in order to track the interval in which it had an existence i.e. when it was created until it was deleted (in all the readings of AM I've come across, the anchor table itself has no date fields unlike the tables for attributes and ties)?

If not, and we just rely on the changedAt/erasedAt columns in the attribute and tie tables, then how do we distinguish between the case of a record being deleted (from a 3NF point of view) versus the setting of all the attributes (and ties) to NULL values (again from a 3NF point of view) at a point of time after creation?

I've read the posts regarding bi-temporal AM modelling and I can't say I have a complete understanding of what was posted regarding that, but it would seem to me that the use case for bi-temporal AM modelling is not targeted at logical deletes of a record from a 3NF point of view, but more about handling the corrections of individual attribute values over time just by using inserts only. Please correct me if I'm wrong about this assertion.

It is also of interest to me how AM would handle the setting of an attribute value to NULL from a 3NF point of view without AM ever allowing NULLs. From a 3NF point of view, the application might want to set a value to NULL to assert that they do not know anything about that attribute yet. Wouldn't allowing NULLs to be inserted into the AM attributes tables as the value be all right provided that the only semantics that the NULL represents would be that it is a "Don't Know" and it would never be missing data (bearing in mind that we are using AM to design a clean fresh OLTP database and not a data warehouse where you might get dirty or missing data ETL'ed from various other OLTP databases?)





Reply | Threaded
Open this post in threaded view
|

Re: Using Anchor Modelling in OLTP environment

tsaixingwei
Some more clarification regarding the last point in my post about allowing the use of NULLs to represent "Don't Know" values:

When the record was created, an attribute of we know the entity (e.g. a person) had a number of cats = 1. Then later it became 3. I now know that some of the cats have had more kittens but also that some of the existing cats have died. But I don't necessarily know how many exactly at the moment. So I might want to insert a NULL value into the "No. of Cats" attribute for the moment, and later update it (with another insert in AM model) to the actual number of cats when I do know the exact number.

I don't know if that made sense, but I'm trying to visualize a situation where an application sets a value to NULL in the 3NF point of view when it wasn't NULL before, and how we would model this accurately in 6NF.
Reply | Threaded
Open this post in threaded view
|

Re: Using Anchor Modelling in OLTP environment

roenbaeck
Administrator
In Anchor Modeling anchors hold identities. A single such identity can be seen as the technical way the database identifies an entity. Whatever values you find in there have no meaning outside the database.

In order to make sense of such an identity outside of the database, you must look at the properties of the corresponding entity. More philosophically, an entity is viewed as the sum of its properties. Usually you will have some mapping between the surrogate identity in the anchor and a natural identifier, a subset of properties through which you know you can identify the entity.  

Deleting an entity in its entirety corresponds to wiping out its natural identifier, not its surrogate. The surrogate is eternal. This is a crucial difference beween surrogate and natural keys that is apparent in 6NF, but jumbled in 3NF.

I hope this answers your question of why there never are any timestamps in anchors.
Reply | Threaded
Open this post in threaded view
|

Re: Using Anchor Modelling in OLTP environment

roenbaeck
Administrator
We represent "Don't know" by the absence of a row in 6NF. In the views (which is what you use to query the anchor model) this will turn into a NULL value, indistinguishable from how it would appear in 3NF.

If you need to distinguish between "Don't know anything" and "Don't know right now, but I ought to know" then we see the latter as information that is part of the domain you are modeling. In other words, it should be modeled explicitly, either as a special value, for example the string "Unknown" or using a separate attribute.

The reason we see it as part of the domain is because the information is valuable to the end user. I would say it's equally likely though that the end user does not put any special interest in unknown values, and that they in such cases only would confuse them.

I know this is stretch for some, and judging by some posts people are even implementing end dating of changing time. Both approaches will work.
Reply | Threaded
Open this post in threaded view
|

Re: Using Anchor Modelling in OLTP environment

tsaixingwei
But how can we (in an OLTP environment) "nullify" an attribute that was previously known?

Let's say that at point in time T1, for an entity E1, we store two attributes A1 and A2 into 6NF tables.

Table for A1 attribute:
====================
#id         A1            validFrom
-------------------------------------
4711       "black"        T1


Table for A2 attribute:
====================
#id         A2            validFrom
-------------------------------------
4711       42             T1




Then at point in time T2, we only store changed attribute values and do not store unchanged attribute values: only A1 has changed but not A2, and the tables becomes like this:


6NF Table for A1 attribute:
====================
#id         A1            validFrom
-------------------------------------
4711       "black"        T1
4711       "red"           T2


6NF Table for A2 attribute:
====================
#id         A2            validFrom
-------------------------------------
4711       42             T1


At point in time T3, we then query using the function that returns the latest values for the E1, then we would see the results (in 3NF):

View
====================
#id         A1            A2
-------------------------------------
4711       "red"         42


Now suppose at T4, I want to "nullify" the value of A2, so that if I were to query using the function that returns latest values for E1, it would return (in 3NF):

View
====================
#id         A1            A2
-------------------------------------
4711       "red"         null


How could we achieve this if null values are not allowed in 6NF? If the attribute type was a String, then perhaps I can store NULL as the string "NULL". What would be the canonical integer value for representing a NULL be for A2? And how can the view make this transparent to the developer without leaking the abstraction through from 6NF to 3NF?