There are two reasons for breaking down an n-way tie into its binary components:
1. When not all parts necessary to build the relationship that the tie represents arrive synchronously. Let's assume there is a relationship (customer, order, invoice). In order to populate this 3-way tie all information must be known, but it is not unreasonable to believe that the invoice is produced some time after the customer has made the order. It may also be of importance to track customer orders regardless of if they have yet been invoiced. In that case it is better to decompose the relationship into its binary counterparts (customer, order), (order, invoice), (customer, invoice). The binary tie (customer, order) can then be populated immediately after a customer has placed an order.
2. When you have complex cardinality constraints. Again, assume (customer, order, invoice) and the constraints that a customer may make many orders and an order can only belong to one customer, but an invoice involves exactly one customer and one order. In that case order should be "many" with respect to the first constraint, but "one" with respect to the second. Theoretically, this is not a problem for the 3-way tie, but such constraints are not practically supported in most database engines. Therefore, if you want the database to ensure the constraints, again decompose to (customer, order), (order, invoice), (customer, invoice). For these binary ties, the cardinalities can be implemented in the database as primary keys.
Still, I would argue, that if none of the above apply, an n-way tie increases the readability and understandability of the model better than its binary counterparts.
Thanks Lars. At some point, the time I spend thinking through n-way ties to make sure I haven't omitted or overlooked anything, and that's despite the fact that I am a domain expert who knows through 25+ years of experience the content of what I'm modeling, begins to out weigh the benefits of readability. So binary ties it is! Plus, I'm a big fan of normalization.
But that leads me to another question and with my limited DBMS technical knowledge, I need some expert perspective. Is there such a thing as too many ties in which an anchor can participate? I'm not even a third of the way through my model and already I've got one anchor participating in 24 ties. Let's say that number triples by the time I finish! And in an OLTP production environment, I've got thousands of users accessing that particular table through the web application on a daily basis! Does that present any problems?
When you say that they are accessing that table, do you mean the anchor and its attributes or do you mean the anchor, its attributes, and all adjoined anchors? It's only in the latter case the number of ties make any difference. Assuming that is the case, here's a couple of rules of thumb:
OLTP queries are often very selective in that you want to fetch data belonging to a particular instance, say the customer with customer number 4711. With very selective queries, returning a single or very few rows, the effect of joining very many tables are almost negligible, so in this case I believe that you would be ok even if joining large parts of the model.
OLAP queries, on the other hand, needs to look at a large number of rows and possibly also aggregate these. However, such queries, while being very broad, tend to involve small parts of the model. For example counting the number of orders made per time period, for which a single attribute would be involved. Large number of rows is ok for broad queries as long as they involve small parts of the model.
So, where Anchor may suffer is for OLAP-like queries that need to involve large parts of the model. Such queries tend to be quite rare though, and in a situation where they run too slow they can often be precalculated and maintained separately.
There could be several options, like a materialized view, an incrementally loaded data mart, a ROLAP/HOLAP/MOLAP cube, data transfered to a graph database, a data mining model, or a statstical tool like R or Strata, and so on, depending on what fits the purpose best :) All of those are tuned to answer a specific question though.
While an Anchor model is excellent for storing normalized data and ad-hoc querying, there are of course situations where you need different technology to answer a particular question. However, as I always say, the most interesting questions are the ones you yet don't know that you will need to ask. No way to be better prepared than to have an Anchor model at hand when that day comes!