At a recent public session on Anchor Modeling we got into a discussion on how an Anchor modeled database could act as a blockchain. Now I am no blockchain expert, but the insert only nature of Anchor, together with the posit/assertion decoupling and reliability should make it a good candidate for replacing a blockchain.
From what I gather there are a few other features of a blockchain that Anchor does not have, and to name a few:
blockchain is a distributed database blockchain encrypts the information in the database blockchain links blocks by a hash
First, I would like to address the linking of blocks by a hash. This reinforces previous blocks, making it impossible to break the chain without anyone noticing that it has been tampered with. Since Anchor models are insert only, it should be possible to calculate a hash of an entire table after doing an insert. If someone tampers with the data, that hash will no longer be correct, provided that noone can tamper with the hash as well.
However, calculating the hash of an entire table is an expensive operation. Perhaps we could try to imitate the linking instead? Say we create a hash of an anchor and its attributes on the instance level instead. Let's call the first instance A and hash
#A = #(anchor A + attributes A)
being inserted into the database. Now, the next instance could then hash its anchor and attributes along with the hash #A. In other words instance B has
#B = #(anchor B + attributes B + #A)
Then along comes instance C with
#C = #(anchor C + attributes C + #B) = #(anchor C + attributes C + #(anchor B + attributes B + #A))
Well you get the idea. By using the previous hash we would not have to hash every instance again, since they are in some sense "contained" in the progressively built hashes. Now, I am not sure if this holds mathematically, but it seems to work on a conceptual level.
I am thinking anchors and ties need to have an extra column for the hashes, and the trigger logic could take care of populating the column.
Giving this some more thought from a theoretical perspective, it seems likely that each posit type should have its own hashing chain (rather than the whole anchors). Since each posit represents a fact in real life that a positor can have an opinion about, they become kind of atomic in the theory. This indivisible thing, an atom of information so to speak. Then assertions become opinions about these atoms.
Today we usually use some integer sequence from which we pick numbers as identities for posits. Since every part of a posit defines the posit itself, the chained hash could be used as a key instead. Let p1 be the first posit for a certain posit type, and p2 the second:
p1 = (<42>, <price>, €2500, '2016-05-30')
#p1 = #(42|price|€2500|2016-05-30|#lag(p1, 1))
where # is a hashing function, | a concatenation operator, and lag(p1, 1) is null, since no posit has yet been memorised.
p2 = (<42>, <price>, €2000, '2016-05-23')
#p2 = #(42|price|€2000|2016-05-23|#lag(p2, 1))
where lag(p2, 1) = p1.
Some assertions can then be made as:
[Lars, #p1, '2016-05-20', 0.9]
[Lars, #p2, '2016-05-20', 0.9]
using the hashes as references to the posits. Would there be any reason to hash the assertions? I don't see any right now.
This would still be implementable in the database. However, if we are to address the other missing functionalities, I believe we have to think about writing our own software based on posits and assertions, rather than an implementation of these in a traditional relational database. Still, this additional feature alone would surely be useful in the database as well, as long as it is optional when you generate the code.
Commenting myself. The numbering of the posits p1, p2, is just the order in which they arrive. They may arrive in any order, and the chain will still fulfil its purpose. However, if an end user is going to ensure no tampering this user needs to be able to follow the chain backwards from any given posit. This is not possible unless there is some way to deduce from a given posit what its predecessor is. Writing this I realised that ensuring that no tampering has occurred will become costly instead, since you need to verify as many hashes as you have posits. Assuming this is done more rarely than inserts into the chain, it should still be better than calculating a hash of the entire chain at every insert.
Getting back to the point, the integer sequence I wanted to replace with the hash probably still needs to be there. The hash is then better off as metadata of the posit. The integer that is the identity of the posit need not be part of the hash, though.
I've been invited to reply to these thoughts.
What are you trying to accomplish by setting up a (anchor modeled) blockchain?
Blockchain technology is typically used to replace a trusted third party by setting up a peer to peer network in which every node is equal. Each node can add transactions to the blockchain database. Many people tend to forget that this only works if there is an incentive for those nodes to maintain a blockchain database. In the bitcoin blockchain the nodes incentive is they receive 'new' bitcoins.
Basically blockchain technology is far more inefficient than any normal database and practically not suiteable for a commercial business-case because of the lack of a central organ.
I suppose I am not trying to replace a blockchain, but rather to add certain (desirable?) features of a blockchain to Anchor modeling.
I was made aware of attempts to coerce a blockchain to act as a database, such as being a ledger for other types of transactions than those found within the bitcoin implementation. My thought was that coercing a blockchain is the wrong way to address the underlying issue. If so, Anchor with the addition of some necessary features could instead prove to be the right way. Regardless, being able to have the features of Anchor together with, for example, tampering protection should provide added value.
Tampering protection; that is indeed a very clear feature of blockchain technology, since no one is the owner there is only a (static) rule set to which new transactions must comply in order to be valid. If one peer's transactions do not match these rules they are being ignored by the rest.
Could you give an example for your use case in which you would like to have a tampering protection. In my opinion simply publishing all the transactions to public is usually enough (for a controlled environment).
I can think of a number of scenarios. You maintain a database of sensitive information, for which under normal circumstances users cannot alter the information. With tampering protection you can rest assured that the integrity is pristine. I still need to think of recovering from actual tampering though.
Another scenario would be if you want to distribute some data, say research data, and you want the end users to be able to verify that nothing happened to the data while being sent around.
Perhaps it is even possible for data to experience bit rot, which could then be detected. Again, I am not familiar with how file systems do recovery, so I will read up on that and see if it is something that could be incorporated as well.