-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Towards global Consensus on Trust #3357
Comments
Consensus ranking: http://www.research.ibm.com/cr_aerobatics/cr.html |
https://www.aaai.org/Papers/AAAI/2004/AAAI04-110.pdf Studies the ranking of m alternatives by n voters, for example m sporters in a sports event which are ranked by n judges. The paper discusses different problems like cycles in the majority graph. As far is I understand the problem that I will be trying to solve, there are some major differences between this problem and mine. First of all in the global trust problem not all nodes in the network are ranked by all nodes. This means each node keeps a ranking of a subset of the alternatives. Also the judges in our problem are not external but rather part of the alternatives, therefore their own trust values should be accounted for. |
I have created a Trello board to organize my work, track my progress and build a knowledge base. I keep track of all learnings, questions coming up and todos. In the first sprint (until the next meeting we have on Feburary 1) I will try to understand the problem and come up with a proper problem description. |
A Computational Model of Trust and Reputation Wat is strategy proof. |
I created a github for my thesis which I will use instead of the Trello. Have created some projects (sprints) for the next few weeks and you can follow my progress on the Problem description there. |
Please read the pioneering work from Lik Mui at MIT from 2001 onwards
|
Found an interesting TED talk by Rachel Botsman about the sharing economy. |
Progress meeting 1: Towards a problem definition GOAL: Develop a direction for an exact problem definition for my Master thesis. Agenda:
|
Reputation Systems: Facilitating Trust in Internet Interactions Fun read on Reputation Systems, not much hard science in here but just an introduction, by the same guys as the article "Manipulation-resistant reputation systems" which I am currently studying. They define the minimum features of a reputation system as:
I think what we are trying to achieve is a build a reputation system that is somehow application agnostic. With temporal PageRank we can get the personal view of the reputation of the peers we have interacted with and their peers but what we are missing is a way to acquire a trustworthy reputation value from unrelated and unknown nodes. Also we haven't found a way to distribute the reputation yet, which is one of the basic requirements of a reputation system according to the above definition, which is similar to what is discussed by Mui. Reputation is quite worthless if it is not spread because the chance that one node interacts with the exact same node again in the future is very small in a network with many nodes. |
REGRET: Reputation in gregarious societies (2001 paper) |
Single Internet-of-Trust: |
"Consensus ranking" is a solid step forward for global trust. Spreading individual non-verifiable trust rankings is realistic to implement and deploy. A superior approach would be to have objective and verifiable rankings which are spread. We might need local aggregation and then global aggregation. Mechanism to add to increase trust. Each "signed trustranking record" is not a subjective opinion, but upon request the underlying data can be shared, and the calculation can be repeated. Others who repeated this calculation will guarantee correctness with the signature. This moves it from subjective to objective observations with transparency. Each signed trustranking record is also a confession that you have transaction data and by the rules of the game you are required to produce it, or suffer the consequences. All part of "mechanism design" to enforce cooperative behavior. Please create a fresh repo with all above systems in a clickable table+fresh papers like Google reputation stuff. Thesis existing systems chapter. Rebooting this older Tribler work. Research Question: ToDo: practical side and do Python Pagerank on real public trustchain data crawl. |
Some considerations on reputation systems and aggregationAccording to the above mentioned taxonomy of reputation systems we can define the reputation system for tribler we are building as follows (the six main dimensions for the definition of reputation systems):
Questions derived from the above definition
|
I created a repository with the above mentioned survey of research and commercial systems. I will update them further with more systems and rank aggregation algorithms. |
Impressive progress |
[Outcome of discussion] block withholding attack and obtaining full coverage of all transactions are open problem for creating reproducable trustchain rankings |
Proposal for thesis goal : Verification and aggregation of (partial?) trust rankings in the TrustChain fabric
|
Research idea to protect from the block-withholding attack in general (for our TrustChain context). All direct connected peers need to sign that they: copied, verified, and actively disseminate you entire chain upto and including block X. "Block-verification statement". Nodes for which no peers provide signed Block-verification are given a low reputation. Convert to honesty Conclusion: we need to derive a strategy-proof mechanism for dissemination of interaction-outcomes, (e.g. build a tamper-proof collective memory) and design policies for local interactions from which at the global levels favorable conditions are created with evolutionary stability. Now turning to scalability and trust-rank. For planetary-scale cooperation each actor is provided with bounded rationality, limited storage capacity, and a limited population it can observe. We define a unobserved set V of all n nodes Pairwise auditing
We assume audited top-N trust-rank for each actor in the system. This then become a homework assignment, as their are several known solutions :-) Leveraging Libtorrent You are hereby release of any need to integrate your work into Tribler, if and only if you can do some publishable theoretical proof with as described above and do a Pim-like; Kelong-like stand-alone validation experiment with real crawled Trustchain data... |
Evolution of cooperation in the context of Tribler
Now the other forms are somewhat harder to map to the file-sharing example: spatial selection refers to the mechanism of neighborhood help. We rather help people in our vicinity than far away. Group selection is a similar phenomena, except it’s not related to space but to affinity to a group. Finally kin is a similar concept related to family connections. What happens to nodes that have a low reputation? How does reputation help us get benefit in the application? What is the incentive to gossip the reputation of other nodes? What happens if we do not tell the reputation of a node we know? |
[Continued] Evolution of cooperation in the context of Tribler Sub question to answer next: Is a restriction of interaction to nodes with higher reputation a valid option for an incentive to increase reputation? Is a weighted sum a valid option for combining the global and local reputation? (And can we find a better name than global and local reputation?) Is there an incentive not the share trust rankings? Do we need to reward nodes for this? What happens when a node withholds blocks/double spends? Do we constantly need to check for the validity of their chains after we had an interaction? If we constantly check the validity we should be able to detect a double spend and then inform other auditor's such that they can remove their rankings, practically removing the reputation of the misbehaving node. |
the audit your partners rule: a key mechanism is to require audits, ensuring we create an invisible hand of honesty. The rule: repeated interactions with an agent requires you to be able to produce your cryptographically signed, valid, correct, and honest audit certificate for this agent. After an initial interaction you need to start checking people, and either put positive or a negative endorsement online. Negative endorsements can be simply caused by failure to send information or non-responsiveness. Failure to produce an audit certificate for any agent you interacted with repeatedly will severely impact your trust level. |
Obtaining honest ratings after transactions has proven to be a hard problem. A solution to marketplace information asymmetries is to have trading partners |
Thanks for the suggestions! I think I was misled when thinking of the monotonic increasing value in order to make sharing strategy-proof. Instead, if pairwise audits are the only way of exchanging data and each endorsement block records the blocks exchanged, then we can simply see exactly which data an agent has at any point in time. So, when we require the full-chain, we not only see the agents interaction records but also the record-exchange records, so we know which data the agent possesses. So when we request data we can check if an agent shared all data. Experiments
If all these experiments work I have to think about how to achieve the Sybil resistance. As for the audit-your-partners rule: at some point in the future we will need to make sure that interaction partners can comprehend why the other is not sharing data or sharing some specific amount. Basically, not sharing data can either be classic "defection" or it can be "cooperating" by not giving resources to people that don't deserve it. This comprehension is only possible if the requester can make the same calculation as the responder and check that the responder actually calculates with the given data that the requester deserves some amount of resources. Therefore I would suggest to perform the audit before a transaction. This gives the requester an opportunity to prove her trustworthiness and makes sure both parties agree on the bandwidth and amount of resources to share. |
Trust research spans The Sciences; economists, iterative PD (CS), evolution of cooperation(biology). Social scientists from Stanford Innovation Lab are discussion mechanism design. Talk: "Harnessing Gaming Technology for Peace: A Conversation with Margarita Quihuis", discussing a social operating system based on religion, legal code, and mechanism design in general. The lab founder 1999 paper uses credibility, instead of trust. His 2002 paper with 4000+ citations "Persuasive technology: using computers to change what we think and do":
Thesis framing:
@jangerritharms took the idea to the next level: using it as the exclusive dissemination mechanism. By forcing peers to publicly disclose what they know about all others it is trivial to detect forks, fakes, and fraud. Endorsements are on-chain. This is the key strategy of our lab goal of "Distributed Trust Design", change the rules of the game. Take this to the extreme and make it asymmetric, almost unfair, and virtually impossible to attack. Contributing is not sufficient, only count contributions endorsed by the community-at-large. Reciprocity as endorsement incentive. ToDo:
|
Beyond game theory: cryptographically enforced two-way endorsements. |
reformulating... Endorsements enable audits of auditors! Endorsements are on-chain. The topic of double spending detection time distracts from the advantage of this algorithm. Reading: reputation mechanisms All honest nodes have an incentive to detect fraudulent behavior, are they can become a victim. Transactions are disseminated (gossip or crawling) and each node will report a double spend to others. Assumption 1: a double spend transaction without check is indistinguishable from other transactions. Assumption 2: a malicious agent can not influence the dissemination of his transactions by honest agents (sabotage spreading assumption). Only by detecting for any chain two different transaction with the same sequence number you detect malicious behavior. Thus intuitively formulated, all honest nodes collaborate and scan all disseminated transactions in parallel for double spends, resulting in fast detection. Collusion scenario: nodes can collude and agree to conduct double spending attacks and never disseminate these transactions (no trivial on how to exploit this). Is this a variant of the coupon collection problem? With the introduction of locality: a bias towards dissemination and endorsement means disconnecting from the network size. it scales! It becomes constant, instead of dependent on n, it becomes dependent on the average of interaction partners. (No?) progress with random interactions, but for fast mixing graphs it's progress? experiment brainstorm:
|
Lately I felt like I was bothering too much with details and lost the overview of what we were trying to achieve. That's why I went back to the very first entry and traced back by steps of how I got to the latest considerations. I wrote it down and it ended up as some kind of a summary of the theoretical considerations of the last months. I hope it's clear and we can find any logic faults in the argumentation, so feedback is much appreciated. How did we get here?We started out with the goal to work towards a consensus on trust for the TrustChain architecture. We considered different concepts of trust, analyzed different reputation systems and researched ways of building a sybil-resistant network. The goal was to combine multiple local rankings of trustworthiness of agents to better approximate a global ranking of trustworthiness. But in order to combine rankings we should first check that the other agent is not lying about the ranking, so we need a verification step. The current architecture of TrustChain does not allow a simple verification because rankings are created based on the complete database instead of the agents personal chain. In other words, if we see the trust ranking as a pure function of the agent’s state (i.e. the same state of the agent returns the same ranking) TrustChain does not record the complete agent’s state because the state consists of the agent’s transactions and the agent’s knowledge of the network. Agents need to agree on a single view of the network in order to agree on each other’s or a combined ranking. But without any “proof-of-knowledge” agents can decide to not share records of other agents, because any reputation those other agents obtain in the eyes of the calculating agent through those records can put those agents above the agent that is sending the information. Hence it is not advantageous to share all information.The solution is to make truthful sharing of records incentive compatible. We can achieve this by recording not only transactions on the personal chain but any change to the record database. We need to create a new block type (exchange block) which includes all records sent to and received from any other agent. If this structure is in place the following will be possible: given the complete chain of an agent, we have access to the same knowledge as the that agent. However, storing all transactions of one agent in an exchange block will render that block to be huge. Instead, we can continue to store the actual blocks in an undisclosed database and only record a block exchange index in the exchange block. What we require is: given the complete chain of an agent, we should be able to create a block index which is an exact representation of the blocks in that agent’s database. The block index of an agent lists for that agent all known public keys and each of those keys the interval of all known sequence numbers. With the above extension to the TrustChain architecture we are able to enforce truthful sharing of an agent’s complete knowledge (agent’s view of the network/agent’s database). Now, the original goal of combining trust rankings becomes theoretically possible. Because the agent’s complete knowledge, or state, is on the chain, we are able to verify calculated trust rankings at any time. Now we can envision different mechanisms. For example an agent periodically stores a ranking on-chain. Another agent can at any point in time, for example after finding the ranking during a data crawl, obtain the full chain of that agent until the point of the ranking, then request all blocks that are not already in the verifying agent’s database and once received check the ranking. Once checked, the ranking can be combined to a composite ranking. Pairwise Record Exchange and Trustworthiness EndorsementInstead of the above mechanism we can also consider a stronger mechanism: Pairwise Record Exchange and Trustworthiness Endorsement (in the following referred to PRETEND, maybe we could call it PRoTEcT - Pairwise Record Tender and Endorsement of Trustworthiness, the c in proteCt is missing but who cares, sounds cool and fits the context 😛). Before any transaction, two agents need to agree on the state of the network. For that both exchange their full-chain and from the difference in block index, calculate and exchange any missing blocks (from their own chains, or chains they know about of other agents), such that both obtain the exact same block database. Also both verify that the new state is still valid (no double spending, not missing blocks in chains) and that both indeed shared all data. If both agree on the new database, they create a new exchange block and both sign it. After this the transaction can be conducted as normal. Instead of explicitly calculating and combining trust rankings we only combine the data which is the input to the ranking function. Which reputation function is applied, or wether a function is applied at all (other applications than reputation systems are possible) is not the concern of this mechanism. This way, all transactions appear only once in each ranking calculated. When combining multiple rankings otherwise, it could become difficult to combine rankings properly with weighting transactions multiple times.Verifying application specific behavior. Depending on the application, the endorsement can entail more application specific verifications as well. For example, in the context of file-sharing, we might want to enforce that an agent does not upload to agent that already have a negative balance of more than 1GB. Verifying endorsements. A first obvious objection against this mechanism is that agents can decide to falsely verify each other’s chain, even though they are actually corrupt. However this mechanism and the addition to the TrustChain fabric of storing the database index not only allows to check for the correctness of the chain but also allows to verify the correctness of all endorsements themselves. Overhead. The PRETEND mechanism introduces a lot of overhead. First, with the above described mechanisms, we can only conduct transactions if we have the complete data of another agent. This can create high requirements for the storage capacity of agents if no mechanism is added for removing data from the database. Also the verification of all data of another agent can create constraints on the possible throughput if agents run the verification on devices with little computing power. Locality. This considerations leads to the possible introduction of locality. Agents would be more willing to interact with agents that have a similar state because they need to perform less work during the record exchange. This might help us in the prevention of Sybil-attacks because locality cannot be faked and agents in the vicinity are finite. Bootstrapping. Another implication is that new agents need to obtain a lot of data when joining the network in order to interact with incumbent agents. Mathematical modelIn order to give this work some theoretical depth we need to define a model to show how our approach differs from existing solutions. While at first I mostly looked at reputation systems, the mechanism described above is not necessarily concerned with reputation, only if an actual reputation function is applied. Also, the model from [Seuken](https://dash.harvard.edu/bitstream/handle/1/11856150/Seuken_AccountingMechanisms.pdf). which was used in the thesis of Pim Otte is not one-to-one applicable to this situation because we don’t necessarily need the function describing the value of the transaction, many values or objects could be connected to each transaction. Rather, this work is concerned with decentralized storage, dissemination of data and the methods to verify the correctness and completeness of that data. As such, our system is in the same ballpark with other cryptocurrencies or decentralized databases. I could not find a proper description of those type of systems so I tried to come up with my own definition, of which I am not sure if it is proper.Decentralized transaction record management mechanism (DTRMM). A DTRMM orders, validates and distributes records of transactions between agents in a transaction network without any central controlling entity. It’s tasks can be defined as:
Somehow we should be able to fit TrustChain, Bitcoin and other cryptocurrencies into this definition and compare them, and possibly other systems which are not blockchain based. Maybe next to the mechanism we should define Decentralized transaction record management system which is the actual implementation of such the mechanism for a certain network. The system should also handle the actual storage of the data on each agent. Agent state. Similar to the subjective work graph the agent state is simply the agent’s subjective view of the network given any knowledge the agent has obtained so far. So for an agent p, the agent’s state is simply the agent’s specific subset of S_p = <I_p>, where I_p is a subset of I, the set of all transactions in the network. Now, we can define the observability property, which means that the agents state can be observed or not. In TrustChain only a small subset of the state can be observed, which is the personal transactions of the agent. With our new addition which was defined above the state of the agent can be observed completely. To compare, blockchains with global consensus like bitcoin don’t record each single agent’s state, but instead assume that agent’s have almost exactly the same state, therefore up to some certainty the state is inferable. To be continued … ExperimentsAbove we have claimed some desirable properties of our mechanism, which we need to show in experiments. The mechanism should offer good tools for verification of data which prevents agents from manipulating or withholding data. If any manipulation or withholding data is found, or the an agent did not properly behaved given application specific rules, the agent should be ignored by other agents. A basic setup of multiple experiments could be, create some truthful agents (usually a majority) and some malicious agents and let them perform transactions. The experiment has a positive result if after a certain time the malicious agents have significant problems to find new partners for interactions and as a result have significantly fewer transactions than the truthful agents. Specific experiments could include the following types of malicious agents
Also interesting is the Sybil-attack in this context. Generally, the mechanism is not concerned with the Sybil-attack itself, so in a first experiment we can show that the Sybil-attack is successful as long as all agents produce the correct, signed blocks. However, in a second experiment we can show a certain application in which we keep track of the balance of agents and restrict the balance to a certain negative threshold value. Further we change the genesis block to start with that negative balance, such that new agents first need to perform real work before consuming work. Because we track what agents know about other agents, agents cannot just perform work for agents with negative balance, because that would make their behavior invalid and would lead to banishment. I feel like this might lead to some restriction of the success of the Sybil-attack. Finally, we need to show that even with added verification overhead and storage capacity, the horizontal scalability of our system is not diminished. So another experiment should aim at testing the number of transaction in relation to the number of agents. With increasing chain length of all agents the transaction throughput might be reduced. So not only the transaction throughput with respect to active agents but also to running time of the experiment (to which the chain lengths are proportional) should be examined. |
Protect - solid algoritm work for the core of a thesis. |
Vital dataset we need with real attacks on trust, reputation systems. In online markets, a store’s reputation is closely tied to its profitability. Sellers’ desire to quickly achieve |
Again a short status update: I ran a first simple experiment which is similar to the other experiments I plan to do. We have 20 honest agents that perform the PROTECT mechanism as described above (however still a simplified verfication). Also there is one dishonest agent who withholds one block from his chain when engaging in an interaction (when the other agent starts the interaction the agent behaves normal, that is why he still has some interactions). Other agents verify the chain and find it not complete. Therefore they reject the interaction with the agent. Also I have started writing on my thesis.tex, here is the current pdf: |
@jangerritharms 's novel idea: devise a mechanism which forces agents to disclose their full historical state of their trustchain database at each point in time. This enable the detection of historical dishonest behavior, by allowing a replay of all historical states and decisions. For instance, helping other dishonest agents in the past. thesis storyline, especially the problem description:
For instance: Demers, 1987 classic, "Epidemic algorithms for replicated database maintenance". First mention of anti-entropy syncing between agents. Simply sync differences and see how everybody syncs quickly. Great math. Real attack datasets are now available, see this Twitter 8000 fake accounts, https://news.ycombinator.com/item?id=9170433 Spam a form of sybil attack #2547? (btw dishonest agents == dramatic red in thesis figures, green == good) |
Updated report.pdf Quick update: Possible next steps:
|
|
Mental note: huge Trustchain dataset and picture; not (yet) in thesis |
Good luck today Jan! |
Status update This is the current thesis.tex. Next week I will be on vacation so this is pretty much the work that we can look at at the next meeting on monday 16th of July. Have made a start in each of the chapters of the final report. Was not yet able to implement all feedback from the previous round but that will be the next step. |
remarks:
|
Worked mostly on the experiments, updated parts of the introduction and created an example case for my mechanism in chapter 5. |
Comments
|
Creating trust through verification of interaction recordsFinal thesis on official repository Thesis Abstract |
Related work: 2004 background Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey |
Placeholder issue for master thesis work. End date firm: 16:00, Friday 31 August 2018, Cum Laude potential. Concrete idea:
The text was updated successfully, but these errors were encountered: