Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards global Consensus on Trust #3357

Open
synctext opened this issue Jan 15, 2018 · 64 comments
Open

Towards global Consensus on Trust #3357

synctext opened this issue Jan 15, 2018 · 64 comments
Assignees

Comments

@synctext
Copy link
Member

synctext commented Jan 15, 2018

Placeholder issue for master thesis work. End date firm: 16:00, Friday 31 August 2018, Cum Laude potential. Concrete idea:

  • We aim to build a global consensus system for trust
  • a required intermediary step is obtaining lists of trust rankings
  • Each participant in the network puts periodically the Top-1000 nodes it trusts the most on their Trustchain.
  • A signed trustranking record contains a list of known public keys of nodes, ordered by level of trust. Nodes with highest level of successful interaction and trust are listed first.
  • Calculate Trustrankings using incremental personalised temporal pagerank Incremental update of trust levels in a dynamic blockchain graph #2805
  • Extend trustchain on top of IPv8 IPv8: testnet with 1000 nodes #3272 to parse and interprete these records
  • Calculate from numerous individual signed Trustranking records an estimation of the global trust consensus. Consensus Ranking
  • Who is the most trusted of us all? In future work we can then run a Byzantine Fault Tollerant Consensus algorithm for true global consensus on trustranking. NP-Hard problem:
We are given a set of N rankings, or permutations1
on n objects. These rankings might represent individual
preferences of a panel of N judges, each presented
with the same set of n candidates. Alternatively, they
may represent the ranking votes of a population of N
voters. The problem of rank aggregation, or of finding
a consensus ranking, is to find a single ranking π0 that
best “agrees” with all the N rankings. This process
can also be seen as a voting rule, where the N voters’
preferences are aggregated in an election to produce
a consensus order over the candidates, the top ranked
being the winner.
  • This enables random appointment of a node for some governance role, weighted by your global trust ranking. A fascinating mix of a meritocracy and democratic lottery. All based on the transaction graph and emergent trust network:
    screenshot tribler
@devos50 devos50 added this to the Backlog milestone Jan 15, 2018
@xoriole
Copy link
Contributor

xoriole commented Jan 19, 2018

@devos50
Copy link
Contributor

devos50 commented Jan 19, 2018

@synctext synctext changed the title Solving something with consensus, incremental PageRank, or ... Towards global Consensus on Trust Jan 19, 2018
@jghms
Copy link

jghms commented Jan 19, 2018

https://www.aaai.org/Papers/AAAI/2004/AAAI04-110.pdf
Consensus ranking (Vote aggregation) like the IBM page (this is the the paper the example references)

Studies the ranking of m alternatives by n voters, for example m sporters in a sports event which are ranked by n judges. The paper discusses different problems like cycles in the majority graph. As far is I understand the problem that I will be trying to solve, there are some major differences between this problem and mine. First of all in the global trust problem not all nodes in the network are ranked by all nodes. This means each node keeps a ranking of a subset of the alternatives. Also the judges in our problem are not external but rather part of the alternatives, therefore their own trust values should be accounted for.

@jghms
Copy link

jghms commented Jan 19, 2018

I have created a Trello board to organize my work, track my progress and build a knowledge base. I keep track of all learnings, questions coming up and todos. In the first sprint (until the next meeting we have on Feburary 1) I will try to understand the problem and come up with a proper problem description.

@jghms
Copy link

jghms commented Jan 23, 2018

A Computational Model of Trust and Reputation
Pioneering paper from 2001

image

image

Wat is strategy proof.

@jghms
Copy link

jghms commented Jan 24, 2018

I created a github for my thesis which I will use instead of the Trello. Have created some projects (sprints) for the next few weeks and you can follow my progress on the Problem description there.

Repo

@synctext
Copy link
Member Author

Please read the pioneering work from Lik Mui at MIT from 2001 onwards

L. Mui, M. Mohtashemi, S. Verma (2004) "A group and reputation model for the emergence of voluntarism in open source development,,"Working Paper. (For citation, please contact lmui). [pdf]
M. Mohtashemi & L. Mui (2003) "Evolution of indirect reciprocity by social information: the role of Trust and reputation in evolution of altruism.","Journal of Theoretic Biology, Vol 223/4, pp. 523-531. [pdf] [word]
L. Mui (2003) Computational Models of Trust and Reputation: Agents, Evolutionary Games, and Social Networks, Ph.D. Dissertation, Massachusetts Institute of Technology. [pdf]
L. Mui, A. Halberstadt & M. Mohtashemi (2003) "Evaluating Reputation in Multi-agents Systems."" in Trust, Reputation, and Security: Theories and Practices, R. Falcone, S. Barber, L. Korba, M. Singh (Eds), Springer-Verlag, Berlin, 2003, pp. 123-137. [pdf] [word]
L. Mui (2002) "Toward Realistic Models for Evolution of Cooperation," MIT LCS Memorandum. [PowerPoint] [postscript] [word]
L. Mui & M. Mohtashemi (2002) "Notions of Reputation in Multi-agent Systems: A Review," AAMAS'2002 and MIT LCS Memorandum.
L. Mui, M. Mohtashemi, A. Halberstadt (2002) "A Computational Model of Trust and Reputation," 35th Hawaii International Conference on System Science (HICSS). (Best Paper Nominee) [postscript] [word]
L. Mui, M. Mohtashemi, C. Ang, P. Szolovits, A. Halberstadt (2001) "Ratings in Distributed Systems: A Bayesian Approach," Workshop on Information Technologies and Systems (WITS'2001). (Best Paper Nominee) [postscript] [word]
L. Mui, M. Mohtashemi, C. Ang, (2001) “A Probabilistic Rating Framework for Pervasive Computing,” MIT Student Oxygen Workshop (SOW'2001). [postscript] [word]
L. Mui, C. Ang (2001) "Collaborative Sanctioning: Applications in Restaurant Recommendation based on Reputation," [poster] Autonomous Agents Conference (Agents'2001).
A. Halberstadt, L. Mui (2001) "Group and Reputation Modeling in Multi-Agent Systems," [poster] NASA Goddard/JPL Workshop on Radical Agent Concepts,  [postscript] [word]

@jghms
Copy link

jghms commented Jan 29, 2018

Found an interesting TED talk by Rachel Botsman about the sharing economy.

@jghms
Copy link

jghms commented Jan 31, 2018

Progress meeting 1: Towards a problem definition

GOAL: Develop a direction for an exact problem definition for my Master thesis.
Subgoal: Get feedback on my knowledge of the field, the goals and the current status of research.

Agenda:

  • Discuss my view of the goals, problems and directions (5 slide presentation)
  • Get some feedback
  • Discuss fruitful directions for further research in order to develop a problem definition
  • List of actions until the next meeting
  • Plan the next meeting

Presentation

@jghms
Copy link

jghms commented Feb 6, 2018

Reputation Systems: Facilitating Trust in Internet Interactions

Fun read on Reputation Systems, not much hard science in here but just an introduction, by the same guys as the article "Manipulation-resistant reputation systems" which I am currently studying. They define the minimum features of a reputation system as:

  • long-lived entities
  • feedback about current interactions is captured and distributed.
  • past feedback guides decisions

I think what we are trying to achieve is a build a reputation system that is somehow application agnostic. With temporal PageRank we can get the personal view of the reputation of the peers we have interacted with and their peers but what we are missing is a way to acquire a trustworthy reputation value from unrelated and unknown nodes. Also we haven't found a way to distribute the reputation yet, which is one of the basic requirements of a reputation system according to the above definition, which is similar to what is discussed by Mui. Reputation is quite worthless if it is not spread because the chance that one node interacts with the exact same node again in the future is very small in a network with many nodes.

@synctext
Copy link
Member Author

synctext commented Feb 8, 2018

REGRET: Reputation in gregarious societies (2001 paper)
The Multi-Agents Systems community has for 20+ years worked on trust. However, these models are pure mathematical. They rarely get confronted with real-world matters in existence for decades, like eBay feedback extortion fraud. Commercial system are central and do not share any of their trust rankings. Open challenge is to connect the commercial and the academic world. At the Tribler team we consistently aim to bridge the gap between theory and reality. For trust models this gap has been wide and deep for 20 years.
image
Prior work by TUDelft faculty (2014), ETAF: An Extended Trust Antecedents Framework for Trust Prediction. This work nicely decomposes the abstract trust construct into components which can be engineered.
etaf__extended_trust_antecedents_framework__nielyork

@synctext
Copy link
Member Author

synctext commented Feb 8, 2018

Single Internet-of-Trust:
A single long-lived identities with certain reputation in one domain means you exist, transacted honestly in the past, and invested effort. We now envision a mechanism which re-uses this prior effort in another context. You start from zero, yet grow trust faster because you have something to lose. If you behave dishonestly in this second domain, it will have consequences. Transparency and consequences breed honesty, and honesty breeds trust. This can be expressed with game theory.

@jghms
Copy link

jghms commented Feb 8, 2018

Commercial reputation systems:
image

Research reputation systems:
image

Reputation systems: A survey and taxonomy

@synctext
Copy link
Member Author

synctext commented Feb 8, 2018

"Consensus ranking" is a solid step forward for global trust. Spreading individual non-verifiable trust rankings is realistic to implement and deploy. A superior approach would be to have objective and verifiable rankings which are spread. We might need local aggregation and then global aggregation.

Mechanism to add to increase trust. Each "signed trustranking record" is not a subjective opinion, but upon request the underlying data can be shared, and the calculation can be repeated. Others who repeated this calculation will guarantee correctness with the signature. This moves it from subjective to objective observations with transparency. Each signed trustranking record is also a confession that you have transaction data and by the rules of the game you are required to produce it, or suffer the consequences. All part of "mechanism design" to enforce cooperative behavior.

Please create a fresh repo with all above systems in a clickable table+fresh papers like Google reputation stuff. Thesis existing systems chapter. Rebooting this older Tribler work.
| Year | ShortSystemName | Full Paper Title |

Research Question:
with partial knowledge of the individual trustranking records. We can only estimate the global consensus rank. Can we mathematically prove a bound on the accuracy of our global estimate, given certain degree of partial knowledge.

ToDo: practical side and do Python Pagerank on real public trustchain data crawl.

@jghms
Copy link

jghms commented Feb 20, 2018

First results of the crawl data analysis.

This plot shows the chain length for each of the crawled public keys.
chain_length_by_sequence_number

The shown length is the largest sequence number for each public key. If we instead use the number of blocks that have been recorded for each public key, we get a much lower number, so not all blocks are in the crawl data. Actually by summing all blocks that were recorded and summing all sequence numbers we see that only about 45% of all transactions are in the crawl data.

@jghms
Copy link

jghms commented Feb 22, 2018

Some considerations on reputation systems and aggregation

According to the above mentioned taxonomy of reputation systems we can define the reputation system for tribler we are building as follows (the six main dimensions for the definition of reputation systems):

  • History: Personal - We have a personal history, each node has crawled a set of blocks but we cannot be sure that every node has all blocks.
  • Context: Single - We envision in the future a reputation system for multiple contexts but in Tribler we have one reputation for data throughput (upload/download), so a single context.
  • Collection: Indirect - We crawl blocks from other nodes, we have not observed all those transactions but have to trust that nodes send proper data. With implicit consensus we can validate the truthfulness of blocks.
  • Representation: Continuous - We have a floating point value as the output of the PageRank algorithm and combine multiple such values to a local aggregated continuous reputation value.
  • Aggregation: Flow, ? - I think we will actually have two aggregation steps: firstly we use the Temporal PageRank algorithm to calculate our personal view of the trustworthiness of the surrounding nodes, then, and this is what my thesis will mostly be about, we have to aggregate locally the votes of a subset of nodes from the network to estimate the global ranking.

Questions derived from the above definition

  • Given an unobserved set V of all n nodes, a set T of all t transactions and m reputation rankings based on unrelated subsets of T, can we put a bound on the distance between the aggregation of
    o << m rankings and the aggregation of all m rankings, based on some aggregation algorithm A.
  • In order to estimate an error we need a reference global ranking, how can we calculate such a global ranking?
  • what algorithm should we use to aggregate rankings in a sybil-resilient way?
  • what measure should we use for the error/distance between locally aggregated and globally aggregated rankings?
  • We don't know how many transactions there have been, how can we know how many blocks give a good estimate?
  • We need a protocol to save, share and validate trust rankings to make them not opinions but provable evidence

@jghms
Copy link

jghms commented Feb 22, 2018

I created a repository with the above mentioned survey of research and commercial systems. I will update them further with more systems and rank aggregation algorithms.

@synctext
Copy link
Member Author

Impressive progress

@jghms
Copy link

jghms commented Feb 23, 2018

[Outcome of discussion] block withholding attack and obtaining full coverage of all transactions are open problem for creating reproducable trustchain rankings
bounded random walk
take crawl rate (known universe fraction) as a parameter for the error on the aggregated rankings
GPU accelerated trust calculations on mobile devices
total byte count of crawled dataset to detect bugs and liars
find suspiciously large transactions -> build filter -> graphics

@jghms
Copy link

jghms commented Feb 26, 2018

Proposal for thesis goal : Verification and aggregation of (partial?) trust rankings in the TrustChain fabric
Specific steps:

  1. In order to be able to verify a trustranking of another node, one needs the evidence which was used for the calculation. Sharing one's state of all global chains is probably too much data?!, we could use only the sequence number of all nodes which in our dataset would be 7138 address - integer pairs. Alternatively we could restrict the number of hops used for the random walk which creates the rankings. Or, if exact verification is too hard, we could do here what we would like to do in the next step for the rankings, namely come up with an error function depending on the collected evidence.
  2. Once we are able to verify the rankings of other nodes we can aggregate them towards an estimate of the global ranking: the more work we do to aggregate rankings, the more accurate the estimate will be. We need to chose an aggregation mechanism and estimate the error for a certain amount of rankings aggregated. The complexity of the aggregation result might be dependent on the application: if all we want to know is wether it's safe to interact with a node, we could simply use a binary value as the outcome, or a discrete value (like a star rating). If we actually require a ranking as the aggregation outcome, it will be slightly more complicated.
  3. Finally we should design some experiments to check the real-world potential of this approach to restrict spamming in the Tribler application.

@jghms
Copy link

jghms commented Feb 26, 2018

Some more results from the data analysis
Basic stats:
7138 nodes (unique public keys requesters + responders)
1328229 blocks
227724 edges in undirected transaction graph (unique public key pairs)

Unfiltered mean transaction size:
There are two nodes that have a huge transaction size which is obviously not correct.
mean_transaction_size
This is the filtered version, maximum value is 999.
mean_transaction_size
Direct neighbors per node:
edges_per_vertex

@synctext
Copy link
Member Author

We crawl blocks from other nodes, we have not observed all those transactions but have to trust that nodes send proper data.

Research idea to protect from the block-withholding attack in general (for our TrustChain context). All direct connected peers need to sign that they: copied, verified, and actively disseminate you entire chain upto and including block X. "Block-verification statement". Nodes for which no peers provide signed Block-verification are given a low reputation.
{Obviously Strategy-Proof Mechanisms for IPv8, @qstokkink }

Convert to honesty
At micro-economics level the majority of (rational) players may be dishonest. A minority of always-honest players will follow the given rules. These dishonest players do not have many opportunities to interact, honest players do not trust them. Honest players have numerous opportunities to interact and are thriving, as shown by their TrustChain interaction history. The next step is to moving from local to global and achieve the desired emergent macro-economic effect: rational players convert to honesty. There is some forgiveness in the strategy of honest agents and if dishonest players convert, they will receive a large payout in the reward function. They might elect to change their ways for a single round or longer. As shown by recent research these ideas might be applied to our real economy, dubbed "Industrial Sybiotic Relations". The holy grail is to prove the stability of cooperation against attacks such as invasion defectors.

Conclusion: we need to derive a strategy-proof mechanism for dissemination of interaction-outcomes, (e.g. build a tamper-proof collective memory) and design policies for local interactions from which at the global levels favorable conditions are created with evolutionary stability.

Now turning to scalability and trust-rank. For planetary-scale cooperation each actor is provided with bounded rationality, limited storage capacity, and a limited population it can observe. We define a unobserved set V of all n nodes

Pairwise auditing
Alice requests a pairwise auditing from Bob. They exchange their observed interaction records. It is determined if their stored interaction graph is equal of who will have a strictly larger subset of all global interactions. We assume Bob is the party with the smallest determined subset, he calculates the Trust-rankings, cryptographically signs them and forwards them to Alice. Aim is to prove the error bound for Alice to check such audits. Likewise Bob will also sign Alice her audit certificate and Trust-rankings.
Combine with strategy-proof dissemination mechanism: after a signed audit you MUST make all interactions which underlie each Trust-rank available to others.
A secondary effect of pairwise auditing is that Alice and Bob synchronise their records (e.g. 1987 stuff). After the merger they will store and disseminate the longest chain of everybody.

Once we are able to verify the rankings of other nodes we can aggregate them towards an estimate of the global ranking

We assume audited top-N trust-rank for each actor in the system. This then become a homework assignment, as their are several known solutions :-)

Leveraging Libtorrent
Both Alice and Bob use Libtorrent to seed their collection of Trustchain records. try to use this trick to seed and exchange records. Maximize the overlap by sorting the single binary blob on public key and other efficiency measures you can think of.

You are hereby release of any need to integrate your work into Tribler, if and only if you can do some publishable theoretical proof with as described above and do a Pim-like; Kelong-like stand-alone validation experiment with real crawled Trustchain data...

@jghms
Copy link

jghms commented Feb 28, 2018

Interaction diagrams for pairwise auditing and how they can be used for transactions.
pairwise auditing
transaction protocol

@jghms
Copy link

jghms commented Feb 28, 2018

Evolution of cooperation in the context of Tribler
In Tribler we have the prisoner’s dilemma when faced with the uploading and downloading of data. If I upload (cooperate) and you upload (cooperate) we both gain data for some cost of the uploading. If I only download (defect) I have no cost and still benefit if you upload (cooperate). One round would be one interaction between two nodes with some amount of data. Usually though we have multiple interactions, either between the same nodes or between different nodes, thus we are playing the iterated prisoner’s dilemma. According to Nowak there are 5 mechanisms for the evolution of cooperation: direct reciprocity, indirect reciprocity, spatial selection, group selection and kin selection. Direct reciprocity is when we both upload data to each other, however due to the asymmetry of data in file sharing this is not always straightforward. I can only upload data if you would like to download that data. This is where indirect reciprocity comes in: I upload data to you, you upload data to someone else and I hope that someone will also upload data to me.

Indirect reciprocity: “I help you, somebody helps me.”

Now the other forms are somewhat harder to map to the file-sharing example: spatial selection refers to the mechanism of neighborhood help. We rather help people in our vicinity than far away. Group selection is a similar phenomena, except it’s not related to space but to affinity to a group. Finally kin is a similar concept related to family connections.
The goal is to create a Tribler society in which cooperation is an evolutionary stable strategy. Indirect reciprocity is then the best suited mechanism to foster cooperation. In contrast to direct reciprocity a node is not rewarded with cooperation in return but rather with an increase in reputation. Given some form of dissemination of the reputation other nodes will see the reputation of the node and be more willing to cooperate if the reputation has some level.

What happens to nodes that have a low reputation? How does reputation help us get benefit in the application? What is the incentive to gossip the reputation of other nodes? What happens if we do not tell the reputation of a node we know?

@jghms
Copy link

jghms commented Mar 5, 2018

[Continued] Evolution of cooperation in the context of Tribler
There should be some incentive for nodes to gain more reputation. In a market place a seller might want to attain higher reputation in order to have more customers buy from her rather than from competitors in the market. In Tribler this does not work the same way, more people downloading is not an advantage. However we could enforce that nodes only upload to nodes which have a higher reputation. This means that a node with a high reputation can obtain data from many nodes in the network while nodes with little reputation only can interact with few people. This might be difficult though in practice because nodes will start with little reputation and might not have access to all files before they upload some data, but uploading is only possible if other nodes are interested in the data a node can provide. But let's assume that the incentive for gaining reputation is more data (privileges) available.
Now, there seem to be multiple notions of reputation: on the one hand we have the (global) interaction reputation, which in the case of Tribler is the ratio between uploaded and downloaded data and on the other hand we have the (local) aggregated rankings of multiple neighboring nodes. In order to properly utilize the reputation and have good incentives for obtaining reputation we need to find a way to combine the two notions of reputation toward one trust value that we can assign the node.
Given that we have an incentive to increase the reputation and there is a way to combine the two different reputations, a node will try to improve its reputation. For the global reputation, reputation is improved by uploading more data than downloading. For the local reputation, this depends on the aggregation mechanism, but in a simple example the real-number scores a node gets from all signed trust rankings of the neighbors, can be summed to one number. Given that the scores are positive values between 0 and 1 the reputation increases with each ranking a node is listed in. Rankings are obtained through pairwise auditing and a node has a strong incentive to do so in order to increase it's local reputation. The other nodes have the same incentive to respond to the pairwise auditing request. Next to just summing the scores, a different aggregation might use the rank of a node in the rankings provided by neighbors. This requires more analysis but a problem with this approach might be that a node can increase its reputation by lowering that of nodes in the ranking above itself, leading to an incentive to be untrustworthy.

Sub question to answer next: Is a restriction of interaction to nodes with higher reputation a valid option for an incentive to increase reputation? Is a weighted sum a valid option for combining the global and local reputation? (And can we find a better name than global and local reputation?) Is there an incentive not the share trust rankings? Do we need to reward nodes for this? What happens when a node withholds blocks/double spends? Do we constantly need to check for the validity of their chains after we had an interaction? If we constantly check the validity we should be able to detect a double spend and then inform other auditor's such that they can remove their rankings, practically removing the reputation of the misbehaving node.

@synctext
Copy link
Member Author

the audit your partners rule: a key mechanism is to require audits, ensuring we create an invisible hand of honesty. The rule: repeated interactions with an agent requires you to be able to produce your cryptographically signed, valid, correct, and honest audit certificate for this agent.

After an initial interaction you need to start checking people, and either put positive or a negative endorsement online. Negative endorsements can be simply caused by failure to send information or non-responsiveness. Failure to produce an audit certificate for any agent you interacted with repeatedly will severely impact your trust level.

@synctext
Copy link
Member Author

synctext commented Apr 11, 2018

Obtaining honest ratings after transactions has proven to be a hard problem.

the paper "Rating inflation"
reputation_inflation

A solution to marketplace information asymmetries is to have trading partners
publicly rate each other post-transaction. Many have shown these ratings
are effective; we show that their effectiveness deteriorates over time.
The problem is that ratings are prone to inflation, with raters feeling pressure
to leave “above average” ratings, which in turn pushes the average
higher. This pressure stems from raters’ desire to not harm the rated
seller. As the potential to harm is what makes ratings effective, reputation
systems, as currently designed, sow the seeds of their own irrelevance.

@jghms
Copy link

jghms commented Apr 13, 2018

Thanks for the suggestions!

I think I was misled when thinking of the monotonic increasing value in order to make sharing strategy-proof. Instead, if pairwise audits are the only way of exchanging data and each endorsement block records the blocks exchanged, then we can simply see exactly which data an agent has at any point in time. So, when we require the full-chain, we not only see the agents interaction records but also the record-exchange records, so we know which data the agent possesses. So when we request data we can check if an agent shared all data.

Experiments
The question of a key metric of success is a hard one and I have been struggling to think of experiments that show that our dissemination works. The dissemination strategy itself does not protect against the Sybil attack or the double-spending. However, good dissemination of data increases the probability of finding such attacks with existing methods. In order to facilitate good dissemination we give agents an incentive to obtain endorsements and thereby exchanging their data. So, our accounting policy becomes a function of contributions and endorsements. Now, only contributing is not enough but agents also need to get endorsements as well. We still have to make sure that simply through creating endorsements by Sybils the overall trust does not increase beyond a certain threshold. This Sybil resistance can be achieved through algorithms analyzing the graph of contributions (e.g. netflow, temporal pagerank), but also the graph of endorsements (graph of contributions should be a subset if we apply the audit-your-partners policy). What we need is another element in the accounting policy which is the probability of an agent being honest. Then, in order to obtain a higher score, an agent needs to contribute and share it's data. And for honest nodes this will also increase the probability of being honest. But for agents that are part of a Sybil region, sharing more data will at some point decrease the probability of being honest because the agent reveals more and more of the Sybil region. What remains is to find a good function for the probability of being honest, as temporal pagerank is not giving us such a value (maybe claimed contribution divided by temporal pagerank could work?).
Anyways, for now I will focus on writing some framework code and run some preparing experiments:

  • create the network from the dataset and choose a node, perform random pairwise audits with closest neighbors and plot the trust as calculated according to some accounting policy by different neighbors
  • next repeat the experiment but first perform audits in the network to increase overall data dissemination
  • run the experiment with an attached sybil region and choose the node from the Sybil region (without any sybil detection this will still increase the value probably)

If all these experiments work I have to think about how to achieve the Sybil resistance.

As for the audit-your-partners rule: at some point in the future we will need to make sure that interaction partners can comprehend why the other is not sharing data or sharing some specific amount. Basically, not sharing data can either be classic "defection" or it can be "cooperating" by not giving resources to people that don't deserve it. This comprehension is only possible if the requester can make the same calculation as the responder and check that the responder actually calculates with the given data that the requester deserves some amount of resources. Therefore I would suggest to perform the audit before a transaction. This gives the requester an opportunity to prove her trustworthiness and makes sure both parties agree on the bandwidth and amount of resources to share.

@synctext
Copy link
Member Author

synctext commented Apr 24, 2018

Trust research spans The Sciences; economists, iterative PD (CS), evolution of cooperation(biology). Social scientists from Stanford Innovation Lab are discussion mechanism design. Talk: "Harnessing Gaming Technology for Peace: A Conversation with Margarita Quihuis", discussing a social operating system based on religion, legal code, and mechanism design in general. The lab founder 1999 paper uses credibility, instead of trust. His 2002 paper with 4000+ citations "Persuasive technology: using computers to change what we think and do":

Given the importance of credibility in computing products, the research on computer credibility is
relatively small. To enhance knowledge about computers and credibility, we define key terms
relating to computer credibility, synthesize the literature in this domain, and propose three new
conceptual frameworks for better understanding the elements of computer credibility. To promote
further research, we then offer two perspectives on what computer users evaluate when assessing
credibility. We conclude by presenting a set of credibility-related terms that can serve in
future research and evaluation endeavors.

Thesis framing:
This thesis provides elegant, simple, fast and secure algorithm for data certainty.

Now, only contributing is not enough but agents also need to get endorsements as well.

@jangerritharms took the idea to the next level: using it as the exclusive dissemination mechanism. By forcing peers to publicly disclose what they know about all others it is trivial to detect forks, fakes, and fraud. Endorsements are on-chain. This is the key strategy of our lab goal of "Distributed Trust Design", change the rules of the game. Take this to the extreme and make it asymmetric, almost unfair, and virtually impossible to attack. Contributing is not sufficient, only count contributions endorsed by the community-at-large. Reciprocity as endorsement incentive.

ToDo:

  • coding + more detailed documentation in thesis.tex format
  • "endorsement decision" algorithm, upon receiving a request and when to send a request.
  • audit algorithm, check for forks, fakes, and fraud.
    • information consistency (forks and double spending, missing blocks, block withholding, ..)
    • information correctness (community contributions since genesis block, netto balance, missing 2nd signatures)
  • asynchronous global consistency: on-chain endorsements by 51% of the network. Really?

@qstokkink
Copy link
Contributor

qstokkink commented Apr 24, 2018

Beyond game theory: cryptographically enforced two-way endorsements.

@jghms
Copy link

jghms commented Apr 25, 2018

https://github.com/jangerritharms/aupair

@jghms
Copy link

jghms commented May 15, 2018

Status update:

I have created a setup for experiments, with two different exchange mechanisms: crawling and pairwise endorsement.

In order to show that the code works in general I have run two tests with 50 nodes, in which each node requests one transaction per second with a random partner for 100 seconds.

** Crawling **
crawl_50

** Endorsement (pairwise record synchronization) **
sync_50

Foreign blocks are half-blocks from other agents, received through the dissemination. Exchange blocks are the blocks created in my version of the pairwise endorsements. Unfortunately my computer does not have enough power to run this experiment fast enough for the pairwise endorsements (probably inefficient code). Actually the information is quite well distributed if the crawl rate and transaction rate is equal and each crawl gets the full chain.

The first actual experiment I want to run is to detect double spenders. So have one node perform a double spend and see how long it takes until a double spend is detected on the network. In theory pairwise endorsements should be superior to crawling because with crawling only the nodes chain is shared while with pairwise endorsements also all information of other nodes is shared, leading to faster spreading of information (but also more information to store for each node).

Can we calculate how much unsafer trustchain is vs bitcoin?
Professor Epema asked this interesting question. Maybe we can reformulate it a little bit: In bitcoin double spending is basically prevented, but how long does it take in trustchain to detect a double spending? If we perform crawling between nodes I believe with some simplifying assumptions we can calculate this, however I am unsure about whether the assumptions are reasonable so maybe this consideration is useless.

Assumptions:

  • With each crawl an agent obtains the full chain of another agent
  • Full network knowledge
  • All nodes online
  • unlimited storage capability

Let's say a double spend has been performed and each node crawls at a certain rate. Let's ignore the attacking node for the moment, then there are two nodes n_0 and n_1, which have the conflicting transactions and all other n-2 nodes don't have any of those.

1 node is crawling:
This node will detect the double spending if it has crawled both n_0 and n_1. This problem is similar to throwing a 1 and a 6 with a normal dice (n=6). The probability for the first success (either 1 or 6) is 2/6 and the second probability is 1/6 (1 if first success was 6, and 6 otherwise).
The T is the number of throws it took to get both a 1 and a 6 the expected value for T is E(T)= 1/(2/6) + 1/(1/6) = 3/2 * 6 = 9. So in the case of crawling it takes approximately 3/2*n crawls for one node to detect the fraud.

All nodes are crawling:
Now once any node find the fraudulent transaction he/she will inform the whole network. So the actual question is how long does it take for any node to detect the double spend. ...

A different approach is that nodes are not crawling the same node twice.
The the probability after m rounds of crawling and n nodes on the network is approximately (m/n)^2.
The probability that any node finds the double spend within m crawls is 1-(1-(m/n)^2)^n.
For example with 1000 nodes on the network and each node making 10 requests we get 1-(1-(10/1000)^2)^1000 = 0.0952.

@synctext
Copy link
Member Author

synctext commented May 15, 2018

reformulating...

Endorsements enable audits of auditors! Endorsements are on-chain. The topic of double spending detection time distracts from the advantage of this algorithm. Reading: reputation mechanisms

All honest nodes have an incentive to detect fraudulent behavior, are they can become a victim. Transactions are disseminated (gossip or crawling) and each node will report a double spend to others. Assumption 1: a double spend transaction without check is indistinguishable from other transactions. Assumption 2: a malicious agent can not influence the dissemination of his transactions by honest agents (sabotage spreading assumption). Only by detecting for any chain two different transaction with the same sequence number you detect malicious behavior. Thus intuitively formulated, all honest nodes collaborate and scan all disseminated transactions in parallel for double spends, resulting in fast detection.

Collusion scenario: nodes can collude and agree to conduct double spending attacks and never disseminate these transactions (no trivial on how to exploit this).

Is this a variant of the coupon collection problem?

With the introduction of locality: a bias towards dissemination and endorsement means disconnecting from the network size. it scales! It becomes constant, instead of dependent on n, it becomes dependent on the average of interaction partners. (No?) progress with random interactions, but for fast mixing graphs it's progress?

experiment brainstorm:

  • Our first experiment is by design as simple as possible. It illustrates the correct behavior of our setup. We have 1 malicious agent and the other two agents are honest.
    • Result: no endorsement of bad node.
  • more extended experiment: 1 malicious attacker creates 4 identities, there are 3 honest nodes. Evil majority scenario.
    • attacker lies inconsistently
    • attacker lies consistently
  • Several honest nodes plus malicious nodes. Test the endorsement mechanism. For instance, malicious nodes only get endorsement from other malicious nodes. What malicious behavior? Mix of various agents:
    • Block withholding, but honest endorsements
    • Block withholding, malicious endorsements
  • Scalability experiment. How ToDo this, key question?

@jghms
Copy link

jghms commented May 30, 2018

Lately I felt like I was bothering too much with details and lost the overview of what we were trying to achieve. That's why I went back to the very first entry and traced back by steps of how I got to the latest considerations. I wrote it down and it ended up as some kind of a summary of the theoretical considerations of the last months. I hope it's clear and we can find any logic faults in the argumentation, so feedback is much appreciated.

How did we get here?

We started out with the goal to work towards a consensus on trust for the TrustChain architecture. We considered different concepts of trust, analyzed different reputation systems and researched ways of building a sybil-resistant network. The goal was to combine multiple local rankings of trustworthiness of agents to better approximate a global ranking of trustworthiness. But in order to combine rankings we should first check that the other agent is not lying about the ranking, so we need a verification step. The current architecture of TrustChain does not allow a simple verification because rankings are created based on the complete database instead of the agents personal chain. In other words, if we see the trust ranking as a pure function of the agent’s state (i.e. the same state of the agent returns the same ranking) TrustChain does not record the complete agent’s state because the state consists of the agent’s transactions and the agent’s knowledge of the network. Agents need to agree on a single view of the network in order to agree on each other’s or a combined ranking. But without any “proof-of-knowledge” agents can decide to not share records of other agents, because any reputation those other agents obtain in the eyes of the calculating agent through those records can put those agents above the agent that is sending the information. Hence it is not advantageous to share all information.

The solution is to make truthful sharing of records incentive compatible. We can achieve this by recording not only transactions on the personal chain but any change to the record database. We need to create a new block type (exchange block) which includes all records sent to and received from any other agent. If this structure is in place the following will be possible: given the complete chain of an agent, we have access to the same knowledge as the that agent. However, storing all transactions of one agent in an exchange block will render that block to be huge. Instead, we can continue to store the actual blocks in an undisclosed database and only record a block exchange index in the exchange block. What we require is: given the complete chain of an agent, we should be able to create a block index which is an exact representation of the blocks in that agent’s database. The block index of an agent lists for that agent all known public keys and each of those keys the interval of all known sequence numbers.

With the above extension to the TrustChain architecture we are able to enforce truthful sharing of an agent’s complete knowledge (agent’s view of the network/agent’s database). Now, the original goal of combining trust rankings becomes theoretically possible. Because the agent’s complete knowledge, or state, is on the chain, we are able to verify calculated trust rankings at any time. Now we can envision different mechanisms. For example an agent periodically stores a ranking on-chain. Another agent can at any point in time, for example after finding the ranking during a data crawl, obtain the full chain of that agent until the point of the ranking, then request all blocks that are not already in the verifying agent’s database and once received check the ranking. Once checked, the ranking can be combined to a composite ranking.

Pairwise Record Exchange and Trustworthiness Endorsement

Instead of the above mechanism we can also consider a stronger mechanism: Pairwise Record Exchange and Trustworthiness Endorsement (in the following referred to PRETEND, maybe we could call it PRoTEcT - Pairwise Record Tender and Endorsement of Trustworthiness, the c in proteCt is missing but who cares, sounds cool and fits the context 😛). Before any transaction, two agents need to agree on the state of the network. For that both exchange their full-chain and from the difference in block index, calculate and exchange any missing blocks (from their own chains, or chains they know about of other agents), such that both obtain the exact same block database. Also both verify that the new state is still valid (no double spending, not missing blocks in chains) and that both indeed shared all data. If both agree on the new database, they create a new exchange block and both sign it. After this the transaction can be conducted as normal. Instead of explicitly calculating and combining trust rankings we only combine the data which is the input to the ranking function. Which reputation function is applied, or wether a function is applied at all (other applications than reputation systems are possible) is not the concern of this mechanism. This way, all transactions appear only once in each ranking calculated. When combining multiple rankings otherwise, it could become difficult to combine rankings properly with weighting transactions multiple times.

Verifying application specific behavior. Depending on the application, the endorsement can entail more application specific verifications as well. For example, in the context of file-sharing, we might want to enforce that an agent does not upload to agent that already have a negative balance of more than 1GB.

Verifying endorsements. A first obvious objection against this mechanism is that agents can decide to falsely verify each other’s chain, even though they are actually corrupt. However this mechanism and the addition to the TrustChain fabric of storing the database index not only allows to check for the correctness of the chain but also allows to verify the correctness of all endorsements themselves.

Overhead. The PRETEND mechanism introduces a lot of overhead. First, with the above described mechanisms, we can only conduct transactions if we have the complete data of another agent. This can create high requirements for the storage capacity of agents if no mechanism is added for removing data from the database. Also the verification of all data of another agent can create constraints on the possible throughput if agents run the verification on devices with little computing power.

Locality. This considerations leads to the possible introduction of locality. Agents would be more willing to interact with agents that have a similar state because they need to perform less work during the record exchange. This might help us in the prevention of Sybil-attacks because locality cannot be faked and agents in the vicinity are finite.

Bootstrapping. Another implication is that new agents need to obtain a lot of data when joining the network in order to interact with incumbent agents.

Mathematical model

In order to give this work some theoretical depth we need to define a model to show how our approach differs from existing solutions. While at first I mostly looked at reputation systems, the mechanism described above is not necessarily concerned with reputation, only if an actual reputation function is applied. Also, the model from [Seuken](https://dash.harvard.edu/bitstream/handle/1/11856150/Seuken_AccountingMechanisms.pdf). which was used in the thesis of Pim Otte is not one-to-one applicable to this situation because we don’t necessarily need the function describing the value of the transaction, many values or objects could be connected to each transaction. Rather, this work is concerned with decentralized storage, dissemination of data and the methods to verify the correctness and completeness of that data. As such, our system is in the same ballpark with other cryptocurrencies or decentralized databases. I could not find a proper description of those type of systems so I tried to come up with my own definition, of which I am not sure if it is proper.

Decentralized transaction record management mechanism (DTRMM). A DTRMM orders, validates and distributes records of transactions between agents in a transaction network without any central controlling entity. It’s tasks can be defined as:

  • to define an order, globally or locally but at least properly ordered for each entity, of the transactions happening in the network
  • to define a validity function for the state of the network
  • to define mechanisms for the dissemination of records

Somehow we should be able to fit TrustChain, Bitcoin and other cryptocurrencies into this definition and compare them, and possibly other systems which are not blockchain based. Maybe next to the mechanism we should define Decentralized transaction record management system which is the actual implementation of such the mechanism for a certain network. The system should also handle the actual storage of the data on each agent.
Now, within such a system we can define the state of the agent. In the model by Seuken et al. the subjective work graph is what is the input to the accounting mechanism. Most of their model is based on the graph representation. However, when not considering graph based methods, it is possibly more applicable to talk about a state.

Agent state. Similar to the subjective work graph the agent state is simply the agent’s subjective view of the network given any knowledge the agent has obtained so far. So for an agent p, the agent’s state is simply the agent’s specific subset of S_p = <I_p>, where I_p is a subset of I, the set of all transactions in the network.

Now, we can define the observability property, which means that the agents state can be observed or not. In TrustChain only a small subset of the state can be observed, which is the personal transactions of the agent. With our new addition which was defined above the state of the agent can be observed completely. To compare, blockchains with global consensus like bitcoin don’t record each single agent’s state, but instead assume that agent’s have almost exactly the same state, therefore up to some certainty the state is inferable.

To be continued …

Experiments

Above we have claimed some desirable properties of our mechanism, which we need to show in experiments. The mechanism should offer good tools for verification of data which prevents agents from manipulating or withholding data. If any manipulation or withholding data is found, or the an agent did not properly behaved given application specific rules, the agent should be ignored by other agents. A basic setup of multiple experiments could be, create some truthful agents (usually a majority) and some malicious agents and let them perform transactions. The experiment has a positive result if after a certain time the malicious agents have significant problems to find new partners for interactions and as a result have significantly fewer transactions than the truthful agents. Specific experiments could include the following types of malicious agents
  • does not sign the exchange/endorsement block
  • withholds blocks from the chain
  • withholds blocks from the database (foreign blocks = blocks from other agents)
  • forks his own chain (double-spend)
  • multiple malicious agents endorsing their manipulated chains

Also interesting is the Sybil-attack in this context. Generally, the mechanism is not concerned with the Sybil-attack itself, so in a first experiment we can show that the Sybil-attack is successful as long as all agents produce the correct, signed blocks. However, in a second experiment we can show a certain application in which we keep track of the balance of agents and restrict the balance to a certain negative threshold value. Further we change the genesis block to start with that negative balance, such that new agents first need to perform real work before consuming work. Because we track what agents know about other agents, agents cannot just perform work for agents with negative balance, because that would make their behavior invalid and would lead to banishment. I feel like this might lead to some restriction of the success of the Sybil-attack.

Finally, we need to show that even with added verification overhead and storage capacity, the horizontal scalability of our system is not diminished. So another experiment should aim at testing the number of transaction in relation to the number of agents. With increasing chain length of all agents the transaction throughput might be reduced. So not only the transaction throughput with respect to active agents but also to running time of the experiment (to which the chain lengths are proportional) should be examined.

@synctext
Copy link
Member Author

Protect - solid algoritm work for the core of a thesis.

@synctext
Copy link
Member Author

Vital dataset we need with real attacks on trust, reputation systems.
http://users.eecs.northwestern.edu/~hxb0652/HaitaoXu_files/TWEB2017.pdf Abstract:

In online markets, a store’s reputation is closely tied to its profitability. Sellers’ desire to quickly achieve
high reputation has fueled a profitable underground business, which operates as a specialized crowdsourc-
ing marketplace and accumulates wealth by allowing online sellers to harness human laborers to conduct
fake transactions for improving their stores’ reputations. We term such an underground market a seller-
reputation-escalation (SRE) market. In this article, we investigate the impact of the SRE service on repu-
tation escalation by performing in-depth measurements of the prevalence of the SRE service, the business
model and market size of SRE markets, and the characteristics of sellers and offered laborers. To this end,
we have infiltrated five SRE markets and studied their operations using daily data collection over a con-
tinuous period of two months. We identified more than 11 thousand online sellers posting at least 219,165
fake-purchase tasks on the five SRE markets. These transactions earned at least $46,438 in revenue for
the five SRE markets, and the total value of merchandise involved exceeded $3,452,530. Our study demon-
strates that online sellers using the SRE service can increase their stores’ reputations at least 10 times
faster than legitimate ones while about 25% of them were visibly penalized. Even worse, we found a much
stealthier and more hazardous service that can, within a single day, boost a seller’s reputation by such a
degree that would require a legitimate seller at least a year to accomplish. Armed with our analysis of the
operational characteristics of the underground economy, we offer some insights into potential mitigation
strategies. Finally, we revisit the SRE ecosystem one year later to evaluate the latest dynamism of the SRE
markets especially the statuses of the online stores once identified to launch fake transaction campaigns on
the SRE markets. We observe that the SRE markets are not as active as they were one year ago and about
17% of the involved online stores become inaccessible likely because they have been forcibly shut down by
the corresponding E-commerce marketplace for conducting fake transactions.

@jghms
Copy link

jghms commented Jun 5, 2018

Again a short status update:

I ran a first simple experiment which is similar to the other experiments I plan to do. We have 20 honest agents that perform the PROTECT mechanism as described above (however still a simplified verfication). Also there is one dishonest agent who withholds one block from his chain when engaging in an interaction (when the other agent starts the interaction the agent behaves normal, that is why he still has some interactions). Other agents verify the chain and find it not complete. Therefore they reject the interaction with the agent.
first_real_experiment
We let the agents interact for 100 seconds with approximately 1 transaction per agent per second. We see that the dishonest agent has significantly less transactions that the honest agents. I will create more experiments like this, but they only prove the correctness of the mechanism. Next to this, there should also be a scalability experiment which I still need to design. For that I will probably also need to make the software work with gumby to run it on the DAS-5.

Also I have started writing on my thesis.tex, here is the current pdf:
report.pdf

@synctext
Copy link
Member Author

synctext commented Jun 6, 2018

@jangerritharms 's novel idea: devise a mechanism which forces agents to disclose their full historical state of their trustchain database at each point in time. This enable the detection of historical dishonest behavior, by allowing a replay of all historical states and decisions. For instance, helping other dishonest agents in the past.

thesis storyline, especially the problem description:

  • big and lab vision, your component in the 50-years: Lab goal: "Distributed Trust Design" #3571 of building-trust-research. Multiple layers, fully distributed.
    • trustchain, force multi-party argeement
    • distributed incremental pagerank for detecting sybil regions
    • incentive compatible info sharing (THIS THESIS)
    • use latency to enforce honest behavior. Introduce new rule that agents physically within few hundred km are your neighbors. You can't replace your neighbors and you can't bend the laws of physics. High latency peers are never trusted, they need to be guaranteed by low-latency peers.
    • strong identities. finally use expensive identities, such as government issued passports to create costly identities. They can't be re-used or re-newed.
  • precise and narrow of thesis work

For instance: This works has successfully created a specialized component which has a proven ability to make trust systems better.

Demers, 1987 classic, "Epidemic algorithms for replicated database maintenance". First mention of anti-entropy syncing between agents. Simply sync differences and see how everybody syncs quickly. Great math.
1988 classic: "A survey of gossiping and broadcasting in communication networks"

Real attack datasets are now available, see this Twitter 8000 fake accounts, https://news.ycombinator.com/item?id=9170433

Spam a form of sybil attack #2547?

(btw dishonest agents == dramatic red in thesis figures, green == good)

@jghms
Copy link

jghms commented Jun 20, 2018

Updated report.pdf

Quick update:
Not so much happened since the last update. I mostly worked on the story for my thesis, how cooperation, reputation systems, tribler, trustchain and my work are related. Also I have reworked my code a little bit, before I had stored on the chain all public keys and sequence numbers that were exchanged, now only two hashes of the exchanges are stored, which still has the same effect as partners can sign the hash of what data they sent and having the hash later on means agents still cannot lie about which data they have. Also I see now that anti-entropy is definitely not the only way to go. We can simply define how many endorsements are required per interaction (some ratio which can be enforced by the honest nodes).

Possible next steps:

  • next steps will be to make my emulation work with gumby and run a scalability experiment
  • run experiments that show that in no way we can not share all data and still be acknowledged by honest agents
  • run experiments on Sybil attacks

@synctext
Copy link
Member Author

  • title? "Key building block for creating online trust"
  • "Trust is the bedrock of society"; opening line?
  • application-agnostic
  • Our audacious ambition is to create a global trust system
  • model with both trust and reputation
  • paper which explains difference between trust system and reputation system
  • Chapter 1, more in-depth. Reader knows trust, reputation, and incentive work by then.
  • Chapter 2
    • describe The Magic Thing
      • connects all digital humanity
      • uses all past transactions for estimating trustworthiness
      • disputed outcomes
      • Aims to identify the small minority of dishonest behaving individuals
      • Works in preventive manner, 100% fraud detection deters
      • transparency breeds honesty and honesty leads to trust
      • non-profit public utility
    • define trust vector
    • story
      • global trust
      • trust data dissemination
      • no solution yet, prior work
  • 4.1. Applications of decentralized accounting systems
    • table with requirements
    • nothing does it all !
    • except this thesis
    • ProTeCT naming
  • Turn around: we now have money without banks. Banks can default. 2007 global financial crisis resulted in numerous collapsing banks (465?)
  • Bitcoin is Chapter 1 material
  • 4.1.3 Partial view and scalability is Problem Description material
  • Chapter "Concept of internal agent state transparency" or "transparency of trust vectors to prevent attacks"
  • Chapter "the ProTeCT algorithm for trust vector integrity"
  • Experiments and performance analysis
    • 3 dishonest and 4 honest
    • 6 dishonest and 4 honest (+no honesty among thieves)

@synctext
Copy link
Member Author

synctext commented Jun 21, 2018

Mental note: huge Trustchain dataset and picture; not (yet) in thesis

@lylamaria
Copy link

Good luck today Jan!

@jghms
Copy link

jghms commented Jul 5, 2018

Status update
report.pdf

This is the current thesis.tex. Next week I will be on vacation so this is pretty much the work that we can look at at the next meeting on monday 16th of July. Have made a start in each of the chapters of the final report. Was not yet able to implement all feedback from the previous round but that will be the next step.

@synctext
Copy link
Member Author

synctext commented Jul 16, 2018

remarks:

  • You engineer! You are not writing in social sciences style
  • experiments:
    • forking is the key manipulative behavior (double spend detection)
    • transaction hiding
    • whitewashing
    • colluding / Sybil (6.4.1. Sybil attack)
  • thesis has a "fast" storyline in chapter 1.
    • expand beyond 3 pages
    • recommend mentioning 1) Uber and Airbnb and 2) Bitcoin in the intro text of Chapter 1
    • then you explain connection of trust to these inside Chapter. (Taxi without taxi license, money without banks)
  • 1.4. Solution to abuse of Trust: decentralization explain in 1 sentence to the poor reader what this subsection is about. Like: We claim in this thesis that a global system to create trust on the Internet and our economy can't have a single-point-of-failure. It needs to be fully decentralized.
  • namyly, no spelling checker?
  • 2. In the introduction we have made a case for our audacious ambition to design and create a layer of
    reputation on top of the core infrastructure of the internet that enables application agnostic trustful
    relationships between relative strangers.
    or shorter and more powerful: our audacious ambition is to create trust on the internet.
  • suggestion: open Chapter 2 with the Mui 2001 MIT picture. Provides context and intro, then do the boring requirement stuff.
  • Definition 7 (Transaction block). missing signatures and switches away from formal model in Def 1-6.
  • Chapter 4: Honesty is rewarded and cheating is reduced when we introduce the novel concept of 'state transparency'. In this chapter we introduce the main idea of the thesis, solve the trust problem by exposing your own state. Our proposed mechanism publicly records the action of sharing knowledge.
  • Chapter 6: we aim to prove the properties of the mechanism and architecture by experimental analysis. We implemented our mechanism and now establish how it handles strategic manipulation.
  • " All agents run a main decision loop at high frequency and have a small chance of starting an interaction each time step.", informal, imprecise. How exactly? interaction probability?
  • Your mechanism seems to have an recursion explosion. Talking to a peer triggers Exchange block creation, next peer you talk to will always lead to the creation of a new record, because you created at least a single exchange block on your own chain. Thus leading to another exchange block creation. Each encounter leads to a new record being created on Trustchain. (positive wording: gossip transparency)

@jghms
Copy link

jghms commented Aug 2, 2018

report.pdf

Worked mostly on the experiments, updated parts of the introduction and created an example case for my mechanism in chapter 5.

@synctext
Copy link
Member Author

synctext commented Aug 2, 2018

Comments

  • more citation needed, none in Section 1.1.1 and no source for Figure 1.1
  • Section 1.4, citation of Lab goal: "Distributed Trust Design" #3571 as the lab goal
  • Figure 2.1, larger.
  • Table 3.1; use survey to expand number of entries.
  • Figure 5.1, depiction of subjects A and B is bit silly
  • Style violation
    • Mixing theory, formal notation with pictures of pens to indicate a digital signature
    • confuses the reader
    • the 19 definitions do not lead to an impossibility result or proof.
    • Mixing Tribler details and dirty implementation with clean formal model (e.g. Nowak level)
    • you spend much more time of coding, less on theory; how much theory in thesis?
    • can you prove something cool? Like: you can't strategically manipulate my algorithm! You are forced to share or you can not-share (freeride), lie and will be caught -guaranteed-.
  • genesis block == "0"
  • The Sybil problem
    • Cardinal example "Sybilproof Reputation Mechanisms" - very good short read
    • "problem formulation" with formal model
    • their hard theoretical science: "Theorem 1. There is no symmetric sybilproof nontrivial reputation function"
    • "We have presented a possible framework for assessing a reputation mechanism’s robustness to sybils. We have shown that no nonconstant, symmetric reputation function exists. Further, we have given a collection of flow-based asymmetric reputation functions which are sybilproof, under some conditions."
  • block tampering, forking and double spending, block withholding
  • Section 6.x system architecture picture and text
  • The same happens when the selected partner is a known
    malicious agent.
    be more clear and explicit about this essential behavior.
  • The first experiment to run is concerning the dissemination free-riders more like: our first experiment analyses the performance of our proposed algorithm against freeriding. the why
  • algorithm has no forgiveness, DFR flatlines (shorthand distracts)
  • Figure 6.1 very boring similar outcomes. compact same message into 1 figure.
  • multiple dissemination free-riders join forces and collaborate in creating a
    subnetwork.
    collusion experiment ! we now present an experiment to determine the outcome if freeriders detect and decide to collude against the honest agents. again explain why.
  • Collusion leads to network partitioning... thus Figure 6.3 needs to be red for the zeros. figure title "quantification of the network partitioning"
  • Figure 6.5 versus 6.6: honest agents without "deep check" and with.
  • Jan-Gerrit idea: creating identities become expensive, you are born with a negative reputation, you first need to perform work for the community. You can't collude with other negative agents anymore. Attack-edges becomes essential.

@synctext
Copy link
Member Author

Creating trust through verification of interaction records

Final thesis on official repository

Thesis Abstract
Trust on the internet is largely facilitated by reputation systems on centralized online platforms. How- ever reports of data breaches and privacy issues on such platforms are getting more frequent. We argue that only a decentralized trust system can enable a privacy-driven and fair future of the online economy. This requires a scalable system to record interactions and ensure the dissemination and consistency of records. We propose a mechanism that incentivizes agents to broadcast and verify each others interaction records. The underlying architecture is TrustChain, a pairwise ledger designed for scalable recording transactions. In TrustChain each node records their transactions on a personal ledger. We extend this ledger with the recording of block exchanges. By making past information exchanges transparent to other agents the knowledge state of each agent is public. This allows to discriminate based on the exchange behavior of agents. Also, it leads agents to verify potential part- ners as transactions with knowingly malicious users leads to proof-of-fraud. We formally analyze the recording of exchanges and show that free-riding nodes that do not exchange or verify can be detected. The results are confirmed with experiments on an open-source implementation that we provide.

@synctext
Copy link
Member Author

synctext commented Nov 3, 2022

Related work:
Mustafa Al-Bassam seems to have re-invented our approach in 2019. His work and Celestia combines our 2018 work and prior 2017 Chico implementation along with our 2016 bottom-up consensus model. Brilliant wording: "virtual side-chains", we just called it a scalable ledger 😄 Superior marketing and acquisition of funds then we did!

2004 background Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey

@qstokkink qstokkink removed this from the Backlog milestone Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

7 participants