Prototype Based Virality
Talking about initializing the link set, I previously said I could solve this practical problem later, it was stupid. You can’t test an algorithm if you can’t initialize it and I want to test it as soon as possible in the real world.
To initialize a user with a relevant link set, we need to know which existing users will provide relevant fragments. The first idea could be to ask lots of questions to the user, but we wouldn’t know what to do with the answers. We have no way to compare our user with existing users and fragments, the basic idea of the whole system is precisely to not require to understand users tastes or information fragments.
Since our real world implementation will be targeted at human users, we will use a simple social trick to solve this problem. Instead of allowing anyone to subscribe to the service, we will ask everyone to be invited by an existing user. Relying on this constraint, when initializing the new user’s link set, we will simply clone the existing user’s link set, add this existing user to the new set and remove the weakest link (to compensate for the addition and maintain a constant link set size).
This new link set is strongly determined by the existing user’s tastes. It contain all its sources plus itself1. If we suppose related people have similar interests (and if users don’t invite others at random), then this new link set should be a good initial link set. Later, the new user will use the service and allow the system to optimise the set and diverge from the first user.
If you ever used a prototype based language you now know why I call this prototype based virality. We’re cloning users instead of objects but the idea is very similar. Instead of building every user from scratch using some fixed rules (In OOP these rules would approximately be a class), we do most of the work through copy, then differentiate only as much as needed.
This invite system is also a very powerful tool to shape our user graph. We can force a delay between subscription and invites to control the network growth rate. We can also limit the number of invites given by each user which will control the connectivity of our graph and enforce decentralisation.