GARGAMEL: improved performance of replicated databases
We approach an approach, Gargamel, which consists in serialize upstream transactions that can lead to conflicts, and parallelizing those which are independent. Our system offers strong transactional guarantees according to the PSI model. Each database replica runs sequentially, and the synchronizations between replicas remain minimal. Our simulations and experiments on Amazon EC2 have shown that Gargamel improves response time by a factor of 10 when the system is heavily loaded and that in other cases, its additional cost is negligible.
The purpose of database replication is to improve both performance and availability, allowing several transactions to run at the same time, on separate copies. It works well for transactions read-only but is still a challenge in the presence of writes. In competition control is an expensive mechanism, moreover, it is inefficient to simultaneously execute transactions that are in conflict, because one of the two will have to anyway to be aborted and restarted. These problems well known, limit the use of modern architectures such as multi-cores, clusters, grid computing, and the clouds. our approach is to classify transactions based on a prediction of their upstream conflicts, i.e. i.e. at the scheduler level, before any execution. Two transactions that do not conflict with each other are executed in parallel on separate copies of the database, which maximizes throughput. The transactions which could have conflicts between them are subjected sequentially to the same replica, which ensures that they will not be abandoned. This optimizes the use of resources. Gargamel does not impose a global synchronization, necessarily expensive. All this improves the throughput, response time, and resource usage. The goal of Gargamel is to dynamically partition the workload in separate queue files so that there is no conflict between two transactions performed in two different files. Each file can then be executed independently. On the other hand, the transactions of the same file are executed in sequence, in order to avoid dropouts. Gargamel comes between the customers and copies of the database. When he receives a new transaction, Gargamelie checks the conflicts and the place in a file. we call transaction classifier the component that predicts possible conflicts. the classifier checks, for each new transaction T, whether it conflicts with a recently committed transaction or with a transaction already scheduled.
Gargamel’s conclusion:
Gargamel allows to execution of parallel transactions on a replicated database. in single-site configuration, all transactions are centrally scheduled and two transactions that must conflict are always executed sequentially. So there is no wasted work. Transactions whose independence is detected can be executed in parallel on different replicas. The more the classifier is precise in its anticipations of conflicts, the more it is possible to parallelize the independent transactions. In the multi-site configuration, the database and the metadata are geo-replicated. In this configuration, Gargamel offers good fault tolerance guarantees while maintaining good performance thanks to its optimistic synchronization mechanism.
Very interesting topic, regards for posting.Expand blog