Problem General It's really difficult to synchronise different zope databases systems. (there is a product by Zope Corp which does that but is very expensive and not so public) There are two different problems - synchronising the same ODB (object database) on multiple servers / sites - synchronising some data stored on different ODB on multiple servers / sites The first case is the case of one company with a global ODB The second case is the case of 2 companies who want to share their mutual orders but not other orders If case 2 is solved, then case 1 too. We shall solve case 2 by using an intermediate XML format specific to each application Specific problems Most of synchronisations systems needs a database wich acts like a master, and all slaves are like a copy of the main database (case 1) Thoses systems don't allow to copy only a part of the data, they copy all objects from the master database to all slaves databases. Proposed Solution This project proposes to create a Zope Product that have to synchronise databases. This synchronisation have to update only the part of the database needed. Specifically we want to achieve the following goals : o manage a list of subscribers who can take a subscription to a query. o manage queries applied to a database of objects. o generate XML in order to communicate with several databases. o use filters, so that generated XML only contains part of the attributes of objects in the ODB (1 XML file per Zope Objet / Document (ex. 1 XML file per order)). For example, private comments should not be exported in the generated XML file o synchronise databases with the content of the XML. There's already a generic synchronisation system for Zope : ZSyncer. But this system is too limited for all our needs. Details Objects All objects will have the following attributes - id : the standard object id - uid : a global object id wich is unique to each ZODB - rid : the standard object id in the master ODB the object was subscribed from (remote id). - sid : the id of the subsription/synchronisation object wich this object was generated from. A mapping from id to rid and rid to id is available through the subscription sid First step : initialisation : 1 We create a subscription inside portal_synchronisations in the slave. We define for that subscription - id, title, description - the URL of the master (which contains an index of all files we can sync) - this URL defines a public synchronisation handle in the master server - a local query (SQL Method) to define the local realm of synchronisation - an import Conduit in order to validate data, calculate differences and import differences. This is where we define the format of the data we receive from the master - a local PageTemplate/DTML to converte local objects to XML. This is where we define both the format we send to the master and the realm of attributes we synchronise (ie. the mapping) 2 The slave sends the "initialisation package" (SyncML specification), in this package, appears several elements : - the SynchHdr element with several informations about the protocol used. We will put the query in this element, and also authentification informations. - the Alert element, in order to specify what kind of synchronisation we want, for us it will be the Alert 200 wich specifies a client-initiated, two way synchronisation. 3 We create a publication in the master database. We define - id, title, description - a local query (SQL Method) to define the local realm of publication (which objects do we publish) - a local PageTemplate/DTML to convert local objects to XML (what do we publish in each object and with what format) - an import Conduit in order to be able to commit changes sent by subscribed databases into the master database (OPTION) A this step, we produce all XML files from master ODB 4 The master sends the "Initialisation package" to the client with : - the SynchHdr element with several informations about the protocol used, and also authentification informations. - the Status element, in order to respond to the alert command sent by the client. A code for the status is returned, it will be typically 212 if the authentification is fine for the client. - differences between the two databases. - the Alert element, in order to specify what kind of synchronisation we want, for us it will be 201 wich specifies a client-initiated, two-way slow-synchronisation. This just means that both the server and the client sends all their data. Each data must specify the DTD used. At this step, we may eventually record the subscriber in the publication of the master database for... (OPTION). This is like the subscribtion is becoming member of mail list to be informed of changes of a publication //- if the slave already have some data (for example an addressbook) corresponding //to the query (for example all people from the company Nexedi), then thoses data //must be sent by the same time. Each data must specify the DTD used. //6 Finally the slave import in his databases all objects from the master. Second step : synchronisation : General idea M master S slave C conflicts t continuous time tn discrete time 1 calculate the difference on the slave DS(tn) = S(tn) - S(tn-1) 1 upload the differences to the master or send by email or whatever 1 calculate the difference on the master DM(tn) = M(tn) - M(tn-1) 1 update the master with differences from the slave M(tn) <- M(tn) + DS(tn) 1 Manage conflicts and if possible solve them on the master C(tn) 1 upload the differences and conflicts resolution to the slave 1 update the slave S(tn) <- S(tn) + DM(tn) + C(tn) Detail implementation blabla References XMLDiff - http://www.garshol.priv.no/download/xmltools/prod/xmldiff.html SyncML - http://www.syncml.org ZSyncer - http://www.zope.org/Members/andym/ZSyncer