root/trunk/docs/whitepaper/synchronization.tex

Revision 3200, 8.9 kB (checked in by dgollub, 9 months ago)

Added initial draft of whitepaper rewrite
Work in progress - patches are welcome!

Hopefully this improves the bus factor
http://en.wikipedia.org/wiki/Bus_factor

Line 
1 \chapter{Synchronization}
2 This chapter give a brief introduction of synchronization basics as well as how
3 OpenSync works and handles real life synchronization issues.\\
4 \\
5 Different synchronization techniques used nowadays, which have some of the
6 following tasks in common:
7 \begin{itemize}
8  \item Connect
9  \item Get changes
10  \item Conflict resolution
11  \item Multiply changes
12  \item Commit changes
13  \item Disconnect
14 \end{itemize}
15 Those tasks are in common for synchronization technique/protocol, but differ in
16 detail to fit the needs for different circumstances to meet the best efficiency.
17 Such circumstances could be:
18 \begin{itemize}
19  \item Number of synchronization parties. If the number of synchronization
20  parties is two, then multiplying of changes is just simple duplication of the
21  change.
22  \item Unidirectional/Bidirectional synchronization. On unidirectional
23  synchronization no conflict resolution required for two parties.
24  \item Resource. Depending of the type of data resource further work is required
25  to get changes. Is the resource able to tell which data changed since the last
26  synchronization, by its own? Or is further help/facility required to detect
27  which data changed since the last sync. Example: file systems, databases,
28  stacked data in a single file, ...
29  \item Type of data. Is the data in a consistent format and supported by all
30  parties? File synchronization. Is the data not consistent and contains unique
31  information which doesn't allow to do a binary compare? Weak compare? Is
32  conversion to a common format for different parties required?
33  \item Protocol. Does the protocol require to read only the latest changes or
34  all at once? Does the protocol support single commits or only all at once
35  (batch commit)?
36  \item Transport. Are various transport layer involved? Does it require to
37  connect and disconnect in a specified way? Limited bandwidth? Example:
38  Bluetooth, USB, ...
39 \end{itemize}
40 You see, there lots of different circumstances which makes it quite complicated
41 to meet all the needs of different ways to synchronize and synchronization
42 protocols.\\
43 This is also only the tip of the iceberg, since it describes only the
44 synchronization role of the ">Server"<.
45
46 \section{Synchronization Role}
47 The term ">Server"< is quite confusing,
48 especially in the combination of a synchronization protocol which uses a
49 transport protocol based on the ">Client"<- and ">Server<"-Model. Most famous
50 example is ">SyncML"<, which support among others the HTTP and OBEX protocol as
51 transport. You might know ">HTTP Server"< like the Apache Webserver and ">HTTP
52 Client"< like the Firefox Webbrowser. In SyncML you can have for example (same
53 for other transports supported by SyncML):
54 \begin{itemize}
55 \item HTTP Server transport and act as Synchronization Server
56 \item HTTP Client transport and act as Synchronization Server
57 \item HTTP Server transport and act as Synchronization Client
58 \item HTTP Client transport and act as Synchronization Client
59 \end{itemize}
60 OpenSync doesn't care much about Transport Server/Client role, this is
61 up to the Plugins. There is only a little detail which OpenSync have to care
62 about plugin when they're acting as the transport Server role, which is about
63 that the plugin has to be initialized all the time so the client can connect.
64 More about this in the Plugin chapter.\\
65 \\
66 Unfortunately OpenSync supports in version 0.40 only the Synchronization role
67 Server. The passive role as Synchronization Client isn't yet implemented, but is
68 on the top of the project agenda. The reason for this is that the current
69 implementation of synchronization tasks/steps mentioned above are currently
70 fixed. As Synchronization Client the order and number of this synchronization
71 steps/tasks would differ to the Server role. More about this issue you can find
72 in the Framework Chapter in section Synchronization Role.
73
74 \section{Slow Sync}
75 Various Synchronization protocols are using so called ">Slow Sync"<
76 synchronization technique. This consists of two types of synchronizations, the
77 already mentioned ">Slow Sync"< and a regular Synchronization (sometimes called
78 ">Fast Sync<"). The difference between the Slow and the regular (Fast) Sync is
79 that the regular one only transfers changes since the last synchronization.
80 This means on a regular synchronization not all entries have to be transfered,
81 converted. This makes the synchronization quite efficient and very fast. The
82 so called ">Slow Sync"< requests intentionally all entries, which makes the
83 synchronization a bit slower. Additionally the Synchronization Framework has to
84 interpret every single entry/change as newly added, since the Framework vanished
85 in advance the entire mappings and has to compare every single reported entry
86 from each party and find the fitting counterpart. This and the combination of
87 transferring all entries makes the synchronization compared to the regular
88 (Fast) synchronization very slow. The ">Slow Sync"< is only used in certain
89 cases to avoid data inconsistence and to keep all the data in sync. ">Slow
90 Sync"< got emitted in following circumstances:
91 \begin{itemize}
92 \item First/Initial Synchronization
93 \item Party got reseted (same as first sync)
94 \item Party got synchronized in meanwhile within another environment
95 \item After an aborted/failed synchronization
96 \end{itemize}
97 \section{Object Types}
98 The term ">Object Types"< is in OpenSync used to describe the type/category of
99 data. Example for ">Object Types"< are Contacts, Events, Todos, Notes or plain
100 Data (like the content of a file) and others. (It's not limited to PIM Data!).
101 Those Object Types get separated processed, to make it configurable which
102 Object Type should get synchronized. Example: Only synchronize contacts of the
103 mobile, no events, todos nor notes.
104 \section{Formats}
105 The ability to synchronize different Parties which use different formats, makes
106 the OpenSync Framework to a very powerful Synchronization Framework. In OpenSync
107 each Format is associated with one Object Type (see previous chapter). This
108 Object Type as common denominator for different formats makes it possible to
109 determine a conversion path between different formats. The conversion path
110 consists of various format converters, which are provided by Format Plugins.
111 Example: Two parties should synchronize their contacts (the Object Type). Party
112 A stores the contacts as VCard 3.0 and Party B stores the contacts in some
113 Binary Format. To synchronize the VCard 3.0 and the Random Binary Contact Format
114 format plugins have to register those formats and provide converters. The
115 Plugins don't have to provide converters for every known Format, often a certain
116 amount of converters to common formats or a common denominator format is enough
117 to create a conversion path between VCard 3.0 to Binary Contact Format.
118 \section{Mappings}
119 If an entry got changed on one Party, the logical same entry has to be updated
120 on the other parties while synchronization. Often different parties use
121 different ids to identify their entries. So it's required to map the logical
122 same entries which each others native id. The combination of those different
123 entries on different parties are called ">Mappings"<. Those ">Mappings"< make it
124 possible to determine a conflict if mapped entries got changed on different
125 parties the same time in a different way.
126 \section{Conflicts}
127 So called ">Conflicts"< appear if at least two entries of the same mapping got
128 changed in a different way. No conflict appears if all entries of the mapping
129 changed the same way. Such conflicts have to be handled by the Synchronization
130 Framework to avoid data loss. There are several ways to solve such conflicts.
131 OpenSync provides several different for a proper conflict resolution without
132 gaining unintended loss of data. Following conflict resolution are supported by
133 the OpenSync Framework:
134
135 \begin{itemize}
136 \item Solve, means intentionally choosing one of the conflicting entries to
137 solve the conflict. The chosen one will be multiplied to all parties and will
138 overwrite the other conflicting changes. This also allows to configure in
139 advance who is the ">Winning"< Party, who's changes will always used as the
140 solving change (">master change"<) for the conflict.
141 \item Duplicate, (intentionally) will duplicate all changed entries.
142 \item Latest, using the latest changed entry of the conflicting entries. This is
143 only an conflict resolution option if all changes provide within their formats
144 enough information to determine which got most recently changed.
145 \item Ignoring, (temporarily) the conflict till the next synchronization.
146 Conflicting entries will be read and compared again by the OpenSync Framework on
147 the next synchronization. To avoid inconsistence and data loss. If the
148 entries/changes are equal on the next synchronization the conflict is solved as
149 well. (This conflict resolution requires that the protocol of all parties is
150 able to request single entries, without triggering a "Slow Sync".)
151 \end{itemize}
152 \section{Capabilities}
153 \section{Filter}
154 OpenSync provides initial code for filtering, but it's not yet usable within
155 OpenSync 0.40. Looking forward to OpenSync 0.41!
Note: See TracBrowser for help on using the browser.