Excerpt from the document "Guide to Designing Communications Protocols for Exchanging Data and Synchronizing Data Between Handheld Devices and PCs" by IMSL Software, (c) 2000 16 Common Protocol Design Mistakes ---------------------------------- 1. Having the hardware or firmware engineers implement something quick and dirty for the data-exchange protocol at the end of a handheld unit development project. Usually they don't have the experience to do a good job on this - they implement something that works for a lab test, but has serious shortcomings for a practical, robust application. 2. Implementing the data exchange protocol without a clear and detailed application-derived specification for what it has to do. In the absence of a detailed spec, the designer will usually take any shortcuts possible to implement something workable, using whatever functions happen to be there already. Only later is it discovered that the shortcuts don't work for the intended application. 3. Leaving critical timing dependencies in the communications code. This is a cardinal sin which should be avoided if at all possible. It makes implementation of the data exchange software difficult, and it makes the entire application unreliable when deployed to the field to run on a wide range of PCs with very different speeds of operation. If it is impossible to avoid, then at least spell it out clearly in the specification, and make it in the form of a minimum delay at specific points, not a tight timing window with both upper and lower limits. 4. Failing to specify the expected total amount of data to be exchanged at the start of the exchange. This prevents the receiver from displaying progress in the form of a progress bar or percent complete, does not allow a storage capacity check in advance, and does not allow the receiver to determine when the exchange is over or has been cut short. 5. Failing to reliably distinguish one message type from another in the protocol design, for example when the start of a variable-content message could be mistaken for a special code such as Abort. 6. Failing to delimit messages clearly, requiring more complex processing to prevent data content from being mistaken for the end of a message. 7. Not breaking a large data exchange into smaller messages with error checking and handshaking. Sending a large amount of data as one huge message means that any error will abort the whole exchange - not appropriate for any but the most reliable links. 8. Failing to make appropriate trade-offs between efficiency and other factors. Many existing protocols are very slow because they have a ridiculous amount of overhead for reasons which were not clearly thought through or weighed appropriately against the more significant loss of efficiency they would cause. Conversely some protocols are made overly complex and non-robust in order to gain a very small amount of efficiency which is not worthwhile to the end user. 9. Not considering all the types of problems which could result if a random communications error occurs in any part of the protocol. There doesn't have to be special handling for every possibility, but the generic error handling should be robust enough to handle any error without drastic consequences. 10. Designing complex error-recovery schemes which are inappropriate for the likelihood or severity of the error. Just Abort and try again! 11. Sending elaborate data structures as part of the data exchange which have to be re-created or manipulated by complex code the other side in order to modify existing data or add new data. Send and receive the minimum necessary, and re-create or revise the internal data structures to accommodate new data on the side which receives the data rather than making it part of the data exchange. 12. Using record identification schemes which are not fully thought out to prevent duplication if a database is reset, or multiple handheld units or databases are used, or the user continues to use the same database for a long time. 13. Designing a data synch scheme which can't handle unexpected interruptions, users taking unexpected actions (such as a Reset clearing the existing database, or restoring an old database file from a backup), or unexpected changes (such as switching PCs, or replacing a broken handheld unit). 14. Failing to consider the order in which data will be received. In some poorly-designed schemes the receiver must buffer data for a long time instead of processing it immediately because it is missing some vital information which hasn't been sent yet (for example if individual records don't specify the type of contents, and the receiver has to wait for an index table which comes last). 15. Forgetting that low-probability loopholes in the protocol are bound to cause a problem eventually when hundreds of thousands of units are deployed to end users who use them on a daily basis to exchange megabytes of data. 16. Forgetting that different computers use different numeric data formats, different countries use different date formats, and different languages use different character sets.