Timestamp formats - The Good, The Bad and the Plain Ugly
In many security or network performance applications it is necessary to capture raw packets and store them on a storage device. Some applications, particularly in the finance sector, require accurate timestamps to be inserted in the packet before it is forwarded to the analytical tools. These timestamps typically have nano-second resolution and the usual way of inserting the timestamps within the packet is through the use of network packet brokers. Unfortunately, there are a wide range of packet broker manufacturers who each have different formats for their timestamps. Each format has unique advantages and disadvantages. In this blog I’ll cover four main timestamp structures. Each manufacturer will have their distinct flavors, but generally they fall into these four formats.
First let’s look at a generic structure of an IP packet:
This shows the L2 header – MAC source and destination addresses, a VLAN Tag (actually optional) and an Ethertype field. The IP header has fields for the IP packet – IP addresses, ports, and protocol etc. And finally a 4 octet error correction field at the end.
There are several locations and formats that can be used with timestamping, but lets look at the four most common.
Insert timestamp at the end of the payload. A typical packet timestamped in this way looks like the following
There has been a total of 14 octs added to the rear of the packet. This is made up of the actual timestamp (8 octets), two octets of optional information and a brand new FCS has been calculated. The old FCS has been retained so that any problems in the original FCS can be analyzed. Its usual to use 8 octets for the time stamp as this allows both unambiguous resolution of the date and time, and also gives nano second resolution. Its standard to use the first 4 octets to set the seconds since Unix Epoch and the second 4 octets as nanoseconds. Its also not unusual to have a couple of bytes as optional information. These may be used to indicate the source of the timestamp reference – eg. whether its NTP or PTP.
The big advantage of this format is that it does not ‘destroy’ the structure of the original packet. If some has written a parser to parse data within the data payload it will continue to work.
The disadvantage though is that the size of the packet has been increased. This impacts both storage capacity or if a congested network link is used between the packet broker and the analytics/storage tool there is a danger that packets could be dropped. Typical packet sizes used with financial trading applications are around 200 bytes. Adding a 14 byte timestamp at the end of the packet increases the size of such packets by 7% and allowance for this needs to be made in both storage capacity and tool bandwidths.
Insert the timestamp at the end of the payload, but use a time beacon to indicate UTC time seconds. In this approach the packet looks like the following
This time the timestamp within the packet is just 4 octets long. This indicates the nanoseconds portion of the timestamp. The seconds portion is transmitted in a separate time beacon packet which is sent out every second. The complete time can then be calculated by any analysis tool, by adding the beacon time to the nanosecond portion from each packet. The benefit of this approach is that it reduces the overhead of the timestamp. Instead of the 14 byte overhead, we only have 10 byte overhead and so the overhead impact on a 200 byte packet is reduced to 5%. A 2% net gain. All good, but this has come at the price of a significant increase in complexity. Using tools such as Wireshark or other tools gets more complex, as the seconds of the timestamp are not readily understood. More complex decoders have to be used that can track the seconds portion of the timestamp. Allowance has also to be made for what happens if a time beacon is missed.
Use MAC Source Substitution – overwrite the MAC Source address with the timestamp. In this case the packet looks like
The original source MAC address has now be overwritten by a 6 octet time stamp. The old FCS has also been recalculated and substituted with the new FCS. The big advantage of this approach is that the packet length is the same as the original packet. There is no need for additional tool bandwidth or storage capacity. There are however, a number of significant disadvantages. With only 6 octets of time stamp data, its not possible to give an unambiguous UTC time. Some other time reference is needed. If 4 octets are used for nanoseconds, this leaves just 2 octets for seconds – which is just 18 hours. So every 18 hours the timestamp rolls over and a possibly duplicate timestamp is generated. This can be compensated for introducing a Time Beacon as described above, but this has its own issues. Another problem is that there is no space for optional fields that indicate things like the source of the time reference. One final issue is that the old FCS has been overwritten, so there is no way to trace problems with CRC errors.
Insert a full timestamp in the L2 header. In this case the time stamp looks like
In this case the Ethertype has been replaced by new special Ethertype that indicates that a timestamp is going to come after the standard Ethertype field. A full 8 octets allows the UTC time to be fully decoded with nanosecond resolution. The FCS is also recalculated. The advantage of this approach is that the only 8 octets have been added to the length – though the old FCS is not retained and there is no ability to indicate the source of the time stamps. The additional octets for the old FCS and time sources could be added in, but then the overhead of the time stamp would be the same as in case 1. The big problem though with this approach is the problems it causes with downstream packet brokers or analytic tools. Many of these downstream devices have parsers that work offsets from the beginning of the packets. Allowance is typically made for VLAN tags (often multiple layers), but they do not expect to see non standard L2 headers. As the offsets are no longer standard these packet decoders will not work and will have to be re-written. In many cases down stream packet brokers cannot correctly filter the traffic as the offset filters will not work. Other disadvantages are that the MAC address of the source devices have been ‘lost’. Fault finding of problems caused by faulty hardware have therefore become more difficult. Any kind of load balancing that included Source MAC addresses is also more difficult.
In summary there are a number of different ways that timestamps can be formatted and inserted into packets. They all have their advantages and disadvantages. A summary of the four main types and their advantages/disadvantages is shown below.