In this section we'll have a look at how to reassemble TCP streams and process their data.
As of version 3.4, libtins provides a set of classes which allow reassembling TCP streams in a very simple, yet powerful, way. Before this classes were introduced, there was a TCPStreamFollower class which sort of did this, but in a not-so-extensible not usable way.
These new classes aim to provide a really simple way to follow streams, process their data, get their attributes and more, using a simple callback based interface. The streams will handle out-of-order data, reassemble it and let the user process it without having to deal with packets, payloads, sequence numbers, etc.
All of this classes require using C++11, as they use std::function
as
a way to specify callbacks. Therefore, you should use some fairly recent compiler
to use it. If you're using GCC, 4.6 is probably enough, maybe even older versions.
The main class you should know about is StreamFollower
. This class will process
TCP packets, looking at the IP addresses and ports used in them. Whenever a new 4-tuple
(client address, client port, server address, server port) is seen, it will create some
context for that TCP stream and execute a user-provided callback to notify the creation
of it. After that, all packets that belong to that stream will be forwarded to the
right object, letting it process data and update its internal state.
The other responsibility of the StreamFollower
is detecting that something's going
wrong in a stream. Let's say you have a high packet loss (e.g. your program
fails to process the packets fast enough), you don't want to keep buffering data for streams
that will never be reassembled, or store state and data for a stream that was actually
closed but the FIN/RST packets weren't captured. For this reason, this class will detect
these events (too many buffered packets, stream timeouts, etc) and delete their state
if this happens.
As a simple example of how to create a StreamFollower
and set up some callbacks:
Note that StreamFollower::process_packet
has another overload that
takes a Packet
. You should try to use this overload as it will
make streams timeout at the actual packet time rather than using the system clock.
So that is the first step into reassembling TCP streams. In the next sections, we'll see how to do something useful with that.
Once you've configured the callback for new streams on the StreamFollower
,
you probably want to do something with that new stream. Stream
s allow you
to configure callbacks for different events that occur on the stream.
Data events are generated whenever there's new, ready to be processed, data. This means that a packet with the next expected sequence number arrived and its payload is available plus all of the out-of-order payload that could have been received before and couldn't be processed because that first packet's data was missing.
You can optionally subscribe to both client and server data events. This means that you can be notified on a different callback whenever there's new data from the client or from the server on each stream. Let's use this in a short example:
So that's it, the StreamFollower
we constructed before will keep processing
packets and forwarding them to the right Stream
objects, which will execute
those callbacks when appropriate.
You can also subscribe to other events on each Stream
. One of them is the
close event, which is executed whenever the stream is properly closed.
You can do this by calling Stream::stream_closed_callback
.
So now that we've seen the basics on how to use streams, lets see some other features.
By default, whenever new data is available for a stream, that data will be moved to the stream's payload, the data callback is going to be executed, and then that data is going to be erased. This is done so that data doesn't start buffering, making memory usage go up until the stream is closed (or you run out of memory). In case you want to buffer the data and use your own handling of it, then you should call the following functions:
If you're only planning on processing say the client's data and not the server's
then you should call ignore_client/server_data
. Otherwise, even
if you don't set a callback, the data will still be buffered and re-ordered
as needed:
This should have given you a fairly good introduction on how to use the
StreamFollower
and Stream
classes. You can check out the
HTTP requests example to see a
pretty simple example of this being put into practice.