Timmy's Blog

Creating a custom protocol dissector for Wireshark

 

24/02/2010 21:02:32

Wireshark is an open source network protocol analyzer and quite probably the best
of its kind. If you are a developer working with a lot of networking code, it’s a must have!

It can recognize many standard protocols such as POP3, NTP, Jabber, etc…
Most of the development I do is related to communicating with hardware devices
using TCP/IP. These devices implement their own communication protocol unknown to
Wireshark. While I can still use Wireshark to monitor the raw data, I thought it would
be much better if I could actually tell it what my data is so it can be displayed properly.


Protocol dissectors allow you to just that, they allow you to define your protocol in
Wireshark. When I first started looking for information on the subject all I found was
some vague guides on how to write a dissector using C++. C++ being bad enough as it is,
the efforts to set up a working build environment was even worse and given the limited
time resources I had I decided to put the idea on hold for a while… until I learned you can
also created protocol dissectors using the Lua language.

Out of the box Wireshark does not enable Lua support, so the first step is to enable it.
I do most of my development on a Windows machine so if you’re doing this on Linux or MacOS
there might be some small differences, but the basic idea behind it remains the same.

On Windows you’ll find a file called init.lua in the Wireshark installation directory,
I’m not sure where this file is located on Linux (perhaps in the /etc/ directory ?).

Open this file with your favorite text editor. I’m using LuaEdit, but any editor will do.
Somewhere at the top of the file you’ll find the following lines:

-- If set and we are running with special privileges this setting
-- tells whether scripts other than this one are to be run.
run_user_scripts_when_superuser = false

Since you’ll most likely be running Wireshark with administrator priviliges on Windows,
you have to set this value to true, otherwise Lua support will be disabled.

By default only the init.lua file will be parsed, you have to manually include the files you want to load. We’ll be making a single Lua file to represent our protocol, let’s call it protosample.lua.

At the very bottom of the init.lua file, you’ll find the following

dofile("console.lua")

You need to update this to include our own Lua script:

dofile("console.lua")
dofile("protosample.lua")

With the basic configuration set up, we’re good to go.

In order to keep things simple, we’ll work with a fictional protocol with only a few fields.
Usually you’ll send a series of bytes across the network which represent what we like to call protocol messages. In order to identify messages in the stream of bytes we need a start and stop byte. In between have the data that make up the various fields of the message.
(The start and stop bytes are not really required when using TCP, but we support RS232)

Consider the following message structure:

|START|SRC|DST|CMD|LENMSB|LENLSB|DATA 0…N|CRCMSB|CRCLSB|CRCMSB|END|

Each section in the line above (except for data) represents a single byte.
So every package will have a length of 10 + N bytes. (Keep in mind it’s purely fictional).

As you can see the LEN and CRC fields are represented by 2 bytes each, this is important
when we define our fields in the protocol dissector.

Create a new file (protosample.lua) in the Wireshark directory.
You start off by creating an instance of the Proto class:

sampleProtocol = Proto("sampleprotocol","Sample Protocol")

The first parameter of the initializer is the name of the protocol, the second a description.
Next you have to define what the fields are in a protocol message (basically the TCP package).

sampleProtocol = Proto("sampleprotocol","Sample Protocol")

local fields = ledProtocol.fields

fields.start = ProtoField.uint8("sampleProtocol.Start","Start Byte")
fields.source = ProtoField.uint8("sampleProtocol.Source","Source")
fields.destination = ProtoField.uint8("sampleProtocol.Destination","Destination")
fields.command = ProtoField.uint8("sampleProtocol.Command","Command")
fields.length = ProtoField.uint16("sampleProtocol.Length","Data Length")
fields.data = ProtoField.bytes("sampleProtocol.Data","Data")
fields.crc = ProtoField.uint16("sampleProtocol.Crc","CRC")

Every field you define needs to have a type. Single byte fields can be represented by the uint8 type, two byte fields (such as Length and Crc) can be represented by the uint16 type and finally the data bytes can be represented by the bytes type. You can find a full list of supported types in the init.lua file (Search for “Field Types”).

The hard work is done in the dissector function. It takes 3 parameters: buffer, pinfo  and tree.
The buffer parameter contains the data from the captured package. The pinfo parameter can be used to provide more information about the protocol. (have not played with this much yet).
Finally the tree parameter gives you access to the visual treeview in Wireshark.

function sampleProtocol.dissector(buffer, pinfo, tree)

	-- Set the text in the protocol column
	pinfo.cols.protocol = sampleProtocol.name

	-- Create a new subtree in the treeview and get a reference
	local subtree = tree:add(sampleProtocol,buffer())

	-- Create a variable to track the current offset in the buffer
	local offset = 0
	
	-- Next add all the fields to the subtree

	subtree:add(fields.start,buffer(offset,1))
	offset += 1
	
	subtree:add(fields.source,buffer(offset,1))
	offset += 1
	
	subtree:add(fields.destination,buffer(offset,1))
	offset += 1

	subtree:add(fields.command,buffer(offset,1))
	offset += 1

	local dataLength = buffer(offset,2):uint()
	
	subtree:add(fields.length,dataLength)
	offset += 2

	subtree:add(fields.data,buffer(offset,dataLength)
	offset += dataLength

	subtree:add(fields.crc,buffer(offset,2)

end

I kept the sample very basic so it would be clear what’s happening here.
Of course in reality you’d probably want to add additional logic to check the validity of the fields, display to the user which command the message is calling, etc…

There are two more steps involved in order to get this dissector working.

First you will need to add an init function (so you can set the packet_counter value):

local packet_counter

function sampleProtocol.init()
	
	packet_counter = 0
	
end

Second you need to register the dissector in the Wireshark TCP table (at the end of the file):

tcp_table = DissectorTable.get("tcp.port")
tcp_table:add(33000,sampleProtocol)

Basically it means that if data is sent or received using/on TCP port 33000,
decode it using the sampleProtocol dissector.

Once you’ve saved the file in the Wireshark, restart the application and start
transmitting some protocol data on the port you specified.
You can even filter on “sampleprotocol” to hide all other packets that are captured.

Since this is not an example of a real protocol, I’ve taken a screenshot of a dissector for
one of our older communication protocols (which was also made with RS232 in mind):

As you can see, the packet gets decoded and all the fields are displayed properly.
The root item (marked yellow) not only displays the name of the protocol, but it also displays
the type of protocol message this packet represents.
This is done by adding text to the ‘subtree’ directly:

subtree:append_text(", " .. messageTypeName)

In this case the messageTypeName is a string which was created after evaluating the
Message Type field. The ‘..’ operator simply concatenates two strings in Lua

You can create much more powerful dissectors with Lua, but I’ve kept this example as simple
as possible. It should be enough to get you started, the rest is up to you.


Comments:

 

On 24/02/2010 22:13:05, Philip Paeps wrote:

So are lua dissectors actually "fast enough" these days to be useful? I tried writing a dissector in lua a couple of years ago when I needed one quickly and soon found it really didn't perform.

For what it's worth, I've written a couple of dissectors in C and it's really not that hard to do. The only silly bit is the fact that you need an entire wireshark source tree around to build your dissector, but that's really not too much of an obstacle. It's just "annoying".

The advantage of writing in C is also that it happens to be the language I implement my protocols in (and the only language which really exists, as far as I'm concerned) so the data structures and most of the bit-fiddling can just be copied and pasted or put in a shared object file and linked in.
 

On 24/02/2010 22:45:45, TimothyP wrote:

Well... so far I have not yet had any problems with performance,
but I must say I am running on a 8 core machine with 12GB ram and I'm not sending to many packets...
I'll have to run some tests...

Don't get me wrong, I'm not telling you that you should abandon C or anything...
but "the only language that really exists"... that's a bit narrow minded don't you think?

Even if you were the best C programmer in the world, if you came to a job interview with
this remark, I would instantly have to write you off.

Even if you don't use other languages in your projects,
every developer should at least try to learn a new language every year.
Not in order to use it for production, but just to look at things from a different perspective.


I'm not saying Java developers should learn C# and the other way around,
I'm talking about learning a totally different language.

As a C# developer I started using F#, IronPython and a few other languages...
Playing around with these other languages has improved my C# skills in a way...
It allowed me to think about problems differently....

As far as I'm concerned no language is the best language...
It's all about using the right tool for the job.
But I'm very interested to hear from you why you consider C to be the only language.
 

On 25/02/2010 0:52:38, Philip Paeps wrote:

As far as the protocols I implement are concerned, C really is the only language. I don't write protocols on top of TCP, I write protocols like TCP. Data plane, not config plane. Performance matters, and so does the price of the hardware. I need to know what it's doing and I need full control. Multi-GHz clocks and ridiculous amounts of memory are not an option.

No fear - I have no plans for interviewing for jobs. :-)

I did not say other languages are inherrently wrong or bad. They just don't exist in the protocol world from my perspective. It also feels really weird to add code in a different language to an existing body of code. This is like writing C++ in the Linux kernel. Of course you can do it (as long as you disable silly things like exceptions and you're willing to deal with background magic like constructors) but why would you want to? You can even write Java in the kernel. In fact, you can do whatever you like -- code is code. If you can convince the processor to run it, go wild!

In fact -- if your protocol is implemented in C#, what stopped you from just writing your dissector in C# too? Why did you choose Lua? The only reason I'd choose Lua over C in this particular case is because it's much faster to write. One of these days, I may actually bother to learn the language, but so far, just writing code in it seems to be fine for its interpreter. No thinking required. It just doesn't run fast enough. Or it didn't when I tried it about four or five years ago when I needed a dissector quickly.

For what it's worth: there is documented open source evidence that I don't only write C. I've written loads of tcl, perl and more recently lua. I just don't implement protocols in them. Or anything else that needs to run in a tightly controlled environment.
 

On 25/02/2010 1:36:31, Philip Paeps wrote:

Note that despite the 65535th iteration of the "use the right language for the job" discussion, I am still very interested in performance numbers for wireshark dissectors written in lua. Since lua has always been fairly snappy, I think the key thing is its integration with wireshark, and a couple of versions have happened since I last played with it.

Then again, given the kind of hardware one is likely to run wireshark on these days, I wonder if it really matters much. When I played with it, wireshark would take ages (read: many many minutes) to dissect a reasonable dataset (tens to low-hundreds of megabytes) with lua plugins.

I don't remember if it was doing something silly like forking a lua for every packet. I just remember thinking "no, this is way too slow" and spending couple of hours getting a buildable wireshark and plugging my code underneath. The same dataset took measurably less time (though I didn't measure it) with the new plugin.
 

On 25/02/2010 5:58:09, TimothyP wrote:

I'm simply trying to pick at your brain here to see if I can learn something :-)

Perhaps we are not on the same wavelength. The people who develop the hardware use C
of course. There reasons however are different from yours. While you have the ability to compare
C to other languages in terms of performance, the people we have to work with only know C.
They learn it once and use the same knowledge for the rest of their lives.

What I was really talking about it the development that we do internally or that
our partners do. We (and our partners) create desktop applications to interact with the hardware.
These applications are usually targetting .NET or Java. Are you suggesting that all of us should
implement a C library to deal with protocol data only to write a wrapper around it in order to use
it from .NET or Java? The .NET Framework and Java are both quite capable of dealing with TCP
communication, should we abandon that?

The reason I didn't implement a dissector in C# is because I don't have the time to
take the source of Wireshark, make some modifications and recompile it using managed C++,
which is probably what it would take to have Wireshark to load C# dissectors (correct me if I'm wrong)

There are a few reasons I used Lua.
First of all, having been a C# developer for many years, I would make a terrible C developer,
the two languages are to similar and yet very different. Second, a lot of the people I work
with are engineers. While they are quite capable of editing a file and creating a new one,
I'm not sure I'd be able to convince them to get the source of Wireshark, install a C/C++ compiler, etc...
Finally I used Lua because it's available in Wireshark, without the need to recompile the whole thing.

The protocols we work with are intended to be used on slow, unreliable connections.
(Not just TCP/IP) so they are always very small. Therefore we never run into situations like the one
you mentioned where we have to dissect a dataset of a 100MB. Perhaps if that were the case Lua
would indeed fail us, I can't disagree with you there.

While .NET and Java might not provide real-time execution, I'm sure you can agree that for
the line of business apps we write the performance is more than adequate?

I have been playing with the .NET Micro Framework on hardware and yes it's a lot slower
than native C implementations but in some cases it really works well and it does save a lot
of development efforts.

So C on the hardware, yes by all means, on the software side of things... I don't see the point.
(But of course I'm only talking about our line of business)

I hope you don't take any offense, I love hearing opinions from other people,
It's how we learn :-)
 

On 19/05/2010 13:43:37, blueowl wrote:

Thanks for this tutorial. It's a nice introduction to dissectors in Lua.
I've tried the example and it works. Nevertheless, I had to fix a few syntactical errors first:
1)
local fields = ledProtocol.fields
should be
local fields = sampleProtocol.fields

2)
missing right parenthesis in
subtree:add(fields.data,buffer(offset,dataLength)
and
subtree:add(fields.crc,buffer(offset,2)

3)
wireshark (1.2.6) with Lua 5.1 complained about '+=' operator,
so I need to change it to offset = offset + 1

Other than that it looks good.

 

On 1/06/2010 10:19:45, Timothy wrote:

@blueowl
Indeed I made a few typo's there :p
Was trying to obscure some of our implementation details hehe :)

Thnx for the feedback :)

 
Your comments:
Your details:
 
   
 

Since you have not authenticated,
we require you to submit some
additional information and fill out the captcha.

Your e-mail address will not be disclosed to anyone and will not be visible on the site. If you specify your blog, we will display a link to it.

If you sign in with your OpenID, you can store your profile and you will never have to enter your details or fill in the captcha again.