Make application logging great again

Making your enterprise application log is easy. Making your enterprise application log in a good way is a really complex task.

In this article I will discuss some of the topics of application logging, and how to make this an important part of your application.

Why should we log?

Logging fills a lot of roles in an applications life cycle. From the development stage where you can use the log output as an important tool for making your application do what you expect it to, and all the way to the operations stages when you use the logging output for detecting errors or taking actions to improve performance and efficiency.

Logging can also be important part of the monitoring of your application, but I will leave that out of the rest of this article, since my general opinion is that there is a distinct difference between logging (past) and monitoring (present), and that there are better ways of doing monitoring than through logging. Maybe monitoring will be a topic for another article some day.

What should we log?

This is really difficult, and will always be a question for each application you work on, but there are some key elements that will always be expected at some point of an applications life cycle. I will focus on these general topics for logging:

  • Error Logging
  • Communication Logging
  • Trace Logging
  • Performance Logging

Maybe you miss more business related functional logging in this list? This is left out for now, since business related functional logging in general is done differently for each application and for each company the applications are created for.

Error Logging

Let’s start with errors. All errors (system errors and application errors) should be logged. Always! When an error message is given to a user (human or a system) there should always be a log entry showing what actually happened. Of course this is not always easy to achieve, but it should always be your goal to achieve it. In some applications it will be enough to log the error when it first occurs and only one time, but for others it is useful to log on more than one layer in your application. Your approach must be clearly defined in your logging strategy. Error logging should always be active, to make sure you always are able to track it.

In some cases errors that turns into a functional error message telling the user to do something else, is not considered errors and treated only as information. I recommend on a general basis that these situations(functional errors) are also covered by your logging, but with another log level to make sure you can turn it on/off. This will enable the possibility to do corrective actions based on functional errors happening often, since this indicates that your application is easily misused or misunderstood.

Remember to log the cause of your error and enough information to figure out what actually caused it along with it. This means that both the error itself together with some meta data should be logged.

Communication Logging

When it comes to Communication Logging, I am talking about logging of all communication between your application and other applications. Both where your application is the consumer of other applications interfaces and where your application exposes interfaces to the rest of the world.

This kind of logging is important for all protocols and technology used, and the way of doing it depends heavily on these parameters, but the goal is the same: Collect your requests and responses in logs, to be able to track down the reasons why some communication fails, or to prove that the communication actually happened. At some point this will be needed within your applications life cycle, so you better prepare for it. This comes with quite a warning, because there are to aspects affecting it: Security and storage.

Security because it might be information exchanged between applications that should not be logged to log files, databases or other storages, and is only ment for pass through of information. In case you have this kind of information it needs to be taken care of (scrambling or removing information) before it is logged.

Storage because some interfaces produces massive amounts of information that is rarely or never used and still takes up a lot of storage space. In these cases I would recommend you to still have the logging implemented as part of your solution, but turned of during regular use, to avoid this massive storage needs, and still be able to turn it on during suspected problems, or during special situations, like directly after deployment of a new version to see that everything works as intended.

Trace Logging

Trace information is the third topic, and this is logging that in most cases are turned off, partly or in general, during regular days of the application. This can typically be placed on important functionality in your application like transaction boundaries, batch functionality and integration points, but also on interesting points in the rest of the application. This might be the most important debugging tool for your application management team, when it comes to problems in your production environment. To do this in a good way, it is important to use your log levels with the highest caution, and also make sure you use the whole scale, to be able to turn on the level of logging that you actually need to solve your problem.

Performance Logging

Performance logging is the last of the log types that I will discuss in this article. This is logging of information, used to track performance of different parts of your application. In some cases this is handled by separate logs, but it could also very well be combined with the communication logging and/or trace logging described above, by making it part of the meta data of the log statements.

Meta data

To fully make use of your logging, meta data is a major component. What your meta data consists of for each log statement depends on your applications behaviour, but typically you will need information about:

  • The server it occurs on (most important for systems with more than one node, but also if you centralise logging for several environments)
  • Information about the user (logged in or system user)
  • Date/time for when the entry occurred
  • Where it is logged from in the application
  • The log level the entry is logged with
  • Possibly some trace information to connect different log entries together.

Most logging frameworks have bundled functionality for handling metadata. Your job as a developer is to make sure the metadata are available when the logging occurs, and the frameworks will take care of the rest.

Assembling related data

The last bullet point mentioned above is related to assembling data from different logs. By having a common id for each request to the application and add this common id to all log entries, it will be possible to trace related log entries from different logs and with different content to see the whole picture of what happens based on the specified request. This is very valuable when you have difficult error situations to investigate, and also in analysis of how the application behaves.

Configuration

Many of the logging tools comes with default configuration bundled with the application they are used in, and this is nice for the development stages, but turns into a mess when you goes to test or production environments. My recommendation is to externalise the configuration right from the start, and make it possible to alter log configuration in runtime without the need of rebuild or redeployment. As soon as you have the log configuration separated from the application, you will get the operations and application management teams on your side immediately, and you have enabled a more efficient environment.

Now that your configuration is separated from the application, you need to look at how things are logged. This can be log formats, metadata logged with each statement and the order of your information. There is a lot of tools on the market for analysing log data, and these adds a lot of features of using your log data to improve your application, but they also requires specific formats to your log statements to make them readable for the tool. Most of them depends on key/value sets of meta data, but how keys and values are used depends on the specific tool. Since we now have the configuration separate from the application it is easy to adapt to different tools like this, and also to other usages of the log information.

Log configuration requires love and care and should be tuned on regular basis, to get the information that you need out of your logging.

The more specific your configuration is, the more you can tune the output.

How much logging should I do?

In my opinion: A lot! Add a lot of logging points to your application, more than you think you need, but be very careful with the use of each log level. Then use well defined configuration to tune what the output (format and entries) will be and finally ends up in the logs (log files, database, mails etc).