The Elastic Stack today is comprised of four components, Elasticsearch, Logstash, Kibana, and Beats. The last one is a family of log shippers for different use cases and Filebeat is the most popular. If you want to get started with Filebeat, read this short article to get informed about the basics of installing, configuring and running in order to obtain the full potential of your data!  

What is Filebeat and where is it used?

Generally, the beats family are open-source lightweight data shippers that you install as agents on your servers to send operational data to Elasticsearch. Beats can send data directly to Elasticsearch or via Logstash, where you can further process and enhance the data (image). The beats Family consists of Filebeat, Metricbeat, Packetbeat, Winlogbeat, Auditbeat, Journalbeat, Heartbeat and Functionbeat. Each beat is dedicated to shipping different types of information — Winlogbeat, for example, ships Windows event logs, Metricbeat ships host metrics, and so forth. Filebeat is designed to ship log files. Filebeat helps keep things simple by offering a lightweight way (low memory footprint) to forward and centralize logs and files, making the use of SSH unnecessary when you have a number of servers, virtual machines, and containers that generate logs. Other benefits of Filebeat are the ability to handle large bulks of data, the support of encryption, and deal efficiently with backpressure. Essentially, Filebeat is a logging agent installed on the machine generating the log files, tailing them, and forwarding the data to either Logstash for more advanced processing or directly into Elasticsearch for indexing. At this point, we want to emphasize that Filebeat is not a replacement for Logstash, but it should be used together to take advantage of a unique and useful feature. Filebeat uses a backpressure-sensitive protocol when sending data to Logstash or Elasticsearch to account for higher volumes of data. If Logstash is busy processing data, it lets Filebeat know to slow down its read. Once the congestion is resolved, Filebeat will build back up to its original pace and keep on shipping.

Yes, but what is a Beats module?

Very interesting is the fact that Filebeat comes with internal modules for Apache, Nginx, MySQL and more, that simplify the collection, parsing, and visualization of common log formats down to a single command. These support modules are built-in configurations and Kibana objects for specific platforms and systems and can be utilized easily because they come with pre-configured settings and they can also be later adjusted according to the organization’s needs. Additionally, a few Filebeat modules ship with pre-configured machine learning jobs. A list of the different configurations per module can be found in the /etc/filebeat/module.d (on Linux or Mac) folder. Modules are disabled by default and need to be enabled. There are various ways of enabling modules, one way being from your Filebeat configuration file:

The downside of Filebeat modules is that it is a bit difficult to use since it requires using Elasticsearch Ingest Node and some specific modules have additional dependencies. Also, keep in mind that the users of Logstail.com do not need to enable and configure the Filebeat module (very important!). Today, Filebeat supports 36 different modules and Metricbeat 48.

How to install Filebeat

Filebeat can be downloaded and installed using various methods and on a variety of platforms. It only requires that you have a running ELK Stack to be able to ship the data collected by Filebeat.

Install Filebeat using Apt

The most convenient way to install the Filebeat is with the use of Apt or Yum from Elastic’s repositories. First, you need to add Elastic’s signing key so that the downloaded package can be verified (skip this step if you’ve already installed packages from Elastic):

The next step is to add the repository definition to your system:

Lastly, update your repositories and install Filebeat:

How to Configure Filebeat

Filebeat is relatively easy to configure, and also they all follow the same configuration setup. Filebeat is configured using a YAML configuration file. On Linux, this file is located at:

YAML is syntax sensitive and you cannot use tabs for spacing. Filebeat contains many configuration options, but in most cases, you will only need the very basics. For your convenience, you can refer to the example filebeat.reference.yml configuration file which is located in the same location as the filebeat.yml file, that contains all the different available options. The main configuration units are inputs, processors and output. Let’s take a closer look at them.

1) Filebeat inputs

Filebeat inputs are responsible for locating specific files and applying basic processing to them. From this point, you can configure the path (or paths) to the file you want to track. Also, you can use additional configuration options such as the input type and the encoding to use for reading the file, excluding and including specific lines, adding custom fields, and more.

Paths can be edited to include other inputs (defined by -) for crawling and fetching. The input here for Apache creates a directory called apache2. The next Nginx, then Mysql, and so on.

2) Filebeat processors

Filebeat can process and enhance the data before forwarding it to Logstash or Elasticsearch. This feature is not as good as Logstash, but it is useful. You can decode JSON strings, drop specific fields, add various metadata (e.g. Docker, Kubernetes), and more. Processors are defined in the Filebeat configuration file per input. You can define rules to apply your processing using conditional statements. Below is an example using the drop_fields processor for dropping some fields from Apache access logs:

3) Filebeat output

Here you can define where the data is going to be shipped. Most often you will use the Logstash or Elasticsearch output types, but you should know that there are other options such as Redis or Kafka. If you define a Logstash instance you can have advanced processing and data enhancement. Otherwise and if your data is well-structured you can ship directly to the Elasticsearch cluster. Also, you can define multiple outputs and use a load balancing option to balance the forwarding of data. For forwarding logs to Elasticsearch:

For forwarding logs to Logstash:

For forwarding logs to two Logstash instances:

Logstail.com Log Shippers Wizard

To make things even easier for you, Logstail.com provides a predefined Filebeat configuration for every technology integrated with the platform. It can be accessed via the Log Shippers tab in your account.   You can choose the log shipper you are interested in and the instructions will help you to make the correct settings.

When all steps are complete, you’re ready to explore your data!

 Contact Our Experts

 

Running Filebeat

The next step after installing the Filebeat is to start it. In Ubuntu/Debian you can start the Filebeat service with:

Are there any drawbacks when using Filebeat?

Like any software, we have to be aware of the potential issues when we use it. In a nutshell, you should be aware of the following: 1)  The complexity of multiple pipelines 2)  Changed log files 3)  Filebeat Registry file 4)  YAML syntax First of all, is the complexity of the configuration of multiple pipelines. While Filebeat allows you to define multiple file paths, you are required to add some specific settings to each log file. The best and most basic example is adding a log type field to each file to be able to easily distinguish between the log messages. This requires configuring a prospector for each log type, adding additional points of failure when configuring Filebeat. Another issue that might exhaust disk space is the file handlers for removed or renamed log files. If a file is removed or renamed, Filebeat continues to read the file and the handler continues to consume resources. If you have multiple harvesters working, this situation may become problematic The third point of caution is the Filebeat registry file which is saved locally and its functionality is to help Filebeat ensure that logs are not lost, if Elasticsearch or Logstash go offline unexpectedly. This registry file can become quite large and begin to consume a lot of memory. And lastly, when we configure the Filebeat .yml file, we have to know that the formatting is very sensitive. For example, we must be very cautious with the indentation and the structure, because one mistake can crash the entire configuration and pipeline.

Conclusion

Filebeat is an efficient, reliable and relatively easy-to-use log shipper. Following the guidelines of this tutorial, you can make the best out of this software to enhance the productivity of your ELK Stack.    

 

 Contact Our Expertsor Sign Up for Free

5 2 votes
Article Rating