FileBeat 401 Unauthorized Error with AWS Elasticsearch

Overview

Filebeat is a lightweight shipper for forwarding and centralizing log data. Installed as an agent on your servers, Filebeat monitors the log files or locations that you specify, collects log events, and forwards them either to Elasticsearch or Logstash for indexing.

Here’s how Filebeat works: When you start Filebeat, it starts one or more inputs that look in the locations you’ve specified for log data. For each log that Filebeat locates, Filebeat starts a harvester. Each harvester reads a single log for new content and sends the new log data to libbeat, which aggregates the events and sends the aggregated data to the output that you’ve configured for Filebeat.

If you are getting below error while importing data into AWS Elasticsearch directly from Filebeat, then this post is for you!

Exiting: 1 error: error loading index pattern: returned 401 to import file: . Response: {“statusCode”:401,”error”:”Unauthorized”,”message”:”Authentication required”}
Exiting: 1 error: error loading index pattern: returned 401 to import file: . Response: {“statusCode”:401,”error”:”Unauthorized”,”message”:”Authentication required”}

This issue comes if you are approaching AWS Elasticsearch with username/password security as

setup.kibana:
host: “https://arun-learningsubway-abxybalglzl3zmkmiq4.ap-south-1.es.amazonaws.com:443/_plugin/kibana/”

output.elasticsearch:
protocol: https
hosts: [“arun-learningsubway-workapps-abxybalglzl3zmkmiq4.ap-south-1.es.amazonaws.com:9200”]
username: “myUsername”
password: “myPassword”
index: “nginx_index_by_arun”

Solution

In Aws, while configuring your Elasticsearch service configure it for whitelisting of IP instead of Master User.

or

Configure FileBeat–> Logstash–>Elasticsearch with master username/password also it will work.

FileBeat and Logstash to insert Data into AWS Elasticsearch

FileBeat to insert Data into Logstash, and Logstash to insert Data into Elasticsearch

*Important point here is latest Elasticsearch version supported on AWS is 7.10, so Logstash and FileBeat must also be on same version.

If not then there will be a possibility of version compatibility.

* If latest version of ES available is x and you are not on cloud then also keep at least (x-1) version on production. It will keep you safe in production and away from product bugs to a lot extent.

Click and Download Filebeat 7.10 and Logstash7.10

Configuration of FileBeat to insert nginx logs into Logstash

Open filebeat.yml in any editor of your choice from location

/etc/filebeat/ on Linux or

C:\Program Files\filebeat-7.10.0 on windows

filebeat:
inputs:
– paths:
– E:/nginx-1.20.1/logs/.log
input_type: log


filebeat.config.modules:
enabled: true
path: ${path.config}/modules.d/*.yml


output:
logstash:
hosts: [“localhost:5044”]

Logstash Configuration

input {
beats {
port => 5044
ssl => false
}
}

filter {
grok {
match => [ “message” , “%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}”]
overwrite => [ “message” ]
}
mutate {
convert => [“response”, “integer”]
convert => [“bytes”, “integer”]
convert => [“responsetime”, “float”]
}
geoip {
source => “clientip”
target => “geoip”
add_tag => [ “nginx-geoip” ]
}
date {
match => [ “timestamp” , “dd/MMM/YYYY:HH:mm:ss Z” ]
remove_field => [ “timestamp” ]
}
useragent {
source => “agent”
}
}

output {
elasticsearch {
hosts => [“https://arun-learningsubway-ybalglooophuhyjmik3zmkmiq4.ap-south-1.es.amazonaws.com:443”]
index => “arun_nginx”
document_type => “%{[@metadata][type]}”
user => “myusername”
password => “mypassword”
manage_template => false
template_overwrite => false
ilm_enabled => false
}
}

Commands to Run on run Windows

To run Nginx
cd D:\nginx
start nginx
–to kill nginx process
taskkill /IM “nginx.exe” /F

To run Filebeat

To enable module
.\filebeat.exe modules enable nginx

C:\Program Files\filebeat-7.10.0> .\filebeat.exe -e

To run Logstash

C:\logstash> .\bin\logstash.bat -f .\config\logstash.conf

Know Your Elasticsearch!

Q)What is ElasticSearch?

Ans) Elasticsearch is a distributed, free and analytics engine for all types of data, including textual, numerical, geospatial<geo location>, structured, and unstructured.

Elasticsearch is built on Apache Lucene and was first released in 2010 by Elasticsearch N.V. (now known as Elastic).

We can use REST APIs other than ample of tools for data ingestion, enrichment, storage, analysis, and visualization. Due to Rest API CRUD feature it is easy to integrate with coding Languages/Platforms like Java, Python or Spring Boot.

Q) Two Case study to use Elastic Search or Areas where we can use it ?

Ans) a)ELK Stack

Any application generates logs. We can monitor application with use of Logstash. It will store these logs into Elasticsaerch. To see data after insertion we can see it in Kibana. In kibana we can write queries to analyze data and analyze it in data or graphical form.

b) Searching text in Java Application

I have blog, I can insert it’s content in Elasticsearch using Java(Rest CRUD api).

Above that I can use Elasticsearch data(JPA) to search text in Elasticsearch and bring related results back as result of get api.

Q) Is Elasticsearch as a NoSQL Database ?

Ans) Yes

Q) Is Elasticsearch is build upon Lucene engine ?

Ans) Yes

Q) Terminologies of Elasticsearch ?

Ans) field, document, index and cluster .

Q) Map above Elasticsearch terminologies with RDBMS ?

Ans) Elasticsearch RDBMS

Cluster Database

index table

document row

field column

● Cluster: A cluster is a collection of one or more nodes that together holds the entire data. It provides federated indexing and search capabilities across all nodes and is identified by a unique name (by default it is ‘elasticsearch’).

● Node: A node is a single server which is a part of cluster, stores data and participates in the cluster’s indexing and search capabilities.

● Index: An index is a collection of documents with similar characteristics and is identified by a name. This name is used to refer to the index while performing indexing, search, update, and delete operations against the documents in it.

● Type: A type is a logical type of an index whose semantics is complet. It is defined for documents that have a set of common fields. you can define more than one type in your index.

● Document: A document is a basic unit of information which can be indexed. It is demonstrated in JSON which is a global internet data interchange format.

Documents also contain reserved fields that constitute the document metadata such as:

  1. _index – the index where the document resides
  2. _type – the type that the document represents
  3. _id – the unique identifier for the document

An example of a document:

{
   "_id": 3,
   “_type”: [“your index type”],
   “_index”: [“your index name”],
   "_source":{
   "age": 32,
   "name": ["arun”],
   "year":1989,
}
}

● Shards: Elasticsearch provides the ability to subdivide the index into multiple pieces called shards. Each shard is in itself a fully-functional and independent “index” that can be hosted on any node within the cluster

● Replicas: Elasticsearch allows you to make one or more copies of your index’s shards which are called replica shards or replica.

Q)Why Elasticsearch is faster in searching than file search/RDBMS search?

Ans) Its all depend on how these system store data instead of retrieval.

Let me explain, If I have 1000 blogs and in three of them I have word ShRaam

Then, RDBMS/File system will go per blog/page and search for entire content in these pages and then bring three which has this matching term.

While, Elastisearch make use of inverted Index i.e. it will store words of that pages as keys of that pages.

ShRaam–> Page x, y and z

So when you search for keyword ShRaam, it will simply bring those three page where it is present instead of searching in page content at time of requirement.

Q) Name three companies using Elasticsearch?

Ans) Netflix

Walmart

Ebay

SpringDataElasticsearch Queries

Add only spring-data-elasticsearch dependency
--------------------Model----------

public class ArunOrder {
    @Id
    private Integer id;

    private Integer userId;
    private String description;

    private Boolean hidden;

    @Field(type = FieldType.Date)
    private Long createdDate;

 @Field(type = FieldType.Date)
    private Long updatedDate;

//getter and setter
}

------------------Repository--------------
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
import org.springframework.stereotype.Repository;

import java.util.List;

@Repository
public interface MyOrderElastricSearchRepository extends ElasticsearchRepository<ArunOrder , Integer> {

//Search a text in order description any where
    List<ArunOrder> findByDescriptionContaining(String subject);


//Search a text in description for order of particular user only 
  List<ArunOrder> findByDescriptionContainingAndUserId(String subject, Integer userId);


//Search a text in description for visible order of particular user only 
  List<ArunOrder> findByDescriptionContainingAndUserIdAndHiddenFalse
(String subject, Integer userId);
   }


//Search a text in description for visible messages of particular user only and sort by created date
  List<ArunOrder> findByDescriptionContainingAndUserIdAndHiddenFalseOrderByCreatedDateDesc
(String subject, Integer userId);
   }

Application.yml and jars required for SpringBoot 2.4 + with Elasticsearch 7.1 +

Build.Gradle

ext {
    springBootVersion = '2.5.0'
}

dependencies {
    implementation('org.springframework.boot:spring-boot-starter-web') {
      exclude module: "spring-boot-starter-tomcat"
    }

implementation ('org.springframework.data:spring-data-elasticsearch')

}

application.yml

elasticsearch:
  rest:
    uris: http://localhost:9200

or with username and password
elasticsearch:
  rest:
    uris: https://MyHost:9200
    username: arun
    password: mypassword

Elasticsearch Java Config

package com.arun.config;

import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.apache.http.impl.nio.client.HttpAsyncClientBuilder;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.ComponentScan;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.elasticsearch.client.ClientConfiguration;
import org.springframework.data.elasticsearch.client.RestClients;
import org.springframework.data.elasticsearch.core.ElasticsearchOperations;
import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;
import org.springframework.data.elasticsearch.repository.config.EnableElasticsearchRepositories;

@Configuration
@EnableElasticsearchRepositories(basePackages = "com.arun.elasticsearch.repository")
@ComponentScan(basePackages = { "com.arun.elasticsearch.service" })
public class ElasticSearchConfig {
    private static final String HOST = "aws_host";//localhost
    private static final int PORT = 443;/9092
    private static final String PROTOCOL = "https";//http

    @Bean
    public RestHighLevelClient client() {
       final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
        credentialsProvider.setCredentials(AuthScope.ANY,
                new UsernamePasswordCredentials("UserName", "Password"));

        RestClientBuilder builder = RestClient.builder(new HttpHost( HOST, PORT, PROTOCOL))
                .setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
                    @Override
                    public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
                        return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
                    }
                });

       
    }

}