Humio: Parsing a JSON substring in a message string

This blog post has been edited with ChatGPT March 14 version. Apologies for my laziness.

If you’re working with Humio logs and need to extract specific information from them, you’re not alone. Parsing logs can be a challenging task, but with the right query, you can easily filter and extract the data you need. In this blog post, we’ll show you how to parse a Humio log and find the top 10 error codes using a simple query.

Sample humio dashboard, nabbed from the internet for representation.

Let’s start by looking at an example Humio log, with a message field like this:

Webhook from App: {"id":"12345678-1234-1234-1234-123456789012","errorCode":"ERROR_CODE_123","timestamp":"2023-03-25T15:25:02.685193369Z","$type":"ErrorEvent"}

In this log, we have an “ErrorEvent” that contains an “id”, “errorCode”, “timestamp”, and a “$type” field. We want to extract the “errorCode” field and find the top 10 error codes. Now the problem often is that you have the JSON in a simple string, and parsing it can get a big messy.

To do this, we can use the following query:

source=your_app_logs | regex("Webhook from App: (?<incomingWebhookJSON>\\S+)",field=message) | parseJson(field=incomingWebhookJSON) | top(errorCode)  

Let’s break down this query step by step:

  • source=your_app_logs: This filters the logs to only show those from the app_logs source. You can replace this with the source of your logs. Make sure that your Humio dash only shows message lines which you plan to work on.
  • regex("Webhook from App: (?\incomingWebhookJSON>\\S+)",field=message): This parses the JSON substring of your message string, and copies it to a new field called: incomingWebhookJSON.
  • parseJson(field=incomingWebhookJSON): The calculated field from previous step is now turned into key value pairs, available for subsequent operations.
  • top(errorCode): This groups the logs by the “errorCode” field, and lists the top values of the errorCode.

Profit

And that’s it! With this query, we can easily find the top 10 error codes in our logs. You can customize this query to match the structure of your logs and the field you want to extract.

In conclusion, parsing Humio logs doesn’t have to be difficult. With the right query, you can easily filter and extract the data you need. We hope this blog post has been helpful in showing you how to find the top 10 error codes in your logs.

Advertisement

Multiple containers on GCP with Nginx, Let’s-Encrypt and a web-server.

GCP provides docker optimised compute nodes where you can spin up a very light weight (and minimal footprint) containers. They run this on the so called Container Optimised OS and seems to work great for simple use cases. However, there is a limitation: You can only auto-config it to run 1 single container at a time. The OS takes care of re-starting, port mapping etc, but this limitation sometimes makes it tough to run your hobby projects.

Lets try to go over a use case here. For this example, we take a project – Metabase, which runs a Jetty app server on port 3000.

Use Case

  1. You have your primary docker service (Metabase) running on port 3000.
  2. You want to run a nginx proxy on 80/443.
  3. You want to setup a certbot + LetsEncrypt SSL certificate on this server as well, so you have your service securely out in the open.

Prerequisites

  1. You have a GCP Compute node up with Metabase, and is running on port 3000. You can follow this tutorial here, in case you run into trouble.
  2. You have assigned a public IP to the node, and now your setup is accessible at http://your-public-ip:3000

Setup

At this point, you have a metabase container running on your GCP machine. You can login via SSH, or use the in-browser console service to see this using docker ps inside your compute node. Its time to setup the nginx reverse proxy at this point.

Starting with the basics, we need some nginx configurations. This config does two things:

  1. Start the nginx server on port 80/443. For this example, we use the host network itself, since GCP uses the host network when running containers.
  2. Point the certbot challenge for laters correctly to get your certs.
user_name@gcpnode $ mkdir nginx 
user_name@gcpnode $ nano nginx/nginx.conf

Now, setting up the nginx/nginx.conf is going to be easy. Lets see how it should look:

events {
  worker_connections  4096;  ## Default: 1024
}

http {
    log_format combined_ssl '$remote_addr - $remote_user [$time_local] '
                            '$ssl_protocol/$ssl_cipher '
                            '"$request" $status $body_bytes_sent '
                            '"$http_referer" "$http_user_agent"';
    server {
      listen 80;
      server_name subdomain.domain.com;
    
      location /.well-known/acme-challenge/ {
        root /var/www/certbot;
      }
    
      location / {
          return 301 https://$host$request_uri;
      }
  }

}

Now, that would be enough to serve the certbot challenges. Lets run the nginx server now. If it complaints about missing folders, just create it.

user_name@gcpnode $ docker run --network host -p 80:80 -p 443:443 -v 
/home/user_name/nginx/nginx.conf:/etc/nginx/nginx.conf -v 
/home/user_name/certbot/letsencrypt:/etc/letsencrypt -v 
/home/user_name/certbot/www:/var/www/certbot -d nginx

Now check if nginx is running correctly, using docker ps. If things are good, its time to run the certbot challenge.

user_name@gcpnode $ docker run --rm --name temp_certbot -v 
/home/user_name/certbot/letsencrypt:/etc/letsencrypt -v 
/home/user_name/certbot/www:/tmp/letsencrypt -v /home/user_name/servers-
data/certbot/log:/var/log certbot/certbot:v1.8.0 certonly --webroot --
agree-tos --renew-by-default --preferred-challenges http-01 --server
 https://acme-v02.api.letsencrypt.org/directory --text --email 
useremail@domain.com -w /tmp/letsencrypt -d subdomain.domain.com

There we go. If things go alright, you should have your certs in /home/user_name/certbot/letsencrypt.

Now that this is done, time to route some SSL traffic to your webserver. Lets edit the nginx config one more time, and add the necessary routes to your web-server. In this case, its a metabase running on the same machine with an HTTP out on port 3000.

Lets update the nginx config in nginx/nginx.conf again to route the reverse proxied traffic to your metabase installation.

events {
  worker_connections  4096;  ## Default: 1024
}

http {
    log_format combined_ssl '$remote_addr - $remote_user [$time_local] '
                            '$ssl_protocol/$ssl_cipher '
                            '"$request" $status $body_bytes_sent '
                            '"$http_referer" "$http_user_agent"';
    server {
        listen 80;
        server_name subdomain.domain.com;

        location /.well-known/acme-challenge/ {
        root /var/www/certbot;
        }

        location / {
          return 301 https://$host$request_uri;
        }
    }

    server {
        listen 443 ssl;
        server_name subdomain.domain.com;

        access_log /var/log/nginx/access.log combined_ssl;

        ssl_certificate /etc/letsencrypt/live/subdomain.domain.com/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/subdomain.domain.com/privkey.pem;

        location / {
            set $upstream "site_upstream";

            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header Host $http_host;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

            proxy_set_header X-Real-Port $server_port;
            proxy_set_header X-Real-Scheme $scheme;
            proxy_set_header X-NginX-Proxy true;
            proxy_set_header X-Forwarded-Proto $scheme;
            proxy_set_header X-Forwarded-Ssl on;

            expires off;
            proxy_pass http://$upstream;
        }
    }

    upstream site_upstream{
        server your-gcp-private-ip:3000;
    }
}

Make sure to update the highlighted code lines to match your install. Now, once these are done, its time to restart the nginx container again to enjoy the new HTTPS service. You can do that by something like:

user_name@gcpnode $ docker ps # and, note the nginx container id. 
user_name@gcpnode $ docker stop <container_id> # pass the right id.
user_name@gcpnode $ docker run --network host -p 80:80 -p 443:443 -v 
/home/user_name/nginx/nginx.conf:/etc/nginx/nginx.conf -v 
/home/user_name/certbot/letsencrypt:/etc/letsencrypt -v 
/home/user_name/certbot/www:/var/www/certbot -d nginx

# Hopefully, things start alright here. Check docker logs for clarity. 
user_name@gcpnode $ docker logs -f <container_id> # Use the new id.

Now you should have your metabase service running under subdomain.domain.com. Things to think about:

  1. Docker optimized OS on GCP uses the host network by default. Probably running them on a different separate network should be the best way to do it. The port forwarding in this example is kind of discarded, since we use network=host
  2. You should be pointing an A record in your DNS to point to the gcp-public-ip.
  3. You should not have to expose any ports outside the gcp firewall, other than HTTPS. Probably even block the HTTP access.
  4. Since metabase handles sensitive stuff, its always recommended to host it behind a secure access VPN/Cloudflare for teams access control.

Hope it helps.

[OpenStack] Get IPv4 address of a VM from compute object

Recently came across this scenario:

  1. I create a VM with conn.compute.create_server(*args) and have the server object. The server is allocated an IP over `DHCP`
  2. I want the IPv4 address of the machine.

Seems tough ? Finally found this one:

from openstack import connection

conn = connection.Connection(
    auth_url=configs['auth']['OS_AUTH_URL'],
    project_name=configs['auth']['OS_PROJECT_NAME'],
    username=configs['auth']['OS_USERNAME'],
    password=configs['auth']['OS_PASSWORD'],
    project_domain_name=configs['auth']['OS_PROJECT_DOMAIN_NAME'],
    user_domain_name=configs['auth']['OS_USER_DOMAIN_NAME']
)
# Define network, security_groups_list, user_data_file_opened
# An example network config is given below
network = {
  "name": "personal_network",
  "security_group":"open",
  "subnet": {
    "name": "personal_network_subnet",
    "ip_version": "4",
    "cidr": "10.10.60.0/24",
    "dns_servers":["8.8.8.8","8.8.8.4"],
    "gateway_ip": "10.10.60.1"
  }
}
network_ = [x for x in conn.network.networks(name=network['name'])][0]
node = conn.compute.create_server(
    name=server_name,
    image_id=image.id,
    flavor_id=flavor.id,
    networks=[{"uuid": network_.id}],
    key_name=keypair.name,
    security_groups=security_groups_list,
    user_data=user_data_file_opened
)
node_ = conn.compute.wait_for_server(node, wait=360)
node_ip = conn.compute.get_server(node.id).to_dict()['addresses'][network['name']][0]['addr']

print(f'New node ip is {node_ip}')

I have pasted the gist here, https://gist.github.com/tonythomas01/e7cecc6c1aaa4d4ca221487659ef9f40

Tell me how it goes, good luck.

Basic CRUD with Openstack Python V2.x clients

Last week I had this shiny assignment by a company here for a thesis interview to build some Python scripts using Openstack Python clients. It made me create some code which eventually got me through.  The recent upgrade of Openstack Python clients from API v1 to 2 have left most of the parts undocumented or carefully shredded here and there, so here you go.

Keystone: authenticate

In case you are just playing around with the default project domains, this should return the auth sessions.

from keystoneauth1.identity import v3
from keystoneauth1 import session

def authenticate_and_return_session(auth_url='', username=None, password=None):
    auth = v3.Password(
        auth_url=auth_url, username=username,
        password=password, project_name="demo", user_domain_id="default",
        project_domain_id="default"
    )
    return session.Session(auth=auth)

Keystone: list projects

from keystoneclient.v3 import client as keystoneclient

def list_projects(session=None):
    keystone = keystoneclient.Client(session=session)
    return keystone.projects.list()

Glance: list images and return the first one

You might want to now list out all the images to select what to chose

from glanceclient import Client

def list_images(session=None):
    # now check out images from glance

    glance = Client('2', session=session)

    image_ids = []
    for image in glance.images.list():
        image_ids.append(image.id)

    print('{0} Images found'.format(len(image_ids)))
    # Return the first image id
    return image_ids[0]

Nova: list flavors and return the `tiny` one

def get_your_flavor(flav_name='m1.tiny'):
    nova = novaclient.Client(2.1, session=sess)
    return nova.flavors.find(name=flav_name)

Neutron: Create network

This is one of the important steps, here we are trying to create a custom network and subnet on which our VM should reside


from neutronclient.v2_0 import client
def create_network(session, network_name='test_net')
    neutron = client.Client(session=session)

    return neutron.create_network(
        body={"network": {"name": network_name, "admin_state_up": True}}
    )

Neutron: Create your custom subnet

def create_subnet(neutronclient=None, net=None, cidr='192.168.2.1/24'):
    return neutronclient.create_subnet(
        body={
            'subnet': {
                'name': 'test_sub', 'network_id': net['network']['id'],
                'ip_version': 4, 'cidr': cidr, 'enable_dhcp': True
            }
        }
    )

Neutron: Connect new subnet to the default router

This would add an interface to the default router to our new subnet

sub = create_subnet(neutronclient, net=net)
neutron.add_interface_router(
    neutron.list_routers()['routers'][0]['id'],
    body={
        'subnet_id': sub['subnet']['id']
    }
)

Nova: Create your instance, connect your NIC card to new network

def create_instance(instance_ip_address='192.168.1.5'):
    nics = [
        {
            'net-id': net['network']['id'],
            'v4-fixed-ip': '{0}'.format(instance_ip_address)
        }
    ]
    instance = nova.servers.create(
        name='api-test', image=image, flavor=flav, nics=nics
    )
    if instance:
        print('Created: {0}'.format(instance))

thats it! You can see a better version here though. Leave a comment if you found this interesting.